Hack for speeding up data migration for large tables using SSMA for Oracle

Posted by

This post has been republished via RSS; it originally appeared at: Microsoft Tech Community - Latest Blogs - .

Introduction

SSMA is a great tool for data migration if you plan to migrate data to SQL database without any additional cost and setup hassle. It initiates a single data migration process for each individual table. However, if you have a very large table in your database, the total data migration time will increase because the SSMA will not initiate multiple copy process for a single table.

 

If you have a database with similar data distribution with handful of tables holding most of the data and none of the other available migration services and tools are suitable for you, then use following hack to split your data into multiple chunks for faster data migration. 

 

Use case

This customer wanted to migrate an Oracle database of 1 TB to Azure SQL DB and wanted to use SSMA as data migration tool for its ease of setup and networking requirements. Their application was able to afford longer downtime. They had a schema that had a single table with 80% of the data. In the first run SSMA took more than 20Hrs to complete the data migration. Upon investigation it was found that the large table took most of the time. 

We decided to break this table into multiple chunks but physically breaking the table was not possible therefore we tried creating views to create logical chunks. But SSMA does not currently support data movement from views. This hack helped the customer in moving the data faster and with minimal effort.

 

Setup 

This can be best explained using an example. 

 

  • We have a table named EMPLOYEES with five million rows.
  • This table has a surrogate key named EMPLOYEE_ID which is also an Indexed Primary Key.
  • SSMA took 3:35 minutes to migrate the data to SQL Server.

         KapilSamant_0-1672160309421.png

  • If this time has to be reduced, then the table will have to be split into smaller chunks. 
  • The easiest way to do this is to create views which subsets the data. e.g.

 

create view  EMPLOYEES1 as select * from EMPLOYEES WHERE EMPLOYEE_ID < 1000000;
create view  EMPLOYEES2 as select * from EMPLOYEES WHERE EMPLOYEE_ID >= 1000000 and EMPLOYEE_ID < 2000000;
create view  EMPLOYEES3 as select * from EMPLOYEES WHERE EMPLOYEE_ID >= 2000000 and EMPLOYEE_ID < 3000000;
create view  EMPLOYEES4 as select * from EMPLOYEES WHERE EMPLOYEE_ID >= 3000000 and EMPLOYEE_ID < 4000000;
create view  EMPLOYEES5 as select * from EMPLOYEES WHERE EMPLOYEE_ID >= 4000000;

 

  • But views cannot migrate data in SSMA. Only tables can transfer the data.
  • To work around this, we will drop the views and create empty tables with same names as views.

 

create table  EMPLOYEES1 as select * from EMPLOYEES WHERE 1 = 2;
create table  EMPLOYEES2 as select * from EMPLOYEES WHERE 1 = 2;
create table  EMPLOYEES3 as select * from EMPLOYEES WHERE 1 = 2;
create table  EMPLOYEES4 as select * from EMPLOYEES WHERE 1 = 2;
create table  EMPLOYEES5 as select * from EMPLOYEES WHERE 1 = 2;

 

  • We then refresh the SSMA oracle schema so that these tables show in the project.

         KapilSamant_1-1672199347227.png

  • Migrate these tables to the target database and sync. This step will create empty tables on the target database

         KapilSamant_4-1672199444508.png

  • Once refreshed, drop the tables and create original views at Source.
    • DO NOT refresh the SSMA project after the above step is complete.
  • You can now either use SSMA GUI or command line to migrate the data to target heap tables.
  • Internally it will read the data from the view and therefore load it in chunks.
  • Data will be loaded into temporary tables at target with the same name as the view.
  • Once the data is migrated to the target, it can then be joined together into the main table.
  • Please note: - Indexes and constraints on the main table should be dropped before running the following insert.

 

INSERT INTO [HR].[EMPLOYEES]
select * from 
(select * FROM [HR].[EMPLOYEES1]
UNION ALL
select * FROM [HR].[EMPLOYEES2]
UNION ALL
select * FROM [HR].[EMPLOYEES3]
UNION ALL
select * FROM [HR].[EMPLOYEES4]
UNION ALL
select * FROM [HR].[EMPLOYEES5]) a

 

Conclusion

The total time taken to split, and load is 1:56 minutes. Thats a saving of 1:39 minutes against direct load (which took 3:35 minutes) from SSMA.

 

Considerations

  • This test was carried out on a local infrastructure with Oracle 11g as Source and SQL Server 2019 as target. The results may vary based on your infrastructure and source target combination. 
  • This blog only covers direct data migration with Views using SSMA for Oracle. Views can also be migrated using ADF pipeline and SSIS package.
  • In this scenario we are referring physical tables created by SSMA as Temporary tables.

Feedback and suggestions

If you have feedback or suggestions for improving this asset, please contact the Data Platform Engineering (datasqlninja@microsoft.com). Thanks for your support! Note: For additional information about migrating various source databases to Azure, see the Azure Database Migration Guide.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.