Reduce your time to load with Synapse Studio and the COPY statement

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

Azure Synapse Analytics (formerly SQL Data Warehouse) is an analytics platform that enables data professionals to achieve more with fast insights from their data at any scale. The new Synapse Studio provides a unified workspace experience simplifying many common tasks when onboarding analytics workloads. For data engineers using SQL for analytics, data lake exploration using the  Synapse SQL serverless capabilities and external tables has never been easier. The next step after data exploration is data ingestion where you can now expect the same simplified and code-free experience in the Synapse Studio when loading data into your SQL database to serve fast insights for your end users. 
 
Data ingestion with Synapse SQL requires you to specify the source storage location, file format, target table, and generic load settings such as which row to start reading from and what to do when rows are rejected. These steps can be daunting when getting started especially when loading non-binary file formats such as CSV. Loading CSV files requires you to completely understand your data to accurately set up the load process where we have seen users struggle with file parsing, schema mismatch and data type conversion issues. When these issues occur, you typically have to investigate the target table's schema, download a subset of the data, and use external tools such as Notepad++ to manually inspect the file to root cause the issue. Troubleshooting these issues significantly delays the time to load and ultimately time to insight for your end users. 
 
The Synapse Studio now provides a bulk loading wizard which will guide you through this experience to successfully bulk load data. You can now immediately explore your source storage accounts, preview the content of your file to load, and specify the data loading settings through a code-free experience. 
 
1. Right click on a file or folder from your storage account
studio_load_wizard_right_click.png

 

2. Specify the source storage location, file format settings, and error file details. You can preview the content of your file using "Preview data" which will help you specify the correct file format settings. "Preview data" will apply the file format settings you specified and show how the COPY statement will parse your file. 
studio_load_wizard_preview_file_parsing.gif

 

 
Note the wizard will auto-detect the schema when loading Parquet files. 
 
3. Select the target SQL pool and table, column mapping, and click open the script
studio_load_wizard_generate_script.gif

 

 
You can save this auto-generated SQL script to your Synapse workspace for future use or quickly create a SQL Stored Procedure with the COPY statement by removing the comments in the script. 
 
For more information, visit the Azure Synapse documentation here. You can sign up for the Studio preview at this link here.
 

To provide feedback on Synapse SQL experiences in the studio, please reach out to the following email distribution list: SynapseSQLStudioFeedback@service.microsoft.com.

 

Learn more about the simplified COPY statement by visiting the following links:

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.