How to use COPY Into

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

COPY Into provides a new method of loading data from storage directly into a target object with lesser permissions required and no additional objects or credentials to be created in order to load the data.

For full details on Parameters and Syntax review our documentation here

 

Permissions

The User execution the Copy Command must have the following Permissions

  • ADMINISTER DATABASE BULK OPERATIONS
  • INSERT

 

Requirements and Limitations

With COPY Into we DO NOT Have to create any Database Scoped Credentials like we must for Polybase, the credential will be defined in the COPY Into Statement.

 

Each File Type has varying support for credentials Per storage type so when deciding on a credential ensure that the storage and file type supports this method of authentication to storage.

 

 

 

CSV

Parquet

ORC

Azure blob storage

SAS/MSI/SERVICE PRINCIPAL/KEY/AAD

SAS/KEY

SAS/KEY

Azure Data Lake Gen2

SAS/MSI/SERVICE PRINCIPAL/KEY/AAD

SAS/MSI/SERVICE PRINCIPAL/KEY/AAD

SAS/MSI/SERVICE PRINCIPAL/KEY/AAD

 

Important note: When the source file is Parquet or Orc an External File format is required, which is defined via Create External File Format.

 

Storage Level Permissions

As per the table above we support certain credentials per file type per storage account type.

RBAC Storage Blob Data Contributor/Reader/Owner: AAD / MSI / Service Principal

Read and List Permissions: Shared Access Signature

 

Best Practices

  • Avoid Specifying Wildcards that would expand over large number of files rather list multiple file locations instead
  • Use dfs endpoint for ADLS Gen2 for better performance

 

Examples:

For all the examples below there are 2 FROM Entries, one for Storage Account Gen2 and the other for Azure Blob Storage. A data source will be either or, remove the entry which is not required.

 

Authenticating with Managed Service Identity (MSI)

 

IDENTITY: A constant with a value of ‘Managed Identity’

The MSI Authentication type requires you to create the MSI via powershell for the logical SQL Server

 

In order to so run the following Powershell commands, you can run this from the Azure Module if you have it installed or run the command from the Azure Portal using Cloudshell.

 

Connect-AzAccount

Select-AzSubscription -SubscriptionId <your-subscriptionId>

Set-AzSqlServer -ResourceGroupName <your-database-server-resourceGroup> -ServerName <your-database-servername> -AssignIdentity

 

 

 -- MSI

COPY INTO <TARGET TABLE>

FROM 'https://<Storageaccount>.dfs.core.windows.net/container/file' -- Storage Account Gen2

FROM 'https://<Storageaccount>.blob.core.windows.net/container/file' -- Azure Blob Storage

WITH (

    FILE_TYPE = 'CSV',

    CREDENTIAL= (IDENTITY = 'Managed Identity'),

    FIELDQUOTE = '"',

    FIELDTERMINATOR=';',

    ROWTERMINATOR='\n',

    DATEFORMAT = 'ymd',

MAXERRORS = 10

)

 

 

Authenticating with Service Principals

IDENTITY: <CLIENTID>@<OAuth_2.0_V1_Token_EndPoint> 

SECRET: AAD Application Service Principal key

 

Example : 

CREDENTIAL = ( IDENTITY = 'abcdefgh-a123-4567-I9jk-l1m12345678n@https://login.microsoftonline.com/12a345bc-67d8-91ef-23gh-4i5jk678lm91/oauth2/token', SECRET = 'maopiqweh+-alskd!)@ASlkd09k1' )

 

 -- Service Principal

COPY INTO <TARGET TABLE>

FROM 'https://<Storageaccount>.dfs.core.windows.net/container/file' -- Storage Account Gen2

FROM 'https://<Storageaccount>.blob.core.windows.net/container/file' -- Azure Blob Storage

WITH (

    FILE_TYPE = 'CSV',

CREDENTIAL=(IDENTITY= '<client_id>@<OAuth_2.0_Token_EndPoint>',SECRET='<key>'),  

    FIELDQUOTE = '"',

    FIELDTERMINATOR=';',

    ROWTERMINATOR='\n',

    DATEFORMAT = 'ymd',

MAXERRORS = 10

 

)

 

Authentication with AAD

When using AAD Authentication you have to be connected to the DW as the AAD User, a Service Principal Does NOT Qualify as an AAD User and if you want to make use of the Service Principal follow the guidelines above.

 

 -- AAD

COPY INTO <TARGET TABLE>

FROM 'https://<Storageaccount>.dfs.core.windows.net/container/file' -- Storage Account Gen2

FROM 'https://<Storageaccount>.blob.core.windows.net/container/file' -- Azure Blob Storage

WITH (

    FILE_TYPE = 'CSV',

    FIELDQUOTE = '"',

    FIELDTERMINATOR=';',

    ROWTERMINATOR='\n',

    DATEFORMAT = 'ymd',

MAXERRORS = 10

 

)

 

Authenticating with Storage account key

 

IDENTITY: A constant with a value of ‘Storage Account Key’

SECRET: Storage account key

 

-- Storage Account Key

COPY INTO <TARGET TABLE>

FROM 'https://<Storageaccount>.dfs.core.windows.net/container/file' -- Storage Account Gen2

FROM 'https://<Storageaccount>.blob.core.windows.net/container/file' -- Azure Blob Storage

WITH (

    FILE_TYPE = 'CSV',

     CREDENTIAL= (IDENTITY='Storage Account Key', Secret='<Your Account Key>'

    FIELDQUOTE = '"',

    FIELDTERMINATOR=';',

    ROWTERMINATOR='\n',

    DATEFORMAT = 'ymd',

    MAXERRORS = 10

)

 

ORC File with AAD Authentication

The Following example is to provide you with the required steps to read data from ORC files, if you are using Parquet then a File Format for Parquet should be created as well.

 

CREATE EXTERNAL FILE FORMAT <File Format Name>

WITH ( 

    FORMAT_TYPE = ORC, 

    DATA_COMPRESSION = 'org.apache.hadoop.io.compress.SnappyCodec' 

);

 

COPY INTO <TARGET TABLE>

FROM 'https://<Storageaccount>.dfs.core.windows.net/container/file' -- Storage Account Gen2

FROM 'https://<Storageaccount>.blob.core.windows.net/container/file' -- Azure Blob Storage

WITH (

    FILE_TYPE = 'ORC',

    FORMAT = <File Format Name>

    FIELDQUOTE = '"',

    FIELDTERMINATOR=';',

    ROWTERMINATOR='\n',

    DATEFORMAT = 'ymd',

    MAXERRORS = 10

)

 

Conclusion:

In the article we covered how to execute COPY Into from a CSV File and what the basic requirements are to use the feature. I strongly advise reading the COPY Into documentation to confirm all the available options and functionality for the feature. 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.