Azure Synapse Analytics CI/CD with Custom Parameters – Made Easy!

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

Introduction

Custom parameter templates for Azure Synapse Analytics Workspace deployments allow you to parameterize Synapse Pipeline features and settings that are not exposed in the default deployment parameters template. In this blog post, I will demonstrate how to:

  • Create Synapse Pipelines that simplify the creation of a custom parameters template
  • Create the custom parameters template
  • Deploy the custom parameters template as part of an Azure DevOps Release Pipeline for Synapse Workspace deployments

Synapse CICD Custom Parms.jpg

If you are new to Azure Synapse Analytics Workspace Deployments, I suggest you start with these articles to gain an understanding of CI/CD with Synapse and the Azure DevOps Release Pipeline task for Azure Synapse Workspace Deployments:

 

Overview

When integrating Synapse Analytics Workspaces with Azure DevOps Git, you have the advantages of both Source Control and CI/CD for deployments to other Synapse Environments. When you publish a Git-Enabled Synapse Workspace, two ARM Template files are created in your Git repo  workspace_publish branch: TemplateForWorkspace.json and TemplateParametersForWorkspace.json:

 

jehayes_0-1682105674761.png

 

By default, a limited set of values are exposed as parameters in the TemplateParametersForWorkspace.json file, such as linked services connection strings and trigger parameter values. Furthermore, Global Parameters are not available in Synapse Pipelines like they are in Azure Data Factory.

 

When deploying Synapse Workspaces to other environments (such as uat, prod, etc)., you may need to configure different values in your Synapse pipelines for that environment. For example, let’s say I have a Synapse pipeline that does not fit a metadata-driven pattern; however, I still need a pipeline activity to have a different value for different environments. Or perhaps I have a fully baked metadata-driven pipeline but I want my pipeline to perform a Lookup activity on my metadata table based upon the Synapse Workspace environment.

 

To override the default parameter template, you can create a custom parameter template that must be named template-parameters-definition.json and place that in the root folder of your collaboration branch in your Git repo:

jehayes_1-1682105814307.png

 

Creating custom parameter templates is well defined here. However, it can be a bit cumbersome to parse through the entire workspace template to figure out how to construct the parameter definitions for each feature you want to parameterize. I found an easier approach: Create pipeline parameters for any setting that you want to parameterize that supports dynamic expressions (and what doesn’t in Synapse?) Then create a simple json script for just pipeline parameter values. This also provides more transparency on what settings are parameterized – like environment variables in SSIS.

 

In this blog post, we’ll cover:

  1. Creating and leveraging pipeline parameters in the Synapse Workspace
    1. Decide what features need to be parameterized by environment
    2. Create a pipeline parameter for each feature
    3. Use the pipeline parameter in the pipeline activity feature’s dynamic expression
  2. Creating the template-parameters-definition.json file to expose the necessary Synapse pipeline parameters
  3. Overriding the parameter values in Synapse Workspace Deployment task for the DevOps release pipeline

Creating and leveraging pipeline parameters in the Synapse Workspace

I have 2 examples in my Synapse Analytics Workspace:

  1. A pipeline which calls a Spark Notebook where the storage account and the size of the Spark pool’s executors and driver vary by environment
  2. A metadata-driven pipeline where some activity setting values vary by environment

For the first pipeline, I added 4 Pipeline Parameters:

 

jehayes_0-1682106163823.png

Python uses a different storage endpoint than Synapse linked services so I need to construct the endpoint in my Notebook. The storage account will be different based upon the environment, so I need to parameterize it:

 

jehayes_1-1682106250259.png

On the Notebook Activity, I set the notebook parameters values to be the pipeline parameter values.  Though I have a medium spark pool defined in both my dev and uat Synapse workspaces, I want to save money by only running a small executor and driver size in dev. In uat, we’ll want to use medium.

 

jehayes_0-1682106382620.png

 

My second pipeline is a full metadata-driven pipeline which leverages a control table in SQL for Copy Data and Dataflow Activities. I also want to have a single table for some static values that don’t vary by pipeline entities. I want this table to hold values for all environments rather than separate tables or databases for each environment.

 

jehayes_0-1682360648041.png

 

In this pipeline, I have a single parameter called Environment:

 

jehayes_2-1682106558064.png

 

The pipeline has a Lookup activity that queries my table to get the values for that environment:

 

jehayes_3-1682106558076.png

 

Here’s the full query dynamic expression:

 

jehayes_4-1682106558080.png

 

select * from dbo.ParameterLookup where PipelineName = '@{pipeline().Pipeline}' and Environment = '@{pipeline().parameters.Environment}'

 

The results from the Lookup are then used to set a variable for the folder name:

 

jehayes_5-1682106558085.png

jehayes_6-1682106855020.png

The variable along with the other Lookup activity outputs are used in the rest of my pipeline:

jehayes_1-1682361050922.png

 

jehayes_2-1682361190025.png

 

jehayes_3-1682361296870.png

 

Next for the fun (and easy) part.

 

Creating the template-parameters-definition.json file

After committing all the changes to the main or collaboration branch, go to the main or collaboration branch in the DevOps Git repo and create a new file in the root directory called template-parameters-definition.json

 

jehayes_7-1682106915544.png

 

To expose ALL pipeline parameters to DevOps, the contents of the file would simply contain:

 

{ "Microsoft.Synapse/workspaces/pipelines": { "properties": { "parameters": { "*": { "*": "=" } } } } }

 

However, in my case, I only want to parameterize the values for my Environment, StorageAccount, and SparkExecutorSize parameters so my json file contains only those parameters:

 

{ "Microsoft.Synapse/workspaces/pipelines": { "properties": { "parameters": { "Environment": { "defaultValue": "=" }, "StorageAccount": { "defaultValue": "=" }, "SparkExecutorSize": { "defaultValue": "=" } } } } }

 

The next time I publish my Synapse workspace, the pipeline parameters are added to the TemplateParametersForWorkspace.json file:

 

jehayes_0-1682107874725.png

 

Overriding the parameter values in Synapse Workspace Deployment task

In the DevOps release pipeline for my UAT environment, I created variables for the 3 pipeline parameters with the values for my UAT environment:

 

jehayes_1-1682107874741.png

 

In the Synapse Deployment release pipeline task, I override the default values of my Synapse pipeline parameters with the variables:

jehayes_2-1682107874764.png

 

After I save my release pipeline and the next time my UAT environment is deployed, whether manually or through continuous CI/CD triggers, my Synapse UAT Workspace will have the new values!

jehayes_3-1682107874770.png

 

jehayes_4-1682107874772.png

 

That’s it! Easy-peasy!

 

Summary

For Synapse pipeline activities that have settings which vary by environment, adding parameters to the Synapse pipelines simplifies the process of creating a custom parameter template that will be overridden in your Azure DevOps Release Pipeline Azure Synapse Workspace deployments.

 

If you are new to Synapse Git integration and DevOps, I also recommend these resources:

CI/CD in Azure Synapse Analytics

Synapse CI/CD Video Series

Automating the Publishing of Workspace Artifacts in Synapse CICD - Microsoft Community Hub

 

For creating custom parameters template for features beyond pipeline parameters check out:

CICD Automation in Synapse Analytics: taking advantage of custom parameters in Workspace Templates - Microsoft Community Hub

 

I hope you enjoyed this article and welcome any feedback!

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.