This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.
Introduction
Custom parameter templates for Azure Synapse Analytics Workspace deployments allow you to parameterize Synapse Pipeline features and settings that are not exposed in the default deployment parameters template. In this blog post, I will demonstrate how to:
- Create Synapse Pipelines that simplify the creation of a custom parameters template
- Create the custom parameters template
- Deploy the custom parameters template as part of an Azure DevOps Release Pipeline for Synapse Workspace deployments
If you are new to Azure Synapse Analytics Workspace Deployments, I suggest you start with these articles to gain an understanding of CI/CD with Synapse and the Azure DevOps Release Pipeline task for Azure Synapse Workspace Deployments:
Overview
When integrating Synapse Analytics Workspaces with Azure DevOps Git, you have the advantages of both Source Control and CI/CD for deployments to other Synapse Environments. When you publish a Git-Enabled Synapse Workspace, two ARM Template files are created in your Git repo workspace_publish branch: TemplateForWorkspace.json and TemplateParametersForWorkspace.json:
By default, a limited set of values are exposed as parameters in the TemplateParametersForWorkspace.json file, such as linked services connection strings and trigger parameter values. Furthermore, Global Parameters are not available in Synapse Pipelines like they are in Azure Data Factory.
When deploying Synapse Workspaces to other environments (such as uat, prod, etc)., you may need to configure different values in your Synapse pipelines for that environment. For example, let’s say I have a Synapse pipeline that does not fit a metadata-driven pattern; however, I still need a pipeline activity to have a different value for different environments. Or perhaps I have a fully baked metadata-driven pipeline but I want my pipeline to perform a Lookup activity on my metadata table based upon the Synapse Workspace environment.
To override the default parameter template, you can create a custom parameter template that must be named template-parameters-definition.json and place that in the root folder of your collaboration branch in your Git repo:
Creating custom parameter templates is well defined here. However, it can be a bit cumbersome to parse through the entire workspace template to figure out how to construct the parameter definitions for each feature you want to parameterize. I found an easier approach: Create pipeline parameters for any setting that you want to parameterize that supports dynamic expressions (and what doesn’t in Synapse?) Then create a simple json script for just pipeline parameter values. This also provides more transparency on what settings are parameterized – like environment variables in SSIS.
In this blog post, we’ll cover:
- Creating and leveraging pipeline parameters in the Synapse Workspace
- Decide what features need to be parameterized by environment
- Create a pipeline parameter for each feature
- Use the pipeline parameter in the pipeline activity feature’s dynamic expression
- Creating the template-parameters-definition.json file to expose the necessary Synapse pipeline parameters
- Overriding the parameter values in Synapse Workspace Deployment task for the DevOps release pipeline
Creating and leveraging pipeline parameters in the Synapse Workspace
I have 2 examples in my Synapse Analytics Workspace:
- A pipeline which calls a Spark Notebook where the storage account and the size of the Spark pool’s executors and driver vary by environment
- A metadata-driven pipeline where some activity setting values vary by environment
For the first pipeline, I added 4 Pipeline Parameters:
Python uses a different storage endpoint than Synapse linked services so I need to construct the endpoint in my Notebook. The storage account will be different based upon the environment, so I need to parameterize it:
On the Notebook Activity, I set the notebook parameters values to be the pipeline parameter values. Though I have a medium spark pool defined in both my dev and uat Synapse workspaces, I want to save money by only running a small executor and driver size in dev. In uat, we’ll want to use medium.
My second pipeline is a full metadata-driven pipeline which leverages a control table in SQL for Copy Data and Dataflow Activities. I also want to have a single table for some static values that don’t vary by pipeline entities. I want this table to hold values for all environments rather than separate tables or databases for each environment.
In this pipeline, I have a single parameter called Environment:
The pipeline has a Lookup activity that queries my table to get the values for that environment:
Here’s the full query dynamic expression:
The results from the Lookup are then used to set a variable for the folder name:
The variable along with the other Lookup activity outputs are used in the rest of my pipeline:
Next for the fun (and easy) part.
Creating the template-parameters-definition.json file
After committing all the changes to the main or collaboration branch, go to the main or collaboration branch in the DevOps Git repo and create a new file in the root directory called template-parameters-definition.json
To expose ALL pipeline parameters to DevOps, the contents of the file would simply contain:
However, in my case, I only want to parameterize the values for my Environment, StorageAccount, and SparkExecutorSize parameters so my json file contains only those parameters:
The next time I publish my Synapse workspace, the pipeline parameters are added to the TemplateParametersForWorkspace.json file:
Overriding the parameter values in Synapse Workspace Deployment task
In the DevOps release pipeline for my UAT environment, I created variables for the 3 pipeline parameters with the values for my UAT environment:
In the Synapse Deployment release pipeline task, I override the default values of my Synapse pipeline parameters with the variables:
After I save my release pipeline and the next time my UAT environment is deployed, whether manually or through continuous CI/CD triggers, my Synapse UAT Workspace will have the new values!
That’s it! Easy-peasy!
Summary
For Synapse pipeline activities that have settings which vary by environment, adding parameters to the Synapse pipelines simplifies the process of creating a custom parameter template that will be overridden in your Azure DevOps Release Pipeline Azure Synapse Workspace deployments.
If you are new to Synapse Git integration and DevOps, I also recommend these resources:
CI/CD in Azure Synapse Analytics
Automating the Publishing of Workspace Artifacts in Synapse CICD - Microsoft Community Hub
For creating custom parameters template for features beyond pipeline parameters check out:
I hope you enjoyed this article and welcome any feedback!