Site icon TheWindowsUpdate.com

Synapse Connectivity Series Part #3 – Synapse Managed VNET and Managed Private Endpoints

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

This is part 3 of a series related to Synapse Connectivity - check out the previous blog articles:

 

In this article we are going to talk about Synapse Managed Virtual Network and Managed Private Endpoints

1 - Synapse Managed VNET and Data Exfiltration

2 - Managed Private Endpoints

3 - Synapse Managed VNET flavors

4 - Troubleshooting

 

1 - Synapse Managed VNET and Data Exfiltration

When you create your Azure Synapse workspace, you can choose to associate it to an Azure Virtual Network. The Virtual Network associated with your workspace is managed by Azure Synapse. This Virtual Network is called a Managed Workspace Virtual Network or Synapse Managed VNET

 

Synapse with Managed VNET supports enabling Data Exfiltration Protection (DEP) for workspaces. With exfiltration protection, you can guard against malicious insiders accessing your Azure resources and exfiltrating sensitive data to locations outside of your organization’s scope. At the time of workspace creation, you can choose to configure the workspace with a managed virtual network and additional protection against data exfiltration.

 

 

 

Azure Synapse provides various analytic capabilities in a workspace:

 

If your workspace has a Managed VNET, ADF - Azure Integration Runtime (AzureIR) and Spark resources are deployed in the VNET. This means that when an Azure IR or Spark VM is created or started for an execution, it will get a private IP from this managed VNET and will comply with the rules of this managed VNET. If you have selected Data Exfiltration Protection, you cannot go out to ANY public endpoint. (More details below)

 

Dedicated SQL pool and serverless SQL pool are multi-tenant and therefore reside outside of the Managed workspace Virtual Network. Intra-workspace communication from ADF/ Spark to dedicated SQL pool and serverless SQL pool use Managed Private Endpoints. These private endpoints are automatically created for you when you create a workspace with a Managed VNET associated to it.

 

 

 

Taking into account all of the requirements mentioned, we have three variations of Synapse workspaces:

Before we dive into the details of the three options, we will explain more about are Managed Private Endpoints.

 

2 - Managed Private Endpoints

Managed private endpoints are Private Endpoints created within a Synapse Managed VNET. Managed private endpoints establish a private link to Azure resources, and Azure Synapse manages these private endpoints on your behalf. You can create Managed private endpoints from your Azure Synapse workspace to access Azure services like Azure Storage or Azure Cosmos DB, as well as and Azure hosted customer/partner services.

 

IMPORTANT !!!

You cannot reuse other existing private endpoints from your customer Azure VNET. Because in this scenario we want to connect Synapse resources on a Managed VNET to an Azure resource, not your client directly to resource, that means the traffic will not go through your VNET or through your firewall. Its an VM (ADF or Spark) on an Synapse Managed VNET, accessing the resource directly.

 

 

A Managed private endpoint uses private IP address from your Managed Virtual Network to effectively bring the Azure service that your Azure Synapse workspace is communicating into your Virtual Network. Managed private endpoints are mapped to a specific resource in Azure and not the entire service. Customers can limit connectivity to a specific resource approved by their organization.

Ref: Synapse Managed private endpoints

 

A private endpoint connection is created in a "Pending" state. The destination resource owner is responsible to approve or reject the connection. Only a Managed private endpoint in an approved state can be used to send traffic to the private link resource that is linked to the Managed private endpoint. You can also create private link between different subscription and even different tenants.

 

 

Ref: Manage Azure Private Endpoints 

 

In the image below I'm trying to show that when you start an ADF (Azure IR) execution or when you stark an Spark Job, we need a machine to actually run it, as the machines are created on demand as you pay per use. As the machines need to be part of the VNET we need to create them linked in the VNET

 

 

 

 

 

3 - Synapse Managed VNET flavors

3.1 - Option 1 - Synapse with NO VNET

ADF Azure IR and Spark VMs create a resource that will be used to process your workload, this process can take a few minutes to get ready

 


3.2 - Option 2 - Synapse with Managed VNET

ADF Azure IR and Spark VMs create a resource that will be used to process your workload, this process can take some minutes to get ready

"By design, Managed VNet IR takes longer queue time than Azure IR as we are not reserving one compute node per service instance, so there is a warm up for each copy activity to start, and it occurs primarily on VNet join rather than Azure IR."

 


3.3 - Option 3 - Synapse with Managed VNET + DEP (Data Exfiltration Protection)

The difference option 2 is you are NOT allowed to access any public endpoint, even the ones that are part of your subscription. You need to access the resources using Managed Private Endpoints.

Check out Data exfiltration protection for Azure Synapse Analytics workspaces for more information.

 

 

3.3.1 - Alternative to SHIR VM

Instead of using Self Hosted integration runtime you can use proxy machines. We will not go into the details of these solutions in this article, but the following documentation provides a step-by-step guide:

 

4 - Troubleshooting

Troubleshooting inbound connections have no influence if you have or not Managed VNET, if this the case, refer to Synapse Connectivity Series Part #2 - Inbound Synapse Private Endpoints.

 

Check the following troubleshooting items:

4.1 - Linked Services

Check if the linked service is using the managed private endpoint.

 

4.2 - Managed Private Endpoints

Check if Managed private endpoints exists and if they are approved.

*Pay attention that some services have multiple endpoints like storage (blob and dfs), that will depend on an endpoint being used by you

 

You can also check it from resource point of view. Name of private endpoint will be [WORKSPACENAME].[NAME YOU GIVEN TO PE]

 

4.2 - Test connection

Check if it's using the managed private endpoint.

 

Enable interactive authoring to test connections. As we have referenced before, we need a machine that exists on Synapse Managed VNET to test this connection, as something that is created on demand is not available right away.

  1.  

4.2 - Test name resolution and port (from Spark)

As we do not have an Azure VM inside the Managed VNET to do some tests, we can use Spark Notebooks to test it directly.

 

Check name resolution, should resolve to something private like 10.x.x.x .

 

 

%%pyspark import socket hostname = "management.azure.com" port = "443" ############################################################ def resolve_hostname(hostname): try: ip = socket.gethostbyname(hostname) print(f"{hostname} resolved to {ip}.") return ip except: print(f"Unable to resolve hostname {hostname}.") return None ############################################################ def is_port_open(hostname, port): try: sock = socket.create_connection((hostname, port), timeout=1) sock.close() print(f"Port {port} is OPEN to {hostname}") except socket.error: print(f"Port {port} is CLOSED to {hostname}") ############################################################ resolve_hostname(hostname) is_port_open(hostname, port)

 

 

 

We can see below that Storage is open because we have a Managed private endpoint, but management.azure.com show as closed because this was a workspace with DEP and it cannot go to public endpoints as explained above.

 

 

 

 

 

 

References and links

Exit mobile version