Getting started with Private Clusters on HDInsight on AKS for securing your analytics workloads

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

HDInsight on AKS is a managed Platform as a Service (PaaS) that runs on Azure Kubernetes Service (AKS). HDInsight on AKS allows you to deploy popular Open-Source Analytics workloads like Apache Spark™, Apache Flink:registered:, and Trino without the overhead of managing and monitoring containers.

HDInsight on AKS clusters allow you to setup outbound network connections from cluster to any destination, if the destination is reachable from the node's network interface. This means that cluster resources can access any public or private IP address, domain name, or URL on the internet or on your virtual network.

However, in some scenarios, you may want to control or restrict the egress traffic from your cluster for security, compliance reasons. For example, you may want to:


  1. Prevent clusters from accessing malicious or unwanted services.
  2. Enforce network policies or firewall rules on the outbound traffic.
  3. Monitor or audit the egress traffic from cluster for troubleshooting or compliance purposes.


There are different methods for managing the traffic flow. You can learn more about it here.

In this blog, we will discuss about how to control or restrict the egress traffic from your HDInsight on AKS cluster using User Defined Routing (UDR) in your virtual network.


With this setup, there won't be any Public IP created when you spin up an HDInsight on AKS cluster.

Note: UDR setup requires you to setup firewall rules and define the routing using custom VNet and subnet before creating an HDInsight on AKS cluster

Let's get started. 

Step 1: Setup the virtual network (VNet). Required if you don't have existing VNet


  1. From the Azure portal, search for virtual networks and click to create new.

  2.  Create a VNet named "contoso-hdi-vnet".



Step 2: Setup the firewall. Deploy the firewall in your virtual network (contoso-hdi-vnet).
To deploy a firewall into the integrated virtual network, you need a subnet called AzureFirewallSubnet 

  1. Navigate to your VNet (contoso-hdi-vnet) and go to subnets
  2. Add subnet with subnet purpose as "Azure Firewall"

  3. Now, go to Firewall tab and click to add a new Firewall

    2. Create a firewall named "contoso-hdi-firewall" with the following details



    Resource group

    Same resource group as the integrated virtual network.


    "contoso-hdi-firewall" or Name of your choice 


    Same region as the integrated virtual network.

    Firewall policy

    Create one by selecting Add new.

    Virtual network

    Select the integrated virtual network.

    Public IP address

    Select an existing address or create one by selecting Add new.

    3. Once deployment is complete, go to Overview page of newly created firewall, copy private IP address. The private IP address will be used as next hop address in the routing rule for the virtual network.


Step 3: Create a Route table and associate it with your virtual network to route all traffic to the firewall

When you create a virtual network, Azure automatically creates a default route table for each of its subnets and adds system default routes to the table. In this step, you create a user-defined route table that routes all traffic to the firewall, and then associate it with the App Service subnet in the integrated virtual network.


  1. From the Azure portal, search for "Route tables" and select Route tables resource



  2. Create a route table with name "contoso-hdi-route-table". 
    Note: Region should be same as Firewall region For e.g. "East US 2" in this case



  3. Go to the newly create route table and add a route with the following details



    Destination Type

    IP Addresses

    Destination IP addresses/CIDR ranges

    Next hop type

    Virtual appliance

    Next hop address

    The private IP address for the firewall that you copied


  4. Go to subnets and associate the subnet you want to use during HDInsight on AKS cluster setup. Here, "default" subnet is used.


Step 4: Configure Firewall policies

  1. Navigate to the firewall's overview page and select its firewall policy.




  2. Add network rules (defined here) with the subnet (To be used for setting up HDInsight on AKS cluster) as the source address



  3. Add application rules (defined here) with the subnet (To be used for setting up HDInsight on AKS cluster) as the source address

  4. Depending on the cluster type (Spark, Flink, Trino), you need to add additional network and application rules defined here.


Step 5: Setup HDInsight on AKS cluster pool

  1. From the Azure portal, search "HDInsight on AKS clusters pool" and create a new HDInsight on AKS cluster pool

  2. Under Security + network settings, choose the virtual network (contoso-hdi-vnet), Subnet (default) and Egress path (Outbound with userDefinedRouting)



  3. Once cluster pool is created, verify that no public IP is created. Search for MC_hdi-<clusterpool deployment id> resource group





Step 6: Add AKS API Server Address to the network rules in the firewall policy


  1. From the Azure portal, search for cluster pool name (contoso-hdi-udr-pool) and go the corresponding kubernetes resource.



  2. From the overview tab, copy the API server address

  3. Navigate to the contoso-hdi-firewall-policy and enable DNS proxy from DNS tab



  4. Go to Network rules tab and add a new rule 




Step 7: Assign the AKS cluster - that matches the cluster pool - Network Contributor role on your network resources that are used for defining the routing, such as Virtual Network, Route table, and NSG (if used).


  1. Navigate to your VNet (contoso-hdi-vnet), go to Access control and click on "Add role assignment



  2. Select "Network Contributor" role and member as "Managed identity". In Managed identity option, select Kubernetes services and select your cluster pool name



  3. Click Review+ create to complete the role assignment


Step 8: Create HDInsight on AKS cluster

  1. From Azure portal, search for "HDInsight on AKS cluster" service or click + New cluster from the overview tab of the cluster pool



  2. Select the cluster pool (contoso-hdi-udr-pool), cluster type (Trino) and click "Review + create"


Step 9: Access the cluster via a client such as virtual machine (VM)


  1. Create a windows virtual machine and copy the public IP of the VM



  2. Navigate to the route table (contoso-hdi-route-table) and add VM IP to the route table



  3. Remote login to the VM and you can access the cluster web urls


With Private AKS clusters, and outbound UDR setup, enterprise customers can ensure that their sensitive data is protected from unauthorized access, and theft. They can continue to implement a range of security measures to protect their data driven applications. With the ability to perform regular security updates and patches to keep their systems up-to-date and secure with In-place upgrades.


With all of this available, enterprise customers can now comply with industry-specific regulations related to data privacy, security, and compliance and reduce the risk of data breaches and other security incidents.


Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.