Getting started with Private Clusters on HDInsight on AKS for securing your analytics workloads

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

HDInsight on AKS is a managed Platform as a Service (PaaS) that runs on Azure Kubernetes Service (AKS). HDInsight on AKS allows you to deploy popular Open-Source Analytics workloads like Apache Spark™, Apache Flink:registered:, and Trino without the overhead of managing and monitoring containers.


HDInsight on AKS clusters allow you to setup outbound network connections from cluster to any destination, if the destination is reachable from the node's network interface. This means that cluster resources can access any public or private IP address, domain name, or URL on the internet or on your virtual network.

However, in some scenarios, you may want to control or restrict the egress traffic from your cluster for security, compliance reasons. For example, you may want to:

 

  1. Prevent clusters from accessing malicious or unwanted services.
  2. Enforce network policies or firewall rules on the outbound traffic.
  3. Monitor or audit the egress traffic from cluster for troubleshooting or compliance purposes.

 

There are different methods for managing the traffic flow. You can learn more about it here.


In this blog, we will discuss about how to control or restrict the egress traffic from your HDInsight on AKS cluster using User Defined Routing (UDR) in your virtual network.

 

With this setup, there won't be any Public IP created when you spin up an HDInsight on AKS cluster.

Note: UDR setup requires you to setup firewall rules and define the routing using custom VNet and subnet before creating an HDInsight on AKS cluster

Let's get started. 

Step 1: Setup the virtual network (VNet). Required if you don't have existing VNet

 

  1. From the Azure portal, search for virtual networks and click to create new.

    Abhishjain_3-1715599057408.png
  2.  Create a VNet named "contoso-hdi-vnet".

    Abhishjain_4-1715599276706.png

     

Step 2: Setup the firewall. Deploy the firewall in your virtual network (contoso-hdi-vnet).
To deploy a firewall into the integrated virtual network, you need a subnet called AzureFirewallSubnet 

  1. Navigate to your VNet (contoso-hdi-vnet) and go to subnets
  2. Add subnet with subnet purpose as "Azure Firewall"

    Abhishjain_6-1715600345992.png
  3. Now, go to Firewall tab and click to add a new Firewall

    Abhishjain_7-1715600611290.png
    2. Create a firewall named "contoso-hdi-firewall" with the following details

    Setting

    Value

    Resource group

    Same resource group as the integrated virtual network.

    Name

    "contoso-hdi-firewall" or Name of your choice 

    Region

    Same region as the integrated virtual network.

    Firewall policy

    Create one by selecting Add new.

    Virtual network

    Select the integrated virtual network.

    Public IP address

    Select an existing address or create one by selecting Add new.


    Abhishjain_8-1715600742291.png
    3. Once deployment is complete, go to Overview page of newly created firewall, copy private IP address. The private IP address will be used as next hop address in the routing rule for the virtual network.

    Abhishjain_1-1715601918771.png

Step 3: Create a Route table and associate it with your virtual network to route all traffic to the firewall

When you create a virtual network, Azure automatically creates a default route table for each of its subnets and adds system default routes to the table. In this step, you create a user-defined route table that routes all traffic to the firewall, and then associate it with the App Service subnet in the integrated virtual network.

 

  1. From the Azure portal, search for "Route tables" and select Route tables resource

    Abhishjain_1-1715597646087.png

     

  2. Create a route table with name "contoso-hdi-route-table". 
    Note: Region should be same as Firewall region For e.g. "East US 2" in this case

    Abhishjain_0-1715601185609.png

     

  3. Go to the newly create route table and add a route with the following details

    Setting

    Value

    Destination Type

    IP Addresses

    Destination IP addresses/CIDR ranges

    0.0.0.0/0

    Next hop type

    Virtual appliance

    Next hop address

    The private IP address for the firewall that you copied


    Abhishjain_3-1715602898983.png

  4. Go to subnets and associate the subnet you want to use during HDInsight on AKS cluster setup. Here, "default" subnet is used.

    Abhishjain_0-1715622065625.png

Step 4: Configure Firewall policies

  1. Navigate to the firewall's overview page and select its firewall policy.

     

    Abhishjain_0-1715623138923.png

     

  2. Add network rules (defined here) with the subnet (To be used for setting up HDInsight on AKS cluster) as the source address

    Abhishjain_1-1715625303199.png

     

  3. Add application rules (defined here) with the subnet (To be used for setting up HDInsight on AKS cluster) as the source address

    Abhishjain_2-1715626689957.png
  4. Depending on the cluster type (Spark, Flink, Trino), you need to add additional network and application rules defined here.

     

Step 5: Setup HDInsight on AKS cluster pool

  1. From the Azure portal, search "HDInsight on AKS clusters pool" and create a new HDInsight on AKS cluster pool

  2. Under Security + network settings, choose the virtual network (contoso-hdi-vnet), Subnet (default) and Egress path (Outbound with userDefinedRouting)

    Abhishjain_3-1715668064503.png

     

    Abhishjain_2-1715667424716.png
  3. Once cluster pool is created, verify that no public IP is created. Search for MC_hdi-<clusterpool deployment id> resource group

    Abhishjain_0-1715686894062.png

     

    Abhishjain_1-1715687476289.png

     

Step 6: Add AKS API Server Address to the network rules in the firewall policy

 

  1. From the Azure portal, search for cluster pool name (contoso-hdi-udr-pool) and go the corresponding kubernetes resource.

    Abhishjain_3-1715671319558.png

     


  2. From the overview tab, copy the API server address

    Abhishjain_4-1715671592419.png
  3. Navigate to the contoso-hdi-firewall-policy and enable DNS proxy from DNS tab

    Abhishjain_0-1715671998387.png

     

  4. Go to Network rules tab and add a new rule 

    Abhishjain_1-1715672239654.png

     

 

Step 7: Assign the AKS cluster - that matches the cluster pool - Network Contributor role on your network resources that are used for defining the routing, such as Virtual Network, Route table, and NSG (if used).

 

  1. Navigate to your VNet (contoso-hdi-vnet), go to Access control and click on "Add role assignment

    Abhishjain_4-1715668270947.png

     

  2. Select "Network Contributor" role and member as "Managed identity". In Managed identity option, select Kubernetes services and select your cluster pool name

    Abhishjain_5-1715668698019.png

     

  3. Click Review+ create to complete the role assignment

 

Step 8: Create HDInsight on AKS cluster

  1. From Azure portal, search for "HDInsight on AKS cluster" service or click + New cluster from the overview tab of the cluster pool

    Abhishjain_0-1715669989675.pngAbhishjain_1-1715670079210.png

     

  2. Select the cluster pool (contoso-hdi-udr-pool), cluster type (Trino) and click "Review + create"

    Abhishjain_2-1715672401169.png



Step 9: Access the cluster via a client such as virtual machine (VM)

 

  1. Create a windows virtual machine and copy the public IP of the VM

    Abhishjain_0-1715689662551.png

     

  2. Navigate to the route table (contoso-hdi-route-table) and add VM IP to the route table

    Abhishjain_1-1715690232294.png

     

  3. Remote login to the VM and you can access the cluster web urls

 

With Private AKS clusters, and outbound UDR setup, enterprise customers can ensure that their sensitive data is protected from unauthorized access, and theft. They can continue to implement a range of security measures to protect their data driven applications. With the ability to perform regular security updates and patches to keep their systems up-to-date and secure with In-place upgrades.

 

With all of this available, enterprise customers can now comply with industry-specific regulations related to data privacy, security, and compliance and reduce the risk of data breaches and other security incidents.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.