This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.
In this blog we will go through how to set up Transport Layer Security (TLS) encryption for HDInsight Apache Kafka cluster (between Apache Kafka brokers and Apache Zookeepers).
Prerequisite
1. Create Apache Kafka Cluster
2. SSH Access to the cluster
The blog covers self-sign certificate, the process remains same for certificates issues from Certificate Authority (CA).
TLS Setup for Apache Kafka Brokers
User can generate TLS for each broker Domian/IP address/Node Name or Subject Alternative Name (SAN) or Wildcard certificate. The choice between individual vs Wildcard vs SAN is depends on your cluster behavior.
If you are planning to have scale out feature of the HDInsight Apache Kafka cluster, then it is less hassle if you use Wildcard or SAN certificate. At the time of scale out, you would not require generating new certificate of each additional node, if you are using Wildcard or SAN certificate.
The HDInsight Apache Kafka cluster broker's DNS names follow similar pattern within given cluster wnX-<cluster name>.<unique id>.bx.internal.cloudapp.net, that means you can generate either wildcard or SAN certificate for the cluster without worrying about cluster scale out.
Before we jump into certificate generation process, let's understand few points:
- Certificate can be generated outside of the cluster nodes, via different build pipeline and make them available to the cluster (head node, brokers, and zookeeper nodes) or from the head node. Over here we are assuming you are generating self-sign SAN certificate from the head node.
- Self-sign vs CA cert process - The self-signed certificates are public key certificates that are not issued by a certificate authority (CA). These self-signed certificates are easy to make and do not cost money. However, they do not provide any trust value. It is recommended to use CA cert for the production workload.
Steps to generate SAN certificate for the Kafka and Zookeeper Nodes:
Here we are assuming you are using self-sign certification and generating such certification from cluster head node.
1. SSH to Head node or you from your local development environment.
2. Create a new directory
3. Create ca-cert and ca-key files for the self-sign cert process.
4. Add CAs public certificate to the truststore
5. Create a keystore and populate it with a new private certificate. In the case of auto scale, you can add "x" worker node(s) DNS names ahead of time part of the certificate.
6. Create a certificate signing request (CSR).
7. Sign certificate signing request (CSR) using your private key and root CA certificate created in step#3
8. Add the CAs public certificate to the keystore.
9. Add signed certificate (from step#7) to the keystore
SCP truststore and keystore to worker and zookeeper nodes
You can use sshpass or any other automation script to copy truststore and keystore to each kafka and zookeeper nodes. These must be copied to the consistent location across the nodes, for example: `/home/sshuser/ssl'. In the case of auto scale, you can leverage persisted script action to copy truststore and keystore.
Update Zookeeper configuration to use TLS and restart zookeepers
Modify zookeeper related configuration using Ambari. To complete the configuration modification, do the following steps:
-
Sign in to the Azure portal and select your Azure HDInsight Apache Kafka cluster.
-
Go to the Ambari UI by clicking Ambari home under Cluster dashboards.
- Under Zookeeper ->Advanced zookeeper-env, add following lines before `{% if security_enabled %}`:
export SERVER_JVMFLAGS=" $SERVER_JVMFLAGS -Dzookeeper.serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory -Dzookeeper.ssl.keyStore.location=/home/sshuser/ssl/kafka.server.keystore.jks -Dzookeeper.ssl.keyStore.password=confidential -Dzookeeper.ssl.trustStore.location=/home/sshuser/ssl/kafka.server.truststore.jks -Dzookeeper.ssl.trustStore.password=confidential" export CLIENT_JVMFLAGS="$CLIENT_JVMFLAGS -Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty -Dzookeeper.client.secure=true -Dzookeeper.ssl.keyStore.location=/home/sshuser/ssl/kafka.server.keystore.jks -Dzookeeper.ssl.keyStore.password=confidential -Dzookeeper.ssl.trustStore.location=/home/sshuser/ssl/kafka.server.truststore.jks -Dzookeeper.ssl.trustStore.password=confidential" - Under Custom zoo.cfg, add a new property "secureClientPort" with value "2281".
- Restart "Restart All Affected"
Update Kafka configuration to use TLS and restart brokers
You have now set up each Kafka broker with a keystore and truststore, and imported the correct certificates. Next, modify related Kafka configuration properties using Ambari and then restart the Kafka brokers. To complete the configuration modification, do the following steps:
-
Sign in to the Azure portal and select your Azure HDInsight Apache Kafka cluster.
-
Go to the Ambari UI by clicking Ambari home under Cluster dashboards.
-
Under Kafka ->Kafka Broker:
- set the listeners property to PLAINTEXT://localhost:9092,SSL://localhost:9093
-
zookeeper.connect to use port 2281 for zookeeper nodes
- Under Advanced kafka-broker:
- Set the security.inter.broker.protocol property to SSL
- Set ssl.keystore.location and ssl.truststore.location is the complete path of your keystore, truststore location.
- Set ssl.keystore.password and ssl.truststore.password is the password set for the keystore and truststore. In this case as an example, confidential
- Set ssl.key.password is the key set for the keystore and trust store. In this case as an example, confidential
- Under Advanced kafka-env -> kafka-env template , add following line at the end:
Validate TLS Setup
Zookeeper Connection
1. SSH to zookeeper node from the headnode
2. connect to zookeeper using zookeeper-cli:
Test Kafka Connection
1. Copy the ca-cert to the client machine (maybe passive headnode hn1)
2. Import the CA certificate to the truststore.
3. Create the file client-ssl-auth.properties on client machine. It should have the following lines:
4. Create Kafka Topic
5. Run Kafka producer
6. Run Kafka consumer
References
1. Self-signed certificate - Wikipedia
2. Apache Kafka TLS encryption & authentication - Azure HDInsight | Microsoft Learn