Introducing the Public Preview of HDInsight on Azure Kubernetes Service (AKS )

This post has been republished via RSS; it originally appeared at: Microsoft Tech Community - Latest Blogs - .

We're thrilled to unveil the new version of HDInsight – Azure's flagship open-source analytics service. A year's worth of dedicated effort has culminated in this moment, and we're eager to put it into the hands of our valued customers. Introducing HDInsight on Azure Kubernetes Service (AKS), a groundbreaking offering that represents a complete reimagining of our infrastructure, utilizing the power of Azure Kubernetes Service. This update introduces two popular workloads - Trino and Flink - alongside the highly coveted Spark workload. 


In today's data-rich landscape, the quest to transform data into a competitive edge is driving customer demands. Enterprises are deeply entrenched in leveraging open-source technologies to build their data lakehouses. At Microsoft, our commitment to delivering top-tier open-source analytics on Azure remains unwavering. HDInsight on AKS is the synthesis of Azure Kubernetes Service’s exceptional cloud efficiency and the integration of the world's premier open-source workloads for streaming the Federated Query Engine. We are confident that these enhancements will empower customers to construct secure, contemporary, and dependable data applications, all while continuing to revel in the developer-friendly experience and versatility that HDInsight is synonymous with. 


As HDInsight channels substantial investments into the open-source realm, we've taken a pivotal step by basing this new service entirely on Azure Linux, Microsoft's fully open source Linux distribution. The general availability of the Azure Linux container host on AKS was announced at Build this past May, releasing a fully supported OS for AKS nodepools. Azure Linux is Microsoft's default Linux distribution, running millions of cores across thousands of services. 


With Azure Linux, we saw several benefits right out the gate. First, Azure Linux has some of the fastest cluster operations time on AKS. We've noticed 30% faster cluster upgrade times and 40% faster cluster creation times, on average, for our workloads. The lean nature of Azure Linux, with ~500 packages, translates into minimal disk space consumption (up to 5GB less) on AKS.  


Furthermore, Azure Linux undergoes rigorous testing and validation in close collaboration with the HDInsight and AKS team. Each package update undergoes an exhaustive regimen of unit tests and end-to-end testing on the existing image, effectively preventing regressions. In combination with the smaller package count, this significantly reduces the chances of disruptive updates to our application, resulting in fewer incidents and package vulnerabilities, and driving down our operational costs. 


Security is paramount to HDInsight, and Azure Linux has achieved the CIS Level 1 benchmarks, setting it apart as the only Linux distribution on AKS to achieve this feat. Lastly, the AKS extensions, add-ons, and open-source projects we utilize have worked seamlessly on Azure Linux from the start. 


“HDInsight on AKS is a true reflection of Microsoft's commitment to open-source with Azure Linux providing a solid foundation for HDInsight to deliver a reliable, secure and open analytics platform for customers to build their data applications and services.” - Balaji Sankaran, General Manager of Azure HDInsight . 


In essence, the future of Azure HDInsight on AKS holds exciting potential for driving innovation, enabling deeper insights, and transforming the way organizations process and leverage data to gain a competitive edge in their respective industries.  

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.