Azure Host OS – Cloud Host

This post has been republished via RSS; it originally appeared at: Microsoft Tech Community - Latest Blogs - .

Azure Host OS – Cloud Host

One Windows

 

Windows is a versatile and flexible operating system, running on a variety of machine architectures and available in multiple SKUs. It currently supports x86, x64, and ARM architectures. It even used to support Itanium, PowerPC, Alpha, and MIPS (wiki entry). Windows also runs in a multitude of environments; from data centers, laptops, and phones to embedded devices such as ATM machines.

 

Even with all of this support, the core of Windows remains virtually unchanged on all these architectures and SKUs. Windows dynamically scales up, depending on the architecture and the processor that it’s run on to exploit the full power of the hardware. This same applies to Microsoft Azure as well. So, if you have ever wondered how Windows runs Azure nodes in the data center, read ahead!

 

As Satya says, “we are building Azure as the worlds computer” and powering the worlds computer shows the ability of Windows to scale up and scale out. To demonstrate this scale, here is a snapshot of taskmgr running directly on the Azure host in a M-series machine (one of the largest VMs available in Azure, showing 896 logical processors) in the data center.

 

Hari_Pulapaka_0-1672935896309.png

M-series taskmgr

 

In this post, we will talk about the internals of the Azure Host OS which powers the Azure hosts in the data center.

 

Cloud Host – Azure Host Operating System

Azure of course is Microsoft’s cloud computing service, that provides IaaS (infra as a service) virtual machines (VM), PaaS (platform as a service) containers and many other SaaS services (e.g., Azure Storge, Networking, etc.). For the IaaS and PaaS services, all customer code eventually ends up running in a virtual machine. Hence at the core platform layer, the main purpose of the Azure Host operating system is to manage virtual machines and manage it really well! Managing VMs includes launching, shutting down, live migrating, updating it, etc.

 

Since Azure uses Windows as the operating system, all these VMs run as guests of Microsoft Hyper-V, which is our hypervisor. Microsoft Hyper-v is a type 1 hypervisor and hence when I say Azure Host operating system, its technically the root operating system. This is the OS that has full control of the hardware and provides virtualization facilities to run guest VMs.

 

Keep in mind that the hypervisor we use is the same hypervisor that we use on Windows Client and Windows Server across all our millions of customer machines. We will have upcoming blog posts explaining some of the key features of Microsoft Hyper-V, that that allows Azure to securely, and reliably manage guest VMs.

 

Cloud Host 

As I mentioned, the goal of Azure Host OS is to be very good at managing the lifecycle of VMs. This means that Windows (aka Azure Host OS) doesn’t need a whole lot of functionality typically associated with Windows to perform this functionality. Hence, we created a specially crafted console only (no GUI, some also call it headless) edition of Windows called Cloud Host.

 

This is a OneCore based edition of Windows. OneCore is the base layer upon which all the families of Windows SKUs (or editions) build their functionality. It is a set of components (executables, DLLs, etc.) that are needed by all editions of Windows (PCs, Windows Server, XBOX or IOT). For a programming analogy, it is the base class from which all the Windows classes inherit (e.g., Object). If you look inside OneCore to see what functionality it provides, you can see API sets which provide core functionality such as Kernel, Hypervisor, File system support, Networking, Security, Win32 APIs, etc. OneCoreUAP called out in the picture below is another example of a slightly higher layer that is used to build client PC editions which includes the UWP programming surface, GUI stack and higher-level networking components such as media stack and WiFi.

Hari_Pulapaka_0-1672937725585.png

Overview of some representative components available in OneCore

 

How did we build Cloud Host?

There is a minimal amount of code that needs to run on the Azure host to integrate with the control plane as well as monitor and manage container/VMs. Based on an analysis of the dependency set of this code, we identified the set of functionalities (DLLs and API sets) that Azure needs on top of OneCore. These handful of binaries (tens of binaries) were then added to OneCore to use it as the OS for Azure Host.   

 

To add these DLLs, we created a brand-new SKU called Cloud Host and added all these binaries to Cloud Host. You can think of Cloud host as a “child class” of OneCore. Note that we had to create a new SKU “Cloud Host” because we needed to add new binaries to OneCore. We could have just added them to OneCore directly but its cleaner to create a purpose-built specific SKU/Edition, while keeping OneCore unmodified. In other words, Cloud Host is a special purpose SKU designed and built to run the Azure Host nodes in the data center. You may be more familiar with other Windows SKUs, often referred to as Editions, such as Pro, Enterprise, etc. [wiki]. Cloud Host is a similar SKU/edition, one that is used only for Azure nodes in the data center.

 

With that explanation, let’s see this Cloud Host. Here is a picture of the Cloud Host WIM file (a WIM file is just like a zip file to store the Windows image to boot off from). You can see its size is 280 MB, which is more than 10 times smaller than a typical PC WIM file.

 

Hari_Pulapaka_0-1672937889367.png

 

That is significantly smaller than any Windows you use on your PC, typical client enterprise WIM file would be 3.6 GB in size.

 

Hari_Pulapaka_1-1672937889369.png

 

Cloud Host boots into a console shell and the experience would typically be similar to Windows Server Core. Here is a picture of a Cloud Host session, from one of our test machines.

 

(Keep in mind, we do NOT typically log onto Azure Host Nodes, this is purely for demo purposes)

 

Hari_Pulapaka_0-1672938022848.png

Cloud Host with cmd shell, taskmgr and Regedit

 

Another thing you may have noticed is that the taskmgr or even regedit does not look the same as you would see on Windows 11. This is because as I mentioned, Cloud Host is built on OneCore and it is headless (or console based), hence, it doesn’t contain any of the GUI pieces of Windows. We have a special taskmgr and regedit version that doesn’t link with all the modern GUI functionality available in Windows 11, which gives them the “old style” look.

 

API: What kind of code can run on Azure Host nodes?

We can run C++, Python and even Rust code on Azure Host. The main thing to keep in mind is that as a developer if you are building code to run on Azure Host (which is only our internal developers), you can only link against the OneCore SDK (onecore.lib).  We have documented the API surface available to OneCore here along with info on building against OneCore here.  

 

Hari_Pulapaka_0-1672938173121.png

 

With that look into the internals of Azure Cloud Host, future blog posts will continue into the code and design internals of updating the Azure Host (e.g., Tardigrade, VM PHU, Hypervisor Hot Restart, and Live Migration), kernel/virtualization features, security and many more areas in the operating system platform.

 

Cheers,

 

Hari (on behalf of the entire Core OS Team)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.