04 Azure Machine Learning and Custom Models

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

Azure Machine Learning and Custom Models


Azure Machine Learning (AML) is a powerful platform that caters to the needs of customers who already have their own models or are utilizing popular open-source models like SKLearn, PyTorch or XGBoost, and more. In such cases, customers are primarily focused on efficiently managing and tracking their models, along with centralized compute capabilities to support their data science scenarios. AML offers a comprehensive suite of features to meet these requirements, including compute resources, MLFlow integration for model tracking, a Model Registry for organized management, a Feature Store for sharing feature engineering artifacts, and the ability to deploy custom models to online or batch endpoints. With AML, customers can streamline their model management processes and unleash the full potential of their machine learning workflows.


Azure Machine Learning Compute Instance

Azure Machine Learning offers a fully configured and managed development environment through compute instances. These instances not only serve as your dedicated development and testing environments but can also be utilized as training compute targets. With the ability to run multiple jobs in parallel and a convenient job queue, compute instances provide the flexibility and power you need to streamline your machine learning workflows.

It's important to note that compute instances are designed for individual use and cannot be shared with other users in your workspace. Once created, a compute instance becomes a one-time process for your workspace, allowing you to reuse it as a development workstation or as a compute target for training. Furthermore, you have the option to attach multiple compute instances to your workspace, providing even more flexibility in your development setup.

When it comes to resource allocation, the dedicated cores per region per VM family quota and total regional quota apply to compute instance creation. It's worth mentioning that stopping a compute instance does not release quota to ensure you can easily restart it when needed. Additionally, please note that it's not possible to change the virtual machine size of a compute instance once it has been created.

Compute instances offer a convenient and powerful development environment in the cloud. Not only can you leverage compute instances for development, testing, and training, but you also have access to a range of popular tools to suit your preferences. From the compute instance, you can seamlessly access JupyterLab, Jupyter Notebook, and even VS Code, providing you with flexibility and choice in your coding experience. Whether you prefer the interactive and collaborative nature of Jupyter environments or the robust features of VS Code, Azure Machine Learning compute instances empower you to work efficiently and optimize your productivity. Unlock the full potential of your machine learning workflows with Azure Machine Learning compute instances and your preferred coding environment.


Manage a compute instance - Azure Machine Learning | Microsoft Learn

Run Jupyter notebooks in your workspace - Azure Machine Learning | Microsoft Learn


Azure Machine Learning Compute Cluster

In addition to compute instances, Azure Machine Learning offers compute clusters, which provide a managed-compute infrastructure for creating single or multi-node compute environments. Unlike compute instances, compute clusters can be shared with other users in your workspace, promoting collaboration and resource optimization. The compute cluster dynamically scales up as jobs are submitted, ensuring efficient resource allocation. Moreover, compute clusters can be deployed within an Azure Virtual Network, offering enhanced security and control over your workloads. With the option for no public IP deployment and support for containerized environments, compute clusters simplify the execution of jobs while packaging your model dependencies within a Docker container. By leveraging Azure Machine Learning compute clusters, you can effortlessly scale your workloads, securely execute jobs, and streamline your machine learning workflows.


Create compute clusters - Azure Machine Learning | Microsoft Learn


Azure Machine Learning Serverless Compute

Azure Machine Learning introduces a new compute target type, serverless compute, which revolutionizes the training process. With serverless compute, machine learning professionals can effortlessly submit their training jobs and let Azure Machine Learning handle the rest. As a fully managed and on-demand compute option, serverless compute takes care of creating, scaling, and managing the compute infrastructure, freeing you from the complexities of compute setup.

By leveraging serverless compute, machine learning professionals can focus on their core expertise of building machine learning models, without the need to worry about compute infrastructure. You can easily specify the resources required for each job, while Azure Machine Learning takes care of managing the compute infrastructure and provides managed network isolation, reducing your workload.

Enterprises can optimize costs by specifying the optimal resources for each job, while IT admins can maintain control by setting cores quota at the subscription and workspace level, and applying Azure policies.

Serverless compute is not limited to model training. It can be used for various tasks, including fine-tuning models in the model catalog, running jobs from Azure Machine Learning studio, SDK, and CLI, building environment images, and enabling responsible AI dashboard scenarios. Serverless jobs consume the same quota as Azure Machine Learning compute, providing flexibility in resource allocation. You can choose between standard (dedicated) or spot (low priority) VMs, and serverless jobs support both managed identity and user identity. The billing model for serverless compute aligns with Azure Machine Learning compute.


Model training on serverless compute - Azure Machine Learning | Microsoft Learn


Azure Machine Learning Custom Model Training

Azure Machine Learning empowers you to train custom models with ease, giving you the flexibility to choose between running them on your personal compute or utilizing an Azure Machine Learning compute instance or compute cluster. The process remains the same as if you were working in an isolated environment, but with Azure Machine Learning, you have the added advantage of tracking with MLFlow, running hyperparameter tuning, training, and testing on scalable AML compute.

To guide you through the process of training SciKitLearn, TensorFlow, Keras, and PyTorch models, we have curated a collection of helpful resources. These links provide step-by-step instructions and best practices to ensure your custom model training journey is smooth and successful. Whether you are a seasoned data scientist or a machine learning enthusiast, Azure Machine Learning simplifies the process, allowing you to focus on refining your models and achieving optimal performance.


Train scikit-learn machine learning models (v2) - Azure Machine Learning | Microsoft Learn

Train and deploy a TensorFlow model (SDK v2) - Azure Machine Learning | Microsoft Learn

Train deep learning Keras models (SDK v2) - Azure Machine Learning | Microsoft Learn

Train deep learning PyTorch models (SDK v2) - Azure Machine Learning | Microsoft Learn

Hyperparameter tuning a model (v2) - Azure Machine Learning | Microsoft Learn

Guidelines for deploying MLflow models - Azure Machine Learning | Microsoft Learn


Azure Machine Learning Model Registry

After you model is trained Azure Machine Learning seamlessly integrates with MLflow, providing users with a powerful tool for model management. By leveraging MLflow, you can easily support the entire model lifecycle, making it a convenient choice for those already familiar with the MLflow client.


Manage models registries in Azure Machine Learning with MLflow - Azure Machine Learning | Microsoft Learn


Deploy a Custom Model

With Azure Machine Learning, you can seamlessly deploy your MLflow model to an online endpoint, enabling real-time inference. The beauty of this deployment approach is its no-code characteristic, eliminating the need to specify a scoring script or environment. When deploying your MLflow model to an online endpoint, Azure Machine Learning dynamically installs the necessary Python packages from the conda.yaml file during container runtime. Additionally, Azure Machine Learning provides a curated environment and MLflow base image, which includes essential components such as azureml-inference-server-http and mlflow-skinny. To facilitate inference, a scoring script is also included.


Deploy MLflow models to real-time endpoints - Azure Machine Learning | Microsoft Learn

Deploy MLflow models to Online Endpoints

Predictive Maintenance with MLFlow(XGBosot)

Predictive Maintenance with MLFlow(PyTorch)


Principal author:

  • Shep Sheppard | Senior Customer Engineer, FastTrack for ISV and Startups

Other contributors:

  • Yoav Dobrin Principal Customer Engineer, FastTrack for ISV and Startups
  • Jones Jebaraj | Senior Customer Engineer, FastTrack for ISV and Startups
  • Olga Molocenco-Ciureanu | Customer Engineer, FastTrack for ISV and Startups


Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.