Deploy and run a Azure OpenAI/ChatGPT application on AKS

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

This article shows how to deploy an Azure Kubernetes Service(AKS) cluster and Azure OpenAI Service and how to deploy a Python chatbot that authenticates against Azure OpenAI using Azure AD workload identity and calls the Chat Completion API of a ChatGPT model. A chatbot is an application that simulates human-like conversations with users via chat. Its key task is to answer user questions with instant messages. Azure Kubernetes Service(AKS) cluster communicates with Azure OpenAI Service via an Azure Private Endpoint. The chatbot application simulates the original Magic 8 Ball plastic sphere, made to look like an oversized eight ball used for fortune-telling or seeking advice. You can find the code of the chatbot and Bicep modules to deploy the environment in this GitHub repository.

AI applications perform tasks such as summarizing articles, writing stories, and engaging in long conversations with chatbots. This is made possible by large language models (LLMs) like OpenAI ChatGPT, which are deep learning algorithms capable of recognizing, summarizing, translating, predicting, and generating text and other content. LLMs leverage the knowledge acquired from extensive datasets, enabling them to perform tasks beyond teaching AI human languages. These models have succeeded in diverse domains, including understanding proteins, writing software code, and more. Apart from their applications in natural language processing, such as translation, chatbots, and AI assistants, large language models are also extensively employed in healthcare, software development, and various other fields.

For more information on Azure OpenAI Service and Large Language Models (LLMs), see the following articles:

Prerequisites

An active Azure subscription. If you don't have one, create a free Azure account before you begin.
Visual Studio Code installed on one of the supported platforms along with the Bicep extension.
Azure CLI version 2.49.0 or later installed. To install or upgrade, see Install Azure CLI.
aks-preview Azure CLI extension of version 0.5.140 or later installed

You can run az --version to verify the above versions.

To install the aks-preview extension, run the following command:

az extension add --name aks-preview

Run the following command to update to the latest version of the extension released:

az extension update --name aks-preview

Architecture

This sample provides a set of Bicep modules to deploy an Azure Kubernetes Service(AKS) cluster and Azure OpenAI Service and how to deploy a Python chatbot that authenticates against Azure OpenAI using Azure AD workload identity and calls the Chat Completion API of the ChatGPT model. Azure Kubernetes Service(AKS) cluster communicates with Azure OpenAI Service via an Azure Private Endpoint. The following diagram shows the architecture and network topology deployed by the sample:

Bicep modules are parametric, so you can choose any network plugin:

The Bicep modules also allow installing the following extensions and add-ons for Azure Kubernetes Service(AKS):

In addition, this sample shows how to deploy an Azure Kubernetes Service cluster with the following features:

Istio-based service mesh add-on for Azure Kubernetes Service provides an officially supported and tested Istio integration for Azure Kubernetes Service (AKS).
API Server VNET Integration allows you to enable network communication between the API server and the cluster nodes without requiring a private link or tunnel. AKS clusters with API Server VNET integration provide a series of advantages, for example, they can have public network access or private cluster mode enabled or disabled without redeploying the cluster. For more information, see Create an Azure Kubernetes Service cluster with API Server VNet Integration.
Azure NAT Gateway to manage outbound connections initiated by AKS-hosted workloads.
Event-driven Autoscaling (KEDA) add-on is a single-purpose and lightweight component that strives to make application autoscaling simple and is a CNCF Incubation project.
Dapr extension for Azure Kubernetes Service (AKS) allows you to install Dapr, a portable, event-driven runtime that simplifies building resilient, stateless, and stateful applications that run on the cloud and edge and embrace the diversity of languages and developer frameworks. With its sidecar architecture, Dapr helps you tackle the challenges that come with building microservices and keeps your code platform agnostic.
Flux V2 extension allows to deploy workloads to an Azure Kubernetes Service (AKS) cluster via GitOps. For more information, see GitOps Flux v2 configurations with AKS and Azure Arc-enabled Kubernetes
Vertical Pod Autoscaling allows you to automatically sets resource requests and limits on containers per workload based on past usage. VPA makes certain pods are scheduled onto nodes that have the required CPU and memory resources. For more information, see Kubernetes Vertical Pod Autoscaling.
Azure Key Vault Provider for Secrets Store CSI Driver provides a variety of methods of identity-based access to your Azure Key Vault.
Image Cleaner to clean up stale images on your Azure Kubernetes Service cluster.
Open Service Mesh add-on is a lightweight, extensible, cloud-native service mesh that allows you to uniformly manage, secure, and get out-of-the-box observability features for highly dynamic microservice environments. Bicep modules allow to install the Open Service Mesh add-on as an alternative to the Istio Service Mesh add-on. NOTE: you can't install both the Open Service Mesh add-on and Istio Service Mesh add-on on the same AKS cluster.

In a production environment, we strongly recommend deploying a private AKS cluster with Uptime SLA. For more information, see private AKS cluster with a Public DNS address. Alternatively, you can deploy a public AKS cluster and secure access to the API server using authorized IP address ranges.

The Bicep modules deploy the following Azure resources:

Microsoft.CognitiveServices/accounts: an Azure OpenAI Service with a GPT-3.5 model used by the chatbot application. Azure OpenAI Service gives customers advanced language AI with OpenAI GPT-4, GPT-3, Codex, and DALL-E models with Azure's security and enterprise promise. Azure OpenAI co-develops the APIs with OpenAI, ensuring compatibility and a smooth transition from one to the other.
Microsoft.ManagedIdentity/userAssignedIdentities: a user-defined managed identity used by the AKS cluster to create additional resources like load balancers and managed disks in Azure.
Microsoft.ManagedIdentity/userAssignedIdentities: a user-defined managed identity used by the chatbot application to acquire a security token via Azure AD workload identity to call the Chat Completion API of the ChatGPT model provided by the Azure OpenAI Service.
Microsoft.Compute/virtualMachines: Bicep modules can optionally create a jump-box virtual machine to manage the private AKS cluster.
Microsoft.Network/bastionHosts: a separate Azure Bastion is deployed in the AKS cluster virtual network to provide SSH connectivity to both agent nodes and virtual machines.
Microsoft.Network/natGateways: a bring-your-own (BYO) Azure NAT Gateway to manage outbound connections initiated by AKS-hosted workloads. The NAT Gateway is associated to the SystemSubnet, UserSubnet, and PodSubnet subnets. The outboundType property of the cluster is set to userAssignedNatGateway to specify that a BYO NAT Gateway is used for outbound connections. NOTE: you can update the outboundType after cluster creation and this will deploy or remove resources as required to put the cluster into the new egress configuration. For more information, see Updating outboundType after cluster creation.
Microsoft.Storage/storageAccounts: this storage account is used to store the boot diagnostics logs of both the service provider and service consumer virtual machines. Boot Diagnostics is a debugging feature that allows you to view console output and screenshots to diagnose virtual machine status.
Microsoft.ContainerRegistry/registries: an Azure Container Registry (ACR) to build, store, and manage container images and artifacts in a private registry for all container deployments.
Microsoft.KeyVault/vaults: an Azure Key Vault used to store secrets, certificates, and keys that can be mounted as files by pods using Azure Key Vault Provider for Secrets Store CSI Driver. For more information, see Use the Azure Key Vault Provider for Secrets Store CSI Driver in an AKS cluster and Provide an identity to access the Azure Key Vault Provider for Secrets Store CSI Driver.
Microsoft.Network/privateEndpoints: an Azure Private Endpoint is created for each of the following resources:
- Azure OpenAI Service
- Azure Container Registry
- Azure Key Vault
- Azure Storage Account
- API Server when deploying a private AKS cluster.
Microsoft.Network/privateDnsZones: an Azure Private DNS Zone is created for each of the following resources:
- Azure OpenAI Service
- Azure Container Registry
- Azure Key Vault
- Azure Storage Account
- API Server when deploying a private AKS cluster.
Microsoft.Network/networkSecurityGroups: subnets hosting virtual machines and Azure Bastion Hosts are protected by Azure Network Security Groups that are used to filter inbound and outbound traffic.
Microsoft.OperationalInsights/workspaces: a centralized Azure Log Analytics workspace is used to collect the diagnostics logs and metrics from all the Azure resources:
- Azure OpenAI Service
- Azure Kubernetes Service cluster
- Azure Key Vault
- Azure Network Security Group
- Azure Container Registry
- Azure Storage Account
- Azure jump-box virtual machine
Microsoft.Resources/deploymentScripts: a deployment script is used to run the install-nginx-via-helm-and-create-sa.sh Bash script which creates the namespace and servicea account for the sample application and installs the following packages to the AKS cluster via Helm. For more information on deployment scripts, see Use deployment scripts in Bicep

NOTE
You can find the architecture.vsdx file used for the diagram under the visio folder.

What is Bicep?

Bicep is a domain-specific language (DSL) that uses a declarative syntax to deploy Azure resources. It provides concise syntax, reliable type safety, and support for code reuse. Bicep offers the best authoring experience for your infrastructure-as-code solutions in Azure.

What is Azure OpenAI Service?

The Azure OpenAI Service is a platform offered by Microsoft Azure that provides cognitive services powered by OpenAI models. One of the models available through this service is the ChatGPT model, designed for interactive conversational tasks. It allows developers to integrate natural language understanding and generation capabilities into their applications.

Azure OpenAI Service provides REST API access to OpenAI's powerful language models, including the GPT-3, Codex and Embeddings model series. In addition, the new GPT-4 and ChatGPT model series have now reached general availability. These models can be easily adapted to your specific task, including but not limited to content generation, summarization, semantic search, and natural language to code translation. Users can access the service through REST APIs, Python SDK, or our web-based interface in the Azure OpenAI Studio.

The Chat Completion API, part of the Azure OpenAI Service, provides a dedicated interface for interacting with the ChatGPT and GPT-4 models. This API is currently in preview and is the preferred method for accessing these models. The GPT-4 models can only be accessed through this API.

GPT-3, GPT-3.5, and GPT-4 models from OpenAI are prompt-based. With prompt-based models, the user interacts with the model by entering a text prompt, to which the model responds with a text completion. This completion is the model’s continuation of the input text. While these models are extremely powerful, their behavior is also very sensitive to the prompt. This makes prompt construction a critical skill to develop. For more information, see Introduction to prompt engineering.

Prompt construction can be complex. In practice, the prompt acts to configure the model weights to complete the desired task, but it's more of an art than a science, often requiring experience and intuition to craft a successful prompt. The goal of this article is to help get you started with this learning process. It attempts to capture general concepts and patterns that apply to all GPT models. However, it's essential to understand that each model behaves differently, so the learnings may not apply equally to all models.

Prompt engineering refers to creating instructions called prompts for Large Language Models (LLMs), such as OpenAI’s ChatGPT. With the immense potential of LLMs to solve a wide range of tasks, leveraging prompt engineering can empower us to save significant time and facilitate the development of impressive applications. It holds the key to unleashing the full capabilities of these huge models, transforming how we interact and benefit from them. For more information, see Prompt engineering techniques.

Deploy the Bicep modules

You can deploy the Bicep modules in the bicep folder using the deploy.sh Bash script in the same folder. Specify a value for the following parameters in the deploy.sh script and main.parameters.json parameters file before deploying the Bicep modules.

prefix: specifies a prefix for all the Azure resources.
authenticationType: specifies the type of authentication when accessing the Virtual Machine. sshPublicKey is the recommended value. Allowed values: sshPublicKey and password.
vmAdminUsername: specifies the name of the administrator account of the virtual machine.
vmAdminPasswordOrKey: specifies the SSH Key or password for the virtual machine.
aksClusterSshPublicKey: specifies the SSH Key or password for AKS cluster agent nodes.
aadProfileAdminGroupObjectIDs: when deploying an AKS cluster with Azure AD and Azure RBAC integration, this array parameter contains the list of Azure AD group object IDs that will have the admin role of the cluster.
keyVaultObjectIds: Specifies the object ID of the service principals to configure in Key Vault access policies.

We suggest reading sensitive configuration data such as passwords or SSH keys from a pre-existing Azure Key Vault resource. For more information, see Use Azure Key Vault to pass secure parameter value during Bicep deployment.

OpenAI Bicep Module

The following table contains the code from the openAi.bicep Bicep module used to deploy the Azure OpenAI Service.

// Parameters @description('Specifies the name of the Azure OpenAI resource.') param name string = 'aks-${uniqueString(resourceGroup().id)}' @description('Specifies the resource model definition representing SKU.') param sku object = { name: 'S0' } @description('Specifies the identity of the OpenAI resource.') param identity object = { type: 'SystemAssigned' } @description('Specifies the location.') param location string = resourceGroup().location @description('Specifies the resource tags.') param tags object @description('Specifies an optional subdomain name used for token-based authentication.') param customSubDomainName string = '' @description('Specifies whether or not public endpoint access is allowed for this account..') @allowed([ 'Enabled' 'Disabled' ]) param publicNetworkAccess string = 'Enabled' @description('Specifies the OpenAI deployments to create.') param deployments array = [ { name: 'text-embedding-ada-002' version: '2' raiPolicyName: '' capacity: 1 scaleType: 'Standard' } { name: 'gpt-35-turbo' version: '0301' raiPolicyName: '' capacity: 1 scaleType: 'Standard' } { name: 'text-davinci-003' version: '1' raiPolicyName: '' capacity: 1 scaleType: 'Standard' } ] @description('Specifies the workspace id of the Log Analytics used to monitor the Application Gateway.') param workspaceId string // Variables var diagnosticSettingsName = 'diagnosticSettings' var openAiLogCategories = [ 'Audit' 'RequestResponse' 'Trace' ] var openAiMetricCategories = [ 'AllMetrics' ] var openAiLogs = [for category in openAiLogCategories: { category: category enabled: true }] var openAiMetrics = [for category in openAiMetricCategories: { category: category enabled: true }] // Resources resource openAi 'Microsoft.CognitiveServices/accounts@2022-12-01' = { name: name location: location sku: sku kind: 'OpenAI' identity: identity tags: tags properties: { customSubDomainName: customSubDomainName publicNetworkAccess: publicNetworkAccess } } resource model 'Microsoft.CognitiveServices/accounts/deployments@2022-12-01' = [for deployment in deployments: { name: deployment.name parent: openAi properties: { model: { format: 'OpenAI' name: deployment.name version: deployment.version } raiPolicyName: deployment.raiPolicyName scaleSettings: { capacity: deployment.capacity scaleType: deployment.scaleType } } }] resource openAiDiagnosticSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = { name: diagnosticSettingsName scope: openAi properties: { workspaceId: workspaceId logs: openAiLogs metrics: openAiMetrics } } // Outputs output id string = openAi.id output name string = openAi.name

Azure Cognitive Services use custom subdomain names for each resource created through the Azure portal, Azure Cloud Shell, Azure CLI, Bicep, Azure Resource Manager (ARM), or Terraform. Unlike regional endpoints, common for all customers in a specific Azure region, custom subdomain names are unique to the resource. Custom subdomain names are required to enable authentication features like Azure Active Directory (Azure AD). We need to specify a custom subdomain for our Azure OpenAI Service, as our chatbot application will use an Azure AD security token to access it. By default, the main.bicep module sets the value of the customSubDomainName parameter to the lowercase name of the Azure OpenAI resource. For more information on custom subdomains, see Custom subdomain names for Cognitive Services.

This bicep module allows you to pass an array containing the definition of one or more model deployments in the deployments parameter. For more information on model deployments, see Create a resource and deploy a model using Azure OpenAI

AKS Cluster Bicep module

The aksCluster.bicep Bicep module is used to deploy the Azure Kubernetes Service(AKS) cluster. In particular, the following code snippet creates the user-defined managed identity used by the chatbot to acquire a security token from Azure Active Directory via Azure AD workload identity. When the boolean openAiEnabled parameter is true, the Bicep code performs the following steps:

Creates a new user-defined managed identity.
Assign the new managed identity to the Cognitive Services User role with the resource group as a scope.
Federate the managed identity with the service account used by the chatbot. The following information are necessary to create the federated identity credentials:
- The Kubernetes service account name.
- The Kubernetes namespace that will host the chatbot application.
- The URL of the OpenID Connect (OIDC) token issuer endpoint for Azure AD workload identity

For more information, see the following resources:

... @description('Specifies the name of the user-defined managed identity used by the application that uses Azure AD workload identity to authenticate against Azure OpenAI.') param workloadManagedIdentityName string @description('Specifies whether creating the Azure OpenAi resource or not.') param openAiEnabled bool = false ... // This user-defined managed identity used by the workload to connect to the Azure OpenAI resource with a security token issued by Azure Active Directory resource workloadManagedIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = if (openAiEnabled) { name: workloadManagedIdentityName location: location tags: tags } // Assign the Cognitive Services User role to the user-defined managed identity used by workloads resource cognitiveServicesUserRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = if (openAiEnabled) { name: guid(workloadManagedIdentity.id, cognitiveServicesUserRoleDefinitionId) scope: resourceGroup() properties: { roleDefinitionId: cognitiveServicesUserRoleDefinitionId principalId: workloadManagedIdentity.properties.principalId principalType: 'ServicePrincipal' } } // Create federated identity for the user-defined managed identity used by the workload resource federatedIdentityCredentials 'Microsoft.ManagedIdentity/userAssignedIdentities/federatedIdentityCredentials@2023-01-31' = { name: letterCaseType == 'UpperCamelCase' ? '${toUpper(first(namespace))}${toLower(substring(namespace, 1, length(namespace) - 1))}FederatedIdentity' : letterCaseType == 'CamelCase' ? '${toLower(namespace)}FederatedIdentity' : '${toLower(namespace)}-federated-identity' parent: workloadManagedIdentity properties: { issuer: aksCluster.properties.oidcIssuerProfile.issuerURL subject: 'system:serviceaccount:${namespace}:${serviceAccountName}' audiences: [ 'api://AzureADTokenExchange' ] } } ... // Output output id string = aksCluster.id output name string = aksCluster.name output issuerUrl string = aksCluster.properties.oidcIssuerProfile.issuerURL output workloadManagedIdentityClientId string = workloadManagedIdentity.properties.clientId

Use Azure AD workload identity with Azure Kubernetes Service (AKS)

Workloads deployed on an Azure Kubernetes Services (AKS) cluster require Azure Active Directory (Azure AD) application credentials or managed identities to access Azure AD-protected resources, such as Azure Key Vault and Microsoft Graph. Azure AD workload identity integrates with the capabilities native to Kubernetes to federate with external identity providers.

Azure AD workload identity uses Service Account Token Volume Projection to enable pods to use a Kubernetes service account. When enabled, the AKS OIDC Issuer issues a service account security token to a workload, and OIDC federation enables the application to access Azure resources securely with Azure AD based on annotated service accounts.

Azure AD workload identity works well with the Azure Identity client libraries and the Microsoft Authentication Library (MSAL) collection if you use a registered application instead of a managed identity. Your workload can use any of these libraries to authenticate and access Azure cloud resources seamlessly. For more information, see the following resources:

Azure Identity client libraries

In the Azure Identity client libraries, you can choose one of the following approaches:

Use DefaultAzureCredential, which will attempt to use the WorkloadIdentityCredential.
Create a ChainedTokenCredential instance that includes WorkloadIdentityCredential.
Use WorkloadIdentityCredential directly.

The following table provides the minimum package version required for each language's client library.

Language	Library	Minimum Version	Example
.NET	Azure.Identity	1.9.0	Link
Go	azidentity	1.3.0	Link
Java	azure-identity	1.9.0	Link
JavaScript	@azure/identity	3.2.0	Link
Python	azure-identity	1.13.0	Link

Microsoft Authentication Library (MSAL)

The following client libraries are the minimum version required

Language	Library	Image	Example	Has Windows
.NET	microsoft-authentication-library-for-dotnet	ghcr.io/azure/azure-workload-identity/msal-net	Link	Yes
Go	microsoft-authentication-library-for-go	ghcr.io/azure/azure-workload-identity/msal-go	Link	Yes
Java	microsoft-authentication-library-for-java	ghcr.io/azure/azure-workload-identity/msal-java	Link	No
JavaScript	microsoft-authentication-library-for-js	ghcr.io/azure/azure-workload-identity/msal-node	Link	No
Python	microsoft-authentication-library-for-python	ghcr.io/azure/azure-workload-identity/msal-python	Link	No

Deployment Script

The sample makes use of a Deployment Script to run the install-nginx-via-helm-and-create-sa.sh Bash script that creates the namespace and service account for the sample application and installs the following packages to the AKS cluster via Helm. For more information on deployment scripts, see Use deployment scripts in Bicep.

This sample uses the NGINX Ingress Controller to expose the chatbot to the public internet. Companion Bicep modules allow deploying an Azure Application Gateway and Application Gateway Ingress Controller just by setting the value of the applicationGatewayEnabled to true. So you can easily modify this sample to expose the chatbot to the public internet using the Application Gateway Ingress Controller instead of the NGINX Ingress Controller.

# Install kubectl az aks install-cli --only-show-errors # Get AKS credentials az aks get-credentials \ --admin \ --name $clusterName \ --resource-group $resourceGroupName \ --subscription $subscriptionId \ --only-show-errors # Check if the cluster is private or not private=$(az aks show --name $clusterName \ --resource-group $resourceGroupName \ --subscription $subscriptionId \ --query apiServerAccessProfile.enablePrivateCluster \ --output tsv) # Install Helm curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 -o get_helm.sh -s chmod 700 get_helm.sh ./get_helm.sh &>/dev/null # Add Helm repos helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx helm repo add jetstack https://charts.jetstack.io # Update Helm repos helm repo update if [[ $private == 'true' ]]; then # Log whether the cluster is public or private echo "$clusterName AKS cluster is public" # Install Prometheus command="helm install prometheus prometheus-community/kube-prometheus-stack \ --create-namespace \ --namespace prometheus \ --set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false \ --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false" az aks command invoke \ --name $clusterName \ --resource-group $resourceGroupName \ --subscription $subscriptionId \ --command "$command" # Install NGINX ingress controller using the internal load balancer command="helm install nginx-ingress ingress-nginx/ingress-nginx \ --create-namespace \ --namespace ingress-basic \ --set controller.replicaCount=3 \ --set controller.nodeSelector.\"kubernetes\.io/os\"=linux \ --set defaultBackend.nodeSelector.\"kubernetes\.io/os\"=linux \ --set controller.metrics.enabled=true \ --set controller.metrics.serviceMonitor.enabled=true \ --set controller.metrics.serviceMonitor.additionalLabels.release=\"prometheus\" \ --set controller.service.annotations.\"service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path\"=/healthz" az aks command invoke \ --name $clusterName \ --resource-group $resourceGroupName \ --subscription $subscriptionId \ --command "$command" # Install certificate manager command="helm install cert-manager jetstack/cert-manager \ --create-namespace \ --namespace cert-manager \ --set installCRDs=true \ --set nodeSelector.\"kubernetes\.io/os\"=linux" az aks command invoke \ --name $clusterName \ --resource-group $resourceGroupName \ --subscription $subscriptionId \ --command "$command" # Create cluster issuer command="cat <<EOF | kubectl apply -f - apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-nginx spec: acme: server: https://acme-v02.api.letsencrypt.org/directory email: $email privateKeySecretRef: name: letsencrypt solvers: - http01: ingress: class: nginx podTemplate: spec: nodeSelector: "kubernetes.io/os": linux EOF" az aks command invoke \ --name $clusterName \ --resource-group $resourceGroupName \ --subscription $subscriptionId \ --command "$command" # Create workload namespace command="kubectl create namespace $namespace" az aks command invoke \ --name $clusterName \ --resource-group $resourceGroupName \ --subscription $subscriptionId \ --command "$command" # Create service account command="cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ServiceAccount metadata: annotations: azure.workload.identity/client-id: $workloadManagedIdentityClientId azure.workload.identity/tenant-id: $tenantId labels: azure.workload.identity/use: "true" name: $serviceAccountName namespace: $namespace EOF" az aks command invoke \ --name $clusterName \ --resource-group $resourceGroupName \ --subscription $subscriptionId \ --command "$command" else # Log whether the cluster is public or private echo "$clusterName AKS cluster is private" # Install Prometheus helm install prometheus prometheus-community/kube-prometheus-stack \ --create-namespace \ --namespace prometheus \ --set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false \ --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false # Install NGINX ingress controller using the internal load balancer helm install nginx-ingress ingress-nginx/ingress-nginx \ --create-namespace \ --namespace ingress-basic \ --set controller.replicaCount=3 \ --set controller.nodeSelector."kubernetes\.io/os"=linux \ --set defaultBackend.nodeSelector."kubernetes\.io/os"=linux \ --set controller.metrics.enabled=true \ --set controller.metrics.serviceMonitor.enabled=true \ --set controller.metrics.serviceMonitor.additionalLabels.release="prometheus" \ --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz helm install $nginxReleaseName $nginxRepoName/$nginxChartName \ --create-namespace \ --namespace $nginxNamespace # Install certificate manager helm install cert-manager jetstack/cert-manager \ --create-namespace \ --namespace cert-manager \ --set installCRDs=true \ --set nodeSelector."kubernetes\.io/os"=linux # Create cluster issuer cat <<EOF | kubectl apply -f - apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-nginx spec: acme: server: https://acme-v02.api.letsencrypt.org/directory email: $email privateKeySecretRef: name: letsencrypt solvers: - http01: ingress: class: nginx podTemplate: spec: nodeSelector: "kubernetes.io/os": linux EOF # Create workload namespace kubectl create namespace $namespace # Create service account cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ServiceAccount metadata: annotations: azure.workload.identity/client-id: $workloadManagedIdentityClientId azure.workload.identity/tenant-id: $tenantId labels: azure.workload.identity/use: "true" name: $serviceAccountName namespace: $namespace EOF fi # Create output as JSON file echo '{}' | jq --arg x $namespace '.namespace=$x' | jq --arg x $serviceAccountName '.serviceAccountName=$x' | jq --arg x 'prometheus' '.prometheus=$x' | jq --arg x 'cert-manager' '.certManager=$x' | jq --arg x 'ingress-basic' '.nginxIngressController=$x' >$AZ_SCRIPTS_OUTPUT_PATH

The install-nginx-via-helm-and-create-sa.sh Bash script can run on a public AKS cluster or on a private AKS cluster using the az aks command invoke. For more information, see Use command invoke to access a private Azure Kubernetes Service (AKS) cluster.

The install-nginx-via-helm-and-create-sa.sh Bash script returns the following outputs to the deployment script:

Namespace hosting the chatbot sample. You can change the default magic8ball namespace by assigning a different value to the namespace parameter of the main.bicep module.
Service account name
Prometheus namespace
Cert-manager namespace
NGINX ingress controller namespace

Chatbot Application

The chatbot is a Python application inspired by the sample code in the It’s Time To Create A Private ChatGPT For Yourself Today article.

The application is contained in a single file called app.py. The application makes use of the following libraries:

OpenAPI: The OpenAI Python library provides convenient access to the OpenAI API from applications written in Python. It includes a pre-defined set of classes for API resources that initialize themselves dynamically from API responses, making it compatible with a wide range of versions of the OpenAI API. You can find usage examples for the OpenAI Python library in our API reference and the OpenAI Cookbook.
Azure Identity: The Azure Identity library provides Azure Active Directory (Azure AD) token authentication support across the Azure SDK. It provides a set of TokenCredential implementations, which can be used to construct Azure SDK clients that support Azure AD token authentication.
Streamlit: Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom web apps for machine learning and data science. You can build and deploy powerful data apps in just a few minutes. For more information, see Streamlit documentation.
Streamlit-chat: a Streamlit component that provides a configurable user interface for chatbot applications.
Dotenv: Python-dotenv reads key-value pairs from a .env file and can set them as environment variables. It helps in the development of applications following the 12-factor principles.

The requirements.txt file under the scripts folder contains the list of packages used by the app.py application that you can restore using the following command:

pip install -r requirements.txt --upgrade

The following table contains the code of the app.py chatbot:

# Import packages import os import sys import time import openai import logging import streamlit as st from streamlit_chat import message from azure.identity import DefaultAzureCredential from dotenv import load_dotenv from dotenv import dotenv_values # Load environment variables from .env file if os.path.exists(".env"): load_dotenv(override=True) config = dotenv_values(".env") # Read environment variables assistan_profile = """ You are the infamous Magic 8 Ball. You need to randomly reply to any question with one of the following answers: - It is certain. - It is decidedly so. - Without a doubt. - Yes definitely. - You may rely on it. - As I see it, yes. - Most likely. - Outlook good. - Yes. - Signs point to yes. - Reply hazy, try again. - Ask again later. - Better not tell you now. - Cannot predict now. - Concentrate and ask again. - Don't count on it. - My reply is no. - My sources say no. - Outlook not so good. - Very doubtful. Add a short comment in a pirate style at the end! Follow your heart and be creative! For mor information, see https://en.wikipedia.org/wiki/Magic_8_Ball """ title = os.environ.get("TITLE", "Magic 8 Ball") text_input_label = os.environ.get("TEXT_INPUT_LABEL", "Pose your question and cross your fingers!") image_file_name = os.environ.get("IMAGE_FILE_NAME", "magic8ball.png") image_width = int(os.environ.get("IMAGE_WIDTH", 80)) temperature = float(os.environ.get("TEMPERATURE", 0.9)) system = os.environ.get("SYSTEM", assistan_profile) api_base = os.getenv("AZURE_OPENAI_BASE") api_key = os.getenv("AZURE_OPENAI_KEY") api_type = os.environ.get("AZURE_OPENAI_TYPE", "azure") api_version = os.environ.get("AZURE_OPENAI_VERSION", "2023-05-15") engine = os.getenv("AZURE_OPENAI_DEPLOYMENT") model = os.getenv("AZURE_OPENAI_MODEL") # Configure OpenAI openai.api_type = api_type openai.api_version = api_version openai.api_base = api_base # Set default Azure credential default_credential = DefaultAzureCredential() if openai.api_type == "azure_ad" else None # Configure a logger logging.basicConfig(stream = sys.stdout, format = '[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level = logging.INFO) logger = logging.getLogger(__name__) # Log variables logger.info(f"title: {title}") logger.info(f"text_input_label: {text_input_label}") logger.info(f"image_file_name: {image_file_name}") logger.info(f"image_width: {image_width}") logger.info(f"temperature: {temperature}") logger.info(f"system: {system}") logger.info(f"api_base: {api_base}") logger.info(f"api_key: {api_key}") logger.info(f"api_type: {api_type}") logger.info(f"api_version: {api_version}") logger.info(f"engine: {engine}") logger.info(f"model: {model}") # Authenticate to Azure OpenAI if openai.api_type == "azure": openai.api_key = api_key elif openai.api_type == "azure_ad": openai_token = default_credential.get_token("https://cognitiveservices.azure.com/.default") openai.api_key = openai_token.token if 'openai_token' not in st.session_state: st.session_state['openai_token'] = openai_token else: logger.error("Invalid API type. Please set the AZURE_OPENAI_TYPE environment variable to azure or azure_ad.") raise ValueError("Invalid API type. Please set the AZURE_OPENAI_TYPE environment variable to azure or azure_ad.") # Customize Streamlit UI using CSS st.markdown(""" <style> div.stButton > button:first-child { background-color: #eb5424; color: white; font-size: 20px; font-weight: bold; border-radius: 0.5rem; padding: 0.5rem 1rem; border: none; box-shadow: 0 0.5rem 1rem rgba(0,0,0,0.15); width: 300 px; height: 42px; transition: all 0.2s ease-in-out; } div.stButton > button:first-child:hover { transform: translateY(-3px); box-shadow: 0 1rem 2rem rgba(0,0,0,0.15); } div.stButton > button:first-child:active { transform: translateY(-1px); box-shadow: 0 0.5rem 1rem rgba(0,0,0,0.15); } div.stButton > button:focus:not(:focus-visible) { color: #FFFFFF; } @media only screen and (min-width: 768px) { /* For desktop: */ div { font-family: 'Roboto', sans-serif; } div.stButton > button:first-child { background-color: #eb5424; color: white; font-size: 20px; font-weight: bold; border-radius: 0.5rem; padding: 0.5rem 1rem; border: none; box-shadow: 0 0.5rem 1rem rgba(0,0,0,0.15); width: 300 px; height: 42px; transition: all 0.2s ease-in-out; position: relative; bottom: -32px; right: 0px; } div.stButton > button:first-child:hover { transform: translateY(-3px); box-shadow: 0 1rem 2rem rgba(0,0,0,0.15); } div.stButton > button:first-child:active { transform: translateY(-1px); box-shadow: 0 0.5rem 1rem rgba(0,0,0,0.15); } div.stButton > button:focus:not(:focus-visible) { color: #FFFFFF; } input { border-radius: 0.5rem; padding: 0.5rem 1rem; border: none; box-shadow: 0 0.5rem 1rem rgba(0,0,0,0.15); transition: all 0.2s ease-in-out; height: 40px; } } </style> """, unsafe_allow_html=True) # Initialize Streamlit session state if 'prompts' not in st.session_state: st.session_state['prompts'] = [{"role": "system", "content": system}] if 'generated' not in st.session_state: st.session_state['generated'] = [] if 'past' not in st.session_state: st.session_state['past'] = [] # Refresh the OpenAI security token every 45 minutes def refresh_openai_token(): if st.session_state['openai_token'].expires_on < int(time.time()) - 45 * 60: st.session_state['openai_token'] = default_credential.get_token("https://cognitiveservices.azure.com/.default") openai.api_key = st.session_state['openai_token'].token # Send user prompt to Azure OpenAI def generate_response(prompt): try: st.session_state['prompts'].append({"role": "user", "content": prompt}) if openai.api_type == "azure_ad": refresh_openai_token() completion = openai.ChatCompletion.create( engine = engine, model = model, messages = st.session_state['prompts'], temperature = temperature, ) message = completion.choices[0].message.content return message except Exception as e: logging.exception(f"Exception in generate_response: {e}") # Reset Streamlit session state to start a new chat from scratch def new_click(): st.session_state['prompts'] = [{"role": "system", "content": system}] st.session_state['past'] = [] st.session_state['generated'] = [] st.session_state['user'] = "" # Handle on_change event for user input def user_change(): # Avoid handling the event twice when clicking the Send button chat_input = st.session_state['user'] st.session_state['user'] = "" if (chat_input == '' or (len(st.session_state['past']) > 0 and chat_input == st.session_state['past'][-1])): return # Generate response invoking Azure OpenAI LLM if chat_input != '': output = generate_response(chat_input) # store the output st.session_state['past'].append(chat_input) st.session_state['generated'].append(output) st.session_state['prompts'].append({"role": "assistant", "content": output}) # Create a 2-column layout. Note: Streamlit columns do not properly render on mobile devices. # For more information, see https://github.com/streamlit/streamlit/issues/5003 col1, col2 = st.columns([1, 7]) # Display the robot image with col1: st.image(image = os.path.join("images", image_file_name), width = image_width) # Display the title with col2: st.title(title) # Create a 3-column layout. Note: Streamlit columns do not properly render on mobile devices. # For more information, see https://github.com/streamlit/streamlit/issues/5003 col3, col4, col5 = st.columns([7, 1, 1]) # Create text input in column 1 with col3: user_input = st.text_input(text_input_label, key = "user", on_change = user_change) # Create send button in column 2 with col4: st.button(label = "Send") # Create new button in column 3 with col5: st.button(label = "New", on_click = new_click) # Display the chat history in two separate tabs # - normal: display the chat history as a list of messages using the streamlit_chat message() function # - rich: display the chat history as a list of messages using the Streamlit markdown() function if st.session_state['generated']: tab1, tab2 = st.tabs(["normal", "rich"]) with tab1: for i in range(len(st.session_state['generated']) - 1, -1, -1): message(st.session_state['past'][i], is_user = True, key = str(i) + '_user', avatar_style = "fun-emoji", seed = "Nala") message(st.session_state['generated'][i], key = str(i), avatar_style = "bottts", seed = "Fluffy") with tab2: for i in range(len(st.session_state['generated']) - 1, -1, -1): st.markdown(st.session_state['past'][i]) st.markdown(st.session_state['generated'][i])

The application uses an internal cascading style sheet (CSS) inside an st.markdown element to add a unique style to the Streamlit chatbot for mobile and desktop devices. For more information on how to tweak the user interface of a Streamlit application, see 3 Tips to Customize your Streamlit App.

streamlit run app.py

Working with the ChatGPT and GPT-4 models

The generate_response function creates and sends the prompt to the Chat Completion API of the ChatGPT model.

def generate_response(prompt): try: st.session_state['prompts'].append({"role": "user", "content": prompt}) if openai.api_type == "azure_ad": refresh_openai_token() completion = openai.ChatCompletion.create( engine = engine, model = model, messages = st.session_state['prompts'], temperature = temperature, ) message = completion.choices[0].message.content return message except Exception as e: logging.exception(f"Exception in generate_response: {e}")

OpenAI trained the ChatGPT and GPT-4 models to accept input formatted as a conversation. The messages parameter takes an array of dictionaries with a conversation organized by role or message: system, user, and assistant. The format of a basic Chat Completion is as follows:

{"role": "system", "content": "Provide some context and/or instructions to the model"}, {"role": "user", "content": "The users messages goes here"}, {"role": "assistant", "content": "The response message goes here."}

The system role, also known as the system message, is included at the beginning of the array. This message provides the initial instructions for the model. You can provide various information in the system role, including:

A brief description of the assistant
Personality traits of the assistant
Instructions or rules you would like the assistant to follow
Data or information needed for the model, such as relevant questions from an FAQ
You can customize the system role for your use case or include basic instructions.

The system role or message is optional, but it's recommended to at least include a basic one to get the best results. The user role or message represents an input or inquiry from the user, while the assistant message corresponds to the response generated by the GPT API. This dialog exchange aims to simulate a human-like conversation, where the user message initiates the interaction and the assistant message provides a relevant and informative answer. This context helps the chat model generate a more appropriate response later on. The last user message refers to the prompt currently requested. For more information, see Learn how to work with the ChatGPT and GPT-4 models.

Application Configuration

Make sure to provide a value for the following environment variables when testing the app.py Python app locally, for example in Visual Studio Code. You can eventually define environment variables in a .env file in the same folder as the app.py file.

AZURE_OPENAI_TYPE: specify azure if you want to let the application use the API key to authenticate against OpenAI. In this case, make sure to provide the Key in the AZURE_OPENAI_KEY environment variable. If you want to authenticate using an Azure AD security token, you need to specify azure_ad as a value. In this case, don't need to provide any value in the AZURE_OPENAI_KEY environment variable.
AZURE_OPENAI_BASE: the URL of your Azure OpenAI resource. If you use the API key to authenticate against OpenAI, you can specify the regional endpoint of your Azure OpenAI Service (e.g., https://eastus.api.cognitive.microsoft.com/). If you instead plan to use Azure AD security tokens for authentication, you need to deploy your Azure OpenAI Service with a subdomain and specify the resource-specific endpoint url (e.g., https://myopenai.openai.azure.com/).
AZURE_OPENAI_KEY: the key of your Azure OpenAI resource.
AZURE_OPENAI_DEPLOYMENT: the name of the ChatGPT deployment used by your Azure OpenAI resource, for example gpt-35-turbo.
AZURE_OPENAI_MODEL: the name of the ChatGPT model used by your Azure OpenAI resource, for example gpt-35-turbo.
TITLE: the title of the Streamlit app.
TEMPERATURE: the temperature used by the OpenAI API to generate the response.
SYSTEM: give the model instructions about how it should behave and any context it should reference when generating a response. Used to describe the assistant's personality.

When deploying the application to Azure Kubernetes Service (AKS), these values are provided in a Kubernetes ConfigMap. For more information, see the next section.

OpenAI Library

To use the openai library with Microsoft Azure endpoints, you need to set the api_type, api_base and api_version in addition to the api_key. The api_type must be set to 'azure' and the others correspond to the properties of your endpoint. In addition, the deployment name must be passed as the engine parameter. To use OpenAI Key to authenticate to your Azure endpoint, you need to set the api_type to azure and pass the OpenAI Key to api_key.

import openai openai.api_type = "azure" openai.api_key = "..." openai.api_base = "https://example-endpoint.openai.azure.com" openai.api_version = "2023-05-15" # create a chat completion chat_completion = openai.ChatCompletion.create(deployment_id="gpt-3.5-turbo", model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hello world"}]) # print the completion print(completion.choices[0].message.content)

For a detailed example of how to use fine-tuning and other operations using Azure endpoints, please check out the following Jupyter notebooks:

To use Microsoft Active Directory to authenticate to your Azure endpoint, you need to set the api_type to azure_ad and pass the acquired credential token to api_key. The rest of the parameters must be set as specified in the previous section.

from azure.identity import DefaultAzureCredential import openai # Request credential default_credential = DefaultAzureCredential() token = default_credential.get_token("https://cognitiveservices.azure.com/.default") # Setup parameters openai.api_type = "azure_ad" openai.api_key = token.token openai.api_base = "https://example-endpoint.openai.azure.com/" openai.api_version = "2023-05-15" # ...

You can use two different authentication methods in the magic8ball chatbot application:

API key: set the AZURE_OPENAI_TYPE environment variable to azure and the AZURE_OPENAI_KEY environment variable to the key of your Azure OpenAI resource. You can use the regional endpoint, such as https://eastus.api.cognitive.microsoft.com/, in the AZURE_OPENAI_BASE environment variable, to connect to the Azure OpenAI resource.
Azure Active Directory: set the AZURE_OPENAI_TYPE environment variable to azure_ad and use a service principal or managed identity with the DefaultAzureCredential object to acquire a security token from Azure Active Directory. For more information on the DefaultAzureCredential in Python, see Authenticate Python apps to Azure services by using the Azure SDK for Python. Make sure to assign the Cognitive Services User role to the service principal or managed identity used to authenticate to your Azure OpenAI Service. For more information, see How to configure Azure OpenAI Service with managed identities. If you want to use Azure AD integrated security, you need to create a custom subdomain for your Azure OpenAI resource and use the specific endpoint containing the custom domain, such as https://myopenai.openai.azure.com/ where myopenai is the custom subdomain. If you specify the regional endpoint, you get an error like the following: Subdomain does not map to a resource. Hence, pass the custom domain endpoint in the AZURE_OPENAI_BASE environment variable. In this case, you also need to refresh the security token periodically.

Build the container image

You can build the container image using the 01-build-docker-image.sh in the scripts folder.

#!/bin/bash # Variables source ./00-variables.sh # Build the docker image docker build -t $imageName:$tag -f Dockerfile .

Before running any script, make sure to customize the value of the variables inside the 00-variables.sh file. This file is embedded in all the scripts and contains the following variables:

# Variables acrName="CoralAcr" acrResourceGrougName="CoralRG" location="FranceCentral" attachAcr=false imageName="magic8ball" tag="v2" containerName="magic8ball" image="$acrName.azurecr.io/$imageName:$tag" imagePullPolicy="IfNotPresent" # Always, Never, IfNotPresent managedIdentityName="OpenAiManagedIdentity" federatedIdentityName="Magic8BallFederatedIdentity" # Azure Subscription and Tenant subscriptionId=$(az account show --query id --output tsv) subscriptionName=$(az account show --query name --output tsv) tenantId=$(az account show --query tenantId --output tsv) # Parameters title="Magic 8 Ball" label="Pose your question and cross your fingers!" temperature="0.9" imageWidth="80" # OpenAI openAiName="CoralOpenAi " openAiResourceGroupName="CoralRG" openAiType="azure_ad" openAiBase="https://coralopenai.openai.azure.com/" openAiModel="gpt-35-turbo" openAiDeployment="gpt-35-turbo" # Nginx Ingress Controller nginxNamespace="ingress-basic" nginxRepoName="ingress-nginx" nginxRepoUrl="https://kubernetes.github.io/ingress-nginx" nginxChartName="ingress-nginx" nginxReleaseName="nginx-ingress" nginxReplicaCount=3 # Certificate Manager cmNamespace="cert-manager" cmRepoName="jetstack" cmRepoUrl="https://charts.jetstack.io" cmChartName="cert-manager" cmReleaseName="cert-manager" # Cluster Issuer email="paolos@microsoft.com" clusterIssuerName="letsencrypt-nginx" clusterIssuerTemplate="cluster-issuer.yml" # AKS Cluster aksClusterName="CoralAks" aksResourceGroupName="CoralRG" # Sample Application namespace="magic8ball" serviceAccountName="magic8ball-sa" deploymentTemplate="deployment.yml" serviceTemplate="service.yml" configMapTemplate="configMap.yml" secretTemplate="secret.yml" # Ingress and DNS ingressTemplate="ingress.yml" ingressName="magic8ball-ingress" dnsZoneName="babosbird.com" dnsZoneResourceGroupName="DnsResourceGroup" subdomain="magic8ball" host="$subdomain.$dnsZoneName"

Upload Docker container image to Azure Container Registry (ACR)

You can push the Docker container image to Azure Container Registry (ACR) using the 03-push-docker-image.sh script in the scripts folder.

#!/bin/bash # Variables source ./00-variables.sh # Login to ACR az acr login --name $acrName # Retrieve ACR login server. Each container image needs to be tagged with the loginServer name of the registry. loginServer=$(az acr show --name $acrName --query loginServer --output tsv) # Tag the local image with the loginServer of ACR docker tag ${imageName,,}:$tag $loginServer/${imageName,,}:$tag # Push latest container image to ACR docker push $loginServer/${imageName,,}:$tag

Deployment Scripts

If you deployed the Azure infrastructure using the Bicep modules provided with this sample, you only need to deploy the application using the following scripts and YAML templates in the scripts folder.

09-deploy-app.sh
10-create-ingress.sh
11-configure-dns.sh
configMap.yml
deployment.yml
ingress.yml
service.yml

If you instead want to deploy the application in your AKS cluster, you can use the following scripts to configure your environment.

The 04-create-nginx-ingress-controller.sh installs the NGINX Ingress Controller using Helm.

#!/bin/bash # Variables source ./00-variables.sh # Use Helm to deploy an NGINX ingress controller result=$(helm list -n $nginxNamespace | grep $nginxReleaseName | awk '{print $1}') if [[ -n $result ]]; then echo "[$nginxReleaseName] ingress controller already exists in the [$nginxNamespace] namespace" else # Check if the ingress-nginx repository is not already added result=$(helm repo list | grep $nginxRepoName | awk '{print $1}') if [[ -n $result ]]; then echo "[$nginxRepoName] Helm repo already exists" else # Add the ingress-nginx repository echo "Adding [$nginxRepoName] Helm repo..." helm repo add $nginxRepoName $nginxRepoUrl fi # Update your local Helm chart repository cache echo 'Updating Helm repos...' helm repo update # Deploy NGINX ingress controller echo "Deploying [$nginxReleaseName] NGINX ingress controller to the [$nginxNamespace] namespace..." helm install $nginxReleaseName $nginxRepoName/$nginxChartName \ --create-namespace \ --namespace $nginxNamespace \ --set controller.config.enable-modsecurity=true \ --set controller.config.enable-owasp-modsecurity-crs=true \ --set controller.config.modsecurity-snippet=\ 'SecRuleEngine On SecRequestBodyAccess On SecAuditLog /dev/stdout SecAuditLogFormat JSON SecAuditEngine RelevantOnly SecRule REMOTE_ADDR "@ipMatch 127.0.0.1" "id:87,phase:1,pass,nolog,ctl:ruleEngine=Off"' \ --set controller.metrics.enabled=true \ --set controller.metrics.serviceMonitor.enabled=true \ --set controller.metrics.serviceMonitor.additionalLabels.release="prometheus" \ --set controller.nodeSelector."kubernetes\.io/os"=linux \ --set controller.replicaCount=$replicaCount \ --set defaultBackend.nodeSelector."kubernetes\.io/os"=linux \ --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz fi

05-install-cert-manager.shinstalls the cert-manager using Helm.

#/bin/bash # Variables source ./00-variables.sh # Check if the ingress-nginx repository is not already added result=$(helm repo list | grep $cmRepoName | awk '{print $1}') if [[ -n $result ]]; then echo "[$cmRepoName] Helm repo already exists" else # Add the Jetstack Helm repository echo "Adding [$cmRepoName] Helm repo..." helm repo add $cmRepoName $cmRepoUrl fi # Update your local Helm chart repository cache echo 'Updating Helm repos...' helm repo update # Install cert-manager Helm chart result=$(helm list -n $cmNamespace | grep $cmReleaseName | awk '{print $1}') if [[ -n $result ]]; then echo "[$cmReleaseName] cert-manager already exists in the $cmNamespace namespace" else # Install the cert-manager Helm chart echo "Deploying [$cmReleaseName] cert-manager to the $cmNamespace namespace..." helm install $cmReleaseName $cmRepoName/$cmChartName \ --create-namespace \ --namespace $cmNamespace \ --set installCRDs=true \ --set nodeSelector."kubernetes\.io/os"=linux fi

06-create-cluster-issuer.sh creates a cluster issuer for the NGINX Ingress Controller based on the Let's Encrypt ACME certificate issuer.

#/bin/bash # Variables source ./00-variables.sh # Check if the cluster issuer already exists result=$(kubectl get ClusterIssuer -o json | jq -r '.items[].metadata.name | select(. == "'$clusterIssuerName'")') if [[ -n $result ]]; then echo "[$clusterIssuerName] cluster issuer already exists" exit else # Create the cluster issuer echo "[$clusterIssuerName] cluster issuer does not exist" echo "Creating [$clusterIssuerName] cluster issuer..." cat $clusterIssuerTemplate | yq "(.spec.acme.email)|="\""$email"\" | kubectl apply -f - fi

07-create-workload-managed-identity.sh: creates the managed identity used by the magic8ballchatbot and assigns it the Cognitive Services User role on the Azure OpenAI Service.

#!/bin/bash # Variables source ./00-variables.sh # Check if the user-assigned managed identity already exists echo "Checking if [$managedIdentityName] user-assigned managed identity actually exists in the [$aksResourceGroupName] resource group..." az identity show \ --name $managedIdentityName \ --resource-group $aksResourceGroupName &>/dev/null if [[ $? != 0 ]]; then echo "No [$managedIdentityName] user-assigned managed identity actually exists in the [$aksResourceGroupName] resource group" echo "Creating [$managedIdentityName] user-assigned managed identity in the [$aksResourceGroupName] resource group..." # Create the user-assigned managed identity az identity create \ --name $managedIdentityName \ --resource-group $aksResourceGroupName \ --location $location \ --subscription $subscriptionId 1>/dev/null if [[ $? == 0 ]]; then echo "[$managedIdentityName] user-assigned managed identity successfully created in the [$aksResourceGroupName] resource group" else echo "Failed to create [$managedIdentityName] user-assigned managed identity in the [$aksResourceGroupName] resource group" exit fi else echo "[$managedIdentityName] user-assigned managed identity already exists in the [$aksResourceGroupName] resource group" fi # Retrieve the clientId of the user-assigned managed identity echo "Retrieving clientId for [$managedIdentityName] managed identity..." clientId=$(az identity show \ --name $managedIdentityName \ --resource-group $aksResourceGroupName \ --query clientId \ --output tsv) if [[ -n $clientId ]]; then echo "[$clientId] clientId for the [$managedIdentityName] managed identity successfully retrieved" else echo "Failed to retrieve clientId for the [$managedIdentityName] managed identity" exit fi # Retrieve the principalId of the user-assigned managed identity echo "Retrieving principalId for [$managedIdentityName] managed identity..." principalId=$(az identity show \ --name $managedIdentityName \ --resource-group $aksResourceGroupName \ --query principalId \ --output tsv) if [[ -n $principalId ]]; then echo "[$principalId] principalId for the [$managedIdentityName] managed identity successfully retrieved" else echo "Failed to retrieve principalId for the [$managedIdentityName] managed identity" exit fi # Get the resource id of the Azure OpenAI resource openAiId=$(az cognitiveservices account show \ --name $openAiName \ --resource-group $openAiResourceGroupName \ --query id \ --output tsv) if [[ -n $openAiId ]]; then echo "Resource id for the [$openAiName] Azure OpenAI resource successfully retrieved" else echo "Failed to the resource id for the [$openAiName] Azure OpenAI resource" exit -1 fi # Assign the Cognitive Services User role on the Azure OpenAI resource to the managed identity role="Cognitive Services User" echo "Checking if the [$managedIdentityName] managed identity has been assigned to [$role] role with [$openAiName] Azure OpenAI resource as a scope..." current=$(az role assignment list \ --assignee $principalId \ --scope $openAiId \ --query "[?roleDefinitionName=='$role'].roleDefinitionName" \ --output tsv 2>/dev/null) if [[ $current == $role ]]; then echo "[$managedIdentityName] managed identity is already assigned to the ["$current"] role with [$openAiName] Azure OpenAI resource as a scope" else echo "[$managedIdentityName] managed identity is not assigned to the [$role] role with [$openAiName] Azure OpenAI resource as a scope" echo "Assigning the [$role] role to the [$managedIdentityName] managed identity with [$openAiName] Azure OpenAI resource as a scope..." az role assignment create \ --assignee $principalId \ --role "$role" \ --scope $openAiId 1>/dev/null if [[ $? == 0 ]]; then echo "[$managedIdentityName] managed identity successfully assigned to the [$role] role with [$openAiName] Azure OpenAI resource as a scope" else echo "Failed to assign the [$managedIdentityName] managed identity to the [$role] role with [$openAiName] Azure OpenAI resource as a scope" exit fi fi

08-create-service-account.sh creates the namespace and service account for the magic8ball chatbot and federate the service account with the user-defined managed identity created in the previous step.

#!/bin/bash # Variables for the user-assigned managed identity source ./00-variables.sh # Check if the namespace already exists result=$(kubectl get namespace -o 'jsonpath={.items[?(@.metadata.name=="'$namespace'")].metadata.name'}) if [[ -n $result ]]; then echo "[$namespace] namespace already exists" else # Create the namespace for your ingress resources echo "[$namespace] namespace does not exist" echo "Creating [$namespace] namespace..." kubectl create namespace $namespace fi # Check if the service account already exists result=$(kubectl get sa -n $namespace -o 'jsonpath={.items[?(@.metadata.name=="'$serviceAccountName'")].metadata.name'}) if [[ -n $result ]]; then echo "[$serviceAccountName] service account already exists" else # Retrieve the resource id of the user-assigned managed identity echo "Retrieving clientId for [$managedIdentityName] managed identity..." managedIdentityClientId=$(az identity show \ --name $managedIdentityName \ --resource-group $aksResourceGroupName \ --query clientId \ --output tsv) if [[ -n $managedIdentityClientId ]]; then echo "[$managedIdentityClientId] clientId for the [$managedIdentityName] managed identity successfully retrieved" else echo "Failed to retrieve clientId for the [$managedIdentityName] managed identity" exit fi # Create the service account echo "[$serviceAccountName] service account does not exist" echo "Creating [$serviceAccountName] service account..." cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ServiceAccount metadata: annotations: azure.workload.identity/client-id: $managedIdentityClientId azure.workload.identity/tenant-id: $tenantId labels: azure.workload.identity/use: "true" name: $serviceAccountName namespace: $namespace EOF fi # Show service account YAML manifest echo "Service Account YAML manifest" echo "-----------------------------" kubectl get sa $serviceAccountName -n $namespace -o yaml # Check if the federated identity credential already exists echo "Checking if [$federatedIdentityName] federated identity credential actually exists in the [$aksResourceGroupName] resource group..." az identity federated-credential show \ --name $federatedIdentityName \ --resource-group $aksResourceGroupName \ --identity-name $managedIdentityName &>/dev/null if [[ $? != 0 ]]; then echo "No [$federatedIdentityName] federated identity credential actually exists in the [$aksResourceGroupName] resource group" # Get the OIDC Issuer URL aksOidcIssuerUrl="$(az aks show \ --only-show-errors \ --name $aksClusterName \ --resource-group $aksResourceGroupName \ --query oidcIssuerProfile.issuerUrl \ --output tsv)" # Show OIDC Issuer URL if [[ -n $aksOidcIssuerUrl ]]; then echo "The OIDC Issuer URL of the $aksClusterName cluster is $aksOidcIssuerUrl" fi echo "Creating [$federatedIdentityName] federated identity credential in the [$aksResourceGroupName] resource group..." # Establish the federated identity credential between the managed identity, the service account issuer, and the subject. az identity federated-credential create \ --name $federatedIdentityName \ --identity-name $managedIdentityName \ --resource-group $aksResourceGroupName \ --issuer $aksOidcIssuerUrl \ --subject system:serviceaccount:$namespace:$serviceAccountName if [[ $? == 0 ]]; then echo "[$federatedIdentityName] federated identity credential successfully created in the [$aksResourceGroupName] resource group" else echo "Failed to create [$federatedIdentityName] federated identity credential in the [$aksResourceGroupName] resource group" exit fi else echo "[$federatedIdentityName] federated identity credential already exists in the [$aksResourceGroupName] resource group" fi

09-deploy-app.sh creates the Kubernetes config map, deployment, and service used by the magic8ball chatbot.

#!/bin/bash # Variables source ./00-variables.sh # Attach ACR to AKS cluster if [[ $attachAcr == true ]]; then echo "Attaching ACR $acrName to AKS cluster $aksClusterName..." az aks update \ --name $aksClusterName \ --resource-group $aksResourceGroupName \ --attach-acr $acrName fi # Check if namespace exists in the cluster result=$(kubectl get namespace -o jsonpath="{.items[?(@.metadata.name=='$namespace')].metadata.name}") if [[ -n $result ]]; then echo "$namespace namespace already exists in the cluster" else echo "$namespace namespace does not exist in the cluster" echo "creating $namespace namespace in the cluster..." kubectl create namespace $namespace fi # Create config map cat $configMapTemplate | yq "(.data.TITLE)|="\""$title"\" | yq "(.data.LABEL)|="\""$label"\" | yq "(.data.TEMPERATURE)|="\""$temperature"\" | yq "(.data.IMAGE_WIDTH)|="\""$imageWidth"\" | yq "(.data.AZURE_OPENAI_TYPE)|="\""$openAiType"\" | yq "(.data.AZURE_OPENAI_BASE)|="\""$openAiBase"\" | yq "(.data.AZURE_OPENAI_MODEL)|="\""$openAiModel"\" | yq "(.data.AZURE_OPENAI_DEPLOYMENT)|="\""$openAiDeployment"\" | kubectl apply -n $namespace -f - # Create deployment cat $deploymentTemplate | yq "(.spec.template.spec.containers[0].image)|="\""$image"\" | yq "(.spec.template.spec.containers[0].imagePullPolicy)|="\""$imagePullPolicy"\" | yq "(.spec.template.spec.serviceAccountName)|="\""$serviceAccountName"\" | kubectl apply -n $namespace -f - # Create deployment kubectl apply -f $serviceTemplate -n $namespace

10-create-ingress.sh creates the ingress object to expose the service via the NGINX Ingress Controller

#/bin/bash # Variables source ./00-variables.sh # Create the ingress echo "[$ingressName] ingress does not exist" echo "Creating [$ingressName] ingress..." cat $ingressTemplate | yq "(.spec.tls[0].hosts[0])|="\""$host"\" | yq "(.spec.rules[0].host)|="\""$host"\" | kubectl apply -n $namespace -f -

11-configure-dns.sh creates an A record in the Azure DNS Zone to expose the application via a given subdomain (e.g., https://magic8ball.example.com)

# Variables source ./00-variables.sh # Retrieve the public IP address from the ingress echo "Retrieving the external IP address from the [$ingressName] ingress..." publicIpAddress=$(kubectl get ingress $ingressName -n $namespace -o jsonpath='{.status.loadBalancer.ingress[0].ip}') if [ -n $publicIpAddress ]; then echo "[$publicIpAddress] external IP address of the application gateway ingress controller successfully retrieved from the [$ingressName] ingress" else echo "Failed to retrieve the external IP address of the application gateway ingress controller from the [$ingressName] ingress" exit fi # Check if an A record for todolist subdomain exists in the DNS Zone echo "Retrieving the A record for the [$subdomain] subdomain from the [$dnsZoneName] DNS zone..." ipv4Address=$(az network dns record-set a list \ --zone-name $dnsZoneName \ --resource-group $dnsZoneResourceGroupName \ --query "[?name=='$subdomain'].arecords[].ipv4Address" \ --output tsv) if [[ -n $ipv4Address ]]; then echo "An A record already exists in [$dnsZoneName] DNS zone for the [$subdomain] subdomain with [$ipv4Address] IP address" if [[ $ipv4Address == $publicIpAddress ]]; then echo "The [$ipv4Address] ip address of the existing A record is equal to the ip address of the [$ingressName] ingress" echo "No additional step is required" exit else echo "The [$ipv4Address] ip address of the existing A record is different than the ip address of the [$ingressName] ingress" fi # Retrieving name of the record set relative to the zone echo "Retrieving the name of the record set relative to the [$dnsZoneName] zone..." recordSetName=$(az network dns record-set a list \ --zone-name $dnsZoneName \ --resource-group $dnsZoneResourceGroupName \ --query "[?name=='$subdomain'].name" \ --output name 2>/dev/null) if [[ -n $recordSetName ]]; then "[$recordSetName] record set name successfully retrieved" else "Failed to retrieve the name of the record set relative to the [$dnsZoneName] zone" exit fi # Remove the a record echo "Removing the A record from the record set relative to the [$dnsZoneName] zone..." az network dns record-set a remove-record \ --ipv4-address $ipv4Address \ --record-set-name $recordSetName \ --zone-name $dnsZoneName \ --resource-group $dnsZoneResourceGroupName if [[ $? == 0 ]]; then echo "[$ipv4Address] ip address successfully removed from the [$recordSetName] record set" else echo "Failed to remove the [$ipv4Address] ip address from the [$recordSetName] record set" exit fi fi # Create the a record echo "Creating an A record in [$dnsZoneName] DNS zone for the [$subdomain] subdomain with [$publicIpAddress] IP address..." az network dns record-set a add-record \ --zone-name $dnsZoneName \ --resource-group $dnsZoneResourceGroupName \ --record-set-name $subdomain \ --ipv4-address $publicIpAddress 1>/dev/null if [[ $? == 0 ]]; then echo "A record for the [$subdomain] subdomain with [$publicIpAddress] IP address successfully created in [$dnsZoneName] DNS zone" else echo "Failed to create an A record for the $subdomain subdomain with [$publicIpAddress] IP address in [$dnsZoneName] DNS zone" fi

The scripts used to deploy the YAML template use the yq tool to customize the manifests with the value of the variables defined in the 00-variables.sh file. This tool is a lightweight and portable command-line YAML, JSON and XML processor that uses jq like syntax but works with YAML files as well as json, xml, properties, csv and tsv. It doesn't yet support everything jq does - but it does support the most common operations and functions, and more is being added continuously.

Review deployed resources

Use the Azure portal, Azure CLI, or Azure PowerShell to list the deployed resources in the resource group.

Azure CLI

az resource list --resource-group <resource-group-name>

PowerShell

Get-AzResource -ResourceGroupName <resource-group-name>

Clean up resources

When you no longer need the resources you created, delete the resource group. This will remove all the Azure resources.