Azure OpenAI Insights: Monitoring AI with Confidence

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

Azure OpenAI Insights: Monitoring AI with Confidence

Welcome to the forefront of AI innovation! In the ever-evolving world of Artificial Intelligence, organizations and entities across various sectors are on a quest to leverage advanced technologies efficiently. Azure OpenAI opens a realm of possibilities, offering both challenges and excitement, particularly for those at the early stages of AI adoption.

We recognize these unique challenges. That's why we've crafted this blog post – not as a guide, rather as a companion in your journey towards mastering the native platform Monitoring of Azure OpenAI. Our focus here is not on the technicalities of creating workbooks. Instead, we deep dive into how can workbooks, coupled with Azure OpenAI's comprehensive metrics and diagnostic logs, can be powerful tool in scaling & monitoring your AI initiatives.

This is crucial in the early AI adoption phase where every decision and resource allocation matters. We will demonstrate how a well-structured workbook, integrated with Azure Monitor, offers deep insights into Azure OpenAI usage, helping you manage costs, optimize performance, and make strategic decisions for a robust AI infrastructure.

Getting started : Step by step

Step 1: Download the workbook from here.
Step 2: Import the workbook into your Azure Monitor workspace. Here is an external guide on how to import workbooks into Azure Monitor. Alternatively, you can use this repo for additional instructions.
Step 3: Optional step; Enable diagnostic settings for your Azure OpenAI resource. This will allow you to view additional dimensions and logs in the workbook. More information on the level of details later in this post.
Step 4: Explore the workbook.

Please check our repository for further enhancements, issues etc. We hope hearing from you via issues and stars.

Workbook Overview

Figure 1: Workbook Overview - the information in this pane is derived from azure resource graph

This pane provide a high level overview of your Azure OpenAI resource. The information is leveraging KQL to query Azure resource graph. It provides an holistic view of the OpenAI resources, cross subscriptions, resource groups, network access pattern (open or isolated) and regions.

We also provide a matrix of all the resources together with links to all resources:

Figure 2: Resource list - all relevant resources are listed here

Monitor

Figure 3: Monitor tab - Holistic view of all metrics, including token usage, http requests, and more

This pane provide a comprehensive view of all metrics, cross multiple subscriptions and resources. It includes the following information:

Http requests, by multiple dimensions: model name & version, status code, model deployment name, operation and api name and region.
Token based usage - multiple metrics: Processed Inference Tokens, Processed Prompt Tokens, Generate Completions Tokens, Active Tokens; these are displayed with couple of dimensions such as, model name and model deployment name.
PTU Utilization - by multiple dimensions: model name & version, streaming type and model deployment name.
Fine-tuning - Here we show the 'Processed FineTuned Training Hours' metric by two dimensions: model name and model deployment name.

Insights

Before using the Insights tab, diagnostic settings must be enabled for your Azure OpenAI resource.

You would need to choose the log analytics workspace in which the diagnostic settings will be stored, multiple workspaces are supported. In this tab we have 3 additional tabs:

Insights Overview

This tab focus on aggregative view on all logs, by multiple dimensions. It includes the following information:

Model name
Model Deployment name
Average Duration (in milliseconds)
API operation name

Figure 4: Insights Overview - Aggregative view on all logs, by multiple dimensions

Figure 5: Insights Overview - more aggregative view

This tab shows the overall utilization, performance of the selected Azure OpenAI resource.

By Caller IP

This tab examines the logs by caller IP, this potentially can be used to better understand usage patterns and traffic origins. It includes the following information:

Request/Response (Model name, Model deployment name & Operation name)
Average Duration

Figure 6: By Caller IP - examine the logs by caller IP and multiple dimensions

All Logs

Essentially this tab is used to provide complete view of all logs originated from multiple subscriptions, resource groups and OpenAI resources.

Figure 7: All Logs - complete view of all logs originated from multiple subscriptions, resource groups and OpenAI resources

Why? Activating Azure OpenAI Monitoring: Cognitive Services, Metrics, and Diagnostics

As the latest addition to Azure AI Services (previously known as Cognitive Services), Azure OpenAI (AOAI) stands out as a revolutionary tool. It joins an impressive lineup of services including speech, text, decision, vision, translation, and content moderation, among others. These services, while diverse in their capabilities, share a common thread in their provisioning and control mechanisms. Whether it's the usage measured in token counts, page numbers, or audio length, each service is designed with flexibility and specificity in mind.

Azure AI Services, including AOAI, offer two primary usage patterns: integration through specific SDKs supporting various programming languages like .NET, Python, Go, Java, or direct consumption via REST endpoints. All these services are region-specific, providing options for private endpoints and/or restricted networks, catering to diverse operational needs.

For Independent Software Vendors (ISVs), the stakes are uniquely tailored. ISVs face distinct challenges in managing, controlling, and monetizing their customers' usage of these services in a multi-tenant environment. This is especially true with Azure OpenAI. The nuances of monitoring metrics and managing resources in such an environment are critical for ensuring efficient operation and customer satisfaction.

Here are a few user stories that highlight these challenges:

Resource Allocation: As an ISV operator, monitoring and controlling the usage of cognitive services across tenants is vital for fair resource distribution.
Billing Accuracy: Keeping track of each tenant's service consumption is crucial for accurate billing and service verification.
Monetization Strategy: For ISVs, monetizing cognitive service usage is key to recovering operational costs and maintaining profitability.
Usage Limits: Setting limits on service access for each tenant helps in preventing resource monopolization and ensuring service availability for all.
Data Segregation: Ensuring strict data segregation between tenants is paramount for maintaining privacy and preventing data leakage.
Metrics and Documentation: Having access to detailed documentation on AOAI metrics, error codes, and rate limits is essential for effective system integration.
Comprehensive Metrics: Access to extensive metrics like deployment names and hosting hours is crucial for managing usage and performance of cognitive services effectively.

Each of these user stories underscores the unique aspects that ISVs must consider when leveraging Azure AI Services, particularly Azure OpenAI, in their operations.

Here are couple of useful links to get you started:

Azure OpenAI Service Monitoring: Azure OpenAI Service Monitoring Guide details how to use Azure Monitor tools for tracking the availability, performance, and operation of Azure OpenAI Service resources. It covers different monitoring data types such as platform metrics, resource logs, and activity logs, explaining their collection and storage via diagnostic settings. The guide highlights out-of-box dashboards with categories like HTTP Requests and PTU Utilization, and delves into using the Kusto query language in Log Analytics for complex data queries. Additionally, it provides insights into creating alerts based on various monitoring data and outlines best practices and use cases for proactive notification, making it an essential resource for efficient Azure OpenAI Service management.
Azure OpenAI Service Overview: Understanding Azure OpenAI Service offers a comprehensive look at Microsoft's Azure OpenAI Service, which grants access to OpenAI’s advanced language models like GPT-4, GPT-4 Turbo with Vision, and GPT-3.5-Turbo. The service is accessible via REST APIs, Python SDK, or a web-based interface and is tailored for customers with established partnerships with Microsoft, focusing on lower-risk applications and adherence to responsible AI principles. Key features include the Completions Endpoint for generating text completions from prompts, and the introduction of the DALL-E and Whisper models, which are in preview for generating images from text and transcribing or translating speech. The page also guides new users on starting with Azure OpenAI, including creating an Azure OpenAI resource, deploying models, and crafting effective prompts, making it a vital resource for anyone looking to leverage these cutting-edge AI capabilities.

How? Approaches to Provision Azure OpenAI Services for ISVs and Enterprises

In the realm of Azure OpenAI; ISVs and enterprises have several strategic options for provisioning services to meet their unique business needs and control measures. These approaches largely fall into two categories: Build and Reuse.

Our focus in this blog is on the 'Reuse' approach.

Reuse: Utilizing Existing Azure Monitoring Tools

Overview: The 'Reuse' strategy focuses on leveraging existing Azure tools for monitoring and diagnostics, such as Azure Monitor, Azure Metrics, and diagnostic logs.
Detailed View: These tools provide detailed insights into the usage of Azure OpenAI services. By reusing these tools, ISVs can gain a comprehensive view of service utilization, performance metrics, and operational health without the need for extensive custom development.
Integration and Customization: Azure monitor and workbooks are pre-backed into Azure portal, enabling cost-effectiveness and time-saving aspects of this approach.

Build: Crafting Custom Solution

Overview: This approach involves ISVs developing their own custom tools tailored to their specific requirements for controlling and monitoring Azure OpenAI services.
Considerations: When building a custom solution, ISVs must consider the integration complexity, development cost, and the ongoing maintenance. This route offers maximum flexibility and control but requires significant investment in development resources.
Leveraging Existing Platforms/Tools: While the specifics of building custom tools are beyond the scope of this discussion, it's worth noting that these tools can often be built on top of existing platforms or frameworks, enhancing efficiency and reducing development time.

Decision Factors

Balancing Flexibility and Resource Investment: The choice between building custom tools or reusing existing Azure tools depends on several factors, including the desired level of customization, available resources, and the specific needs of the ISV or enterprise.
Scalability and Future Growth: Considerations should also include scalability and the ability to adapt to future changes in Azure OpenAI services and the broader AI landscape.

Reuse Strategies in Azure OpenAI Provisioning

Under the 'Reuse' umbrella, ISVs have a couple of notable alternatives for managing Azure OpenAI services for their customers. These methods are particularly beneficial for efficiently utilizing the existing infrastructure provided by Microsoft.

Figure 8: Two options to view usage: By IP and by Model Deployment Name.

Unique Deployment Names for Each Customer

Figure 9: Multiple deployment names, one per customer

Overview: In this approach, ISVs assign a unique deployment name for each customer, with individualized settings including TPM (tokens per minute) and RPM (requests per minute). This customization allows for more precise control over how each customer can utilize the service.
Controlled Management: By having distinct deployment names, ISVs can fine-tune the service parameters per customer. This ensures that each customer's usage stays within the prescribed limits, helping to manage resource allocation effectively and prevent over utilization.
Benefits: This method delegates significant control measures to the Azure platform, reducing the management burden on the ISV. It's particularly suitable for scenarios where customer-specific data segregation and usage monitoring are critical, and where each customer's capacity needs are within the overall model limits.
Considerations: While this setup simplifies management for the ISV, it requires careful planning and setup for each customer to ensure that their specific needs are met within the parameters of TPM and RPM.

Figure 10: Configuring deployment names

Multiple Endpoints for Increased Capacity

Figure 11: Multiple ned-points

Overview: Alternatively, ISVs can use multiple endpoints to enhance capacity. In this scenario, each ISV customer uses the same endpoint, and the ISV is responsible for load balancing and monitoring individual customer usage.
Challenges: This approach requires the ISV to actively manage load balancing and usage tracking, which can be complex but offers greater flexibility in resource allocation and scalability.
Usage Monitoring: The ISV must implement robust systems to accurately monitor and count usage per customer, ensuring fair billing and resource distribution.

Hybrid Approach

Possibility: A third alternative could be a hybrid approach, combining elements of both strategies. This could involve using unique deployment names for certain customers with specific needs while employing multiple endpoints for others to scale capacity.
Flexibility: This approach offers the greatest flexibility, allowing ISVs to tailor the provisioning strategy to the specific needs and usage patterns of each customer.
Management Complexity: While offering adaptability, this approach can increase management complexity and resource requirements for the ISV.

Conclusion and Next Steps

In this blog post, we highlighted two main approaches - Build and Reuse - and discussed their implications, helping you determine the best fit for your unique business needs. We introduced a valuable resource - a detailed workbook available as a link. This workbook is designed to guide you through the practical aspects of using Azure OpenAI monitoring, providing insights on use cases tailored to the ISV landscape. We encourage you to explore this workbook, adapt it to your needs, and integrate it into your AI strategy.

Dolev & Yoav.