Announcing Azure OpenAI Service Assistants Preview Refresh

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.



In early 2023, we introduced the groundbreaking Assistants API in Azure OpenAI Service (Public Preview) to empower developers to easily build agent-like features into their applications. Building these agentic features was possible before, but often required significant engineering, the use of third-party libraries, and multiple integrations. Now with Assistants, leveraging the latest GPT models, tools, and knowledge, developers are rapidly creating customized stateful copilots grounded in their enterprise data and capable of handling a diverse range of tasks.


Today, we are announcing the Preview Refresh of Assistants with a range of new features, including File Search and Browse tools, enhanced data security features, improved controls, new models, expanded region support, and various enhancements to make it easy to get from prototyping to production.   


New Assistants Tools 


Azure OpenAI's Assistants API comes packed with prebuilt tools that make it easier than ever for developers to extend the capabilities of their AI applications. The Code Interpreter tool enables advanced data analysis and data visualization with charts and graphs, writing and running Python code and solving math problems. With Function Calling, developers can write custom functions and have the app intelligently return the functions that need to be called along with their arguments. Today we are announcing two new tools to supercharge your copilots further:


File Search (Public Preview): Enables you to easily connect your enterprise data sources, enable vector search and implement Retrieval-Augmented Generation (RAG). File Search supports up to 10,000 files per Assistant, supports parallel queries through multi-threaded searches, and features enhanced reranking and query rewriting.  We are also introducing vector_store as a new object in the API. Once a file is added to a vector store, it is automatically parsed, chunked, and embedded, made ready to be searched. Vector stores can be used across assistants and threads, simplifying file management and billing. File Search uses the text-embedding-3-large model at 256 dimensions, a default chunk size of 800 tokens and a chunk overlap of 400 tokens. File Search is priced at $0.10/GB of vector store storage per day (the first GB of storage is free). This tool will be offered free-of-cost until Jun 17, 2024.


Browse (Public Preview coming in July 2024): Enable your Assistant to search the web to help answer questions that benefit from the most up-to-date information. With the Browse tool, you can bring intelligent search to your apps and harness the ability to comb billions of webpages, images, videos, and news with a single API call. If your users ask a question that requires the use of the Browse feature (e.g. What is the weather today in Seattle?), your Assistant will formulate a keyword search based on this query and submit it to the Bing search engine to retrieve results. 


Enhanced Data Security 


Customer-managed key (CMK) support for Assistants Thread State and Files (Public Preview coming in June 2024): Enables users to protect and control access to stateful entities and files. Create your own keys and store them in the key vault or managed HSM or use the Azure Key Vault APIs to generate keys. CMK support for File Search is coming soon.


Bring Your Indexes to File Search (Public Preview coming in July 2024): Allows existing users of Azure OpenAI On Your Data, to connect existing On Your Data indexes to the File Search tool. On Your Data indexes can be created with data in Azure AI Search, Azure Cosmos DB for MongoDB vCore, Azure Blob Storage, Pinecone, Elasticsearch, etc. Simply select your AML project indexes in the File Search tool and enable Retrieval-Augmented Generation on this data. 


Control Assistants Outputs and Manage Costs (Public Preview): Allow users to view the input and output tokens used in a thread, message and run, control the maximum number of tokens a run uses in the Assistants API, and set limits on the number of previous messages used in each run. If your scenario necessitates use of a specific tool, you can force the use of a specific tool like file_search or code_interpreter in a particular run with the new tool_choice parameter.  


We are also announcing support for popular model configuration parameters, including temperature, response_format (JSON mode) and top_p in Assistant and Run objects. By adjusting the temperature and top_p parameters, you can achieve different levels of creativity and control in an Assistant’s outputs, making them suitable for a wide range of applications. 

Other Features and Enhancements


Streaming and Polling Support (Public Preview):  We're excited to announce support for streaming responses to reduce perceived latency in your applications using Assistants. With our Python SDK, you can use 'create and stream' helpers to create runs and stream responses seamlessly. We've also added SDK helpers to share object status updates without needing to poll. 


Custom Conversation Histories in Threads (Public Preview): Create messages with the Role Assistant to create custom conversation histories in Threads. 


Assistants Tracing and Evaluation with the PromptFlow SDK (Public Preview coming in June 2024): As developers move from prototyping with Assistants to production, the complexity of components and calls increases, especially because of the non-deterministic nature of Assistants. As a result, fast switching, evaluating, and comparing of Assistants configurations becomes important. We are announcing support for instrumentation of the Assistants API in the PromptFlow SDK to enable better transparency and debuggability by tracing functions and tool calls. In addition, Promptflow's evaluation feature will enable developers to measure and assess the quality and safety of their Assistants' outputs through built-in and custom evaluators.  




Tracing: The PromptFlow SDK allows users to trace the tool calls of their existing application by instrumenting commonly used libraries and by providing a rich UX to visualize the traces locally and in AI Studio. Tracing will allow the developer to understand the flow of their LLM app both in development and as part of monitoring in production.  


Evaluation: The PromptFlow Evaluator SDK will help thoroughly assess the performance of your Assistant-powered application both during development and when in operation. With the SDK’s built-in or custom evaluators, you can measure Assistant outputs as well as individual tool’s quality and safety with both code-based metrics as well as AI-assisted quality and safety evaluators. Built-in evaluators include performance and quality (groundedness, relevance coherence, fluency, similarity, F1 Score) and risk and safety (violence, sexual, self-harm, hate & unfairness).  


Azure OpenAI Assistants and AutoGen: AutoGen by Microsoft Research provides a multi-agent conversation framework to enable convenient building of LLM workflows across a wide range of applications. Azure OpenAI assistants are now integrated into AutoGen via GPTAssistantAgent, a new experimental agent that lets you seamlessly add Assistants into AutoGen-based multi-agent workflows. This enables multiple Azure OpenAI assistants, that could be task or domain specialized to collaborate and tackle complex tasks. Learn more about AutoGen here and how to use GPTAssistantAgent here.


New Models, Regions and Language Support 


GPT-4o Support (Public Preview coming in June 2024): GPT-4o is the latest preview model from OpenAI. GPT-4o integrates text and images in a single model, enabling it to handle multiple data types simultaneously. This multimodal approach enhances accuracy and responsiveness in human-computer interactions. We are announcing support for gpt-4o on Assistants to enable you to create interactive multi-modal experiences.

Finetuned Model Support (Public Preview): Use finetuned gpt-35-turbo (0125) in Sweden Central and East US2 regions. We will be expanding finetuned and regional support for finetuned models in the future. 


Vision Support (Public Preview coming in June 2024): We are announcing vision support for Assistants through gpt-4-turbo (0409) on Assistants. You can create messages with image URLs or uploaded files, and your assistant will use the visuals as part of its context for the conversation. 


Regional Availability Expansion (Public Preview): We have also expanded regional availability for Assistants to include Japan East, UK South, India South, West US and West US3. For information on regional model availability, consult the model matrix for Assistants. 


Python, JavaScript/TypeScript, .NET, Java and Go SDK Support (Public Preview): Python developers can start using Azure OpenAI Assistants right now using the OpenAI library for Python (  We are announcing the new Azure OpenAI client ( in OpenAI library for JavaScript/TypeScript. Developers can access the latest Azure OpenAI Assistant APIs via this unified SDK ( .NET support will be available in June and Java/Go in July.  


Now let's look at a few early adopters of Azure OpenAI Assistants who have unlocked net-new scenarios and accelerated their AI transformation.

Assistants Customer Showcase at MSFT Build 2024

(1) The Coca-Cola Company

The Coca-Cola Company has embarked on a transformative AI journey by integrating Azure OpenAI Assistants API with its KO Assist Tool, its enhanced business intelligence GPT that is aiming to redefine productivity for its 30,000+ associates. Specifically tailored to meet Coca-Cola's unique business and operational needs, KO Assist speeds up basic financial analysis and deeper dive into trends, aids in scenario planning and risk management, and enables the creation of standardized assistants for all organizational units, ensuring consistency and efficiency in internal communications and task execution.


“With Coke’s KO Assist powered by Azure OpenAI Assistants API, we are intersecting human ingenuity and technology to empower collaboration and co-creation within our workforce. Our associates love how KO Assist helps unlock their productivity and get the right business insights at the right time on our enterprise data effortlessly with Assistant’s Code Interpreter and File Search tools. Our time-to-market with Assistants has been incredible. We were able to get KOAssist up and running within weeks, instead of months, with Azure’s enterprise promises from Day One.  We have we been able to streamline our analytical processes but also enhanced our competitive edge in the rapidly evolving beverage industry”    - Punit Vir, VP Emerging Technologies, The Coca-Cola Company


Coke KO Assist.png



(2) Freshworks Inc.

Freshworks Inc. makes it easy for companies to delight their customers and their employees. Its AI-powered customer and employee-service solutions increase efficiency and improve engagement for companies of all sizes. The result is happier customers and more productive employees. Headquartered in San Mateo, California, Freshworks operates around the world to serve more than 67,000 customers, including American Express, Bridgestone, Databricks, Fila, Nucor and Sony.  

“Our Freddy AI platform leverages Assistants API from Azure OpenAI to enable our customers to build AI Agents with near zero configuration or coding. Assistants' advanced file search and parallel function calling capabilities provide intelligent, accurate responses from our customer's data corpuses spread within their enterprise. This enables the AI agent to take intelligent automated actions resulting in improved deflection rates and significant personalization."    - Ramesh Parthasarthy, Chief Architect, Freshworks 




(3) Microsoft Copilot for Finance 

Microsoft Copilot for Finance is a new Copilot experience for Microsoft 365 that unlocks AI-assisted competencies for financial professionals, right from within productivity applications they use every day. Now available in public preview, Copilot for Finance connects to the organization’s financial systems, including Dynamics 365 and SAP, to provide role-specific workflow automation, guided actions, and recommendations in Microsoft Outlook, Excel, Microsoft Teams and other Microsoft 365 applications —helping to save time and focus on what truly matters: navigating the company to success.


Copilot for Finance is leveraging Assistants API to create a specialized code interpreting GPT Assistant for Variance Analysis to help optimize financial decision making. Users upload relevant datasets and provide specific tailored instructions to guide the analysis. The Assistant then write and executes code to perform calculations, transformations and exploratory analysis. After several rounds of analysis, GPT understands queries better and users can clarify and understand the results of variance analysis better.


Copilot for finance-AssistantsAPI.png






Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.