Azure AI Translator announces Synchronous Document Translation

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

Seattle—Feb 21, 2024—Today, we are pleased to announce the public preview of the synchronous operation of document translation feature in Azure AI Translator service. This new synchronous operation allows users to translate a document in real time into a target language.


Document translation enables users to translate complex documents in a variety of file formats including Text, HTML, Word, Excel, PowerPoint, and Outlook messages whilst preserving the source document’s format and layout. The service autodetects the language of the text in the source document if it is unknown to the user. In addition, the user in the request can optionally send a glossary of terms to apply when translating the document.

Updated GIF.gif

Enterprise customers, using document translation asynchronous batch operation, have provided feedback that their employees managing highly confidential documents are hesitant to upload them to a shared cloud storage of their organization for translation. New synchronous operation addresses such need by processing and translating the entire document in memory, avoiding a need to store documents in any storage, even temporarily. The synchronous operation takes a document as part of the request, translates the textual content in the document into a specified target language, and returns the translated document as part of the response. It supplements asynchronous batch operation of document translation which has been generally available since May 2021.

 

 

Document Translation

Asynchronous batch operation

Synchronous operation (preview)

Asynchronously translates batches of up to 1000 documents, into up to 10 target languages in a single request.

Synchronously translates a single document into single target language.

Upload the document to translate into Azure blob storage. In the request, send the Azure blob storage location URLs of source and target documents.

In the request send the source document and get the translated document in the response.

Translate large documents of size up to 40MB

Translate a document of size up to 10MB

Supports translation of document formats including Text, HTML, Markdown, Office, Outlook message, PDF, and legacy Office and Open document formats.

Supports translation of document formats including Text, HTML, Markdown, Office, and Outlook message.

 

Document translation synchronous operation is priced at the same rate as asynchronous batch operation.

 

To try and adopt document translation synchronous operation, as a prerequisite you need an active Azure subscription and an Azure AI Translator resource. Please use the following code samples to try it out.

 

Sample curl command to translate a document:

 

curl -i -X POST "{document-translation-endpoint}/translator/document:translate?sourceLanguage={language_code}&targetLanguage={language_code}&api-version=2023-11-01-preview" \ -H "Ocp-Apim-Subscription-Key:{Your resource key}" \ --form "document={full-path-to-source-file};type={content-type}/{file-extension}" \ --output "{full-path-to-translated-file}"

 

Python code sample to translate a document:

 

import requests import os #Construct URL endpoint = "<Your document translation endpoint>" path = "/translator/document:translate" url = endpoint + path headers = { "Ocp-Apim-Subscription-Key": "<Your resource key>" } # Define the parameters # Get list of supported languages and code here: https://aka.ms/TranslatorLanguageCodes params = { "sourceLanguage": "<source language code>", "targetLanguage": "<target language code>", "api-version": "2023-11-01-preview" } # Include full path, file name and extension input_file = "<full path to source file>" output_file = "<full path to translated file>" # Open the input file in binary mode with open(input_file, "rb") as document: # Define the data to be sent # Find list of supported content types here: https://aka.ms/dtsync-content-type data = { "document": (os.path.basename(input_file), document, "<Your file content type>") } # Send the POST request response = requests.post(url, headers=headers, files=data, params=params) # Write the response content to a file with open(output_file, "wb") as output_document: output_document.write(response.content)

 

References:

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.