Azure Translation API and Azure blob storage

110 views Asked by At

Problem: I am currently using azure translation API to translate my polish text into English, and we are using s1 instance. when I try to save translated text in my local machine it works fine but since when I am trying to store it in azure blob storage I am getting unwanted things like example translated text: "title": "Built-up property located in Rgielew, K\u0142odawa commune" desired/correct text: "title": "Built-up property located in Rgielew, Kłodawa commune" or "title": "Built-up property located in Rgielew, Klodawa commune"

I am using basic template in my code which I found on web itself so I am sharing it below

import requests
import uuid
import json
import os

def translate_dict(input_dict, source_language, target_languages):
    # Convert dictionary to JSON text
    json_text = json.dumps(input_dict, ensure_ascii=False)

    # Add your key and endpoint
    key = os.getenv("translation_key")
    endpoint = os.getenv("translation_endpoint")

    # location, also known as region.
    # required if you're using a multi-service or regional (not global) resource.
    location = os.getenv("translation_location")

    path = '/translate'
    constructed_url = endpoint + path

    params = {
        'api-version': '3.0',
        'from': source_language,
        'to': target_languages
    }

    headers = {
        'Ocp-Apim-Subscription-Key': key,
        # location required if you're using a multi-service or regional (not global) resource.
        'Ocp-Apim-Subscription-Region': location,
        'Content-type': 'application/json',
        'X-ClientTraceId': str(uuid.uuid4())
    }

    # Create body for translation
    body = [{'text': json_text}]

    # Make translation request
    request = requests.post(constructed_url, params=params, headers=headers, json=body)
    response = request.json()

    # Extract translated text from the response
    translated_json_text = response[0]['translations'][0]['text']

    # Convert the translated text back to a dictionary
    translated_dict = json.loads(translated_json_text)

    return translated_dict

If someone had already worked on Azure Translator API can let me know how do I handle such case. Thank you!

I tried browsing in web to find if someone faced similar issues, when I am saving Json files locally it works fine but when I am storing it in azure blob storage there comes the main issue.

1

There are 1 answers

6
Venkatesan On

When I am trying to store it in azure blob storage I am getting unwanted things like example translated text: "title": "Built-up property located in Rgielew, K\u0142odawa commune" desired/correct text: "title": "Built-up property located in Rgielew, Kłodawa commune" or "title": "Built-up property located in Rgielew, Klodawa commune"

You can use the following modified code to get the desired output from Polish text to English text using Python.

Code:

import requests
import uuid
import json
from azure.storage.blob import BlobServiceClient

def translate_dict():
    input_dict = {"title": "Nieruchomość zabudowana położona w Rgielewie, gmina Kłodawa"}
    source_language = "pl"
    target_languages = "en"
    json_text = json.dumps(input_dict, ensure_ascii=False)
    key = os.getenv("translation_key")
    endpoint = os.getenv("translation_endpoint")
    location = os.getenv("translation_location")
    path = '/translate'
    constructed_url = endpoint + path

    params = {
        'api-version': '3.0',
        'from': source_language,
        'to': target_languages
    }

    headers = {
        'Ocp-Apim-Subscription-Key': key,
        'Ocp-Apim-Subscription-Region': location,
        'Content-type': 'application/json',
        'X-ClientTraceId': str(uuid.uuid4())
    }

    body = [{'text': json_text}]

    request = requests.post(constructed_url, params=params, headers=headers, json=body)
    response = request.json()
    translated_json_text = response[0]['translations'][0]['text']
    translated_dict = json.loads(translated_json_text)
    encoded_text = json.dumps(translated_dict, ensure_ascii=False)

    connection_string = "<Storage connection string>"
    container_name = "test"
    blob_name = "sample.json"
    blob_service_client = BlobServiceClient.from_connection_string(connection_string)
    container_client = blob_service_client.get_container_client(container_name)
    blob_client = container_client.get_blob_client(blob_name)
    blob_client.upload_blob(encoded_text, overwrite=True)

translate_dict()

The above code translates a dictionary from one language (Polish) to another (English) using Azure Cognitive Services Translator APIs and saves the translated dictionary as a JSON file in an Azure Blob Storage container.

Portal:

The desired output is "Built-up property located in Rgielew, Kłodawa commune".

Desired Output

if you need to upload as text you can use the modified Upload process.

    connection_string = "xxx"
    container_name = "test"
    blob_name = "sample.txt"  
    blob_service_client = BlobServiceClient.from_connection_string(connection_string)
    container_client = blob_service_client.get_container_client(container_name)
    blob_client = container_client.get_blob_client(blob_name)
    blob_client.upload_blob(encoded_text.encode('utf-8'), overwrite=True)

Output: enter image description here

Reference:

Translator Translate Method - Azure AI services | Microsoft Learn