Using HuggingFace Inference Endpoint for translation

66 views Asked by At

Setup

I've created a HF Inference Endpoints for translation, in particular french->english. This is the setup

  • Instance type: GPU · Nvidia A10G · 1x GPU · 24 GB
  • Model: Helsinki-NLP/opus-mt-fr-en
  • Container: default

I have 2k documents to translate and I'm using python package request, together with concurrent to fire multiple HTTP POST request to my endpoint. Each document should have 100-300 sentences and I wasn't able to translate them as they were, so I split each document in 6 section.

Checking the usage of the machine, I realise I barely use any resource in terms of CPU/GPU. Also, after a while, some requests start failing.

I'm quite sure this could be much faster and could process more text all together, but I can't understand how to do so. My data is in a pyspark dataframe: I first tried with an UDF (and it was a terrible idea), now I'm creating a list of documents and use that as input.

Here's the code I'm using

def translate_text(text):
    payload = {"inputs": text}
    headers = {"Authorization": f"Bearer {API_TOKEN}", "Content-Type": "application/json"}

    try:
        response = requests.post(API_URL, json=payload, headers=headers)
        response.raise_for_status()

        if response.status_code == 200 and response.text:
            response_data = response.json()
            translated_text = list(map(lambda x: x.get("translation_text"), response_data))
            return translated_text
        else:
            print("Translation response is empty or not in JSON format.")
            return None

    except requests.exceptions.RequestException as e:
        print("Request error:", e)
        return None

    except requests.exceptions.JSONDecodeError as e:
        print("Failed to decode JSON response:", e)
        return None

def translate_text_dict(text_dict: dict) -> dict:
    output_dict = {}

    with concurrent.futures.ThreadPoolExecutor() as executor:
        results = executor.map(
            lambda args: (args[0], list(map(translate_text, args[1]))), text_dict.items()
        )

        output_dict = dict(results)

    return output_dict

the input dictionary has, as key, an unique id and, as value, the list of sentences to translate. Could this be improved?

0

There are 0 answers