Running blenderbot-3B model locally does not provide same result as on Inference API

Question

Running blenderbot-3B model locally does not provide same result as on Inference API

461 views Asked by BlackHawk At 25 March 2022 at 18:08

I tried the facebook/blenderbot-3B model using the Hosted Inference API and it works pretty well (https://huggingface.co/facebook/blenderbot-3B). Now I tried to use it locally with the Python script shown below. The created responses are much worse than from the inference API and do not make sense most of the time.

Is a different code used for the inference API or did I make a mistake?

from transformers import TFAutoModelForCausalLM, AutoTokenizer, BlenderbotTokenizer, TFBlenderbotForConditionalGeneration, TFT5ForConditionalGeneration, BlenderbotTokenizer, BlenderbotForConditionalGeneration
import tensorflow as tf
import torch

device = "cuda:0" if torch.cuda.is_available() else "cpu"
chat_bots = {
    'BlenderBot': [BlenderbotTokenizer.from_pretrained("hyunwoongko/blenderbot-9B"), BlenderbotForConditionalGeneration.from_pretrained("hyunwoongko/blenderbot-9B").to(device)],
}
key = 'BlenderBot'
tokenizer, model = chat_bots[key]

for step in range(100):
    new_user_input_ids = tokenizer.encode(input(">> User:") + tokenizer.eos_token, return_tensors='pt').to(device)
    if step > 0:
      bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1)
    else:
      bot_input_ids = new_user_input_ids

    chat_history_ids = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id).to(device)

    print("Bot: ", tokenizer.batch_decode(chat_history_ids, skip_special_tokens=True)[0])

Original Q&A

There are 1 answers

**Mautoz** · Answer 1 · 2023-03-13T15:31:41+00:00

Your code doesn't execute an API request, it makes model run on your GPU. You can see that model name in your program is "hyunwoongko/blenderbot-9B", not blenderbot-3B, so this code is for another model.

This is the right code to run blenderbot-3B on your own GPU, for api request you should use package "InferenceApi", but i haven't tried it.

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
modelName = "facebook/blenderbot-3B"
tokenizer = AutoTokenizer.from_pretrained(modelName)
model = AutoModelForSeq2SeqLM.from_pretrained(modelName).cuda()

while True:
    text = input("Input: ")
    input_ids = tokenizer.encode(text, return_tensors="pt").cuda()
    out = model.generate(input_ids.cuda())
    generated_text = list(map(tokenizer.decode, out))[0]
    print("Output: "+generated_text)

TechQA.

Running blenderbot-3B model locally does not provide same result as on Inference API

There are 1 answers

Related Questions in CHATBOT

Related Questions in HUGGINGFACE-TRANSFORMERS

Related Questions in BLENDERBOT

Popular Questions

Popular Tags

Trending Questions