How can I re-train a LLaMA 2 Text Generation model into a Sequence-to-Sequence model?

745 views Asked by At

LLaMA 2 is a Text Generation Model. Is it possible to re-train the model to make it capable of doing sequence-to-sequence generation, such as translation? I can access LLaMA 2 via the HuggingFace platform.

Alternatively, should I write a prompt to ask LLaMA 2 to translate words and train its translation ability with Q & A style?

Thanks.

1

There are 1 answers

1
Tolga Aktas On BEST ANSWER

It is definitely possible and certainly one of the better capabilities of such models. Many translation models have been trained in an encoder-decoder transformer structure, including the very first transformer paper.

While you can finetune a model to do specifically translation, you could try to zero-shot or few-shot the inference and it probably should return good results still. Did you know have good enough performance from that?