How do I know the right data format for different LLMs finetuning?

104 views Asked by At

I have seen people use different data formats to fine-tune different LLMs. For example, the following format can be used for Llama-2:

  {
    "instruction": "",
    "input": "",
    "output": ""
  }

and sometimes the format below is used for chatglm2-6b:

   {
    "content": "",
    "summary": ""
  }

Is it related to what format was used for pre-training or actually both can be used for different llms, how do I organize my custom data if I want to fine-tune a llm?

0

There are 0 answers