Struggling with Hugging Face PEFT

81 views Asked by At

For an exercise I trained GPT-2 on a certain dataset for sequence classification (binary classification on sentiment). Specifically, I trained the untrained classification head as it comes from AutoModelForSequenceClassification.from_pretrained("gpt2") according to the warning message

Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Then, on top of the trained model, I used LoRA from the PEFT library to fine-tune this model further, i.e.,

lora_config = LoraConfig(
    r=8,
    lora_alpha=8,
    task_type=TaskType.SEQ_CLS,
    fan_in_fan_out=True     # GPT-2 requires this
)

lora_model = get_peft_model(model, lora_config) # here model is the trained GPT-2 model
lora_trainer = Trainer(
    model=lora_model,
    ...
)
lora_trainer.train()
lora_model.save_pretrained('gpt-2_lora')

which worked well up to this point. Then, when loading the model back, i.e.,

lora_model = AutoPeftModelForSequenceClassification.from_pretrained(
    'gpt-2_lora',
    num_labels=2,
    id2label=id2label, # some mapping
    label2id=label2id  # never mind now
)

I get the same warning:

Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

So apparently, PEFT is ignorant of the trained classification layer of GPT-2.

Question #1: What do I need to do to make PEFT consider the fully trained version of GPT?

Ignoring this for the moment, I wanted to proceed with an evaluation of the PEFT model, i.e.,

task_evaluator = evaluator('text-classification')
eval_results = task_evaluator.compute(
    model_or_pipeline=lora_model,
    tokenizer=tokenizer,
    data=ds_test,
    input_column=dataset_textfield_name,
    metric=evaluate.combine(['accuracy', 'f1']),
    label_mapping=label2id
)
print(eval_results)

which earned me a lengthy error message:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File ~/.local/lib/python3.10/site-packages/peft/peft_model.py:529, in PeftModel.__getattr__(self, name)
    528 try:
--> 529     return super().__getattr__(name)  # defer to nn.Module's logic
    530 except AttributeError:

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1614, in Module.__getattr__(self, name)
   1613         return modules[name]
-> 1614 raise AttributeError("'{}' object has no attribute '{}'".format(
   1615     type(self).__name__, name))

AttributeError: 'PeftModelForSequenceClassification' object has no attribute 'task'

During handling of the above exception, another exception occurred:

... [more messages of similar kind]

AttributeError: 'GPT2ForSequenceClassification' object has no attribute 'task'

Question #2: what is wrong here? What do I need to do?

The task evaluator works when used on plain GPT-2 (without PEFT).

0

There are 0 answers