finetune a model with LoRa, then load it in its vanilla architecture

60 views Asked by At

I am following this code to finetune openai-whisper eith LoRa. The output is a PeftModel model. I want to load it (or do finetuning) in such a way so that I can get back the vanilla whisper architecture.

The steps of finetuning are

1.instantiate a base model.

model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny", load_in_8bit=False)

2.Create a configuration (LoraConfig) where you define LoRA-specific parameters.

config = LoraConfig(r=32, lora_alpha=64, target_modules=["q_proj", "v_proj"], lora_dropout=0.05, bias="none")

3.Wrap the base model with get_peft_model() to get a trainable PeftModel.

model = get_peft_model(model, config)

4.Train the PeftModel as you normally would train the base model.

trainer = Seq2SeqTrainer(
args=training_args,
model=model,
train_dataset=dataset_dict["train"],
eval_dataset=dataset_dict["test"],
data_collator=data_collator,
tokenizer=processor.feature_extractor,
callbacks=[SavePeftModelCallback],
)
model.config.use_cache = False  # silence the warnings. Please re-enable for inference!

trainer.train()

I save the model using trainer.save_model(MODEL_SAVE_FOLDER_NAME). I load it as follows:

peft_model_id = "reach-vb/test" # Use the same model ID as before.
language = "en"
task = "transcribe"
peft_config = PeftConfig.from_pretrained(peft_model_id)
model = WhisperForConditionalGeneration.from_pretrained(
  peft_config.base_model_name_or_path, load_in_8bit=False, device_map="auto"
)

model = PeftModel.from_pretrained(model, peft_model_id)

Now this model is a PeftModel model and I can see its from architecture.

I want to load it (or do finetuning) in such a way so that I can get back the vanilla whisper architecture. (would be even better if its not the huggingface architecture but the actual openai-whisper's architecture).

0

There are 0 answers