I am following this code to finetune openai-whisper
eith LoRa
. The output is a PeftModel
model. I want to load it (or do finetuning) in such a way so that I can get back the vanilla whisper architecture.
The steps of finetuning are
1.instantiate a base model.
model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny", load_in_8bit=False)
2.Create a configuration (LoraConfig) where you define LoRA-specific parameters.
config = LoraConfig(r=32, lora_alpha=64, target_modules=["q_proj", "v_proj"], lora_dropout=0.05, bias="none")
3.Wrap the base model with get_peft_model() to get a trainable PeftModel.
model = get_peft_model(model, config)
4.Train the PeftModel as you normally would train the base model.
trainer = Seq2SeqTrainer(
args=training_args,
model=model,
train_dataset=dataset_dict["train"],
eval_dataset=dataset_dict["test"],
data_collator=data_collator,
tokenizer=processor.feature_extractor,
callbacks=[SavePeftModelCallback],
)
model.config.use_cache = False # silence the warnings. Please re-enable for inference!
trainer.train()
I save the model using trainer.save_model(MODEL_SAVE_FOLDER_NAME)
. I load it as follows:
peft_model_id = "reach-vb/test" # Use the same model ID as before.
language = "en"
task = "transcribe"
peft_config = PeftConfig.from_pretrained(peft_model_id)
model = WhisperForConditionalGeneration.from_pretrained(
peft_config.base_model_name_or_path, load_in_8bit=False, device_map="auto"
)
model = PeftModel.from_pretrained(model, peft_model_id)
Now this model is a PeftModel
model and I can see its from architecture.
I want to load it (or do finetuning) in such a way so that I can get back the vanilla whisper architecture. (would be even better if its not the huggingface architecture but the actual openai-whisper's architecture).