As the title said, I want to merge my PEFT LoRA adapter model (ArcturusAI/Crystalline-1.1B-v23.12-tagger) that I trained before with the base model (TinyLlama/TinyLlama-1.1B-Chat-v0.6) and make a fully new model.
And I got this code from ChatGPT:
from transformers import AutoModel, AutoConfig
# Load the pretrained model and LoRA adapter
pretrained_model_name = "TinyLlama/TinyLlama-1.1B-Chat-v0.6"
pretrained_model = AutoModel.from_pretrained(pretrained_model_name)
lora_adapter = AutoModel.from_pretrained("ArcturusAI/Crystalline-1.1B-v23.12-tagger")
# Assuming the models have the same architecture (encoder, decoder, etc.)
# Get the weights of each model
pretrained_weights = pretrained_model.state_dict()
lora_adapter_weights = lora_adapter.state_dict()
# Combine the weights (adjust the weights based on your preference)
combined_weights = {}
for key in pretrained_weights:
combined_weights[key] = 0.8 * pretrained_weights[key] + 0.2 * lora_adapter_weights[key]
# Load the combined weights into the pretrained model
pretrained_model.load_state_dict(combined_weights)
# Save the integrated model
pretrained_model.save_pretrained("ArcturusAI/Crystalline-1.1B-v23.12-tagger-fullmodel")
And I got this error:
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-1-d2120d727884> in <cell line: 6>()
4 pretrained_model_name = "TinyLlama/TinyLlama-1.1B-Chat-v0.6"
5 pretrained_model = AutoModel.from_pretrained(pretrained_model_name)
----> 6 lora_adapter = AutoModel.from_pretrained("ArcturusAI/Crystalline-1.1B-v23.12-tagger")
7
8 # Assuming the models have the same architecture (encoder, decoder, etc.)
1 frames
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
3096 )
3097 else:
-> 3098 raise EnvironmentError(
3099 f"{pretrained_model_name_or_path} does not appear to have a file named"
3100 f" {_add_variant(WEIGHTS_NAME, variant)}, {TF2_WEIGHTS_NAME}, {TF_WEIGHTS_NAME} or"
OSError: ArcturusAI/Crystalline-1.1B-v23.12-tagger does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.
I have no idea what I did wrong there, I would appreciate it if anyone could teach me how to fix it, or am I going in a completely wrong direction? Thank you.
I tried using transformers
and pytorch
, I expect them to merge both models and create a new model out of it.
The adapter can't be loaded with AutoModel from
transformers
and also the suggestion from ChatGPT of merging won't work. Luckily you don't need to rely on AI for that. The peft library has everything ready for you with merge_and_unload:Output:
You can now save
merged_model
with save_pretrained or do with it whatever you want.Please note that this is only the model and not the tokenizer. You still need to load the tokenizer from the
TinyLlama/TinyLlama-1.1B-Chat-v0.6
repo and save it with save_pretrained locally to have everything in one place:P.S.: I noticed that you have trained the model with a different version of
peft
. Hence I downloaded it locally and removed the following keys from the adapter_config.json:to be able to load it with
peft==0.6.2
.