Query regarding training mistral ai using AWS

119 views Asked by At

What will be the cost of training Mistral 7b or code llama model on an instance lets say on p3.8xlarge instance ($13/hour) .

Also after the model is trained can we download it and execute in on a smaller machine or not. Or do we need to use on same machine where model is trained.

Also would it be cheaper to just use readymade models if so how much more cheap

I tried to get some answers in forums but the couldnt get a good estimation

1

There are 1 answers

0
Someone else On

You can try to convert the finetuned model to gguf format and use a server without gpu aswell. Also 4bit quantized model (https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF) works very good when the layers are transferred to gpu (even t4, using ctransformers library sample code provided in the above hf card).

To convert model into gguf you can use convert.py https://github.com/ggerganov/llama.cpp

also, after converting it into 4/8bit size after merge (if using qlora) will be like 4-5gb for 4bit and 8+ for 8bit. Performance of both seems good as per my own testing.

for downloading and using in another server maybe try to transfer the saved model to s3 bucket ig