Query regarding training mistral ai using AWS

Question

Query regarding training mistral ai using AWS

117 views Asked by Gauravesh Sharma Information S At 06 January 2024 at 17:01

What will be the cost of training Mistral 7b or code llama model on an instance lets say on p3.8xlarge instance ($13/hour) .

Also after the model is trained can we download it and execute in on a smaller machine or not. Or do we need to use on same machine where model is trained.

Also would it be cheaper to just use readymade models if so how much more cheap

I tried to get some answers in forums but the couldnt get a good estimation

Original Q&A

There are 1 answers

**Someone else** · Answer 1 · 2024-02-17T13:31:52+00:00

You can try to convert the finetuned model to gguf format and use a server without gpu aswell. Also 4bit quantized model (https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF) works very good when the layers are transferred to gpu (even t4, using ctransformers library sample code provided in the above hf card).

To convert model into gguf you can use convert.py https://github.com/ggerganov/llama.cpp

also, after converting it into 4/8bit size after merge (if using qlora) will be like 4-5gb for 4bit and 8+ for 8bit. Performance of both seems good as per my own testing.

for downloading and using in another server maybe try to transfer the saved model to s3 bucket ig

TechQA.

Query regarding training mistral ai using AWS

There are 1 answers

Related Questions in AMAZON-WEB-SERVICES

Related Questions in AWS-BILLING

Related Questions in MISTRAL-7B

Popular Questions

Popular Tags

Trending Questions