Hi I am trying to train the dolly-v2-12b or any of the dolly model using a custom dataset using A10 gpu. I am coding in pycharm, windows os. The task is similar to a Q&A. I am trying to use this as communication assistant that can answer the queries. I have the dataset with more than 10,000 entries each entry might have around 3000 characters.
I wanted to know whether this is possible with the GPU I have and how long will it take to train on the dataset:-training time
UPDATE: I found code and used it to train the model. With my NVIDIA GeForce RTX 3060 Graphics card, I was able to train model upto 1billion parameters anything more gets CUDA out of memory error. I didn't try 8 bit mode yet. I will update when I make it work.
UPDATE: So I was trying to make the dolly work on 8 bit mode on my pc which is windows system. But the code I find uses bitsnadbytesand deepspeed. Somehow I find both of these impossible to install and use on windows systems. If anyone got it please help. I tried the instructions on both of the Github(bitsandbytes and deepspeed). But I am not being able to install it. Is there a workaround or any solution for this.