Im currently trying to work on text generation with my own text. I have trained my model with gpt2 with my own text. But it is giving random answers. For some questions it is giving me relevant answers. Is there a way to fine tune it further or can we do reinforcement learning on this?
I have used the code exactly like this with my own text: https://www.kaggle.com/code/changyeop/how-to-fine-tune-gpt-2-for-beginners
 
                        
GPT-2 is a decoder-based transformer model so it can always be fine-tuned further. Even at the base model, it shouldn't give you random answers, re-initialize the model afresh especially if you are using HF as your source, recreate your dataset, also get one from HF, a capable dataset, and fine-tune it again. You could use reinforcement learning but that will not fine-tune your model, GPT will be the model predicting the actions your agent takes. Not to mention RL is a very random and novel way to fine-tune it will probably lead to a lot of pitstops. However, if going for that method, consider using TRPO, which seems like a good approach for this problem.