I'm a newbie in RL so please forgive me if I ask stupid question:)
I'm working a DQN project right now and it's very similar to the simplest snake game. The game is wrote in js and has a demo (in which snake moves randomly). But since I don't know how to write js, I can't pass the action value to the game during trainng process, what I'm doing now is generating random game image and training the dqn model instead.
What I want to ask is that: Is it possible to do in this way? Can the Q(s,r) still converge? If it's possible, is there anything I should pay attention to? and do I need the episilon parameter anymore?
Thank you very much:)
I'd definitely say no!
The problem is that the agent will only learn from random decisions and can never try if a learned action produces maybe even more reward. So everything he learns will be based on the starting frames. Further, the agent will, in your case, never learn how to handle his size (if it grows like in snake) because he will never grow due to the bad random decisions.
Imagine a child that tries to ride a bike and you lift it off the bike as soon as it has ridden one meter. It will probably be able to ride one or even more meters straight but will never be able to do turns, etc.