Game-like model in Q-learning

77 views Asked by bugrahaskan At 12 February 2024 at 16:46

I have a modeling question. I am sorry I am new to reinforcement learning.

Suppose we have a game in the style pacman. the agent has access to left-front, center-front, right-front circles and must eat dots it will encounter. (if it skips there is more penalty.) dots would appear randomly but have different weigths: either positive or negative. I want to find an optimal score (summed from weigths of the dots) and/or optimal length of dots it will encounter in chain where it would score positive.

I want to train a Q-learning model for this (though I doubt it is the correct way). I plan next using policy-based iteration because value-based model gave me a rather linear solution in a stochastic state space (only one decision per state where it can alter).

I don't know if theoretically this question is solvable.
The dots appear on the fly in random circle next to the agent. say, the "next states" [+/-1,0,0], [0,+/-1,0], [0,0,+/-1] have equal probability distribution. I have trouble posing question the rigth way and to fix a terminal state.

Can you guide me?

Original Q&A

TechQA.

Game-like model in Q-learning

There are 0 answers

Related Questions in REINFORCEMENT-LEARNING

Related Questions in Q-LEARNING

Popular Questions

Trending Questions