MDP: How to calculate the chances of each possible result for a sequence of actions?

Question

MDP: How to calculate the chances of each possible result for a sequence of actions?

321 views Asked by Skyfe At 05 April 2017 at 14:59

I've got a MDP problem with the following environment (3x4 map):

with the possible actions Up/Down/Right/Left and a 0.8 chance of moving in the right direction, 0.1 for each adjoining direction (e.g. for Up: 0.1 chance to go Left, 0.1 chance to go Right).

Now what I need to do is calculate the possible results starting in (1,1) running the following sequence of actions:

[Up, Up, Right, Right, Right]

And also calculate the chance of reaching a field (for each possible outcome) with this actions sequence. How can I do this efficiently (so not going through the at least 2^5, max 3^5 possible results)?

Thanks in advance!

Original Q&A

There are 1 answers

**Max pridy** · Answer 1 · 2017-04-07T03:30:32+00:00

Well. I wonder if you are solving the RL problem. We now usually solve the RL problem with Bellman equation and Q-learning.

You will also benefit from this lecture. http://cs229.stanford.edu/notes/cs229-notes12.pdf

And if you have finished learning, repeat the whole process and you will know [up, up, right, right, right]'s probability.

and after learning, the second constraint will be meaningless because it reaches the correct answer almost immediately.

I think this example is in AIMA, right? Actually I have a few questions about the approach. I think it doesn't seem to right my answer if you approach it very theoretically.

while not done:
    if np.random.rand(1) < e:
        action = env.action_space.sample()
    else:
        action = rargmax(Q[state, :])

    new_state, reward, done, _ = env.step(action)
    Q[state, action] = Q[state, action]+ lr * (reward + r*np.max(Q[new_state,:]) - Q[state, action])

and this is the code I simply code with the gym.

TechQA.

MDP: How to calculate the chances of each possible result for a sequence of actions?

There are 1 answers

Related Questions in ARTIFICIAL-INTELLIGENCE

Related Questions in TRANSITION

Related Questions in MARKOV

Related Questions in MDP

Popular Questions

Trending Questions