Markov entropy when probabilities are uneven

315 views Asked by At

I've been thinking about information entropy in terms of the Markov equation:

H = -SUM(p(i)lg(p(i)), where lg is the base 2 logarithm.

This assumes that all selections i have equal probability. But what if the probability in the given set of choices is unequal? For example, let's say that StackExchange has 20 sites, and that the probability of a user visiting any StackExchange site except StackOverflow is p(i). But, the probability of a user visiting StackExchange is 5 times p(i).

Would the Markov equation not apply in this case? Or is there an advanced Markov variation that I'm unaware of?

2

There are 2 answers

0
Mathias On BEST ANSWER

I believe you are mixing up 2 concepts: entropy and the Markov equation. Entropy measures the "disorder" of a distribution on states, using the equation you gave: H = -SUM(p(i)lg(p(i)), where p(i) is the probability of observing each state i.

The Markov property does not imply that every state has the same probability. Roughly, a system is said to exhibit the Markov property if the probability to observe a state depends only on observing a few previous states - after a certain limit, the extra states you observe add no information to predicting the next state.

The prototypical Markov model is known as a Markov chain. It says that from each state i, you can move to any state with another probability, represented as a probability matrix:

0 1 2
0 0.2 0.5 0.3
1 0.8 0.1 0.1
2 0.3 0.3 0.4

In this example, the probability of moving from state 0 to 1 is 0.5, and depends only on being in state 0 (knowing more about the previous states would not change that probability).

As long as all states can be visited starting from any state, no matter what the initial distribution, the probability of being in each state converges to a stable, long-term distribution, and on a "long series" you'll observe each state with a stable probability, which is not necessarily equal for each states.

In our example, we would end up having probabilities p(0), p(1) and p(2), and you could then compute the entropy of that chain using your formula.

1
Totero On

From your example are you thinking of Markov Chains?