Missing values in bayesian learning

139 views Asked by At

Assume you have the following dataset, where the two variables Color and Size are observed:

Color | Size 
------+------
Red   | Big 
White | Small
Red   | Small
Red   | Big
White | Big
Red   | Big

You are asked to learn the maximum likelihood parameters for the Bayesian network shown below:

Color -> Size

You get more data for the learning problem described in the table but the new dataset contains missing values. Which algorithm can you use to learn the maximum likelihood parameters now?

1

There are 1 answers

0
Sigurd Lund On BEST ANSWER

If you just throw away the cases with missing values, you will get inaccurate values.

So you need to make predictions on the probabilities instead, and you can use the Expectation Maximisation algorithm for this. http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm