Assume you have the following dataset, where the two variables Color and Size are observed:
Color | Size ------+------ Red | Big White | Small Red | Small Red | Big White | Big Red | Big
You are asked to learn the maximum likelihood parameters for the Bayesian network shown below:
Color -> Size
You get more data for the learning problem described in the table but the new dataset contains missing values. Which algorithm can you use to learn the maximum likelihood parameters now?
 
                        
If you just throw away the cases with missing values, you will get inaccurate values.
So you need to make predictions on the probabilities instead, and you can use the Expectation Maximisation algorithm for this. http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm