Given this dataset:
Color | Size
Red | Big
White | Small
Red | Big
Red | Small
White | Big
Red | Big
and the following bayesian network: Color --> Size, I am supposed to find the maximum likelihood parameters for the bayesian network. What will the estimators be? I am not sure how to proceed here, so any help will be greatly appreciated.
Assuming a multinomial distribution for your Color and Size variables, you need to estimate the following parameters :
For color:
For size:
Which in the end are only 3, since
The likelihood is the probability of the observed data given the model, in this case, for a dataset with n observations of color and size:
,
And parameters:
,
The likelihood is given by:
Since we are dealing with Bernoulli distributions here for the color and the size given the color, we can write it like so:
Where is the count of observations that are red and small, and the other Ms are defined likewise.
Finally, by optimizing the likelihood function, you get the parameter estimators: