lecture 5


The M-step chooses the parameters that

		minimize the cost function

	(with the assignment probabilities held fixed)

•

This is easy. We just fit each Gaussian to the data

weighted by the assignment probabilities that the

Gaussian has for the data.

–

When you fit a Gaussian to data you are maximizing

the log probability of the data given the Gaussian.

This is the same as minimizing the energies of the

datapoints that the Gaussian is responsible for.

–

If a Gaussian is assigned a probability of 0.7 for a

datapoint the fitting treats it as 0.7 of an observation.

•

Since both the E-step and the M-step decrease the

same cost function, EM converges.