lec14


The M-step chooses the parameters that

		minimize the cost function

	(with the responsibilities held fixed)

•

This is easy. We just fit each Gaussian to the data

weighted by the responsibilities that the Gaussian has

for the data.

–

When you fit a Gaussian to data you are maximizing

the log probability of the data given the Gaussian.

This is the same as minimizing the energies of the

datapoints that the Gaussian is responsible for.

–

If a Gaussian has a responsibility of 0.7 for a

datapoint the fitting treats it as 0.7 of an observation.

•

Since both the E-step and the M-step decrease the

same cost function, EM converges.