 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
This is easy. We
just fit each Gaussian to the data
|
|
|
weighted by the
assignment probabilities that the
|
|
|
Gaussian has for
the data.
|
|
|
– |
When
you fit a Gaussian to data you are maximizing
|
|
|
the
log probability of the data given the Gaussian.
|
|
|
This
is the same as minimizing the energies of the
|
|
|
datapoints
that the Gaussian is responsible for.
|
|
|
– |
If
a Gaussian is assigned a probability of 0.7 for a
|
|
|
datapoint
the fitting treats it as 0.7 of an observation.
|
|
| • |
Since both the
E-step and the M-step decrease the
|
|
|
same cost
function, EM converges.
|
|