The E-step chooses the assignment
probabilities that minimize the cost function
(with the parameters of the Gaussians held fixed)
How do we find assignment probabilities for a datapoint
that minimize the cost and sum to 1?
The optimal solution to the trade-off between expected
energy and entropy is to make the probabilities be
proportional to the exponentiated negative energies:
So using the posterior probabilities as assignment
probabilities minimizes the cost function!