The E-step chooses the responsibilities that
minimize the cost function
(with the parameters of the Gaussians held fixed)
How do we find responsibility values for a datapoint that
minimize the cost and sum to 1?
The optimal solution to the trade-off between expected
energy and entropy is to make the responsibilities be
proportional to the exponentiated negative energies:
So using the posterior probabilities as responsibilities
minimizes the cost function!