Minimizing the coding cost
• Pick hidden configurations using a Boltzmann
distribution in their energies
– This is exactly the posterior distribution over
configurations given the datavector
• Minimize the expected energy of the chosen
configurations.
– Change the parameters to minimize the energies of
configurations weighted by their probability of being
picked.
• Don’t worry about the changes in the free energy caused
by changes in the posterior distribution.
– We chose the distribution to minimize free energy.
– So small changes in the distribution have no effect on
the free energy!