The goal of learning
Maximize the product of the probabilities that the
Boltzmann machine assigns to the vectors in the
training set.
This is equivalent to maximizing the sum of
the log probabilities of the training vectors.
It is also equivalent to maximizing the
probabilities that we will observe those
vectors on the visible units if we take random
samples after the whole network has reached
thermal equilibrium with no external input.