Getting a sample from the model
If there are more than a few hidden units, we cannot
compute the normalizing term (the partition function)
because it has exponentially many terms.
So use Markov Chain Monte Carlo to get samples from
the model:
Start at a random global configuration
Keep picking units at random and allowing them to
stochastically update their states based on their
energy gaps.
At thermal equilibrium,  the probability of a global
configuration is given by the Boltzmann distribution.