












• 
The obvious
Markov chain makes a random



perturbation to
the data and accepts it with a



probability that
depends on the energy change.




– 
Diffuses
very slowly over flat regions




– 
Cannot
cross energy barriers easily



• 
In
highdimensional spaces, it is much better to



use the gradient
to choose good directions and to


use momentum.




– 
Beats
diffusion. Scales well.




– 
Can
cross energy barriers.

