lecnew20

Why does the shortcut work?

•

If we start at the data, the Markov chain wanders away

from them data and towards things that it likes more. We

can see what direction it is wandering in after only a few

steps. It’s a big waste of time to let it go all the way to

equilibrium.

–

All we need to do is lower the probability of the

“confabulations” it produces and raise the probability

of the data. Then it will stop wandering away.

•

The learning cancels out once the confabulations and the

data have the same distribution.

•

We need to worry about regions of the data-space that

the model likes but which are very far from any data.

–

These regions cause the normalization term to be big

and we cannot sense them if we use the shortcut.