•Instead
of taking the negative samples from the equilibrium distribution, use slight corruptions of the
datavectors. Only add random momentum once, and only follow the
dynamics for a few steps.
–Much
less variance because a datavector and its confabulation form a
matched pair.
–Seems to be very biased,
but maybe it is optimizing a
different objective function.
•If
the model is perfect and there is an infinite amount of data, the
confabulations will be equilibrium samples. So the shortcut
will not cause learning to mess up a perfect model.