Whitening the learning signal instead
of the data
Contrastive divergence learning can remove the effects
of the second-order statistics on the learning without
actually changing the data.
The lateral connections model the second order
statistics
If a pixel can be reconstructed correctly using second
order statistics, its will be the same in the
reconstruction as in the data.
The hidden units can then focus on modeling high-
order structure that cannot be predicted by the lateral
connections.
For example, a pixel close to an edge, where interpolation
from nearby pixels causes incorrect smoothing.