Another view of why layer-by-layer
learning works
There is an unexpected equivalence between
RBM’s and directed networks with many layers
that all use the same weights.
This equivalence also gives insight into why
contrastive divergence learning works.