




















• 
Each RBM
converts its data distribution



into a posterior
distribution over its



hidden units.



• 
This divides the
task of modeling its



data into two
tasks:




– 
Task
1: Learn generative weights



that
can convert the posterior



distribution
over the hidden units



back
into the data.




– 
Task
2: Learn to model the
posterior



distribution
over the hidden units.




– 
The
RBM does a good job of task 1



and
a not so good job of task 2.



• 
Task 2 is easier
(for the next RBM) than


modeling the
original data because the



posterior
distribution is closer to a



distribution
that an RBM can model



perfectly.

