 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
Each RBM
converts its data distribution
|
|
|
into a posterior
distribution over its
|
|
|
hidden units.
|
|
|
• |
This divides the
task of modeling its
|
|
|
data into two
tasks:
|
|
|
|
– |
Task
1: Learn generative weights
|
|
|
that
can convert the posterior
|
|
|
distribution
over the hidden units
|
|
|
back
into the data.
|
|
|
|
– |
Task
2: Learn to model the
posterior
|
|
|
distribution
over the hidden units.
|
|
|
|
– |
The
RBM does a good job of task 1
|
|
|
and
a not so good job of task 2.
|
|
|
• |
Task 2 is easier
(for the next RBM) than
|
|
modeling the
original data because the
|
|
|
posterior
distribution is closer to a
|
|
|
distribution
that an RBM can model
|
|
|
perfectly.
|
|