 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
Each RBM
converts its data distribution
|
|
|
into an
aggregated posterior distribution
|
|
over its hidden
units.
|
|
|
• |
This divides the
task of modeling its
|
|
|
data into two
tasks:
|
|
|
|
– |
Task
1: Learn generative weights
|
|
|
that
can convert the aggregated
|
|
|
posterior
distribution over the hidden
|
|
units
back into the data distribution.
|
|
|
|
– |
Task
2: Learn to model the
|
|
|
aggregated
posterior distribution
|
|
|
over
the hidden units.
|
|
|
|
– |
The
RBM does a good job of task 1
|
|
|
and
a moderately good job of task 2.
|
|
• |
Task 2 is easier
(for the next RBM) than
|
|
modeling the
original data because the
|
|
|
aggregated
posterior distribution is
|
|
|
closer to a
distribution that an RBM can
|
|
|
model perfectly.
|
|