 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
If we really
need distributed representations (which we
|
|
|
nearly always
do), we can make inference much simpler
|
|
|
by using three
tricks:
|
|
|
|
– |
Use
an RBM for the interactions between hidden and
|
|
|
visible
variables. This ensures that the main source of
|
|
information
wants the posterior to be factorial.
|
|
|
|
– |
Model
short-range temporal information by allowing
|
|
|
several
previous frames to provide input to the hidden
|
|
units
and to the visible units.
|
|
|
• |
This leads to a
temporal module that can be stacked
|
|
|
|
– |
So
we can use greedy learning to learn deep models
|
|
|
of
temporal structure.
|
|