Time series models
If we really need distributed representations (which we
nearly always do), we can make inference much simpler
by using three tricks:
Use an RBM for the interactions between hidden and
visible variables. This ensures that the main source of
information wants the posterior to be factorial.
Model short-range temporal information by allowing
several previous frames to provide input to the hidden
units and to the visible units.
This leads to a temporal module that can be stacked
So we can use greedy learning to learn deep models
of temporal structure.