The conditional RBM model
 t-1        t
Given the data and the previous hidden
state, the hidden units at time t are
conditionally independent.
So online inference is very easy if we
do not need to propagate uncertainty
about the hidden states.
Learning can be done by using
contrastive divergence.
Reconstruct the data at time t from
the inferred states of the hidden units.
The temporal connections between
hiddens can be learned as if they
were additional biases
t-2       t-1        t