The conditional RBM model
 t-1            t
Given the data and the previous
hidden state, the hidden units at
time t are conditionally independent.
So online inference is very easy.
Learning can be done by using
contrastive divergence.
Reconstruct the data at time t
from the inferred states of the
hidden units.
The temporal connections
between hiddens can be learned
as if they were additional biases
t-2       t-1        t