




























• 
Everything that
one weight needs to know about


the other weights
and the data in order to do




maximum
likelihood learning is contained in the



difference of two
correlations.










Expected
value of


product
of states at

thermal
equilibrium

when the
training


vector
is clamped


on
the visible units









Expected
value of


product
of states at

thermal
equilibrium

when
nothing is


clamped








Derivative
of


log
probability


of
one training

vector











