 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
|
Mixture: Take
a weighted average of the distributions.
|
|
|
|
|
It
can never be sharper than the individual distributions.
|
|
|
Its
a very weak way to combine models.
|
|
|
|
Product: Multiply the distributions at each point and
then
|
|
|
renormalize.
|
|
|
|
|
Exponentially more powerful than a mixture. The
|
|
|
normalization
makes maximum likelihood learning
|
|
|
difficult,
but approximations allow us to learn anyway.
|
|
|
|
Composition: Use the values of the latent variables of
one
|
|
model as the data
for the next model.
|
|
|
|
|
Works
well for learning multiple layers of representation,
|
|
|
but
only if the individual models are undirected.
|
|