


















• 
Suppose we want
to build a model of a



complicated data
distribution by combining



several simple
models. What combination



rule should we
use?



• 
Mixture
models take a weighted sum of
the



distributions




– 
Easy
to learn




– 
The
combination is always vaguer than



the
individual distributions.



• 
Products
of Experts multiply the
distributions


together and
renormalize.




– 
The
product is much sharper than the



individual
distributions.




– 
A
nasty normalization term is needed
to



convert
the product of the individual



densities
into a combined density.

