CIAR Summer School Tutorial
Lecture 1a: Mixtures of Gaussians, EM, and Variational Free Energy
Two types of density model
(with hidden configurations h)
The k-means algorithm
Why K-means converges
The soft assignment step
"How do we find the..."
The re-fitting step
Some difficulties with soft k-means
A generative view of clustering
The mixture of Gaussians generative model
Computing the new mixing proportions
Computing the new means
Computing the new variances
How many Gaussians do we use?
Avoiding local optima
Speeding up the fitting
Proving that EM improves the log probability of the training data
An MDL approach to clustering
How many bits must we send?
Using a Gaussian agreed distribution
What is the best variance to use?
Sending a value assuming a mixture of two equal Gaussians
The bits-back argument
Using another message to make random decisions
The general case
A Canadian example
What is the best distribution?
EM as coordinate descent in Free Energy
The advantage of using F to understand EM
The indecisive means algorithm
An incremental EM algorithm
Stochastic MDL using the wrong distribution over codes
A spectrum of representations