A generative view of clustering
The mixture of Gaussians generative model
Fitting a mixture of Gaussians
The E-step: Computing responsibilities
The M-step: Computing new mixing proportions
More M-step: Computing the new means
More M-step: Computing the new variances
How do we know that the updates improve things?
The expected energy of a datapoint
The advantage of using F to understand EM
Beyond Mixture
models:
Directed Acyclic Graphical models
Ways to define the conditional probabilities
What is easy and what is hard in a DAG?
A trade-off between how well the model fits the data and the accuracy of inference
Using a Gaussian agreed distribution
What is the best variance to use?
Sending a value assuming a mixture of two equal Gaussians
Using another message to make random decisions
What is the best distribution?
EM as coordinate descent in Free Energy