Speeding up the fitting
Fitting a mixture of Gaussians is one of the main
occupations of an intellectually shallow field called data-
mining.
If we have huge amounts of data, speed is very
important. Some tricks are:
Initialize the Gaussians using k-means
Makes it easy to get trapped.
Initialize K-means using a subset of the datapoints so that
the means lie on the low-dimensional manifold.
Find the Gaussians near a datapoint more efficiently.
Use a KD-tree to quickly eliminate distant Gaussians from
consideration.
Fit Gaussians greedily
Steal some mixing proportion from the already fitted
Gaussians and use it to fit poorly modeled datapoints better.