 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
Fitting a
mixture of Gaussians is one of the main
|
|
|
occupations of
an intellectually shallow field called data-
|
|
mining.
|
|
|
• |
If we have huge
amounts of data, speed is very
|
|
|
important. Some
tricks are:
|
|
|
|
– |
Initialize
the Gaussians using k-means
|
|
|
|
• |
Makes
it easy to get trapped.
|
|
|
|
• |
Initialize
K-means using a subset of the datapoints so that
|
|
|
the
means lie on the low-dimensional manifold.
|
|
|
|
– |
Find
the Gaussians near a datapoint more efficiently.
|
|
|
|
• |
Use a
KD-tree to quickly eliminate distant Gaussians from
|
|
|
consideration.
|
|
|
|
– |
Fit
Gaussians greedily
|
|
|
|
• |
Steal
some mixing proportion from the already fitted
|
|
|
Gaussians
and use it to fit poorly modeled datapoints better.
|
|