 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
Suppose that we want to cluster data in
a way that
|
|
|
guarantees that
we still have a good model even if an
|
|
|
adversary
removes one of the cluster centers from our
|
|
|
model.
|
|
|
E-step: find the
two cluster centers that are closest to
|
|
|
each data point.
Each of these cluster centers is given a
|
|
responsibility of
0.5 for that datapoint.
|
|
|
M-step:
Re-estimate each cluster center to be the mean
|
|
|
of the datapoints
it is responsible for.
|
|
|
|
Proof that it
converges:
|
|
|
|
The
E-step optimizes F subject to the constraint that
|
|
|
the
distribution contains 0.5 in two places.
|
|
|
|
The
M-step optimizes F with the distribution fixed
|
|