lec1a

The indecisive means algorithm

Suppose that we want to cluster data in a way that

guarantees that we still have a good model even if an

adversary removes one of the cluster centers from our

model.

•

E-step: find the two cluster centers that are closest to

each data point. Each of these cluster centers is given a

responsibility of 0.5 for that datapoint.

•

M-step: Re-estimate each cluster center to be the mean

of the datapoints it is responsible for.

•

“Proof” that it converges:

–

The E-step optimizes F subject to the constraint that

the distribution contains 0.5 in two places.

–

The M-step optimizes F with the distribution fixed