Clustering
We assume that the data was generated from a
number of different classes. The aim is to cluster
data from the same class together.
How do we decide the number of classes?
Why not put each datapoint into a separate
class?
What is the objective function that is optimized
by sensible clusterings?