lec14

Clustering

•

We assume that the data was generated from a

number of different classes. The aim is to cluster

data from the same class together.

–

How do we decide the number of classes?

•

Why not put each datapoint into a separate class?

•

What is the payoff for clustering things together?

–

What if the classes are hierarchical?

–

What if each datavector can be classified in

many different ways? A one-out-of-N

classification is not nearly as informative as a

feature vector.