lec1a

Clustering

•

We assume that the data was generated from a

number of different classes. The aim is to cluster

data from the same class together.

–

How do we decide the number of classes?

•

Why not put each datapoint into a separate class?

•

What is the payoff for clustering things together?

–

Clustering is not a very powerful way to model

data, especially if each data-vector can be

classified in many different ways? A one-out-

of-N classification is not nearly as informative

as a feature vector.

•

We will see how to learn feature vectors later.