Lecture 3

Using “least squares” for classification

•

This is not the right thing to do and it doesn’t work as

well as better methods, but it is easy:

–

It reduces classification to least squares regression.

–

We already know how to do regression. We can just

solve for the optimal weights with some matrix

algebra (see lecture 2).

•

We use targets that are equal to the conditional

probability of the class given the input.

–

When there are more than two classes, we treat each

class as a separate problem (we cannot get away with this

if we use the “max” decision function).