**CSC2515 Fall 2007
Introduction to Machine
Learning
Lecture 3: Linear Classification Methods**

**What is “linear”
classification?**

**Representing the target
values for classification**

**Three approaches to
classification**

**Reminder: Three different
spaces that are easy to confuse**

**Discriminant functions for
N>2 classes**

**Problems with multi-class
discriminant functions**

**Using “least squares” for
classification**

**Problems with using least
squares for classification**

**Another example where least
squares regression gives poor decision surfaces**

**A picture showing the
advantage of Fisher’s linear discriminant.**

**Math of Fisher’s linear
discriminants**

**More math of Fisher’s linear
discriminants**

**The perceptron convergence
procedure**

**A natural way to try to
prove convergence**

**A better way to prove the
convergence (using the convexity of the solutions in weight-space)**

**Why the learning procedure
works**

**Why connectedness is hard to
compute**

**Distinguishing T from C in
any orientation and position**

**Logistic regression (jump to
page 205)**

**The natural error function
for the logistic**

**Using the chain rule to get
the error derivatives**

**The cross-entropy or
“softmax” error function for multi-class classification**

**A special case of softmax
for two classes**

**Probabilistic Generative
Models for Discrimination**

**A simple example for
continuous inputs**

**A picture of the two
Gaussian models and the resulting posterior for the red class**

**A way of thinking about the
role of the inverse covariance matrix**

**The posterior when the
covariance matrices are different for different classes.**

**Two ways to train a set of
class-specific generative models**

**An example where the two
types of training behave very differently**