CSC2515 Fall 2007
Introduction to Machine
Learning
Lecture 2: Linear regression
Some types of basis function in 1-D
Two types of linear model that are equivalent with respect to learning
A geometrical view of the solution
When is minimizing the squared error equivalent to Maximum Likelihood Learning?
Least mean squares: An alternative approach for really big datasets
A picture of the effect of the regularizer
A problem with the regularizer
The lasso: penalizing the absolute values of the weights
A geometrical view of the lasso compared with a penalty on the squared weights
An example where minimizing the squared error gives terrible estimates
One dimensional cross-sections of loss functions with different powers
The bias-variance
trade-off
(a figment of the frequentists lack of imagination?)
The bias-variance decomposition
How the regularization parameter affects the bias and variance terms
An example of the bias-variance trade-off
Beating the bias-variance trade-off
Using the posterior distribution
A way to see the covariance of the predictions for different values of x