| Chapter | Notes | Contents | Links and Other Readings |
| 1. | Introduction to Machine Learning | Overview of Machine Learning topics |
Machine Learning (Wikipedia) Linear Algebra Review (by Z. Kolter) |
| 2. | Linear Regression | 1D regression, multidimensional regression, least-squares, pseudo-inverse |
Linear Regresion (Wikipedia) |
| 3. | Nonlinear Regression | Basis function regression, Radial Basis Functions, Neural networks, K-nearest nieghbours |
RBFs (Wikipedia) ANNs (Wikipedia) KNN (Wikipedia) |
| 4. | Quadratics (background) | Matrix-vector quadratic forms, gradients, optimization |
Linear Algebra Review (by Z. Kolter) |
| 5. | Basic Probability and Statistics (background) | Probability, conditioning, marginalization, density, mathematical expectation |
Cox axioms (wikipedia) binomial distribution (wikipedia) multinomial distribution (wikipedia) |
| 6. | Probability Density Functions (background) | PDFs Mean and covariance, Uniform distribution, (multi-dim.) Gaussian distribution | PDFs (Wikipedia) Probability Review (by S. Teong) |
| 7. | Estimation | Bayes' rule, Maximum likelihood, Maximum a Posteriori |
Probabilistic LS (by A. Ng) |
| 8. | Introduction to Classification | Class conditional models, Logistic regression, Neural Network Classifiers, Naïve Bayes |
Logistic
Regression (Wikipedia) Naïve Bayes (Wikipedia) |
| 9. | Gradient Descent (background) | Gradient Descent, Line Search |
Gradient descent (Wikipedia) Line Search (Wikipedia) Optimization (Wikipedia) |
| 10. | Cross Validation | Hold-out Validation, N-Fold Cross Valiadation |
Cross-validation (Wikipedia) |
| 11. | Bayesian Methods | Bayesian Regression, Model Averaging, Model Selection | Bayesian model selection demos (Tom Minka) |
| 12. | Monte Carlo Methods (optional) td> | Sampling Gaussians, Importance Sampling, MCMC, Metropolis Hastings |
MCMC (Wikipedia) MCMC applet |
| 13. | Principal Component Analysis | Dimensionality Reduction, PCA, Probabilistic PCA (optional), Whitening (optional) |
PCA (Wikipedia) Introductory PCA Tutorial (by L. Sm ith) |
| 14. | Lagrange Multipliers (background) | Equality constraints, Bounds constraints | Lagrange Multipliers (Wikipedia) |
| 15. | Clustering | K-means, Mixtures of Gaussians, Expectation-Maximization Algorithm |
K-means (Wikipedia) Slides on Mixture Models and EM Notes on BIC |
| 16. | Hiddden Markov Models (optional) | Markov chains, Viterbi, Forward-Backward, Baum-Welch (EM) | HMMs (Wikipedia) |
| 17. | Support Vector Machines | Maximum margin, Loss functions, Kernels | SVMs (Wikipedia) |
| 18. | AdaBoost | Boosting, Ensemble Methods, | AdaBoost (Wikipedia) |