Readings / Notes: CSCC11 Introduction to Machine Learning

Front matter: Title page, table of contents, notation

Chapter Notes Contents Links and Other Readings
1. Introduction to Machine Learning Overview of Machine Learning topics Machine Learning (Wikipedia)
Loss Functions (Wikipedia)
Linear Algebra Review (by Z. Kolter)
2. Linear Regression 1D regression,
multidimensional regression,
least-squares, pseudo-inverse
Linear Regresion (Wikipedia)
Linear Algebra Review (by Z. Kolter)
Common matrix identities (S. Roweis)
3. Nonlinear Regression Basis function regression, Radial Basis Functions, Neural networks, K-nearest neighbours RBFs (Wikipedia)
ANNs (Wikipedia)
KNN (Wikipedia)
4. Quadratics (background) Matrix-vector quadratic forms, gradients, optimization
5. Basic Probability and Statistics (background) Probability, conditioning, marginalization, density, mathematical expectation Cox axioms (wikipedia)
binomial distribution (wikipedia)
multinomial distribution (wikipedia)
6. Probability Density Functions (background) PDFs Mean and covariance, Uniform distribution, (multi-dim.) Gaussian distribution PDFs (Wikipedia)
Probability Review (by S. Teong)
7. Estimation Bayes' rule, Maximum likelihood (ML), Maximum a Posteriori (MAP), Bayes' estimates Probabilistic LS (by A. Ng)
8. Information Theory Entropy, Mutual Information, KL Divergence, Cross-Entropy Information Theory (Wikipedia)
Entropy (Wikipedia)
9. Classification Methods k-NN classifiers, Decision trees, Class conditional models, Naïve Bayes, Logistic regression Decision Trees (Wikipedia)
Logistic Regression (Wikipedia)
Naïve Bayes (Wikipedia)
10. Gradient Descent (background) Gradient Descent, Line Search Gradient descent (Wikipedia)
Line Search (Wikipedia)
Optimization (Wikipedia)
GD with Momentum
11. Cross Validation N-Fold Cross Valiadation, LOOCV Cross-validation (Wikipedia)
12. Bayesian Methods Bayesian Regression, Model Averaging, Model Selection Bayesian model selection demos (Tom Minka)
13. Monte Carlo Methods Sampling Gaussians and Categorical Distributions, Importance Sampling, MCMC MCMC (Wikipedia)
MCMC applet
14. Principal Component Analysis Dimensionality Reduction, PCA, Probabilistic PCA PCA (Wikipedia)
PCA Tutorial (by L. Smith)
15. Lagrange Multipliers (background) Equality constraints, Bounds constraints Lagrange Multipliers (Wikipedia)
16. Clustering K-means, Gaussian Mixture Models, Expectation-Maximization Algorithm K-means (Wikipedia)
K-means++ (Wikipedia)
Slides on Mixture Models and EM
17. Neural Networks I (optional) Multi-Layer Perceptron, Activation Functoins, Back-Propagation MLPs (by R. Grosse)
Back-Prop (by R. Grosse)