Chapter |
Notes |
Contents |
Links and Other Readings |
1. |
Introduction to Machine Learning |
Overview of Machine Learning topics
|
Machine Learning (Wikipedia)
Loss Functions (Wikipedia)
Linear Algebra Review (by Z. Kolter)
|
2. |
Linear Regression |
1D regression, multidimensional regression,
least-squares, pseudo-inverse
|
Linear Regresion (Wikipedia)
Linear Algebra Review (by Z. Kolter)
Common matrix identities (S. Roweis)
|
3. |
Nonlinear Regression |
Basis function regression, Radial Basis Functions, Neural networks, K-nearest neighbours
|
RBFs (Wikipedia)
ANNs (Wikipedia)
KNN (Wikipedia)
|
4. |
Quadratics (background) |
Matrix-vector quadratic forms, gradients, optimization
|
|
5. |
Basic Probability and Statistics (background) |
Probability, conditioning, marginalization, density, mathematical expectation
|
Cox axioms (wikipedia)
binomial distribution (wikipedia)
multinomial distribution (wikipedia)
|
6. |
Probability Density Functions (background) |
PDFs Mean and covariance, Uniform distribution, (multi-dim.)
Gaussian distribution
|
PDFs (Wikipedia)
Probability Review (by S. Teong)
|
7. |
Estimation |
Bayes' rule, Maximum likelihood (ML), Maximum a Posteriori (MAP), Bayes' estimates
|
Probabilistic LS (by A. Ng)
|
8. |
Information Theory |
Entropy, Mutual Information, KL Divergence, Cross-Entropy
|
Information Theory (Wikipedia)
Entropy (Wikipedia)
|
9. |
Classification Methods |
k-NN classifiers, Decision trees, Class conditional models,
Naïve Bayes, Logistic regression
|
Decision Trees (Wikipedia)
Logistic Regression (Wikipedia)
Naïve Bayes (Wikipedia)
|
10. |
Gradient Descent (background) |
Gradient Descent, Line Search
|
Gradient descent (Wikipedia)
Line Search (Wikipedia)
Optimization (Wikipedia)
GD with Momentum
|
11. |
Cross Validation |
N-Fold Cross Valiadation, LOOCV
|
Cross-validation (Wikipedia)
|
12. |
Bayesian Methods |
Bayesian Regression, Model Averaging, Model Selection
|
Bayesian model selection demos (Tom Minka)
|
13. |
Monte Carlo Methods |
Sampling Gaussians and Categorical Distributions, Importance Sampling, MCMC
|
MCMC (Wikipedia)
MCMC applet
|
14. |
Principal Component Analysis |
Dimensionality Reduction, PCA, Probabilistic PCA
|
PCA (Wikipedia)
PCA Tutorial (by L. Smith)
|
15. |
Lagrange Multipliers (background) |
Equality constraints, Bounds constraints
|
Lagrange Multipliers (Wikipedia)
|
16. |
Clustering |
K-means, Gaussian Mixture Models, Expectation-Maximization Algorithm
|
K-means (Wikipedia)
K-means++ (Wikipedia)
Slides on Mixture Models and EM
|
17. |
Neural Networks I (optional) |
Multi-Layer Perceptron, Activation Functoins, Back-Propagation
|
MLPs (by R. Grosse)
Back-Prop (by R. Grosse)
|