1a - Why do we need machine learning

1b - What are neural networks

1c - Some simple models of neurons

1d - A simple example of learning

1e - Three types of learning


2a - An overview of the main types of network architecture

2b - Perceptrons

2c - A geometrical view of perceptrons

2d - Why the learning works

2e - What perceptrons can not do


3a - Learning the weights of a linear neuron

3b - The error surface for a linear neuron

3c - Learning the weights of a logistic output neuron

3d - The backpropagation algorithm

3e - How to use the derivatives computed by the backpropagation algorithm


4a - Learning to predict the next word

4b - A brief diversion into cognitive science

4c - Another diversion_The softmax output function

4d - Neuro-probabilistic language models

4e - ways to deal with large number of possible outputs


5a - Why object recognition is difficult

5b - Ways to achieve viewpoint invariance

5c - Convolutional neural networks for hand-written digit recognition

5d - Convolutional neural networks for object recognition


6a - Overview of mini-batch gradient descent

6b - A bag of tricks for mini-batch descent

6c - The momentum method

6d - A separate, adaptive learning rate for each connection

6e - rmsprop_divide the gradient


7a - Modeling sequences_brief overview

7b - Training RNNs with backpropagation

7c - A toy example of training an RNN

7d - Why it is difficul to train an RNN

7e - Long term short term memory


8a -

8b - Modeling character strings with multiplicative connections

8c - Learning to predict the next character using HF

8d - Echo state networks


9a - Overview of ways to improve generalization

9b - Limiting size of the weights

9c - Using noise as a regularizer

9d - Introduction to the bayesian approach

9e - The bayesian interpretation of weight decay

9f - MacKays quick and dirty method of fixing weight costs


10a - Why it helps to combine models

10b - Mixtures of experts

10c - The idea of full bayesian learning

10d - Making full bayesian learning practical

10e - Dropout an efficient way to combine neural nets


11a - Hopfield Nets

11b - Dealing with spurious minima in hopfield nets

11c - Hopfields Nets with hidden units

11d - Using stochastic units to improve search

11e - How a boltzmann machine models data


12a - The boltzmann machine learning algorithm

12b - More efficient ways to get the statistics

12c - Restricted boltzmann machines

12d - An example of contrastive divergence learning

12e - RBMs for collaborative filtering


13a - The ups and downs of backpropagation

13b - Belief nets

13d - The wake-sleep algorithm


14a - Learning layers of features by stacking RBMs

14b - Discriminative fine-tuning for DBNs

14c - What happens during discriminative fine-tuning

14d - Modeling real-valued data with an RBM

14e - RBMs are infinite sigmoid belief nets


15a - From principal components analysis to autoencoders

15b - Deep Autoencoders

15c - Deep autoencoders for document retrieval and visualization

15d - Semantic hashing

15e - Learning binary codes for image retrieval

15f - Shallow autoencoders for pre-training


16a - Learning a joint model of images and captions

16b - Hierarchical coordinate frames

16c - Bayesian optimization of neural network hyperparameters