CIAR Second Summer School Tutorial
Lecture 1a

Sigmoid Belief Nets
and
Boltzmann Machines

A very old idea about how to build a perceptual system

Good old-fashioned neural networks

What is wrong with back-propagation?

Overcoming the limitations of  back-propagation

The building blocks: Binary stochastic neurons

Bayes Nets:
Directed Acyclic Graphical models

Ways to define the conditional probabilities

What is easy and what is hard in a DAG?

Explaining away

The learning rule for sigmoid belief nets

The derivatives of the log prob

Sampling from the posterior distribution

Gibbs sampling

The recipe for Gibbs sampling

Computing the posterior for i given the rest

Terms in the global energy

Approximate inference

The Free Energy

A trade-off between how well the model fits the data and the tractability of inference

The wake-sleep algorithm

What the wake phase achieves

The flaws in the wake-sleep algorithm

Mode averaging

Summary

How a Boltzmann Machine models data

The Energy of a joint configuration

Using energies to define probabilities

An example of how weights define a distribution

Getting a sample from the model

Thermal equilibrium

Getting a sample from the posterior distribution over distributed representations
for a given data vector

The goal of learning

Why the learning could be difficult

A very surprising fact

The batch learning algorithm

Why is the derivative so simple?

Why do we need the negative phase?

Comparison of sigmoid belief nets and Boltzmann machines

Two types of density model with hidden units