CSC2535 Lecture 4

Boltzmann Machines, Sigmoid Belief Nets and Gibbs sampling

Another computational role for Hopfield nets

An example: Interpreting a line drawing

Noisy networks find better energy minima

Stochastic units

How a Boltzmann Machine models data

The Energy of a joint configuration

Using energies to define probabilities

An example of how weights define a distribution

Getting a sample from the model

Thermal equilibrium

Thermal equilibrium

An analogy

Detailed Balance

Getting a sample from the posterior distribution over distributed representations
for a given data vector

The goal of learning

Why the learning could be difficult

A very surprising fact

The batch learning algorithm

Why is the derivative so simple?

Why do we need the negative phase?

Slide 22

Bayes Nets:
Directed Acyclic Graphical models

Ways to define the conditional probabilities

What is easy and what is hard in a DAG?

Explaining away

The learning rule for sigmoid belief nets

The derivatives of the log prob

Sampling from the posterior distribution

Gibbs sampling

The recipe for Gibbs sampling

Computing the posterior for i given the rest

Terms in the global energy

Ways to combine Gibbs sampling with learning

A clever trick

Comparison of sigmoid belief nets and Boltzmann machines

Two types of density model with hidden units