CSC 2535: Advanced Machine Learning

Lecture 5
Energy-Based Models

Two types of density model

Using energies to define probabilities

Slide 4

How to combine simple density models

A picture of the two combination methods

Products of Experts and energies

How sharp are products of experts?

“Uni-gauss” experts

Combining energy dimples

Generating from a product of experts

Relationship to causal generative models

Learning a Product of Experts

Ways to deal with the intractable sum

The Markov chain for unigauss experts

A shortcut

Good and bad properties of the shortcut

Contrastive divergence

Contrastive divergence

15 axis-aligned uni-gauss experts fitted to 24 clusters (one cluster is missing from the grid)

Fantasies from the model
(it fills in the missing cluster)

Energy-Based Models with deterministic hidden units

Reminder:
Maximum likelihood learning is hard

Hybrid Monte Carlo

Slide 25

Simulating the dynamics

A numerical problem

The leapfrog method for keeping numerical errors small.

Combining the last move of one interval with the first move of the next interval

Dealing with the remaining numerical error

Backpropagation can compute the gradient that Hybrid Monte Carlo needs

The online HMC learning procedure

The shortcut

A simple 2-D dataset

The network for the 4 squares task

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

A different kind of hidden structure

Frequently Approximately Satisfied constraints

Frequently Approximately Satisfied constraints

Learning the constraints on an arm

Slide 52

Superimposing constraints

Dealing with missing inputs

Learning constraints from natural images
(Yee-Whye Teh)

Slide 56

Slide 57

How to learn a topographic map

Slide 59

Faster mixing chains

Pro’s and Con’s of Gibbs sampling

Slide 62

Independent Components Analysis

The energy-based view of ICA

Slide 65

Independence relationships of hidden variables
 in three types of model that have one hidden layer

Over-complete ICA
using a causal model

Over-complete ICA
using an energy-based model