CSC2535 Lecture 12

Learning Multiplicative Interactions

Two different meanings of “multiplicative”

The heavy-tailed world

But first: One final product of experts

Why products of HMM’s should be better than ordinary HMM’s.

Inference in a PoHMM

How to reconstruct from a PoHMM

How well do PoHMM’s work?

Back to multiplicative interactions

Learning how style and content interact

Slide 11

Some ways to use the bilinear model

Slide 13

Slide 14

Fitting the asymmetric model

Decomposing the matrix

Fitting the symmetric model

Higher order Boltzmann machines (Sejnowski, ~1986)

A higher-order Boltzmann machine with one visible group and two hidden groups

Using higher-order Boltzmann machines to model image transformations
(Memisevic and Hinton, 2007)

Making the reconstruction  easier

Roland’s unfactorized model

The main problem with 3-way interactions

Factoring three-way interactions

A picture of factor f

Another picture of factor f

Factoring the three-way interactions

The dynamics

Belief propagation

The learning

A nasty numerical problem

Roland’s experiments

A principle of hierarchical systems

Why hierarchical generative models require lateral interactions

Restricted Boltzmann Machines with multiplicative interactions

A picture of factor f

Factoring the three-way interactions

An advantage of modeling correlations between pixels rather than pixels

Keeping perceptual inference tractable

Why the hiddens remain conditionally independent

Where does the asymmetry in the independence relations of visibles and hiddens come?

Summary of the learning procedure

Learning a factored Boltzmann Machine

Linear filters learned by the factors on MNIST digits

Linear-Linear blow-up

Three-way interactions between pixels

Factoring the three-way interactions between pixels

How to create the reconstructions for linear visible units

A very similar model

How to learn a topographic map

Slide 51

Two models with similar energy functions

THE  END