CSC2535: Advanced Machine
Learning
Lecture 11
Learning by maximizing agreement between
outputs
The aims of unsupervised learning
Temporally invariant properties
Some obvious measures of agreement
A new way to get a teaching signal
Some advantages of mutual information
Simple forms for the relationship
Spatially invariant properties
Maximizing mutual information between a local region and a larger context
But what about discontinuities?
Mixtures of expert interpolators
The mixture of interpolators net
Mutual Information with multi-dimensional output
Beware of Gaussian assumptions
Violating the Gaussian
Assumption
(experiments by Russ Salakhutdinov)
Kernel Canonical
Correlation
(Bach and Jordan)
Slow Feature
Analysis
(Berges & Wiskott, Wiskott & Sejnowski)
Relationship to linear dynamical system
A way to learn non-linear transformations that maximize agreement between the outputs of two modules
An energy-based model of agreement
It’s the same cost as symmetric SNE!
The forces acting on the output vectors
Combining symmetric SNE with a feedforward neural net