lecture 8: Deep Belief Nets

•

Deep belief nets can benefit a lot from unlabeled data

when labeled data is scarce.

–

They just use the labeled data for fine-tuning.

•

Kernel methods, like Gaussian processes, work well on

small labeled training sets but are very slow for large

training sets.

•

So when there is a lot of unlabeled data and only a little

labeled data, combine the two approaches:

–

First learn a deep belief net without using the labels.

–

Then apply Gaussian process models to the deepest

layer of features. This works better than using the raw

data.

–

Use GP’s to get the derivatives that are

backpropagated through the deep belief net. This is a

further win. It allows GP’s to fine-tune complicated

kernels.