NIPS 2007 Tutorial on Deep Belief Nets

•

Deep belief nets can benefit a lot from unlabeled data

when labeled data is scarce.

–

They just use the labeled data for fine-tuning.

•

Kernel methods, like Gaussian processes, work well on

small labeled training sets but are slow for large training

sets.

•

So when there is a lot of unlabeled data and only a little

labeled data, combine the two approaches:

–

First learn a deep belief net without using the labels.

–

Then apply Gaussian process models to the deepest

layer of features. This works better than using the raw

data.

–

Then use GP’s to get the derivatives that are back-

propagated through the deep belief net. This is a

further win. It allows GP’s to fine-tune complicated

domain-specific kernels.