NIPS 2007 Tutorial on Deep Belief Nets

How to compress the count vector

output

vector

2000 reconstructed counts


•	We train the neural
	network to reproduce its
	input vector as its output

•	This forces it to
	compress as much
	information as possible
	into the 10 numbers in
	the central bottleneck.

•	These 10 numbers are
	then a good way to
	compare documents.

500 neurons

250 neurons

10

250 neurons

500 neurons


input
vector

2000 word counts