the next generation of neural networks

How to compress document count vectors

output

vector

2000 reconstructed counts


•	We train the
	autoencoder to
	reproduce its input
	vector as its output

•	This forces it to
	compress as much
	information as possible
	into the 2 real numbers
	in the central bottleneck.

•	These 2 numbers are
	then a good way to
	visualize documents.

500 neurons

250 neurons

2

250 neurons

500 neurons


Input vector uses
Poisson units

2000 word counts