How to compress document count vectors
output
vector
2000 reconstructed counts
We train the
autoencoder to
reproduce its input
vector as its output
This forces it to
compress as much
information as possible
into the 2 real numbers
in the central bottleneck.
These 2 numbers are
then a good way to
visualize documents.
500 neurons
250 neurons
2
250 neurons
500 neurons
Input vector uses
Poisson units
2000 word counts