 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
|
|
How to compress the count vector
|
|
|
|
|
|
|
|
2000
reconstructed counts
|
|
|
|
|
|
|
|
 |
|
|
|
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
We train the
neural
|
|
|
network to
reproduce its
|
|
|
input vector as
its output
|
|
• |
This forces it to
|
|
|
compress as much
|
|
|
information as
possible
|
|
|
into the 10
numbers in
|
|
|
the central
bottleneck.
|
|
|
• |
These 10 numbers
are
|
|
|
then a good way
to
|
|
|
compare
documents.
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
 |
input
|
|
vector
|
|
|
|
|
|
|
|
|
|
|
|