A final example: Document retrieval
We can use an autoencoder to find low-
dimensional codes for documents that allow
fast and accurate retrieval of similar
documents from a large set.
We start by converting each document into a
“bag of words”.  This a 2000 dimensional
vector that contains the counts for each of the
2000 commonest words.