NIPS 2007 Tutorial on Deep Belief Nets


Retrieving documents that are similar

	to a query document


•	We can use an autoencoder to find low-
	dimensional codes for documents that allow
	fast and accurate retrieval of similar
	documents from a large set.

•	We start by converting each document into a

	“bag of words”. This a 2000 dimensional
	vector that contains the counts for each of the
	2000 commonest words.