A spectrum of representations
PCA is powerful because it uses
distributed representations but limited
because its representations are linearly
related to the data.
Clustering is powerful because it uses
very non-linear representations but
limited because its representations are
local (not componential).
We need representations that are both
distributed and non-linear
Unfortunately, these are typically
very hard to learn.
Local         Distributed
PCA
Linear
 non-
linear
What
we
need
clustering