Let's make a linear autoencoder, i.e. PCA, by hand. Three visible units, one hidden unit, and of course three output units. All units are linear. Dataset A, containing four training cases: [1, 1, 0], [0.5, 0.5, 0], [-2, -2.1, 0.1], [2, 2, 0.1]. Dataset B, also containing four training cases: [1, 1, 0], [1, 0, 1], [1, 2, -1], [1, -1, 2]. For each dataset, find good weights & biases for the linear autoencoder. They don't have to be the best possible; just find decent weights. Is training an autoencoder supervised learning or unsupervised learning? Give an argument for both of those. What's the point of using PCA (or another autoencoder) for re-encoding data? See slide 7 of chapter 15. How can any computer program have runtime that's better than linear in the amount of data that's provided? If we take a trained RBM, and convert it to an autoencoder like we did a while back in class (before we had the term "autoencoder"), will it be a good autoencoder? What does that depend on?