Overcoming the limitations of  back-
propagation
Keep the efficiency and simplicity of using a
gradient method for adjusting the weights, but use
it for modeling the structure of the sensory input.
Adjust the weights to maximize the probability
that a generative model would have produced
the sensory input.
Learn p(image)  not  p(label | image)
If you want to do computer vision, first learn
computer graphics
What kind of generative model should we learn?