7
We need to keep the efficiency of using a gradient method
for adjusting the weights, but use it for modeling the
structure of the sensory input.
Adjust the weights to maximize the probability that a
generative model would have produced the sensory
input. This is the only place to get 10^5 bits per second.
Learn p(image)  not  p(label | image)
What kind of generative model could the brain be using?