The deep autoencoder
     784 1000   500  250
                                                         30 linear units
     784 1000   500  250
    If you start with small random weights it will not
learn.  If you break symmetry randomly by using
bigger weights, it will not find a good solution.
   So we train a stack of 4 RBM’s and then “unroll”
them.  Then we fine-tune with gentle backprop.