Combining the outputs of replicated features
Get a small amount of translational invariance at
each level by averaging four neighboring
replicated detectors to give a single output to the
next level.
Taking the maximum of the four should work
better.
Achieving invariance in multiple stages seems to
be what the monkey visual system does.
Segmentation may also be done in multiple
stages.