Combining networks reduces variance
•
We want to compare two expected squared errors
–
Method 1: Pick one of the predictors at random
–
Method 2:
Use the average of the predictors, y
This term vanishes