Slide 11 of lecture 6 shows a simple network: no hidden units, just one output unit (which is linear), and no biases. It has only two learnable parameters: w1 and w2. Below are some training sets that we could use with that network. For each of these training sets, describe the error surface: 1. Is there a setting of the weights where zero error is achieved? If so, what setting(s) is/are that? 2. Draw a contour plot of the error surface. Use w1 as your horizontal ('x') axis, and w2 as your vertical ('y') axis. 3. Describe it in English. Is it a quadratic bowl? Is it some other shape? Describe differences between the two axes. This training set has only one case: (0, 0) -> 5 This training set has only one case: (0, 2) -> 10 This training set has only one case: (20, 0) -> 100 This training set has two cases: (0, 2) -> 10 (20, 0) -> 100 This training set has three cases: (0, 2) -> 10 (20, 0) -> 100 (0, 0) -> 20 This training set has only one training case: (0.01, 0.02) -> 70 This training set has three cases: (0, 2) -> 10 (20, 0) -> 100 (0.01, 0.02) -> 70 This training set has two cases: (0, 2) -> 10 (0, 3) -> 100