Common problems with Assignment 2

Question 1.
Marking scheme: (roughly) 6 marks for loop invariant, 3 each for
    proving the loop invariant, partial correctness, and termination.

- Most students did get the program right; for those that didn't I 
provided a sample input that would cause the program to behave badly 
in some way. In all cases I tried to be sure the problem was not 
simply a typo by checking the proof quickly. Most wrong programs were 
due to people simply copying the usual binary search algorithm in the 
vague hope that it would do the right thing. Some people who used this 
approach recognized the problem and tried to handle it by appending a 
few extra tests and rolling back to first - 1 if needed. This 
generally wasn't done correctly; and where it was the student was 
still left with the problem of trying to prove it was correct (more on 
this below).

- By far the most common problem was the use of a loop invariant that 
was not sufficient to prove partial correctness given termination i.e. 
omitting any mention that the index we were seeking was the maximum i 
such that A[i] > x. Without this all we know is that the array 
contains some value greater than x; a fact that can be verified merely 
by comparing x to A[1]. Most people who made this mistake did not seem 
to notice that something was missing.

- Handling of inequalities was a problem for some. A common problem 
was the use of the relatively weak statement:
    first <= mid = (first + last + 1)/2 <= last
to support contentions such as:
    first[i+1] = first[i] <= last[i+1] = mid - 1.
To compound this, some incorrectly derived:
    first <= (first + last + 1)/2 < last
due no doubt to adopting the sub-lemma from the handout on binary 
search proofs without any changes.

- Many people assumed when proving the base case of the induction that 
the loop is never executed i.e. started off with induction on the size 
of A. In all cases the proof would subsequently revert to induction on 
the number of iterations performed (thank God).

- As mentioned above, some people were able to squeak by using 
ordinary binary search and then subsequently fixing the result. In all 
cases but one that I can remember they then either
    a) claimed in their loop invariant that some i such that A[i] > x
       satisfied first <= i <= last (which is disingenuous to say the
       least), or 
    b) assumed that x was in fact in the array.


Question 2.
Marking scheme: same as above.

- Again, insufficient loop invariants were the main problem. Many 
people only tried to prove s >= 1 (which is of no use for partial 
correctness, and can be done in two lines anyways), and only a 
minority attempted anything like (r + s)^2 > x. Often (in contrast to 
above) it was recognized that something more than r^2 <= x was needed 
to show the correctness of the program. Some people attempted very 
suggestive summation arguments, but since they had no way to 
characterize the terms of the summation apart from what it added up 
to, they couldn't go anywhere with them (for those people, a hint: try 
restricting s to a particular set of values).

- Again, many people assumed in their base case that the loop was 
never executed.

Question 3.
Marking scheme (very roughly):
i) 2 marks for verifying that all values used in the loop body had
   size O(n)
ii) 3 marks for correctly deducing that the loop body ran in O(m^2)
   for some unspecified m.
iii) 3 marks for correctly working out the number of iterations.
iv) 2 marks for remembering that you multiply the number of iterations
   by the time of the loop body to get the program's running time.

- Very few people made any attempt at (i), which is what the answer in 
the solution set devotes most of its time to. It seems to me that the 
idea of the size of a number in terms of storage space has not been 
fully digested by a lot of the class. Most students started off by 
counting up the operations in terms of m as if it were some external 
constant; many did not even state at any point that m had any relation 
to n whatsoever: they just derived nm^2 and then just blandly asserted 
that T(n) = O(n^3).

- Many people had problems working out that the number of iterations 
was log s = n/2. Some laboured at length to get the (roughly) correct 
result; others came up with stuff like log n iterations and the like. 
Very many who did know that log_2 of 2^k is k ignored the value for s 
specified in the question and used s = 2^n, possibly because they 
didn't know what to do with the square root of a power of 2 (no marks 
were taken off for this).

- More people than I would have expected only worked out the time 
taken for the body of the loop, and then did things like O(m^2) or 
O(n^2) is a subset of O(n^3) to get the result.

- Some people tried ill-advised inductions or recurrence relation 
schemas to prove the result. No marks were explicitly taken off for doing this, 
but in no case that I remember did such an approach get anywhere.

Question 4.
Marking scheme: a) 3 marks, b) 4 marks, c) 3 marks.

Most people got all of these without any real problems. Most Common 
mistakes involved ill-treatment or confusion of constants used in 
definitions of e.g. O(g) and O(h). A Common not-so-good approach was an
attempt to argue in terms of "asymptotic upper/lower bounds" rather 
than using the definitions. Some people in (c) first "converted" g to 
O(n^4) and f to O(n^5) and then claimed on that basis f > g.