The following code can be traced with a few examples in order to discern its purpose.
Assume these declarations have been made:
int n;
// Read a value into n;
int data[] = new int[n];
int frequency[] = new int[n];
int i;
Imagine this code comes later:
/* Preconditions: ??? */
i = 0;
while (data[i] != -1)
{
frequency[data[i]] = frequency[data[i]] + 1;
i++;
}
/* Postconditions: ??? */
You will notice that the code could certainly have been clearer in stating its purpose. Comments would have helped. Once we figure out the purpose of the program, we can write specifications for it - guidelines that the program will follow.
We would like to figure what must be true in order for the program to
work.
From the code, we can see the purpose is to store the frequency of
occurrences of elements in the data array in the
corresponding positions in the
frequency array. However, the code in between the
declarations and the loop hasn't been specified, so it could be
anything. We want the loop code to work so let's figure out some
situations where it won't work. Once we do that, we can write
preconditions so that those using the loop will know when it will work.
When the Loop Code will Fail:
We know from conditions 1, 3, 4, and 5 that the data
array has to be initialized, it can't have any negative numbers or
numbers greater than the size of the frequency array before the first -1
element, it can, it must have at least one -1 element, and the first -1
element. Many different formats can work under these restrictions. We
can make the preconditions simpler by imposing another restriction.
We can say the -1 element has to be at the end of the data
array. That way, there is only one -1 element to think about and all the
data in the array up to the -1 element at the end will be processed, and
none of the other elements will have "out-of-range" values.
For the frequency array, we know it must be initialized to some sensible
values. We can further impose that all these values must be zero. Thus,
the loop code will result in the frequency array containing frequencies
of all the data elements in only the data array.
We can write the preconditions in a more mathematical way:
data[n-1] = -1
frequency[j] = 0 for all 0 <= j <= n-1
0 <= data[j] < n-1 for all 0 <= j <= n-2
The postconditions assert what is true after the code is terminated (assuming of course, that the preconditions were true). With the purpose of the program in mind, what can we say is true once the loop is done? ... Well, we know that the frequencies of the data elements have been stored in the corresponding positions in the data array, so let that be our postcondition. Expressed in a mathematical way, the postcondition would be:
for all 0 <= j <= n-1
frequency[j] = the number of occurrences of j in data[0..n-2]
Assume these declarations have been made:
int n;
// Read a value into n;
int data[] = new int[n];
int frequency[] = new int[n];
int i;
Imagine this code comes later:
/* Preconditions:
data[n-1] = -1
frequency[j] = 0 for all 0 <= j <= n-1
0 <= data[j] < n-1 for all 0 <= j <= n-2 */
i = 0;
while (data[i] != -1)
{
frequency[data[i]] = frequency[data[i]] + 1;
i++;
}
/* Postconditions:
for all 0 <= j <= n-1
frequency[j] = the number of occurrences of j in data[0..n-2] */
We now have to find a loop invariant.. something that stays the same as the loop progresses. It also has to help us prove the postconditions on termination of the loop. Here is a loop invariant that shows the progress of the loop but stays true throughout the loop:
for all 0 <= j <= n-1
frequency[j] = #occurrences of j in data[0..i-1]
This simply means that at the beginning of each iteration i, we have
processed the data array up to i - 1.
A second loop invariant would be that the data array
remains unchanged:
for all 0 <= j <= n
data'[j] = data[j]
If the preconditions are true and the loop terminates, then the
postconditions should be true. We can prove this. We will be making
use of an equivalence derived from the preconditions to do this proof:
data[k] = -1 <=> k=n-1. These two statements are equivalent.
Convince yourself of this by examining the preconditions. The equivalency
remains true in the loop since the data array does not change (second
loop invariant).
Now, assuming the loop invariants and the negation of the loop guard, we can prove the postconditions:
LI#1:
for all 0 <= j <= n-1
frequency[j] = #occurrences of j in data[0..i-1]
LI#2:
for all 0 <= j <= n
data'[j] = data[j]
Negation of Loop guard:
data[i] = -1
Since the last statement is equivalent to i = n - 1, we also know
i = n - 1
We can now substitute this i into the loop invariant #1, yielding:
for all 0 <= j <= n-1
frequency[j] = #occurrences of j in data[0..n-2]
The last statement obtained is the postcondition we wanted! The above is
a proof that the postconditions are true upon termination of the loop.
For a loop to work, it has to terminate. If it loops infinitely, it is incorrect. So, part of proving a loop program correct is proving that the loop terminates.
First, we must find an integer-valued quantity that decreases on each
iteration but can never become negative. This quantity is called the
bound function,t - it is an upper bound on the number of
iterations left.t must have these
properties:
t > 0
t decreases at every iteration.
We can find such a bound on our loop. In the loop, we progress through
the data array and exit upon reaching the last element,
since, by the preconditions, it must be -1. Thus, a bound on the
iterations left can be the number of elements left to look at: the index
of the last element
of the array minus the current position
in the array. We can thus use (n-i)-1 as our bound function.
This becomes zero when it is time to exit. n-i will also
work. So will (n-i)+400. (n-i)-50 won't work
because it violates the first condition.
A bound function exists for the loop, thus the loop terminates.
We used the invariant to show that the postconditions were true if the loop terminated. However, in order to complete our proof, we must prove that the loop invariants are indeed invariant.
Our loop invariant #1 was:
for all 0 <= j <= n-1
frequency[j] = #occurrences of j in data[0..i-1]
We prove invariance of this statement using induction.
frequency array must have frequencies of all the data
contained in data[0..-1] since i is 0. This is an array
with zero elements. Since all data has occurred 0 times in such
an array, the frequency array has the correct contents;
it has been initialized so all its elements are zero (according to
the preconditions). So the statement is true for i=0.
for all 0 <= j <= n-1
frequency[j] = #occurrences of j in data[0..k-1]
frequency[data[k]] = frequency[data[k]] + 1
By inspection, we see that this is the correct processing for the
k-th element of data. This means:
frequency[data[k]] = #occurrences of data[k] in data[0..k]
Thus, given that the statement is true for k, we have shown it must
be true for k+1.