Problem Set 8

Iterative Correctness

1 Buy Low Sell High

In this problem, we will examine algorithms for finding the maximum possible profit.

Here’s the set up. We’re given a list \(p\) containing the prices of a single stock over several days, where \(p[i]\) is the price of the stock on day \(i\). If I bought on day \(i\) and sold on day \(j\), my profit would be \(p[j] - p[i]\). Note that you are only allowed to sell after you buy.

Throughout the problem, use 0-indexing, so that the first day is day 0, and the last day is day \(\mathrm{len}(p) - 1\).

We are interested in finding the maximum possible profit when at most one buy trade and one sell trade of the stock in the time period captured in \(p\) (the minimum is 0 since you can choose to neither buy nor sell). More formally, we wish to find

\[\max_{\substack{i, j \in \mathbb{N}, \\ 0 \leq i \leq j < \mathrm{len}(p)}} \left\{p[j] - p[i]\right\}\]

The edge case when \(i = j\) captures the idea that you may choose to not buy and sell (i.e. you can buy/sell on the same day for the same price for a profit of 0).

In your answers you may talk either about the informal ‘maximum possible profit’ definition of the problem or the more formal mathematical definition.

Both of the algorithms below have the following inputs, preconditions, and postconditions.

Input: \(p\)

Precondition: \(p\) is a list of natural numbers with a length of at least \(2\).

Postcondition: Returns the maximum possible profit when making a single buy and sell trade over the time period captured by \(p\).

Here is an algorithm.

def max_profit1(p):
    m = 0
    for i in range(len(p) - 1):
        t = max(p[i + 1:]) - p[i]
        if t > m:
            m = t   
    return m

1.1

Prove MaxProfit1 is correct. You may assume that the max function used in line \(3\) is correct. That is, max(\(l\)) correctly returns the maximum element of the list \(l\).

Remember to use a well-selected loop invariant and to prove initialization, maintenance, and termination.

Solution

Loop Invariant. \(P(n)\): after the \(n\)th iteration, \(m\) stores the value of the maximum profit possible when buying before day \(n\).

Initialization. \(P(0)\). The maximum profit possible when buying before day \(0\) is \(0\) (you’re not allowed to buy anything yet!)

Maintenance. Let \(i \in \mathbb{N}\) and suppose \(P(i)\), we’ll show \(P(i+1)\).

\(m\) is updated according to the following rule.

\[m_{i + 1} = \begin{cases} \max(p[i+1:]) - p[i] & \max{p[i+1:]} - p[i] > m_i\\ m_i & \text{else}\\ \end{cases}\]

The maximum profit possible when buying on day \(i\) is obtained when you sell for the highest price after day \(i\). This highest price is \(\max(p[i+1])\), the buy price was \(p[i]\), so the maximum profit when buying on day \(i\) is computed as \(\max(p[i+1:]) - p[i]\).

By the inductive hypothesis, \(m_i\) stores the maximum profit possible when buying before day \(i\).

Thus, \(m_{i+1}\) stores the larger of the maximum profit when buying on day \(i\) and the maximum profit when buying before day \(i\). i.e., the maximum profit possible when buying before day \(i-1\), as required.

Termination. The algorithm terminates after exactly \(\text{len}(p) - 1\) iterations, at which point the LI implies that \(m\) stores the maximum profit possible when buying before day \(\text{len}(p) - 1\). This is the maximum possible profit since the last chance to buy is day \(\text{len}(p) - 2\).

1.2

Find the asymptotic worst-case runtime of MaxProfit1 on an input list of length \(n\). Assume the max function in line 3 takes time \(k\) where \(k\) is the length of the input to max.

Justify your answer briefly.

Solution

The runtime is dominated by the runtime of max. At iteration \(i\), max takes time \(n-i-1\). Thus, the total about of work done in the max function if \(\sum_{i = 0}^{n-2}n-i-1 = \sum_{i = 1}^{n-1}i = \Theta(n^2)\).

Here is another algorithm.

def max_profit2(p):
    b = p[0]  # buy price
    m = 0  # maximum profit
    i = 1  # day

    while i < len(p):
        x = p[i]  
        t = x - b  
        if t > m:
            m = t  
        if b > x:
            b = x  
        i += 1
    return m

1.3

Prove MaxProfit2 is correct.

Remember to use a well-selected loop invariant and to prove initialization, maintenance, and termination.

Solution

Loop Invariant. \(P(n)\): after the \(n\)th iteration,

  • \(b\) corresponds to the lowest buying price seen up to day \(n\)

  • \(m\) corresponds to the maximum possible profit when selling on or before day \(n\).

  • \(i = n+1\)

Initialization. \(P(0)\). The best price seen up to day \(0\) is the price on day \(0\), so \(b_0 = p[0]\). Furthermore, since selling on day \(0\) is impossible, \(m_0 = 0\). Additionally, \(i\) is initialized to \(1\), which is \(0 + 1\).

Maintenance. Let \(k \in \mathbb{N}\), and suppose \(P(k)\), we’ll show \(P(k+1)\). Note that \(x\) is the price at day \(i_k = k+1\).

\(t = x-b_k\), since \(b_k\) is the smallest buying price up to day \(k\), \(t\) stores the maximum possible profit possible when selling on day \(k+1\). Then we have \(m_{k+1} = t\) if \(t > m_k\), and \(m_k\) otherwise. Since \(m_k\) stores the maximum profit when selling on or before day \(k\), \(m_{k+1}\) now stores the maximum profit when selling on or before day \(k+1\).

Next, If \(b_k > x\), then \(x\) is smaller than the smallest buying price seen up to day \(k\), and hence is the smallest buying price seen up to day \(k + 1\). Otherwise, \(b_k\) is still the smallest buying price seen up to day \(k+1\), and \(b_{k+1} = b_k\).

Finally, \(i_{k+1} = i_k + 1 = k+2\).

Termination. Let \(n = \text{len}(p)\). By the last part of the loop invariant, we have that after the \(n-1\)th iteration, \(i\) equals \(n\), and the loop terminates after exactly \(n-1\) iterations. The first part of the loop invariant implies that \(m\) stores the maximum possible profit when selling on or before day \(n-1\), which is the maximum possible profit overall since day \(n-1\) is the last possible day to sell.

1.4

Find the asymptotic worst-case runtime of MaxProfit2 on an input list of length \(n\).

Justify your answer briefly.

Solution

\(O(n)\). Each iteration of the loop does a constant amount of work, and there are \(n\) iterations.

2 Task Scheduling Part 2

This question extends some ideas from HW3 Problem 2. It might be a good idea to review that problem! I have made an optional companion notebook for this problem which you can access here. It contains the prerequisite relationships for the CSC courses.

Let \(\prec\) be a valid prerequisites relation (defined in problem set 3) over a finite non-empty set of tasks \(\mathrm{Tasks}\). For any two tasks \(a, b \in \mathrm{Tasks}\), \(a \prec b\) if and only if \(a\) is a prerequisite for \(b\).

In HW3 we showed, by induction, that there exists an ordering of all the tasks in \(\mathrm{Tasks}\) that respects \(\prec\). In this problem we’ll prove the same thing algorithmically. I.e. to show that the ordering exists, we’ll demonstrate an algorithm that finds the ordering!

In this problem, we’ll view the prerequisite relation \(\prec\) over \(\mathrm{Tasks}\) as a graph. Specifically, the vertices will be the tasks and there will be a directed edge \((a, b)\) if and only if \(a\) is a direct prerequisite of \(b\). I.e. \(a \prec b\) and there does not exists a different task \(c\) such that \(a \prec c\), and \(c \prec b\). Formally, define \(G_{\mathrm{Tasks}, \prec} = (\mathrm{Tasks}, E)\) where \(E = \{(a, b): (a \prec b) \land \neg \exists c. (a \prec c \land c \prec b)\}\)

Consider the following implementations of isMinimal and findMinimal:

def isMinimal(a, G = (V, E)):
    for b in V:
        if (b, a) in E:
            return False
    return True

def findMinimal(G = (V, E)):
    for a in V:
        if isMinimal(a, G):
            return a

We propose, FindOrdering1, as an algorithm to find an ordering of \(\mathrm{Tasks}\) that respects \(\prec\).

def FindOrdering1(G = (V, E)):
    ordering = []
    n = len(V)

    while len(ordering) < n:
        t = findMinimal(G)
        ordering.append(t)
        V.remove(t)

    return ordering

2.1

Prove that FindOrdering1 is correct. I.e. Let \(\mathrm{Tasks}\) be any non-empty finite set of tasks and \(\prec\) be any valid prerequisites relation over \(\mathrm{Tasks}\). Prove that FindOrdering1 on input \(G_{\mathrm{Tasks}, \prec}\) returns an ordering of all the tasks in \(\mathrm{Tasks}\) that respects \(\prec\).

You may assume findMinimal and isMinimal are correct.

Solution

Define the following loop invariant. \(P(i)\). At the end of the \(i\)th iteration, the following hold:

  1. \(\texttt{ordering}_i = [o_1,...,o_{i}]\) such that for any \(j, k\) with \(j < k\), \(o_k \not \prec o_j\). I.e. the ordering is so far valid and has length \(i\).

  2. For all \(t\) in \(\texttt{ordering}_i\), and \(t' \in V_i\), \(t' \not \prec t\). Nothing in \(V_i\) is a prerequisite for anything in \(\texttt{ordering}_i\)

  3. \(V_i\) and \(\texttt{ordering}_i\) form a partition of the tasks. I.e., every task is in exactly one of \(V\) or ordering.

Initialization. The loop invariant holds at the start of the for loop since \(\texttt{ordering}_0 = []\), and \(V_0\) contains all the tasks.

Maintenance. Let \(i \in \mathbb{N}\), and assume \(P(i)\), we’ll show \(P(i+1)\).

By \(P(i)\).a, \(P(i)\).c, and the fact that the loop condition passed, we have that \(V_i\) is non-empty and finite. By HW3, there is a minimal element, and by the assumption that \(\texttt{findMinimal}\), is correct, \(t\) is a minimal element in \(V_i\). It is appended to \(\texttt{ordering}\) and removed from \(V\). We’ll now show the loop invariant holds.

We have \(\texttt{ordering}_{i+1} = \texttt{ordering}_i + [t]\), \(V_{i+1} = V \setminus \{t\}\).

  1. \(P(i+1)\).a holds because \(P(i)\).b implies that \(t\) is not a prerequisites of anything in \(\texttt{ordering}_i\).

  2. Since \(t\) is minimal in \(V_i\), and \(P(i)\).b, \(P(i+1)\).b.

  3. \(P(i+1)\).c holds from the fact that we simply moved one task from \(V\) to ordering

Thus, the loop invariant holds at the end of iteration \(i\).

Termination. Part a.) of the Loop Invariant implies that the loop will terminate after the \(n\)th iteration, at which point, \(P(n)\).a, and \(P(n)\).c imply that \(\texttt{ordering}_n\) contains an ordering of all the tasks in \(\mathrm{Tasks}\) that respects \(\prec\).

Here’s the story so far.

  • HW3: There exists an ordering

  • Previous problem: We can find an ordering

The next chapter has to do with efficiency - how efficiently can we find the ordering?

Let \(T_1(n, m)\) be the runtime of FindOrdering1 on a graph \(G=(V, E)\) where \(|V| = n\), and \(|E| = m\). Assume \(V\) and \(E\) are given as sets, and that set operations are constant time.

2.2

Prove that the worst case runtime of FindOrdering1 is \(O(n^3)\).

You can assume that the check (b, a) in E in isMinimal takes constant time.

Solution

isMinimal takes time \(O(n)\) in the worst case since it iterates over all the vertices. FindMinimal takes time \(O(n^2)\) in the worst case since there may be just one minimal element, and if we get unlucky with the iteration order and it was the last element, we needed to run isMinimal \((n-1)\) times.

Then FindOrdering1 runs FindMinimal \(n\) times. Thus, in the worst case FindOrdering takes time \(O(n^3)\).

2.3

Find a non-empty finite set of tasks, \(\mathrm{Tasks}\), and a valid prerequisite relation \(\prec\) over \(\mathrm{Tasks}\). Such that, FindOrdering1 run on \(G_{\mathrm{Tasks}, \prec}\) takes time \(\Omega(n^3)\). You can assume that the order of iteration over the sets is bad (works in your favor to prove the lower bound).

Solution

Let \(\mathrm{Tasks} = [n]\), and \(\prec\) be \(<\). Note that there is only ever 1 minimal element. Let \(|V_i| = n - i + 1 = k_i\), in the worst case, FindMinimal tests the minimal element last, and so we do at least \(k_i(k_i-1) + k_i = \Omega(k_i^2)\) work. In the first \(n/2\) iterations, we do \(\Omega(n^2) + \Omega((n-1)^2) + ... +\Omega((n/2)^2)\) work, which is at least \(n/2\cdot\Omega((n/2)^2) = \Omega(n^3)\), as required.

FindOrdering2 is another proposed algorithm for the same task.

def FindOrdering2(G = (V, E)):
    ordering = []
    num_prereqs = {b: sum(y == b for (x,y) in E) for b in V}
    adjacent = {a: [y for (x,y) in E if x == a] for a in V}
    minimal = [a for a in V if num_prereqs[a] == 0]
    for i in range(len(V)):
        a = minimal.pop(0)
        ordering.append(a)
        for b in adjacent[a]:
            num_prereqs[b] -= 1
            if num_prereqs[b] == 0:
                minimal.append(b)
    return ordering

Note that numPrereqs and adjacent can be computed by iterating over all edges once and, and minimal can be computed by iterating over all the vertices.

2.4

Prove that this algorithm is correct. You may assume the inner for loop does what you want it to do - you could prove the inner for loop works as well but that would make the proof extra long.

Solution

Notes: FindOrdering2 is called Kahn’s algorithm. The task of finding a valid ordering of tasks is also known as topological sorting.

We’ll show the following loop invariant

\(P(i)\): At the end of the \(i\)th iteration, Let \(G_i=(V_i, E_i)\) be the graph with \(G\) with the vertices already in \(\texttt{ordering}_i\) removed. I.e. \(V_i = V \setminus \texttt{ordering}_i\), and \(E_i = \{(u, v): u, v \in V'\}\)

  1. \(\texttt{numPrereqs}_i\) is a mapping such that for all \(v \in V_i\), \(\texttt{numPrereqs}[v]\) is the number of tasks \(u \in V_i\) such that \((u, v) \in E_i\).

  2. \(\texttt{minimal}_i\) contains all the vertices \(v \in V_i\) with \(\texttt{numPrereqs}_i[v] = 0\).

  3. \(\texttt{ordering}_i\) is a list of \(i\) elements that respects \(\prec\).

  4. For any \(v \in \texttt{ordering}_i, u \in V_i\), \(u \not\prec v\). Nothing in \(V_i\) is a prereq for anything in \(\texttt{ordering}_i\).

  5. Every task \(v\) is either in \(\texttt{ordering}_i\), \(\texttt{minimal}_i\), or has \(\texttt{numPrereqs}_i(v) > 0\).

Initialization. \(\texttt{numPrereqs}_0, \texttt{minimal}_0\) are assumed to be correct, thus a., b., and e. hold. \(\texttt{ordering}_0 = []\) so c. and d. also hold.

Maintenance. Let \(i \in \mathbb{N}\), and assume \(P(i)\), we’ll show \(P(i+1)\).

By HW3 and \(P(i)\).b, \(\texttt{minimal}_i\) is non-empty so we get a task \(a\).

We have \(\texttt{ordering}_{i+1} = \texttt{ordering}_i + [a]\). By \(P(i)\).d, \(a\) is not a prereq for anything in \(\texttt{ordering}_i\) and thus loop invariant c.) holds for the start of iteration \(i+1\).

Since \(a\) was in \(\texttt{minimal}\), \(P(i)\).b implies \(\texttt{numPrereqs}(a) = 0\), thus, \(P(i+1)\).d holds.

Next, we decrement \(\texttt{numPrereqs}[b]\) for every \(b \in V'\) for which \(a\) was a prerequisite, and update \(\texttt{minimal}\) if any vertex now has \(0\) prerequisites. This step ensures a.) and b.), and e.) hold.

Thus, \(P(k+1)\).

Termination. After iteration \(n\), the loop invariant \(P(n)\).c, and \(P(n)\).e implies that orderingis a list of \(n\) elements that respects \(\prec\), which is what we wanted.

2.5

Prove that the worst case runtime of FindOrdering2 is \(O(n + m)\).

Solution

numPrereqs, and adjacent are computed by iterating over the edges once, which take \(O(m)\) time, minimal is then computed by iterating through all the vertices which takes \(O(n)\) time.

Let \(k\) be the sum of all the values in numPrereqs. Note that before the first for loop, \(k = m\) since each edge contributes one prerequisite. Every time we reach line 9, \(k\) is decremented by 1. Since, numPrereqs\([v]\) is always at least 0 for each \(v\), \(k \geq 0\), so we reach lines \(9\) through \(12\) at most \(m\) times. Lines \(6\) and \(7\) are executed \(n\) times in the loop.

In total, the algorithm takes time at most \(O(n + m)\).

Since \(m\) is always at most \(n^2\), FindOrdering2 is strictly better than FindOrdering1 in the worst case!

3 Majority

Let \(l\) be any list of length \(n\). The majority element of a list is an element that occurs \(> n/2\) times. Note that not all lists have majority elements.

In this problem we will examine algorithms that find the majority element in a list.

Input. a list \(l\).

Precondition. -

Postcondition. Returns the majority element of \(l\) if one exists, otherwise returns None.

The following are two algorithms for this problem.

def MajorityNaive(l):
    n = len(l)
    counts = defaultdict(int)
    for x in l:
        counts[x] += 1
        if counts[x] > n / 2:
            return x

def Majority2(l):
    m = None
    i = 0
    for x in l:
        if i == 0:
            m = x
            i = 1
        elif x == m:
            i += 1
        else:
            i -= 1
    if checkMajority(l, m):
        return m

Recall that it requires \(\Theta(\log(n))\) bits of space to represent the natural number \(n\).

checkMajority(l, m) returns \(\texttt{True}\) iff \(m\) occurs in \(l\) \(>n/2\) times. The implementation keeps a counter and iterates over the list. In the worst case, the runtime is \(O(n)\) and the algorithm requires \(O(\log(n))\) space (to store the counter, which has value at most \(n/2 + 1\).).

Let \(n\) be the length of the list.

3.1

Show that MajorityNaive requires at least \(\Omega(n)\) space in the worst case.

Hint. Create an example list that makes the variable counts very big!

Solution

Consider the list \(l = [1,2,3...,n]\). Since the keys to the count dictionary are the unique elements in the list and each of the elements is distinct, by the end of the for loop, count is a dictionary with \(n\) keys and thus requires \(\Omega(n)\) space.

3.2

Trace Majority2 on input \(l = [2,1,3,1,1]\). Report the values of \(m\) and \(i\) at the state of every iteration.

Solution

\[ \begin{aligned} m_0 = 0 , i_0 = 0 \\ m_1 = 2 , i_1 = 1 \\ m_2 = 2 , i_2 = 0 \\ m_3 = 3 , i_3 = 1 \\ m_4 = 3 , i_4 = 0 \\ m_5 = 1 , i_5 = 1 \\ \end{aligned} \]

The return value is \(m_5 = 1\).

3.3

Let \(l\) be any list with a majority element. Prove that by the end of the for loop, the variable \(m\) contains the majority element. You should use a well-selected loop invariant.

Solution

Let \(j\) be the iteration number. We’ll show the following loop invariant. Let \(\alpha\) be the majority element, and \(m_j, i_j\) be the values of \(m\) and \(i\) at the start of iteration \(j\) respectively.

For any iteration number \(j\), and natural number \(k\), let \(b(k)\) be the number non-\(\alpha\) elements in \(l[k],...,l[n]\), and \(a(k)\) be the number of \(\alpha\)s in \(l[k],...,l[n]\).

P(j): \(i\) is non-negative and

  1. Either \(m_j = \alpha\) and \(b(j) - a(j) < i_j\).

  2. Or \(m_i \neq \alpha\) and \(a(j) - b(j) > i_j\).

Initialization. \(m_0 \neq \alpha\) \(i_0 = 0\), and we have \(a(0) - b(0) > 0\) since \(\alpha\) is the majority element.

Maintenance. Let \(j \in \mathbb{N}\) and suppose \(P(j)\). We’ll show that \(P(j+1)\) holds.

There are several cases.

Case \(m_j = \alpha, x = \alpha, i_j=0\). \[ \begin{aligned} b(j+1) - a(j+1) & = b(j) - (a(j) - 1) \\ & < i_j + 1 \\ & = i_{j + 1} \end{aligned} \]

Case \(m_j = \alpha, x = \alpha, i_j>0\). The calculation is the same as above.

Case \(m_j = \alpha, x \neq \alpha, i_j>0\). \[ \begin{aligned} b(j+1) - a(j+1) & = (b(j) - 1) - a_j \\ & < i_j - 1 \\ & = i_{j + 1} \end{aligned} \]

Case \(m_j = \alpha, x \neq \alpha, i_j=0\). Then we claim \(b.)\) holds for iteration \(j+1\). We have \(m_{j+1} \neq \alpha\), and

\[ \begin{aligned} a(j+1) - b(j+1) & = a(j) - (b(j) - 1) \\ & = a(j) - b(j) + 1 \\ & > i_j + 1 & (b(j) - a(j) < i_j) \\ & = i_{j+1} \\ \end{aligned} \]

Case \(m_j \neq \alpha, x = \alpha, i_j=0\). We claim that \(a.)\) holds for iteration \(j+1\)

\[ \begin{aligned} b(j+1) - a(j+1) & = b(j) - (a_j - 1) \\ & < i_j + 1 \\ & < i_{j+1} \\ \end{aligned} \]

Case \(m_j \neq \alpha, x = \alpha, i_j>0\). \[ \begin{aligned} a(j+1)- b(j+1) & = a(j) - 1 - b(j) \\ & > i_j - 1 \\ & > i_{j+1} \\ \end{aligned} \]

Case \(m_j \neq \alpha, x \neq \alpha, i_j=0\). \[ \begin{aligned} a(j+1) - b(j+1) & = a(j) - (b(j) - 1) \\ & > i_j + 1 \\ & > i_{j+1} \\ \end{aligned} \]

Case \(m_j \neq \alpha, x \neq \alpha, i_j>0\). The calculation is the same as the above.

Thus, in every case, the loop invariant holds.

Termination. The loop invariant holds at the end of iteration \(n-1\) (the start of the imaginary iteration \(n\)). Note that \(b.)\) from the loop invariant can not hold because \(a(n) - b(n) = 0\) and \(i_n \geq 0\). Thus, we \(a.)\) holds and we have \(m_n = \alpha\) as required.

3.4

Show that the worst-case runtime of Majority2 is \(O(n)\)

Solution

The for loop runs \(n\) times, and every operation in the for loop is constant time. checkMajority also runs time \(O(n)\) time. Thus, the overall runtime of the algorithm is \(O(n)\).

3.5

Show that the worst-case space usage of Majority2 is \(O(\log(n))\)

Solution

In the for loop, we only ever keep track of a single element and a single counter which has value at most \(n\) (hence requiring \(O(\log(n))\) space to store). Furthermore, checkMajority requires \(O(\log(n))\) space, thus we never use more than \(O(\log(n))\) space.