===========================================================================
CSC 373H                Lecture Summary for Week  8             Winter 2006
===========================================================================

Selection.  [Section 13.5]

  - Given list A, rank k, return k-th smallest element in A.

  - Solution 1: sort A, return element at position k.
    Runtime: Theta(n log n).

  - However, can easily be faster for special cases (k=1: find min,
    k=n: find max, both doable in linear time).
    Would like linear time for arbitrary k.

  - Idea: no need to fully sort array to find k-th smallest.
    Given list A, rank k:
      . pick pivot element p;
      . partition A into B = [elements < p], C = [elements > p];
      . if k = |B| + 1, return p;
      . if k < |B| + 1, return k-th smallest in B;
      . if k > |B| + 1, return (k-|B|+1)-th smallest in C.

  - Runtime depends on choice of pivot, like quicksort:
      . worst-case degenerates to Theta(n^2);
      . average-case is Theta(n) -- yields randomized algorithm
        with expected worst-case Theta(n).

  - Clever strategy can be used to get deterministic algorithm with
    worst-case performance Theta(n).

Closest points.

  - Given points p_1 = (x_1,y_1), ..., p_n = (x_n,y_n), find a pair i,j
    such that d(p_i,p_j) is minimal (d(p,q) = distance).

  - Assumption:  No two points with same x or y coordinate.  (Can be
    eliminated but makes presentation simpler.)

  - Idea:  Divide points into two halves, find closest pairs in each half,
    find closest pair across the two halves, and return closest overall.

  - Details:  The input will consist of
      . P = set of points,
      . P_x = list of points sorted by x-coordinate,
      . P_y = list of points sorted by y-coordinate.

  - Divide:  Split P horizontally (along x axis) into
      . Q = leftmost n/2 points,
      . R = rightmost n/2 points,
      . (lists Q_x, Q_y, R_x, R_y can be computed from P_x, P_y
        in linear time).

  - Recurse:  Recursively find
      . q_0, q_1: closest points in Q,
      . r_0, r_1: closest points in R.

  - Combine:  Let d = min( d(q_0,q_1), d(r_0,r_1) ).
    Need to determine if there are points q in Q, r in R with d(q,r) < d
    (in which case q,r are closest in P) or not (in which case q_0,q_1 or
    r_0,r_1 are closest in P).

    Consider any vertical line L that "splits" Q and R (i.e., with
    x-coordinate between rightmost point in Q and leftmost point in R).
    If there are points q in Q, r in R with d(q,r) < d, then q,r both
    lie within distance d of L (because d(q,r) < d means horizontal q-r
    distance also < d and so is distance to L).

    Fix line L and let S = points in P within distance d of L.  S_x and S_y
    can be constructed in linear time from P_x, P_y.

    Fact:  If points q in Q, r in R have d(q,r) < d, then q, r appear
    within 15 positions of each other in S_y!
    Proof:  page 229 in textbook.

  - Algorithm:
      . construct P_x, P_y  (time Theta(n log n))
      . (p_0,p_1) := ClosestPairRec(P_x, P_y)

    ClosestPairRec(P_x, P_y):
        if |P| <= 3:
            find closest points by brute force
        else:
            construct Q_x, Q_y, R_x, R_y  (time Theta(n))
            (q_0,q_1) := ClosestPairRec(Q_x, Q_y)
            (r_0,r_1) := ClosestPairRec(R_x, R_y)
            d := min{ d(q_0,q_1), d(r_0,r_1) }
            m := average of rightmost x-coordinate in Q and leftmost
                 x-coordinate in R
            S := points in P with x-coordinate within distance d of m
            construct S_x, S_y  (time Theta(n))
            for each s in S_y, compute distance to next 15 points in S_y
                and let (s_0,s_1) be closest pair found
            return closest of (q_0,q_1) or (r_0,r_1) or (s_0,s_1)


-----------------------
Network Flow Algorithms
-----------------------

Definition: a "network" is a directed graph N=(V,E) with
  - a "source" s in V with no incoming edge,
  - a "target" t with no outgoing edge (sometimes called "sink"),
  - a nonnegative integer weight (the "capacity") for each edge.

  - Example picture.  Networks can be used to represent, e.g., computer
    networks (capacity = bandwidth), electrical networks, etc.

Network flow problem: assign flow f(e) for each edge e such that we have
maximum flow in the network, subject to:
  - capacity constraint: 0 <= f(e) <= c(e) (flow does not exceed capacity);
  - conservation constraint: for each vertex v != s,t,
        flow into v = flow out of v
    (flow into v = sum_{e in E-(v)} f(e); flow out of v = ...E+..., where
    E-(v) = in-edges, E+(v) = out-edges).

  - Flow for network N = flow out of s = flow into t (by conservation).

Augmenting paths:

  - First idea: path P = s -> ... -> t where f(e) < c(e) for each e.
    Define "residual capacity" delta_f(e) = c(e) - f(e), and residual
    capacity delta_f(P) = MIN (delta_f(e) for e in P).
    Augment path by adding delta_f(P) to all edge flows.

  - Problem: notion too narrow, can get stuck with sub-optimal solution.
    (Example.)

  - Second idea: allow "backward" edges on path and re-define residual
    capacity of e is c(e) - f(e) if e is a forward edge on the path; it's
    f(e) if e is a backward edge.

  - Augmenting path = s-t path where each edge has positive residual
    capacity (i.e., c(e)-f(e) > 0 for forward edges e, f(e) > 0 for
    backward edges e).

    (A backward edge with positive flow represents extra flow that can be
    reassigned to forward edges.)

  - Augmentation: add delta_f(P) (defined as before) to forward edges,
    subtract it from backward edges.

  - Example.

Ford-Fulkerson algorithm:
    start with any flow f (e.g., f(e) = 0 for all e in E)
    while there is an augmenting path P
        augment f using P
    output f