=========================================================================== CSC 373H / L0101 Lecture Summary for Week 7 Winter 2005 =========================================================================== Integer multiplication. - Picture of savings from 4 T(n/2) to 3 T(n/2). Merge sort. - Given array A: 1. split into two halves B,C; 2. sort B,C recursively; 3. merge B,C back into A (in linear time). - Runtime: T(1) = Theta(1) T(n) = 2 T(n/2) + Theta(n) Closed-form: T(n) = Theta(n log n) (a = 2 = 2^1 = b^d). Quick sort. - Given array A: 1. pick "pivot" element p; 2. partition A into B = [elements < p] and C = [elements > p], in place (i.e., so that A = [B,p,C]); 3. sort B,C recursively, in place. - Runtime depends on pivot picked at each step: . worst-case degenerates to Theta(n^2); . average-case is Theta(n log n). Counting inversions. - Given a_1,...,a_n, a permutation of [1,...,n], count the number of inversions (pairs i < j with a_i > a_j). - Application: comparative ranking. - Solution 1: Consider each pair. Runtime: Theta(n^2). - Solution 2: (divide and conquer) 1. split input permutation A into halves B,C; 2. count inversions in B,C recursively; 3. count inversions between B,C, i.e., number of pairs of elements b in B, c in C such that b > c; 4. return total number of inversions. - Problem: step 3 takes time Theta(n^2). - Solution 2': (make it possible to do step 3 in linear time) 1. split input permutation A into halves B,C; 2. SORT and count inversions in B,C recursively; 3. MERGE and count inversions between B,C, i.e., at each step of merge, comparing next element b in B, c in C: . if b < c then b < all remaining elements in C, so no more inversions involve b . if c < b then c < all remaining elements in B, so add remaining number of elements in B to inversion count (note: number of inversions between B,C not affected by sorting within each of B,C because all pairs b in B, c in C such that b > c remain the same); 4. return total number of inversions. - Extra sorting makes it possible to count inversions between B,C in linear time (at the same time as merge step), and total time same as Mergesort: Theta(n log n). Selection. [Section 13.5] - Given list A, rank k, return k-th smallest element in A. - Solution 1: sort A, return element at position k. Runtime: Theta(n log n). - However, can easily be faster for special cases (k=1: find min, k=n: find max, both doable in linear time). Would like linear time for arbitrary k. - Idea: no need to fully sort array to find k-th smallest. Given list A, rank k: . pick pivot element p; . partition A into B = [elements < p], C = [elements > p]; . if k = |B| + 1, return p; . if k < |B| + 1, return k-th smallest in B; . if k > |B| + 1, return (k-|B|+1)-th smallest in C. - Runtime depends on choice of pivot, like quicksort: . worst-case degenerates to Theta(n^2); . average-case is Theta(n) -- yields randomized algorithm with expected worst-case Theta(n). - Clever strategy can be used to get deterministic algorithm with worst-case performance Theta(n). Closest points. - Given points p_1 = (x_1,y_1), ..., p_n = (x_n,y_n), find a pair i,j such that d(p_i,p_j) is minimal (d(p,q) = distance). - Assumption: No two points with same x or y coordinate. (Can be eliminated but makes presentation simpler.) - Idea: Divide points into two halves, find closest pairs in each half, find closest pair across the two halves, and return closest overall. - Details: The input will consist of . P = set of points, . P_x = list of points sorted by x-coordinate, . P_y = list of points sorted by y-coordinate. - Divide: Split P horizontally (along x axis) into . Q = leftmost n/2 points, . R = rightmost n/2 points, . (lists Q_x, Q_y, R_x, R_y can be computed from P_x, P_y in linear time). - Recurse: Recursively find . q_0, q_1: closest points in Q, . r_0, r_1: closest points in R. - Combine: Let d = min( d(q_0,q_1), d(r_0,r_1) ). Need to determine if there are points q in Q, r in R with d(q,r) < d (in which case q,r are closest in P) or not (in which case q_0,q_1 or r_0,r_1 are closest in P). Consider any vertical line L that "splits" Q and R (i.e., with x-coordinate between rightmost point in Q and leftmost point in R). If there are points q in Q, r in R with d(q,r) < d, then q,r both lie within distance d of L (because d(q,r) < d means horizontal q-r distance also < d and so is distance to L). Fix line L and let S = points in P within distance d of L. S_x and S_y can be constructed in linear time from P_x, P_y. Fact: If points q in Q, r in R have d(q,r) < d, then q, r appear within 15 positions of each other in S_y! Proof: pp. 255-256 in textbook. - Algorithm: . construct P_x, P_y (time Theta(n log n)) . (p_0,p_1) := ClosestPairRec(P_x, P_y) ClosestPairRec(P_x, P_y): if |P| <= 3: find closest points by brute force else: construct Q_x, Q_y, R_x, R_y (time Theta(n)) (q_0,q_1) := ClosestPairRec(Q_x, Q_y) (r_0,r_1) := ClosestPairRec(R_x, R_y) d := min{ d(q_0,q_1), d(r_0,r_1) } m := average of rightmost x-coordinate in Q and leftmost x-coordinate in R S := points in P with x-coordinate within distance d of m construct S_x, S_y (time Theta(n)) for each s in S_y, compute distance to next 15 points in S_y and let (s_0,s_1) be closest pair found return closest of (q_0,q_1) or (r_0,r_1) or (s_0,s_1)