=========================================================================== CSC 236 Lecture Summary for Week 4 Winter 2008 =========================================================================== -------------------- Algorithm complexity -------------------- Refresher: - Measure running time by counting "elementary steps" in algorithm: arithmetic operations, assignments, array accesses, comparisons, return statements, etc. -- our convention: count everything *except* evaluating variables and constants. - Measure as a function of input "size": usually, number of input elements (size of array/list, number of bits in numbers). - Work with worst-case measure (maximum over all inputs of same size), using asymptotic notation. - Upper bound: prove worst-case time T(n) (- O(f(n)) by showing for some real constant c > 0, runtime <= c f(n) for *all* inputs of size n, for all n >= some constant b. - Lower bound: prove worst-case time T(n) (- \Omega(f(n)) by showing for some real constant c > 0, runtime >= c f(n) for *some* input of size n, for all n >= some constant b. - Tight bound: prove worst-case time T(n) (- \Theta(f(n)) by showing T(n) (- O(f(n)) and T(n) (- \Omega(f(n)). - Recall how to deal with straight line algorithms (one statement after the other), branching (if-statements), loops, method calls. Recursive algorithms: - Recursive binary search algorithm: RecBinSearch(x,A,b,e): 1. if b == e: 2. if x <= A[b]: 3. return b else: 4. return e+1 else: 5. m = (b + e) / 2 # integer division 6. if x <= A[m]: 7. return RecBinSearch(x,A,b,m) else: 8. return RecBinSearch(x,A,m+1,e) (Correctness to be considered later.) - Define T(n) = worst-case running time of RecBinSearch over all inputs of size n = e-b+1. - Number of "steps" for each line, ignoring evaluation of variables and literals (constants): 1. 2 step (comparison, if) 2. 3 steps (comparison, array access, if) 3. 1 step (return) 4. 2 steps (arithmetic, return) 5. 3 steps (2 arithmetic, assignment) 6. 3 steps (comparison, array access, if) 7. 1 step (return) + T(ceil(n/2)) steps (call) 8. 2 steps (arithmetic, return) + T(floor(n/2)) steps (call) - Hence, T(n) satisfies recurrence: T(1) = 7, T(n) = 8 + max{ T(ceil(n/2)) + 1, T(floor(n/2)) + 2 } for n > 1. - Closed form? Later... Solving recurrence relations: - Example 1: factorial Fact(n): if n == 0 or n == 1: return 1 else: return n * Fact(n-1) - Worst-case running time of factorial satisfies recurrence: { 5 if n <= 1, T(n) = { { 7 + T(n-1) if n > 1. Closed form? 1. Repeated substitutions: T(n) = 7 + T(n-1) = 7 + 7 + T(n-2) = 7 + 7 + 7 + T(n-3) pattern after i steps: = 7i + T(n-i) base case for n-i <= 1, i.e., i >= n-1: = 7(n-1) + T(n-(n-1)) = 7n - 7 + 5 = 7n - 2. This is only guess, not proof. 2. Prove guess by induction. Formally, prove that T(n) = 7n-2. for all n >= 1. Base Case: n = 1. By def'n, T(1) = 5 = 7-2 = 7(1)-2. Ind. Hyp.: Let n > 1 and assume T(k) = 7k-2 for 1 <= k < n. Ind. Step: Since n > 1, T(n) = 7 + T(n-1) (by recurrence since n > 1) = 7 + 7(n-1)-2 (by IH since 1 <= n-1 < n) = 7n - 2. Hence, by induction, T(n) = 7n-2 for all n >= 1. - Example 2: Fibonacci (0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...) Recall F_0 = 0, F_1 = 1, F_n = F_{n-1} + F_{n-2} for n >= 2. Fib(n): if n == 0 or n == 1: return n else: return Fib(n-1) + Fib(n-2) Worst-case time satisfies recurrence: { 5 if n <= 1, T(n) = { { 8 + T(n-1) + T(n-2) if n > 1. Closed form? Repeated substitution: T(n) = 8 + T(n-1) + T(n-2) = 8 + 8 + T(n-2) + T(n-3) + 8 + T(n-3) + T(n-4) = 8 + 8 + 8 + T(n-2) + 2 T(n-3) + T(n-4) = ... messy! Notice T satisfies recurrence very similar to Fibonacci itself: T(n) ~= T(n-1) + T(n-2). Closed form for Fibonacci is F_n in Theta(phi^n), where phi = (1 + sqrt{5}) / 2 = 1.618..., and it's pssible to prove the same holds for T(n). - Can we do better? Idea: store previous values and compute bottom-up. Fib2(n): F = [0, 1] while n > 1: F = [F[1], F[0]+F[1]] n -= 1 return F[n] Runtime? Theta(n): n constant-time iterations. - Back to RecBinSearch. Recurrence for worst-case running time: { 7 if n = 1 T(n) = { { 9 + max{ T(ceil(n/2)), T(floor(n/2)) + 1 } if n > 1 1. Repeated substitutions. Make simplifying assumptions (for guess): ignore floors and ceilings. T(n) = 10 + T(n/2) (approximately) = 10 + 10 + T(n/4) = 10 + 10 + 10 + T(n/8) After i substitutions: T(n) = 10i + T(n/2^i), Base case for n/2^i = 1 <==> n = 2^i <==> log_2 n = i so substitute i = log_2 n: T(n) = 10 log_2 n + T(1) = 10 log_2 n + 7. Notation: lg n = log_2(n). 2. This cannot be exact solution because of simplifications. But we expect T(n) (- \Theta(log n). (a) T(n) (- \Omega(log n): T(n) >= 9 lg(n) for all n >= 1. Base Case: T(1) = 7 >= 0 = 9 lg(1). Ind. Hyp.: Let k > 1 and suppose T(j) >= 9 lg(j) for 1 <= j < k. Ind. Step: Since k > 1, T(k) = 9 + max{ T(ceil(k/2)), T(floor(k/2)) + 1 } >= 9 + T(ceil(k/2)) (by properties of max) >= 9 + 9 lg(ceil(k/2)) (by IH, since 1 <= ceil(k/2) < k for k > 1) >= 9 + 9(lg(k) - lg(2)) (since ceil(x) >= x so lg(ceil(x)) >= lg(x) for all x) >= 9 lg(k). Hence, by induction, T(n) >= 9 lg n for all n >= 1, i.e., T(n) (- \Omega(lg n). (b) Think of how to prove T(n) (- O(log n).