2025-06-04
Last time, we used induction to prove asymptotic bounds on recursive functions. For example, we showed that. Fib(n)=Θ(φn),
and TBinSearch(n)=Θ(log(n)).
Last time, the process looked like
Here are two weaknesses to this approach.
Today, we will see how to
The base case typically involves calculating some values of the recursive function and selecting constants that are large enough so that the calculations work out.
The base usually works out and is a little tedious to check, so for the rest of this class, you may skip this step, as long as you swear the following oath.
I swear that I understand that a full proof by induction requires a base case
In divide-and-conquer algorithms, we typically split the problem into subproblems of roughly equal size. For example, we might split a problem of size n into 2 sub-problems of size n/2. When n is not divisible by 2, this is one subproblem of ⌊n/2⌋ and another of ⌈n/2⌉.
However, replacing ⌈n/2⌉ and ⌊n/2⌋ with n/2 has a negligible impact on the asymptotics, so we can just ignore floors and ceilings. See Introduction To Algorithms (CLRS), section 4.62 for a discussion on this.
The substitution method for solving recurrences is proof by (complete) induction with the simplifications applied.
You can also add as many lower-order terms as you want. I.e. you can show T(n)=cf(n)+d.
TBinSearch(n)=TBinSearch(n/2)+1
Claim. TBinSearch(n)=O(log(n)).
Use the substitution method. TBinSearch(n)=TBinSearch(n/2)+1≤clog(n/2)+1=c(log(n)−log(2))+1=clog(n)−c+1,
which is at most clog(n) when c≥1.
Input. A list l.
Output. l, but sorted.
Let’s think of l∈List[N], i.e. l is a list of natural numbers is sorted iff i≤j⟹l[i]≤l[j].
In general, sorting makes sense for l∈List(A), as long as the elements of A can be ordered.
def merge_sort(l):
n = len(l)
if n <= 1:
return l
else:
left = merge_sort(l[:n//2]) # Sort the left subarray
right = merge_sort(l[n//2:]) # Sort the right subarray
return merge(left, right) # Merge the sorted subarrays
def merge(l1, l2):
"""
Input: sorted lists: l1, l2
Output: l, a sorted list of elements from both l1 and l2
"""
l = []
while True:
# If either list is empty, concatenate the other list to the end and return
if len(l1) == 0:
return l + l2
if len(l2) == 0:
return l + l1
# Otherwise, both lists are non-empty, so append the smallest element in either list
if l1[0] <= l2[0]:
l.append(l1.pop(0)) # pop(0) retrives first element and removes it from the list
else:
l.append(l2.pop(0)
def merge_sort(l):
n = len(l)
if n <= 1:
return l
else:
left = merge_sort(l[:n//2]) # Sort the left subarray
right = merge_sort(l[n//2:]) # Sort the right subarray
return merge(left, right) # Merge the sorted subarrays
TMS(n)=2TMS(n/2)+TMerge(n)
def merge(l1, l2):
"""
Input: sorted lists: l1, l2
Output: l, a sorted list of elements from both l1 and l2
"""
l = []
while True:
# If either list is empty, concatenate the other list to the end and return
if len(l1) == 0:
return l + l2
if len(l2) == 0:
return l + l1
# Otherwise, both lists are non-empty, so append the smallest element in either list
if l1[0] <= l2[0]:
l.append(l1.pop(0)) # pop(0) retrives first element and removes it from the list
else:
l.append(l2.pop(0)
Let n be the total number of elements in l1
and l2
. What is the complexity of merge
in terms of n?
Θ(n). Explanation: Each iteration of the while loop adds at least one element to the merged list.
def merge_sort(l):
n = len(l)
if n <= 1:
return l
else:
left = merge_sort(l[:n//2]) # Sort the left subarray
right = merge_sort(l[n//2:]) # Sort the right subarray
return merge(left, right) # Merge the sorted subarrays
TMS(n)=2TMS(n/2)+n
TMS(n)=2TMS(n/2)+n
TMS(n)=2TMS(n/2)+n=2(2TMS(n/4)+n/2)+n=4TMS(n/4)+2n=4(2TMS(n/8)+n/4)+2n=8TMS(n/8)+3n=8(2TMS(n/16)+n/8)+3n=16TMS(n/16)+4n...
Let’s say n=2k for some k. Then eventually, we get to... TMS(n)=2kTMS(n/2k)+kn=nTMS(1)+kn=Θ(nlog(n))
Recursion Trees are a great way to visualize the sum.
TMS(n)=2TMS(n/2)+n
Like in the previous example, we can sometimes use the recursion tree to compute the runtime directly.
Other times, we won’t be able to compute the runtime directly, but we can still use recursion trees to make a good guess. We can then prove our guess was correct using the substitution method.
Lower bound: Remove all nodes with height >log3(n). The remaining tree is perfect, has height log3(n), and n work at each level. So guess nlog(n).
Upper bound: Imagine levels below log3(n) also do n work. In this case, we do n work for log3/2(n) levels, so again guess nlog(n).
Prove the guess using the substitution method (exercise).
A recurrence is in standard form if it is written as
T(n)=aT(n/b)+f(n)
For some constants a≥1, b>1, and some function f:N→R.
Most divide-and-conquer algorithms will have recurrences that look like this.
T(n)=aT(n/b)+f(n)
Draw a recursion tree for the standard form recurrence. In terms of a,b,f...
The height of the tree is logb(n)
The number of vertices at level h is ah
The total non-recursive work done at level h is ahf(n/bh). Of note are
Summing up the levels, the total amount of work done is
logb(n)∑h=0ahf(n/bh).
The Master Theorem is a way to solve most standard form recurrences quickly.
We get the Master Theorem by analyzing the recursion tree for a generic standard form recurrence.
Let T(n)=aT(n/b)+f(n). Define the following cases based on how the root work compares with the leaf work.
Leaf heavy. f(n)=O(nlogb(a)−ϵ) for some constant ϵ>0.
Balanced. f(n)=Θ(nlogb(a))
Root heavy. f(n)=Ω(nlogb(a)+ϵ) for some constant ϵ>0, and af(n/b)≤cf(n) for some constant c<1 for all sufficiently large n.
Then,
T(n)={Θ(nlogb(a))Leaf heavy caseΘ(f(n)log(n))Balanced caseΘ(f(n))Root heavy case
f(n)=O(nlogb(a)−ϵ) for some ϵ>0 means that f(n) is smaller than nlogb(a) by a factor of at least nϵ. You might find it easier to think of ϵ as 0.0001, and nlogb(a)−ϵ as nlogb(a)nϵ.
For example, n1.9=O(n2−ϵ)
for some ϵ>0 (e.g ϵ=0.01), but
n2/log(n)≠O(n2−ϵ)
for any ϵ>0 since n2−ϵ=n2/nϵ, and log(n)=O(nϵ) for any choice of ϵ>0.
The condition in the root heavy case that af(n/b)≤cf(n) for some constant c<1 for all sufficiently large n is called the regularity condition.
In the root-heavy case, most of the work is done at the root. af(n/b) is the total work done at level 1 of the tree. The regularity condition says that if most of the work is done at the root, we better do more at the root than at level 1 of the tree!
Write the recurrence in standard form to find the parameters a,b,f
Compare nlogb(a) to f to determine the case split.
Read off the asymptotics from the relevant case.
TMS(n)=2TMS(n/2)+n
TMS is a standard form recurrence with a=2,b=2,f(n)=n. We have nlog2(2)=n1. Thus, f=Θ(nlogb(a)), and we are in the balanced case of the Master Theorem. Hence TMS(n)=Θ(nlog(n)).
TBinSearch(n)=TBinSearch(n/2)+1
TBinSearch is a standard form recurrence with a=1,b=2,f(n)=1. We have nlog2(1)=n0=1. Thus, we are in case 2 of the Master Theorem. Hence TBinSearch(n)=Θ(log(n)).
Method | Pros | Cons |
---|---|---|
Induction | Always works, can get more precision | Requires a guess, can get technical, and proofs can get quite complex. |
Substitution | Always works | Requires a guess and is slower than the below. |
Recursion Tree | More Intuitive/Visual | Doesn’t always work but is a good starting point and good for generating guesses. |
Master Theorem | Proofs are super short | Restricted scope (recurrence must be in standard form and must fall into one of the cases). |
alogb(n)=aloga(n)loga(b)(Change of base)=(aloga(n))1/loga(b)=n1/loga(b)=nlogb(a)(1/loga(b)=logb(a))
The remaining slides provide an outline of the proof of the Master Theorem.
In this class, you only need to know how to apply the master theorem.
However, understanding the proof can be helpful for getting an intuition for the case splits, remembering the conditions, and applying the theorem.
Here we go.
Analyze the recursion tree for the generic standard form recurrence. Apply the case splits to f.
Before we prove the Master Theorem, let’s review geometric series. A geometric series is a sum that looks like S=a+ar+ar2+...arn−1=n−1∑i=0ari
I.e. each term in the sum is obtained by multiplying the previous term by r.
The closed-form solution for S is
S=a(rn−1r−1)
S=a+ar+ar2+...arn−1
f(n)=Θ(nlogb(a))
f(n)=Θ(nlogb(a))
f(n)=Θ(nlogb(a))
f(n)=O(nlogb(a)−ϵ)
f(n)=O(nlogb(a)−ϵ)
f(n)=O(nlogb(a)−ϵ)
The third case is similar to the previous cases. Check CLRS section 4.6.1 for the details.
CSC236 Summer 2025