Here is MergeSort:
def merge(L1, L2):
'''Return a list that contains the elements of lis1 and lis2 and is sorted
Precondition: lis1 and lis2 are sorted
'''
i, j = 0, 0
#L1[:i] was already processed
#L2[:j] was already processed
merged = []
while i < len(L1) and j < len(L2):
if L1[i] < L2[j]:
merged.append(L1[i])
i += 1
else:
merged.append(L2[j])
j += 1
#Only one of L1[i:] and L2[j:] is non-empty, so we can just append
#them both in arbitrary order
merged.extend(lis1[i:])
merged.extend(lis2[j:])
return merged
def merge_sort(L):
'''Return a sorted version of lis
'''
#base case
if len(L) <= 1:
return L[:] #return a copy of L
#need to return a copy in case the user wants to modify
#the original lis and also modify the copy; so they
#need to be kept separated
#Sort the first half, sort the second half, and then merge the two halves!
mid = len(lis)//2
return merge(merge_sort(lis[:mid]), merge_sort(lis[mid:]))
Let's first consider the complexity of merge()
. Here's what happens in merge()
: we append, in some order, all the elements of L1
and all the elements of L2
. Note that list.extend()
is just a loop that repeatedly appends elements to the end of a list.
Since each append operation takes the same amount of time, and we perform len(L1) + len(L2)
append operations (and basically nothing else) inside merge(L1, L2)
, it follow that the complexity of merge(L1, L2)
is $\mathcal{O}(\text{len(L1)} + \text{len(L2)})$.
Another way to think about it is that we're going over L1
and L2
just the once, so the runtime should be proportional to the sum of the lengths of the list.
We are no ready to compute the runtime complexity of merge_sort()
. Consider the call tree for [3, 2, 1, 4]
:
[3] [2] [1] [4]
\ / \ |
\ / \ /
[3,2] [1,4]
\ /
\ /
[3, 2, 1, 4]
In general, the call tree will look like this, where [k]
denotes a list of length k
. Note that this is identical to the call tree for sum_list2()
that we saw before.
level
[1].......[1]...[1]....................[1] log_2 n +1
. .
. .
. .
. .
. .
[n/4] [n/4] [n/4] [n/4] 3
\ / \ /
\ / \ /
[n/2] [n/2] 2
\ /
\ /
\ /
\ /
[n] 1
It's not the case that each call takes the same amount of time to run, however. Most of the computation is performed in merge()
, and, since merge()
is $\mathcal{O}(\text{len(L1)} + \text{len(L2)}) = \mathcal{O}(\text{len(L[:mid])} + \text{len(L[mid:)}) = \mathcal{O}(\text{len(L)})$, runs in time $kn$ for some constant $k$.
We'll now use a trick: we'll compute the runtime that all the calls on each level take. It's easy to see that that runtime is $kn$ on every level: for example, on the third level from the bottom, the runtime is $k(n/4)+k(n/4)+k(n/4)+k(n/4)=kn$. That's the runtime on every level, except for the top level, where merge()
is not called (since merge()
isn't called when n=1
), so we consider the runtime to be 0.
There are $log_2 n$ levels in total where the runtime isn't 0, so the total runtime is $kn\log_2 n$, which is $\mathcal{O}(n\log n)$.
$\mathcal{O}(n\log n)$ is pretty fast -- for $n = 1,000,000$, $n^2$ is a trillion, but $n\log_2 n$ is just 20 million.