sum_list2
again¶Here is sum_list2
again:
def sum_list2(L):
'''Return the sum of the list of ints L'''
if len(L) == 0:
return 0
if len(L) == 1:
return L[0] #the sum of the list is L[0] if L[0] is the only element
mid = len(L)//2 #(the index of the approximate midpoint of the list)
return sum_list2(L[:mid]) + sum_list2(L[mid:])
In order to figre out the complexity of sum_list2
, we will draw a call tree. We'll use the notation [k]
to denote a list of length k
. For convenience, we'll assume that len(L)
is a power of 2. That's a common assumption that's usually harmless (in the sense that it doesn't change the complexity estimate) -- we will not pursue the detailed analysis that proves that in this case: it's slightly tedious but not very difficult.
[1].......[1]...[1]....................[1]
. .
. .
. .
. .
. .
[n/4] [n/4] [n/4] [n/4]
\ / \ /
\ / \ /
[n/2] [n/2]
\ /
\ /
\ /
\ /
[n]
Each call produces two more calls (for example, for a list of length n/4
, we need to compute the sum of two lists, both of length n/8
. Eventually, we reach calls to sum_list2()
for lists of length 1, at which point we return.
IF WE ASSUME THAT SLICING COSTS NOTHING (which is not true), each call takes the same amount of time , so we just need to count the calls. We'll do that by counting the number of calls on each level:
Number of calls
[1].......[1]...[1]....................[1] 2^k
. .
. .
. .
. .
. .
[n/4] [n/4] [n/4] [n/4] 2^2
\ / \ /
\ / \ /
[n/2] [n/2] 2^1
\ /
\ /
\ /
\ /
[n] 2^0
Each level has twice as many calls as the previous level. We start with $1=2^0$, and end at $2^k$, where $k$ is the number of levels minus 1 (since we start at $2^0$.)
The sum of the geometric progression is: $$\sum_{i=0}^{k}\alpha^i = \frac{1-\alpha^{k+1}}{1-\alpha}$$
In total, theorefore, we'll have $$\sum_{i=0}^k 2^i = \frac{1-2^{k+1}}{1-2} = 2^{k+1}-1$$
calls.
What's k
? It's the number of times we need to divide n
by 2 in order to get to 1. In other words, by definition, it is $\log_2 n$.
(Another way to arrive at the same conclusion is to observe that we'll make n
calls to sum_list2()
with lists of size 1
-- one per element of the original list L
. Since we said that the last level contains $2^k$ calls, it follows that $n = 2^k$, so that, again, $k = \log_2 n$, which we get by taking the $\log_2$ of both sides.)
The total number of calls is therefore $$2^{k+1}-1 = 2^k \times 2^1 -1 = 2\times 2^k - 1 = 2\times 2^{\log_2 n} - 1 = 2n-1$$
(Note that $2^{\log_2 n} = n$, since $\log_2 n$ is be definition the power to which $2$ is raised to get $n$.)
So the complexity of sum_list2
is $\mathcal{O}(n)$, assuming slicing is costless.
What's the true story there? Actually, slicing creates a new list, costing us time that's proportional to the length of the resulting new list.
See the analysis of MergeSort to see that the runtime here is actually $\mathcal{O}(n\log n)$.
In this case, we assumed that slicing is costless because we could actually avoid it:
def sum_list2_fast(L, start, end):
if end == start:
return 0
if end == start + 1:
return L[start]
mid = (start+end)//2
return sum_list2_fast(L, start, mid) + sum_list2_fast(L, mid, end)
Here, the complexity really is $\mathcal{O}(n)$.