Here is selection sort again:
def max_i(L1):
'''Return the index of a largest element in L. If there is more than
one maximal element, return the index of the leftmost one'''
cur_max = L[0] #the largest element up to index i
cur_max_i = 0 #the index of the largest element up to index i
for i in range(1, len(L1)):
if L1[i] > cur_max: #BLOCK A
cur_max = L1[i]
cur_max_i = i
return cur_max_i
def selection_sort(L):
'''Modify L so that it's sorted in non-decreasing order
Arugment:
L -- a list of ints
'''
for j in range(len(L)):
ind_of_max = max_i(L[:len(L)-j]) #LINE 1
L[ind_of_max], L[-1-j] = L[-1-j], L[ind_of_max] #LINE 2
Let's analyze Selection Sort again, this time a little more carefully than last time. Let n = len(L)
In the for-loop in selection_sort(), we previously argued that LINE 1 takes different amounts of time depending on j. Let's detail what we mean:
First, LINE 1 creates the new list L[:len(L)-j], and places it in memory. It makes sense to think (and also broadly correct) that this operation is $\mathcal{O}(len(L)-j)$, i.e., that it takes time proportional to len(L)-j. Let's assume that the operation takes a*(n-j) time, for some constant a.
Second, LINE 1 calls max_i() with a list of length len(L)-j. Like we argued previously, BLOCK A repeats len(L)-j-1 times, and that's what takes the bulk of the time in the call to max_i(). Let's assume, therefore, that the call to max_i() takes b*(n-j-1) time, for some constant b.
Now LINE 2 takes a constant amount of time, let's say c.
In total, for a set j, we will take a*(n-j)+b*(n-j-1)+c time. Rearranging terms, we get (a+b)*n-(a+b)*j+c-b.
Now, the total runtime will be
$\sum_{j=0}^{n-1} (a+b)n-(a+b)j+c-b = (a+b)n^2 - (a+b)\sum_{j=0}^{n-1}j + n(c-b) = (a+b)(n^2-n(n-1)/2)+n(c-b)$
This is equal to
$(a+b)\frac{n(n-1)}{2}+(c-b)n = \frac{a+b}{2}n^2 +(c-b/2+a/2)n.$
So the algorithm is indeed $\mathcal{O}(n^2)$. Note that we still left out some details -- for example, we didn't account for the operations that are needed to execute the line for j in range(len(L)). However, it's clear that LINE 1 is the costly one.
One conclusion we can draw from this is that we need to be careful about creating new lists -- in this case, this didn't affect the asymptotic runtime complexity, but it's easy to see how it might.