Let's come back to the example from before:
def find_i(L, e):
for i in range(len(L)):
if L[i] == e:
return i
return None
Ideally, we'd want to know how fast this function runs for any list. But that requires knowing exactly what computer it runs on.
Let's instead count the number of operations this function needs to perform in order to run. If every operation takes, for example, 0.01ms, then we can compute the total time by multiplying the number of operations needed by 0.01.
Let's try to count the operations that each line performs
def find_i(L, e):
for i in range(len(L)): #1 operation: load a new number into i
if L[i] == e: #2 operations: access the i-th element of L, and compare it to 2
return i #1 operation
return None #1 operation
Those counts are necessarily approximate. For example, it takes more than one operation behind the scenes to load a new number into i
and also decide whether that was the last number. But that doesn't matter much, as we'll soon see.
So what's the number of operations required? We have two cases:
L[k] == e
for some k
¶In this case, the loop runs k
times. We repeat the following two lines k
times:
for i in range(len(L)): #1 operation: load a new number into i
if L[i] == e: #2 operations: access the i-th element of L, and compare it to 2
That takes operations. Then we also return (once), taking, in our accounting, more operation.
In total, we would perform operations.
e
is not in L
¶In this case, the loop runs len(L)
times, and returns once. Setting n=len(L)
, we would perform operations.
A lot of the time, we want a more concise answer. We'd like to know, for input of size n, the longest amount of time the function could possibly take. (We have to specify the size of the input -- otherwise the function could take arbitrarily long amounts of time.)
In the worst case, e
is not in L
, so the function will run for operations.
What we're really interested is how much time the function will take. Let's say that on our computer, each elementary operation takes 0.01ms. (Actually, it's on the order of ms on ECF -- we'll see that tomorrow.)
That means that, in the worst case (i.e., when e
is not in L
), the function should take ms to run, where n == len(L)
.
But the coefficient is just made up! What we really want ot know is what the runtime of the algorithm will be proportional to. Essentially, we want to say that the runtime, in this case, will be roughly proportional to .
We write it as is .
Informally, a function is if grows no faster than , for large n. In other words, the ratio doesn't tend to infinity.
For example, if and , then is . We also write it as:
is .
It's also the case, for example, that:
is .
is .
is . (Because grows slower than )
is
But:
is not
is not
As we said, basically, is if the ratio doesn't tend to infinity. We can write this technically as follows:
You can think of almost as a . The reason is used that it's defined in more situations than is. Read more e.g. here: https://en.wikipedia.org/wiki/Limit_superior_and_limit_inferior
We don't really care about the technical definition here -- the imporant thing to understand is that when we say is we mean that for large enough , is either almost proportional to , or is much smaller than .
In most cases, we can identify a loop such that the number of iterations of that loop is proportional to the total number of operations.
This is the case here. In the worst case, the loop in find_i()
runs times, and we can say that the worst-case runtime complexity of find_i
is where n = len(L)
.
It is also technically true to say that the runtime complexity of find_i()
is . However, that is not a tight upper bound on the worst-case runtime complexity. By the tight bound, we mean the slowest growing possible function such that the runtime is .