Longest run

We'd like to write the function longest_run(s, c), which return the length of the longest consecutive sequence of the character c in the string s. For example:

>>> longest_run("aababbbabb", "b")
3

>>> longest_sequence("aababbbabb", "a")
2

Here is the more "Pythonic" way of solving this. Suppose that c == "a". Here's what we want to do:

Check: is "a" in s? if not, return 0 Check: is "aa" in s? if not, return 1 Check: is "aaa" in s? if not, return 2 ...

Recall that we can form strings of the form "aaaaa..a" like this:

In [1]:
c = "z"
c * 10
Out[1]:
'zzzzzzzzzz'

We can now write the function:

In [2]:
def longest_run(s, c):
    
    #Start with i = 1, so that we have the option of returning 0.
    #End with len(s)+1, so that we have the option of returning len(s)+1-1 = len(s)
    for i in range(1, len(s)+2):
        if c*i not in s:
            return i-1
            

Here's another approach: let's go through s letter-by-letter, and keep track of the length of the current run, as well as the length of the maximum run.

In [3]:
def longest_run_buggy(s, c):
    run = 0     #the length of the current run
    max_run = 0 #the largest run that we saw
                #so far
    
    for ch in s:
        if ch != c:
            max_run = max(run, max_run)
            run = 0
        else:
            run += 1
    
    return max_run

Note the technique here. We defined two variables, said what they were, and then, inside the loop, we tried to make sure that the variables stay up-to-date.

There is a small bug there: what if s = "abaabaaaaaa" c = "a"

Then we never update max_run for the last time, and return 2.

Here's one way to get around this: add a character to the end of s that's not equal to c (strictly speaking, make s a new string that equals to s plus a character other than c)

In [4]:
def longest_run2(s, c):
    run = 0     #the length of the current run
    max_run = 0 #the largest run that we saw
                #so far
    
    if c == "z":
        s += "y"
    else:
        s += "z"
    
    for ch in s:
        if ch != c:
            max_run = max(run, max_run)
            run = 0
        else:
            run += 1
    
    return max_run

Another thing we can do is make sure that max_run is updated one last time before returning. We can do that in the return statement:

In [5]:
def longest_run2(s, c):
    run = 0     #the length of the current run
    max_run = 0 #the largest run that we saw
                #so far
    
    if c == "z":
        s += "y"
    else:
        s += "z"
    
    for ch in s:
        if ch != c:
            max_run = max(run, max_run)
            run = 0
        else:
            run += 1
    
    return max(run, max_run)