Here is how we actually sort the 250 midterms for CSC180: we first separate them into piles by the first initial (i.e., the first letter of the last name.) We then sort each pile individiually. That's much faster: if we use, for example, Selection Sort, and we are able to separate the 250 midterms into 25 piles of 10, we'll be making $25\times 10^2 /2 = 1250$ comparisons instead of $250^2/2 \approx 30,000$ comparisons.

Here, we'll concentrate on a simpler scenario: suppose we want to sort integers. We again want to separate them into piles. Suppose the integers are:

1, 5, 10, 1, 2, 100, 500, 2, 4, 5, 6

We'll separate them into piles as follows: we'll have a pile of 1's, a pile of 2's, a pile of 3's, ..., a pile of 500's. We won't even need to then sort the piles, and will almost have the sorted list ready. Here's what the piles would look like:

pile of: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, ..., 100, 101, ..., 500
         *  *     *  *  *           *                 *              *
         *  *        *

After we have the piles, we can read off the sorted list: 1, 1, 2, 2, 4, 5, 5, 6, 10, 100, 500

We don't literally need to maintain the piles: instead, we can keep track of how many 1's, 2's, 3's, ... etc. we've seen. Here is the algrithm. The algorithm is a special case of Bucket Sort, and is called Counting Sort.

In [ ]:
def bucket_sort(L):
    max_int = max(L)
    
    counts = [0]*(1+max_int)
    #counts[i] will be the number of times that
    #i appears in L
    
    for e in L:
        counts[e] += 1
        
    
    #Now, build up the sorted version of the list. Note that if we have
    #counts[i] of the number i, we need to extend the sorted list
    #by [i]*counts[i], which is just [i, i, i, ..., i]
    #                                  counts[i] times
    sorted_L = []
    for i in range(0, len(counts)):
        sorted_L.extend([i]*counts[i])
    
    #Modify the contents of L to be the contents of sorted_L
    L[:] = sorted_L