CS4HS Workshop: Algorithms and Data Structures

July 13–14, 2011
Department of Computer Science
University of Toronto
François Pitt <francois.pitt@utoronto.ca>

What to expect

Goal

Topics

General references

Contact me!


Outline

Algorithms

Data Structures


Brute Force Algorithms

Discussion

What is it?

Advantage(s)?

Disadvantage(s)?

Try it!

Regroup

High-level solution (in Python)?

    H, block_list = process_input("DATA4.txt")
    subset_list = generate_subsets(block_list)
    minimum = 0
    for subset in subset_list:
        if sum(subset) == H and (minimum == 0 or len(subset) < minimum):
            minimum = len(subset)
    generate_output(minimum, "OUT4.txt")

Difficulty?

Solution (in Python)?

    def generate_subsets(block_list):
        if block_list == []:
            return [[]]
        else:
            # Generate the list of subsets for every block except the last.
            first_half = generate_subsets(block_list[:-1])
            # Make a copy of every subset in the first half, but with the
            # last block added to it.
            second_half = []
            for subset in first_half:
                second_half.append(subset + [block_list[-1]])
            return first_half + second_half

Further examples

[Back to outline]

Graph Traversal Algorithms

Discussion

What is this about?

Depth-first search

Breadth-first search

Try it!

Regroup

High-level solution?

  1. Read input into a 2D array.
  2. Scan input to generate graph representation of connections:
    1. For each number, "flood-fill" to find all portal entrances and store in a list.
    2. For each portal exit, "flood-fill" to find all numbers and portal entrances and store in a list.
    Result is adjacency list representation of graph: list of neighbours for every node.
    Adjacency lists for sample input:
  3. Perform BFS/DFS on graph to generate lists of reachable points of interest — but modified so that points of interest cannot be used as intermediary nodes.
  4. Print sorted lists, in order.

What's "flood-fill"?

Try it!

Regroup

High-level solution?

  1. Perform BFS — not DFS — on the implicit graph defined by knight moves.

What's an "implicit graph"?

Why BFS and not DFS?

[Back to outline]

Dynamic Programming

Discussion

What is it?

Classic example: Fibonacci numbers

Why not just use recursion?

    def fib(n):
        if n == 0:  return 0
        if n == 1:  return 1
        return fib(n-1) + fib(n-2)

Problem!

    fib(5):
        fib(4):
            fib(3):
                fib(2):
                    fib(1)
                    fib(0)
                fib(1)
            fib(2):
                fib(1)
                fib(0)
        fib(3):
            fib(2):
                fib(1)
                fib(0)
            fib(1)

Solution

        def fib2(n):
            L = [0, 1]
            for i in range(n):  # for i = 0,1,...,n-1
                L.append(L[-2] + L[-1])
            return L[n]

Try it!

Regroup

High-level solution?

Is it recursive?
Yes: solution either does not use last block (fewer blocks, same target height) or it uses last block (fewer blocks, reduced target height).
Does it have optimal substructure?
Yes: if optimal solution does not use last block, then it is optimal for rest of blocks;
if optimal solution uses last block, then it is optimal for rest of blocks and reduced height.
Are subproblems "simple"?
Yes: each subproblem characterized by target height (range [0..H]) and index of last block available (range [0..S]).
Do subproblems overlap?
Yes: small subproblems (e.g., with S = 1) must be evaluated multiple times as part of solving larger problems.

Step 1: define a table of solutions.

Step 2: write recursive equations for table values.

Step 3: write the algorithm.

Step 4: reconstruct the solution.

Try it!

Regroup

High-level solution?

Is it recursive?
Yes: solution either does not use last song or it does (in which case capacity is reduced accordingly).
Does it have optimal substructure?
Yes: if optimal solution does not use last song, then it is optimal (highest total rating) for rest of songs and original capacity;
if optimal solution uses last song, then it is optimal for rest of songs and reduced capacity.
Are subproblems "simple"?
Yes: each subproblem characterized by capacity (range [0..C]) and index of last song considered (range [0..N]).
Do subproblems overlap?
Yes: small subproblems (e.g., with N = 1) must be evaluated multiple times as part of solving larger problems.

Step 1: define a table of solutions.

Step 2: write recursive equations for table values.

Step 3: write the algorithm.

Step 4: reconstruct the solution.

Further examples

[Back to outline]

Greedy Algorithms

Discussion

What is it?

Example: making change

Input:
Coin denominations 1 = d1 < d2 < ... < dN; amount A.
Output:
Exact change for A, using as few total coins as possible.

Algorithm

Problem

Solution?

How do we know when greedy will work?

Try it!

Regroup

High-level solution

  1. Translate input into a graph representation.
    Careful! We want a graph where each pen is a "node", the outside is also one "node", and there are "edges" between any two pens that share an edge.
  2. Pick edges with smallest cost, until every pen is connected — skip edges between pens that are already connected.

Minimum Spanning Trees

Definition

Kruskal's algorithm

  1. Initialization:
    1. Sort the edges by non-decreasing cost.
    2. Initialize an empty tree (subset of edges).
  2. Repeat n−1 times (where n is the number of vertices):
    1. Select an edge with smallest cost whose endpoints are not already connected, and add it to the partial tree.

Prim's algorithm

  1. Initialization:
    1. Pick a start vertex s. The entire graph is partitioned into two sides: vertices connected to s, and the rest.
    2. Initialize an empty tree (subset of edges).
  2. Repeat n−1 times:
    1. Select a minimum-cost edge between some vertex connected to s and some unconnected vertex, and add it to the tree.

Runtime

Argument of correctness

[Back to outline]

Divide-and-Conquer Algorithms

Discussion

What is it?

Example: Mergesort

The Master Theorem

Try it!

Regroup

Trick

Resulting algorithm and runtime

How did this happen?

[Back to outline]

Hash Tables

Discussion

Direct-address table

Problem

Solution

Hash functions

Collisions

Complexity

[Back to outline]

Heaps and Priority Queues

Discussion

Motivation

Definition and properties

Implementation

Operations and complexity

[Back to outline]
© Copyright 2011 François Pitt <francois.pitt@utoronto.ca>
last updated at 17:13 (EDT) on Sat 16 Jul 2011