The Algorithm
A 'Depth-first search' of a graph is a procedure for iterating over vertices in a graph. When choosing which vertex to explore next, we favour children over siblings (as opposed to a breadth-first search order where siblings are favoured over children).
As was the case with breadth-first search (and any graph traversal) the evolution of a DFS may be described as a process by which every node in the graph is coloured, starting with white, at some point becoming grey when 'activated', and finally becoming black when 'finished'.
To get a feel for this, we need to run some DFSs on some graphs. There are many nice pictures in CLRS, as well as in the the wiki entry on Depth-first search.
While it makes sense to do a DFS on any graph, the applications we'll discuss concern directed graphs, so imagine our underlying graph is directed.
So what's the output of a DFS?
DFS is an algorithm in that it describes a procedure. Many of the algorithms we discuss solve particular problems are understood as having an output (even in the data structure world: the input is the current data-structure state plus some auxiliary arguments for the current operation, the output is the data-structure state after making the modifications). While an algorithm having an output is a natural thing, it isn't a necessary thing, and we won't usually talk about the output of a DFS either. We will see shortly that taking a DFS is an important subroutine of other algorithms (which will have outputs).
(Some readers may have been happy without this discussion. If so, great.)
Graph Coordinates derived from the algorithm
Running a DFS on a graph gives us some quantities and coordinates with which to make sense of graph structure.
Vertex coordinates: Discovery and Finish Times
There are two numbers one may associate with a node $v$: the discovery time $d(v)$ and the finish time $f(v)$. The $d(v)$ is the time(-step) when $v$ is coloured grey, $f(v)$ the time when $v$ is coloured black.
Edge type: Tree, forward, back or cross
The edges can be classified into one of four types:
- Tree edges; those edges that were followed by the DFS just before activating a new node
Note that the tree edges, taken together with $V$, form a tree (d'uh). Ok so they form a forest, since we aren't guaranteed that the graph is fully or strongly connected. The edges that are not tree edges are one of three types:
- Forward edges: edges that point to an already discovered node that is a descendent in the DFS tree
- Back edges: edges that point to one's ancestor in the DFS tree
- Cross edges: anything edge that is not classified by one of the previous three categories.
And a proposition about these categories (thm 22.10 from CLRS):
Properties of activation intervals
The following theorem appears in CLRS:
We formulated the following proposition together in tutorial