Starting Point
Have a list of rules with:
Target
One or more dependencies
Which may be targets as well
Action(s) to bring target up to date
Zero or more targets may be out of date
I.e., older than their dependencies
Goal: execute required actions
Must respect dependency order
Do not update something until all dependencies are up to date
Must only update things once
Efficiency: wasteful to recompile something eleven times
Correctness: actions may have side effects
Example
Can update D and E in any order
Can't update B until both D and E have been updated
![[Dependency Ordering]](../img/graph/dependencyOrdering.png)
Graphs
A graph is a set of nodes connected by arcs
Directed graph if the arcs have direction
Undirected graph if the arcs simply show connections
There are a lot of graphs in the world
Bus routes
Transitions in a finite state machine
Project dependencies
One of the fundamental data structures in computing
Kind of odd that java.util.Graph doesn't exist
Design Choices
Three basic representations
Graph owns nodes and arcs
Graph owns nodes, nodes own arcs
Graph owns arcs, arcs own nodes
Each one makes some algorithms cheap, and others expensive
We will implement directed graphs using the second one option
Graph is a map
Keys are names of nodes
Values are sets of reachable nodes
I.e., the other ends of the arcs out of the node identified by the key
Note: make parameters and return values as general as possible
Use Collection instead of List, Set instead of HashSet, and so on
Makes code more flexible (fewer constraints on users)
Makes code easier to change (fewer external commitments)
Example
![[Graph Representations Diagram]](../img/graph/representations.png)
Class Skeleton
class DirectedGraph {
public DirectedGraph() {...}
public Set getNodes() {...}
public boolean hasNode(Object node) {...}
public void addNode(Object node) {...}
public void addNode(Object node, Collection arcs) {...}
public void removeNode(Object node) {...}
public Set getArcs(Object node) {...}
public boolean hasArc(Object src, Object dst) {...}
public void addArc(Object src, Object dst) {...}
public void addArcs(Object src, Collection allDst) {...}
public void removeArc(Object src, Object dst) {...}
public String toString() {...}
protected Map fNodes;
}
More Design Choices
What happens if we try to add a node that's already present?
Or duplicate an arc?
Or remove a node or arc that doesn't exist?
Could throw an exception
Means that users have to write code like "if not present then add" and "If present then delete"
Could just ignore operations that don't make sense
Results in late failures
Still More Design Choices
What should a graph return when asked for its nodes?
Return map's keys as a set
What if it is asked for a node's arcs?
The set used to store the arcs?
Efficient, but allows users to mess up data structure's internals
A copy of that set?
Less efficient for large graphs with many arcs
But safer
What if a user wants all arcs?
Not directly available, but easy to construct
A set of two-element arrays?
A list of two-element lists?
Implementation
public DirectedGraph() {
fNodes = new HashMap();
}
public Set getNodes() {
return fNodes.keySet();
}
public boolean hasNode(Object node) {
return fNodes.containsKey(node);
}
public void addNode(Object node) {
addEmptyNode(node);
}
public void addNode(Object node, Collection arcs) {
addEmptyNode(node);
addArcs(node, arcs);
}
protected Map fNodes;
Utility Methods
Have to use a concrete set class for storing arcs
If we ever want to change how this is done, only want to have to change it in one place
Also need to handle nodes that already exist
protected void addEmptyNode(Object node) {
if (!fNodes.containsKey(node)) {
fNodes.put(node, new HashSet());
}
}
Removing Nodes
Must ensure integrity of data structure
When removing a node, must also remove all arcs that point to it
Rely on being able to "remove" nonexistent arcs
public void removeNode(Object node) {
fNodes.remove(node);
Iterator ni = getNodes().iterator();
while (ni.hasNext()) {
removeArc(ni.next(), node);
}
}
Adding Arcs
When adding an arc from A to B, must ensure that A and B exist
Data structure integrity again
Note: work with Set interface, not HashSet class
public void addArc(Object src, Object dst) {
addNode(src);
addNode(dst);
Set arcSet = (Set)fNodes.get(src);
arcSet.add(dst);
}
Other methods are straightforward
See exmpl/graph/DirectedGraph.java
Reading Graphs
Write a parser to read graphs from files
Should be easy after the Makefile parser
Graph file format:
Blank lines and comments
Data lines containing:
Node name
Colon
Space-separated list of reachable nodes
Note: nodes may be implied
I.e., node may be listed as reachable, but not have its own entry
# Example graph file a : b c b : c
Graph Algorithms
Crop up everywhere in computing
Because many problems best represented as graphs
Most are recursive
Must handle circularity
A->B, B->C, C->A
Solution is to keep track of nodes already visited using a set
If node being examined is already in the set, don't recurse
Example: Reachability
public static boolean reachable(DirectedGraph graph, Object src, Object dst, Set seen) {
// Are we there yet?
if (src == dst) {
return true;
}
// Try to get there indirectly
Iterator ia = graph.getArcs(src).iterator();
while (ia.hasNext()) {
Object next = ia.next();
if (!seen.contains(next)) {
seen.add(next);
if (reachable(graph, next, dst, seen)) {
return true;
}
}
}
return false;
}
Reachability in Action: 1
![[Reachability Diagram Stage 1]](../img/graph/reachability1.png)
Reachability in Action: 2
![[Reachability Diagram Stage 2]](../img/graph/reachability2.png)
Reachability in Action: 3
![[Reachability Diagram Stage 3]](../img/graph/reachability3.png)
Reachability in Action: 4
![[Reachability Diagram Stage 4]](../img/graph/reachability4.png)
How to Call
public static void caller() {
DirectedGraph g = new DirectedGraph();
...add nodes to graph...
boolean b = reachable(graph, "A", "Z", new HashSet());
}
Requiring users to create and pass in a set that is only used internally is clumsy
In practice, probably provide two methods:
public static boolean reachable(DirectedGraph graph, Object src, Object dst) {
return reachable(graph, src, dst, new HashSet());
}
protected static boolean reachable(DirectedGraph graph, Object src, Object dst, Set seen) {
...as before...
}
Example: Shortest Path
Can't just keep adding nodes to seen set
Instead, have to add and remove nodes to keep track of current path
public static List shortest(DirectedGraph graph, Object src, Object dst, Set seen) {
// Are we there yet?
if (src == dst) {
List result = new ArrayList();
result.add(src);
return result;
}
...Look at lengths of all paths starting from here...
// If a valid path exists, put the starting node on it
if (best != null) {
best.add(0, src);
}
return best;
}
Shortest Path (cont.)
// Look at lengths of all paths starting from here
List best = null;
Iterator ia = graph.getArcs(src).iterator();
while (ia.hasNext()) {
Object node = ia.next();
if (!seen.contains(node)) {
seen.add(node);
List path = shortest(graph, node, dst, seen);
seen.remove(node);
if (path != null) {
if ((best == null) || (path.size() < best.size())) {
best = path;
}
}
}
}
Notes on Shortest Path
Set of nodes already seen grows and shrinks
May be several ways to reach a node
Never include more than once in any path
Question: what happens if nodes accidentally left in set?
Note how src==dst case is handled
shortest(a, a) produces [a], not [a,a]
Ensures that paths of length greater than one are constructed correctly
A Note on Testing
Test graph algorithms by checking output for sample input
But looking at output repeatedly is tedious and error-prone
Better solution: build a tool
Read file containing expected output
Read actual output from program
Report whether there are differences
How to get program's output to the comparison program?
Could save to a temporary file
Better to use a Unix pipe
See exmpl/graph/TextDiff.java for comparison tool
Makefile contains lines like these:
JAVA_FLAGS = -classpath ${SCJAVAPATH}
JAVA_RUN = java ${JAVA_FLAGS} -enableassertions
RUN_ALG = ${JAVA_RUN} GraphAlgorithms
RUN_DIFF = ${JAVA_RUN} TextDiff
${RUN_ALG} shortest empty.grf | ${RUN_DIFF} shortest_empty.txt
$Id: graph.html,v 1.1.1.1 2004/01/04 05:02:31 reid Exp $