=========================================================================== CSC 373H1 / L0101 Tutorial Notes -- Week 3 Winter 2012 =========================================================================== This tutorial is based on the PDF steiner_santuari.pdf. For details, see those notes. I will also post those notes on the course homepage. Introduce the graphical Steiner tree problem (ST) (problem 38, Chp 8, text). Defn: Graphical Steiner tree problem ST(G,R,k): Given a graph G = (V, E), a subset of vertices R contained in V, and an integer k, does their exist a subgraph T = (W, F) of G (i.e. W contained in V, F contained in E) such that: a) R subset equals W, b) T is a connected graph, b) |F| <= k (i.e., T has at most k edges). Ask: 1. What is the definition of "connected"? 2. What is an algorithm for detemining if a graph G is connected? 2a. What is the runtime? ("reachability", BFS, O(|V| + |E|), from CSC263.) We wish to sketch the proof that ST is NP-complete, by following the recipe: A) Show ST is in NP B) Choose an NP-complete problem X. C) Show X <=_p ST. =================== Step A: Sketch that ST is in NP: Note ST is a decision problem... good so far. Given the string s representing G = (V, E), R subset V, and k, define |s| to be |V| + |E|. For (G,R,k) in ST define a certificate t to denote a subgraph T = (W, F) where |t| is defined to be |W| + |F|, and all of the following are satisfied: a) T is a subgraph of G (W contained in V and F contained in E) b) R subset equal W c) |F| <=k d) T is a tree. Show that this choice of t satisfies the conditions for a poly-time certificate. Specifically: there exists a polynomial runtime algorithm C((G,R,k), T) (i.e., C runs in time O(p(|s| + |t|)) such that: (G,R,k) in ST (*) iff there exists T = (W,F), a subgraph of G (so |t| <= |s|), and C((G,R,k), T) is true. In particular, define the algorithm C to check all the conditions a-d above, and return true iff they are all satisfied. ASK: Why is the equivalence (*) satisfied? In particular, given a (G,R,k) in ST, the definition of ST did not require that the subgraph T = (W,F) formed a tree. ASK: What is the definition of a tree? ASK: Why can we limit our attention to T = (W,F) being a tree? If T was a connected graph containing a cycle, we could remove an edge from it and maintain connectedness. This reduces the number of edges by one. We are asking for a solution with no more than k edges, so the removal of one edge doesn't hurt. So WLOG we can assume, for any solution T of the ST problem, T is a tree. This argument shows the forward direction of the implication (*). The reverse direction follows from the definition of C (i.e. the conditions a-d), the fact that a tree is a connected graph, and the ST problem. Why does C(s,t) run in poly-time? (Recall |s| = |V| + |E|, |t| = |W|+|F| <= |s|). We can clearly check conditions (a) through (c) in time O(|s|). What about (d)? Use BFS in T, checking all vertices in W are reachable (from any vertex in W). Time O(|t|). Therefore C(s,t) is poly-time. ================== Step B: Choose an NP-complete problem Say the NP-complete problem: Exact cover by 3-sets (X3C). This is a generalization of 3-Dimensional Matching discussed on p.481 of the text. (Remark that we are assuming they have/will read all of chapter 8.) =========== STEP C: Show that X3C <=_p ST See steiner_santuari.pdf for details of the reduction. Sketch the graph G in Figure 1 The graph can be generated in polytime. Here s specifies a X3C problem, say |s| = q, and poly-time refers to p(|s|). Lemma: Suppose (X,C) is an instance of the X3C problem, and suppose (G,R,k) is the Steiner tree instance constructed from (X,C) using the above procedure. Then: (**) (X,C) in X3C iff (G,R,k) in ST See pdf for => implication. For the reverse, <=, implication I would add a couple of details to the provided proof, replacing the following quotes: Line 4: "so T contains at most q c-nodes" so the number of c-nodes in T, nc, satisfies: nc <= (4q+1) - (3q+1) (Total number of nodes, minus |R| = |X|+|v|) = q Line 6: "...impossible to hit all the 3q x-nodes if the tree contains less than 4q+1 nodes." ....impossible to hit all the 3q x-nodes if the tree contains less than q c-nodes. Since the tree must also contain X and v, we have that the tree must contain at least 3q + q + 1 = 4q+1 nodes. Any tree with 4q+1 nodes has 4q edges. Completes the proof that ST is NP-complete =========== If time: ASK: What happens if we tried a similar reduction from exact set cover, i.e. set cover with the restriction that the sets in the cover cannot intersect? Where did we need to use the fact that we had sets with exactly 3 elements? ASK: Could we use the similar style reduction beginning with 3-Dim Matching instead of X3C? Recall: 3-Dim Matching: Given disjoint sets X,Y,Z, each of size n, and given a set C subset equals X x Y x Z (i.e. C is a set of ordered triples), does there exist a subset S of C such that |S| = n, and each element of X U Y U Z is contained in exactly one element of S.