=========================================================================== CSC 363 Lecture Summary for Week 10 Winter 2008 =========================================================================== Examples of NP-completeness: - Definition: a propositional formula is in Conjunctive Normal Form (CNF) if it is written as a conjunction ("and") of one or more "clauses": C_1 /\ C_2 /\ ... /\ C_r where each clause is a disjunction ("or") of one or more "literals": C_j = (a_{j,1} \/ a_{j,2} \/ ... \/ a_{j,s_j}) where each literal is either a propositional variable or the negation of a propositional variable. For example, (p \/ ~q) /\ q is in CNF, but ~(p /\ q) /\ (r \/ ~p) is NOT in CNF (it is equivalent to some CNF formula, but the property of being in CNF is all about the way the formula is written). The proof of the Cook-Levin Theorem actually shows that CNF-SAT is NPc (where CNF-SAT = { : F is a formula in CNF that is satisfiable }), because it's possible to construct the formula F_x to be in CNF. - Definition: a propositional formula is in 3-CNF if it is in CNF where each clause contains exactly 3 literals, e.g., (p \/ q \/ ~r) /\ (p \/ p \/ ~q) 3SAT = { : F is a formula in 3-CNF that is satisfiable } is NPc: 3SAT (- NP because it's a special case of SAT (same verifier works). 3SAT is NP-hard because CNF-SAT <=p 3SAT: reduction was written up and posted along with summary of lecture notes for last week. Note: Careful with directions! Trivially, 3SAT <=p CNF-SAT (3SAT is special case of CNF-SAT). But we need other direction, transforming instances of general problem into instances of restricted problem. - VERTEX-COVER (VC for short) = { : G is a graph that contains a vertex cover of size k, i.e., a set C of k vertices such that each edge of G has at least one endpoint in C } . VC (- NP: certificate = vertex cover of size k. . VC is NP-hard: 3SAT <=p VC. Given F = (a1 \/ b1 \/ c1) /\ ... /\ (ar \/ br \/ cr), where ai,bi,ci in {x1,~x1,x2,~x2,...,xs,~xs}, construct G=(V,E) and k such that F satisfiable iff G contains vertex cover of size k, as follows: k = s + 2r V = { a1,b1,c1, ..., ar,br,cr, x1,~x1, ..., xs,~xs } E = { (xi,~xi) : 1 <= i <= s } U { (ai,bi),(bi,ci),(ci,ai) : 1 <= i <= r } U { (l,x) : l = ai or bi or ci, and x = xj or ~xj corresponding to l } For example, if F = (x1 \/ ~x2 \/ ~x4) /\ (x2 \/ ~x3 \/ x1) /\ (~x3 \/ x4 \/ ~x2), then a1=x1, b1=~x2, c1=~x4, a2=x2, b2=~x3, c2=x1, a3=~x3, b3=x4, c3=~x2 so k = 4 + 2*3 = 10 V = {a1,b1,c1, a2,b2,c2, a3,b3,c3, x1,~x1, x2,~x2, x3,~x3, x4,~x4} E = { (x1,~x1), (x2,~x2), (x3,~x3), (x4,~x4), (a1,b1), (b1,c1), (c1,a1), (a1,x1), (b1,~x2), (c1,~x4), (a2,b2), (b2,c2), (c2,a2), (a2,x2), (b2,~x3), (c2,x1), (a3,b3), (b3,c3), (c3,a3), (a3,~x3), (b3,x4), (c3,~x2) } Clearly, construction can be done in polytime (with one scan of F). Also, if F is satisfiable, then there is an assignment of truth values that make at least one literal in each clause true. Pick a cover C as follows: for each variable, C contains xi or ~xi, whichever is true under the truth assignment; for each clause, C contains every literal except one that's true (pick arbitrarily if more than one true literal). C contains exactly s+2r vertices and is a cover: all edges (xi,~xi) are covered; all edges in clause triangles are covered (because we picked two vertices from each triangle); all edges between "clauses" and "variables" are covered (two from inside triangle, one from true literal for that clause). Finally if G contains a cover C of size k=s+2r, C must contain at least one of xi or ~xi for each i (because of edges (xi,~xi)) and at least two of ai,bi,ci for each i (because of triangle), so only way for C to have size s+2r is to contain exactly one of xi or ~xi and exactly two of ai,bi,ci, for each i. Since C covers all edges with only two vertices per triangle, the third vertex in each triangle must have its "outside" edge covered because of xi or ~xi. If we set literals according to choices of xi or ~xi in C, this will make formula F true: at least one literal will be true in each clause (because at least one edge from "variables" to "clauses" is covered by the variable in C). - INDEPENDENT-SET = { | G is an undirected graph that contains an independent set of size k -- a subset of vertices with NO edge between any two of them } INDEPENDENT-SET (IS) is NPc: . in NP: certificate = independent set . NP-hard: VC <=p IS: Given , construct as follows: G' = G, k' = n-k. Clearly this can be done in polytime. Also, if G contains a vertex cover of size k, the vertices outside the cover form an independent set of size n-k (no edge can have both endpoint outside the cover). Finally, if G' contains an independent set of size n-k, the vertices outside the independent set form a vertex cover of size k (every edge has at least one endpoint in the cover). - CLIQUE is NPc: Already known in NP. For NP-hardness, use reduction from INDEPENDENT-SET: On input , construct as follows: G' = complement of G (same vertices, edge in G' iff edge not in G) k' = k This construction can obviously be carried out in polytime. Moreover, if G contains an I.S. of size k, then the same set in G' will be a clique of size k; conversely, if G' contains a clique of size k, the same set in G will be an I.S. of size k. Note: textbook uses different reduction from 3SAT. - SUBSET-SUM is NPc: Already known in NP. For NP-hardness, show 3SAT <=p SS: Given formula F = (a1 \/ b1 \/ c1) /\ ... /\ (ar \/ br \/ cr) where ai,bi,ci in {x1,~x1,...,xs,~xs}, construct numbers as follows: . For j = 1,...,s, number xj = 1 followed by s-j 0s followed by r digits where k-th next digit equals 1 if xj appears in clause C_k, 0 otherwise; number ~xj = 1 followed by s-j 0s followed by r digits where k-th next digit equals 1 if ~xj appears in clause C_k, 0 otherwise. . For j = 1,...,r, number Cj = 1 followed by r-j 0s and number C'j = 2 followed by r-j 0s. . Target t = s 1s followed by r 4s. Clearly, this can be constructed in polytime. Example of reduction for F = (x1 \/ ~x2 \/ ~x4) /\ (x2 \/ ~x3 \/ x1) /\ (~x3 \/ x4 \/ ~x2): S = { x1 = 1000110, ~x1 = 1000000, x2 = 0100010, ~x2 = 0100101, x3 = 0010000, ~x3 = 0010011, x4 = 0001001, ~x4 = 0001100, C1 = 0000100, C'1 = 0000200, C2 = 0000010, C'2 = 0000020, C3 = 0000001, C'3 = 0000002 } t = 1111444 Note: slightly different from book to ensure S contains all distinct numbers (book's reduction constructs S with repeated numbers, so S not really a set). If F is satisfiable, then there is a setting of variables such that each clause of F contains at least one true literal. Consider the subset S' = {numbers that correspond to true literals}. By construction, \sum_{x (- S'} x = s 1s followed by r digits, each one of which is either 1, 2, or 3 (because each clause contains at least one true literal). This means it is possible to add suitable numbers from {C1,C'1,...,Cr,C'r} so that the last r digits of the sum are equal to 4, i.e., there is a subset S' such that \sum_{x (- S'} x = t. If there is a subset S' of S such that \sum_{x (- S'} x = t, then S' must contain exactly one of {xj,~xj} for j = 1,...,n, because that is the only way for the numbers in S' to add to the target (with a 1 in the first s digits). Then, F is satisfied by setting each variable according to the numbers in S': for each clause j, the corresponding digit in the target is equal to 4 but the numbers Cj and C'j together only add up to 3 in that digit; this means that the selection of numbers in S' must include some literal with a 1 in that digit, i.e., clause j contains at least one true literal.