=========================================================================== CSC 363 Lecture Summary for Week 9 Winter 2008 =========================================================================== More examples of languages in NP. - SUBSET-SUM = { : S = {x_1,x_2,...,x_k} is a set of positive integers and there is a subset {y_1,y_2,...,y_j} (_ S such that \sum y_i = t } For example, S={4,22,10,8}, t=14 belongs to SUBSET-SUM, but S={4,22,10,8}, t=13 does not belong to SUBSET-SUM. SUBSET-SUM (- P? Unknown (checking all 2^n subsets not polytime). SUBSET-SUM (- NP? Yes: Verifier = "On input : 1. Check that c encodes a set of numbers. 2. Check that c (_ S. 3. Check that \sum_{y (- c} y = t. Accept if all checks pass, reject otherwise." Clearly runs in polytime (addition of numbers is polytime), and if (- SUBSET-SUM, then there is some value of c such that verifier accepts (c = subset whose sum equals t); if !(- SUBSET-SUM, then verifier rejects for all values of c (there is no subset that works). - CLIQUE = { : G is an undirected graph that contains a k-clique -- a subset of k vertices with all edges between them } a---b For example, the graph pictured on the left contains a |\ /| 3-clique (there are sets of 3 vertices with all edges | c | between them, e.g., {a,b,c}), but it does not contain a |/ \| 4-clique (every set of 4 vertices is missing at least one d---e edge, e.g., {a,b,c,d} is missing (b,d)). CLIQUE (- P? Unknown (checking all possible subsets not polytime because k not fixed, part of input). CLIQUE (- NP? Yes: Verifier = "On input : 1. Check that c encodes a set of k vertices. 2. Check that every vertex in c belongs to G. 3. Check that G contains the edge between every pair of vertices from c. Accept if all checks pass, reject otherwise." Verifier runs in polytime: 1. checking encodings can be done in polytime, 2. time O(kn), where n = number of vertices of G, 3. time O(k^2 n^2) (O(k^2) pairs in c, time O(n^2) for each one). If (- CLIQUE, then verifier accepts when c = a k-clique of G; if !(- CLIQUE, then no value of c makes verifier accept. - Contrast with TRIANGLE = { : G contains a triangle }: TRIANGLE (- NP: On input , check c encodes a triangle in G. But TRIANGLE (- P: There are only (n choose 3) = O(n^3) triplets of vertices to check, and each one can be checked in linear time, for total time O(n^4). What's the difference? Same argument with CLIQUE gives that it takes time O(n^{k+1}), except that k is part of the input (instead of being fixed) so this could be as bad as, e.g., O(n^{n/2}) -- definitely not polytime. Notes: - ~HAMPATH, ~CLIQUE, ~SUBSET-SUM (complements) don't appear to belong to NP: apparently, no way to give a short certificate of NON-membership in HAMPATH, CLIQUE, or SUBSET-SUM. - On the other hand, ~COMPOSITES = PRIMES can be shown to belong to NP (using number theory). In fact, recent research result (Agrawal, Kayal, Saxena 2002) showed that PRIMES is actually in P (for more details, see http://crypto.cs.mcgill.ca/~stiglic/PRIMES_P_FAQ.html). Definition: coNP = { ~L : L (- NP } = { complements of languages in NP }. Note: coNP != ~NP! L (- coNP iff ~L (- NP but L (- ~NP iff L !(- NP (for example, A_TM (- ~NP but A_TM !(- coNP). [Picture: P (_ (NP n coNP) (_ DECIDABLE; analogy with computability: DECIDABLE = RECOGNIZABLE n coRECOGNIZABLE. Notation: 'A n B' = 'A intersection B'.] Open question: P ?= NP n coNP (No strong concensus.) Open question: NP ?= coNP (Strongly believed to be NO.) Open question: P ?= NP (Strongly believed to be NO.) Answering these questions is worth 1 million dollars! (They are some of the "Millenium Problems" recognized by the Clay Mathematics Institute.) --------------------- Polytime reducibility --------------------- Defn: Language A is "polytime reducible" to language B (written A <=p B) if there is a polytime computable function f : \Sigma* -> \Sigma* such that for all w (- \Sigma*, w (- A iff f(w) (- B. Almost identical to many-one reducibility, with added constraint that f can be computed in polytime. In fact, most (if not all) reductions we've seen so far have been polytime. Just like <=m, think of "<=p" as comparing the difficulty of deciding the languages. So A <=p B intuitively says "A is no more difficult to solve than B" or equivalently, "B is at least as hard to solve as A". Theorem: A <=p B and B (- P (or NP) implies A (- P (or NP). Main proof idea: On input x (or ), compute f(x), in polytime, then run decider for B on f(x) (or verifier for B on ), in polytime. Corollary: A <=p B and A !(- P (or NP) implies B !(- P (or NP). Just like for decidability/recognizability, one example of a language not in P could be used to prove more, using <=p. Problem: such languages hard to find, and only known examples are outside NP... Want to focus on NP because it contains a vast majority of problems from "real life" applications. Idea: try to identify "hardest" problems in NP. Defn: Language A is "NP-complete" (NPc) if - A (- NP, - B <=p A for all B (- NP (A is "NP-hard"). Notes: - NP-hardness (the condition that B <=p A for all B (- NP) is different from A (- NP: it's possible for a language to be NP-hard *without* belonging to NP -- for instance, A_TM is NP-hard (left as an exercise). - This is a very "ambitious" definition: the condition of NP-hardness requires us to prove that B <=p A for ALL B (- NP, and this is a strong requirement -- it's not clear that there's any language A for which it's possible to do this. - However, the definition is useful, as shown in the next theorem. Theorem: If A is NPc, then A (- P iff P = NP. Proof: - If P = NP, then A NPc -> A (- NP -> A (- P. - If A (- P, then A NPc -> for all B (- NP, B <=p A -> B (- P (because A (- P), i.e., NP subset of P so P = NP. Corollary: If P != NP and A is NPc, then A !(- P. So proving NP-completeness "as good as" proving not in P. --------------- NP-completeness --------------- - SAT = { : F is a propositional formula that is satisfiable, i.e., there is some assignment of truth-values to the variables of F that makes F true } For example, (p /\ q) -> (p \/ r) is satisfiable, while p /\ ~p is not satisfiable. - Cook-Levin Theorem: SAT is NPc SAT (- NP: Given , check that c encodes an assignment of truth values to the variables of F that makes F evaluate to TRUE. This can be done in polytime, and F is satisfiable iff there is some value of c that makes the verifier accept. SAT is NP-hard: (high-level idea only) For an arbitrary language A (- NP, by definition, there is some NTM M_A that decides A in time <= C n^k for some constants C, k. Given an input x, we can construct a formula F_x that describes possible computation paths (of length at most C |x|^k) of M_A on x, such that F_x is satisfiable iff there is some computation path of M_A that accepts x. Intuitively, we are simulating the TM model of computation using propositional formulas, which are similar to digital circuits. Details are in the textbook and are needed to ensure that this can be done in polytime. In general, to prove A is NP-hard, it's sufficient to show B <=p A for some B known to be NP-hard: if B <=p A then for all L (- NP, L <=p B (by definition of NP-hardness for B) so L <=p A (since <=p is transitive, something you can prove as an easy exercise). Template for proofs of NP-completeness: To show A is NPc, prove that A (- NP: Describe a polytime verifier for A: "Given , check that c..." Argue that verifier runs in polytime and that x (- A iff verifier accepts for some c. Note that all languages in NP we've seen so far have a similar structure to their definition: "the set of objects A for which there is some related object B such that some property holds about A and B" -- for example, CLIQUE = "the set of undirected graphs G and integers k for which there is a subset of vertices C such that C makes up a k-clique in G". For all such languages, the verifier will also have a common structure: "on input , check that c encodes an object B and that A and B have the required property". Because of the way languages are defined, this guarantees is accepted iff belongs to the language. All that remains is to ensure checking property of A,B can be done in polytime. A is NP-hard: Show B <=p A for some NP-hard B: "Given y, construct x_y as follows: ..." Argue that construction can be carried out in polytime and that y (- B iff x_y (- A. To recap: . start with arbitrary input y for B, . describe explicit construction of specific input x_y for A, . argue construction can be carried out in polytime, . argue if y (- B, then x_y (- A, . argue if x_y (- A, then y (- B (of if y !(- B, then x_y !(- A). Watch that last step! The argument starts from x_y constructed earlier (not from arbitrary x input for A), and relates it to arbitrary y that x_y was constructed from.