=========================================================================== CSC 363 Lecture Summary for Week 13 Winter 2008 =========================================================================== Back to P, NP, PSPACE: - Why do we care? Despite our belief P != NP != PSPACE, classes still correspond to some notion of efficiency, and NP contains vast majority of problems that arise from "real-life" applications. - Most likely: P ( NP ( PSPACE (using '(' for "strict subset of"). - Inside PSPACE: Defn: A "TM with oracle for language A" is a 2-tape TM with special states q_Q, q_Y, q_N. When TM enters state q_Q with string w on second tape, next state becomes q_Y if w (- A, q_N if w !(- A. Formalizes notion of "what would be possible if we could decide A". For complexity classes C, D, C^D represents languages recognizable within class C by TMs with oracles for some language in class D. For example, P^P = { languages recognizable in polytime by TMs with an oracle for some polytime language } = P (oracle not necessary as answer can be computed in polytime) NP^P = { languages recognizable in polytime by NTMs with an oracle for some polytime language } = NP (oracle not necessary as answer can be computed in polytime) P^NP = { languages recognizable in polytime by TMs with an oracle for some language in NP } P^NP is most likely different from P (e.g., SAT (- P^NP) but also from NP and coNP: NP subset of P^NP and also coNP subset P^NP. - Polynomial Hierarchy: Definition: \Delta^p_0 = \Sigma^p_0 = \Pi^p_0 = P, and for all k >= 0, \Delta^p_{k+1} = P^{\Sigma^p_k} \Sigma^p_{k+1} = NP^{\Sigma^p_k} \Pi^p_{k+1} = co-\Sigma^p_{k+1} Properties: . P = \Delta^p_1, NP = \Sigma^p_1, coNP = \Pi^p_1 . \Delta^p_k (_ \Sigma^p_k n \Pi^p_k . \Sigma^p_k u \Pi^p_k (_ \Delta^p_{k+1} The Polynomial Hierarchy: PH = U_{k >= 0} \Sigma^p_k All of PH is a subset of PSPACE: any polytime TM/NTM with oracle for a language in PSPACE can be implemented in PSPACE. Unknown: PH = PSPACE? PH = P? More generally, PH = \Sigma^p_k? Believed: all inclusions are proper. Languages: QBF_k = { : F = -] p_1 \-/ p_2 ... Q p_k F' for some propositional formula F', and F is true } QBF_k is complete for \Sigma^p_k Similar class starting with \-/ complete for \Pi^p_k - Inside NP: If P != NP, then there are problems in NP that are neither in P nor NPc, and there are infinitely many intermediate classes between P and NPc, with complexity that gets larger and larger. For example, the following language: GRAPH-ISOMORPHISM = { : G and H are two undirected graphs that are isomorphic, i.e., there is a one-to-one and unto function f that maps vertices of G to vertices of H such that all corresponding edges are the same -- (u,v) is an edge of G iff (f(u),f(v)) is an edge of H } This is clearly in NP (a certificate is the function f, described as a list of pairs), but it is not known (or believed) to be in P, and it is not known (or believed) to be NP-complete. ---------------------------- Dealing with NP-completeness ---------------------------- NP important because it contains huge number of real-life problems that arise in various application domains. Vast majority of these problems belong either in P or are NP-complete. We know NP-complete problems do not have efficient solutions (unless P=NP) but this doesn't make real-life application go away. Example: VLSI circuit layout problem is NP-hard, but that doesn't mean we can forget about it; it must still be solved somehow! NP-complete means there is no exact, efficient algorithm. - Heuristics: compromise on efficiency -- some problems have algorithms that run in exponential time in worst-case but where worst-case does not seem to happen often in practice. Requires extensive testing on "real" inputs. - Approximation: compromise on exactness -- find efficient algorithm that may not return exact answer but something "close", e.g., instead of finding a k-clique, maybe will find a (k/2)-clique or k vertices that are "almost" a clique. - Randomization: by making random choices, algorithm hopefully avoids worst-case behaviour. Intuitively, works well when many choices are reasonably good but it's hard to select deterministically for all inputs (e.g., QuickSort). Two types of randomized algorithms: . random runtime but exact answer (e.g., Quicksort) . fixed runtime but random answer (e.g., primality testing) In both cases, need to design carefully to be able to prove bounds on probability of error (or on expected runtime). - NP-completeness based on worst-case analysis: in practical applications, worst-case may not come up often (if at all). Average-case performance may be more indicative (but much harder to compute properly) -- however, beware: uniformly random inputs may not be representative of "real life" inputs, and algorithm behaviour on uniformly random inputs may be misleading. Alternatively, sometimes possible to work with restricted classes of inputs. For example, 2SAT (- P, UNARY-SUBSET-SUM (- P (and so is the problem if all input integers have a value bounded by some polynomial function of the input size), etc. In some cases, possible to prove performance results, e.g., "greedy by degree" graph colouring algorithm can be shown to produce optimal colouring for all "co-graphs" -- graphs that have a certain property. For graphs that don't have this property, there is a gradual loss of optimality (i.e., graphs that are "close" to having the property will be coloured using "close" to the smallest number of colours), instead of a sharp rise. Heuristics are useful not just for problems in NPc, but also for problems in P whose running time is a high-degree polynomial (meaning 4 or above). Many practical applications deal with huge inputs (sizes 10^6 and above) where difference between n^2 and n^4 algorithms is significant. For example, the "linear programming" problem asks to minimize a linear function of some variables subject to a set of linear constraints on those variables. There is a polytime algorithm to solve linear programming, but its running time is a high-degree polynomial (something like n^6). Because linear programming is often used to model very large systems, this is not usable in practice; instead, the "simplex method" is often used. This algorithm has a worst-case running time that is exponential, but for most inputs encountered in practice, it does much better (including much better than the complicated polytime algorithm). ------ REVIEW ------ Main topics: - Computability . models of computation; robustness . diagonalization; countability/uncountability . decidability/recognizability; dovetailing . A_TM; undecidability/unrecognizability . mapping reductions (<=m); examples - Complexity . models of computation; P, NP, coNP . polytime reductions (<=p); Cook's theorem; NP-completeness . polytime self-reducibility . space complexity; PSPACE, L, NL; intractable problems - Reducibility (A <= B) is the central tool used. Understand it well!