=========================================================================== CSC 363 Lecture Summary for Week 4 Fall 2009 =========================================================================== --------------- Diagonalization -- section 4.2, pp.174-178 --------------- What does this have to do with languages? - Number of TMs countable: list all strings over appropriate alphabet and keep only ones that describe valid TMs -- each TM can be represented as string over some fixed alphabet. Because number of strings is countable, so is number of possible TMs: cannot have more TMs than strings to represent them. Number of recognizable languages countable: each one associated with one TM. - Number of languages same as number of infinite binary sequences (not "strings", because strings are finite). Let \Sigma* = {s_0, s_1, s_2, ..., s_n, ...} (lexicographic order). Each language L (_ \Sigma* associated with infinite binary sequence b_0 b_1 b_2 ... b_n ... where b_i = 1 if s_i (- L, b_i = 0 otherwise. This is one-to-one and onto: correspondence shows number of languages same as number of infinite binary sequences. Same proof as for R shows uncountably many infinite binary sequences -- given supposed list of all infinite binary sequences, construct sequence different from all others in at least one bit. Hence, uncountably many languages over \Sigma*. - This means *most* languages unrecognizable! --------------------- The "Halting" problem --------------------- Recall "acceptance" problem/language for TMs: A_TM = { | M is a TM that accepts input w }. - From this point on, algorithms will be described in high-level stages (i.e., without specifying details of head movement, states, or tape symbols), with indentation for blocks that represent loops. Each stage simple (and clear) enough that it is obvious it can be implemented on Turing machine -- if not, break it up into simpler stages. Algorithms always starts with input, always a string, using <> notation when working with inputs of a specific type (as described above). Algorithms must always include clear, explicit conditions for accepting and rejecting -- no condition for looping, because that's not an action explicitly taken by TM. Algorithms should always be followed by brief argument of correctness (that acceptance/rejection/looping occurs exactly when expected). Theorem: A_TM is undecidable. Proof: - For contradiction, assume A_TM decidable, i.e., there is a TM S that accepts input if M accepts w, and that rejects if M does not accept w (either because M rejects or loops). - Construct TM D that includes S as a hard-coded subroutine. On input , D runs S's instructions on >; if S accepts, D rejects; if S rejects, D accepts. Using convention above: D = "On input : 1. Run S on >. 2. Accept if S rejects; reject if S accepts." From definition of S, D rejects input if M accepts input and D accepts input if M does not accept input . Note that because S is decider, D is also decider, i.e., D never loops (for any input). - What happens if we give as input to D? D should reject if D accepts input and D should accept if D does not accept input , i.e., D accepts input iff D does not accept input . - Contradiction! Hence, D cannot exist, which means S cannot exist, i.e., no TM can decide A_TM. Note: Result much stronger than simply "nobody knows how to decide A_TM" -- rather, it shows no decider is possible (now or ever)! By Church-Turing thesis, this implies there is *no* algorithm for solving this problem! What about unrecognizable languages? Theorem 4.22 on pp.181-182: Language L is decidable iff both L and ~L are recognizable (~L is complement of L). Proof idea: - If L decidable, then L recognizable (with decider M). Then, ~L recognizable by exchanging q_A, q_R in M, since M always halts -- note that this wouldn't work if M weren't a decider. - If L and ~L recognizable, then L decidable by running recognizers for L and ~L in parallel (one step of each computation at a time) until one stops -- exactly one must accept eventually, so can always decide. (See proof in textbook for detailed version.) Corollary: ~A_TM (the complement of A_TM) is unrecognizable. (Because A_TM is recognizable but undecidable.) Each of the following language is also undecidable: - HALT_TM = { | M is a TM that halts on input w } is undecidable. Proof: For a contradiction, suppose HALT_TM has decider H, and consider: A = "On input : 1. Run H on . Reject if H rejects. Otherwise, 2. Simulate M on w. Accept if M accepts; reject if M rejects." Then, A accepts if M accepts w, and A rejects if M loops on w or if M rejects w, i.e., A decides A_TM, a contradiction. Hence, HALT_TM is undecidable. HALT_TM is recognizable (on input , simulate M on w and accept if M accepts or rejects), so this means ~HALT_TM is unrecognizable. - E_TM = { | M is a TM such that L(M) = {} }: Assume R decides E_TM, and construct S as follows: S = "On input : - Compute , the description of the following TM: Q = "On input x: . Ignore x and simulate M on w; accept if M accepts; reject if M rejects." - Run R on input and do the opposite (if R accepts, reject; if R rejects, accept)." Then S accepts iff R rejects iff L(Q) != {} iff L(Q) = \Sigma* (by construction, either Q accepts all strings, if M accepts w, or Q accepts no string, if M rejects or loops on w) iff M accepts w. Moreover, S always halts because R always halts (and the first stage only constructs the desription of Q, , without actually executing it). Hence, S decides A_TM, a contradiction! - EQ_TM = { | M_1 and M_2 are TMs such that L(M_1) = L(M_2) }: Assume R decides EQ_TM and construct S as follows: S = "On input : - Compute , the description of the following TM: M' = "On input x: reject." - Run R on input and do the same." Then S decides E_TM, a contradiction.