=========================================================================== CSC 363 Lecture Summary for Week 3 Fall 2009 =========================================================================== Enumerators: - Similar to regular TM except: no input, two tapes, no accept/reject states (q_A/q_R), special "print" state (q_P) and "halt" state (q_H). Machine starts with both tapes blank and whenever it enters state q_P, string written on second tape is "printed" (technically, only the part of the string over \Sigma). Language is set of strings printed during computation (could be infinite if computation never halts). - Example: Enumerator with states {q_P,q_H}, tape alphabet {_,0,1}, initial state q_P, transition function \delta(q_P,a,a) = (q_P,a,L,0,R) for all a (- {_,0,1}. Language = {\epsilon, 0, 00, 000, ...} = L(0*). - Equivalent to TM: (Theorem 3.21 on p.135) . Given enumerator E for language A, construct TM M that recognizes A as follows: M = "On input w: 1. Run E; every time a string is "printed", compare it with w. 2. Accept if w is ever printed. Reject if E stops without printing w." If w (- A, then E eventually prints w so M accepts w. If w !(- A, then E never prints w so M rejects or loops on w. Hence, M recognizes A. . Given TM M that recognizes A, construct enumerator E for A as follows, where s_0,s_1,s_2,... is a complete list of all strings in \Sigma* (e.g., in lexicographic order): E = "Repeat for i = 0, 1, 2, 3, ... 1. Run M for i steps on each input string s_0,s_1,...,s_i. 2. Print every string that is accepted." If w (- A, then M accepts w so E will eventually print w (infinitely many times). If w !(- A, then M rejects or loops on w so E never prints w. Hence, E generates language A. Note: Why "for i steps"? Suppose M loops on s_k but accepts s_{k+1}; then E would enter infinite loop running M on s_k and never get to print s_{k+1}. Solution is the technique above, called "dovetailing": run multiple partial computations, stopping early to avoid infinite loops, but make sure to eventually continue all unfinished computations. There are other ways to achieve the same result, e.g., keep track of all ongoing computations on tape, separated by special symbol '#', and at each stage perform one more step of each computation as well as starting one more computation for the next string, removing any computation that has halted -- this avoids repeating computations and printing strings multiple times, but is slightly harder to describe precisely. Other models: - Register machines, Post correspondence systems, recursive functions, Conway's "Game of Life", etc. Given formal definitions of the different models, all have been shown equivalent to each other! ------------------------ The Church-Turing thesis ------------------------ "All reasonable models of computation are equivalent". Reasonable means has access to unlimited resources, can only carry out finite amount of work in one step. A "thesis" rather than a "theorem" because states something about informal notion of "reasonable model of computation" -- so it can never be proven. Any formal model chosen can be *proven* equivalent to others. In other words, any reasonable model of computation captures informal notion of "computation" precisely, i.e., there is a single, well-defined notion of "algorithm" independent of model used to define it. In particular, Turing machines are as powerful as any other model, and every pseudo-code algorithm you've ever seen has a TM implementation! Question: Is there some language that cannot be recognized by a TM? --------------------- The "Halting" problem --------------------- Consider the "acceptance" problem/language for TMs: A_TM = { | M is a TM that accepts input w }. Notation: represents some "reasonable" encoding of M,w as a string, over a fixed alphabet -- where "reasonable" means the encoding scheme is fixed and can be parsed by a TM (i.e., by some algorithm). For example, over alphabet \Sigma = {(,),;,q,c,0,1,L,R}, list states "(q0;q1;q10;...);", then initial state "q0;", accepting and rejecting states "q1101;q1111;", then input and tape alphabets "(s0;s1;...);(s0;s1;...);", then transitions "((q0;s0;q11;s101;L);...)". A_TM is recognizable. - There is a TM U (the "universal TM") that takes as input and that carries out the computation of M on w. U accepts if M accepts w, U rejects if M rejects w, and U loops if M loops on w. More formally: U = "On input : 1. Simulate M on input w (use portion of tape to represent M's configuration -- state and content of M's tape, including head position -- and move back-and-forth between M's description and M's configuration for each simulation step). 2. Accept if M accepts; reject if M rejects." Notes: . U is "general-purpose computer": like all TMs, hard-wired to carry out exactly one task, but that task depends on instructions provided as part of input. . "Simulate M on w" different from "run M on w": "simulate" means description of M is part of input, and algorithm must parse that description and follow it (none of M's behaviour is hard-coded into the algorithm); "run" means description of M is hard-coded into algorithm (none of M's behaviour is described in the input). Conventions: - "On input : ..." is shorthand for "On input x: (meaning: let x be the value on the tape when we start) 0. Check that x has the form , i.e., that x is the string encoding of some TM M and input string w -- reject if x does *not* have that form; otherwise, ..." - From this point on, algorithms will be described in high-level stages (i.e., without specifying details of head movement, states, or tape symbols), with indentation for blocks that represent loops. Each stage simple (and clear) enough that it is obvious it can be implemented on Turing machine -- if not, break it up into simpler stages. Algorithms always starts with input, always a string, using <> notation when working with inputs of a specific type (as described above). Algorithms must always include clear, explicit conditions for accepting and rejecting -- no condition for looping, because that's not an action explicitly taken by TM. Theorem: A_TM is undecidable. Proof: - For contradiction, assume A_TM decidable, i.e., there is a TM S that accepts input if M accepts w, and that rejects if M does not accept w (either because M rejects or loops). - Construct TM D that includes S as a hard-coded subroutine. On input , D runs S's instructions on >; if S accepts, D rejects; if S rejects, D accepts. Using convention above: D = "On input : 1. Run S on >. 2. Accept if S rejects; reject if S accepts." From definition of S, D rejects input if M accepts input and D accepts input if M does not accept input . Note that because S is decider, D is also decider, i.e., D never loops (for any input). - What happens if we give as input to D? D should reject if D accepts input and D should accept if D does not accept input , i.e., D accepts input iff D does not accept input . - Contradiction! Hence, D cannot exist, which means S cannot exist, i.e., no TM can decide A_TM. Note: Result much stronger than simply "nobody knows how to decide A_TM" -- rather, it shows no decider is possible (now or ever)! By Church-Turing thesis, this implies there is *no* algorithm for solving this problem! What about unrecognizable languages? Theorem 4.22 on pp.181-182: Language L is decidable iff both L and ~L are recognizable (~L is complement of L). Proof idea: - If L decidable, then L recognizable (with decider M). Then, ~L recognizable by exchanging q_A, q_R in M, since M always halts -- note that this wouldn't work if M weren't a decider. - If L and ~L recognizable, then L decidable by running recognizers for L and ~L in parallel (one step of each computation at a time) until one stops -- exactly one must accept eventually, so can always decide. (See proof in textbook for detailed version.) Corollary: ~A_TM (the complement of A_TM) is unrecognizable. (Because A_TM is recognizable but undecidable.)