=========================================================================== CSC 236 Tutorial Notes for Week 12 Fall 2009 =========================================================================== 1. Prove that L_2 = { s^2 : s (- {a,b}* } is not regular. For a contradiction, suppose that it is and that L(A_2) = L_2 for some FSA A_2. Let K be the number of states of A_2, let s = a^K b^K, and consider the behaviour of A_2 on input s^2 = ss = a^K b^K a^K b^K. Since s^2 (- L_2, A_2 accepts it (by assumption). Because the first half of s contains K characters, there must be some loop in the transitions of A_2 used to process the symbols from a^K -- say the loop is over k <= K states. But then, A_2 must accept input t = a^{K+k} b^K a^K b^K (because this string simply goes around the loop one more time). However, t !(- L_2, so A_2 accepts some string not in L_2, a contradiction. Hence, there can be no FSA that accepts exactly the strings in L_2, i.e., L_2 is not regular. 2. Give a RE for L(A_1), where A_1 is the FSA below. _______b_______ b a b |/ a a \ a --> (q_0) -----> q_1 -----> q_2 -----> (q_3) |\_______________/ b Eliminate q_2 first (may lead to slightly simpler REs because q_2 has no loop): __b__ b a b |/ aa \ a --> (q_0) -----> q_1 ------> (q_3) |\____/ ab Eliminate q_1: ab*aa b+ab*ab ---------> a+bb*aa --> (q_0) (q_3) <--------- bb*ab RE for q_3: R_{q_3} = (b+ab*ab)* ab*aa (a + bb*aa + bb*ab(b+ab*ab)*ab*aa)* But we're not done: q_0 is also accepting! So eliminate q_3: --> (q_0) b + ab*ab + ab*aa(a+bb*aa)*bb*ab RE for q_0: R_{q_0} = (b + ab*ab + ab*aa(a+bb*aa)*bb*ab)* Final RE for A_1 is: R_1 = R_{q_0} + R_{q_3} = (b + ab*ab + ab*aa(a+bb*aa)*bb*ab)* + (b+ab*ab)*ab*aa(a + bb*aa + bb*ab(b+ab*ab)*ab*aa)* 3. Give a CFG for L_3 = { s (- {a,b}* : s contains as many a's as b's }. Many possible answers, including: S -> aSbS | bSaS | \epsilon S -> SaSbS | SbSaS | \epsilon S -> SS | aSb | bSa | \epsilon Main idea: use productions that always introduce one a and one b at the same time -- ensuring the same number of each -- while allowing all possible combinations. - Example derivation -- symbol between . being replaced each step (\epsilon not written explicitly): S => a.S.bS => aa.S.bSbS => aabSb.S. => aabSbb.S.aS => aab.S.bbaSbSaS => aabaSbSbbaSbSaS => (replace each S with \epsilon...) => aababbbaba - Argument L(G) = L_3 for first CFG G above: . All strings generated contains the same number of a's and b's: productions introduce them only in matching pairs. . All strings with the same number of a's and b's can be generated: Given such a string s, say s starts with an 'a' (similar argument works if s starts with 'b'). Consider the quantity "#a's - #b's" associated with each symbol of s -- e.g., if s = aabbab, the quantity corresponding to each symbol is 1, 2, 1, 0, 1, 0. Since s contains as many b's as a's, there must be a first symbol where #a's = #b's, i.e., it is possible to write s = a.u.b.v where u and v are strings that contain as many a's as b's. Then, s can be generated using the production S -> aSbS followed by productions to generate the shorter strings u, v.