CS500 Homework #2 Solutions

Similar documents
Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

CISC4090: Theory of Computation

Theory of Computation (IV) Yijia Chen Fudan University

2.1 Solution. E T F a. E E + T T + T F + T a + T a + F a + a

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL, DPDA PDA)

Computational Models - Lecture 4

CS481F01 Prelim 2 Solutions

CS 341 Homework 16 Languages that Are and Are Not Context-Free

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

Introduction to Theory of Computing

Context-Free Languages (Pre Lecture)

Foundations of Informatics: a Bridging Course

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Notes for Comp 497 (Comp 454) Week 10 4/5/05

The View Over The Horizon

Section 1 (closed-book) Total points 30

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

SCHEME FOR INTERNAL ASSESSMENT TEST 3

Context-Free and Noncontext-Free Languages

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

CS481F01 Solutions 6 PDAS

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

NPDA, CFG equivalence

Languages, regular languages, finite automata

CISC 4090 Theory of Computation

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Please give details of your answer. A direct answer without explanation is not counted.

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Computational Models: Class 5

ÖVNINGSUPPGIFTER I SAMMANHANGSFRIA SPRÅK. 17 april Classrum Edition

The Pumping Lemma. for all n 0, u 1 v n u 2 L (i.e. u 1 u 2 L, u 1 vu 2 L [but we knew that anyway], u 1 vvu 2 L, u 1 vvvu 2 L, etc.

Computational Models - Lecture 4 1

MA/CSSE 474 Theory of Computation

Automata Theory. CS F-10 Non-Context-Free Langauges Closure Properties of Context-Free Languages. David Galles

Fundamentele Informatica II

MA/CSSE 474 Theory of Computation

Properties of Context Free Languages

Pushdown Automata. Chapter 12

Context Free Languages and Grammars

Theory of Computation

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

HW6 Solutions. Micha l Dereziński. March 20, 2015

CS375: Logic and Theory of Computing

DM17. Beregnelighed. Jacob Aae Mikkelsen

Grade 6 Math Circles October 20/21, Formalism and Languages: Beyond Regular Languages

Notes for Comp 497 (454) Week 10

Pumping Lemma for CFLs

CISC 4090 Theory of Computation

Chapter 6. Properties of Regular Languages

Theory of Computation (Classroom Practice Booklet Solutions)

Automata and Computability. Solutions to Exercises

CSE 355 Test 2, Fall 2016

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer.

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Context-Free Languages

Non-context-Free Languages. CS215, Lecture 5 c

Intro to Theory of Computation

Concordia University Department of Computer Science & Software Engineering

Final exam study sheet for CS3719 Turing machines and decidability.

Turing machines and linear bounded automata

Ogden s Lemma for CFLs

Automata and Computability. Solutions to Exercises

CS5371 Theory of Computation. Lecture 5: Automata Theory III (Non-regular Language, Pumping Lemma, Regular Expression)

NODIA AND COMPANY. GATE SOLVED PAPER Computer Science Engineering Theory of Computation. Copyright By NODIA & COMPANY

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Turing machines and linear bounded automata

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Cliff s notes for equivalence of CFLs and L(PDAs) LisaCFL L = L(M) for some PDA M L=L(M)forsomePDAM L = L(G) for some CFG G

Homework 5 - Solution

Pushdown Automata (2015/11/23)

UNIT-VI PUSHDOWN AUTOMATA

Lecture 17: Language Recognition

CSE 105 Homework 5 Due: Monday November 13, Instructions. should be on each page of the submission.

Properties of Context-Free Languages. Closure Properties Decision Properties

CS20a: summary (Oct 24, 2002)

Comment: The induction is always on some parameter, and the basis case is always an integer or set of integers.

CpSc 421 Final Exam December 15, 2006

Homework 8. a b b a b a b. two-way, read/write

Closure under the Regular Operations

Miscellaneous. Closure Properties Decision Properties

HW 3 Solutions. Tommy November 27, 2012

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

PUSHDOWN AUTOMATA (PDA)

The Pumping Lemma for Context Free Grammars

Instructor s Solutions Manual for Introduction to the Theory of Computation third edition Michael Sipser Mathematics Department MIT

CSE 105 THEORY OF COMPUTATION

Turing machines and linear bounded automata

Fall 1999 Formal Language Theory Dr. R. Boyer. Theorem. For any context free grammar G; if there is a derivation of w 2 from the

Computational Models - Lecture 5 1

Pushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata.

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

October 6, Equivalence of Pushdown Automata with Context-Free Gramm

INSTITUTE OF AERONAUTICAL ENGINEERING

Context Free Languages: Decidability of a CFL

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Theory of Computation - Module 3

Turing Machines Part III

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Transcription:

CS500 Homework #2 Solutions 1. Consider the two languages Show that L 1 is context-free but L 2 is not. L 1 = {a i b j c k d l i = j k = l} L 2 = {a i b j c k d l i = k j = l} Answer. L 1 is the concatenation of {a n b n } with {c n d n }, and each of these is easily seen to be context-free, e.g. with grammars S asb ɛ and similarly S csd ɛ. Since the context-free languages are closed under concatenation we re done. To prove that L 2 is not context-free, we prove by contradiction that the pumping lemma does not hold. Suppose it did with some value of p; then consider the word w = a p b p c p d p. Writing w = uvxyz, we see that for uv i xy i z to be in L 2 for all i, neither v nor y can cross a boundary between the blocks of different symbols, since repeating them would then give a word outside a b c d L 2. Moreover, if v is in the block of as or bs, then y must be in the block of cs or ds respectively to keep the corresponding block the same size. But this implies that x contains either b p or c p, and in either case since vy > 0 this violates the third condition of the pumping lemma, vxy p. 2. Show that both the context-free languages and the deterministic context-free languages are closed under intersection with regular languages. That is, suppose R is regular; show that if L is context-free, then so is L R, and if L is deterministic context-free, then so is L R. Answer. The idea is to run the PDA for L and the DFA for R in parallel, just as we constructed a DFA for the intersection of two regular languages by running both their DFAs in parallel and accepting only if both machines accept. Formally, suppose the PDA for L has finite states Q 1, stack alphabet Γ, transition function δ 1, and accepting states F 1, and the DFA for R has similarly Q 2, δ 2, and F 2. Assume for simplicity that the PDA for L reads a input symbol on every step. Then the PDA for L R will have states Q = Q 1 Q 2 and transition function δ : Q Σ Γ ɛ Q Γ ɛ, where the new state is (δ 1 (q 1, a), δ 2 (q 2, a)) and we use δ 1 to determine what, if anything, we push on the stack. Our set of accepting states is F = F 1 F 2 Q. The resulting PDA is deterministic if the original PDA for L is, so L R is a CFL or DCFL if L is. 1

3. Inspired by the previous problem, let L be the parenthesis language {ɛ, (), (()), ()(),...} and R the regular language where neither ( nor ) can occur more than 3 times in a row. Give a context-free grammar for L R. Answer. The idea is for the variables to know how many left or right parentheses there already are next to them. Let there be 9 variables, {S l,r 0 l, r < 3} plus the start symbol S. Then S l,r means a nonempty string with l ( s to its left and r ) s to its right. The simplest grammar I found is S l,r (S l+1,1 )S 0,r (S l+1,r+1 ) ()S 0,r () where we disallow any rules where l 3 or r 3 on the right-hand side. Note that we don t allow S l,r to disappear, since then we wouldn t know whether r in S l,r (S l+1,r )S 0,r should be 1 or r + 1. Finally, we have S S 0,0 ɛ which also allows us to generate the empty word. 4. Recall that a grammar in Chomsky normal form is one where all rules are of the form A BC or A a, where capital and lower-case letters represent variables and terminals respectively. Let G be a grammar in Chomsky normal form with V = k variables, and let L be the language generated by G. Show that if L contains a word of length greater than 2 k 1, then L is infinite. Answer. Derivations of grammars in Chomsky normal form are binary trees, where the internal nodes correspond to rules of the form A BC and the leaves correspond to rules of the form A a. Since a binary tree of depth k has at most 2 k 1 leaves (counting both the root and the leaves toward the depth) any word of length greater than 2 k 1 must have a derivation tree of depth greater than k. That means that it contains some path of length greater than k, and by the Pigeonhole Principle some variable is repeated along that path. The argument of the pumping lemma then applies, and we can generate an infinite number of words of the form uv i xy i z. 5. Show that the language L = {a i b j c k i j j k} is context-free but not deterministic context-free. Answer. To prove it is context-free, recall that context-free languages are closed under union, and write L as L = {a i b j c k i j} {a i b j c k j k} Both of these languages are easily seen to be context-free; for the first one, for instance, push when you read a and pop when you read b, and accept if the stack goes empty before we re done with the b s, or if it is still nonempty when we read the first c. 2

However, if L were deterministic context-free then L would be also. But if L were context-free, then {a n b n c n } = L a b c would be context-free since the CFLs are closed under intersection with regular languages. Since we know that {a n b n c n } is not context-free, this is a contradiction. Thus L is a CFL but not a DCFL. 6. Show that the complement of the set of palindromes, L = {w {a, b} w w R }, is context-free. Since both L and L are context-free, does this mean that they are deterministic context-free? Give a DPDA that recognizes L, or some intuition for why one does not exist. Answer. Here is a context-free grammar for L: S asa bsb asb bsa at b bt a T at a at b bt a bt b a b ɛ We require S to emit at least one mismatched pair before it becomes a T, which then emits more pairs until it leaves behind a single symbol or disappears. However, just because L and L are both context-free doesn t mean that they are deterministic context-free (although the converse is true); it could be that they both require non-deterministic PDAs. In fact, imagine a DPDA trying to recognize a palindrome. At the center of the word, it needs to stop pushing and start popping; but without some kind of marker, it has no way of knowing when it is halfway through. An NPDA, on the other hand, can guess when to turn around. 7. We saw in class that while {a n b n c n } is not context-free, it is the intersection of two context-free languages. Consider a k-symbol alphabet {a 1,..., a k }, and the language L k = {a n 1a n 2 a n k }. What is the smallest number of context-free languages such that L k is equal to their intersection? Answer. Two. L k = {a n 1 1 a n 2 2 a n k k n i = n i+1 for even i} {a n 1 1 a n 2 2 a n k k n i = n i+1 for odd i} Both of these are CFLs since they are concatenations of k/2 or k/2 1 CFLs of the form {a n b n } with perhaps the regular languages a 1 and a k. 8. Show that the language L = {ucv u, v {a, b}, u v} is context-free. (Note that u and v are not required to have the same length, and c is used as a marker.) Hint: first, design a grammar for {ucv u, v {a, b}, u = v, and u and v differ in their last symbol} 3

and then figure out how to extend u and v with arbitrary strings. Answer. We have Σ = {a, b}, so Σ i means any word of length i. Then rewrite L as (Σ i aσ c Σ i bσ ) (Σ i bσ c Σ i aσ ) This formula requires that u i+1 v i+1 for some i, and therefore u v. We start by generating (Σ i a c Σ i b) (Σ i b c Σ i a), in which u and v differ on their last symbol, with the grammar S T b Ua T at a at b bt a bt b ac U aua aub bua bub bc Then we get L by allowing arbitrary words Σ to the left of c and at the end of the word. To do this, add the rules c ac bc S Sa Sb which completes the grammar for L. 9. Let L be the copy language, L = {ww w {a, b} }. Show that its complement L is context-free. This is tricky, but the idea is similar to the previous problem. Answer. L is the union of the set of words of odd length (which is regular and therefore context-free) with the language We can rewrite this as L = {uv u = v and u v} L = (Σ i aσ j Σ i bσ j ) (Σ i bσ j Σ i aσ j ) As in the previous problem, this ensures that u i+1 v i+1 and so u v. Now for the trick: since Σ j Σ i = Σ i Σ j, this is the same as L = (Σ i aσ i Σ j bσ j ) (Σ i bσ i Σ j aσ j ) 4

Now the Σ i s and Σ j s no longer cross, and each of the languages in this union is the concatenation of two context-free languages. Our grammar is then S T U UT T at a at b bt a bt b a U aua aub bua bub b Of course, this provides another example of a language which is a CFL but not a DCFL. 10. Consider the language L = {w {a, b} # a (w) = # b (w)} i.e., words with an equal number of as and bs. (a) How many words N(l) are there of each length l? Answer. If l is odd, N(l) = 0; if l is even, we just need to choose which l/2 of the symbols are a s, giving N(l) = ( l l/2). (b) What is its generating function g(z) = l N(l) zl? (Feel free to ask Mathematica or Maple to sum the series.) Answer. We get l=0,2,4,... ( ) l z l = l/2 n=0 ( ) 2n z 2n = n 1 1 4z 2 (c) Recall the unambiguous grammar for the bracket language, S (S)S ɛ. Inspired by this, we might hope that S asbs bsas ɛ is an unambiguous grammar for L, but unfortunately it s not. Explain why. Answer. Because we can derive the word abab in two distinct ways: S asbs asb[asbs] abab, and S asbs a[bsas]bs abab (here [ ] marks where the rule is applied in the second step). (d) Now construct an unambiguous grammar for L. Hint: what kind of paths tracking the depth of the stack, or tracking the imbalance between as and bs seen so far do words in L correspond to, and how can we unambiguously define them as being made up of smaller paths? Answer. If we define the height as the number of as so far minus the number of bs or, equivalently, the depth of the stack if there have been more as and minus the depth if there have been more bs the words in L correspond to paths which begin and end at height zero. 5

One way to decompose these unambiguously is shown in the figure. Look at the first place where the height crosses zero i.e., the shortest initial subword which is also in L. Consider this part of the path. It either consists of a path which is nowhere negative, bracketed by a and b (the example shown in the figure) or a path which is nowhere positive, bracketed by b and a. Let s define variables A and B for the nowhere-negative and nowhere-positive paths respectively. In both cases the remainder of the path is another path which begins and ends at zero, and thus which corresponds to another word in L; this is generated by another copy of the start symbol S. Paths which are nowhere negative are isomorphic to words in the bracket language, where ( and ) correspond to a and b respectively. We can generate this unambiguously with the grammar A aaba ɛ. We generate nowhere-positive paths by switching a and b. All this gives the following set of rules: S aabs bbas ɛ A aaba ɛ B bbab ɛ A a b S Figure 1: An example of the decomposition. Here aabaabbbbaab = a(abaabb)b baab. (e) Use your unambiguous grammar from (d) to derive the generating function g(z) and check that it gives the same answer as you gave in (b) above. Answer. The rules of the grammar above imply that g(z) = z 2 g A (z)g(z) + z 2 g B (z)g(z) + 1 g A (z) = z 2 g A (z) 2 + 1 g B (z) = z 2 g B (z) 2 + 1 where g A and g B are the generating functions for the languages generated by A and B. Solving the quadratic equations for g A and g B, we get g A (z) = g B (z) = 1 1 4z 2 2z 2 6

which is the generating function for the bracket language. Substituting this into the above equation for g(z) we get g(z) = z 2 (g A (z) + g B (z)) g(z) + 1 = (1 ) 1 4z 2 g(z) + 1 and solving this linear equation for g gives g(z) = 1 1 4z 2. 7