Notes for Comp 497 (454) Week 10

Similar documents
Notes for Comp 497 (Comp 454) Week 10 4/5/05

Non-context-Free Languages. CS215, Lecture 5 c

CPS 220 Theory of Computation

Properties of Context-Free Languages. Closure Properties Decision Properties

Properties of Context-free Languages. Reading: Chapter 7

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

NPDA, CFG equivalence

Properties of Context-Free Languages

Properties of Context Free Languages

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Notes for Comp 497 (Comp 454) Week 5 2/22/05. Today we will look at some of the rest of the material in Part 1 of the book.

Ogden s Lemma for CFLs

Introduction to Theory of Computing

MA/CSSE 474 Theory of Computation

HW6 Solutions. Micha l Dereziński. March 20, 2015

Chap. 7 Properties of Context-free Languages

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Computability Theory

Context-Free Languages (Pre Lecture)

This lecture covers Chapter 7 of HMU: Properties of CFLs

Chapter 6. Properties of Regular Languages

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

CISC4090: Theory of Computation

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

10. The GNFA method is used to show that

Context-Free Grammars and Languages. Reading: Chapter 5

CSE 355 Test 2, Fall 2016

Foundations of Informatics: a Bridging Course

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL, DPDA PDA)

CS500 Homework #2 Solutions

Computational Models - Lecture 5 1

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Context-Free and Noncontext-Free Languages

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Properties of context-free Languages

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

Part 3 out of 5. Automata & languages. A primer on the Theory of Computation. Last week, we learned about closure and equivalence of regular languages

Computational Models - Lecture 4

Theory of Computation (Classroom Practice Booklet Solutions)

Section 1 (closed-book) Total points 30

Computational Models - Lecture 4 1

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

Context Free Grammars

Context Free Languages and Grammars

MA/CSSE 474 Theory of Computation

INSTITUTE OF AERONAUTICAL ENGINEERING

CS375: Logic and Theory of Computing

Fall 1999 Formal Language Theory Dr. R. Boyer. Theorem. For any context free grammar G; if there is a derivation of w 2 from the

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

VTU QUESTION BANK. Unit 1. Introduction to Finite Automata. 1. Obtain DFAs to accept strings of a s and b s having exactly one a.

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

Chapter 16: Non-Context-Free Languages

Theory of Computation - Module 3

Theory of Computation

Before we show how languages can be proven not regular, first, how would we show a language is regular?

NODIA AND COMPANY. GATE SOLVED PAPER Computer Science Engineering Theory of Computation. Copyright By NODIA & COMPANY

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Automata and Computability. Solutions to Exercises

The Pumping Lemma for Context Free Grammars

Automata Theory CS F-08 Context-Free Grammars

Automata and Computability. Solutions to Exercises

Question Bank UNIT I

CPSC 421: Tutorial #1

Concordia University Department of Computer Science & Software Engineering

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Theory of Computation Turing Machine and Pushdown Automata

Lecture 17: Language Recognition

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

Computability and Complexity

SCHEME FOR INTERNAL ASSESSMENT TEST 3

Theory Bridge Exam Example Questions

Solution Scoring: SD Reg exp.: a(a

St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

Fundamentele Informatica II

CS20a: summary (Oct 24, 2002)

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

5 Context-Free Languages

CS481F01 Prelim 2 Solutions

The View Over The Horizon

What Is a Language? Grammars, Languages, and Machines. Strings: the Building Blocks of Languages

Computational Models - Lecture 5 1

2.1 Solution. E T F a. E E + T T + T F + T a + T a + F a + a

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer.

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

Pushdown Automata. Chapter 12

CSE 105 Homework 5 Due: Monday November 13, Instructions. should be on each page of the submission.

Pushdown Automata. Reading: Chapter 6

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Author: Vivek Kulkarni ( )

Testing Emptiness of a CFL. Testing Finiteness of a CFL. Testing Membership in a CFL. CYK Algorithm

Context-Free Grammars (and Languages) Lecture 7

Sri vidya college of engineering and technology

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Transcription:

Notes for Comp 497 (454) Week 10 Today we look at the last two chapters in Part II. Cohen presents some results concerning the two categories of language we have seen so far: Regular languages (RL). Context-free languages (CFL) He also looks at some decidability issues. Homework #5 (due April 8, 2014) is at the end of the notes. Errata (Chapter 17) Chapter 17 p. 383, two lines from the end, make that "(baabbbbb)(a)" p. 392, eight lines from the end, append "in" to the line. p. 398, Question 2(iii) 2x+2z should be 2x+z Earlier we looked at the union, intersection and Kleene closure of regular languages. Let us see what properties context-free languages have both in themselves and in conjunction with regular languages. First, where L 1 and L 2 are context-free languages, what can we say about (a) Union L 1 + L 2 (b) Product L 1 L 2 (c) * Closure L 1? All these turn out to be CFLs. We can prove this construction in a couple of ways: (1) using grammars for the languages (2) using machines. We look at grammars here. Cohen also discusses machine-based proofs (a) Union THEOREM 36 If L 1 and L 2 are context free grammars, so is their union L 1 +L 2 Comp 454 Notes Page 1 of 10 April 1, 2014

For each of L 1 and L 2 there is a CFG. Call these grammars CFG 1 and CFG 2 respectively. We modify CFG 1 by adding the subscript 1 to each non-terminal. Thus X becomes X 1. Similarly we modify CFG 2, adding a different subscript, 2, to each non-terminal. Now we can combine the grammars into a single grammar with no ambiguity. Finally, we add a new rule S S 1 S 2 Thus we have devised a CFG that generates L1 + L2 and shown it is therefore a CFL. (b) Product THEOREM 37 If L 1 and L 2 are context free grammars, so is their product L 1 L 2 Similar proof to that of Theorem 36, subscripting followed by addition of S S 1 S 2 (c) Closure THEOREM 38 If L is a CFL then so is L*. Change S to S 1 throughout the existing grammar then add new rule S S S 1 Λ What about (d) intersection of two CFL and (e) complement of a CFL? (d) Intersection. THEOREM 39 The intersection of two CFL may or may not be a CFL. We can show that both possibilities exist. For example if both L 1 and L 2 are regular then L 1 L 2 is regular by Theorem 12 and therefore a CFL. But what if L 1 = a n b n a m (m,n > 0) (This is CFL; we can easily devise a CFG for it) and L 2 = a n b m a m (m,n > 0) (Also CFL, we can easily devise a CFG for it) Comp 454 Notes Page 2 of 10 April 1, 2014

the intersection of these two languages is a n b n a n which we know is non-context-free. It turns out there is no algorithm to which we can give two CFL which will determine if the intersection is a CFL. (e) Complement THEOREM 40 The complement of a CFL may or may not be a CFL Again we can show that both possibilities exist. If L is regular, and therefore CF, its complement is also regular by Theorem 11 and therefore CF. Consider the following proof by contradiction (that the complement of any CFL is CF). Assume that CFL is CF If L 1 and L 2 are CFL Then L 1 and L 2 are CFL (our assumption) Then (L 1 + L 2 ) is CFL (Theorem 36) Then (L 1 + L 2 ) is CFL (our assumption again) Then L 1 L 2 is CFL (de Morgan s law) But we know that the intersection of two CFL is not always CF so it must be the case that our assumption is wrong and CFL is not always CF. Mixing context-free (CF) and regular languages (RL) Union: CF + RL The union of CF and RL is CF because the RL is CF and we can apply Theorem 36. The union may or may not be regular depending which is the larger language Example 1: PALINDROME + (a+b)* (a+b)* is larger therefore union is regular Example 2: PALINDROME + a* PALINDROME is larger therefore union non-regular Comp 454 Notes Page 3 of 10 April 1, 2014

Intersection: CFL RL THEOREM 41 The intersection of a CFL and a RL is always CF. Proof is by construction. Given a PDA for the CFL and a FA for the RL we can construct a PDA for the intersection language. The states of the new PDA are combinations of the old PDA states and the FA states. Cohen sketches the construction logic on pages 394-395 and then shows how to construct a PDA which recognizes the intersection of EQUAL (CFL of strings with same number of a s as b s) and ENDA (RL of strings ending in a) Consider also DOUBLEWORD ( = ww where w = (a+b)* ) We know from Chapter 10 that it is non-regular We know from Chapter 16 that it is non-context-free (pumping lemma proof) We can also prove it is non-regular by means of Theorem 41 and careful choice of a regular language with which to intersect it. Consider the intersection of DOUBLEWORD with aa*bb*aa*bb* The intersection is L = ww where w = a n b m where m,n > 0 i.e. a n b m a n b m But we know this language is non-context-free (see last week) so that means that DOUBLEWORD must be non-context free also otherwise it would contradict Theorem 41. Errata (Chapter 18) Chapter 18 p. 410, line 10, replace B A by B baa In this chapter we look at some decidability issues similar to those considered in Chapter 11 for RL. Paraphrasing Cohen, the first group of questions (p. 402) is: 1. Do two CFG define the same language? 2. Is a particular CFG ambiguous? 3. If a CFG is ambiguous, is there another CFG defining the same language that is not? 4. How can we tell if (CFL) is CF? 5. How can we tell if CFL 1 CFL 2 is CF? 6. Given CFG 1 defining CFL 1 and CFG 2 defining CFL 2 is CFL 1 CFL 2 empty? 7. Given CFG defining CFL, is CFL equivalent to (a+b)* Comp 454 Notes Page 4 of 10 April 1, 2014

These questions are all undecidable no algorithm can exist for any of them. We will see more undecidable questions in Part III of the book. However, there are still some questions concerning CFG that we can answer: 1. (Emptiness) Does a particular CFG generate any words at all? 2. (Finiteness) Given a CFG, is the CFL it generates finite or infinite? 3. (Membership) Given a CFG, is a particular word w in the language it generates? We will see similar questions to these in Part III. Emptiness THEOREM 42 Given a CFG, there is an algorithm to determine if it generates any words at all We can prove this is true by finding such an algorithm. We can tell if is in the language (we determine if S is nullable?) Assume it is not and convert the grammar to CNF Find a rule of the form N t and back substitute If S is eliminated, we are done, CFG produces a string If we cannot eliminate S, the CFG produces no strings. This is an algorithm because the back substitution must terminate in a finite number of steps there are a finite number of rules in the grammar. The example on page 405 is of a grammar that does produce strings. The example on page 406 is of a grammar that does not produce any strings. If you draw the derivation tree of the page 406 grammar, you will see that we can never get rid of all the nonterminals in a working string. The following Theorem is somewhat related. THEOREM 43 Given a CFG with nonterminal X, there is an algorithm to determine if X is ever used in the generation of words. Comp 454 Notes Page 5 of 10 April 1, 2014

Again we can prove this Theorem by devising such an algorithm. We could break the problem down into two subproblems: (i) can we generate a string of terminals from X? (ii) Can we obtain from S, a working string containing X? (i) (ii) In a copy of the grammar (CFG 2 ) exchange S and X wherever they occur. Now apply the algorithm of Theorem 42 to CFG 2. If CFG 2 produces any words then X in CFG produces words. Back substitution of X to see if we can obtain a working string from S. In the example on page 408 X can be produced from S if A can be produced from S (second rule). A can be produced from S according to the first rule so X can be in a working string from S Finiteness THEOREM 44 There is an algorithm to decide if a given CFG generates a finite or an infinite language. Again the proof is to construct such an algorithm. Note that if any word in the language is long enough to apply the pumping lemma to we can produce an infinite language. If the language is infinite there must be some words long enough to apply the pumping lemma to. So we need to determine if the pumping lemma can be applied. We need to see if there is a self-embedded nonterminal that is involved in the derivation of words in the language. Algorithm (1) eliminate useless nonterminals from the grammar (see Theorem 43) (2) back substitution similar to Theorem 43 to see if there is a self-embedded (directly or indirectly) nonterminal. Example on p. 410-411 Membership We would like to be able to determine if the language defined by a particular CFG contains a particular word w. Comp 454 Notes Page 6 of 10 April 1, 2014

THEOREM 45 Given a CFG and string x (over the same alphabet), we can determine if x can be generated by the CFG. Once more the proof of the algorithm is the demonstration that an algorithm exists to answer the question. The algorithm is the CYK algorithm (p. 410). We assume the CFG is in CNF We wish to determine which substrings of x = x 1 x 2... x n are derivable from which nonterminals in the grammar S, N 1, N 2,... N M The substrings of length 1 are easy to identify because they are on the RHS of rules of the form N t. For each substring of length 1 we have a list of the nonterminals that can produce it. For a string of length 2 e.g. x i x j to be producable x i must be producable from N p, x j must be producable from N q and there is a rule N r N p N q. For each producable substring of length 2 we have a list of the nonterminals (N r ) that can produce it. Similarly, we can determine which substrings of length 3 are producable, then which substrings of length 4 and so on. Eventually, we will consider the (sub)strings of length n (the length of the word of interest). If S is among the nonterminals that can produce it then x is in the language. Because the string x is finite in length, this algorithm is finite. Parsing Simple Arithmetic In chapter 3 we had a simple (recursive) grammar (AE) for arithmetic expressions. However, it did not reflect the different operator precedences. A better grammar PLUS- TIMES is given on Page 414. It distinguishes between a lower precedence operator (+) and a higher precedence operator (*). We can easily extend the grammar to include subtraction and division operators. We can also extend he grammar to include operators with precedence greater than * (for example exponentiation) and operators with precedence lower than + (for example bitwise AND). Comp 454 Notes Page 7 of 10 April 1, 2014

The derivation tree for an arithmetic expression generated using PLUS-MINUS will properly reflect operator precedence (see p 416) The parsing problem is how to determine whether a string x is in a language L. In this case whether a string is a valid arithmetic expression. Two approaches are: (1) Top-down start with S and see if we can generate x (2) Bottom-up start with x and see if we can reduce it to S (1) Top-down: On pages 416-420, Cohen gives an example of this approach showing how a derivation tree is grown and pruned and how, in this case, a tree can be constructed with w as its leaves. Note that we don t need to grow the tree in full; we can explore a branch then backtrack to the parent and try another if it doesn t work out. (2) Bottom-up: On pages 421-423. In this case the root of the tree is the string x and we try to construct a path to a leaf S. We know from Data Structures classes that a postfix (Reverse Polish) arithmetic expression can be evaluated using a stack of operands. A PDA has a stack so it seems reasonable to devise a PDA that can read a postfix expression and output its value. We need to add ADD, MPY and PRINT operators to our existing set of nodes. This PDA is given on page 424. We also know from Data Structures that Dijkstra s algorithm for converting infix to postfix also uses a stack. This time the stack contains operators and open parentheses. The PDA for this process is on Page 427. Input is an infix expression, output is the corresponding postfix. Read Chapters 17 and 18 Reading Assignment Comp 454 Notes Page 8 of 10 April 1, 2014

Homework #5 Here is Homework #5 due April 8, 2014. Each question is worth 20 points. Covers Chapters 16, 17 and 18. 1. Prove that the language { a n b n a n b n a n for n = 1, 2, 3 } = ababa aabbaabbaa aaabbbaaabbbaaa is non-context-free. 2. Let VERYEQUAL be the language of all words over Σ = { a b c} that have the same number of a s, b s and c s VERYEQUAL = { abc acb bac bca cab cba aabbcc aabc bc } Notice that the order of the letters does not matter. Prove that VERYEQUAL is non-context-free. 3. Language L1 is defined by the following CFG S asa ata T b bt Language L2 is defined by the following CFG S XY X axb ab Y a ay Is the intersection of these languages context free? Why or why not? Comp 454 Notes Page 9 of 10 April 1, 2014

4. For each of the following grammars, determine whether it generates any words using the algorithm of Theorem 42 (page 403) (i) (ii) (iii) (iv) (v) S asa bsb S XY X SY Y SX X a Y b S AB A BC C DA B CD D a A b S XS Y YX Y YY Y XX X a S AB A BSB B AAS A CC B CC C SS A a b C b bb 5. Using bottom-up parsing, find any derivation in the grammar PLUS-TIMES for the following expressions: (a) i * (i) (b) i * ( i + i ) Comp 454 Notes Page 10 of 10 April 1, 2014