Chap. 7 Properties of Context-free Languages

Similar documents
Properties of context-free Languages

Properties of Context-Free Languages

Properties of Context-free Languages. Reading: Chapter 7

The Pumping Lemma for Context Free Grammars

This lecture covers Chapter 7 of HMU: Properties of CFLs

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

NPDA, CFG equivalence

Non-context-Free Languages. CS215, Lecture 5 c

CS20a: summary (Oct 24, 2002)

Ogden s Lemma for CFLs

Properties of Context-Free Languages. Closure Properties Decision Properties

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Fall 1999 Formal Language Theory Dr. R. Boyer. Theorem. For any context free grammar G; if there is a derivation of w 2 from the

CPS 220 Theory of Computation

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

CS Pushdown Automata

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Context-Free Languages (Pre Lecture)

Context Free Language Properties

Notes for Comp 497 (454) Week 10

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Context-Free and Noncontext-Free Languages

Foundations of Informatics: a Bridging Course

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

CS375: Logic and Theory of Computing

Section 1 (closed-book) Total points 30

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

CS311 Computational Structures More about PDAs & Context-Free Languages. Lecture 9. Andrew P. Black Andrew Tolmach

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Theory of Computation 8 Deterministic Membership Testing

MA/CSSE 474 Theory of Computation

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Computational Models - Lecture 4 1

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL, DPDA PDA)

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs

Final exam study sheet for CS3719 Turing machines and decidability.

Einführung in die Computerlinguistik

DD2371 Automata Theory

CYK Algorithm for Parsing General Context-Free Grammars

Context Free Languages: Decidability of a CFL

Computational Models - Lecture 5 1

CS481F01 Prelim 2 Solutions

CSE 105 Homework 5 Due: Monday November 13, Instructions. should be on each page of the submission.

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Computability and Complexity

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

CSE 355 Test 2, Fall 2016

Homework 4. Chapter 7. CS A Term 2009: Foundations of Computer Science. By Li Feng, Shweta Srivastava, and Carolina Ruiz

The View Over The Horizon

Computational Models - Lecture 4

6.8 The Post Correspondence Problem

CS481F01 Solutions 6 PDAS

Computability Theory

Context-Free Grammars (and Languages) Lecture 7

Concordia University Department of Computer Science & Software Engineering

Ogden s Lemma. and Formal Languages. Automata Theory CS 573. The proof is similar but more fussy. than the proof of the PL4CFL.

Even More on Dynamic Programming

Chapter 6. Properties of Regular Languages

Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove

MA/CSSE 474 Theory of Computation

Functions on languages:

Context-Free Languages

Computational Models - Lecture 5 1

CISC4090: Theory of Computation

Solution. S ABc Ab c Bc Ac b A ABa Ba Aa a B Bbc bc.

Context Free Languages and Grammars

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

CS 373: Theory of Computation. Fall 2010

Testing Emptiness of a CFL. Testing Finiteness of a CFL. Testing Membership in a CFL. CYK Algorithm

Grammars and Context Free Languages

CSCI Compiler Construction

Introduction to Theory of Computing

PUSHDOWN AUTOMATA (PDA)

Question Bank UNIT I

SCHEME FOR INTERNAL ASSESSMENT TEST 3

DM17. Beregnelighed. Jacob Aae Mikkelsen

Theory of Computation

The Chomsky Hierarchy(review)

V Honors Theory of Computation

St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

VTU QUESTION BANK. Unit 1. Introduction to Finite Automata. 1. Obtain DFAs to accept strings of a s and b s having exactly one a.

Context Free Grammars

Theory of Computation - Module 3

MTH401A Theory of Computation. Lecture 17

Grammars and Context Free Languages

Context-Free Grammars: Normal Forms

CDM Parsing and Decidability

Context-Free Grammars and Languages. Reading: Chapter 5

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

Transcription:

Chap. 7 Properties of Context-free Languages 7.1 Normal Forms for Context-free Grammars Context-free grammars A where A N, (N T). 0. Chomsky Normal Form A BC or A a except S where A, B, C N, a T. 1. Eliminating useless symbols(non generating and non reachable) 2. Eliminating -productions(no A except S ) 1 3. Eliminating unit productions(no A B) A or A a where A N, (N T), 2, a T. 4. Introducing variables for each terminals(a a a, a T) A or A a where A N, N, 2, a T. 5. Reducing length of RHS to two A BC or A a except S where A, B, C N, a T. 11/8/16 Kwang-Moo Choe 1

7.1.1 Eliminating Useless Symbols We say X N T is useful, if S X w, X (N T), w T ; useless otherwise. 1. We say X is generating, if X w, w T. 2. We say X is reachable, if S X. 1. Eliminate non generating symbols and productions 2. Eliminate non reachable symbols and productions Theorem 7.2 Let G = (N, T, P, S) be a CGF where L(G). 1. Eliminate non generating symbols, and productions in G, G 2 = (N 2, T 2, P 2, S). 2. Eliminate non reachable symbols and productions from G 2, G 1 = (N 1, T 1, P 1, S). Then G 1 has no useless symbols, and L(G 1 ) = L(G). 11/8/16 Kwang-Moo Choe 2

Theorem 7.4 Finding generating symbols basis a T, a is generating. Rec. A X 1 X n P, if (1 i n: X i is generating) or (n = 0), then X is generating. Proof X w, w T. (induction on number of steps in algorithm) basis zero step. a. Rec. Consider X n-1 w, w T. X X 1 X n P. 1 i n, X i w i, w i T by IH. X X 1 X n w 1 w n = w, w. Theorem 7.6 Finding reachable symbols. Basis S is reachable. G reach = (N, E), (A, B) E, if A B P. Rec. If A N is reachable, A X 1 X n P, 1 i n, X i is reachable. Proof X (N T) is reachable, if S X.G reach = (N, E ) from S N. 11/8/16 Kwang-Moo Choe 3

7.1.3 Eliminating -Productions A is called -production. G is -free, if P has no -production. A is nullable, iff A. Theorem 7.7 Finding nullable symbols Basis If A P, A is nullable. Rec. B C 1 C k P: if (1 i k) C i N is nullable, then B is nullable. Proof A, if and only if, A is nullable in the above algorithm. (If) If A is nullable in the above algorithm, then A P. trivial. (Only if) induction on number the shortest derivation to. basis One step. A P, A 1. A. induction Suppose A n where n 1 derivations. Then A B 1 B k +. where 1 i k: B i n. 1 i k: B i is nullable by IH. A is nullable in the above algorithm. 11/8/16 Kwang-Moo Choe 4

Theorem 7.9 Let G = (N, T, P, S) be a cfg. Then G 1 = (N, T, P 1, S) is -free and L(G 1 ) = L(G) { }. P 1 : A X 1 X n P, add A Z 1 Z n to P 1. i) If X i is not nullable, Z i = X i. (no change!) ii) If X i is nullable, Z i = (X i ). iii) remove A, if any. (if m-nullable symbols, 2 m rules) (if m=n, upper bound 2 n-1 rule) Proof A G1 w if and only if A G w and w (w T + ). (If) If A G k w and w, A G1 w. basis If A G w and w, and w +, then A w P 1. A G1 w. induction A G X 1 X n G k-1 w = w 1 w n, X i G w i and w. If w i, X i G1 w i by IH(X i G k x i ). 11/8/16 Kwang-Moo Choe 5

If w i =, X i is nullable. A G X 1 X i-1 X i X i+1 X n G w 1 w i-1 w i+1 w n = w. A X 1 X i-1 X i+1 X n P 1 by construction of P 1. A G1 X 1 X i-1 X i+1 X n G1 w 1 w i-1 w i+1 w n = w. (Only if) If A G1 k w, A G w and w. basis If A G1 w, w (G 1 is -free). A G, G w ( -rules only, w ) induction Assume A G1 Z 1 Z n G1 k-1 w = x 1 x n, Z i G1 x i. A Z 1 Z n P 1 comes from A X 1 X m P, (m n). A G X 1 X m G Z 1 Z n ( -rules only) G x 1 x n and x i by IH(Z i G1 k x i ) = x and x. 11/8/16 Kwang-Moo Choe 6

7.1.4 Eliminating Unit productions A B is called a unit production, if A, B N. (A, B) is called a unit pair, if A B. Theorem 7.11 Following algorithm finds exactly unit pairs. basis (A, A) is a unit pair. induction If (A, B) is a unit pair and B C P, (A, C) is a unit pair. Proof Number of derivation steps unit pair is found. basis Zero steps. A = B, (A, A) is added in basis. induction Assume A n C. Then B, A n-1 B C. (A, C) is in unit pair(ih) and the induction rule B C P adds (A, B) in unit pair. 11/8/16 Kwang-Moo Choe 7

Theorem 7.13 Let G = (N,, P, S) be a cfg. Then G 1 = (N,, P 1, S) that has no unit productions and L(G 1 ) = L(G), P 1 = {A (A, B) is a unit pair, B P, N}. Proof A G w if and only if A G1 w. If A P 1, N. Non-unit productions. (If) If A P 1, A P or A G B G. If A P 1, A G. If A G1 w, A G w. (Only if) If A G w, A lm G w in G. Assume A = 0 lm 1 lm 2 lm n = w in G. 0 i n, 1) If i lm i+1 by non unit production in G, i lm G1 i+1. 2) If i lm i+1 in G by unit production, 11/8/16 Kwang-Moo Choe 8

i k,. i j k, j lm G j+1 by unit productions and finally k lm k+1 by non unit production i lm G1 k+1. If A lm G w, A lm G1 w. 7.1.5 Chomsky Normal Form(CNF) 1. S P or 2. A BC P where B, C N or 3. A a P where a. Theorem 7.16 Let G = (N, T, P, S) be a CFG. There is a CFG G 1 such that G 1 is CNF and L(G) = L(G 1 ). Proof 1. Eliminate useless symbols and productions. 2. Eliminate -rules. 11/8/16 Kwang-Moo Choe 9

3. Eliminate unit production. No -productions and no unit productions. If L(G), S P 1. A a P, CNF. A X 1 X n P where n 2, X i N T. X i, B a a(=x i ) P 1 and replace X i by B a. A C 1 C n P where n 2, C i N. If n = 2, CNF. A C 1 C n P where n 3, C i N. A C 1 D 1 P 1, D 1 C 2 D 2 P 1, D n-3 C n-2 D n-2 P 1, D n-2 C n-1 C n P 1. 11/8/16 Kwang-Moo Choe 10

Proof G 1 is CNF is trivail. 1) If A X 1 X k P, A G1 + X 1 X k. If A G w, A G1 w. 2) If A G1 w and consider the parse tree of w in G 1. Convert the parse tree into the parse tree of w in G. i) A C 1 D 1,, D n-3 C n-2 D n-2, D n-2 C n-1 C n into A C 1 C n-1 C n. (Fig. 7.4) ii) B a a into a L(G) = L(G 1 ). See Ex. 7.15 in p273. 11/8/16 Kwang-Moo Choe 11

Regular(type 3) grammar(normal form) A ab or b A, B N, a, b T. right linear A ab ac non-deterministic! \ A ab ac, if B C. deterministic! A Ba or b A, B N, a, b T. left linear (Extended )regular(type 3) grammar A xb or y A, B N, x, y T. extended right linear A Bx or y A, B N, x, y T. extended left linear Context-free(type 2) grammar(chomsky s normal form) A BC or a A, B, C N, a T. Context free(type 2) grammar(extended) A A N, (N T). 11/8/16 Kwang-Moo Choe 12

7.2 The Pumping Lemma for context-free Languages 7.2.1 The size of parse tree Theorem 7.17 Let G = (N, T, P, S) be a Chomsky Normal Form contextfree grammar and consider a parse tree for w L(G). If n is the length(# of edges) of the longest path in the parse tree, w 2 n-1. Proof Induction on n, i) n = 1, w, w = 1 2 1-1 = 1. ii) n 1, S AB is the root of the tree. Two subtrees with roots A and B, respectively, and assume A w a, B w b, and w = w a w b. By induction hypothesis, w a 2 n-2 and w b 2 n-2. w 2 n-2 + 2 n-2 = 2 n-1. Binary(A BC) tree with unary leaf(a a). w 2 n-1. 11/8/16 Kwang-Moo Choe 13

7.2.2 Statement of the pumping Lemma Theorem 7.18 (The pumping lemma for context-free languages) Let L be a CFL. n N. if z L and z n, then we write z = uvwxy 1) vwx n, the first pump 2) vx, nontrivial pump 3) i 0, uv i wx i y L. pump Proof Since L is CFL, G = (N, T, P, S) where L = L(G) and G is CNF. Choose n = 2 N and suppose the longest path P of the parse tree for z L is k+1. n = 2 N z 2 (k+1)-1 = 2 k. (Def. of n, and Thm. 7.17) N k. Consider the longest path(k+1), (A 0, A 1,, A k, a) 11/8/16 Kwang-Moo Choe 14

A 0 = S, 0 i k: A i N, a T. (Fig. 7.5) Since N k, 1 i j k. A i = A j = A. Assume S ua i y uva j xy uvwxy. (Fig. 7.6) Note that A = A i va j x = vax or A = A j w, and S uay. (1) S uay uwy = uv 0 wx 0 y; or (2) Assume S uay uv i-1 wx i-1 y for i 1, and S uay uvaxy uvv i-1 wxx i-1 y = uv i wx i y. (Fig. 7.7) S uv i wx i y for i 0. 3) pumping(xy i z in RL s) Since G is useful and -free(cnf) v and x, vx. 2) nonempty pumping(y in RL s) We can select A i to be the closest to the bottom of the tree, k - i N, Since the length of the longest path in A i -subtree N + 1, vwx n (Thm. 7.17) 1) first pumping( xy n in RL s) 11/8/16 Kwang-Moo Choe 15

7.2.3 Application of the Pumping Lemma for CFL s 1. Pick L that we want to prove that L is not context-free. 2. Adversary pick n(any possible n) 3. Pick z, we may use n as a parameter 4. Adversary break z into uvwxy. vwx n, vx 5. To win the game, find i. uv i wx i y L. Context-free languages cannot match more than two groups of symbols for equality or inequality. Example 7.19 L = {0 n 1 n 2 n n 1} Let K be the adversary number, and z = 0 n 1 n 2 n, For all breaks of z into uvwxy. vwx K, vx (1) u = 0 n-i, vwx = 0 i 1 n-i, and y = 1 i 2 n. Since vx, uwy L. (2) u = 0 n 1 n-i, vwx = 1 i 2 n-i, and y = 2 i. Since vx, uwy L. 11/8/16 Kwang-Moo Choe 16

Two groups match cannot be interleaved. Example 7.20 L = {0 i 1 j 2 i 3 j i, j 1} Let n be the adversary number and z = 0 n 1 n 2 n 3 n. For all breaks of z into uvwxy. vwx n, vx, vwy: substring of at most two consecutive symbols Nontrivial (vx ) pumping of v and x Less than or equal to n symbols that is in vwx. CFL s cannot match two strings of arbitrary length Exercise 7.21 L ww = {ww w (0+1) } vs. L wwr = {ww R w (0+1) } Consider z = 0 n 1 n 0 n 1 n. 11/8/16 Kwang-Moo Choe 17

7.3 Closure properties of Context Free Languages Context-free languages are closed under 1. union, 2. concatenation, 3. closure, 4. substitution, 5. reversal Let G A = (N A, T A, P A, S A ) and G B = (N B, T B, P B, S B ) be cfg s. Then G 1 = (N A N B {S 1 }, A B, P A P B {S 1 S A S B }, S 1 ). L(G 1 ) = L(G A ) L(G B ), G 2 = (N A N B {S 2 }, T A T B, P A P B {S 2 S A S B }, S 2 ). L(G 2 ) = L(G A )L(G B ), G 3 = (N A {S 3 }, A, P A {S 3 S A S 3 }, S 3 ). L(G 3 ) = L(G A ), G 5 = (N A, A, {A R A P}, S A ). L(G 5 ) = L(G A ) R. 11/8/16 Kwang-Moo Choe 18

Context-free language is not closed under intersection Example 7.26 We know that L = {0 n 1 n 2 n n 1} is not cfl in Ex. 7.19. Consider L 1 = {0 n 1 n 2 i n 1, i 1} G 1 : S AB A 0A1 01 B 2B 2. L 2 = {0 i 1 n 2 n n 1, i 1} G 2 : S AB A 0A 0 B 1B2 12. L 1 and L 2 are context-free but L = L 1 L 2 = {0 n 1 n 2 n n 1} is not context-free counter example 11/8/16 Kwang-Moo Choe 19

Theorem 7.27 If L is CFL and R is regular language, then L R is context-free. Proof Let P = (Q P, T,, P, q P, Z P, F P ) be a PDA, L(P) = L, and A = (Q A,, A, q A, F A ) be a FA, L(A) = R. Then P = (Q P Q A, T,,, (q P, q A ), Z P, F P F A ) where a { } and ((q, p), a, X) = {((r, s), ) s A (p, a), (r, ) P (q, a, X)}. Induction (q P, w, Z P ) (q,, ) if and only if ((q P, q A ), w, Z P ) ((q, p),, ) and q A (q A, w). Theorem 7.29 If L, L 1, and L 2 are CFL s and R is regular language. 1. L R is context-free. 2. L is not (necessary) context-free. 3. L 1 L 2 is not (necessary) context-free. 11/8/16 Kwang-Moo Choe 20

7.4 Decision Properties of CFL s PDA by empty stack PDA by final state Thm 6.9, 11 O(n) CFG PDA Thm 6.13 O(n) PDA CFG Thm 6.14 O(n 3 ) CFG CNF O(n 2 ) 1. Detecting reachable and generating symbol O(n) Eliminating useless symbols and productions O(n) 2. Eliminating -production O(2 k ) where k is maximum length of RHS O(2 n ) 3. Eliminating unit productions O(n 2 ) 4. Replacing terminal symbols by nonterminal symbols O(n) 5. Breaking length of RHS O(n) 2 Eliminating -production 2 2 O(n) O(n) 11/8/16 Kwang-Moo Choe 21

Membership problem CYK algorithm(coke, Younger, Kasami) Given w = a 1 a n T and a cfg G in CNF, test if w L(G) or not. We can compute X ij = {A N A a i a j }, 1 i j n. If S X 1n, w L(G); otherwise w L(G). How to compute X ij.(w.l.o.g assume CNF) basis X ii = {A A a i P} induction Assume A a i a j. Since i j, and CNF( -free) A BC P where B a i a k and C a k+1 a j, i k j. if B X ik, C X k+1,j, and A BC P; A X ij. Test j i pairs (X ii, X i+1,j ), (X i,i+1, X i+2,j ),,(X i,j-1, X jj ) Since for each O(n 2 ) X ij, test at most n pairs, O(n 3 ). 11/8/16 Kwang-Moo Choe 22

CYK algorithm in PASCAL style for i:=1 to n do for j:=i to n do X ij := ; ( initalize O(n 2 ), i j, see Fig. 7.12 ) for i:=1 to n do ( basis O(n) ) if A a i P then X ii := X ii {A}; for k:=1 to n-1 do for i:=1 to n-k do ( consider X i,i+k ) for j:=i to i+k do ( recursion O(n 3 ) ) for A BC P do if (B X i,j ) and (C X j+1,i+k ) then X i,i+k := X i,i+k {A}; ( See Fig. 7.13 ) 11/8/16 Kwang-Moo Choe 23

Some undecidable problems on CFL s 1. Is a given CFG G ambiguous? 2. Is a given CFL L is inherently ambiguous? 3. Is the intersection of two CFL s are empty? 4. Are two CFL s are same? 5. Is given CFL L, L = where is the alphabet of L. 11/8/16 Kwang-Moo Choe 24