Context-Free Grammars. 2IT70 Finite Automata and Process Theory

Similar documents
Context-Free Grammars. 2IT70 Finite Automata and Process Theory

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Context-Free Grammars and Languages

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Context-Free Grammar

Lecture 11 Context-Free Languages

Automata Theory CS F-08 Context-Free Grammars

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

5/10/16. Grammar. Automata and Languages. Today s Topics. Grammars Definition A grammar G is defined as G = (V, T, P, S) where:

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

CPS 220 Theory of Computation

Context-Free Grammars and Languages. Reading: Chapter 5

Automata Theory Final Exam Solution 08:10-10:00 am Friday, June 13, 2008

Einführung in die Computerlinguistik

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Push-Down Automata and Context-Free Languages

Context Free Grammars

Properties of Context-free Languages. Reading: Chapter 7

Grammars and Context Free Languages

Context-free Grammars and Languages

St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

Computational Models - Lecture 4 1

5 Context-Free Languages

Formal Languages, Grammars and Automata Lecture 5

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Grammars and Context Free Languages

CS 373: Theory of Computation. Fall 2010

1. (a) Explain the procedure to convert Context Free Grammar to Push Down Automata.

Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer.

Chapter 5: Context-Free Languages

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

This lecture covers Chapter 5 of HMU: Context-free Grammars

Concordia University Department of Computer Science & Software Engineering

Harvard CS 121 and CSCI E-207 Lecture 10: Ambiguity, Pushdown Automata

Introduction and Motivation. Introduction and Motivation. Introduction to Computability. Introduction and Motivation. Theory. Lecture5: Context Free

The Pumping Lemma for Context Free Grammars

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

Computational Models - Lecture 4 1

Computational Models - Lecture 5 1

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Einführung in die Computerlinguistik

FLAC Context-Free Grammars

CPSC 313 Introduction to Computability

Solutions to Problem Set 3

PUSHDOWN AUTOMATA (PDA)

Computational Models - Lecture 4

Context-Free Grammars and Languages. We have seen that many languages cannot be regular. Thus we need to consider larger classes of langs.

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

Pushdown Automata. Reading: Chapter 6

Introduction to Theory of Computing

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Section 1 (closed-book) Total points 30

Properties of context-free Languages

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Problem Session 5 (CFGs) Talk about the building blocks of CFGs: S 0S 1S ε - everything. S 0S0 1S1 A - waw R. S 0S0 0S1 1S0 1S1 A - xay, where x = y.

Properties of Context-Free Languages

CFG Simplification. (simplify) 1. Eliminate useless symbols 2. Eliminate -productions 3. Eliminate unit productions

Theory of Computation - Module 3

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

CPSC 421: Tutorial #1

Intro to Theory of Computation

Foundations of Informatics: a Bridging Course

Theory of Computer Science

Computational Models - Lecture 3

Homework 4 Solutions. 2. Find context-free grammars for the language L = {a n b m c k : k n + m}. (with n 0,

Context Free Languages and Grammars

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

VTU QUESTION BANK. Unit 1. Introduction to Finite Automata. 1. Obtain DFAs to accept strings of a s and b s having exactly one a.

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

CISC4090: Theory of Computation

Chapter 16: Non-Context-Free Languages

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Context-Free Grammars: Normal Forms

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

Notes for Comp 497 (Comp 454) Week 10 4/5/05

CS375: Logic and Theory of Computing

Computational Models - Lecture 5 1

MTH401A Theory of Computation. Lecture 17

CS20a: summary (Oct 24, 2002)

Notes for Comp 497 (454) Week 10

INSTITUTE OF AERONAUTICAL ENGINEERING

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Sheet 1-8 Dr. Mostafa Aref Format By : Mostafa Sayed

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove

Theory of Computation (Classroom Practice Booklet Solutions)

CA320 - Computability & Complexity

Chapter 4: Context-Free Grammars

Theory Of Computation UNIT-II

CSC 4181Compiler Construction. Context-Free Grammars Using grammars in parsers. Parsing Process. Context-Free Grammar

Transcription:

Context-Free Grammars 2IT70 Finite Automata and Process Theory Technische Universiteit Eindhoven May 18, 2016

Generating strings language L 1 = {a n b n n > 0} ab L 1 if w L 1 then awb L 1 production rules ab and ab 2 IT70 (2016) Context-Free Grammars 2/ 41

Generating strings language L 1 = {a n b n n > 0} language L 2 = (01) ab L 1 if w L 1 then awb L 1 ε L 2 if w L 2 then 01w L 2 production rules ab and ab production rules ε and 01 2 IT70 (2016) Context-Free Grammars 2/ 41

Variables, terminals, production rules, start symbol palindromes over {a, b} ε a b aa bb binary integer expressions E I E N I a I I0 E E +E E E E E (E) I I1 N 1 N N0 N N1 2 IT70 (2016) Context-Free Grammars 3/ 41

Variables, terminals, production rules, start symbol palindromes over {a, b} ε a b aa bb alternative notation ε a b aa bb binary integer expressions E I E N I a I I0 E E +E E E E E (E) I I1 N 1 N N0 N N1 alternative notation E I N E +E E E (E) I a I0 I1 N 1 N0 N1 2 IT70 (2016) Context-Free Grammars 3/ 41

Clicker questions L81 Consider again the grammar given by E I N E +E E E (E) I a I0 I1 N 1 N0 N1 How many of the strings aa1, a01, 011, 11a, a01+a01, (a11 a10), +101, (110) cannot be generated by the grammar, you expect? A. Two strings B. Three strings C. Four strings D. ix strings E. Can t tell 2 IT70 (2016) Context-Free Grammars 4/ 41

Language of a CFG context-free grammar G = (V, T, R, ) V variables and T terminals R V (V T) production rules A α V start symbol productions G (V T) (V T) γ G γ if γ = β 1 Aβ 2, A α rule of G, γ = β 1 αβ 2 production sequences γ 0 G γ 1 G G γ n 2 IT70 (2016) Context-Free Grammars 5/ 41

Language of a CFG context-free grammar G = (V, T, R, ) V variables and T terminals R V (V T) production rules A α V start symbol productions G (V T) (V T) γ G γ if γ = β 1 Aβ 2, A α rule of G, γ = β 1 αβ 2 production sequences γ 0 G γ 1 G G γ n language of a variable L G (A) = {w T A G w } language of the grammar L(G) = L G () = {w T G w } 2 IT70 (2016) Context-Free Grammars 5/ 41

More examples expression ::= term expression + term term ::= factor term factor factor ::= identifier ( expression ) identifier ::= a b c... 2 IT70 (2016) Context-Free Grammars 6/ 41

More examples expression ::= term expression + term term ::= factor term factor factor ::= identifier ( expression ) identifier ::= a b c... char ::= a... z A... Z... text ::= ε char text doc ::= ε element doc element ::= text <EM> doc </EM> <P> doc <OL> list </OL> listitem ::= <LI> doc list ::= ε listitem doc 2 IT70 (2016) Context-Free Grammars 6/ 41

Combining and splitting productions lemma CFG G = (V, T, R, ) if X 1 n 1 G γ 1,...,X k n k G γ k then X 1 X k n G γ 1 γ k where n = n 1 + +n k if X 1 X k n G γ then X 1 n 1 G γ 1,...,X k n k G γ k where n = n 1 + +n k and γ = γ 1...γ k X 1,...,X k (V T), γ 1,...,γ k (V T) 2 IT70 (2016) Context-Free Grammars 7/ 41

The parentheses language L () CFG ε () several production sequences for string ()(()) G G () G (()) G (()) G ()(()) G ()(()) G G () G () G ()() G ()(()) G ()(()) G G () G ()() G ()() G ()(()) G ()(()) leftmost, rightmost, mixed production sequence 2 IT70 (2016) Context-Free Grammars 8/ 41

Clicker question L82 Given the CFG () (). How many production sequences are there for the string (())((()))? A. (())((())) has 5 possible production sequences B. (())((())) has 6 possible production sequences C. (())((())) has 10 possible production sequences D. (())((())) has 12 possible production sequences E. Can t tell 2 IT70 (2016) Context-Free Grammars 9/ 41

Proving a grammar correct CFG G with production rules ab and ab for L = {a n b n n 1} it holds that L(G) = L proof induction on n: if n G w then w L, thus L(G) L induction on n: if w = a n b n then w L(G), thus L L(G) 2 IT70 (2016) Context-Free Grammars 10/ 41

Avoiding the inductive proofs lemma CFGs G 1 = (V 1, T 1, R 1, 1 ) and G 2 = (V 2, T 2, R 2, 2 ) moreover V 1 and V 2 disjoint define CFG G = ({} V 1 V 2, T 1 T 2, R, ) if R = { 1 2 } R 1 R 2 then L(G) = L(G 1 ) L(G 2 ) if R = { 1 2 } R 1 R 2 then L(G) = L(G 1 ) L(G 2 ) if R = { ε 1 } R 1 then L(G) = L(G 1 ) 2 IT70 (2016) Context-Free Grammars 11/ 41

Avoiding the inductive proofs (cont.) CFG G with production rules 1 2 1 ab B ε bb 2 ba A ε aa then L(G) = {ab n, ba m n,m 0} proof use the lemma L G (A) = {a m m 0} and L G (B) = {b n n 0} L G ( 1 ) = {a} {b n n 0} and L G ( 2 ) = {b} {a m m 0} L(G) = {ab n n 0} {ba m m 0} 2 IT70 (2016) Context-Free Grammars 12/ 41

Context-free languages language L is context-free if L = L(G) for CFG G {a n b n n 0} and {ww R w {0,1} } are context-free 2 IT70 (2016) Context-Free Grammars 13/ 41

Context-free languages language L is context-free if L = L(G) for CFG G {a n b n n 0} and {ww R w {0,1} } are context-free theorem if L is regular then L is context-free proof for DFA D = (Q, Σ, δ, q 0, F ) put G = (Q, Σ, R, q 0 ) where R = {q aq δ(q,a) = q } {q ε q F } then L = L(G) 2 IT70 (2016) Context-Free Grammars 13/ 41

Chomsky Normal Form 2IT70 Finite Automata and Process Theory Technische Universiteit Eindhoven May 18, 2016

Useless symbols CFG G = (V,T,R,) symbol X V T generating if X G w T symbol X V T reachable if α,β: G αxβ symbol X V T is useful if both generating and reachable 2IT70 (2016) Chomsky Normal Form 15/41

Clicker question L83 Consider the grammar AB c A a C c 2IT70 (2016) Chomsky Normal Form 16/41

Clicker question L83 Consider the grammar AB c A a C c Which of the following statements about variables holds true? A. 2 variables generating, 2 reachable, 1 useful B. 2 variables generating, 2 reachable, 2 useful C. 3 variables generating, 3 reachable, 2 useful D. 3 variables generating, 3 reachable, 3 useful E. Can t tell 2IT70 (2016) Chomsky Normal Form 16/41

Finding of generating variables CFG G = (V,T,R,) with L(G) symbol X V T is generating if X G w T and Gen(G) = {X X generating} lemma put Gen 0 = T Gen i+1 = Gen i {A A G α, α Gen i } Gen = i=0 Gen i then Gen(G) = Gen theorem CFG G = (V,T,R,) with V = V Gen(G) and R = {A α R A Gen, α Gen } then L(G) = L(G ) and all symbols of G generating 2IT70 (2016) Chomsky Normal Form 17/41

Finding of reachable variables CFG G = (V,T,R,) with L(G) symbol X V T is reachable if G Reach(G) = {X X reachable} αxβ and lemma put Reach 0 = {} Reach i+1 = Reach i {X A G γ, A Reach i, γαxβ } Reach = i=0 Reach i then Reach(G) = Reach theorem CFG G = (V,T,R,) with V = V Reach(G) and R = {A α R A Reach } then L(G) = L(G ) and all variables of G reachable 2IT70 (2016) Chomsky Normal Form 18/41

Parse Trees 2IT70 Finite Automata and Process Theory Technische Universiteit Eindhoven May 18, 2016

Identifying production sequences parentheses grammar ε () several production sequences for string ()(()) G G () G () G ()() G ()(()) G ()(()) G G () G ()() G ()() G ()(()) G ()(()) 2IT70 (2016) Parse Trees 20/41

Identifying production sequences parentheses grammar ε () several production sequences for string ()(()) G G () G () G ()() G ()(()) G ()(()) G G () G ()() G ()() G ()(()) G ()(()) swapping independent productions 2IT70 (2016) Parse Trees 20/41

Identifying production sequences (cont.) 2IT70 (2016) Parse Trees 21/41

Identifying production sequences (cont.) 2IT70 (2016) Parse Trees 21/41

Identifying production sequences (cont.) ( ) 2IT70 (2016) Parse Trees 21/41

Identifying production sequences (cont.) ( ) ε 2IT70 (2016) Parse Trees 21/41

Identifying production sequences (cont.) ( ) ( ) ε 2IT70 (2016) Parse Trees 21/41

Identifying production sequences (cont.) ( ) ( ) ε ( ) 2IT70 (2016) Parse Trees 21/41

Identifying production sequences (cont.) ( ) ( ) ε ( ) ε 2IT70 (2016) Parse Trees 21/41

Identifying production sequences (once more) 2IT70 (2016) Parse Trees 22/41

Identifying production sequences (once more) 2IT70 (2016) Parse Trees 22/41

Identifying production sequences (once more) ( ) 2IT70 (2016) Parse Trees 22/41

Identifying production sequences (once more) ( ) ( ) 2IT70 (2016) Parse Trees 22/41

Identifying production sequences (once more) ( ) ( ) ε 2IT70 (2016) Parse Trees 22/41

Identifying production sequences (once more) ( ) ( ) ε ( ) 2IT70 (2016) Parse Trees 22/41

Identifying production sequences (once more) ( ) ( ) ε ( ) ε 2IT70 (2016) Parse Trees 22/41

Yield of a parse tree CFG G = (V, T, R, ) set PT G of all parse trees of G [X] single node tree, X V T [A ε] two node tree, root A, leaf ε for rule A ε R [A PT 1,PT 2,...,PT k ] rule A X 1 X k R parse trees PT i with root X i 2IT70 (2016) Parse Trees 23/41

Yield of a parse tree CFG G = (V, T, R, ) set PT G of all parse trees of G [X] single node tree, X V T [A ε] two node tree, root A, leaf ε for rule A ε R [A PT 1,PT 2,...,PT k ] rule A X 1 X k R parse trees PT i with root X i yield function yield : PT G (V T) yield([x]) = X yield([a ε]) = ε yield([a PT 1,...,PT k ]) = yield(pt 1 )... yield(pt k ) parse tree PT is complete if yield(pt) T 2IT70 (2016) Parse Trees 23/41

A parse tree with yield ()(()) ( ) ( ) ε ( ) ε 2IT70 (2016) Parse Trees 24/41

A parse tree with yield ()(()) ( ) ( ) ε ( ) ε 2IT70 (2016) Parse Trees 24/41

Another parse tree CFG AB A ε aaa B ε Bb A B a a A B b ε ε parse tree with yield aab 2IT70 (2016) Parse Trees 25/41

Another parse tree CFG AB A ε aaa B ε Bb A B a a A B b ε ε parse tree with yield aab 2IT70 (2016) Parse Trees 25/41

Parsing CFG G with rules ε ab ba aabb L(G)? w {a,b} 2IT70 (2016) Parse Trees 26/41

Parsing CFG G with rules ε ab ba aabb L(G)? 1 2 3 4 ε a b b a ε awb bwa w 1 w 2 2IT70 (2016) Parse Trees 26/41

Parsing CFG G with rules ε ab ba aabb L(G)? 1 2 3 4 ε a b b a ε awb bwa w 1 w 2 2IT70 (2016) Parse Trees 26/41

Parsing CFG G with rules ε ab ba aabb L(G)? 2.1 2.2 2.3 2.4 a b a b a b a b ε a b b a 2IT70 (2016) Parse Trees 27/41

Parsing CFG G with rules ε ab ba aabb L(G)? 2.1 2.2 2.3 2.4 a b a b a b a b ε a b b a ab aawbb abwab aw 1 w 2 b 2IT70 (2016) Parse Trees 27/41

Parsing CFG G with rules ε ab ba aabb L(G)? 2.2.1 2.2.2 2.2.3 2.2.4 a b a b a b a b ε a b a b a b a b ε a b b a 2IT70 (2016) Parse Trees 28/41

Parsing CFG G with rules ε ab ba aabb L(G)? 2.2.1 2.2.2 2.2.3 2.2.4 a b a b a b a b ε a b a b a b a b ε a b b a aabb aaawbbb aabwabb aaw 1 w 2 bb 2IT70 (2016) Parse Trees 28/41

Parsing CFG G with rules ε ab ba aabb L(G)? 2.2.1 2.2.2 2.2.3 2.2.4 a b a b a b a b ε a b a b a b a b ε a b b a aabb aaawbbb aabwabb aaw 1 w 2 bb Thus aabb L(G) 2IT70 (2016) Parse Trees 28/41

Clicker question L91 parsing takes at most 2 w 1 rounds if no summand ε summands have at least one terminal or at least two variables 2IT70 (2016) Parse Trees 29/41

Clicker question L91 parsing takes at most 2 w 1 rounds if no summand ε summands have at least one terminal or at least two variables With the parsing procedure and restrictions above, we have that A. Parsing is linear in the length of the string B. Parsing is quadratic in the length of the string C. Parsing is exponential in the length of the string D. Can t tell 2IT70 (2016) Parse Trees 29/41

generated strings of terminals vs. yields of parse trees theorem CFG G = (V, T, R, ) A G w implies w = yield(pt) for parse tree PT with root A proof by induction on n: A n G w implies PT PT G(A): w = yield(pt) for all A V and w T 2IT70 (2016) Parse Trees 30/41

generated strings of terminals vs. yields of parse trees theorem CFG G = (V, T, R, ) A G w implies w = yield(pt) for parse tree PT with root A proof by induction on n: A n G w implies PT PT G(A): w = yield(pt) for all A V and w T thus L(G)={w T G w } {yield(pt) PT complete parse tree of G, root } 2IT70 (2016) Parse Trees 30/41

Clicker question L92 uppose X l G β for a CFG G. 2IT70 (2016) Parse Trees 31/41

Clicker question L92 uppose X l G β for a CFG G. Then it holds that A. αxγ l G αβγ for all α T and γ T B. αxγ l G αβγ for all α T and γ (V T) C. αxγ l G αβγ for all α (V T) and γ T D. αxγ l G αβγ for all α (V T) and γ (V T) E. Can t tell 2IT70 (2016) Parse Trees 31/41

From parse tree to leftmost production sequence theorem CFG G for parse tree PT, root A and yield w: A l G w proof induction on the height of the parse tree PT thus {yield(pt) PT complete parse tree of G, root } {w T G w }=L(G) 2IT70 (2016) Parse Trees 32/41

Different parse trees (harmless) a b a ε b ε aabb ε a b a ε b ε aabb ambiguous grammar ε ab 2IT70 (2016) Parse Trees 33/41

Different parse trees (harmful) E I E + E E * E ( E ) I a b c E E E E E + E I E + E E E E a I I I I c b c a b a*b+c a*b+c ambigious grammar 2IT70 (2016) Parse Trees 34/41

Different parse trees (harmful, cont.) E I E + E E * E ( E ) I a b c 14 2 7 2 3 + 4 a 3 4 10 6 + 4 2 3 4 2 3 c b c a b 2*3+4? wrong 2*3+4? right ambigious grammar 2IT70 (2016) Parse Trees 35/41

Disambiguation E T E + T T F T * F F I ( E ) I a b c syntactic categories: expression, term, factor, identifier 2IT70 (2016) Parse Trees 36/41

Disambiguation E E + T E T E + T T F T * F F I ( E ) I a b c T F T F I F I c I b a a*b+c syntactic categories: expression, term, factor, identifier 2IT70 (2016) Parse Trees 36/41