Context Free Languages and Grammars

Similar documents
CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer.

Automata Theory CS F-08 Context-Free Grammars

Properties of Context-Free Languages

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Computational Models - Lecture 4

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

This lecture covers Chapter 7 of HMU: Properties of CFLs

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Foundations of Informatics: a Bridging Course

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

CPS 220 Theory of Computation

Context-Free Grammars (and Languages) Lecture 7

1. (a) Explain the procedure to convert Context Free Grammar to Push Down Automata.

Non-context-Free Languages. CS215, Lecture 5 c

Computational Models - Lecture 4 1

Einführung in die Computerlinguistik

St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

Deterministic Finite Automata (DFAs)

MA/CSSE 474 Theory of Computation

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Properties of Context-Free Languages. Closure Properties Decision Properties

NPDA, CFG equivalence

Properties of Context-free Languages. Reading: Chapter 7

CS500 Homework #2 Solutions

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Introduction to Theory of Computing

Computational Models - Lecture 4 1

Ogden s Lemma for CFLs

Einführung in die Computerlinguistik

2.1 Solution. E T F a. E E + T T + T F + T a + T a + F a + a

Computational Models: Class 5

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

Computational Models - Lecture 5 1

Properties of Context Free Languages

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

FLAC Context-Free Grammars

Functions on languages:

CSE 355 Test 2, Fall 2016

Final exam study sheet for CS3719 Turing machines and decidability.

Non-deterministic Finite Automata (NFAs)

Notes for Comp 497 (454) Week 10

Section 1 (closed-book) Total points 30

CS20a: summary (Oct 24, 2002)

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

INSTITUTE OF AERONAUTICAL ENGINEERING

DD2371 Automata Theory

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL, DPDA PDA)

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

Theory of Computation - Module 3

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Context Free Grammars

Lecture 11 Context-Free Languages

CS375: Logic and Theory of Computing

Even More on Dynamic Programming

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

Context-Free and Noncontext-Free Languages

Computational Models - Lecture 3

CSE 105 THEORY OF COMPUTATION

Deterministic Finite Automata (DFAs)

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

Chapter 5: Context-Free Languages

Miscellaneous. Closure Properties Decision Properties

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

Computational Models - Lecture 5 1

Context-Free Languages (Pre Lecture)

CSE 105 THEORY OF COMPUTATION

Deterministic Finite Automata (DFAs)

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

The View Over The Horizon

Concordia University Department of Computer Science & Software Engineering

CISC4090: Theory of Computation

Comment: The induction is always on some parameter, and the basis case is always an integer or set of integers.

Lecture 17: Language Recognition

Theory of Computation (IV) Yijia Chen Fudan University

Solutions to Problem Set 3

Homework. Context Free Languages. Announcements. Before We Start. Languages. Plan for today. Final Exam Dates have been announced.

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

Part 3 out of 5. Automata & languages. A primer on the Theory of Computation. Last week, we learned about closure and equivalence of regular languages

CPS 220 Theory of Computation Pushdown Automata (PDA)

Properties of context-free Languages

Theory of Computation (Classroom Practice Booklet Solutions)

10. The GNFA method is used to show that

Chap. 7 Properties of Context-free Languages

Context-Free Grammars and Languages. Reading: Chapter 5

Theory of Computation

The Pumping Lemma. for all n 0, u 1 v n u 2 L (i.e. u 1 u 2 L, u 1 vu 2 L [but we knew that anyway], u 1 vvu 2 L, u 1 vvvu 2 L, etc.

Automata and Computability. Solutions to Exercises

Chapter 4: Context-Free Grammars

Context-Free Grammars. 2IT70 Finite Automata and Process Theory

Transcription:

Algorithms & Models of Computation CS/ECE 374, Fall 2017 Context Free Languages and Grammars Lecture 7 Tuesday, September 19, 2017 Sariel Har-Peled (UIUC) CS374 1 Fall 2017 1 / 36

What stack got to do with it? What s a stack but a second hand memory? 1 DFA/NFA/Regular expressions. constant memory computation. 2 Turing machines DFA/NFA + unbounded memory. a standard computer/program. Sariel Har-Peled (UIUC) CS374 2 Fall 2017 2 / 36

What stack got to do with it? What s a stack but a second hand memory? 1 DFA/NFA/Regular expressions. constant memory computation. 2 NFA + stack context free grammars (CFG). 3 Turing machines DFA/NFA + unbounded memory. a standard computer/program. Sariel Har-Peled (UIUC) CS374 2 Fall 2017 2 / 36

What stack got to do with it? What s a stack but a second hand memory? 1 DFA/NFA/Regular expressions. constant memory computation. 2 NFA + stack context free grammars (CFG). 3 Turing machines DFA/NFA + unbounded memory. a standard computer/program. NFA with two stacks. Sariel Har-Peled (UIUC) CS374 2 Fall 2017 2 / 36

Context Free Languages and Grammars Programming Language Specification Parsing Natural language understanding Generative model giving structure... Sariel Har-Peled (UIUC) CS374 3 Fall 2017 3 / 36

Programming Languages Sariel Har-Peled (UIUC) CS374 4 Fall 2017 4 / 36

Natural Language Processing Sariel Har-Peled (UIUC) CS374 5 Fall 2017 5 / 36

Models of Growth L-systems http://www.kevs3d.co.uk/dev/lsystems/ Sariel Har-Peled (UIUC) CS374 6 Fall 2017 6 / 36

Kolam drawing generated by grammar Sariel Har-Peled (UIUC) CS374 7 Fall 2017 7 / 36

Context Free Grammar (CFG) Definition Definition A CFG is a quadruple G = (V, T, P, S) V is a finite set of non-terminal symbols T is a finite set of terminal symbols (alphabet) P is a finite set of productions, each of the form A α where A V and α is a string in (V T ). Formally, P V (V T ). S V is a start symbol ( G = Variables, Terminals, Productions, Start var ) Sariel Har-Peled (UIUC) CS374 8 Fall 2017 8 / 36

Context Free Grammar (CFG) Definition Definition A CFG is a quadruple G = (V, T, P, S) V is a finite set of non-terminal symbols T is a finite set of terminal symbols (alphabet) P is a finite set of productions, each of the form A α where A V and α is a string in (V T ). Formally, P V (V T ). S V is a start symbol ( G = Variables, Terminals, Productions, Start var ) Sariel Har-Peled (UIUC) CS374 8 Fall 2017 8 / 36

Context Free Grammar (CFG) Definition Definition A CFG is a quadruple G = (V, T, P, S) V is a finite set of non-terminal symbols T is a finite set of terminal symbols (alphabet) P is a finite set of productions, each of the form A α where A V and α is a string in (V T ). Formally, P V (V T ). S V is a start symbol ( G = Variables, Terminals, Productions, Start var ) Sariel Har-Peled (UIUC) CS374 8 Fall 2017 8 / 36

Context Free Grammar (CFG) Definition Definition A CFG is a quadruple G = (V, T, P, S) V is a finite set of non-terminal symbols T is a finite set of terminal symbols (alphabet) P is a finite set of productions, each of the form A α where A V and α is a string in (V T ). Formally, P V (V T ). S V is a start symbol ( G = Variables, Terminals, Productions, Start var ) Sariel Har-Peled (UIUC) CS374 8 Fall 2017 8 / 36

Example V = {S} T = {a, b} P = {S ɛ a b asa bsb} (abbrev. for S ɛ, S a, S b, S asa, S bsb) S asa absba abbsbba abb b bba What strings can S generate like this? Sariel Har-Peled (UIUC) CS374 9 Fall 2017 9 / 36

Example V = {S} T = {a, b} P = {S ɛ a b asa bsb} (abbrev. for S ɛ, S a, S b, S asa, S bsb) S asa absba abbsbba abb b bba What strings can S generate like this? Sariel Har-Peled (UIUC) CS374 9 Fall 2017 9 / 36

Example V = {S} T = {a, b} P = {S ɛ a b asa bsb} (abbrev. for S ɛ, S a, S b, S asa, S bsb) S asa absba abbsbba abb b bba What strings can S generate like this? Sariel Har-Peled (UIUC) CS374 9 Fall 2017 9 / 36

Example formally... V = {S} T = {a, b} P = {S ɛ a b asa bsb} (abbrev. for S ɛ, S a, S b, S asa, S bsb) G = {S}, {a, b}, S ɛ, S a, S b S asa S bsb S Sariel Har-Peled (UIUC) CS374 10 Fall 2017 10 / 36

Palindromes Madam in Eden I m Adam Dog doo? Good God! Dogma: I am God. A man, a plan, a canal, Panama Are we not drawn onward, we few, drawn onward to new era? Doc, note: I dissent. A fast never prevents a fatness. I diet on cod. http://www.palindromelist.net Sariel Har-Peled (UIUC) CS374 11 Fall 2017 11 / 36

Examples L = {0 n 1 n n 0} S ɛ 0S1 Sariel Har-Peled (UIUC) CS374 12 Fall 2017 12 / 36

Examples L = {0 n 1 n n 0} S ɛ 0S1 Sariel Har-Peled (UIUC) CS374 12 Fall 2017 12 / 36

Notation and Convention Let G = (V, T, P, S) then a, b, c, d,..., in T (terminals) A, B, C, D,..., in V (non-terminals) u, v, w, x, y,... in T for strings of terminals α, β, γ,... in (V T ) X, Y, X in V T Sariel Har-Peled (UIUC) CS374 13 Fall 2017 13 / 36

Derives relation Formalism for how strings are derived/generated Definition Let G = (V, T, P, S) be a CFG. For strings α 1, α 2 (V T ) we say α 1 derives α 2 denoted by α 1 G α 2 if there exist strings β, γ, δ in (V T ) such that α 1 = βaδ α 2 = βγδ A γ is in P. Examples: S ɛ, S 0S1, 0S1 00S11, 0S1 01. Sariel Har-Peled (UIUC) CS374 14 Fall 2017 14 / 36

Derives relation continued Definition For integer k 0, α 1 k α 2 inductive defined: α 1 0 α 2 if α 1 = α 2 α 1 k α 2 if α 1 β 1 and β 1 k 1 α 2. Alternative definition: α 1 k α 2 if α 1 k 1 β 1 and β 1 α 2 is the reflexive and transitive closure of. α 1 α 2 if α 1 k α 2 for some k. Examples: S ɛ, 0S1 0000011111. Sariel Har-Peled (UIUC) CS374 15 Fall 2017 15 / 36

Derives relation continued Definition For integer k 0, α 1 k α 2 inductive defined: α 1 0 α 2 if α 1 = α 2 α 1 k α 2 if α 1 β 1 and β 1 k 1 α 2. Alternative definition: α 1 k α 2 if α 1 k 1 β 1 and β 1 α 2 is the reflexive and transitive closure of. α 1 α 2 if α 1 k α 2 for some k. Examples: S ɛ, 0S1 0000011111. Sariel Har-Peled (UIUC) CS374 15 Fall 2017 15 / 36

Derives relation continued Definition For integer k 0, α 1 k α 2 inductive defined: α 1 0 α 2 if α 1 = α 2 α 1 k α 2 if α 1 β 1 and β 1 k 1 α 2. Alternative definition: α 1 k α 2 if α 1 k 1 β 1 and β 1 α 2 is the reflexive and transitive closure of. α 1 α 2 if α 1 k α 2 for some k. Examples: S ɛ, 0S1 0000011111. Sariel Har-Peled (UIUC) CS374 15 Fall 2017 15 / 36

Context Free Languages Definition The language generated by CFG G = (V, T, P, S) is denoted by L(G) where L(G) = {w T S w}. Definition A language L is context free (CFL) if it is generated by a context free grammar. That is, there is a CFG G such that L = L(G). Sariel Har-Peled (UIUC) CS374 16 Fall 2017 16 / 36

Context Free Languages Definition The language generated by CFG G = (V, T, P, S) is denoted by L(G) where L(G) = {w T S w}. Definition A language L is context free (CFL) if it is generated by a context free grammar. That is, there is a CFG G such that L = L(G). Sariel Har-Peled (UIUC) CS374 16 Fall 2017 16 / 36

Example L = {0 n 1 n n 0} S ɛ 0S1 L = {0 n 1 m m > n} L = {w { (, ) } } w is properly nested string of parenthesis Sariel Har-Peled (UIUC) CS374 17 Fall 2017 17 / 36

Closure Properties of CFLs G 1 = (V 1, T, P 1, S 1 ) and G 2 = (V 2, T, P 2, S 2 ) Assumption: V 1 V 2 =, that is, non-terminals are not shared Theorem CFLs are closed under union. L 1, L 2 CFLs implies L 1 L 2 is a CFL. Theorem CFLs are closed under concatenation. L 1, L 2 CFLs implies L 1 L 2 is a CFL. Theorem CFLs are closed under Kleene star. If L is a CFL = L is a CFL. Sariel Har-Peled (UIUC) CS374 18 Fall 2017 18 / 36

Closure Properties of CFLs G 1 = (V 1, T, P 1, S 1 ) and G 2 = (V 2, T, P 2, S 2 ) Assumption: V 1 V 2 =, that is, non-terminals are not shared Theorem CFLs are closed under union. L 1, L 2 CFLs implies L 1 L 2 is a CFL. Theorem CFLs are closed under concatenation. L 1, L 2 CFLs implies L 1 L 2 is a CFL. Theorem CFLs are closed under Kleene star. If L is a CFL = L is a CFL. Sariel Har-Peled (UIUC) CS374 18 Fall 2017 18 / 36

Closure Properties of CFLs Union G 1 = (V 1, T, P 1, S 1 ) and G 2 = (V 2, T, P 2, S 2 ) Assumption: V 1 V 2 =, that is, non-terminals are not shared. Theorem CFLs are closed under union. L 1, L 2 CFLs implies L 1 L 2 is a CFL. Sariel Har-Peled (UIUC) CS374 19 Fall 2017 19 / 36

Closure Properties of CFLs Concatenation Theorem CFLs are closed under concatenation. L 1, L 2 CFLs implies L 1 L 2 is a CFL. Sariel Har-Peled (UIUC) CS374 20 Fall 2017 20 / 36

Closure Properties of CFLs Stardom (i.e, Kleene star) Theorem CFLs are closed under Kleene star. If L is a CFL = L is a CFL. Sariel Har-Peled (UIUC) CS374 21 Fall 2017 21 / 36

Exercise Prove that every regular language is context-free using previous closure properties. Prove the set of regular expressions over an alphabet Σ forms a non-regular language which is context-free. Sariel Har-Peled (UIUC) CS374 22 Fall 2017 22 / 36

Closure Properties of CFLs continued Theorem CFLs are not closed under complement or intersection. Theorem If L 1 is a CFL and L 2 is regular then L 1 L 2 is a CFL. Sariel Har-Peled (UIUC) CS374 23 Fall 2017 23 / 36

Canonical non-cfl Theorem L = {a n b n c n n 0} is not context-free. Proof based on pumping lemma for CFLs. Technical and outside the scope of this class. Sariel Har-Peled (UIUC) CS374 24 Fall 2017 24 / 36

Parse Trees or Derivation Trees A tree to represent the derivation S w. Rooted tree with root labeled S Non-terminals at each internal node of tree Terminals at leaves Children of internal node indicate how non-terminal was expanded using a production rule A picture is worth a thousand words Sariel Har-Peled (UIUC) CS374 25 Fall 2017 25 / 36

Parse Trees or Derivation Trees A tree to represent the derivation S w. Rooted tree with root labeled S Non-terminals at each internal node of tree Terminals at leaves Children of internal node indicate how non-terminal was expanded using a production rule A picture is worth a thousand words Sariel Har-Peled (UIUC) CS374 25 Fall 2017 25 / 36

Example S A derivation tree for abbaab (also called parse tree ) a S b b S a S à asb bsa SS ab ba ε S S b a ε A corresponding derivation of abbaab S è asb è absab è abssab è abbasab è abbaab Sariel Har-Peled (UIUC) CS374 26 Fall 2017 26 / 36

Ambiguity in CFLs Definition A CFG G is ambiguous if there is a string w L(G) with two different parse trees. If there is no such string then G is unambiguous. Example: S S S 1 2 3 S S S S S S 3 S S S S 1 2 1 3 2 3 (2 1) (3 2) 1 Sariel Har-Peled (UIUC) CS374 27 Fall 2017 27 / 36

Ambiguity in CFLs Original grammar: S S S 1 2 3 Unambiguous grammar: S S C 1 2 3 C 1 2 3 S S C S C 1 3 2 (3 2) 1 The grammar forces a parse corresponding to left-to-right evaluation. Sariel Har-Peled (UIUC) CS374 28 Fall 2017 28 / 36

Inherently ambiguous languages Definition A CFL L is inherently ambiguous if there is no unambiguous CFG G such that L = L(G). There exist inherently ambiguous CFLs. Example: L = {a n b m c k n = m or m = k} Given a grammar G it is undecidable to check whether L(G) is inherently ambiguous. No algorithm! Sariel Har-Peled (UIUC) CS374 29 Fall 2017 29 / 36

Inherently ambiguous languages Definition A CFL L is inherently ambiguous if there is no unambiguous CFG G such that L = L(G). There exist inherently ambiguous CFLs. Example: L = {a n b m c k n = m or m = k} Given a grammar G it is undecidable to check whether L(G) is inherently ambiguous. No algorithm! Sariel Har-Peled (UIUC) CS374 29 Fall 2017 29 / 36

Inherently ambiguous languages Definition A CFL L is inherently ambiguous if there is no unambiguous CFG G such that L = L(G). There exist inherently ambiguous CFLs. Example: L = {a n b m c k n = m or m = k} Given a grammar G it is undecidable to check whether L(G) is inherently ambiguous. No algorithm! Sariel Har-Peled (UIUC) CS374 29 Fall 2017 29 / 36

Inductive proofs for CFGs Question: How do we formally prove that a CFG L(G) = L? Example: S ɛ a b asa bsb Theorem L(G) = {palindromes} = {w w = w R } Two directions: L(G) L, that is, S w then w = w R L L(G), that is, w = w R then S w Sariel Har-Peled (UIUC) CS374 30 Fall 2017 30 / 36

Inductive proofs for CFGs Question: How do we formally prove that a CFG L(G) = L? Example: S ɛ a b asa bsb Theorem L(G) = {palindromes} = {w w = w R } Two directions: L(G) L, that is, S w then w = w R L L(G), that is, w = w R then S w Sariel Har-Peled (UIUC) CS374 30 Fall 2017 30 / 36

L(G) L Show that if S w then w = w R By induction on length of derivation, meaning For all k 1, S k w implies w = w R. If S 1 w then w = ɛ or w = a or w = b. Each case w = w R. Assume that for all k < n, that if S k w then w = w R Let S n w (with n > 1). Wlog w begin with a. Then S asa k 1 aua where w = aua. And S n 1 u and hence IH, u = u R. Therefore w r = (aua) R = (ua) R a = au R a = aua = w. Sariel Har-Peled (UIUC) CS374 31 Fall 2017 31 / 36

L(G) L Show that if S w then w = w R By induction on length of derivation, meaning For all k 1, S k w implies w = w R. If S 1 w then w = ɛ or w = a or w = b. Each case w = w R. Assume that for all k < n, that if S k w then w = w R Let S n w (with n > 1). Wlog w begin with a. Then S asa k 1 aua where w = aua. And S n 1 u and hence IH, u = u R. Therefore w r = (aua) R = (ua) R a = au R a = aua = w. Sariel Har-Peled (UIUC) CS374 31 Fall 2017 31 / 36

L L(G) Show that if w = w R then S w. By induction on w That is, for all k 0, w = k and w = w R implies S w. Exercise: Fill in proof. Sariel Har-Peled (UIUC) CS374 32 Fall 2017 32 / 36

Mutual Induction Situation is more complicated with grammars that have multiple non-terminals. See Section 5.3.2 of the notes for an example proof. Sariel Har-Peled (UIUC) CS374 33 Fall 2017 33 / 36

Normal Forms Normal forms are a way to restrict form of production rules Advantage: Simpler/more convenient algorithms and proofs Two standard normal forms for CFGs Chomsky normal form Greibach normal form Sariel Har-Peled (UIUC) CS374 34 Fall 2017 34 / 36

Normal Forms Normal forms are a way to restrict form of production rules Advantage: Simpler/more convenient algorithms and proofs Two standard normal forms for CFGs Chomsky normal form Greibach normal form Sariel Har-Peled (UIUC) CS374 34 Fall 2017 34 / 36

Normal Forms Chomsky Normal Form: Productions are all of the form A BC or A a. If ɛ L then S ɛ is also allowed. Every CFG G can be converted into CNF form via an efficient algorithm Advantage: parse tree of constant degree. Greibach Normal Form: Only productions of the form A aβ are allowed. All CFLs without ɛ have a grammar in GNF. Efficient algorithm. Advantage: Every derivation adds exactly one terminal. Sariel Har-Peled (UIUC) CS374 35 Fall 2017 35 / 36

Normal Forms Chomsky Normal Form: Productions are all of the form A BC or A a. If ɛ L then S ɛ is also allowed. Every CFG G can be converted into CNF form via an efficient algorithm Advantage: parse tree of constant degree. Greibach Normal Form: Only productions of the form A aβ are allowed. All CFLs without ɛ have a grammar in GNF. Efficient algorithm. Advantage: Every derivation adds exactly one terminal. Sariel Har-Peled (UIUC) CS374 35 Fall 2017 35 / 36

Things to know: Pushdown Automata PDA: a NFA coupled with a stack PDAs and CFGs are equivalent: both generate exactly CFLs. PDA is a machine-centric view of CFLs. Sariel Har-Peled (UIUC) CS374 36 Fall 2017 36 / 36