Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

Similar documents
Einführung in die Computerlinguistik

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Einführung in die Computerlinguistik

Tree Adjoining Grammars

NPDA, CFG equivalence

Properties of Context-Free Languages. Closure Properties Decision Properties

Foundations of Informatics: a Bridging Course

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

Non-context-Free Languages. CS215, Lecture 5 c

This lecture covers Chapter 7 of HMU: Properties of CFLs

Ogden s Lemma for CFLs

Properties of Context-Free Languages

Properties of context-free Languages

The Pumping Lemma for Context Free Grammars

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Chomsky and Greibach Normal Forms

MTH401A Theory of Computation. Lecture 17

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

Computational Models - Lecture 3

Computational Models - Lecture 4

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Context Free Grammars

The View Over The Horizon

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs

Context Free Languages and Grammars

Miscellaneous. Closure Properties Decision Properties

Context Free Language Properties

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

Computational Models - Lecture 4 1

CPS 220 Theory of Computation

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

3515ICT: Theory of Computation. Regular languages

MA/CSSE 474 Theory of Computation

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Properties of Context-free Languages. Reading: Chapter 7

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

Section 1 (closed-book) Total points 30

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CS375: Logic and Theory of Computing

Mildly Context-Sensitive Grammar Formalisms: Embedded Push-Down Automata

Computational Models - Lecture 4 1

CS481F01 Prelim 2 Solutions

What we have done so far

Theory Of Computation UNIT-II

CS 341 Homework 16 Languages that Are and Are Not Context-Free

CS 373: Theory of Computation. Fall 2010

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is

Computability Theory

Context-Free and Noncontext-Free Languages

Computational Models - Lecture 3 1

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Context-Free Languages (Pre Lecture)

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

CFLs and Regular Languages. CFLs and Regular Languages. CFLs and Regular Languages. Will show that all Regular Languages are CFLs. Union.

Theory of Computation - Module 3

Fall 1999 Formal Language Theory Dr. R. Boyer. Theorem. For any context free grammar G; if there is a derivation of w 2 from the

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

Parsing Regular Expressions and Regular Grammars

Functions on languages:

Notes for Comp 497 (Comp 454) Week 10 4/5/05

CS311 Computational Structures More about PDAs & Context-Free Languages. Lecture 9. Andrew P. Black Andrew Tolmach

Theory of Computation (II) Yijia Chen Fudan University

Introduction to Theory of Computing

CS20a: summary (Oct 24, 2002)

Chap. 7 Properties of Context-free Languages

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Context-Free Grammars and Languages. Reading: Chapter 5

Even More on Dynamic Programming

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

Computational Models: Class 3

CISC4090: Theory of Computation

Chapter 4: Context-Free Grammars

Computational Models - Lecture 5 1

Complexity of Natural Languages Mildly-context sensitivity T1 languages

Context-Free Grammars (and Languages) Lecture 7

Theory of Computation (Classroom Practice Booklet Solutions)

Sri vidya college of engineering and technology

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Computational Models - Lecture 5 1

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Notes for Comp 497 (454) Week 10

Properties of Context Free Languages

Automata Theory. CS F-10 Non-Context-Free Langauges Closure Properties of Context-Free Languages. David Galles

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Solution. S ABc Ab c Bc Ac b A ABa Ba Aa a B Bbc bc.

CSE 355 Test 2, Fall 2016

Homework #7. True False. d. Given a CFG, G, and a string w, it is decidable whether w ε L(G) True False

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Context-Free Grammar

Decidability (What, stuff is unsolvable?)

Transcription:

Normal forms (1) Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften Laura Heinrich-Heine-Universität Düsseldorf Sommersemester 2013 normal form of a grammar formalism F is a further restriction on the grammars in F that does not affect the set of generated string languages. Let G = N,T,P,S be a CFG. X N T is called useful if there is a derivation S αxβ w with w T. useless otherwise. For each CFG, there exists an equivalent CFG (= a CFG generating the same string language) without useless symbols. CFG - Formal Properties 1 Sommer 2013 CFG - Formal Properties 3 Sommer 2013 Overview 1. Normalformen 2. bgeschlossenheitseigenschaften 3. Pumping Lemma [Hopcroft and Ullman, 1979] CFG - Formal Properties 2 Sommer 2013 Normal forms (2) To eliminate all useless symbols two things need to be done: 1. ll X N need to be eliminated that cannot lead to a terminal sequence. This can be done recursively: Starting from the terminals and following the productions from right to left, the set of all symbols leading to terminals can be computed recursively. Productions containing symbols that are not in this set are eliminated. 2. In the resulting CFG, the unreachable symbols need to be eliminated. This is done starting from S and applying productions. Each time, the symbols from the right-hand sides are added. gain, productions containing non-terminals or terminals that are not in the set are eliminated. CFG - Formal Properties 4 Sommer 2013

Normal Forms (3) production of the form ǫ is called a ǫ-production. The following holds: For each CFG G, there is a CFG G without ǫ-productions such that L(G ) = L(G)\{ǫ}. Normal Forms (5) production of the form B is called a unary production. For each CFL that does not contain ǫ, a CFG without unary productions can be found. Elimination of unary productions for a CFG without ǫ-productions: For all B and all B β,β / N: add β. Delete all unary productions. CFG - Formal Properties 5 Sommer 2013 CFG - Formal Properties 7 Sommer 2013 Normal Forms (4) In order to eliminate ǫ-productions, we compute the set N ǫ = { ǫ} recursively: 1. N ǫ := { N ǫ}. 2. For all with α, α N ǫ : add to N ǫ. 3. Repeat 2. until N ǫ does not change any more. delete the ǫ-productions and for each X 1...X n : add all productions one can obtain by deleting some X N ǫ from the right-hand side as long as one does not delete all X 1,...,X n. Normal Forms (6) There are two important normal forms for CFGs: CFG for a language without ǫ is in Chomsky normal form iff all productions have either the form BC or a with,b,c N,a T. in Greibach normal form iff all productions have the form aα with a T,α N. CFG - Formal Properties 6 Sommer 2013 CFG - Formal Properties 8 Sommer 2013

Normal Forms (7) For each CFL L without ǫ, there is a CFG G in Chomsky normal form with L = L(G). Construction of an equivalent CFG in CNF for a given CFG (after elimination of useless symbols, ǫ-productions and unary productions): Normal Forms (9) For each CFL L without ǫ, there is a CFG G in Greibach normal form with L = L(G). For the construction see [Hopcroft and Ullman, 1994]. 1. For each terminal a: introduce new non-terminal C a, replace a with C a in all right-hand sides of length > 1 and add production C a a. CFG - Formal Properties 9 Sommer 2013 CFG - Formal Properties 11 Sommer 2013 Normal Forms (8) 2. For each production B 0...B n introduce new non-terminals D 1,...,D n 1 and replace production with productions B 0 D 1, D 1 B 1 D 2, D 2 B 2 D 3,..., D n 1 B n 1 B n. B 0... B n B 0 D 1 B 1 D 2... B n 1 D n 1 B n Closure Properties (1) CFLs are closed under union (construction: add S S 1 S 2 where S 1,S 2 the start symbols of the two CFGs; condition: non-terminals in the two CFGs pairwise disjoint) under concatenation and Kleene closure (construction as with regular languages) under homomorphisms (construction: replace terminals in productions with their images under the homomorphism) under substitution (construction: replace terminals in first CFG with start symbols of corresponding CFGs, make sure non-terminals of the involved CFGs are pairwise disjoint) CFG - Formal Properties 10 Sommer 2013 CFG - Formal Properties 12 Sommer 2013

Closure Properties (2) CFLs are closed under intersection with regular languages (construction: take the CFG and the DF of the regular language; then build a new CFG by replacing non-terminals with triples q 1,,q 2 where the triple stands for derivation of the yield of while traversing a DF path from q 1 to q 2 ) Pumping Lemma (1) In a context-free derivation, the expansion of a non-terminal does not depend on the context occurs in. Consequently, if we have a derivation S + xz + xv 1 v 2 z + xv 1 yv 2 z then the part + v 1 v 2 of the derivation can be iterated, i.e., we can also have for any i 1. S + xv i 1 yvi 2 z CFG - Formal Properties 13 Sommer 2013 CFG - Formal Properties 15 Sommer 2013 Closure Properties (3) Example for intersection with regular languages: CFG: S asb ε, regular language a + b +. DF: q a 0 q b 1 q 2 b Pumping Lemma (2) Looking at the derivation trees, this is even clearer: ssume that in a derivation tree, if the derivation tree has a certain minimal height (maximal length of paths from root to leaves), we have a path from the root (symbol S) to a leaf such that a Intersection grammar, start symbol S : S q 0,S,q 2 (q 2 is the only final state) q 0,S,q 2 a q 1,S,q 1 b (with δ(q 0,a) = q 1,δ(q 1,b) = q 2) q 0,S,q 2 a q 1,S,q 2 b (with δ(q 0,a) = q 1,δ(q 2,b) = q 2) q 1,S,q 2 a q 1,S,q 2 b (with δ(q 1,a) = q 1,δ(q 2,b) = q 2) q 1,S,q 2 a q 1,S,q 1 b (with δ(q 1,a) = q 1,δ(q 1,b) = q 2) q 1,S,q 1 ε (since q 1 = q 1) CFG - Formal Properties 14 Sommer 2013 on this path, a non-terminal occurs twice, and below the higher of these s, there is only a single and no other non-terminal is repeated on any path. Since the number of non-terminals is finite, from a certain string length on, every derivation treee of a word in the language is necessarily of this form. CFG - Formal Properties 16 Sommer 2013

Pumping Lemma (3) The part of the derivation tree in between the two nodes with the same non-terminal can be iterated. This means that the strings yielded by this part are pumped. x v 1 y v 2 z CFG - Formal Properties 17 Sommer 2013 Pumping Lemma (5) With the pumping lemma and the closure properties, we can show for a lot of languages that they are not context-free: L 1 = {a n b n c n n 1} is not context-free. Proof: ssume that L 1 is context-free. Then it must satisfy the pumping lemma with some constant k. Consequently, for every w L 1 and then in particular for a k b k c k, we must find substrings v 1,v 2 that can be iterated. Either they contain each only occurrences of a single terminal. Then the iteration will yield words that have no longer the same numbers of as, bs and cs. Or at least one contains at least two different terminals. Then the iterations necessarily lead to words where the as, bs and cs get mixed. L 1 does not satisfy the pumping lemma, contrary to the assumption and therefore L 1 cannot be context-free. CFG - Formal Properties 19 Sommer 2013 Pumping Lemma (4) Pumping lemma for context-free languages: Let L be a context-free language. Then there is a constant k such that for all w L with w k: w = xv 1 yv 2 z with v 1 v 2 1, v 1 yv 2 k, and for all i 0: xv i 1 yvi 2 z L. Pumping Lemma (6) L 2 = {w w {a,b,c}, w a = w b = w c } is not context-free. Proof: ssume that L 2 is context-free. Then its intersection with the regular language a + b + c + must also be context-free. However, this intersection is L 1 = {a n b n c n n 1}, for which we have just shown that it is not context-free. Since L 1 is not context-free, our assmption is false and L 2 is not context-free either. CFG - Formal Properties 18 Sommer 2013 CFG - Formal Properties 20 Sommer 2013

References [Hopcroft and Ullman, 1979] Hopcroft, J. E. and Ullman, J. D. (1979). Introduction to utomata Theory, Languages and Computation. ddison Wesley. [Hopcroft and Ullman, 1994] Hopcroft, J. E. and Ullman, J. D. (1994). Einführung in die utomatentheorie, Formale Sprachen und Komplexitätstheorie. ddison Wesley, 3. edition. CFG - Formal Properties 21 Sommer 2013