Context Free Grammars: Introduction. Context Free Grammars: Simplifying CFGs

Similar documents
Context Free Grammars: Introduction

Context Free Languages: Decidability of a CFL

Context-Free Grammar

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Context-Free Grammars: Normal Forms

CS 373: Theory of Computation. Fall 2010

CYK Algorithm for Parsing General Context-Free Grammars

Chomsky Normal Form for Context-Free Gramars

Chapter 4: Context-Free Grammars

Chomsky and Greibach Normal Forms

Einführung in die Computerlinguistik

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Einführung in die Computerlinguistik

Simplification and Normalization of Context-Free Grammars

Theory of Computation - Module 3

MTH401A Theory of Computation. Lecture 17

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Properties of context-free Languages

Properties of Context-free Languages. Reading: Chapter 7

h>p://lara.epfl.ch Compiler Construc/on 2011 CYK Algorithm and Chomsky Normal Form

This lecture covers Chapter 7 of HMU: Properties of CFLs

CSCI Compiler Construction

CFG Simplification. (simplify) 1. Eliminate useless symbols 2. Eliminate -productions 3. Eliminate unit productions

Context Free Languages and Grammars

Even More on Dynamic Programming

St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

Chap. 7 Properties of Context-free Languages

Context Free Grammars

CSE 355 Test 2, Fall 2016

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

MA/CSSE 474 Theory of Computation

The Pumping Lemma for Context Free Grammars

Follow sets. LL(1) Parsing Table

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Computational Models - Lecture 4 1

NPDA, CFG equivalence

CS20a: summary (Oct 24, 2002)

Introduction to Computational Linguistics

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Computing if a token can follow

CS375: Logic and Theory of Computing

CPS 220 Theory of Computation

Pushdown Automata: Introduction (2)

Homework 4 Solutions. 2. Find context-free grammars for the language L = {a n b m c k : k n + m}. (with n 0,

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

Non-context-Free Languages. CS215, Lecture 5 c

Theory Of Computation UNIT-II

Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove

Introduction to Theory of Computing

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis

Theory of Computation Turing Machine and Pushdown Automata

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Homework #7. True False. d. Given a CFG, G, and a string w, it is decidable whether w ε L(G) True False

Computational Models - Lecture 5 1

CPSC 313 Introduction to Computability

Computational Models - Lecture 4 1

Computational Models - Lecture 3

Notes for Comp 497 (Comp 454) Week 10 4/5/05

10. The GNFA method is used to show that

Conflict Removal. Less Than, Equals ( <= ) Conflict

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Probabilistic Context-Free Grammar

Properties of Context-Free Languages. Closure Properties Decision Properties

Context-Free Grammars. 2IT70 Finite Automata and Process Theory

Computational Models - Lecture 4

Chomsky Normal Form and TURING MACHINES. TUESDAY Feb 4

Foundations of Informatics: a Bridging Course

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

CS481F01 Prelim 2 Solutions

Ogden s Lemma for CFLs

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012

Lecture 11 Context-Free Languages

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario

CS Pushdown Automata

Functions on languages:

Pushdown Automata (Pre Lecture)

Remembering subresults (Part I): Well-formed substring tables

The View Over The Horizon

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Context- Free Parsing with CKY. October 16, 2014

Context-Free Grammars and Languages. We have seen that many languages cannot be regular. Thus we need to consider larger classes of langs.

Recitation 4: Converting Grammars to Chomsky Normal Form, Simulation of Context Free Languages with Push-Down Automata, Semirings

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis

LR(1) Parsers Part III Last Parsing Lecture. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

Transcription:

Context Free Grammars: Introduction CFGs are more powerful than RGs because of the following 2 properties: 1. Recursion Rule is recursive if it is of the form X w 1 Y w 2, where Y w 3 Xw 4 and w 1, w 2, w 3, w 4 V 2. Self-embedment Rule is self-embedding if it is of the form X w 1 Y w 2, where Y w 3 Xw 4 and w 1, w 2, w 3, w 4 Σ + Simplifying grammars 1. Eliminating symbols that do not terminate Context Free Grammars: Simplifying CFGs CFG removeunproductive (CFG G) marked = G.Sigma; //productive symbols do oldmarked = marked; for (each X -> alpha in G.R) marked = marked + X; for (each s in alpha) if (s not in marked) marked = marked - X; while (marked!= oldmarked); G.R = NULL; for (each X -> alpha in G.R) G.R = G + X -> alpha; for (each s in alpha) if (s not in marked) G.R = G - X -> alpha; G.V = NULL; G.V = marked; G.Sigma = G.Sigma; G.S = G.S; return G ; 1

2. Eliminating unreachable symbols CFG removeunreachable (CFG G) marked = G.S; //reachable symbols do oldmarked = marked; for (each X -> alpha in G.R) for (each s in alpha) if ((X in marked) AND (s not in marked)) marked = marked + s; while (marked!= oldmarked); G.V = G.V; for (each X in (G.V - G.Sigma)) if (X not in marked) G.V = G.V - X; G.Sigma = G.Sigma; for (each s in G.Sigma) if (s not in marked) G.Sigma = G.Sigma - s; G.R = G.R; for (each (X -> alpha) in G.R) if (X not in marked) G.R = G.R - (X -> alpha); G.S = G.S; return G ; Proving grammar correctness Context Free Grammars: Proving Grammar Is Correct string generate (grammar G) w = G.S; do apply some rule r = X -> alpha A beta from G.R; while (w contains X in (G.V - G.Sigma)); return w; Proof constructed as follows: 1. Construct loop invariant I for above algorithm 2. Show that (a) I is true when loop starts (b) I holds on each iteration of loop (c) At termination, w L(G) 2

Context Free Grammars: Ambiguity Issues arise with CFGs that are not inherent in RGs 1. CFG derivation order not obvious from parse tree Not an issue for RGs there can only be a single NT on RHS Example: Consider grammar < S > < NP >< V P > < NP > < A >< N >< P P > < A >< N > < V P > < V >< NP >< P P > < V >< NP > < P P > < P >< NP > < N > boy < N > man < N > telescope < A > the < A > a < V > saw < P > with Derivations of the man saw a boy S < NP >< V P > < A >< N >< V P > the < N >< V P > the < N >< V >< NP > the < N > saw < NP > the < N > saw < A >< N > the < N > saw < A > boy the < N > saw a boy the man saw a boy Vs S < NP >< V P > < A >< N >< V P > the < N >< V P > the man < V P > the man < V P > the man < V >< NP > the man saw < NP > the man saw < A >< N > the man saw a < N > the man saw a boy 2. CFG may be ambiguous Example: 3

To reduce ambiguity 1. Eliminate ɛ rules CFG removeeps (CFG G) nullable = NULL; for (each rule (X -> alpha) in G.R) if (alpha == epsilon)) nullable = nullable + X; do oldnullable = nullable; for (each rule (X -> alpha) in G.R) if (alpha contains only NTs) flag = TRUE; for (each Y in alpha) if (Y not in nullable) flag = FALSE; if (flag) nullable = nullable + X; while (oldnullable!= nullable); R = G.R; do oldr = R ; for (each (X -> alpha Q beta) in R ) if (Q in nullable) if ((X -> alpha beta) not in R ) AND (alpha beta!= epsilon) AND (X!= alpha beta)) R = R + (X -> alpha beta); while (oldr!= R ); for(each X -> alpha in R ) if (alpha == epsilon) R = R - (X -> alpha); G = (G.V, G.Sigma, R, G.S); return G ; Need to check special case in above algorithm: Language may include ɛ Add special rule for this case at end of algorithm: if (nullable(g.s)) G.V = G.V + G.S ; G.R = G.R - (S -> epsilon); G.R = G.R + (S -> epsilon); G.R = G.R + (S -> S); 2. Eliminating symmetric recursive rules 3. Ambiguous attachment 4

Context Free Grammars: Normal Forms 1. Chomsky Normal Form All rules are in one of forms (a) X a, where a Σ (b) X BC, where B, C V Σ 2. Greibach Normal Form All rules are in form (a) X αβ, where α Σ, β (V Σ) Theorem 11.3: Rule Substitution Statement: Context Free Grammars: Converting to Normal Forms Let G = (V, Σ, R, S) be a CFG with rules r i of form X αy β, where α, β V, Y V Σ Let Y γ 1 γ 2... γ n be a set of rules in R with Y as LHS Let R = (R r i ) X αγ 1 β, X αγ 2 β,..., X αγ n β G = (V, Σ, R, S) Then, L(G ) = L(G) Algorithm uses 4 basic steps Context Free Grammars: Converting to Normal Forms - CNF CFG converttocnf (CFG G) Eliminate epsilon rules; Eliminate unit productions; Eliminate rules where RHS > 1 and have terminal symbol on RHS; Eliminate rules where RHS > 2; return (G.V, G.Sigma, modified rules, G.S); 1. ɛ productions: See above 2. Unit productions CFG removeunits (CFG G) R = G.R; visited = NULL; while (no unit productions in R ) r = (X -> Y) in R ; R = R - r; visited = visited + r; for (each r = (Y -> beta) in R ) if ((X -> beta) not in visited) R = R + (X -> beta); visited = visited + (X -> beta); 5

3. Replacing where RHS > 1 and contain a terminal CFG removemixed (CFG G) R = G.R; for (each a in G.Sigma) create terminal symbol Ta; G.V = G.V + Ta; R = R + (Ta -> a); for (each r in R ) if (length(r.rhs) > 1) for (each s in r.rhs) if (s in G.Sigma) replace s in r with Ts; return (G.V, G.Sigma, R, G.S); 4. Replace rules where RHS > 2 with rules where RHS = 2 CFG removelong (CFG G) R = NULL; for (each r in G.R) if (length(r.rhs) > 2) n = length(r.rhs); r = (X -> Y 1 Y 2...Y n ); R = R + (X -> Y 1 R 1 ); G.V = G.V + R 1 ; for (i = 1; i < n - 2; i++) G.V = G.V + R i+1 ; R = R + (R i -> Y i+1 R i+1 ); G.V = G.V + R n 2 ; R = R + (R n 2 -> Y n 1 Y n ); else R = R + r; return (G.V, G.Sigma, R, G.S); 6

Context Free Grammars: Converting to Normal Form - GNF Algorithm uses 3 basic steps CFG converttognf (CFG G) G = converttocnf(g); Modify rules so that they are of the form S ɛ A cα, where c Σ, α (V Σ) A Bα, where B V Σ, α (V Σ) Modify rules so that they are in GNF; return (G.V, G.Sigma, modified rules, G.S); 1. Convert G to CNF (See CNF algorithm) 2. Modify the rules so that they are of the form S ɛ A cα, where c Σ, α (V Σ) A Bα, where B V Σ,, α (V Σ) (a) Number the NTs (b) Left recusion is eliminated To eliminate direct left recursion Consider rules A Au 1 Au 2... Au j and A v 1 v 2... v k, where the first symbol of u i A Replace these rules with the following Z u 1 u 2... u j u 1 Z u 2 Z... u j Z, where Z is a new symbol A v 1 v 2...v k v 1 Z v 2 Z... v k Z 3. Modify rules to GNF For rules of the form X Y γ, where Y V Σ, replace Y with ths RHSs of rules with Y on the LHS 7