Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars

Size: px
Start display at page:

Download "Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars"

Transcription

1 Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars Laura Kallmeyer, Wolfgang Maier University of Tübingen ESSLLI Course 2008 Parsing beyond CFG 1 MCFG Parsing Multiple Context-Free Grammars (1) Seki et al. (1991), Seki & Kato (2008) Motivation: describe discontinuity. Idea: Non-terminal symbols can span a tuple of strings that need not be adjacent in the input string. Rewrite rules of the form A 0 f[a 1,..., A q ] where f is a function describing how to compute an A 0 -tuple from tuples satisfying A 1,..., A q such that each component of the value of f is a concatenation of some constant strings and some components of its arguments; each component of the rhs arguments is not allowed to appear in the value of f more than once. Parsing beyond CFG 3 MCFG Parsing Overview 1. Multiple Context-Free Grammars 2. CYK Parsing (a) The basic algorithm (b) The naïve algorithm (c) The active algorithm (d) The incremental algorithm (e) Prediction strategies 3. Conclusion Multiple Context-Free Grammars (2) A MCFG is a 5-tuple N, T, F, P, S where N is a finite set of non-terminals, each A N has a dimension dim(a) 1, dim(a) IN; T is a finite set of terminals; F is a finite set of mcf-functions (see below); P is a finite set of rules of the form A 0 f[a 1,..., A k ] with k 0, f F such that f : (T ) dim(a1)... (T ) dim(ak) (T ) dim(a0) ; S N is the start symbol. dim(s) = 1. Parsing beyond CFG 2 MCFG Parsing Parsing beyond CFG 4 MCFG Parsing

2 Multiple Context-Free Grammars (3) f is an mcf-function if there is a k 0 and there are d i > 0 for 0 i k such that f is a total function from (T ) d1... (T ) dk to (T ) d0 such that the components of f( x 1,..., x k ) are concatenations of a limited amount of terminal symbols and the components x ij of the x i (1 i k, 1 j d i ), and the components x ij of the x i are used at most once in the components of f( x 1,..., x k ). Multiple Context-Free Grammars (5) Two ranges l 1, r 1, l 2, r 2 are overlapping if either a) l 1 l 2 < r 1 and l 1 < r 2 or b) l 1 < r 2 r 1 and l 2 < r 1. A ρ (Pos(w) Pos(w)) k is a k-dimensional range vector for w iff ρ = l 1, r 1,..., l k, r k with a) l i, r i is a range in w for 1 i k and b) the elements of ρ are pairwise non-overlapping. We then define ρ(w) := l 1, r 1 (w),..., l k, r k (w). Parsing beyond CFG 5 MCFG Parsing Parsing beyond CFG 7 MCFG Parsing Multiple Context-Free Grammars (4) Given an input string w, each A N can be considered as a predicate that is true for certain vectors of substrings of w. To distinguish between different substrings containing the same terminal symbols, we introduce ranges: Let w be the input word, w = w 1...w n. Pos(w) := {0,..., n}. We call a pair l, r Pos(w) Pos(w) with l r a range in w. Its yield l, r (w) is the substring w l+1...w r. For two ranges ρ 1 = l 1, r 1, ρ 2 = l 2, r 2 : if r 1 = l 2, then ρ 1 ρ 2 = l 1, r 2 ; otherwise ρ 1 ρ 2 is undefined. Parsing beyond CFG 6 MCFG Parsing Multiple Context-Free Grammars (6) Now we define the range vectors in the yield of a given predicate A wrt w: For every terminating rule A f[ ] and every range l, r : if l, r (w) = f[ ], then A( l, r ). Let A f[a 1,..., A k ] be a production and ρ i range vectors with A i ( ρ i ) for 1 i k. We now apply f directly to the range vectors while mapping the terminals in the lhs to appropriate ranges of length 1. This way, f is no longer a function and it is no longer defined for all range vectors. (In some cases, it might yield undefined concatenations of ranges.) For all ρ f( ρ 1,..., ρ k ): A( ρ). For any other ρ, the predicate A is false. The language of a MCFG G is {w T S( 0, n ) wrt. w)}. Parsing beyond CFG 8 MCFG Parsing

3 CYK Parsing: the basic algorithm (1) Seki et al. (1991). Idea: process input from left to right, calculate for each position i all predicates A together with their yield position vector whose rightmost yield component ends at some position j i while starting with the terminating rules. w is in the language iff S with position vector 0, n is in the final set. CYK Parsing: the naïve algorithm (1) Problem with basic CYK algorithm: in order to perform a complete, one has to find items for all arguments of a rhs at the same time. Burden & Ljunglöf (2005) propose to modify the basic CYK algorithm such that only one daughter needs to be found at a time. binarization with dotted items for partially completed rhs (similar to Chomsky Normal Form for CFGs). Such items must contain all range vectors for the already recognized predicates of the rhs. Parsing beyond CFG 9 MCFG Parsing Parsing beyond CFG 11 MCFG Parsing CYK Parsing: the basic algorithm (2) Deduction rules: Items [A, ρ] with A N, ρ is a dim(a)-dimensional range vector in w. CYK Parsing: the naïve algorithm (2) We know that the arguments of the rhs predicates are taken as single components of the arguments of the lhs. We refer to them as A (k) i. Then we can write a rewriting rule as follows: Axioms: Complete: [A, ρ] A f[ ], f[ ] = ρ(w) [A 1, ρ 1 ],..., [A m, ρ m ] [A, ρ] A f[a 1,..., A m ], ρ f[ ρ 1,..., ρ m ] A 0 f[a 1,..., A n ] := x 1,..., x k where k = dim(a 0 ), x i (T {A (m) A {A 1,..., A n }, m {1,..., dim(a)}}). The vector x = x 1,..., x k is called a range constraint vector. Goal item: [S, 0, n ] Parsing beyond CFG 10 MCFG Parsing Parsing beyond CFG 12 MCFG Parsing

4 CYK Parsing: the naïve algorithm (3) Given a w, we can map the terminal symbols in a range constraint vector to ranges in w: Let x be a range constraint vector, x a component of x. We define if x T, then x w = { l, r l, r (w) = x} x = yv z with V = A (m), then x w = {α 1 A (m) α 2 α 1 y w, α 2 z w }. x w is then obtained by applying this to all components of x such that the ranges occurring in the result are all pairwise non-overlapping. CYK Parsing: the naïve algorithm (5) Convert turns a completely recognized active item into a passive item: [A f[ B ]; φ] [A; φ] Complete moves the dot over a non-terminal if a corresponding passive item exists. [A f[ B B k B ]; φ], [B k ; ψ] [A f[ BB k B ]; φ ] φ = φ[b k / ψ] Here, φ[b k / ψ] means replacing every occurrence of B (i) k in φ with ψ(i). Parsing beyond CFG 13 MCFG Parsing Parsing beyond CFG 15 MCFG Parsing CYK Parsing: the naïve algorithm (4) Naive algorithm: Passive items [A, ρ] and active items [A 0 f[ A A ]; φ] where the components of φ are concatenations of ranges and variables A (i). Predict introduces new axioms: [A f[ B]; φ] A f[ B] := x and φ x w Note that this is a completely blind prediction, any rule is predicted as being potentially used. CYK Parsing: the active algorithm (1) Idea: use the dot to traverse the range constraint vector φ. Passive items [A; Γ] as before. Active items [A f[ B]; (φ, ρ x, ψ); Γ] Such an active item indicates that the first arguments of A have been recognized yielding the ranges φ and the next argument is recognized up to the position marked by the dot so far yielding ρ. The rest of this argument (range constraints x) and the following arguments (range constraints ψ) are still waiting for completion. Γ contains range vectors for the predicates in B if these are found; otherwise it contains the variables B (i) k for these ranges. Parsing beyond CFG 14 MCFG Parsing Parsing beyond CFG 16 MCFG Parsing

5 CYK Parsing: the active algorithm (2) Predict introduces a new rule with the dot on the left of its range constraint vector: [A f[ B]; ( x, Ψ); Γ B ] ( Γ B contains the range variables for the vector B) A f[ B] := (x, Ψ) Complete moves a dot that is at the end of an argument to the next argument: [A f[ B]; (Φ, α, x, Ψ); Γ] CYK Parsing: the incremental algorithm (1) Problem of active algorithm: only passive items are used in combine steps. I.e., in a situation where the dot precedes A (i), in order to use the ith component of the predicate A, all the other components of A must already have been recognized. Better: process incrementally, allow to use active items in combine steps. Idea: read one token at the time and calculate all possible consequences of that token before the next token is read. [A f[ B]; (Φ, α, x, Ψ); Γ] Scan moves the dot over a terminal: [A f[ B]; (Φ, α ax, Ψ); Γ [A f[ B]; Φ, α l, r x, Ψ); Γ] l, r (w) = a Parsing beyond CFG 17 MCFG Parsing Parsing beyond CFG 19 MCFG Parsing CYK Parsing: the active algorithm (3) Combine moves the dot over a non-terminal if the corresponding passive item has been found: [A f[ B]; (Φ, α B (i) k x, Ψ), Γ], [B k ; ρ] [A f[ B]; (Φ, α ρ(i) x, Ψ); Γ ] Γ(k) compatible with ρ, Γ = Γ(k, i := ρ(i)) ( compatible means for every 1 i dim(b k ): either Γ(k)(i) = ρ(i) or Γ(k)(i) = B (i) k ) CYK Parsing: the incremental algorithm (2) We now use explicit feature r 1,..., r k for the range constraints of the k ranges of a predicate A. This way, the argument index is no longer given by the position in the range constraint vector and we can process the arguments in any order. Only active items [A f[ B]; (φ, r i = ρ x, ψ); Γ] As in the active algorithm except that the order in the range constraint vectors need not be the same as in the original rule in the grammar. Convert turns a fully recognized active item into a passive item: [A f[ B]; (Φ, α ), Γ] [A; (Φ, α)] Parsing beyond CFG 18 MCFG Parsing Parsing beyond CFG 20 MCFG Parsing

6 CYK Parsing: the incremental algorithm (3) Predict introduces a new rule with the dot on the left of one of the arguments of the lhs: [A f[ B]; (r i = k, k x, Ψ 1, Ψ 2 ); Γ B ] ( Γ B contains the range variables for the vector B) A f[ B ] := (Ψ 1, x, Ψ 2) with x the ith element, 1 i dim(a), 1 k n Complete moves a dot that is at the end of an argument to another argument: [A f[ B]; (Φ, r i = l i, r i, Ψ 1, r j = x, Ψ 2 ); Γ] [A f[ B]; (Φ, r i = α, r j = k, k x, Ψ 1, Ψ 2 ); Γ] r i k n CYK Parsing: Prediction strategies Problem of predict operations presented above: We compute partial results that are not reachable given the predicates we are looking for/the predicates we have already found. Solution: replace the unrestricted predict rule with more intelligent predictions. Possible strategies: A f[ B] with dot left of r i = α is only predicted if there is another item looking for A (i) (top-down prediction). there is a passive item that has found the first symbol in α (bottom-up prediction). Parsing beyond CFG 21 MCFG Parsing Parsing beyond CFG 23 MCFG Parsing CYK Parsing: the incremental algorithm (4) Scan moves the dot over a terminal: [A f[ B]; (Φ, r i = l, r ax, Ψ); Γ [A f[ B]; (Φ, r i = l, r + 1 x, Ψ); Γ] r, r + 1 (w) = a Combine moves the dot over a non-terminal if the corresponding passive item has been found: [A f[ B]; (Φ 1, r j = α B (i) k x, Ψ 1), Γ] [B k ; (Φ 2, r i = β, Ψ 2 ] [A f[ B]; (Φ 1, r j = α β x, Ψ 1 ); Γ(k, i := β)] ( compatible means for every 1 h dim(b k ): if r h = α h (Φ 2 ), then Γ(k)(h) = α h ) Γ(k) compatible with (Φ 2) Conclusion Starting point: basic algorithm (Seki et al.). Refinement: decompose single items and deductions steps in different items and smaller deduction steps. naïve algorithm (Burden & Ljunglöf, 2005). Further refinement: devide the combine rule into complete, scan and combine. active algorithm (Burden & Ljunglöf, 2005). Further refinement: predict and complete can select from any possible remaining range constraint, not just the following. incremental algorithm (Burden & Ljunglöf, 2005). The algorithms from Burden & Ljunglöf (2005) have been implemented in the Grammatical Framework System (Ranta, 2004). Parsing beyond CFG 22 MCFG Parsing Parsing beyond CFG 24 MCFG Parsing

Parsing Linear Context-Free Rewriting Systems

Parsing Linear Context-Free Rewriting Systems Parsing Linear Context-Free Rewriting Systems Håkan Burden Dept. of Linguistics Göteborg University cl1hburd@cling.gu.se Peter Ljunglöf Dept. of Computing Science Göteborg University peb@cs.chalmers.se

More information

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universität Tübingen Part I Formal Properties of TAG 16.05.2007 und 21.05.2007 TAG Parsing

More information

Tree Adjoining Grammars

Tree Adjoining Grammars Tree Adjoining Grammars TAG: Parsing and formal properties Laura Kallmeyer & Benjamin Burkhardt HHU Düsseldorf WS 2017/2018 1 / 36 Outline 1 Parsing as deduction 2 CYK for TAG 3 Closure properties of TALs

More information

Mildly Context-Sensitive Grammar Formalisms: Thread Automata

Mildly Context-Sensitive Grammar Formalisms: Thread Automata Idea of Thread Automata (1) Mildly Context-Sensitive Grammar Formalisms: Thread Automata Laura Kallmeyer Sommersemester 2011 Thread automata (TA) have been proposed in [Villemonte de La Clergerie, 2002].

More information

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26 Parsing Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 26 Table of contents 1 Context-Free Grammars 2 Simplifying CFGs Removing useless symbols Eliminating

More information

Parsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17

Parsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17 Parsing Left-Corner Parsing Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 17 Table of contents 1 Motivation 2 Algorithm 3 Look-ahead 4 Chart Parsing 2 / 17 Motivation Problems

More information

Einführung in die Computerlinguistik

Einführung in die Computerlinguistik Einführung in die Computerlinguistik Context-Free Grammars formal properties Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2018 1 / 20 Normal forms (1) Hopcroft and Ullman (1979) A normal

More information

Parsing. Weighted Deductive Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Parsing. Weighted Deductive Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26 Parsing Weighted Deductive Parsing Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 26 Table of contents 1 Idea 2 Algorithm 3 CYK Example 4 Parsing 5 Left Corner Example 2 / 26

More information

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22 Parsing Probabilistic CFG (PCFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 22 Table of contents 1 Introduction 2 PCFG 3 Inside and outside probability 4 Parsing Jurafsky

More information

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21 Parsing Unger s Parser Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2016/17 1 / 21 Table of contents 1 Introduction 2 The Parser 3 An Example 4 Optimizations 5 Conclusion 2 / 21 Introduction

More information

Data-Driven Parsing with Probabilistic Linear Context-Free Rewriting Systems

Data-Driven Parsing with Probabilistic Linear Context-Free Rewriting Systems Data-Driven Parsing with Probabilistic Linear Context-Free Rewriting Systems Laura Kallmeyer and Wolfgang Maier SFB 833, University of Tübingen {lk,wmaier}@sfs.uni-tuebingen.de Abstract This paper presents

More information

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University Simplification of CFG and Normal Forms Wen-Guey Tzeng Computer Science Department National Chiao Tung University Normal Forms We want a cfg with either Chomsky or Greibach normal form Chomsky normal form

More information

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University Simplification of CFG and Normal Forms Wen-Guey Tzeng Computer Science Department National Chiao Tung University Normal Forms We want a cfg with either Chomsky or Greibach normal form Chomsky normal form

More information

CS20a: summary (Oct 24, 2002)

CS20a: summary (Oct 24, 2002) CS20a: summary (Oct 24, 2002) Context-free languages Grammars G = (V, T, P, S) Pushdown automata N-PDA = CFG D-PDA < CFG Today What languages are context-free? Pumping lemma (similar to pumping lemma for

More information

Everything You Always Wanted to Know About Parsing

Everything You Always Wanted to Know About Parsing Everything You Always Wanted to Know About Parsing Part V : LR Parsing University of Padua, Italy ESSLLI, August 2013 Introduction Parsing strategies classified by the time the associated PDA commits to

More information

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Lecture 12 Simplification of Context-Free Grammars and Normal Forms Lecture 12 Simplification of Context-Free Grammars and Normal Forms COT 4420 Theory of Computation Chapter 6 Normal Forms for CFGs 1. Chomsky Normal Form CNF Productions of form A BC A, B, C V A a a T

More information

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is Unger s Parser Laura Heinrich-Heine-Universität Düsseldorf Wintersemester 2012/2013 a top-down parser: we start with S and

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 14 Ana Bove May 14th 2018 Recap: Context-free Grammars Simplification of grammars: Elimination of ǫ-productions; Elimination of

More information

CYK Algorithm for Parsing General Context-Free Grammars

CYK Algorithm for Parsing General Context-Free Grammars CYK Algorithm for Parsing General Context-Free Grammars Why Parse General Grammars Can be difficult or impossible to make grammar unambiguous thus LL(k) and LR(k) methods cannot work, for such ambiguous

More information

CSCI Compiler Construction

CSCI Compiler Construction CSCI 742 - Compiler Construction Lecture 12 Cocke-Younger-Kasami (CYK) Algorithm Instructor: Hossein Hojjat February 20, 2017 Recap: Chomsky Normal Form (CNF) A CFG is in Chomsky Normal Form if each rule

More information

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften Normal forms (1) Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften Laura Heinrich-Heine-Universität Düsseldorf Sommersemester 2013 normal form of a grammar formalism

More information

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET Regular Languages and FA A language is a set of strings over a finite alphabet Σ. All languages are finite or countably infinite. The set of all languages

More information

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing L445 / L545 / B659 Dept. of Linguistics, Indiana University Spring 2016 1 / 46 : Overview Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the

More information

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46. : Overview L545 Dept. of Linguistics, Indiana University Spring 2013 Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the problem as searching

More information

Even More on Dynamic Programming

Even More on Dynamic Programming Algorithms & Models of Computation CS/ECE 374, Fall 2017 Even More on Dynamic Programming Lecture 15 Thursday, October 19, 2017 Sariel Har-Peled (UIUC) CS374 1 Fall 2017 1 / 26 Part I Longest Common Subsequence

More information

Grammars and Context Free Languages

Grammars and Context Free Languages Grammars and Context Free Languages H. Geuvers and A. Kissinger Institute for Computing and Information Sciences Version: fall 2015 H. Geuvers & A. Kissinger Version: fall 2015 Talen en Automaten 1 / 23

More information

Everything You Always Wanted to Know About Parsing

Everything You Always Wanted to Know About Parsing Everything You Always Wanted to Know About Parsing Part IV : Parsing University of Padua, Italy ESSLLI, August 2013 Introduction First published in 1968 by Jay in his PhD dissertation, Carnegie Mellon

More information

Einführung in die Computerlinguistik

Einführung in die Computerlinguistik Einführung in die Computerlinguistik Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 22 CFG (1) Example: Grammar G telescope : Productions: S NP VP NP

More information

Properties of Context-Free Languages

Properties of Context-Free Languages Properties of Context-Free Languages Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Computability Theory

Computability Theory CS:4330 Theory of Computation Spring 2018 Computability Theory Decidable Problems of CFLs and beyond Haniel Barbosa Readings for this lecture Chapter 4 of [Sipser 1996], 3rd edition. Section 4.1. Decidable

More information

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id Bottom-Up Parsing Attempts to traverse a parse tree bottom up (post-order traversal) Reduces a sequence of tokens to the start symbol At each reduction step, the RHS of a production is replaced with LHS

More information

Grammars and Context Free Languages

Grammars and Context Free Languages Grammars and Context Free Languages H. Geuvers and J. Rot Institute for Computing and Information Sciences Version: fall 2016 H. Geuvers & J. Rot Version: fall 2016 Talen en Automaten 1 / 24 Outline Grammars

More information

Chap. 7 Properties of Context-free Languages

Chap. 7 Properties of Context-free Languages Chap. 7 Properties of Context-free Languages 7.1 Normal Forms for Context-free Grammars Context-free grammars A where A N, (N T). 0. Chomsky Normal Form A BC or A a except S where A, B, C N, a T. 1. Eliminating

More information

Formal Languages, Grammars and Automata Lecture 5

Formal Languages, Grammars and Automata Lecture 5 Formal Languages, Grammars and Automata Lecture 5 Helle Hvid Hansen helle@cs.ru.nl http://www.cs.ru.nl/~helle/ Foundations Group Intelligent Systems Section Institute for Computing and Information Sciences

More information

MA/CSSE 474 Theory of Computation

MA/CSSE 474 Theory of Computation MA/CSSE 474 Theory of Computation CFL Hierarchy CFL Decision Problems Your Questions? Previous class days' material Reading Assignments HW 12 or 13 problems Anything else I have included some slides online

More information

CPS 220 Theory of Computation

CPS 220 Theory of Computation CPS 22 Theory of Computation Review - Regular Languages RL - a simple class of languages that can be represented in two ways: 1 Machine description: Finite Automata are machines with a finite number of

More information

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

CMPT-825 Natural Language Processing. Why are parsing algorithms important? CMPT-825 Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/ anoop October 26, 2010 1/34 Why are parsing algorithms important? A linguistic theory is implemented in a formal system to generate

More information

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example Administrivia Test I during class on 10 March. Bottom-Up Parsing Lecture 11-12 From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS14 Lecture 11 1 2/20/08 Prof. Hilfinger CS14 Lecture 11 2 Bottom-Up

More information

h>p://lara.epfl.ch Compiler Construc/on 2011 CYK Algorithm and Chomsky Normal Form

h>p://lara.epfl.ch Compiler Construc/on 2011 CYK Algorithm and Chomsky Normal Form h>p://lara.epfl.ch Compiler Construc/on 2011 CYK Algorithm and Chomsky Normal Form S à N ( N S) N ( N ) S S Parsing an Input N S) à S N ) N ( à ( N ) à ) 7 6 5 4 3 2 1 ambiguity N ( N ( N ) N ( N ) N (

More information

Remembering subresults (Part I): Well-formed substring tables

Remembering subresults (Part I): Well-formed substring tables Remembering subresults (Part I): Well-formed substring tables Detmar Meurers: Intro to Computational Linguistics I OSU, LING 684.01, 1. February 2005 Problem: Inefficiency of recomputing subresults Two

More information

The Pumping Lemma for Context Free Grammars

The Pumping Lemma for Context Free Grammars The Pumping Lemma for Context Free Grammars Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar is in the form A BC A a Where a is any terminal

More information

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols Finite Automata and Formal Languages TMV026/DIT321 LP4 2012 Lecture 13 Ana Bove May 7th 2012 Overview of today s lecture: Normal Forms for Context-Free Languages Pumping Lemma for Context-Free Languages

More information

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form Plan for 2 nd half Pumping Lemma for CFLs The Return of the Pumping Lemma Just when you thought it was safe Return of the Pumping Lemma Recall: With Regular Languages The Pumping Lemma showed that if a

More information

NPDA, CFG equivalence

NPDA, CFG equivalence NPDA, CFG equivalence Theorem A language L is recognized by a NPDA iff L is described by a CFG. Must prove two directions: ( ) L is recognized by a NPDA implies L is described by a CFG. ( ) L is described

More information

Follow sets. LL(1) Parsing Table

Follow sets. LL(1) Parsing Table Follow sets. LL(1) Parsing Table Exercise Introducing Follow Sets Compute nullable, first for this grammar: stmtlist ::= ε stmt stmtlist stmt ::= assign block assign ::= ID = ID ; block ::= beginof ID

More information

Conflict Removal. Less Than, Equals ( <= ) Conflict

Conflict Removal. Less Than, Equals ( <= ) Conflict Conflict Removal As you have observed in a recent example, not all context free grammars are simple precedence grammars. You have also seen that a context free grammar that is not a simple precedence grammar

More information

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where Recitation 11 Notes Context Free Grammars Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x A V, and x (V T)*. Examples Problem 1. Given the

More information

CISC4090: Theory of Computation

CISC4090: Theory of Computation CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Spring, 2014 Overview In Chapter

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY 15-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY Chomsky Normal Form and TURING MACHINES TUESDAY Feb 4 CHOMSKY NORMAL FORM A context-free grammar is in Chomsky normal form if every rule is of the form:

More information

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016 CKY & Earley Parsing Ling 571 Deep Processing Techniques for NLP January 13, 2016 No Class Monday: Martin Luther King Jr. Day CKY Parsing: Finish the parse Recognizer à Parser Roadmap Earley parsing Motivation:

More information

Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems

Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems Carlos Gómez-Rodríguez 1, Marco Kuhlmann 2, and Giorgio Satta 3 1 Departamento de Computación, Universidade da Coruña, Spain, cgomezr@udc.es

More information

CS 373: Theory of Computation. Fall 2010

CS 373: Theory of Computation. Fall 2010 CS 373: Theory of Computation Gul Agha Mahesh Viswanathan Fall 2010 1 1 Normal Forms for CFG Normal Forms for Grammars It is typically easier to work with a context free language if given a CFG in a normal

More information

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars Laura Kallmeyer & Tatiana Bladier Heinrich-Heine-Universität Düsseldorf Sommersemester 2018 Kallmeyer, Bladier SS 2018 Parsing Beyond CFG:

More information

A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus

A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus Timothy A. D. Fowler Department of Computer Science University of Toronto 10 King s College Rd., Toronto, ON, M5S 3G4, Canada

More information

Pushdown Automata: Introduction (2)

Pushdown Automata: Introduction (2) Pushdown Automata: Introduction Pushdown automaton (PDA) M = (K, Σ, Γ,, s, A) where K is a set of states Σ is an input alphabet Γ is a set of stack symbols s K is the start state A K is a set of accepting

More information

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor 60-354, Theory of Computation Fall 2013 Asish Mukhopadhyay School of Computer Science University of Windsor Pushdown Automata (PDA) PDA = ε-nfa + stack Acceptance ε-nfa enters a final state or Stack is

More information

Pattern Matching of Compressed Terms and Contexts and Polynomial Rewriting

Pattern Matching of Compressed Terms and Contexts and Polynomial Rewriting Pattern Matching of Compressed Terms and Contexts and Polynomial Rewriting Manfred Schmidt-Schauß 1 Institut für Informatik Johann Wolfgang Goethe-Universität Postfach 11 19 32 D-60054 Frankfurt, Germany

More information

Computational complexity of commutative grammars

Computational complexity of commutative grammars Computational complexity of commutative grammars Jérôme Kirman Sylvain Salvati Bordeaux I University - LaBRI INRIA October 1, 2013 Free word order Some languages allow free placement of several words or

More information

Introduction to Theory of Computing

Introduction to Theory of Computing CSCI 2670, Fall 2012 Introduction to Theory of Computing Department of Computer Science University of Georgia Athens, GA 30602 Instructor: Liming Cai www.cs.uga.edu/ cai 0 Lecture Note 3 Context-Free Languages

More information

Properties of Context-Free Languages. Closure Properties Decision Properties

Properties of Context-Free Languages. Closure Properties Decision Properties Properties of Context-Free Languages Closure Properties Decision Properties 1 Closure Properties of CFL s CFL s are closed under union, concatenation, and Kleene closure. Also, under reversal, homomorphisms

More information

Context-Free Languages (Pre Lecture)

Context-Free Languages (Pre Lecture) Context-Free Languages (Pre Lecture) Dr. Neil T. Dantam CSCI-561, Colorado School of Mines Fall 2017 Dantam (Mines CSCI-561) Context-Free Languages (Pre Lecture) Fall 2017 1 / 34 Outline Pumping Lemma

More information

Chomsky Normal Form and TURING MACHINES. TUESDAY Feb 4

Chomsky Normal Form and TURING MACHINES. TUESDAY Feb 4 Chomsky Normal Form and TURING MACHINES TUESDAY Feb 4 CHOMSKY NORMAL FORM A context-free grammar is in Chomsky normal form if every rule is of the form: A BC A a S ε B and C aren t start variables a is

More information

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Notes for Comp 497 (Comp 454) Week 10 4/5/05 Notes for Comp 497 (Comp 454) Week 10 4/5/05 Today look at the last two chapters in Part II. Cohen presents some results concerning context-free languages (CFL) and regular languages (RL) also some decidability

More information

Foundations of Informatics: a Bridging Course

Foundations of Informatics: a Bridging Course Foundations of Informatics: a Bridging Course Week 3: Formal Languages and Semantics Thomas Noll Lehrstuhl für Informatik 2 RWTH Aachen University noll@cs.rwth-aachen.de http://www.b-it-center.de/wob/en/view/class211_id948.html

More information

Probabilistic Context Free Grammars

Probabilistic Context Free Grammars 1 Defining PCFGs A PCFG G consists of Probabilistic Context Free Grammars 1. A set of terminals: {w k }, k = 1..., V 2. A set of non terminals: { i }, i = 1..., n 3. A designated Start symbol: 1 4. A set

More information

Computational Models - Lecture 3

Computational Models - Lecture 3 Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 1 Computational Models - Lecture 3 Equivalence of regular expressions and regular languages (lukewarm leftover

More information

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1 Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS 164 -- Berkley University 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient

More information

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements

More information

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF) CS5371 Theory of Computation Lecture 7: Automata Theory V (CFG, CFL, CNF) Announcement Homework 2 will be given soon (before Tue) Due date: Oct 31 (Tue), before class Midterm: Nov 3, (Fri), first hour

More information

CS375: Logic and Theory of Computing

CS375: Logic and Theory of Computing CS375: Logic and Theory of Computing Fuhua (Frank) Cheng Department of Computer Science University of Kentucky 1 Table of Contents: Week 1: Preliminaries (set algebra, relations, functions) (read Chapters

More information

Multiple Context-free Grammars

Multiple Context-free Grammars Multiple Context-free Grammars Course 4: pumping properties Sylvain Salvati INRI Bordeaux Sud-Ouest ESSLLI 2011 The pumping Lemma for CFL Outline The pumping Lemma for CFL Weak pumping Lemma for MCFL No

More information

Introduction to Computational Linguistics

Introduction to Computational Linguistics Introduction to Computational Linguistics Olga Zamaraeva (2018) Based on Bender (prev. years) University of Washington May 3, 2018 1 / 101 Midterm Project Milestone 2: due Friday Assgnments 4& 5 due dates

More information

Context-Free Grammars and Languages. Reading: Chapter 5

Context-Free Grammars and Languages. Reading: Chapter 5 Context-Free Grammars and Languages Reading: Chapter 5 1 Context-Free Languages The class of context-free languages generalizes the class of regular languages, i.e., every regular language is a context-free

More information

LCFRS Exercises and solutions

LCFRS Exercises and solutions LCFRS Exercises and solutions Laura Kallmeyer SS 2010 Question 1 1. Give a CFG for the following language: {a n b m c m d n n > 0, m 0} 2. Show that the following language is not context-free: {a 2n n

More information

Decidability (What, stuff is unsolvable?)

Decidability (What, stuff is unsolvable?) University of Georgia Fall 2014 Outline Decidability Decidable Problems for Regular Languages Decidable Problems for Context Free Languages The Halting Problem Countable and Uncountable Sets Diagonalization

More information

Context-Free Grammars (and Languages) Lecture 7

Context-Free Grammars (and Languages) Lecture 7 Context-Free Grammars (and Languages) Lecture 7 1 Today Beyond regular expressions: Context-Free Grammars (CFGs) What is a CFG? What is the language associated with a CFG? Creating CFGs. Reasoning about

More information

Computational Models - Lecture 4

Computational Models - Lecture 4 Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push

More information

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition Salil Vadhan October 11, 2012 Reading: Sipser, Section 2.3 and Section 2.1 (material on Chomsky Normal Form). Pumping Lemma for

More information

Top-Down Parsing and Intro to Bottom-Up Parsing

Top-Down Parsing and Intro to Bottom-Up Parsing Predictive Parsers op-down Parsing and Intro to Bottom-Up Parsing Lecture 7 Like recursive-descent but parser can predict which production to use By looking at the next few tokens No backtracking Predictive

More information

Context Free Grammars: Introduction. Context Free Grammars: Simplifying CFGs

Context Free Grammars: Introduction. Context Free Grammars: Simplifying CFGs Context Free Grammars: Introduction CFGs are more powerful than RGs because of the following 2 properties: 1. Recursion Rule is recursive if it is of the form X w 1 Y w 2, where Y w 3 Xw 4 and w 1, w 2,

More information

Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove

Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove Tuesday 28 of May 2013 Total: 60 points TMV027/DIT321 registration VT13 TMV026/DIT321 registration before VT13 Exam

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY 15-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY REVIEW for MIDTERM 1 THURSDAY Feb 6 Midterm 1 will cover everything we have seen so far The PROBLEMS will be from Sipser, Chapters 1, 2, 3 It will be

More information

Mildly Context-Sensitive Grammar Formalisms: Embedded Push-Down Automata

Mildly Context-Sensitive Grammar Formalisms: Embedded Push-Down Automata Mildly Context-Sensitive Grammar Formalisms: Embedded Push-Down Automata Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Sommersemester 2011 Intuition (1) For a language L, there is a TAG G with

More information

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules). Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules). 1a) G = ({R, S, T}, {0,1}, P, S) where P is: S R0R R R0R1R R1R0R T T 0T ε (S generates the first 0. R generates

More information

Pushdown Automata (Pre Lecture)

Pushdown Automata (Pre Lecture) Pushdown Automata (Pre Lecture) Dr. Neil T. Dantam CSCI-561, Colorado School of Mines Fall 2017 Dantam (Mines CSCI-561) Pushdown Automata (Pre Lecture) Fall 2017 1 / 41 Outline Pushdown Automata Pushdown

More information

Context Free Grammars

Context Free Grammars Automata and Formal Languages Context Free Grammars Sipser pages 101-111 Lecture 11 Tim Sheard 1 Formal Languages 1. Context free languages provide a convenient notation for recursive description of languages.

More information

Push-down Automata = FA + Stack

Push-down Automata = FA + Stack Push-down Automata = FA + Stack PDA Definition A push-down automaton M is a tuple M = (Q,, Γ, δ, q0, F) where Q is a finite set of states is the input alphabet (of terminal symbols, terminals) Γ is the

More information

Recursive descent for grammars with contexts

Recursive descent for grammars with contexts 39th International Conference on Current Trends in Theory and Practice of Computer Science Špindleruv Mlýn, Czech Republic Recursive descent parsing for grammars with contexts Ph.D. student, Department

More information

This lecture covers Chapter 7 of HMU: Properties of CFLs

This lecture covers Chapter 7 of HMU: Properties of CFLs This lecture covers Chapter 7 of HMU: Properties of CFLs Chomsky Normal Form Pumping Lemma for CFs Closure Properties of CFLs Decision Properties of CFLs Additional Reading: Chapter 7 of HMU. Chomsky Normal

More information

Chapter 4: Context-Free Grammars

Chapter 4: Context-Free Grammars Chapter 4: Context-Free Grammars 4.1 Basics of Context-Free Grammars Definition A context-free grammars, or CFG, G is specified by a quadruple (N, Σ, P, S), where N is the nonterminal or variable alphabet;

More information

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis 1 Introduction Parenthesis Matching Problem Describe the set of arithmetic expressions with correctly matched parenthesis. Arithmetic expressions with correctly matched parenthesis cannot be described

More information

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions? Before We Start The Pumping Lemma Any questions? The Lemma & Decision/ Languages Future Exam Question What is a language? What is a class of languages? Context Free Languages Context Free Languages(CFL)

More information

MTH401A Theory of Computation. Lecture 17

MTH401A Theory of Computation. Lecture 17 MTH401A Theory of Computation Lecture 17 Chomsky Normal Form for CFG s Chomsky Normal Form for CFG s For every context free language, L, the language L {ε} has a grammar in which every production looks

More information

Syntactic Analysis. Top-Down Parsing

Syntactic Analysis. Top-Down Parsing Syntactic Analysis Top-Down Parsing Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

More information

Knuth-Morris-Pratt Algorithm

Knuth-Morris-Pratt Algorithm Knuth-Morris-Pratt Algorithm Jayadev Misra June 5, 2017 The Knuth-Morris-Pratt string matching algorithm (KMP) locates all occurrences of a pattern string in a text string in linear time (in the combined

More information

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0 Massachusetts Institute of Technology 6.863J/9.611J, Natural Language Processing, Spring, 2001 Department of Electrical Engineering and Computer Science Department of Brain and Cognitive Sciences Handout

More information

Lecture 11 Context-Free Languages

Lecture 11 Context-Free Languages Lecture 11 Context-Free Languages COT 4420 Theory of Computation Chapter 5 Context-Free Languages n { a b : n n { ww } 0} R Regular Languages a *b* ( a + b) * Example 1 G = ({S}, {a, b}, S, P) Derivations:

More information

UNIT II REGULAR LANGUAGES

UNIT II REGULAR LANGUAGES 1 UNIT II REGULAR LANGUAGES Introduction: A regular expression is a way of describing a regular language. The various operations are closure, union and concatenation. We can also find the equivalent regular

More information

Context-free Grammars and Languages

Context-free Grammars and Languages Context-free Grammars and Languages COMP 455 002, Spring 2019 Jim Anderson (modified by Nathan Otterness) 1 Context-free Grammars Context-free grammars provide another way to specify languages. Example:

More information

Computing if a token can follow

Computing if a token can follow Computing if a token can follow first(b 1... B p ) = {a B 1...B p... aw } follow(x) = {a S......Xa... } There exists a derivation from the start symbol that produces a sequence of terminals and nonterminals

More information