Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars
|
|
- Jeffrey Lambert Jackson
- 6 years ago
- Views:
Transcription
1 Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars Laura Kallmeyer, Wolfgang Maier University of Tübingen ESSLLI Course 2008 Parsing beyond CFG 1 MCFG Parsing Multiple Context-Free Grammars (1) Seki et al. (1991), Seki & Kato (2008) Motivation: describe discontinuity. Idea: Non-terminal symbols can span a tuple of strings that need not be adjacent in the input string. Rewrite rules of the form A 0 f[a 1,..., A q ] where f is a function describing how to compute an A 0 -tuple from tuples satisfying A 1,..., A q such that each component of the value of f is a concatenation of some constant strings and some components of its arguments; each component of the rhs arguments is not allowed to appear in the value of f more than once. Parsing beyond CFG 3 MCFG Parsing Overview 1. Multiple Context-Free Grammars 2. CYK Parsing (a) The basic algorithm (b) The naïve algorithm (c) The active algorithm (d) The incremental algorithm (e) Prediction strategies 3. Conclusion Multiple Context-Free Grammars (2) A MCFG is a 5-tuple N, T, F, P, S where N is a finite set of non-terminals, each A N has a dimension dim(a) 1, dim(a) IN; T is a finite set of terminals; F is a finite set of mcf-functions (see below); P is a finite set of rules of the form A 0 f[a 1,..., A k ] with k 0, f F such that f : (T ) dim(a1)... (T ) dim(ak) (T ) dim(a0) ; S N is the start symbol. dim(s) = 1. Parsing beyond CFG 2 MCFG Parsing Parsing beyond CFG 4 MCFG Parsing
2 Multiple Context-Free Grammars (3) f is an mcf-function if there is a k 0 and there are d i > 0 for 0 i k such that f is a total function from (T ) d1... (T ) dk to (T ) d0 such that the components of f( x 1,..., x k ) are concatenations of a limited amount of terminal symbols and the components x ij of the x i (1 i k, 1 j d i ), and the components x ij of the x i are used at most once in the components of f( x 1,..., x k ). Multiple Context-Free Grammars (5) Two ranges l 1, r 1, l 2, r 2 are overlapping if either a) l 1 l 2 < r 1 and l 1 < r 2 or b) l 1 < r 2 r 1 and l 2 < r 1. A ρ (Pos(w) Pos(w)) k is a k-dimensional range vector for w iff ρ = l 1, r 1,..., l k, r k with a) l i, r i is a range in w for 1 i k and b) the elements of ρ are pairwise non-overlapping. We then define ρ(w) := l 1, r 1 (w),..., l k, r k (w). Parsing beyond CFG 5 MCFG Parsing Parsing beyond CFG 7 MCFG Parsing Multiple Context-Free Grammars (4) Given an input string w, each A N can be considered as a predicate that is true for certain vectors of substrings of w. To distinguish between different substrings containing the same terminal symbols, we introduce ranges: Let w be the input word, w = w 1...w n. Pos(w) := {0,..., n}. We call a pair l, r Pos(w) Pos(w) with l r a range in w. Its yield l, r (w) is the substring w l+1...w r. For two ranges ρ 1 = l 1, r 1, ρ 2 = l 2, r 2 : if r 1 = l 2, then ρ 1 ρ 2 = l 1, r 2 ; otherwise ρ 1 ρ 2 is undefined. Parsing beyond CFG 6 MCFG Parsing Multiple Context-Free Grammars (6) Now we define the range vectors in the yield of a given predicate A wrt w: For every terminating rule A f[ ] and every range l, r : if l, r (w) = f[ ], then A( l, r ). Let A f[a 1,..., A k ] be a production and ρ i range vectors with A i ( ρ i ) for 1 i k. We now apply f directly to the range vectors while mapping the terminals in the lhs to appropriate ranges of length 1. This way, f is no longer a function and it is no longer defined for all range vectors. (In some cases, it might yield undefined concatenations of ranges.) For all ρ f( ρ 1,..., ρ k ): A( ρ). For any other ρ, the predicate A is false. The language of a MCFG G is {w T S( 0, n ) wrt. w)}. Parsing beyond CFG 8 MCFG Parsing
3 CYK Parsing: the basic algorithm (1) Seki et al. (1991). Idea: process input from left to right, calculate for each position i all predicates A together with their yield position vector whose rightmost yield component ends at some position j i while starting with the terminating rules. w is in the language iff S with position vector 0, n is in the final set. CYK Parsing: the naïve algorithm (1) Problem with basic CYK algorithm: in order to perform a complete, one has to find items for all arguments of a rhs at the same time. Burden & Ljunglöf (2005) propose to modify the basic CYK algorithm such that only one daughter needs to be found at a time. binarization with dotted items for partially completed rhs (similar to Chomsky Normal Form for CFGs). Such items must contain all range vectors for the already recognized predicates of the rhs. Parsing beyond CFG 9 MCFG Parsing Parsing beyond CFG 11 MCFG Parsing CYK Parsing: the basic algorithm (2) Deduction rules: Items [A, ρ] with A N, ρ is a dim(a)-dimensional range vector in w. CYK Parsing: the naïve algorithm (2) We know that the arguments of the rhs predicates are taken as single components of the arguments of the lhs. We refer to them as A (k) i. Then we can write a rewriting rule as follows: Axioms: Complete: [A, ρ] A f[ ], f[ ] = ρ(w) [A 1, ρ 1 ],..., [A m, ρ m ] [A, ρ] A f[a 1,..., A m ], ρ f[ ρ 1,..., ρ m ] A 0 f[a 1,..., A n ] := x 1,..., x k where k = dim(a 0 ), x i (T {A (m) A {A 1,..., A n }, m {1,..., dim(a)}}). The vector x = x 1,..., x k is called a range constraint vector. Goal item: [S, 0, n ] Parsing beyond CFG 10 MCFG Parsing Parsing beyond CFG 12 MCFG Parsing
4 CYK Parsing: the naïve algorithm (3) Given a w, we can map the terminal symbols in a range constraint vector to ranges in w: Let x be a range constraint vector, x a component of x. We define if x T, then x w = { l, r l, r (w) = x} x = yv z with V = A (m), then x w = {α 1 A (m) α 2 α 1 y w, α 2 z w }. x w is then obtained by applying this to all components of x such that the ranges occurring in the result are all pairwise non-overlapping. CYK Parsing: the naïve algorithm (5) Convert turns a completely recognized active item into a passive item: [A f[ B ]; φ] [A; φ] Complete moves the dot over a non-terminal if a corresponding passive item exists. [A f[ B B k B ]; φ], [B k ; ψ] [A f[ BB k B ]; φ ] φ = φ[b k / ψ] Here, φ[b k / ψ] means replacing every occurrence of B (i) k in φ with ψ(i). Parsing beyond CFG 13 MCFG Parsing Parsing beyond CFG 15 MCFG Parsing CYK Parsing: the naïve algorithm (4) Naive algorithm: Passive items [A, ρ] and active items [A 0 f[ A A ]; φ] where the components of φ are concatenations of ranges and variables A (i). Predict introduces new axioms: [A f[ B]; φ] A f[ B] := x and φ x w Note that this is a completely blind prediction, any rule is predicted as being potentially used. CYK Parsing: the active algorithm (1) Idea: use the dot to traverse the range constraint vector φ. Passive items [A; Γ] as before. Active items [A f[ B]; (φ, ρ x, ψ); Γ] Such an active item indicates that the first arguments of A have been recognized yielding the ranges φ and the next argument is recognized up to the position marked by the dot so far yielding ρ. The rest of this argument (range constraints x) and the following arguments (range constraints ψ) are still waiting for completion. Γ contains range vectors for the predicates in B if these are found; otherwise it contains the variables B (i) k for these ranges. Parsing beyond CFG 14 MCFG Parsing Parsing beyond CFG 16 MCFG Parsing
5 CYK Parsing: the active algorithm (2) Predict introduces a new rule with the dot on the left of its range constraint vector: [A f[ B]; ( x, Ψ); Γ B ] ( Γ B contains the range variables for the vector B) A f[ B] := (x, Ψ) Complete moves a dot that is at the end of an argument to the next argument: [A f[ B]; (Φ, α, x, Ψ); Γ] CYK Parsing: the incremental algorithm (1) Problem of active algorithm: only passive items are used in combine steps. I.e., in a situation where the dot precedes A (i), in order to use the ith component of the predicate A, all the other components of A must already have been recognized. Better: process incrementally, allow to use active items in combine steps. Idea: read one token at the time and calculate all possible consequences of that token before the next token is read. [A f[ B]; (Φ, α, x, Ψ); Γ] Scan moves the dot over a terminal: [A f[ B]; (Φ, α ax, Ψ); Γ [A f[ B]; Φ, α l, r x, Ψ); Γ] l, r (w) = a Parsing beyond CFG 17 MCFG Parsing Parsing beyond CFG 19 MCFG Parsing CYK Parsing: the active algorithm (3) Combine moves the dot over a non-terminal if the corresponding passive item has been found: [A f[ B]; (Φ, α B (i) k x, Ψ), Γ], [B k ; ρ] [A f[ B]; (Φ, α ρ(i) x, Ψ); Γ ] Γ(k) compatible with ρ, Γ = Γ(k, i := ρ(i)) ( compatible means for every 1 i dim(b k ): either Γ(k)(i) = ρ(i) or Γ(k)(i) = B (i) k ) CYK Parsing: the incremental algorithm (2) We now use explicit feature r 1,..., r k for the range constraints of the k ranges of a predicate A. This way, the argument index is no longer given by the position in the range constraint vector and we can process the arguments in any order. Only active items [A f[ B]; (φ, r i = ρ x, ψ); Γ] As in the active algorithm except that the order in the range constraint vectors need not be the same as in the original rule in the grammar. Convert turns a fully recognized active item into a passive item: [A f[ B]; (Φ, α ), Γ] [A; (Φ, α)] Parsing beyond CFG 18 MCFG Parsing Parsing beyond CFG 20 MCFG Parsing
6 CYK Parsing: the incremental algorithm (3) Predict introduces a new rule with the dot on the left of one of the arguments of the lhs: [A f[ B]; (r i = k, k x, Ψ 1, Ψ 2 ); Γ B ] ( Γ B contains the range variables for the vector B) A f[ B ] := (Ψ 1, x, Ψ 2) with x the ith element, 1 i dim(a), 1 k n Complete moves a dot that is at the end of an argument to another argument: [A f[ B]; (Φ, r i = l i, r i, Ψ 1, r j = x, Ψ 2 ); Γ] [A f[ B]; (Φ, r i = α, r j = k, k x, Ψ 1, Ψ 2 ); Γ] r i k n CYK Parsing: Prediction strategies Problem of predict operations presented above: We compute partial results that are not reachable given the predicates we are looking for/the predicates we have already found. Solution: replace the unrestricted predict rule with more intelligent predictions. Possible strategies: A f[ B] with dot left of r i = α is only predicted if there is another item looking for A (i) (top-down prediction). there is a passive item that has found the first symbol in α (bottom-up prediction). Parsing beyond CFG 21 MCFG Parsing Parsing beyond CFG 23 MCFG Parsing CYK Parsing: the incremental algorithm (4) Scan moves the dot over a terminal: [A f[ B]; (Φ, r i = l, r ax, Ψ); Γ [A f[ B]; (Φ, r i = l, r + 1 x, Ψ); Γ] r, r + 1 (w) = a Combine moves the dot over a non-terminal if the corresponding passive item has been found: [A f[ B]; (Φ 1, r j = α B (i) k x, Ψ 1), Γ] [B k ; (Φ 2, r i = β, Ψ 2 ] [A f[ B]; (Φ 1, r j = α β x, Ψ 1 ); Γ(k, i := β)] ( compatible means for every 1 h dim(b k ): if r h = α h (Φ 2 ), then Γ(k)(h) = α h ) Γ(k) compatible with (Φ 2) Conclusion Starting point: basic algorithm (Seki et al.). Refinement: decompose single items and deductions steps in different items and smaller deduction steps. naïve algorithm (Burden & Ljunglöf, 2005). Further refinement: devide the combine rule into complete, scan and combine. active algorithm (Burden & Ljunglöf, 2005). Further refinement: predict and complete can select from any possible remaining range constraint, not just the following. incremental algorithm (Burden & Ljunglöf, 2005). The algorithms from Burden & Ljunglöf (2005) have been implemented in the Grammatical Framework System (Ranta, 2004). Parsing beyond CFG 22 MCFG Parsing Parsing beyond CFG 24 MCFG Parsing
Parsing Linear Context-Free Rewriting Systems
Parsing Linear Context-Free Rewriting Systems Håkan Burden Dept. of Linguistics Göteborg University cl1hburd@cling.gu.se Peter Ljunglöf Dept. of Computing Science Göteborg University peb@cs.chalmers.se
More informationGrammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG
Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universität Tübingen Part I Formal Properties of TAG 16.05.2007 und 21.05.2007 TAG Parsing
More informationTree Adjoining Grammars
Tree Adjoining Grammars TAG: Parsing and formal properties Laura Kallmeyer & Benjamin Burkhardt HHU Düsseldorf WS 2017/2018 1 / 36 Outline 1 Parsing as deduction 2 CYK for TAG 3 Closure properties of TALs
More informationMildly Context-Sensitive Grammar Formalisms: Thread Automata
Idea of Thread Automata (1) Mildly Context-Sensitive Grammar Formalisms: Thread Automata Laura Kallmeyer Sommersemester 2011 Thread automata (TA) have been proposed in [Villemonte de La Clergerie, 2002].
More informationParsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26
Parsing Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 26 Table of contents 1 Context-Free Grammars 2 Simplifying CFGs Removing useless symbols Eliminating
More informationParsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17
Parsing Left-Corner Parsing Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 17 Table of contents 1 Motivation 2 Algorithm 3 Look-ahead 4 Chart Parsing 2 / 17 Motivation Problems
More informationEinführung in die Computerlinguistik
Einführung in die Computerlinguistik Context-Free Grammars formal properties Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2018 1 / 20 Normal forms (1) Hopcroft and Ullman (1979) A normal
More informationParsing. Weighted Deductive Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26
Parsing Weighted Deductive Parsing Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 26 Table of contents 1 Idea 2 Algorithm 3 CYK Example 4 Parsing 5 Left Corner Example 2 / 26
More informationParsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22
Parsing Probabilistic CFG (PCFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 22 Table of contents 1 Introduction 2 PCFG 3 Inside and outside probability 4 Parsing Jurafsky
More informationParsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21
Parsing Unger s Parser Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2016/17 1 / 21 Table of contents 1 Introduction 2 The Parser 3 An Example 4 Optimizations 5 Conclusion 2 / 21 Introduction
More informationData-Driven Parsing with Probabilistic Linear Context-Free Rewriting Systems
Data-Driven Parsing with Probabilistic Linear Context-Free Rewriting Systems Laura Kallmeyer and Wolfgang Maier SFB 833, University of Tübingen {lk,wmaier}@sfs.uni-tuebingen.de Abstract This paper presents
More informationSimplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University
Simplification of CFG and Normal Forms Wen-Guey Tzeng Computer Science Department National Chiao Tung University Normal Forms We want a cfg with either Chomsky or Greibach normal form Chomsky normal form
More informationSimplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University
Simplification of CFG and Normal Forms Wen-Guey Tzeng Computer Science Department National Chiao Tung University Normal Forms We want a cfg with either Chomsky or Greibach normal form Chomsky normal form
More informationCS20a: summary (Oct 24, 2002)
CS20a: summary (Oct 24, 2002) Context-free languages Grammars G = (V, T, P, S) Pushdown automata N-PDA = CFG D-PDA < CFG Today What languages are context-free? Pumping lemma (similar to pumping lemma for
More informationEverything You Always Wanted to Know About Parsing
Everything You Always Wanted to Know About Parsing Part V : LR Parsing University of Padua, Italy ESSLLI, August 2013 Introduction Parsing strategies classified by the time the associated PDA commits to
More informationLecture 12 Simplification of Context-Free Grammars and Normal Forms
Lecture 12 Simplification of Context-Free Grammars and Normal Forms COT 4420 Theory of Computation Chapter 6 Normal Forms for CFGs 1. Chomsky Normal Form CNF Productions of form A BC A, B, C V A a a T
More informationParsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is
Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is Unger s Parser Laura Heinrich-Heine-Universität Düsseldorf Wintersemester 2012/2013 a top-down parser: we start with S and
More informationFinite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018
Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 14 Ana Bove May 14th 2018 Recap: Context-free Grammars Simplification of grammars: Elimination of ǫ-productions; Elimination of
More informationCYK Algorithm for Parsing General Context-Free Grammars
CYK Algorithm for Parsing General Context-Free Grammars Why Parse General Grammars Can be difficult or impossible to make grammar unambiguous thus LL(k) and LR(k) methods cannot work, for such ambiguous
More informationCSCI Compiler Construction
CSCI 742 - Compiler Construction Lecture 12 Cocke-Younger-Kasami (CYK) Algorithm Instructor: Hossein Hojjat February 20, 2017 Recap: Chomsky Normal Form (CNF) A CFG is in Chomsky Normal Form if each rule
More informationEinführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften
Normal forms (1) Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften Laura Heinrich-Heine-Universität Düsseldorf Sommersemester 2013 normal form of a grammar formalism
More informationTHEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET
THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET Regular Languages and FA A language is a set of strings over a finite alphabet Σ. All languages are finite or countably infinite. The set of all languages
More informationParsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing
L445 / L545 / B659 Dept. of Linguistics, Indiana University Spring 2016 1 / 46 : Overview Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the
More informationParsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.
: Overview L545 Dept. of Linguistics, Indiana University Spring 2013 Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the problem as searching
More informationEven More on Dynamic Programming
Algorithms & Models of Computation CS/ECE 374, Fall 2017 Even More on Dynamic Programming Lecture 15 Thursday, October 19, 2017 Sariel Har-Peled (UIUC) CS374 1 Fall 2017 1 / 26 Part I Longest Common Subsequence
More informationGrammars and Context Free Languages
Grammars and Context Free Languages H. Geuvers and A. Kissinger Institute for Computing and Information Sciences Version: fall 2015 H. Geuvers & A. Kissinger Version: fall 2015 Talen en Automaten 1 / 23
More informationEverything You Always Wanted to Know About Parsing
Everything You Always Wanted to Know About Parsing Part IV : Parsing University of Padua, Italy ESSLLI, August 2013 Introduction First published in 1968 by Jay in his PhD dissertation, Carnegie Mellon
More informationEinführung in die Computerlinguistik
Einführung in die Computerlinguistik Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 22 CFG (1) Example: Grammar G telescope : Productions: S NP VP NP
More informationProperties of Context-Free Languages
Properties of Context-Free Languages Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationComputability Theory
CS:4330 Theory of Computation Spring 2018 Computability Theory Decidable Problems of CFLs and beyond Haniel Barbosa Readings for this lecture Chapter 4 of [Sipser 1996], 3rd edition. Section 4.1. Decidable
More informationBottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id
Bottom-Up Parsing Attempts to traverse a parse tree bottom up (post-order traversal) Reduces a sequence of tokens to the start symbol At each reduction step, the RHS of a production is replaced with LHS
More informationGrammars and Context Free Languages
Grammars and Context Free Languages H. Geuvers and J. Rot Institute for Computing and Information Sciences Version: fall 2016 H. Geuvers & J. Rot Version: fall 2016 Talen en Automaten 1 / 24 Outline Grammars
More informationChap. 7 Properties of Context-free Languages
Chap. 7 Properties of Context-free Languages 7.1 Normal Forms for Context-free Grammars Context-free grammars A where A N, (N T). 0. Chomsky Normal Form A BC or A a except S where A, B, C N, a T. 1. Eliminating
More informationFormal Languages, Grammars and Automata Lecture 5
Formal Languages, Grammars and Automata Lecture 5 Helle Hvid Hansen helle@cs.ru.nl http://www.cs.ru.nl/~helle/ Foundations Group Intelligent Systems Section Institute for Computing and Information Sciences
More informationMA/CSSE 474 Theory of Computation
MA/CSSE 474 Theory of Computation CFL Hierarchy CFL Decision Problems Your Questions? Previous class days' material Reading Assignments HW 12 or 13 problems Anything else I have included some slides online
More informationCPS 220 Theory of Computation
CPS 22 Theory of Computation Review - Regular Languages RL - a simple class of languages that can be represented in two ways: 1 Machine description: Finite Automata are machines with a finite number of
More informationCMPT-825 Natural Language Processing. Why are parsing algorithms important?
CMPT-825 Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/ anoop October 26, 2010 1/34 Why are parsing algorithms important? A linguistic theory is implemented in a formal system to generate
More informationAdministrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example
Administrivia Test I during class on 10 March. Bottom-Up Parsing Lecture 11-12 From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS14 Lecture 11 1 2/20/08 Prof. Hilfinger CS14 Lecture 11 2 Bottom-Up
More informationh>p://lara.epfl.ch Compiler Construc/on 2011 CYK Algorithm and Chomsky Normal Form
h>p://lara.epfl.ch Compiler Construc/on 2011 CYK Algorithm and Chomsky Normal Form S à N ( N S) N ( N ) S S Parsing an Input N S) à S N ) N ( à ( N ) à ) 7 6 5 4 3 2 1 ambiguity N ( N ( N ) N ( N ) N (
More informationRemembering subresults (Part I): Well-formed substring tables
Remembering subresults (Part I): Well-formed substring tables Detmar Meurers: Intro to Computational Linguistics I OSU, LING 684.01, 1. February 2005 Problem: Inefficiency of recomputing subresults Two
More informationThe Pumping Lemma for Context Free Grammars
The Pumping Lemma for Context Free Grammars Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar is in the form A BC A a Where a is any terminal
More informationFinite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols
Finite Automata and Formal Languages TMV026/DIT321 LP4 2012 Lecture 13 Ana Bove May 7th 2012 Overview of today s lecture: Normal Forms for Context-Free Languages Pumping Lemma for Context-Free Languages
More informationPlan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form
Plan for 2 nd half Pumping Lemma for CFLs The Return of the Pumping Lemma Just when you thought it was safe Return of the Pumping Lemma Recall: With Regular Languages The Pumping Lemma showed that if a
More informationNPDA, CFG equivalence
NPDA, CFG equivalence Theorem A language L is recognized by a NPDA iff L is described by a CFG. Must prove two directions: ( ) L is recognized by a NPDA implies L is described by a CFG. ( ) L is described
More informationFollow sets. LL(1) Parsing Table
Follow sets. LL(1) Parsing Table Exercise Introducing Follow Sets Compute nullable, first for this grammar: stmtlist ::= ε stmt stmtlist stmt ::= assign block assign ::= ID = ID ; block ::= beginof ID
More informationConflict Removal. Less Than, Equals ( <= ) Conflict
Conflict Removal As you have observed in a recent example, not all context free grammars are simple precedence grammars. You have also seen that a context free grammar that is not a simple precedence grammar
More informationDefinition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where
Recitation 11 Notes Context Free Grammars Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x A V, and x (V T)*. Examples Problem 1. Given the
More informationCISC4090: Theory of Computation
CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Spring, 2014 Overview In Chapter
More informationFORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
15-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY Chomsky Normal Form and TURING MACHINES TUESDAY Feb 4 CHOMSKY NORMAL FORM A context-free grammar is in Chomsky normal form if every rule is of the form:
More informationCKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016
CKY & Earley Parsing Ling 571 Deep Processing Techniques for NLP January 13, 2016 No Class Monday: Martin Luther King Jr. Day CKY Parsing: Finish the parse Recognizer à Parser Roadmap Earley parsing Motivation:
More informationEfficient Parsing of Well-Nested Linear Context-Free Rewriting Systems
Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems Carlos Gómez-Rodríguez 1, Marco Kuhlmann 2, and Giorgio Satta 3 1 Departamento de Computación, Universidade da Coruña, Spain, cgomezr@udc.es
More informationCS 373: Theory of Computation. Fall 2010
CS 373: Theory of Computation Gul Agha Mahesh Viswanathan Fall 2010 1 1 Normal Forms for CFG Normal Forms for Grammars It is typically easier to work with a context free language if given a CFG in a normal
More informationParsing Beyond Context-Free Grammars: Tree Adjoining Grammars
Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars Laura Kallmeyer & Tatiana Bladier Heinrich-Heine-Universität Düsseldorf Sommersemester 2018 Kallmeyer, Bladier SS 2018 Parsing Beyond CFG:
More informationA Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus
A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus Timothy A. D. Fowler Department of Computer Science University of Toronto 10 King s College Rd., Toronto, ON, M5S 3G4, Canada
More informationPushdown Automata: Introduction (2)
Pushdown Automata: Introduction Pushdown automaton (PDA) M = (K, Σ, Γ,, s, A) where K is a set of states Σ is an input alphabet Γ is a set of stack symbols s K is the start state A K is a set of accepting
More information60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor
60-354, Theory of Computation Fall 2013 Asish Mukhopadhyay School of Computer Science University of Windsor Pushdown Automata (PDA) PDA = ε-nfa + stack Acceptance ε-nfa enters a final state or Stack is
More informationPattern Matching of Compressed Terms and Contexts and Polynomial Rewriting
Pattern Matching of Compressed Terms and Contexts and Polynomial Rewriting Manfred Schmidt-Schauß 1 Institut für Informatik Johann Wolfgang Goethe-Universität Postfach 11 19 32 D-60054 Frankfurt, Germany
More informationComputational complexity of commutative grammars
Computational complexity of commutative grammars Jérôme Kirman Sylvain Salvati Bordeaux I University - LaBRI INRIA October 1, 2013 Free word order Some languages allow free placement of several words or
More informationIntroduction to Theory of Computing
CSCI 2670, Fall 2012 Introduction to Theory of Computing Department of Computer Science University of Georgia Athens, GA 30602 Instructor: Liming Cai www.cs.uga.edu/ cai 0 Lecture Note 3 Context-Free Languages
More informationProperties of Context-Free Languages. Closure Properties Decision Properties
Properties of Context-Free Languages Closure Properties Decision Properties 1 Closure Properties of CFL s CFL s are closed under union, concatenation, and Kleene closure. Also, under reversal, homomorphisms
More informationContext-Free Languages (Pre Lecture)
Context-Free Languages (Pre Lecture) Dr. Neil T. Dantam CSCI-561, Colorado School of Mines Fall 2017 Dantam (Mines CSCI-561) Context-Free Languages (Pre Lecture) Fall 2017 1 / 34 Outline Pumping Lemma
More informationChomsky Normal Form and TURING MACHINES. TUESDAY Feb 4
Chomsky Normal Form and TURING MACHINES TUESDAY Feb 4 CHOMSKY NORMAL FORM A context-free grammar is in Chomsky normal form if every rule is of the form: A BC A a S ε B and C aren t start variables a is
More informationNotes for Comp 497 (Comp 454) Week 10 4/5/05
Notes for Comp 497 (Comp 454) Week 10 4/5/05 Today look at the last two chapters in Part II. Cohen presents some results concerning context-free languages (CFL) and regular languages (RL) also some decidability
More informationFoundations of Informatics: a Bridging Course
Foundations of Informatics: a Bridging Course Week 3: Formal Languages and Semantics Thomas Noll Lehrstuhl für Informatik 2 RWTH Aachen University noll@cs.rwth-aachen.de http://www.b-it-center.de/wob/en/view/class211_id948.html
More informationProbabilistic Context Free Grammars
1 Defining PCFGs A PCFG G consists of Probabilistic Context Free Grammars 1. A set of terminals: {w k }, k = 1..., V 2. A set of non terminals: { i }, i = 1..., n 3. A designated Start symbol: 1 4. A set
More informationComputational Models - Lecture 3
Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 1 Computational Models - Lecture 3 Equivalence of regular expressions and regular languages (lukewarm leftover
More informationLecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1
Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS 164 -- Berkley University 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient
More informationParsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)
Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements
More informationCS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)
CS5371 Theory of Computation Lecture 7: Automata Theory V (CFG, CFL, CNF) Announcement Homework 2 will be given soon (before Tue) Due date: Oct 31 (Tue), before class Midterm: Nov 3, (Fri), first hour
More informationCS375: Logic and Theory of Computing
CS375: Logic and Theory of Computing Fuhua (Frank) Cheng Department of Computer Science University of Kentucky 1 Table of Contents: Week 1: Preliminaries (set algebra, relations, functions) (read Chapters
More informationMultiple Context-free Grammars
Multiple Context-free Grammars Course 4: pumping properties Sylvain Salvati INRI Bordeaux Sud-Ouest ESSLLI 2011 The pumping Lemma for CFL Outline The pumping Lemma for CFL Weak pumping Lemma for MCFL No
More informationIntroduction to Computational Linguistics
Introduction to Computational Linguistics Olga Zamaraeva (2018) Based on Bender (prev. years) University of Washington May 3, 2018 1 / 101 Midterm Project Milestone 2: due Friday Assgnments 4& 5 due dates
More informationContext-Free Grammars and Languages. Reading: Chapter 5
Context-Free Grammars and Languages Reading: Chapter 5 1 Context-Free Languages The class of context-free languages generalizes the class of regular languages, i.e., every regular language is a context-free
More informationLCFRS Exercises and solutions
LCFRS Exercises and solutions Laura Kallmeyer SS 2010 Question 1 1. Give a CFG for the following language: {a n b m c m d n n > 0, m 0} 2. Show that the following language is not context-free: {a 2n n
More informationDecidability (What, stuff is unsolvable?)
University of Georgia Fall 2014 Outline Decidability Decidable Problems for Regular Languages Decidable Problems for Context Free Languages The Halting Problem Countable and Uncountable Sets Diagonalization
More informationContext-Free Grammars (and Languages) Lecture 7
Context-Free Grammars (and Languages) Lecture 7 1 Today Beyond regular expressions: Context-Free Grammars (CFGs) What is a CFG? What is the language associated with a CFG? Creating CFGs. Reasoning about
More informationComputational Models - Lecture 4
Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push
More informationHarvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition
Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition Salil Vadhan October 11, 2012 Reading: Sipser, Section 2.3 and Section 2.1 (material on Chomsky Normal Form). Pumping Lemma for
More informationTop-Down Parsing and Intro to Bottom-Up Parsing
Predictive Parsers op-down Parsing and Intro to Bottom-Up Parsing Lecture 7 Like recursive-descent but parser can predict which production to use By looking at the next few tokens No backtracking Predictive
More informationContext Free Grammars: Introduction. Context Free Grammars: Simplifying CFGs
Context Free Grammars: Introduction CFGs are more powerful than RGs because of the following 2 properties: 1. Recursion Rule is recursive if it is of the form X w 1 Y w 2, where Y w 3 Xw 4 and w 1, w 2,
More informationFinite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove
Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove Tuesday 28 of May 2013 Total: 60 points TMV027/DIT321 registration VT13 TMV026/DIT321 registration before VT13 Exam
More informationFORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
15-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY REVIEW for MIDTERM 1 THURSDAY Feb 6 Midterm 1 will cover everything we have seen so far The PROBLEMS will be from Sipser, Chapters 1, 2, 3 It will be
More informationMildly Context-Sensitive Grammar Formalisms: Embedded Push-Down Automata
Mildly Context-Sensitive Grammar Formalisms: Embedded Push-Down Automata Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Sommersemester 2011 Intuition (1) For a language L, there is a TAG G with
More informationNote: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).
Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules). 1a) G = ({R, S, T}, {0,1}, P, S) where P is: S R0R R R0R1R R1R0R T T 0T ε (S generates the first 0. R generates
More informationPushdown Automata (Pre Lecture)
Pushdown Automata (Pre Lecture) Dr. Neil T. Dantam CSCI-561, Colorado School of Mines Fall 2017 Dantam (Mines CSCI-561) Pushdown Automata (Pre Lecture) Fall 2017 1 / 41 Outline Pushdown Automata Pushdown
More informationContext Free Grammars
Automata and Formal Languages Context Free Grammars Sipser pages 101-111 Lecture 11 Tim Sheard 1 Formal Languages 1. Context free languages provide a convenient notation for recursive description of languages.
More informationPush-down Automata = FA + Stack
Push-down Automata = FA + Stack PDA Definition A push-down automaton M is a tuple M = (Q,, Γ, δ, q0, F) where Q is a finite set of states is the input alphabet (of terminal symbols, terminals) Γ is the
More informationRecursive descent for grammars with contexts
39th International Conference on Current Trends in Theory and Practice of Computer Science Špindleruv Mlýn, Czech Republic Recursive descent parsing for grammars with contexts Ph.D. student, Department
More informationThis lecture covers Chapter 7 of HMU: Properties of CFLs
This lecture covers Chapter 7 of HMU: Properties of CFLs Chomsky Normal Form Pumping Lemma for CFs Closure Properties of CFLs Decision Properties of CFLs Additional Reading: Chapter 7 of HMU. Chomsky Normal
More informationChapter 4: Context-Free Grammars
Chapter 4: Context-Free Grammars 4.1 Basics of Context-Free Grammars Definition A context-free grammars, or CFG, G is specified by a quadruple (N, Σ, P, S), where N is the nonterminal or variable alphabet;
More informationSuppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis
1 Introduction Parenthesis Matching Problem Describe the set of arithmetic expressions with correctly matched parenthesis. Arithmetic expressions with correctly matched parenthesis cannot be described
More informationBefore We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?
Before We Start The Pumping Lemma Any questions? The Lemma & Decision/ Languages Future Exam Question What is a language? What is a class of languages? Context Free Languages Context Free Languages(CFL)
More informationMTH401A Theory of Computation. Lecture 17
MTH401A Theory of Computation Lecture 17 Chomsky Normal Form for CFG s Chomsky Normal Form for CFG s For every context free language, L, the language L {ε} has a grammar in which every production looks
More informationSyntactic Analysis. Top-Down Parsing
Syntactic Analysis Top-Down Parsing Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make
More informationKnuth-Morris-Pratt Algorithm
Knuth-Morris-Pratt Algorithm Jayadev Misra June 5, 2017 The Knuth-Morris-Pratt string matching algorithm (KMP) locates all occurrences of a pattern string in a text string in linear time (in the combined
More informationHandout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0
Massachusetts Institute of Technology 6.863J/9.611J, Natural Language Processing, Spring, 2001 Department of Electrical Engineering and Computer Science Department of Brain and Cognitive Sciences Handout
More informationLecture 11 Context-Free Languages
Lecture 11 Context-Free Languages COT 4420 Theory of Computation Chapter 5 Context-Free Languages n { a b : n n { ww } 0} R Regular Languages a *b* ( a + b) * Example 1 G = ({S}, {a, b}, S, P) Derivations:
More informationUNIT II REGULAR LANGUAGES
1 UNIT II REGULAR LANGUAGES Introduction: A regular expression is a way of describing a regular language. The various operations are closure, union and concatenation. We can also find the equivalent regular
More informationContext-free Grammars and Languages
Context-free Grammars and Languages COMP 455 002, Spring 2019 Jim Anderson (modified by Nathan Otterness) 1 Context-free Grammars Context-free grammars provide another way to specify languages. Example:
More informationComputing if a token can follow
Computing if a token can follow first(b 1... B p ) = {a B 1...B p... aw } follow(x) = {a S......Xa... } There exists a derivation from the start symbol that produces a sequence of terminals and nonterminals
More information