Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars

Similar documents
Parsing Linear Context-Free Rewriting Systems

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Tree Adjoining Grammars

Mildly Context-Sensitive Grammar Formalisms: Thread Automata

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Parsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17

Einführung in die Computerlinguistik

Parsing. Weighted Deductive Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Data-Driven Parsing with Probabilistic Linear Context-Free Rewriting Systems

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

CS20a: summary (Oct 24, 2002)

Everything You Always Wanted to Know About Parsing

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

CYK Algorithm for Parsing General Context-Free Grammars

CSCI Compiler Construction

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Even More on Dynamic Programming

Grammars and Context Free Languages

Everything You Always Wanted to Know About Parsing

Einführung in die Computerlinguistik

Properties of Context-Free Languages

Computability Theory

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id

Grammars and Context Free Languages

Chap. 7 Properties of Context-free Languages

Formal Languages, Grammars and Automata Lecture 5

MA/CSSE 474 Theory of Computation

CPS 220 Theory of Computation

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

h>p://lara.epfl.ch Compiler Construc/on 2011 CYK Algorithm and Chomsky Normal Form

Remembering subresults (Part I): Well-formed substring tables

The Pumping Lemma for Context Free Grammars

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

NPDA, CFG equivalence

Follow sets. LL(1) Parsing Table

Conflict Removal. Less Than, Equals ( <= ) Conflict

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

CISC4090: Theory of Computation

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems

CS 373: Theory of Computation. Fall 2010

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars

A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus

Pushdown Automata: Introduction (2)

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Pattern Matching of Compressed Terms and Contexts and Polynomial Rewriting

Computational complexity of commutative grammars

Introduction to Theory of Computing

Properties of Context-Free Languages. Closure Properties Decision Properties

Context-Free Languages (Pre Lecture)

Chomsky Normal Form and TURING MACHINES. TUESDAY Feb 4

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Foundations of Informatics: a Bridging Course

Probabilistic Context Free Grammars

Computational Models - Lecture 3

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

CS375: Logic and Theory of Computing

Multiple Context-free Grammars

Introduction to Computational Linguistics

Context-Free Grammars and Languages. Reading: Chapter 5

LCFRS Exercises and solutions

Decidability (What, stuff is unsolvable?)

Context-Free Grammars (and Languages) Lecture 7

Computational Models - Lecture 4

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Top-Down Parsing and Intro to Bottom-Up Parsing

Context Free Grammars: Introduction. Context Free Grammars: Simplifying CFGs

Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Mildly Context-Sensitive Grammar Formalisms: Embedded Push-Down Automata

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Pushdown Automata (Pre Lecture)

Context Free Grammars

Push-down Automata = FA + Stack

Recursive descent for grammars with contexts

This lecture covers Chapter 7 of HMU: Properties of CFLs

Chapter 4: Context-Free Grammars

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

MTH401A Theory of Computation. Lecture 17

Syntactic Analysis. Top-Down Parsing

Knuth-Morris-Pratt Algorithm

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0

Lecture 11 Context-Free Languages

UNIT II REGULAR LANGUAGES

Context-free Grammars and Languages

Computing if a token can follow

Transcription:

Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars Laura Kallmeyer, Wolfgang Maier University of Tübingen ESSLLI Course 2008 Parsing beyond CFG 1 MCFG Parsing Multiple Context-Free Grammars (1) Seki et al. (1991), Seki & Kato (2008) Motivation: describe discontinuity. Idea: Non-terminal symbols can span a tuple of strings that need not be adjacent in the input string. Rewrite rules of the form A 0 f[a 1,..., A q ] where f is a function describing how to compute an A 0 -tuple from tuples satisfying A 1,..., A q such that each component of the value of f is a concatenation of some constant strings and some components of its arguments; each component of the rhs arguments is not allowed to appear in the value of f more than once. Parsing beyond CFG 3 MCFG Parsing Overview 1. Multiple Context-Free Grammars 2. CYK Parsing (a) The basic algorithm (b) The naïve algorithm (c) The active algorithm (d) The incremental algorithm (e) Prediction strategies 3. Conclusion Multiple Context-Free Grammars (2) A MCFG is a 5-tuple N, T, F, P, S where N is a finite set of non-terminals, each A N has a dimension dim(a) 1, dim(a) IN; T is a finite set of terminals; F is a finite set of mcf-functions (see below); P is a finite set of rules of the form A 0 f[a 1,..., A k ] with k 0, f F such that f : (T ) dim(a1)... (T ) dim(ak) (T ) dim(a0) ; S N is the start symbol. dim(s) = 1. Parsing beyond CFG 2 MCFG Parsing Parsing beyond CFG 4 MCFG Parsing

Multiple Context-Free Grammars (3) f is an mcf-function if there is a k 0 and there are d i > 0 for 0 i k such that f is a total function from (T ) d1... (T ) dk to (T ) d0 such that the components of f( x 1,..., x k ) are concatenations of a limited amount of terminal symbols and the components x ij of the x i (1 i k, 1 j d i ), and the components x ij of the x i are used at most once in the components of f( x 1,..., x k ). Multiple Context-Free Grammars (5) Two ranges l 1, r 1, l 2, r 2 are overlapping if either a) l 1 l 2 < r 1 and l 1 < r 2 or b) l 1 < r 2 r 1 and l 2 < r 1. A ρ (Pos(w) Pos(w)) k is a k-dimensional range vector for w iff ρ = l 1, r 1,..., l k, r k with a) l i, r i is a range in w for 1 i k and b) the elements of ρ are pairwise non-overlapping. We then define ρ(w) := l 1, r 1 (w),..., l k, r k (w). Parsing beyond CFG 5 MCFG Parsing Parsing beyond CFG 7 MCFG Parsing Multiple Context-Free Grammars (4) Given an input string w, each A N can be considered as a predicate that is true for certain vectors of substrings of w. To distinguish between different substrings containing the same terminal symbols, we introduce ranges: Let w be the input word, w = w 1...w n. Pos(w) := {0,..., n}. We call a pair l, r Pos(w) Pos(w) with l r a range in w. Its yield l, r (w) is the substring w l+1...w r. For two ranges ρ 1 = l 1, r 1, ρ 2 = l 2, r 2 : if r 1 = l 2, then ρ 1 ρ 2 = l 1, r 2 ; otherwise ρ 1 ρ 2 is undefined. Parsing beyond CFG 6 MCFG Parsing Multiple Context-Free Grammars (6) Now we define the range vectors in the yield of a given predicate A wrt w: For every terminating rule A f[ ] and every range l, r : if l, r (w) = f[ ], then A( l, r ). Let A f[a 1,..., A k ] be a production and ρ i range vectors with A i ( ρ i ) for 1 i k. We now apply f directly to the range vectors while mapping the terminals in the lhs to appropriate ranges of length 1. This way, f is no longer a function and it is no longer defined for all range vectors. (In some cases, it might yield undefined concatenations of ranges.) For all ρ f( ρ 1,..., ρ k ): A( ρ). For any other ρ, the predicate A is false. The language of a MCFG G is {w T S( 0, n ) wrt. w)}. Parsing beyond CFG 8 MCFG Parsing

CYK Parsing: the basic algorithm (1) Seki et al. (1991). Idea: process input from left to right, calculate for each position i all predicates A together with their yield position vector whose rightmost yield component ends at some position j i while starting with the terminating rules. w is in the language iff S with position vector 0, n is in the final set. CYK Parsing: the naïve algorithm (1) Problem with basic CYK algorithm: in order to perform a complete, one has to find items for all arguments of a rhs at the same time. Burden & Ljunglöf (2005) propose to modify the basic CYK algorithm such that only one daughter needs to be found at a time. binarization with dotted items for partially completed rhs (similar to Chomsky Normal Form for CFGs). Such items must contain all range vectors for the already recognized predicates of the rhs. Parsing beyond CFG 9 MCFG Parsing Parsing beyond CFG 11 MCFG Parsing CYK Parsing: the basic algorithm (2) Deduction rules: Items [A, ρ] with A N, ρ is a dim(a)-dimensional range vector in w. CYK Parsing: the naïve algorithm (2) We know that the arguments of the rhs predicates are taken as single components of the arguments of the lhs. We refer to them as A (k) i. Then we can write a rewriting rule as follows: Axioms: Complete: [A, ρ] A f[ ], f[ ] = ρ(w) [A 1, ρ 1 ],..., [A m, ρ m ] [A, ρ] A f[a 1,..., A m ], ρ f[ ρ 1,..., ρ m ] A 0 f[a 1,..., A n ] := x 1,..., x k where k = dim(a 0 ), x i (T {A (m) A {A 1,..., A n }, m {1,..., dim(a)}}). The vector x = x 1,..., x k is called a range constraint vector. Goal item: [S, 0, n ] Parsing beyond CFG 10 MCFG Parsing Parsing beyond CFG 12 MCFG Parsing

CYK Parsing: the naïve algorithm (3) Given a w, we can map the terminal symbols in a range constraint vector to ranges in w: Let x be a range constraint vector, x a component of x. We define if x T, then x w = { l, r l, r (w) = x} x = yv z with V = A (m), then x w = {α 1 A (m) α 2 α 1 y w, α 2 z w }. x w is then obtained by applying this to all components of x such that the ranges occurring in the result are all pairwise non-overlapping. CYK Parsing: the naïve algorithm (5) Convert turns a completely recognized active item into a passive item: [A f[ B ]; φ] [A; φ] Complete moves the dot over a non-terminal if a corresponding passive item exists. [A f[ B B k B ]; φ], [B k ; ψ] [A f[ BB k B ]; φ ] φ = φ[b k / ψ] Here, φ[b k / ψ] means replacing every occurrence of B (i) k in φ with ψ(i). Parsing beyond CFG 13 MCFG Parsing Parsing beyond CFG 15 MCFG Parsing CYK Parsing: the naïve algorithm (4) Naive algorithm: Passive items [A, ρ] and active items [A 0 f[ A A ]; φ] where the components of φ are concatenations of ranges and variables A (i). Predict introduces new axioms: [A f[ B]; φ] A f[ B] := x and φ x w Note that this is a completely blind prediction, any rule is predicted as being potentially used. CYK Parsing: the active algorithm (1) Idea: use the dot to traverse the range constraint vector φ. Passive items [A; Γ] as before. Active items [A f[ B]; (φ, ρ x, ψ); Γ] Such an active item indicates that the first arguments of A have been recognized yielding the ranges φ and the next argument is recognized up to the position marked by the dot so far yielding ρ. The rest of this argument (range constraints x) and the following arguments (range constraints ψ) are still waiting for completion. Γ contains range vectors for the predicates in B if these are found; otherwise it contains the variables B (i) k for these ranges. Parsing beyond CFG 14 MCFG Parsing Parsing beyond CFG 16 MCFG Parsing

CYK Parsing: the active algorithm (2) Predict introduces a new rule with the dot on the left of its range constraint vector: [A f[ B]; ( x, Ψ); Γ B ] ( Γ B contains the range variables for the vector B) A f[ B] := (x, Ψ) Complete moves a dot that is at the end of an argument to the next argument: [A f[ B]; (Φ, α, x, Ψ); Γ] CYK Parsing: the incremental algorithm (1) Problem of active algorithm: only passive items are used in combine steps. I.e., in a situation where the dot precedes A (i), in order to use the ith component of the predicate A, all the other components of A must already have been recognized. Better: process incrementally, allow to use active items in combine steps. Idea: read one token at the time and calculate all possible consequences of that token before the next token is read. [A f[ B]; (Φ, α, x, Ψ); Γ] Scan moves the dot over a terminal: [A f[ B]; (Φ, α ax, Ψ); Γ [A f[ B]; Φ, α l, r x, Ψ); Γ] l, r (w) = a Parsing beyond CFG 17 MCFG Parsing Parsing beyond CFG 19 MCFG Parsing CYK Parsing: the active algorithm (3) Combine moves the dot over a non-terminal if the corresponding passive item has been found: [A f[ B]; (Φ, α B (i) k x, Ψ), Γ], [B k ; ρ] [A f[ B]; (Φ, α ρ(i) x, Ψ); Γ ] Γ(k) compatible with ρ, Γ = Γ(k, i := ρ(i)) ( compatible means for every 1 i dim(b k ): either Γ(k)(i) = ρ(i) or Γ(k)(i) = B (i) k ) CYK Parsing: the incremental algorithm (2) We now use explicit feature r 1,..., r k for the range constraints of the k ranges of a predicate A. This way, the argument index is no longer given by the position in the range constraint vector and we can process the arguments in any order. Only active items [A f[ B]; (φ, r i = ρ x, ψ); Γ] As in the active algorithm except that the order in the range constraint vectors need not be the same as in the original rule in the grammar. Convert turns a fully recognized active item into a passive item: [A f[ B]; (Φ, α ), Γ] [A; (Φ, α)] Parsing beyond CFG 18 MCFG Parsing Parsing beyond CFG 20 MCFG Parsing

CYK Parsing: the incremental algorithm (3) Predict introduces a new rule with the dot on the left of one of the arguments of the lhs: [A f[ B]; (r i = k, k x, Ψ 1, Ψ 2 ); Γ B ] ( Γ B contains the range variables for the vector B) A f[ B ] := (Ψ 1, x, Ψ 2) with x the ith element, 1 i dim(a), 1 k n Complete moves a dot that is at the end of an argument to another argument: [A f[ B]; (Φ, r i = l i, r i, Ψ 1, r j = x, Ψ 2 ); Γ] [A f[ B]; (Φ, r i = α, r j = k, k x, Ψ 1, Ψ 2 ); Γ] r i k n CYK Parsing: Prediction strategies Problem of predict operations presented above: We compute partial results that are not reachable given the predicates we are looking for/the predicates we have already found. Solution: replace the unrestricted predict rule with more intelligent predictions. Possible strategies: A f[ B] with dot left of r i = α is only predicted if there is another item looking for A (i) (top-down prediction). there is a passive item that has found the first symbol in α (bottom-up prediction). Parsing beyond CFG 21 MCFG Parsing Parsing beyond CFG 23 MCFG Parsing CYK Parsing: the incremental algorithm (4) Scan moves the dot over a terminal: [A f[ B]; (Φ, r i = l, r ax, Ψ); Γ [A f[ B]; (Φ, r i = l, r + 1 x, Ψ); Γ] r, r + 1 (w) = a Combine moves the dot over a non-terminal if the corresponding passive item has been found: [A f[ B]; (Φ 1, r j = α B (i) k x, Ψ 1), Γ] [B k ; (Φ 2, r i = β, Ψ 2 ] [A f[ B]; (Φ 1, r j = α β x, Ψ 1 ); Γ(k, i := β)] ( compatible means for every 1 h dim(b k ): if r h = α h (Φ 2 ), then Γ(k)(h) = α h ) Γ(k) compatible with (Φ 2) Conclusion Starting point: basic algorithm (Seki et al.). Refinement: decompose single items and deductions steps in different items and smaller deduction steps. naïve algorithm (Burden & Ljunglöf, 2005). Further refinement: devide the combine rule into complete, scan and combine. active algorithm (Burden & Ljunglöf, 2005). Further refinement: predict and complete can select from any possible remaining range constraint, not just the following. incremental algorithm (Burden & Ljunglöf, 2005). The algorithms from Burden & Ljunglöf (2005) have been implemented in the Grammatical Framework System (Ranta, 2004). Parsing beyond CFG 22 MCFG Parsing Parsing beyond CFG 24 MCFG Parsing