Parsing. Weighted Deductive Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Similar documents
Parsing. A Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 27

Parsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17

Tree Adjoining Grammars

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

A* Search. 1 Dijkstra Shortest Path

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is

Top-Down K-Best A Parsing

Einführung in die Computerlinguistik

Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars

Einführung in die Computerlinguistik

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars

Parsing with Context-Free Grammars

Natural Language Processing. Lecture 13: More on CFG Parsing

Everything You Always Wanted to Know About Parsing

Everything You Always Wanted to Know About Parsing

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Context- Free Parsing with CKY. October 16, 2014

Properties of Context-Free Languages

Foundations of Informatics: a Bridging Course

Mildly Context-Sensitive Grammar Formalisms: Embedded Push-Down Automata

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

Parser Combinators, (Simply) Indexed Grammars, Natural Language Parsing

Chomsky Normal Form for Context-Free Gramars

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

MA/CSSE 474 Theory of Computation

NEW TABULAR ALGORITHMS FOR LIG PARSING

Data-Driven Parsing with Probabilistic Linear Context-Free Rewriting Systems

Homework 4 Solutions. 2. Find context-free grammars for the language L = {a n b m c k : k n + m}. (with n 0,

Parsing Regular Expressions and Regular Grammars

In this chapter, we explore the parsing problem, which encompasses several questions, including:

A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus

Parsing Linear Context-Free Rewriting Systems

CFGs and PDAs are Equivalent. We provide algorithms to convert a CFG to a PDA and vice versa.

Machine Learning for natural language processing

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Computational Models - Lecture 5 1

Sequentially Indexed Grammars

Chomsky and Greibach Normal Forms

Context-Free Grammar

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

The Pumping Lemma for Context Free Grammars

Informed Search. Chap. 4. Breadth First. O(Min(N,B L )) BFS. Search. have same cost BIBFS. Bi- Direction. O(Min(N,2B L/2 )) BFS. have same cost UCS

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Computational Models - Lecture 4

Informed Search. Day 3 of Search. Chap. 4, Russel & Norvig. Material in part from

Decoding and Inference with Syntactic Translation Models

Non-context-Free Languages. CS215, Lecture 5 c

Mildly Context-Sensitive Grammar Formalisms: Thread Automata

A parsing technique for TRG languages

Probabilistic Context Free Grammars

CS20a: summary (Oct 24, 2002)

CSCI Compiler Construction

Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove

h>p://lara.epfl.ch Compiler Construc/on 2011 CYK Algorithm and Chomsky Normal Form

Introduction to Formal Languages, Automata and Computability p.1/42

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

AN INTRODUCTION TO SEPARATION LOGIC. 2. Assertions

Remembering subresults (Part I): Well-formed substring tables

The Generalized A* Architecture

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

Propositional Calculus - Hilbert system H Moonzoo Kim CS Division of EECS Dept. KAIST

CS375: Logic and Theory of Computing

Probabilistic Context-free Grammars

Aspects of Tree-Based Statistical Machine Translation

Decidable and undecidable languages

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0

CYK Algorithm for Parsing General Context-Free Grammars

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

Section 1 (closed-book) Total points 30

Static Program Analysis

Planning in Markov Decision Processes

Propositional Resolution

Top-Down Parsing and Intro to Bottom-Up Parsing

Logic. proof and truth syntacs and semantics. Peter Antal

Computational Linguistics II: Parsing

Computability Theory

CPS 220 Theory of Computation Pushdown Automata (PDA)

Computational Models: Class 5

Doctoral Course in Speech Recognition. May 2007 Kjell Elenius

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Tabular Algorithms for TAG Parsing

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Chapter 4: Context-Free Grammars

Improved TBL algorithm for learning context-free grammar

CS558 Programming Languages

Introduction to Theory of Computing

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

CS481F01 Prelim 2 Solutions

Parsing Linear Context-Free Rewriting Systems with Fast Matrix Multiplication

Transcription:

Parsing Weighted Deductive Parsing Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 26

Table of contents 1 Idea 2 Algorithm 3 CYK Example 4 Parsing 5 Left Corner Example 2 / 26

Idea (1) Idea of weighted deductive parsing Nederhof (2003): Give a deductive definition of the probability of a parse tree. Use Knuth s algorithm to compute the best parse tree for category S and a given input w. Advantage: Yields the best parse without exhaustive parsing. Can be used to parse any grammar formalism as long as an appropriate weighted deductive system can be defined. 3 / 26

Idea (2) Reminder: Parsing Schemata understand parsing as a deductive process. Deduction of new items from existing ones can be described using inference rules. General form: antecedent side conditions consequent antecedent, consequent: lists of items Application: if antecedent can be deduced and side condition holds, then the consequent can be deduced as well. 4 / 26

Idea (3) A parsing schema consists of deduction rules; an axiom (or axioms), can be written as a deduction rule with empty antecedent; and a goal item. The parsing algorithm succeeds if, for a given input, it is possible to deduce the goal item. 5 / 26

Idea (4) Example: Deduction-based definition of bottom-up CFG parsing (CYK) with Chomsky Normal Form. For an input w = w 1 w n with w = n, 1 Item form [A, i, j] with A a non-terminal, 1 i j n. 2 Deduction rules: Scan: A w [A, i, i] i [B, i, j], [C, j + 1, k] Complete: [A, i, k] 3 Goal item: [S, 1, n]. A B C 6 / 26

Idea (5) Extension to a weighted deduction system: Each item has an additional weight. Intuition: weight = costs to build an item. (Usually, the higher the costs, the lower the probability.) The deduction rules specify how to compute the weight of the consequent item form the weights of the antecedent items. Extending CYK with weights: Scan: p : A w log(p) : [A, i, i] i x Complete: 1 : [B, i, j], x 2 : [C, j + 1, k] x 1 + x 2 + log(p) : [A, i, k] p : A B C (Note that p 1 p 2 = 10 log 10(p 1) 10 log 10(p 2 ) = 10 log 10(p 1 )+log 10 (p 2 ).) 7 / 26

Idea (6) Reminder: log 1 = 0, log 0 = 0 log 10 x 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 8 / 26

Algorithm (1) There is a linear order < defined on the weights. The lower the weight, the better the item. For Knuth s algorithm, the weight functions f must be monotone nondecreasing in each variable and f (x 1,..., x m ) max(x 1,..., x m ). In our example, this is the case: Complete: x 1 : [B, i, j], x 2 : [C, j + 1, k] x 1 + x 2 + log(p) : [A, i, k] p : A B C f (x 1, x 2 ) = x 1 + x 2 + c where c 0 is a constant. 9 / 26

Algorithm (2) Algorithm for computing the goal item with the lowest weight, goes back to Knuth. Goal: Find possible items with their lowest possible weight. We need two sets: A set C (the chart) that contains items that have reached their final weight. A set A (the agenda) that contains items that are waiting to be processed as possible antecendents in further rule applications and that have not necessarily reached their final weight. Initially, C = and A contains all items that can be deduced from an empty antecedent set. 10 / 26

Algorithm (3) while A do remove the best item x : I from A and add it to C if I goal item then stop and output true else for all y : I deduced from x : I and items in C: if there is no z with z : I C or z : I A then add y : I to A else if z : I A for some z then update weight of I in A to min(y, z) 11 / 26

Algorithm (4) If the weight functions are as required, then the following is guaranteed: Whenever an item is the best in the agenda, you have found its lowest weight. Therefore, if this item is a goal item, then you have found the best parse tree for your input. If it is no goal item, you can store it in the chart. no exhaustive parsing needed. However: A needs to be treated as a priority queue which can be expensive. 12 / 26

CYK Example.6 S SS (.2).1 S SA (1).3 S a (.5).4 A a (.4).6 A b (.2) (the number in brackets is the log ) Input: aa Chart Agenda.4 : [A, 1, 1],.4 : [A, 2, 2],.5 : [S, 1, 1],.5 : [S, 2, 2].4 : [A, 1, 1].4 : [A, 2, 2],.5 : [S, 1, 1],.5 : [S, 2, 2].4 : [A, 1, 1],.4 : [A, 2, 2].5 : [S, 1, 1],.5 : [S, 2, 2].4 : [A, 1, 1],.4 : [A, 2, 2],.5 : [S, 1, 1].5 : [S, 2, 2], 1.9 : [S, 1, 2].4 : [A, 1, 1],.4 : [A, 2, 2],.5 : [S, 1, 1] 1.2 : [S, 1, 2].5 : [S, 2, 2] 13 / 26

Parsing Extension to parsing: Whenever we generate a new item, we store it not only with its weight but also with backpointers to its antecedent items. Whenever we update the weight of an item, we also have to update the backpointers. In order to read off the best parse tree, we have to start from the best goal item and follow the backpointers. 14 / 26

Left Corner Example Deduction rules with weights (goal item [S, 0, n]): Scan: 0 : [a, i, i + 1] Left Corner Predict: w i+1 = a x : [A, i, j] x : [B A α] B Aα P Complete: x 1 : [A α Bβ, i, j], x 2 : [B, j, k] x 1 + x 2 : [A αb β, i, k] Convert: x : [B γ, j, k] x + log(p) : [B, j, k] p : B γ 15 / 26

Left Corner Example.6 S SS (.2).1 S SA (1).3 S a (.5).4 A a (.4).6 A b (.4) Input: aa 1. A = {0 : [a, 0, 1], 0 : [a, 1, 2]} j 2 Chart: 1 0 0 1 2 i 16 / 26

Left Corner Example 2. A = {0 : [a, 1, 2], 0 : [S a, 0, 1], 0 : [A a, 0, 1]} 2 1 0 : a Chart: 0 0 1 2 3. A = {0 : [S a, 0, 1], 0 : [S a, 1, 2], 0 : [A a, 0, 1], 0 : [A a, 1, 2]} 2 0 : a 1 0 : a Chart: 0 0 1 2 17 / 26

Left Corner Example 4. A = {0 : [A a, 0, 1], 0 : [A a, 1, 2],.5 : [S, 0, 1],.5 : [S, 1, 2]} 2 0 : a, 0 : S a (two 1 0 : a, 0 : S a Chart: convert 0 operations) 0 1 2 5. A = {.4 : [A, 0, 1],.4 : [A, 1, 2],.5 : [S, 0, 1],.5 : [S, 1, 2]} 2 0 : a, 0 : S a 0 : A a (two 1 0 : a, 0 : S a Chart: convert 0 : A a operations) 0 0 1 2 18 / 26

Left Corner Example 6. A = {.5 : [S, 0, 1],.5 : [S, 1, 2]} 2 0 : a, 0 : S a 0 : A a,.4 : A 1 0 : a, 0 : S a Chart: 0 : A a,.4 : A 0 0 1 2 (two steps) 19 / 26

Left Corner Example 7. A = {.5 : [S S A, 0, 1],.5 : [S S S, 0, 1],.5 : [S S A, 1, 2],.5 : [S S S, 1, 2]} 2 0 : a, 0 : S a 0 : A a,.4 : A,.5 : S 1 0 : a, 0 : S a Chart: 0 : A a,.4 : A,.5 : S 0 0 1 2 20 / 26

Left Corner Example 8. A = {.5 : [S S S, 0, 1],.5 : [S S A, 1, 2],.5 : [S S S, 1, 2],.9 : [S SA, 0, 2]} 2 0 : a, 0 : S a 0 : A a,.4 : A,.5 : S 1 0 : a, 0 : S a Chart: 0 : A a,.4 : A,.5 : S.5 : S S A 0 0 1 2 21 / 26

Left Corner Example 9. A = {.9 : [S SA, 0, 2], 1 : [S SS, 0, 2]} Chart: 2 0 : a, 0 : S a 0 : A a,.4 : A,.5 : S.5 : S S A,.5 : S S S 1 0 : a, 0 : S a 0 : A a,.4 : A,.5 : S.5 : S S A,.5 : S S S 0 0 1 2 22 / 26

Left Corner Example 10. A = {1 : [S SS, 0, 2], 1.9 : [S, 0, 2]} Chart: 2.9 : S SA 0 : a, 0 : S a 0 : A a,.4 : A,.5 : S.5 : S S A,.5 : S S S 1 0 : a, 0 : S a 0 : A a,.4 : A,.5 : S.5 : S S A,.5 : S S S 0 0 1 2 23 / 26

Left Corner Example 11. A = {1.2 : [S, 0, 2]} Chart: 2.9 : S SA 0 : a, 0 : S a 1 : S SS 0 : A a,.4 : A,.5 : S.5 : S S A,.5 : S S S 1 0 : a, 0 : S a 0 : A a,.4 : A,.5 : S.5 : S S A,.5 : S S S 0 0 1 2 24 / 26

Left Corner Example 12. A = Chart: 2.9 : S SA 0 : a, 0 : S a 1 : S SS 0 : A a,.4 : A,.5 : S 1.2 : S.5 : S S A,.5 : S S S 1 0 : a, 0 : S a 0 : A a,.4 : A,.5 : S.5 : S S A,.5 : S S S 0 0 1 2 25 / 26

Nederhof, Mark-Jan. 2003. Weighted Deductive Parsing and Knuth s Algorithm. Computational Linguistics 29(1). 135 143. 26 / 26