Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is

Similar documents
Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Parsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Einführung in die Computerlinguistik

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

Einführung in die Computerlinguistik

Parsing. Weighted Deductive Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars

Context Free Grammars: Introduction. Context Free Grammars: Simplifying CFGs

Even More on Dynamic Programming

Remembering subresults (Part I): Well-formed substring tables

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Introduction to Computational Linguistics

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Computational Models - Lecture 4 1

CPS 220 Theory of Computation

CISC4090: Theory of Computation

Context Free Grammars: Introduction

Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars

Tree Adjoining Grammars

Space Complexity. The space complexity of a program is how much memory it uses.

CS 314 Principles of Programming Languages

Computational Linguistics II: Parsing

Computational Models - Lecture 4 1

Context- Free Parsing with CKY. October 16, 2014

Mildly Context-Sensitive Grammar Formalisms: Thread Automata

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

Recursive descent for grammars with contexts

Solutions to Problem Set 3

Parsing. A Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 27

Everything You Always Wanted to Know About Parsing

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Context-free Grammars and Languages

Parsing with Context-Free Grammars

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus

Context-Free Grammars and Languages. Reading: Chapter 5

Parsing Regular Expressions and Regular Grammars

Probabilistic Context-free Grammars

MA/CSSE 474 Theory of Computation

A* Search. 1 Dijkstra Shortest Path

Natural Language Processing. Lecture 13: More on CFG Parsing

Turing Machines and Time Complexity

Chapter 14 (Partially) Unsupervised Parsing

The Pumping Lemma for Context Free Grammars

Computational Models - Lecture 5 1

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

CYK Algorithm for Parsing General Context-Free Grammars

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

In this chapter, we explore the parsing problem, which encompasses several questions, including:

Decoding and Inference with Syntactic Translation Models

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis

Pushdown Automata (Pre Lecture)

RNA Secondary Structure Prediction

Top-Down Parsing and Intro to Bottom-Up Parsing

Statistical Machine Translation

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

CS415 Compilers Syntax Analysis Bottom-up Parsing

Pushdown Automata: Introduction (2)

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment.

Foundations of Informatics: a Bridging Course

CSE302: Compiler Design

Accept or reject. Stack

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

Probabilistic Context Free Grammars. Many slides from Michael Collins

CpSc 421 Final Exam December 6, 2008

Creating a Recursive Descent Parse Table

Context Free Grammars

SOLUTION: SOLUTION: SOLUTION:

Easy Shortcut Definitions

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

CFLs and Regular Languages. CFLs and Regular Languages. CFLs and Regular Languages. Will show that all Regular Languages are CFLs. Union.

Mildly Context-Sensitive Grammar Formalisms: Embedded Push-Down Automata

Intro to Theory of Computation

Probabilistic Context Free Grammars

Context-Free Grammar

CS 406: Bottom-Up Parsing

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

Computational Models - Lecture 5 1

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

CFGs and PDAs are Equivalent. We provide algorithms to convert a CFG to a PDA and vice versa.

Pushdown Automata (2015/11/23)

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Computability Theory

Lecture 11 Context-Free Languages

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

CSCI Compiler Construction

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario

Transcription:

Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is Unger s Parser Laura Heinrich-Heine-Universität Düsseldorf Wintersemester 2012/2013 a top-down parser: we start with S and subsequently replace lefthand sides of productions with righthand sides. a non-directional parser: the expanding of non-terminals (with appropriate righthand sides) is not ordered; therefore we need to guess the yields of all non-terminals in a right-hand side at once. Unger s parser 1 24 October 2012 Unger s parser 3 24 October 2012 Introduction (2) S NP VP, VP VP PP, VP V NP, NP Mary, Overview 1. Introduction 2. The parser 3. An example 4. Optimizations 5. Conclusion 1. S Mary sees the man with the telescope 2. NP Mary S NP VP from 1. 3. VP sees the man with the telescope 4. NP Mary sees S NP VP from 1. 5. VP the man with the telescope 6. Mary Mary NP Mary from 2. 7. VP sees VP VP PP from 3. 8. PP the man with the telescope Unger s parser 2 24 October 2012 Unger s parser 4 24 October 2012

Introduction (3) The parser takes a X N T and a substring w of the input. Initially, this is S and the entire input. If X and the remaining substring are equal, we can stop (success for X and w). Otherwise, X must be a non-terminal that can be further expanded. We then choose a X-production and partition w into further substrings that are paired with the righthand side elements of the production. The parser (2) Extension to deal with ǫ-productions and loops: Add a list of preceding calls; pass this list when calling the parser again; if the new call is already on the list, stop and return false. Initial call: unger(w, S, ) The parser continues recursively. Unger s parser 5 24 October 2012 Unger s parser 7 24 October 2012 The parser (1) The parser (3) Assume CFG without ǫ-productions and without loops A + A. function unger(w,x,l): function unger(w,x): out := false; out := false; if X,w L, return out; if w = X, then out := true else if w = X or (w = ǫ and X ǫ P) else for all X X 1 X k : then out := true for all x 1,,x k T + with w = x 1 x k : else for all X X 1 X k P: return out if k i=1 unger(x i,x i ) then out := true; for all x 1,,x k T with w = x 1 x k : if k i=1 unger(x i,x i, L { X,w }) then out := true; unger(w,x) iff X w (for X N T,w T ) return out Unger s parser 6 24 October 2012 Unger s parser 8 24 October 2012

The parser (4) So far, we have a recognizer, not a parser. To turn this into a parser, every call unger(..) must return a (set of) parse trees. This can be obtained from the succssful productions X X 1 X k, and the parse trees returned by the calls unger(x i,x i ). Note however that there might be a large amount of parse trees since in each call, there might be more than one successful production. We will come back to the compact presentation of several analyses in a parse forest. An example (2) Partitions according to Unger s parser: S NP VP Mr. Sarkozy s pension protections Mr. Sarkozy s pension protections Mr. Sarkozy s pension labor protections w = 34, consequently we have 33 different partitions. Unger s parser 9 24 October 2012 Unger s parser 11 24 October 2012 An example (1) Assume a CFG without ǫ. Production S NP VP. Input sentence w: Mr. Sarkozy s pension reform, which only affects about 500,000 public sector employees, is the opening salvo in a series of measures aimed more broadly at rolling back France s system of labor protections. (New York Times) An example (3) Take the following partition: NP S Mr. Sarkozy s pension reform, which employees, is protections For NP NP S, there are 12 partitions of the NP part. VP In the worst case, parsing is exponential in the length n of the input string. Unger s parser 10 24 October 2012 Unger s parser 12 24 October 2012

An example (4) Optimizations (2) We say that an algorithm is of Tabulation: avoid computing several times the same thing: polynomial time complexity if there is a constant c and a k such that the parsing of a string of length n takes an amount of time cn k. Notation: O(n k ). exponential time complexity if there is a constant c and a k such that the parsing of a string of length n takes an amount of time ck n. Notation: O(k n ). whenever unger(x,w) yields a result res, we store X,w,res in our table of partial parsing results; in every call unger(x,w), we first check whether we have already computed a result X,w,res and if so, we stop immediately and return res. Unger s parser 13 24 October 2012 Unger s parser 15 24 October 2012 Optimizations (1) Conditions on the partitions: check on occurrences of terminals in rhs; check on minimal length of terminal string derived by a non-terminal; check on obligatory terminals (pre-terminals) in strings derived by non-terminals; (e.g., each NP contains a N, each VP a V, ) check on the first terminals derivable from a non-terminal Optimizations (3) Results can be stored in a three-dimensional table (chart) C: If k = N +T and non-terminals and terminals X have a unique index k and w = n with w = w 1 w n, then use a k n n table, the chart. Whenever unger(x,w i w j ) yields a result res and m index of X, then C(m,i,j) = res. In every call unger(x,w i w j ), we first check whether we have already a value in C(m,i,j) and if so, we stop and return C(m,i,j). Advantage: access of C(m, i, j) in constant time. (Assumption: grammar ε-free, otherwise we need a k (n+1) (n+1) chart.) Unger s parser 14 24 October 2012 Unger s parser 16 24 October 2012

Optimizations (4) In addition, we can tabulate entire productions with the spans of their different symbols. This gives us a compact presentation of the parse forest. In every call unger(x,w i w j ), we first check whether we have already a value in C(m,i,j) and if so, we stop and return C(m,i,j). Otherwise, we compute all possible first steps of derivations X w: for every production X X 1 X k and all w 1,,w k such that the recursive Unger calls yield true, we add X,w X 1,w 1 X k,w k with the indices of the spans to the list of productions. If at least one such production has been found, we return true, otherwise false. Example: see handout. Unger s parser 17 24 October 2012 References [Grune and Jacobs, 2008] Grune, D. and Jacobs, C. (2008). Techniques. A Practical Guide. Monographs in Computer Science. Springer. Second Edition. Unger s parser 19 24 October 2012 Conclusion Unger s parser is a non-directional top-down parser; highly non-deterministic because during parsing, the yields of all non-terminals in righthand sides must be guessed; in general of exponential complexity; polynomial if tabulation is applied. Unger s parser 18 24 October 2012