Introduction to Natural Language Processing

Similar documents
CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

Natural Language Processing. Lecture 13: More on CFG Parsing

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

A* Search. 1 Dijkstra Shortest Path

Remembering subresults (Part I): Well-formed substring tables

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Parsing with Context-Free Grammars

CS20a: summary (Oct 24, 2002)

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Parsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17

Decoding and Inference with Syntactic Translation Models

Computational Linguistics

Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars

LECTURER: BURCU CAN Spring

An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is

Statistical Methods for NLP

Computational Linguistics II: Parsing

Advanced Natural Language Processing Syntactic Parsing

Sharpening the empirical claims of generative syntax through formalization

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0

Efficient Divide-and-Conquer Parsing of Practical Context-Free Languages

Statistical Machine Translation

To make a grammar probabilistic, we need to assign a probability to each context-free rewrite

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars

Everything You Always Wanted to Know About Parsing

Parser Combinators, (Simply) Indexed Grammars, Natural Language Parsing

Complexity, Parsing, and Factorization of Tree-Local Multi-Component Tree-Adjoining Grammar

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Context- Free Parsing with CKY. October 16, 2014

Probabilistic Context-Free Grammars and beyond

A proof theoretical account of polarity items and monotonic inference.

Categorial Grammar. Larry Moss NASSLLI. Indiana University

Decision problem of substrings in Context Free Languages.

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

cmp-lg/ Nov 1994

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Parsing. Weighted Deductive Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

LING 473: Day 10. START THE RECORDING Coding for Probability Hidden Markov Models Formal Grammars

Probabilistic Context-free Grammars

Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars

Maschinelle Sprachverarbeitung

Lab 12: Structured Prediction

Aspects of Tree-Based Statistical Machine Translation

Context-Free Languages (Pre Lecture)

Maschinelle Sprachverarbeitung

Formal Languages, Grammars and Automata Lecture 5

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation

Top-Down Parsing, Part II

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models

CS 662 Sample Midterm

Computational Linguistics

h>p://lara.epfl.ch Compiler Construc/on 2011 CYK Algorithm and Chomsky Normal Form

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

National Centre for Language Technology School of Computing Dublin City University

Tree Adjoining Grammars

Computational Linguistics. Acknowledgements. Phrase-Structure Trees. Dependency-based Parsing

A parsing technique for TRG languages

CS 406: Bottom-Up Parsing

CSCI Compiler Construction

Lecture 5: UDOP, Dependency Grammars

Parsing Linear Context-Free Rewriting Systems

Recurrent neural network grammars

Context-Free Grammars and Languages. Reading: Chapter 5

Bottom-up Analysis. Theorem: Proof: Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k.

A DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005

Parsing with Context-Free Grammars

Computational Models - Lecture 4 1

An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities

In this chapter, we explore the parsing problem, which encompasses several questions, including:

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

Semantics and Generative Grammar. Expanding Our Formalism, Part 1 1

Introduction to Kleene Algebras

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Even More on Dynamic Programming

Chapter 14 (Partially) Unsupervised Parsing

Introduction to Computational Linguistics

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

Chapter 4: Bottom-up Analysis 106 / 338

Surface Reasoning Lecture 2: Logic and Grammar

Unification Grammars and Off-Line Parsability. Efrat Jaeger

CSCI 5582 Artificial Intelligence. Today 9/28. Knowledge Representation. Lecture 9

An Efficient Context-Free Parsing Algorithm. Speakers: Morad Ankri Yaniv Elia

Talen en Compilers. Johan Jeuring , period 2. October 29, Department of Information and Computing Sciences Utrecht University

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models

CS 314 Principles of Programming Languages

Syntax-Based Decoding

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 7a Syntax Analysis Elias Athanasopoulos

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 3. Y.N. Srikant

A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus

Bringing machine learning & compositional semantics together: central concepts

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment.

Transcription:

Introduction to Natural Language Processing PARSING: Earley, Bottom-Up Chart Parsing Jean-Cédric Chappelier Jean-Cedric.Chappelier@epfl.ch Artificial Intelligence Laboratory 1/20

Objectives of this lecture After CYK algorithm, present two other algorithms used for syntactic parsing 2/20

Earley Parsing Top-down algorithm(predictive) Bottom-up = inference Top-down = search 3 advantages: best known worst-case complexity (as CYK) adaptive complexity for least complex languages (e.g. regular languages) No need for a special form of the CF grammar 2 drawbacks : No way to correct/reconstruct non-parsable sentences ( early error detection ) not very intuitive 3/20

Earley Parsing (2) Idea: on-line (i.e during parsing) binarization of the grammar doted rules and Earley items doted rules: X X 1...X k X k+1...x m withx X 1...X m a rule of the grammar Earley item: one doted rule with one integeri (0 i n,n: size of the input string) the part before the dot ( ) represents the subpart of the rule that derives a substring of the input string starting at positioni+1 Example: (VP V NP, 2) is an Earley item for input string the cat ate a mouse 1 2 3 4 5 4/20

Earley Parsing (3) Principle: Starting from all possible (S X 1...X m,0), parallel construction of all the dotted rules deriving (larger and larger) substrings of the input string, up to a point where the whole input sentence is derived construction of sets of items (E j ) such that: (X α β,i) E j γ,δ : S γxδ and γ w 1...w i and α w i+1...w j Example: in the former example(vp V NP,2) E 3 The input string (lengthn) is syntactically correct (accepted) iff at least one (S X 1...X m,0) is ine n 5/20

Earley Parsing (4) ➊ Initialization: construction ofe 0 1. For each rules X 1...X m in the grammar: add(s X 1...X m,0) toe 0 2. For each(x Y β,0) ine 0 and every ruley γ, add(y γ,0) toe 0 3. Iterate (2) until convergence ofe 0 6/20

Earley Parsing: Interpretation ➋ Iterations: building of derivations ofw 1...w j (E j ) 1. Linking with words: Introduce wordw j whenever a derivation ofw 1...w j 1 can eat w j (i.e. there is a beforew j ) 2. Stepping in the derivation: Whenever non-terminalx can derive a subsequence starting atw i+1 and if there exists one subderivation ending inw i which can eat X, do it! 3. Prediction (of useful items): If at some placey could be eaten by some rule, then introduce all the rules that might (later on) producey 7/20

Earley Parsing (next) ➋ Iterations: construction of thee j sets (1 j n) 1. for all(x α w j β,i) ine j 1, add(x αw j β,i) toe j 2. For all(x γ,i) ofe j, for all(y α Xβ,k) ofe i, add(y αx β,k) toe j 3. For all(y α Xβ,i) ine j and for each rulex γ, add(x γ,j) toe j 4. Repeat to (2) whilee j keeps changing 8/20

Earley Parsing: Full Example Example for I think, and the grammar: S NP VP NP Pron NP DetN VP V VP V S VP V NP Pron I V think 9/20

E 0 : (S NP VP,0) (NP Pron,0) (NP DetN,0) (Pron I,0) E 1 : (Pron I,0) (NP Pron,0) (S NP VP,0) (VP V,1) (VP V P,1) (VP V NP,1) (V think,1) E 2 : (V think,1) (VP V,1) (VP V S,1) (VP V NP,1) (S NP VP,0) (S NP VP,2) (NP Pron,2) (NP DetN,2) (Pron I,2) 10/20

Link between CYK and Earley (X α β,i) E j (X α β) cell j i,i+1................................. (S NP VP,0) (S NP VP,0) (VP V S,1) (VP V NP,1).................................................................. (S NP VP,0) (NP Pron,0) (NP DetN,0) (Pron I,0) (NP Pron,0) (Pron I,0) (VP V,1) (VP V S,1) (VP V NP,1) (V think,1) I (VP V,1) (V think,1) (S NP VP,2) (NP N,2) (NP DetN,2) (Pron I,2) think 11/20

Bottom-up Chart Parsing Idea: keep the best of both CYK and Earley on-line binarization à la Earley (and even better) within a bottom-up CYK-like algorithm Mainly: no need for indices in items: cell position is enough factorize (with respect toα) all thex α β α... This is possible when processing bottom-up replace all thex α simply byx supression ofx α This is possible when processing bottom-up (and without lookahead) 12/20

Bottom-up Chart Parsing: Example S S VP NP... NP 01 Det... V... Det N V VP NP... NP Det... Det The crocodile ate the cat N 13/20

Bottom-up Chart Parsing (3) More formally, a CYK algorithm in which: If cell contents are denoted by[α..., i, j] and[x, i, j] respectively Then initialization isw ij [X,i,j] forx w ij R and the completion phase becomes: (association of two cells) [α...,i,j] [X,k,j +i] [αx...,i+k,j] ify αxβ R [Y,i+k,j] ify αx R ( self-filling ) [X,i,j] [X...,i,j] ify Xβ R [Y,i,j] ify X R 14/20

Bottom-up Chart Parsing: Example Initialization: Det N V Det N The dog hate the cat Det... V... Det... Det N V VP Det N The dog hate the cat k Completion: 00 11 00 11 01 00 11 00 11 01 k 15/20

S S VP 01 01 NP... NP... NP 01 Det... V... NP 01 Det... Det N V NP Det The crocodile ate the N cat 16/20

Dealing with compounds Example on how to deal with compouds during initialization phase: N N V credit N card 17/20

Complexity still ino(n 3 ) What coefficient forn 3? (with respect to grammar parameters) m(r ) NT n 3 wherem(r ) the number of internal nodes of the trie of the right-hand sides of the non-lexical grammar rules NT : the set of non-terminals R : the set of non-lexical grammar rules 18/20

Keypoints The way algorithms work (Earley items, linking, stepping, prediction, link CYK-Earley) worst-case complexityo(n 3 ) Advantages and drawbacks of algorithms 19/20

References [1] D. Jurafsky & J. H. Martin, Speech and Language Processing, pp. 377-385, Prentice Hall, 2000. [2] R. Dale, H. Moisi, H. Somers, Handbook of Natural Language Processing, pp. 69-73, Dekker, 2000. 20/20