Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Similar documents
Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Probabilistic Context-free Grammars

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

LECTURER: BURCU CAN Spring

Parsing with Context-Free Grammars

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

Advanced Natural Language Processing Syntactic Parsing

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

10/17/04. Today s Main Points

Statistical Machine Translation

Statistical Methods for NLP

Parsing with Context-Free Grammars

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

In this chapter, we explore the parsing problem, which encompasses several questions, including:

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Context- Free Parsing with CKY. October 16, 2014

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment.

Chapter 14 (Partially) Unsupervised Parsing

Multilevel Coarse-to-Fine PCFG Parsing

A Supertag-Context Model for Weakly-Supervised CCG Parser Learning

Introduction to Computational Linguistics

CISC4090: Theory of Computation

CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models. The ischool University of Maryland. Wednesday, September 30, 2009

Soft Inference and Posterior Marginals. September 19, 2013

CS 6120/CS4120: Natural Language Processing

Lecture 12: Algorithms for HMMs

Natural Language Processing. Lecture 13: More on CFG Parsing

Lecture 12: Algorithms for HMMs

A* Search. 1 Dijkstra Shortest Path

Parsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing

Processing/Speech, NLP and the Web

Syntax-Based Decoding

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015

Natural Language Processing

Midterm sample questions

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

LING 473: Day 10. START THE RECORDING Coding for Probability Hidden Markov Models Formal Grammars

CSCI 5832 Natural Language Processing. Today 2/19. Statistical Sequence Classification. Lecture 9

Probabilistic Context Free Grammars. Many slides from Michael Collins

Lecture 3: ASR: HMMs, Forward, Viterbi

Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs

Hidden Markov Models

CS460/626 : Natural Language

DT2118 Speech and Speaker Recognition

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation

CS838-1 Advanced NLP: Hidden Markov Models

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Probabilistic Context Free Grammars

{Probabilistic Stochastic} Context-Free Grammars (PCFGs)

Decoding and Inference with Syntactic Translation Models

Grammars and introduction to machine learning. Computers Playing Jeopardy! Course Stony Brook University

Roger Levy Probabilistic Models in the Study of Language draft, October 2,

CS : Speech, NLP and the Web/Topics in AI

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

Lecture 13: Structured Prediction

Sequence Labeling: HMMs & Structured Perceptron

CSE 105 THEORY OF COMPUTATION

Everything You Always Wanted to Know About Parsing

Parsing -3. A View During TD Parsing

Context Free Grammars

Spectral Unsupervised Parsing with Additive Tree Metrics

Computers playing Jeopardy!

Aspects of Tree-Based Statistical Machine Translation

Algorithms for Syntax-Aware Statistical Machine Translation

Topics in Natural Language Processing

Natural Language Processing Prof. Pushpak Bhattacharyya Department of Computer Science & Engineering, Indian Institute of Technology, Bombay

S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP V PP 0.1 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 VP NP PP 1.0. N people 0.

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Lecture 15. Probabilistic Models on Graph

Data-Intensive Computing with MapReduce

Maschinelle Sprachverarbeitung

A Context-Free Grammar

Final exam study sheet for CS3719 Turing machines and decidability.

Maschinelle Sprachverarbeitung

Probabilistic Context-Free Grammar

CS460/626 : Natural Language

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

Part 3 out of 5. Automata & languages. A primer on the Theory of Computation. Last week, we learned about closure and equivalence of regular languages

Hidden Markov Models

CS 545 Lecture XVI: Parsing

Part A. P (w 1 )P (w 2 w 1 )P (w 3 w 1 w 2 ) P (w M w 1 w 2 w M 1 ) P (w 1 )P (w 2 w 1 )P (w 3 w 2 ) P (w M w M 1 )

Introduction to Bottom-Up Parsing

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario

Parsing VI LR(1) Parsers

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Language Models & Hidden Markov Models

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.

Introduction to Bottom-Up Parsing

NPDA, CFG equivalence

Transcription:

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing CS 4120/6120 Spring 2017 Northeastern University David Smith with some slides from Jason Eisner & Andrew McCallum

From Shift-Reduce to CKY Shift-reduce parsing can make wrong turns, needs backtracking Shift-reduce must pop the top of the stack, but how many items to pop? Time-space tradeoff Chomsky normal form

Chomsky Normal Form Any CFL can be generated by an equivalent grammar in CNF Rules of three types X Y Z X, Y, Z nonterminals X a X nonterminal, a terminal S ε S the start symbol NB: the derivation of a given string may change

CNF Conversion Create new start symbol Remove NTs that can generate epsilon Remove NTs that can generate each other, (unary rule cycles) Chain rules with RHS > 2 Related topic: rule Markovization (later)

CKY Algorithm Input: string of n words Output (of recognizer): grammatical or not Dynamic programming in a chart: rows labeled 0 to n-1 columns labeled 1 to n cell [i,j] lists possible constituents spanning words between i and j

CKY Algorithm! for i := 1 to n! Add to [i-1,i] all (part-of-speech) categories for the i th word! for width := 2 to n! for start := 0 to n-width! Define end := start + width! for mid := start+1 to end-1! for every constituent X in [start,mid]! for every constituent Y in [mid,end]! for all ways of combining X and Y (if any)! Add the resulting constituent to [start,end] if it s not already there.

CKY Algorithm! for i := 1 to n! Add to [i-1,i] all (part-of-speech) categories for the i th word! for width := 2 to n! for start := 0 to n-width! Define end := start + width! for mid := start+1 to end-1! for every constituent X in [start,mid]! for every constituent Y in [mid,end]! for all ways of combining X and Y (if any)! Add the resulting constituent to [start,end] if it s not already there. Time complexity?

CKY Algorithm! for i := 1 to n! Add to [i-1,i] all (part-of-speech) categories for the i th word! for width := 2 to n! for start := 0 to n-width! Define end := start + width! for mid := start+1 to end-1! for every constituent X in [start,mid]! for every constituent Y in [mid,end]! for all ways of combining X and Y (if any)! Add the resulting constituent to [start,end] if it s not already there. Time complexity? O(Gn 3 )

CKY Algorithm! for i := 1 to n! Add to [i-1,i] all (part-of-speech) categories for the i th word! for width := 2 to n! for start := 0 to n-width! Define end := start + width! for mid := start+1 to end-1! for every constituent X in [start,mid]! for every constituent Y in [mid,end]! for all ways of combining X and Y (if any)! Add the resulting constituent to [start,end] if it s not already there. Time complexity? O(Gn 3 ) Space complexity?

CKY Algorithm! for i := 1 to n! Add to [i-1,i] all (part-of-speech) categories for the i th word! for width := 2 to n! for start := 0 to n-width! Define end := start + width! for mid := start+1 to end-1! for every constituent X in [start,mid]! for every constituent Y in [mid,end]! for all ways of combining X and Y (if any)! Add the resulting constituent to [start,end] if it s not already there. Time complexity? O(Gn 3 ) Space complexity? O(Tn 2 )

time 1 flies 2 like 3 an 4 arrow 5 NP 3 Vst 3 0 NP! time Vst! time NP! flies VP! flies P! like V! like Det! an N! arrow 1 2 3 4 NP 4 VP 4 P 2 V 5 Det 1 N 8 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 Vst 3 0 1 NP 4 VP 4 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 Det 1 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 Vst 3 NP 10 0 1 NP 4 VP 4 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 Det 1 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 Vst 3 NP 10 S 8 0 1 NP 4 VP 4 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 Det 1 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 0 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 Det 1 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 0 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 _ 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 0 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 _ 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 0 NP 3 Vst 3 NP 10 S 8 S 13 1 NP 4 VP 4 _ 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 _ Vst 3 S 8 S 13 0 1 NP 4 VP 4 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 _ Vst 3 S 8 S 13 0 1 NP 4 VP 4 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 _ Vst 3 S 8 S 13 0 1 NP 4 VP 4 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 Vst 3 S 8 S 13 0 1 NP 4 VP 4 NP 18 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 Vst 3 S 8 S 13 0 1 NP 4 VP 4 NP 18 S 21 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 Vst 3 S 8 S 13 0 1 NP 4 VP 4 NP 18 S 21 VP 18 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 Vst 3 S 8 S 13 0 1 NP 4 VP 4 NP 18 S 21 VP 18 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 Vst 3 S 8 S 13 0 1 NP 4 VP 4 NP 18 S 21 VP 18 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 Vst 3 S 8 S 22 S 13 0 1 NP 4 VP 4 NP 18 S 21 VP 18 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 Vst 3 S 8 S 22 S 13 S 27 0 1 NP 4 VP 4 NP 18 S 21 VP 18 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 Vst 3 S 8 S 22 S 13 S 27 0 1 NP 4 VP 4 NP 18 S 21 VP 18 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 Vst 3 S 8 S 22 S 13 S 27 0 NP 24 1 NP 4 VP 4 NP 18 S 21 VP 18 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 Vst 3 S 8 S 22 S 13 S 27 0 NP 24 1 NP 4 VP 4 S 27 NP 18 S 21 VP 18 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 Vst 3 S 8 S 22 S 13 S 27 0 NP 24 1 NP 4 VP 4 S 27 S 22 NP 18 S 21 VP 18 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 Vst 3 S 8 S 22 S 13 S 27 0 NP 24 1 NP 4 VP 4 S 27 S 22 S 27 NP 18 S 21 VP 18 1 S! NP VP 6 S! Vst NP 2 S! S PP 1 VP! V NP 2 VP! VP PP 2 3 4 P 2 V 5 _ Det 1 PP 12 VP 16 NP 10 N 8 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

Follow backpointers S time 1 flies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 Vst 3 S 8 S 22 S 13 S 27 0 NP 24 S 27 S 22 1 NP 4 VP 4 S 27 NP 18 S 21 1 S! NP VP 6 S! Vst NP 2 S! S PP 2 3 4 P 2 V 5 _ Det 1 VP 18 PP 12 VP 16 NP 10 N 8 1 VP! V NP 2 VP! VP PP 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

S time 1 flies 2 like 3 an 4 arrow 5 NP VP NP 3 NP 10 NP 24 Vst 3 S 8 S 22 S 13 S 27 0 NP 24 S 27 S 22 1 NP 4 VP 4 S 27 NP 18 S 21 1 S! NP VP 6 S! Vst NP 2 S! S PP 2 3 4 P 2 V 5 _ Det 1 VP 18 PP 12 VP 16 NP 10 N 8 1 VP! V NP 2 VP! VP PP 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

S time 1 flies 2 like 3 an 4 arrow 5 NP VP NP 3 Vst 3 NP 10 S 8 NP 24 S 22 VP PP S 13 S 27 0 NP 24 S 27 S 22 1 NP 4 VP 4 S 27 NP 18 S 21 1 S! NP VP 6 S! Vst NP 2 S! S PP 2 3 4 P 2 V 5 _ Det 1 VP 18 PP 12 VP 16 NP 10 N 8 1 VP! V NP 2 VP! VP PP 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

S time 1 flies 2 like 3 an 4 arrow 5 NP VP NP 3 Vst 3 NP 10 S 8 NP 24 S 22 VP PP 0 S 13 S 27 NP 24 P NP S 27 S 22 1 NP 4 VP 4 S 27 NP 18 S 21 1 S! NP VP 6 S! Vst NP 2 S! S PP 2 3 4 P 2 V 5 _ Det 1 VP 18 PP 12 VP 16 NP 10 N 8 1 VP! V NP 2 VP! VP PP 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

S time 1 flies 2 like 3 an 4 arrow 5 NP VP NP 3 Vst 3 NP 10 S 8 NP 24 S 22 VP PP 0 S 13 S 27 NP 24 P NP S 27 S 22 Det N 1 NP 4 VP 4 S 27 NP 18 S 21 1 S! NP VP 6 S! Vst NP 2 S! S PP 2 3 4 P 2 V 5 _ Det 1 VP 18 PP 12 VP 16 NP 10 N 8 1 VP! V NP 2 VP! VP PP 1 NP! Det N 2 NP! NP PP 3 NP! NP NP 0 PP! P NP

Treebank Grammars What rules would you extract from this tree? What probabilities would you assign them? Det The NP NOM Noun S Verb read VP Det NP NOM man this Noun book

Treebank Grammars Penn Treebank Lots of rules have high fanout (flat phrases) Lots of unary cycles How should we evaluate? What are the consequences of CNF conversion?

Parsing as Deduction CKY as inference rules CKY as Prolog program But Prolog is top-down with backtracking i.e., backward chaining, CKY is forward chaining Inference rules as Boolean semiring

Probabilistic CFGs Generative process (already familiar) It s context free: Rules are applied independently, therefore we multiply their probabilities How to estimate probabilities? Supervised and unsupervised

Questions for PCFGs What is the most likely parse for a sentence? (parsing) What is the probability of a sentence? (language modeling) What rule probabilities maximize the probability of a sentence? (parameter estimation)

Algorithms for PCFGs Exact analogues to HMM algorithms Parsing: Viterbi CKY Language modeling: inside probabilities Parameter estimation: inside-outside probabilities with EM

Parsing as Deduction A, B, C N,W V,0 i, j, k m constit(b, i, j) constit(c, j, k) A BC constit(a, i, k) word(w, i) A W constit(a, i, i + 1) In Prolog: constit(a, I1, I) :- word(i, W), (A ---> [W]), I1 is I - 1. constit(a, I, K) :- constit(b, I, J), constit(c, J, K), (A ---> [B, C]). But Prolog uses top-down search with backtracking...

Parsing as Deduction A, B, C N,W V,0 i, j, k m constit(b, i, j) constit(c, j, k) A BC constit(a, i, k) word(w, i) A W constit(a, i, i + 1) constit(a, i, k) = _ B,C,j constit(b,i,j) ^ constit(c, j, k) ^ A! BC constit(a, i, j) = _ W word(w, i, j) ^ A! W

Parsing as Deduction constit(a, i, k) = _ B,C,j constit(b,i,j) ^ constit(c, j, k) ^ A! BC constit(a, i, j) = _ W word(w, i, j) ^ A! W score(constit(a, i, k)) = max B,C,j score(constit(b,i,j)) score(constit(a, i, j)) = max W score(constit(c, j, k)) score(a! BC) score(word(w, i, j)) score(a! W ) And how about the inside algorithm?

Inside & Viterbi Algorithms Let A(i, j) = p(constit(a, i, j)) NB: index between words; M&S index words = p(w ij nonterminal A from i to j) A(i, k) = B,C,j B(i, j) C(j, k) p(a BC) Let A(i, j) = p best (constit(a, i, j)) A(i, k) = max B,C,j B (i, j)) C(j, k) p(a BC) S(0,n) =? S(0,n) =?

Forward-Backward Algorithm NN αnn(2) a[vb NN]b[interest VB] a[nn VB]b[rates NN] βnn(4) NNS αnns(2) a[vb NNS]b[interest VB] a[nns VB]b[rates NNS] βnns(4) NNP αnnp(2) a[vb NNP]b[interest VB] a[nnp VB]b[rates NNP] βnnp(4) VB αvb(2) a[vb VB]b[interest VB] a[vb VB]b[rates VB] βvb(4) VBZ αvbz(2) a[vb VBZ]b[interest VB] a[vbz VB]b[rates VBZ] βvbz(4) Fed raises interest rates

Inside & Outside constit(a, i, j) p(words 0-i, words j-n, constit) w(0, 1) w(i-1, i) w(j, j+1) w(n-1, n)

Inside & Outside constit(a, i, j) p(words 0-i, words j-n, constit) Inside: p(words i-j constit) w(0, 1) w(i-1, i) w(j, j+1) w(n-1, n)

Inside & Outside constit(a, i, j) p(words 0-i, words j-n, constit) Inside: p(words i-j constit) w(0, 1) w(i-1, i) w(j, j+1) w(n-1, n)

Inside & Outside constit(a, i, j) Inside x Outside = Inside probability of the whole tree p(words 0-i, words j-n, constit) Inside: p(words i-j constit) w(0, 1) w(i-1, i) w(j, j+1) w(n-1, n)

Outside Algorithm A(i, j) =p(w 0,i,A i,j,w j,n ) Uses inside probs. A(i, j) = S(0,n) =? PP(0,n) =? n B,C,k=j + i B,C,k=0 B(i, k) C (j, k) p(b AC) B(k, j) C (k, i) p(b CA) Some resemblance to derivative product rule

Problems with Inside-Outside EM Each sentence at each iteration takes O(m 3 n 3 ) Local maxima even more problematic than for HMMs: Charniak (1993) found a different maximum for each of 300 trials More NTs needed to learn a good model NTs don t correspond to intuitions: HMMs are easier to constrain with tag dictionaries

Top-Down/Bottom-Up Top-down parsers Can get caught in infinite loops Take exponential time backtracking CKY Needs Chomsky normal form Builds all possible constituents

Earley Parser (1970) Nice combination of dynamic programming incremental interpretation avoids infinite loops no restrictions on the form of the context-free grammar. A! B C the D of causes no problems O(n 3 ) worst case, but faster for many grammars Uses left context and optionally right context to constrain search.

Earley s Overview Finds constituents and partial constituents in input A! B C. D E is partial: only the first half of the A A B C D E D + = A B C D E A! B C. D E A! B C D. E

Earley s Overview Proceeds incrementally left-to-right Before it reads word 5, it has already built all hypotheses that are consistent with first 4 words Reads word 5 & attaches it to immediately preceding hypotheses. Might yield new constituents that are then attached to hypotheses immediately preceding them E.g., attaching D to A! B C. D E gives A! B C D. E Attaching E to that gives A! B C D E. Now we have a complete A that we can attach to hypotheses immediately preceding the A, etc.

The Parse Table Columns 0 through n corresponding to the gaps between words Entries in column 5 look like (3, NP! NP. PP) (but we ll omit the! etc. to save space) Built while processing word 5 Means that the input substring from 3 to 5 matches the initial NP portion of a NP! NP PP rule Dot shows how much we ve matched as of column 5 Perfectly fine to have entries like (3, VP! is it. true that S)

The Parse Table Entries in column 5 look like (3, NP! NP. PP) What will it mean that we have this entry? Unknown right context: Doesn t mean we ll necessarily be able to find a VP starting at column 5 to complete the S. Known left context: Does mean that some dotted rule back in column 3 is looking for an S that starts at 3. So if we actually do find a VP starting at column 5, allowing us to complete the S, then we ll be able to attach the S to something. And when that something is complete, it too will have a customer to its left In short, a top-down (i.e., goal-directed) parser: it chooses to start building a constituent not because of the input but because that s what the left context needs. In the spoon, won t build spoon as a verb because there s no way to use a verb there. So any hypothesis in column 5 could get used in the correct parse, if words 1-5 are continued in just the right way by words 6-n.

Earley s as a Recognizer Add ROOT!. S to column 0. For each j from 0 to n: For each dotted rule in column j, (including those we add as we go!) look at what s after the dot: If it s a word w, SCAN: If w matches the input word between j and j+1, advance the dot and add the resulting rule to column j+1 If it s a non-terminal X, PREDICT: Add all rules for X to the bottom of column j, wth the dot at the start: e.g. X!. Y Z If there s nothing after the dot, ATTACH: We ve finished some constituent, A, that started in column I<j. So for each rule in column j that has A after the dot: Advance the dot and add the result to the bottom of column j. Output yes just if last column has ROOT! S. NOTE: Don t add an entry to a column if it s already there!

Earley s Summary Process all hypotheses one at a time in order. (Current hypothesis is shown in blue.) This may add new hypotheses to the end of the to-do list, or try to add old hypotheses again. Process a hypothesis according to what follows the dot: If a word, scan input and see if it matches If a nonterminal, predict ways to match it (we ll predict blindly, but could reduce # of predictions by looking ahead k symbols in the input and only making predictions that are compatible with this limited right context) If nothing, then we have a complete constituent, so attach it to all its customers

A (Whimsical) Grammar S! NP VP NP! Papa NP! Det N N! caviar NP! NP PP N! spoon VP! V NP V! ate VP! VP PP P! with PP! P NP Det! the Det! a An Input Sentence Papa ate the caviar with a spoon.

0 initialize 0 ROOT. S Remember this stands for (0, ROOT!. S)

0 0 ROOT. S 0 S. NP VP predict the kind of S we are looking for Remember this stands for (0, S!. NP VP)

0 0 ROOT. S 0 S. NP VP 0 NP. Det N 0 NP. NP PP 0 NP. Papa predict the kind of NP we are looking for (actually we ll look for 3 kinds: any of the 3 will do)

0 0 ROOT. S 0 S. NP VP 0 NP. Det N 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a predict the kind of Det we are looking for (2 kinds)

0 0 ROOT. S 0 S. NP VP 0 NP. Det N 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a predict the kind of NP we re looking for but we were already looking for these so don t add duplicate goals! Note that this happened when we were processing a left-recursive rule.

0 Papa 1 0 ROOT. S 0 S. NP VP 0 NP. Det N 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a 0 NP Papa. scan: the desired word is in the input!

0 Papa 1 0 ROOT. S 0 S. NP VP 0 NP. Det N 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a 0 NP Papa. scan: failure

0 Papa 1 0 ROOT. S 0 S. NP VP 0 NP. Det N 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a 0 NP Papa. scan: failure

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a attach the newly created NP (which starts at 0) to its customers (incomplete constituents that end at 0 and have NP after the dot)

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 0 Det. a predict

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a predict

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate predict

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate predict

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 P. with predict

0 Papa 1 ate 2 0 ROOT. S 0 NP Papa. 1 V ate. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 P. with scan: success!

0 Papa 1 ate 2 0 ROOT. S 0 NP Papa. 1 V ate. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 P. with scan: failure

0 Papa 1 ate 2 0 ROOT. S 0 NP Papa. 1 V ate. 0 S. NP VP 0 S NP. VP 1 VP V. NP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 P. with attach

0 Papa 1 ate 2 0 ROOT. S 0 NP Papa. 1 V ate. 0 S. NP VP 0 S NP. VP 1 VP V. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 P. with predict

0 Papa 1 ate 2 0 ROOT. S 0 NP Papa. 1 V ate. 0 S. NP VP 0 S NP. VP 1 VP V. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 0 Det. a 1 V. ate 2 Det. a 1 P. with predict (these next few steps should look familiar)

0 Papa 1 ate 2 0 ROOT. S 0 NP Papa. 1 V ate. 0 S. NP VP 0 S NP. VP 1 VP V. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 0 Det. a 1 V. ate 2 Det. a 1 P. with predict

0 Papa 1 ate 2 0 ROOT. S 0 NP Papa. 1 V ate. 0 S. NP VP 0 S NP. VP 1 VP V. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 0 Det. a 1 V. ate 2 Det. a 1 P. with scan (this time we fail since Papa is not the next word)

0 Papa 1 ate 2 the 3 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 0 S. NP VP 0 S NP. VP 1 VP V. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 0 Det. a 1 V. ate 2 Det. a 1 P. with scan: success!

0 Papa 1 ate 2 the 3 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 0 S. NP VP 0 S NP. VP 1 VP V. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 0 Det. a 1 V. ate 2 Det. a 1 P. with

0 Papa 1 ate 2 the 3 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 0 NP. Det N 0 NP NP. PP 2 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 0 Det. a 1 V. ate 2 Det. a 1 P. with

0 Papa 1 ate 2 the 3 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 0 Det. a 1 V. ate 2 Det. a 1 P. with

0 Papa 1 ate 2 the 3 caviar 4 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 0 Det. a 1 V. ate 2 Det. a 1 P. with

0 Papa 1 ate 2 the 3 caviar 4 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 0 Det. a 1 V. ate 2 Det. a 1 P. with

0 Papa 1 ate 2 the 3 caviar 4 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 0 Det. a 1 V. ate 2 Det. a 1 P. with attach

0 Papa 1 ate 2 the 3 caviar 4 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 0 Det. a 1 V. ate 2 Det. a 1 P. with attach (again!)

0 Papa 1 ate 2 the 3 caviar 4 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 1 P. with attach (again!)

0 Papa 1 ate 2 the 3 caviar 4 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with

0 Papa 1 ate 2 the 3 caviar 4 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. attach (again!)

0 Papa 1 ate 2 the 3 caviar 4 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S.

0 Papa 1 ate 2 the 3 caviar 4 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with

0 Papa 1 ate 2 the 3 caviar 4 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with

0 Papa 1 ate 2 the 3 caviar 4 with 5 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 4 P with. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with

0 Papa 1 ate 2 the 3 caviar 4 with 5 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 4 P with. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with

0 Papa 1 ate 2 the 3 caviar 4 with 5 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 4 P with. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 5 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with

0 Papa 1 ate 2 the 3 caviar 4 with 5 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 4 P with. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 5 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 5 Det. the 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 5 Det. a 1 P. with 0 ROOT S. 4 P. with

0 Papa 1 ate 2 the 3 caviar 4 with 5 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 4 P with. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 5 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 5 Det. the 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 5 Det. a 1 P. with 0 ROOT S. 4 P. with

0 Papa 1 ate 2 the 3 caviar 4 with 5 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 4 P with. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 5 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 5 Det. the 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 5 Det. a 1 P. with 0 ROOT S. 4 P. with

0 Papa 1 ate 2 the 3 caviar 4 with 5 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 4 P with. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 5 NP. Papa 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 5 Det. the 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 5 Det. a 1 P. with 0 ROOT S. 4 P. with

e 2 the 3 caviar 4 with 5 a 6 1 V ate. 2 Det the. 3 N caviar. 4 P with. 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP P 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP P 2 NP. Papa 0 S NP VP. 5 NP. Papa 2 Det. the 1 VP VP. PP 5 Det. the 2 Det. a 4 PP. P NP 5 Det. a 0 ROOT S. 4 P. with 5 Det a.

e 2 the 3 caviar 4 with 5 a 6 1 V ate. 2 Det the. 3 N caviar. 4 P with. 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP P 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP P 2 NP. Papa 0 S NP VP. 5 NP. Papa 2 Det. the 1 VP VP. PP 5 Det. the 2 Det. a 4 PP. P NP 5 Det. a 0 ROOT S. 4 P. with 5 Det a. 5 NP Det. N

e 2 the 3 caviar 4 with 5 a 6 1 V ate. 2 Det the. 3 N caviar. 4 P with. 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP P 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP P 2 NP. Papa 0 S NP VP. 5 NP. Papa 2 Det. the 1 VP VP. PP 5 Det. the 2 Det. a 4 PP. P NP 5 Det. a 0 ROOT S. 4 P. with 5 Det a. 5 NP Det. N 6 N. caviar 6 N. spoon

e 2 the 3 caviar 4 with 5 a 6 1 V ate. 2 Det the. 3 N caviar. 4 P with. 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP P 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP P 2 NP. Papa 0 S NP VP. 5 NP. Papa 2 Det. the 1 VP VP. PP 5 Det. the 2 Det. a 4 PP. P NP 5 Det. a 0 ROOT S. 4 P. with 5 Det a. 5 NP Det. N 6 N. caviar 6 N. spoon

e 2 the 3 caviar 4 with 5 a 6 spoon 7 1 V ate. 2 Det the. 3 N caviar. 4 P with. 5 Det a. 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP 5 NP Det. N P 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 6 N. caviar 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP 6 N. spoon P 2 NP. Papa 0 S NP VP. 5 NP. Papa 2 Det. the 1 VP VP. PP 5 Det. the 2 Det. a 4 PP. P NP 5 Det. a 0 ROOT S. 4 P. with 6 N spoon.

e 2 the 3 caviar 4 with 5 a 6 spoon 7 1 V ate. 2 Det the. 3 N caviar. 4 P with. 5 Det a. 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP 5 NP Det. N P 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 6 N. caviar 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP 6 N. spoon P 2 NP. Papa 0 S NP VP. 5 NP. Papa 2 Det. the 1 VP VP. PP 5 Det. the 2 Det. a 4 PP. P NP 5 Det. a 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N.

e 2 the 3 caviar 4 with 5 a 6 spoon 7 1 V ate. 2 Det the. 3 N caviar. 4 P with. 5 Det a. 1 VP V. NP 2 NP Det. N 2 NP Det N. 4 PP P. NP 5 NP Det. N P 2 NP. Det N 3 N. caviar 1 VP V NP. 5 NP. Det N 6 N. caviar 2 NP. NP PP 3 N. spoon 2 NP NP. PP 5 NP. NP PP 6 N. spoon P 2 NP. Papa 0 S NP VP. 5 NP. Papa 2 Det. the 1 VP VP. PP 5 Det. the 2 Det. a 4 PP. P NP 5 Det. a 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP.

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 7 P. with

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 7 P. with

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 7 P. with

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 7 P. with 0 ROOT S.

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 7 P. with 0 ROOT S.

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 7 P. with 0 ROOT S.

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 7 P. with 0 ROOT S.

0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 NP Papa. 1 V ate. 2 Det the. 3 N caviar. 0 S. NP VP 0 S NP. VP 1 VP V. NP 2 NP Det. N 2 NP Det N. 0 NP. Det N 0 NP NP. PP 2 NP. Det N 3 N. caviar 1 VP V NP. 0 NP. NP PP 1 VP. V NP 2 NP. NP PP 3 N. spoon 2 NP NP. PP 0 NP. Papa 1 VP. VP PP 2 NP. Papa 0 S NP VP. 0 Det. the 1 PP. P NP 2 Det. the 1 VP VP. PP 0 Det. a 1 V. ate 2 Det. a 4 PP. P NP 1 P. with 0 ROOT S. 4 P. with 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 7 P. with 0 ROOT S.

Left Recursion Kills Pure Top-Down Parsing VP Andrew McCallum, UMass Amherst

Left Recursion Kills Pure Top-Down Parsing VP VP PP Andrew McCallum, UMass Amherst

Left Recursion Kills Pure Top-Down Parsing VP VP PP VP PP Andrew McCallum, UMass Amherst

Left Recursion Kills Pure Top-Down Parsing VP VP PP VP PP makes new hypotheses ad infinitum before we ve seen the PPs at all VP PP hypotheses try to predict in advance how many PP s will arrive in input Andrew McCallum, UMass Amherst

but Earley s Alg is Okay! VP 1 VP!. VP PP VP PP (in column 1) Andrew McCallum, UMass Amherst

but Earley s Alg is Okay! VP 1 VP!. VP PP VP PP (in column 1) VP 1 VP! V NP. V NP ate the caviar (in column 4) Andrew McCallum, UMass Amherst

but Earley s Alg is Okay! VP 1 VP!. VP PP VP PP (in column 1) attach VP 1 VP! VP. PP VP PP V NP ate the caviar (in column 4) Andrew McCallum, UMass Amherst

but Earley s Alg is Okay! VP 1 VP!. VP PP VP PP (in column 1) VP 1 VP! VP PP. VP PP with a spoon V NP ate the caviar (in column 7) Andrew McCallum, UMass Amherst

but Earley s Alg is Okay! VP VP PP (in column 1) 1 VP!. VP PP can be reused VP 1 VP! VP PP. VP PP with a spoon V NP ate the caviar (in column 7) Andrew McCallum, UMass Amherst

but Earley s Alg is Okay! VP VP PP (in column 1) 1 VP!. VP PP can be reused attach VP 1 VP! VP. PP VP PP VP PP with a spoon V NP ate the caviar (in column 7) Andrew McCallum, UMass Amherst

but Earley s Alg is Okay! VP VP PP (in column 1) 1 VP!. VP PP can be reused VP VP PP in his bed VP PP with a spoon V NP ate the caviar (in column 10) 1 VP! VP PP. Andrew McCallum, UMass Amherst

but Earley s Alg is Okay! VP VP PP (in column 1) 1 VP!. VP PP can be reused again VP VP PP in his bed VP PP with a spoon V NP ate the caviar (in column 10) 1 VP! VP PP. Andrew McCallum, UMass Amherst

but Earley s Alg is Okay! VP VP PP (in column 1) 1 VP!. VP PP can be reused again attach VP VP VP PP in his bed VP PP with a spoon V NP ate the caviar (in column 10) 1 VP! VP. PP PP Andrew McCallum, UMass Amherst

0 Det. the 0 Det. a 0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 S. NP VP 0 NP. Det N 0 NP. NP PP 0 NP. Papa 0 NP Papa. 0 S NP. VP 0 NP NP. PP 1 VP. V NP 1 VP. VP PP 1 PP. P NP 1 V. ate 1 P. with 1 V ate. 1 VP V. NP 2 NP. Det N 2 NP. NP PP 2 NP. Papa 2 Det. the 2 Det. a 2 Det the. 2 NP Det. N 3 N. caviar 3 N. spoon 3 N caviar. 2 NP Det N. 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 4 PP. P NP 0 ROOT S. 4 P. with completed a VP in col 4 col 1 lets us use it in a VP PP structure 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 7 P. with 0 ROOT S.

0 Det. the 0 Det. a 0 Papa 1 ate 2 the 3 caviar 4 with a spoon 7 0 ROOT. S 0 S. NP VP 0 NP. Det N 0 NP. NP PP 0 NP. Papa 0 NP Papa. 0 S NP. VP 0 NP NP. PP 1 VP. V NP 1 VP. VP PP 1 PP. P NP 1 V. ate 1 P. with 1 V ate. 1 VP V. NP 2 NP. Det N 2 NP. NP PP 2 NP. Papa 2 Det. the 2 Det. a 2 Det the. 2 NP Det. N 3 N. caviar 3 N. spoon 3 N caviar. 2 NP Det N. 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 4 PP. P NP 0 ROOT S. 4 P. with completed that VP = VP PP in col 7 col 1 would let us use it in a VP PP structure can reuse col 1 as often as we need 6 N spoon. 5 NP Det N. 4 PP P NP. 5 NP NP. PP 2 NP NP PP. 1 VP VP PP. 7 PP. P NP 1 VP V NP. 2 NP NP. PP 0 S NP VP. 1 VP VP. PP 7 P. with 0 ROOT S.

Beyond Recognition So far, we ve described an Earley recognizer Note what we did when we tried to create entries that already existed What should we do when combining items? How to derive outside algorithm?

Parsing Tricks

Left-Corner Parsing Technique for 1 word of lookahead in algorithms like Earley s (can also do multi-word lookahead but it s harder)

Basic Earley s Algorithm 0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a attach

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 0 Det. a predict

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a predict

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 V. drank 1 V. snorted predict

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 V. drank 1 V. snorted predict

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 V. drank 1 V. snorted predict!.v makes us add all the verbs in the vocabulary!! Slow we d like a shortcut.

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 V. drank 1 V. snorted predict

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 V. drank 1 V. snorted predict

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 V. drank 1 V. snorted predict! Every.VP adds all VP " rules again.! Before adding a rule, check it s not a duplicate.! Slow if there are > 700 VP " rules

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 V. drank 1 V. snorted 1 P. with predict

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 V. drank 1 V. snorted 1 P. with predict!.p makes us add all the prepositions

1-word lookahead would help 0 Papa 1 ate 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 V. drank 1 V. snorted 1 P. with

1-word lookahead would help 0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 V. drank 1 V. snorted 1 P. with ate No point in adding words other than ate

1-word lookahead would help 0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 PP. P NP 0 Det. a 1 V. ate 1 V. drank 1 V. snorted 1 P. with ate In fact, no point in adding any constituent that can t start with ate Don t bother adding PP, P, etc. No point in adding words other than ate

With Left-Corner Filter 0 Papa 1 ate 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a attach

With Left-Corner Filter 0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a ate attach PP can t start with ate

With Left-Corner Filter 0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a ate attach PP can t start with ate Pruning now we won t predict 1 PP. P NP 1 PP. ate

With Left-Corner Filter 0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a ate attach PP can t start with ate Pruning now we won t predict 1 PP. P NP 1 PP. ate either!

With Left-Corner Filter 0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 0 NP. Papa 0 Det. the 0 Det. a ate attach PP can t start with ate Pruning now we won t predict 1 PP. P NP 1 PP. ate either!

0 Papa 1 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 0 Det. a ate predict

0 Papa 1 ate 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 V. ate 0 Det. a 1 V. drank 1 V. snorted predict

0 Papa 1 ate 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 V. ate 0 Det. a 1 V. drank 1 V. snorted predict

0 Papa 1 ate 0 ROOT. S 0 NP Papa. 0 S. NP VP 0 S NP. VP 0 NP. Det N 0 NP NP. PP 0 NP. NP PP 1 VP. V NP 0 NP. Papa 1 VP. VP PP 0 Det. the 1 V. ate 0 Det. a 1 V. drank 1 V. snorted predict

Merging Right-Hand Sides Grammar might have rules X A G H P X B G H P Could end up with both of these in chart: (2, X A. G H P) in column 5 (2, X B. G H P) in column 5 But these are now interchangeable: if one produces X then so will the other To avoid this redundancy, can always use dotted rules of this form: X... G H P

Merging Right-Hand Sides Similarly, grammar might have rules X A G H P X A G H Q Could end up with both of these in chart: (2, X A. G H P) in column 5 (2, X A. G H Q) in column 5 Not interchangeable, but we ll be processing them in parallel for a while Solution: write grammar as X A G H (P Q)

Merging Right-Hand Sides Combining the two previous cases: X A G H P X A G H Q X B G H P X B G H Q becomes X (A B) G H (P Q) And often nice to write stuff like NP (Det ε) Adj* N

Merging Right-Hand Sides

Merging Right-Hand Sides X (A B) G H (P Q) NP (Det ε) Adj* N

Merging Right-Hand Sides X (A B) G H (P Q) NP (Det ε) Adj* N These are regular expressions!

Merging Right-Hand Sides X (A B) G H (P Q) NP (Det ε) Adj* N These are regular expressions! Build their minimal DFAs:

Merging Right-Hand Sides X (A B) G H (P Q) NP (Det ε) Adj* N These are regular expressions! Build their minimal DFAs: A P X B G H Q

Merging Right-Hand Sides X (A B) G H (P Q) NP (Det ε) Adj* N These are regular expressions! Build their minimal DFAs: A P X B G H Q Adj Det NP N Adj N

Merging Right-Hand Sides X (A B) G H (P Q) NP (Det ε) Adj* N These are regular expressions! Build their minimal DFAs: A P X B G H Q Adj! Automaton states Det replace dotted NP N rules (X A G. H P) Adj N

Merging Right-Hand Sides Indeed, all NP rules can be unioned into a single DFA! NP ADJP ADJP JJ JJ NN NNS NP ADJP DT NN NP ADJP JJ NN NP ADJP JJ NN NNS NP ADJP JJ NNS NP ADJP NN NP ADJP NN NN NP ADJP NN NNS NP ADJP NNS NP ADJP NPR NP ADJP NPRS NP DT NP DT ADJP NP DT ADJP, JJ NN NP DT ADJP ADJP NN NP DT ADJP JJ JJ NN NP DT ADJP JJ NN NP DT ADJP JJ NN NN

Merging Right-Hand Sides Indeed, all NP rules can be unioned into a single DFA! NP ADJP ADJP JJ JJ NN NNS ADJP DT NN ADJP JJ NN ADJP JJ NN NNS ADJP JJ NNS ADJP NN ADJP NN NN ADJP NN NNS ADJP NNS ADJP NPR ADJP NPRS DT DT ADJP DT ADJP, JJ NN DT ADJP ADJP NN DT ADJP JJ JJ NN DT ADJP JJ NN DT ADJP JJ NN NN etc.