Probabilistic Context Free Grammars

Size: px
Start display at page:

Download "Probabilistic Context Free Grammars"

Transcription

1 1 Defining PCFGs A PCFG G consists of Probabilistic Context Free Grammars 1. A set of terminals: {w k }, k = 1..., V 2. A set of non terminals: { i }, i = 1..., n 3. A designated Start symbol: 1 4. A set of rules: { i ξ j }, i = 1..., n 5. A probability function P which assigns probabilities to rules so that, for all nonterminals i Conventions: P( i ξ j ) = 1 j otation Meaning G Grammar (PCFG) L Language (generated or accepted by a grammar) t parse tree { 1,..., n } onterminal vocabulary ( 1 start symbol) {w 1,..., w V } Terminal vocabulary w 1... w m Sentence to be parsed j pq onterminal j spans positions p through q in sentence α j (p, q) Outside probabilities β j (p, q) Inside probabilities j w p... w q onterminal j dominates words w p through w q in sentence (not necessarily directly) A key property of PCFGs is what we will call the independence assumption: Independence Assumption The probability of a node sequence s r depends only on the immediate mother node, not any node above that or outside the current constituent. We first address the problem of finding the probability of a string given a PCFG grammar. This is directly useful for some applications, such as filtering the hypoethses of a speech recognizer. But considering some efficient algorithms for calculating string probabilities will have a sideeffect. It will provide a line of attack on a somewhat more central problem for this course: finding the most probable parse of a string. 1

2 2 Example Thus the only parameters of a PCFG grammar are the rule probabilities, as in the following simple PCFG: (1) S P VP 1.0 P P PP 0.4 PP P P 1.0 P astronomers 0.1 VP V P 0.7 P ears 0.18 VP VP PP 0.3 P saw 0.04 P with 1.0 P stars 0.18 V saw 1.0 P telescopes 0.1 otice the constraint P(P ξ j ) = 1 is met. The sum of the rule probabilities for all the P rules is 1. We consider the probability of the sentence (2) Astronomers saw stars with ears. j with respect to this grammar, which admits the two analyses shown in Figures 1 and 2. The probability of each anaylsis is computed simply by multiplying the probabilities of the rules. We compute the probability of the string by summing the probabilities of its two analyses: P(w 15 ) = P(t 1 ) + P(t 2 ) = = Two algorithms for efficiently computing string probabilities The naive approach to computing string probabilities: Compute the probability of each parse and add. But: The number of parses of a string with a CFG grammar in general grows exponentially with the length of the string. Therefore this is inefficient. 3.1 Inside Probabilities P(w 1,m G) = P( 1 w 1,m G) = P(w 1,m 1 1,m, G) = β 1 (1, m) The probability of a wordstring given the grammar equals the probability that the start symbol exhaustively dominates the string, which is the probability of the string w 1,m given that the start symbol dominates words 1 through m, which we call the inside probability of the string for the span 1m and the category 1. The general definition of an inside probability: β j (p, q) = P(w p,q j p,q, G) 2

3 There is an an algotrithm for computing the inside probability of a strinng known as the inside algorithm. It breaks down to a base case and an induction: 1. Base case: The event that a nonterminal j exhaustively dominates the word at position k is j kk and the probability that that happens given the grammar is: P( j kk ) = β j(k, k) = P( j w k G) The point of the β notation is that it uses only indices, indices of word positions (subscripts) and grammar elements (superscripts). The point of the 2nd line is that this reduces to a fact about the grammar, irrespective of word positions. 2. Induction We assume a Chomsky normal grammar, so the situation is as pictured in Figure 3. Then β j (p, q) = P(w pq j pq, G) (1) = r,s q 1 P(w pd, r pd, w(d + 1)q, s (d+1)q j pq, G) (2) d=p We regard the probability of an analysis of w pq as an instance of category j as the sum of the probabilities of an analysis using some way of expanding j ; each such analysis uses some production: j r s Following the picture in Figure 3, we split the word string into two halves at word d, p d < q, w pd and w (d+1)q, one corresponding to r and one to s. Thus the jointly occurring events in (2) are wordstring w pd being analyzed as r and word string w (d+1)q being analyzed as s. β j (p, q) = r,s q 1 P( r pd, s (d+1)q j pq, G) P(w pd j pq, r pd, s (d+1)q, G) d=p P(w (d+1)q j pq, r pd, s (d+1)q, w pd, G) (3) Equation (3) uses the chain rule on the joint events of (2). ow we use the independence assumptions of CFGs to conclude: β j (p, q) = r,s q 1 d=p P( r pd, s (d+1)q j pq, G) P(w pd r pd, G) P(w (d+1)q s (d+1)q, G) (4) And substituting in the definitions of an inside probability and a rule probability, we have the basic equation we can compute with: β j (p, q) = r,s q 1 P( j r s j pq, G) β r (p, d) β s (d + 1, q) (5) d=p 3

4 This means we can compute inside probabilities bottum up. We use a dynamic programming algorithm, and as is customary represent the computations in a table. Cell (i, j) for row i and column j contains the result of the β computations for the string beginning with word i and ending with word j. First we do all the word βs: β P = β P = 0.04 β V = β P = β P = β P = 0.18 astronomers saw stars with ears We now proceed as with CYK, trying to build constituents of length 2 next: β P = β P = 0.04 β VP = β V = β P = β P = 0.1 β PP = β P = 0.18 astronomers saw stars with ears The computation of β VP (2, 3) and β PP (4, 5) And in the next round: β VP (2, 3) = P(VP V P) β V (2, 2) β P (3, 3) (6) = (7) = (8) β PP (4, 5) = P(PP P P) β P (4, 4) β P (5, 5) (9) = (10) = 0.18 (11) β P = 0.1 β S = β P = 0.04 β V = 1.0 β VP = β P = 0.18 β P = β P = 0.1 β PP = β P = 0.18 astronomers saw stars with ears 4

5 The computation of β S (1, 3) and β P (3, 5) β S (1, 3) = P(S P VP) β P (1, 1) β VP (2, 3) (12) = (13) = (14) β P (3, 5) = P(P P PP) β P (3, 3) β PP (4, 5) (15) And in the only round with an ambiguous constituent: = (16) = (17) β P = 0.1 β S = β P = 0.04 β VP = β VP = β V = β P = 0.18 β P = β P = 0.1 β PP = β P = 0.18 astronomers saw stars with ears The computation of β VP (2, 5) : β VP (2, 5) = (P(VP V P) β V (2, 2) β P (3, 5)) + (P(VP VP PP) β VP (2, 3) β PP (4, 5)) (18) = ( ) + ( ) (19) = (20) = (21) The probability table (or chart) may be completed in the next step: β P = 0.1 β S = β S = β P = 0.04 β VP = β VP = β V = β P = 0.18 β P = β P = 0.1 β PP = β P = 0.18 astronomers saw stars with ears The computation of β S (1, 5) is left as an exercise. (22) 5

6 3.2 Outside Probabilities We define α j (p, q), the outside probability for a category j and a span w pq as α j (p, q) = P(w 1(p 1), j pq, w (q+1)m G) This is the probability of generating j covering the span (p, q) (whatever the words in it) along with the all the words outside it. Using the notion outside probability, we can view the problem of computing the probability of a string from the point of view of any word in it. Pick a word w k. The following is true: P(w 1,m G) = j P(w 1,(k 1), w k, w (k+1)m, j kk G) = j P(w 1,(k 1), w (k+1)m, j kk G) P(w k w 1,(k 1), j kk, w (k+1)m, G) = α j (k, k) P( j w k ) We start with the observation that any assignment of a category to w k fixes a probability for analyses of the entire string compatible with that assignment. So we sum up such analysis probabilities for each category j in the grammar. We then apply the chain rule, and the independence assumptions of PCFGs, and the definition of α j (p, q) to derive an expression for the probability of the entire string. So if we knew the α j (k, k) values, computing the string probability would be easy. We now show the outside algorithm, a method for solving the more general problem of computing α j (p, q). This is done topdown. We break it down as before into a base case and an induction: 1. Base case: α 1 (1, m) = 1 = 0 for j 1 2. Induction The situation is as pictured in Figure 4 or Figure 5 Then α j (p, q) = [ m f,g j e=q+1 P(w 1(p 1), w(q + 1)m, f pe, j pq, g (q+1)e )] +[ p 1 f,g e=1 P(w 1(p 1), w(q + 1)m, f eq, g e(p 1), j pq)] The first double sum is the case of Figure 4. The second is the case of Figure 5. In the first case, we restrict the two categories not to be equal, so as not to count rules of the form: twice. X Y Y α j (p, q) = [ m f,g j e=q+1 P(w 1(p 1), w(q + 1)m, f pe) P( j pq, g (q+1)e f pe) P(w(q + 1)m g (q+1)e )] + [ p 1 f,g e=1 P(w 1(e 1), w(q + 1)m, f eq) P( g e(p 1), j pq f eq) P(w e(p 1) g e(p 1) )] = [ m f,g j e=q+1 α f(p, e)p( f j g )β g (q + 1, e)] +[ p 1 f,g e=1 α f(e, q)p( j g j )β g (e, p 1)] (23) 6

7 4 Finding the most probable parse We define δ j (p, q) as the highest probability parse of w pq as j What follows is a Viterbi like algorithm for finding the best parse. Corresponding to Viterbi valkues we will have δ values. Corresponding to a Viterbi bnacktrace we will have poointers to the subtrees that help us produce our highest probvability parse for each (category,start, end) triple. 1. Initialization: δ i (p, p) = P( i w p ) 2. Induction: δ i (p, q) = max P( i j k )δ j (p, r)δ k (r + 1, q) 1 j,k n p r<q Store backtrace: Ψ i (p, q) = argmax j,k,r P( i j k )δ j (p, r)δ k (r + 1, q) 3. Termination: The probability of the max probability tree ˆt is: P(ˆt) = δ 1 (1, m) In addition we know the mother node and span of the highest probability parse: 1 and (p, q). In general given a mother node and a span (i, p, q), we can construct a highest probability parse tree below it by following the Ψ pointers: Let Ψ i (p, q) = (j, k, r) Then left daughter = j pr right daughter = k (r+1)q 7

8 S 1.0 P 0.1 VP 0.7 Astronomers V 1.0 P 0.4 saw P 0.18 PP 1.0 stars P 1.0 P 0.18 with ears P(t 1 ) = = Figure 1: t 1 S 1.0 P 0.1 VP 0.3 Astronomers VP 0.7 PP 1.0 V 1.0 P 0.18 P 1.0 P 0.18 saw stars with ears P(t 2 ) = = Figure 2: t 2 j r w p... w d s w d+1... w q Figure 3: Computation of inside probability: β j (p, q) 8

9 1 w 1... w p 1. f w e+1... w m j w p... w q g w q+1... w e Figure 4: Computation of outside probability (left category): α j (p, q) 9

10 1 w 1... w e 1. f w q+1... w m g w e... w p 1 j w p... w q Figure 5: Computation of outside probability (right category): α j (p, q) 10

PCFGs 2 L645 / B659. Dept. of Linguistics, Indiana University Fall PCFGs 2. Questions. Calculating P(w 1m ) Inside Probabilities

PCFGs 2 L645 / B659. Dept. of Linguistics, Indiana University Fall PCFGs 2. Questions. Calculating P(w 1m ) Inside Probabilities 1 / 22 Inside L645 / B659 Dept. of Linguistics, Indiana University Fall 2015 Inside- 2 / 22 for PCFGs 3 questions for Probabilistic Context Free Grammars (PCFGs): What is the probability of a sentence

More information

Probabilistic Context-Free Grammar

Probabilistic Context-Free Grammar Probabilistic Context-Free Grammar Petr Horáček, Eva Zámečníková and Ivana Burgetová Department of Information Systems Faculty of Information Technology Brno University of Technology Božetěchova 2, 612

More information

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09 Natural Language Processing : Probabilistic Context Free Grammars Updated 5/09 Motivation N-gram models and HMM Tagging only allowed us to process sentences linearly. However, even simple sentences require

More information

{Probabilistic Stochastic} Context-Free Grammars (PCFGs)

{Probabilistic Stochastic} Context-Free Grammars (PCFGs) {Probabilistic Stochastic} Context-Free Grammars (PCFGs) 116 The velocity of the seismic waves rises to... S NP sg VP sg DT NN PP risesto... The velocity IN NP pl of the seismic waves 117 PCFGs APCFGGconsists

More information

CS : Speech, NLP and the Web/Topics in AI

CS : Speech, NLP and the Web/Topics in AI CS626-449: Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-17: Probabilistic parsing; insideoutside probabilities Probability of a parse tree (cont.) S 1,l NP 1,2

More information

Processing/Speech, NLP and the Web

Processing/Speech, NLP and the Web CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25 Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March, 2011 Bracketed Structure: Treebank Corpus [ S1[

More information

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22 Parsing Probabilistic CFG (PCFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 22 Table of contents 1 Introduction 2 PCFG 3 Inside and outside probability 4 Parsing Jurafsky

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment.

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment. even Lectures on tatistical Parsing Christopher Manning LA Linguistic Institute 7 LA Lecture Attendee information Please put on a piece of paper: ame: Affiliation: tatus (undergrad, grad, industry, prof,

More information

Probabilistic Context-free Grammars

Probabilistic Context-free Grammars Probabilistic Context-free Grammars Computational Linguistics Alexander Koller 24 November 2017 The CKY Recognizer S NP VP NP Det N VP V NP V ate NP John Det a N sandwich i = 1 2 3 4 k = 2 3 4 5 S NP John

More information

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering Review Earley Algorithm Chapter 13.4 Lecture #9 October 2009 Top-Down vs. Bottom-Up Parsers Both generate too many useless trees Combine the two to avoid over-generation: Top-Down Parsing with Bottom-Up

More information

A* Search. 1 Dijkstra Shortest Path

A* Search. 1 Dijkstra Shortest Path A* Search Consider the eight puzzle. There are eight tiles numbered 1 through 8 on a 3 by three grid with nine locations so that one location is left empty. We can move by sliding a tile adjacent to the

More information

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Probabilistic Context-Free Grammars. Michael Collins, Columbia University Probabilistic Context-Free Grammars Michael Collins, Columbia University Overview Probabilistic Context-Free Grammars (PCFGs) The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements

More information

Introduction to Computational Linguistics

Introduction to Computational Linguistics Introduction to Computational Linguistics Olga Zamaraeva (2018) Based on Bender (prev. years) University of Washington May 3, 2018 1 / 101 Midterm Project Milestone 2: due Friday Assgnments 4& 5 due dates

More information

Advanced Natural Language Processing Syntactic Parsing

Advanced Natural Language Processing Syntactic Parsing Advanced Natural Language Processing Syntactic Parsing Alicia Ageno ageno@cs.upc.edu Universitat Politècnica de Catalunya NLP statistical parsing 1 Parsing Review Statistical Parsing SCFG Inside Algorithm

More information

CS460/626 : Natural Language

CS460/626 : Natural Language CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24 Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th,

More information

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universität Tübingen Part I Formal Properties of TAG 16.05.2007 und 21.05.2007 TAG Parsing

More information

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016 CKY & Earley Parsing Ling 571 Deep Processing Techniques for NLP January 13, 2016 No Class Monday: Martin Luther King Jr. Day CKY Parsing: Finish the parse Recognizer à Parser Roadmap Earley parsing Motivation:

More information

Properties of Context-Free Languages

Properties of Context-Free Languages Properties of Context-Free Languages Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

CS460/626 : Natural Language

CS460/626 : Natural Language CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 27 SMT Assignment; HMM recap; Probabilistic Parsing cntd) Pushpak Bhattacharyya CSE Dept., IIT Bombay 17 th March, 2011 CMU Pronunciation

More information

Even More on Dynamic Programming

Even More on Dynamic Programming Algorithms & Models of Computation CS/ECE 374, Fall 2017 Even More on Dynamic Programming Lecture 15 Thursday, October 19, 2017 Sariel Har-Peled (UIUC) CS374 1 Fall 2017 1 / 26 Part I Longest Common Subsequence

More information

In this chapter, we explore the parsing problem, which encompasses several questions, including:

In this chapter, we explore the parsing problem, which encompasses several questions, including: Chapter 12 Parsing Algorithms 12.1 Introduction In this chapter, we explore the parsing problem, which encompasses several questions, including: Does L(G) contain w? What is the highest-weight derivation

More information

Quiz 1, COMS Name: Good luck! 4705 Quiz 1 page 1 of 7

Quiz 1, COMS Name: Good luck! 4705 Quiz 1 page 1 of 7 Quiz 1, COMS 4705 Name: 10 30 30 20 Good luck! 4705 Quiz 1 page 1 of 7 Part #1 (10 points) Question 1 (10 points) We define a PCFG where non-terminal symbols are {S,, B}, the terminal symbols are {a, b},

More information

To make a grammar probabilistic, we need to assign a probability to each context-free rewrite

To make a grammar probabilistic, we need to assign a probability to each context-free rewrite Notes on the Inside-Outside Algorithm To make a grammar probabilistic, we need to assign a probability to each context-free rewrite rule. But how should these probabilities be chosen? It is natural to

More information

Einführung in die Computerlinguistik

Einführung in die Computerlinguistik Einführung in die Computerlinguistik Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 22 CFG (1) Example: Grammar G telescope : Productions: S NP VP NP

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars Berlin Chen 2005 References: 1. Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2. Speech and Language Processing, chapters 9, 10 NLP-Berlin Chen 1 Grammars

More information

Natural Language Processing

Natural Language Processing SFU NatLangLab Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University September 27, 2018 0 Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class

More information

Decoding and Inference with Syntactic Translation Models

Decoding and Inference with Syntactic Translation Models Decoding and Inference with Syntactic Translation Models March 5, 2013 CFGs S NP VP VP NP V V NP NP CFGs S NP VP S VP NP V V NP NP CFGs S NP VP S VP NP V NP VP V NP NP CFGs S NP VP S VP NP V NP VP V NP

More information

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this. Chapter 12 Synchronous CFGs Synchronous context-free grammars are a generalization of CFGs that generate pairs of related strings instead of single strings. They are useful in many situations where one

More information

Introduction to Probablistic Natural Language Processing

Introduction to Probablistic Natural Language Processing Introduction to Probablistic Natural Language Processing Alexis Nasr Laboratoire d Informatique Fondamentale de Marseille Natural Language Processing Use computers to process human languages Machine Translation

More information

Computability Theory

Computability Theory CS:4330 Theory of Computation Spring 2018 Computability Theory Decidable Problems of CFLs and beyond Haniel Barbosa Readings for this lecture Chapter 4 of [Sipser 1996], 3rd edition. Section 4.1. Decidable

More information

Natural Language Processing. Lecture 13: More on CFG Parsing

Natural Language Processing. Lecture 13: More on CFG Parsing Natural Language Processing Lecture 13: More on CFG Parsing Probabilistc/Weighted Parsing Example: ambiguous parse Probabilistc CFG Ambiguous parse w/probabilites 0.05 0.05 0.20 0.10 0.30 0.20 0.60 0.75

More information

Chapter 14 (Partially) Unsupervised Parsing

Chapter 14 (Partially) Unsupervised Parsing Chapter 14 (Partially) Unsupervised Parsing The linguistically-motivated tree transformations we discussed previously are very effective, but when we move to a new language, we may have to come up with

More information

Expectation Maximization (EM)

Expectation Maximization (EM) Expectation Maximization (EM) The EM algorithm is used to train models involving latent variables using training data in which the latent variables are not observed (unlabeled data). This is to be contrasted

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars CS 585, Fall 2017 Introduction to Natural Language Processing http://people.cs.umass.edu/~brenocon/inlp2017 Brendan O Connor College of Information and Computer Sciences

More information

The Inside-Outside Algorithm

The Inside-Outside Algorithm The Inside-Outside Algorithm Karl Stratos 1 The Setup Our data consists of Q observations O 1,..., O Q where each observation is a sequence of symbols in some set X = {1,..., n}. We suspect a Probabilistic

More information

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing L445 / L545 / B659 Dept. of Linguistics, Indiana University Spring 2016 1 / 46 : Overview Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the

More information

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46. : Overview L545 Dept. of Linguistics, Indiana University Spring 2013 Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the problem as searching

More information

Roger Levy Probabilistic Models in the Study of Language draft, October 2,

Roger Levy Probabilistic Models in the Study of Language draft, October 2, Roger Levy Probabilistic Models in the Study of Language draft, October 2, 2012 224 Chapter 10 Probabilistic Grammars 10.1 Outline HMMs PCFGs ptsgs and ptags Highlight: Zuidema et al., 2008, CogSci; Cohn

More information

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21 Parsing Unger s Parser Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2016/17 1 / 21 Table of contents 1 Introduction 2 The Parser 3 An Example 4 Optimizations 5 Conclusion 2 / 21 Introduction

More information

Probabilistic Linguistics

Probabilistic Linguistics Matilde Marcolli MAT1509HS: Mathematical and Computational Linguistics University of Toronto, Winter 2019, T 4-6 and W 4, BA6180 Bernoulli measures finite set A alphabet, strings of arbitrary (finite)

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins

Probabilistic Context Free Grammars. Many slides from Michael Collins Probabilistic Context Free Grammars Many slides from Michael Collins Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Chapter 16: Non-Context-Free Languages

Chapter 16: Non-Context-Free Languages Chapter 16: Non-Context-Free Languages Peter Cappello Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106 cappello@cs.ucsb.edu Please read the corresponding chapter

More information

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):

More information

Algorithms for Syntax-Aware Statistical Machine Translation

Algorithms for Syntax-Aware Statistical Machine Translation Algorithms for Syntax-Aware Statistical Machine Translation I. Dan Melamed, Wei Wang and Ben Wellington ew York University Syntax-Aware Statistical MT Statistical involves machine learning (ML) seems crucial

More information

Soft Inference and Posterior Marginals. September 19, 2013

Soft Inference and Posterior Marginals. September 19, 2013 Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard inference Give me a single solution Viterbi algorithm Maximum spanning tree (Chu-Liu-Edmonds alg.) Soft inference

More information

A DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005

A DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005 A DOP Model for LFG Rens Bod and Ronald Kaplan Kathrin Spreyer Data-Oriented Parsing, 14 June 2005 Lexical-Functional Grammar (LFG) Levels of linguistic knowledge represented formally differently (non-monostratal):

More information

Context-Free Grammars. 2IT70 Finite Automata and Process Theory

Context-Free Grammars. 2IT70 Finite Automata and Process Theory Context-Free Grammars 2IT70 Finite Automata and Process Theory Technische Universiteit Eindhoven May 18, 2016 Generating strings language L 1 = {a n b n n > 0} ab L 1 if w L 1 then awb L 1 production rules

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning Probabilistic Context Free Grammars Many slides from Michael Collins and Chris Manning Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic

More information

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation atural Language Processing 1 lecture 7: constituent parsing Ivan Titov Institute for Logic, Language and Computation Outline Syntax: intro, CFGs, PCFGs PCFGs: Estimation CFGs: Parsing PCFGs: Parsing Parsing

More information

LECTURER: BURCU CAN Spring

LECTURER: BURCU CAN Spring LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Stochastic Grammars Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Methods for NLP 1(22) Structured Classification

More information

CYK Algorithm for Parsing General Context-Free Grammars

CYK Algorithm for Parsing General Context-Free Grammars CYK Algorithm for Parsing General Context-Free Grammars Why Parse General Grammars Can be difficult or impossible to make grammar unambiguous thus LL(k) and LR(k) methods cannot work, for such ambiguous

More information

Section 1 (closed-book) Total points 30

Section 1 (closed-book) Total points 30 CS 454 Theory of Computation Fall 2011 Section 1 (closed-book) Total points 30 1. Which of the following are true? (a) a PDA can always be converted to an equivalent PDA that at each step pops or pushes

More information

Introduction to Theory of Computing

Introduction to Theory of Computing CSCI 2670, Fall 2012 Introduction to Theory of Computing Department of Computer Science University of Georgia Athens, GA 30602 Instructor: Liming Cai www.cs.uga.edu/ cai 0 Lecture Note 3 Context-Free Languages

More information

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

CMPT-825 Natural Language Processing. Why are parsing algorithms important? CMPT-825 Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/ anoop October 26, 2010 1/34 Why are parsing algorithms important? A linguistic theory is implemented in a formal system to generate

More information

(pp ) PDAs and CFGs (Sec. 2.2)

(pp ) PDAs and CFGs (Sec. 2.2) (pp. 117-124) PDAs and CFGs (Sec. 2.2) A language is context free iff all strings in L can be generated by some context free grammar Theorem 2.20: L is Context Free iff a PDA accepts it I.e. if L is context

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based Statistical Machine Translation Marcello Federico Human Language Technology FBK 2014 Outline Tree-based translation models: Synchronous context free grammars Hierarchical phrase-based

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 14 Ana Bove May 14th 2018 Recap: Context-free Grammars Simplification of grammars: Elimination of ǫ-productions; Elimination of

More information

d(ν) = max{n N : ν dmn p n } N. p d(ν) (ν) = ρ.

d(ν) = max{n N : ν dmn p n } N. p d(ν) (ν) = ρ. 1. Trees; context free grammars. 1.1. Trees. Definition 1.1. By a tree we mean an ordered triple T = (N, ρ, p) (i) N is a finite set; (ii) ρ N ; (iii) p : N {ρ} N ; (iv) if n N + and ν dmn p n then p n

More information

Solutions to Problem Set 3

Solutions to Problem Set 3 V22.0453-001 Theory of Computation October 8, 2003 TA: Nelly Fazio Solutions to Problem Set 3 Problem 1 We have seen that a grammar where all productions are of the form: A ab, A c (where A, B non-terminals,

More information

Non-context-Free Languages. CS215, Lecture 5 c

Non-context-Free Languages. CS215, Lecture 5 c Non-context-Free Languages CS215, Lecture 5 c 2007 1 The Pumping Lemma Theorem. (Pumping Lemma) Let be context-free. There exists a positive integer divided into five pieces, Proof for for each, and..

More information

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26 Parsing Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 26 Table of contents 1 Context-Free Grammars 2 Simplifying CFGs Removing useless symbols Eliminating

More information

Context- Free Parsing with CKY. October 16, 2014

Context- Free Parsing with CKY. October 16, 2014 Context- Free Parsing with CKY October 16, 2014 Lecture Plan Parsing as Logical DeducBon Defining the CFG recognibon problem BoHom up vs. top down Quick review of Chomsky normal form The CKY algorithm

More information

Improved TBL algorithm for learning context-free grammar

Improved TBL algorithm for learning context-free grammar Proceedings of the International Multiconference on ISSN 1896-7094 Computer Science and Information Technology, pp. 267 274 2007 PIPS Improved TBL algorithm for learning context-free grammar Marcin Jaworski

More information

Context Free Grammars

Context Free Grammars Automata and Formal Languages Context Free Grammars Sipser pages 101-111 Lecture 11 Tim Sheard 1 Formal Languages 1. Context free languages provide a convenient notation for recursive description of languages.

More information

An introduction to PRISM and its applications

An introduction to PRISM and its applications An introduction to PRISM and its applications Yoshitaka Kameya Tokyo Institute of Technology 2007/9/17 FJ-2007 1 Contents What is PRISM? Two examples: from population genetics from statistical natural

More information

October 6, Equivalence of Pushdown Automata with Context-Free Gramm

October 6, Equivalence of Pushdown Automata with Context-Free Gramm Equivalence of Pushdown Automata with Context-Free Grammar October 6, 2013 Motivation Motivation CFG and PDA are equivalent in power: a CFG generates a context-free language and a PDA recognizes a context-free

More information

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis 1 Introduction Parenthesis Matching Problem Describe the set of arithmetic expressions with correctly matched parenthesis. Arithmetic expressions with correctly matched parenthesis cannot be described

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing CS 4120/6120 Spring 2017 Northeastern University David Smith with some slides from Jason Eisner & Andrew

More information

Remembering subresults (Part I): Well-formed substring tables

Remembering subresults (Part I): Well-formed substring tables Remembering subresults (Part I): Well-formed substring tables Detmar Meurers: Intro to Computational Linguistics I OSU, LING 684.01, 1. February 2005 Problem: Inefficiency of recomputing subresults Two

More information

Lecture 12: Algorithms for HMMs

Lecture 12: Algorithms for HMMs Lecture 12: Algorithms for HMMs Nathan Schneider (some slides from Sharon Goldwater; thanks to Jonathan May for bug fixes) ENLP 26 February 2018 Recap: tagging POS tagging is a sequence labelling task.

More information

Doctoral Course in Speech Recognition. May 2007 Kjell Elenius

Doctoral Course in Speech Recognition. May 2007 Kjell Elenius Doctoral Course in Speech Recognition May 2007 Kjell Elenius CHAPTER 12 BASIC SEARCH ALGORITHMS State-based search paradigm Triplet S, O, G S, set of initial states O, set of operators applied on a state

More information

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write: Languages A language is a set (usually infinite) of strings, also known as sentences Each string consists of a sequence of symbols taken from some alphabet An alphabet, V, is a finite set of symbols, e.g.

More information

Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars

Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars Salil Vadhan October 2, 2012 Reading: Sipser, 2.1 (except Chomsky Normal Form). Algorithmic questions about regular

More information

CSCI Compiler Construction

CSCI Compiler Construction CSCI 742 - Compiler Construction Lecture 12 Cocke-Younger-Kasami (CYK) Algorithm Instructor: Hossein Hojjat February 20, 2017 Recap: Chomsky Normal Form (CNF) A CFG is in Chomsky Normal Form if each rule

More information

Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars

Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars Laura Kallmeyer, Wolfgang Maier University of Tübingen ESSLLI Course 2008 Parsing beyond CFG 1 MCFG Parsing Multiple Context-Free

More information

CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models. The ischool University of Maryland. Wednesday, September 30, 2009

CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models. The ischool University of Maryland. Wednesday, September 30, 2009 CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models Jimmy Lin The ischool University of Maryland Wednesday, September 30, 2009 Today s Agenda The great leap forward in NLP Hidden Markov

More information

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0 Massachusetts Institute of Technology 6.863J/9.611J, Natural Language Processing, Spring, 2001 Department of Electrical Engineering and Computer Science Department of Brain and Cognitive Sciences Handout

More information

Tree Adjoining Grammars

Tree Adjoining Grammars Tree Adjoining Grammars TAG: Parsing and formal properties Laura Kallmeyer & Benjamin Burkhardt HHU Düsseldorf WS 2017/2018 1 / 36 Outline 1 Parsing as deduction 2 CYK for TAG 3 Closure properties of TALs

More information

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is Unger s Parser Laura Heinrich-Heine-Universität Düsseldorf Wintersemester 2012/2013 a top-down parser: we start with S and

More information

Parsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17

Parsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17 Parsing Left-Corner Parsing Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 17 Table of contents 1 Motivation 2 Algorithm 3 Look-ahead 4 Chart Parsing 2 / 17 Motivation Problems

More information

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Notes for Comp 497 (Comp 454) Week 10 4/5/05 Notes for Comp 497 (Comp 454) Week 10 4/5/05 Today look at the last two chapters in Part II. Cohen presents some results concerning context-free languages (CFL) and regular languages (RL) also some decidability

More information

CPS 220 Theory of Computation

CPS 220 Theory of Computation CPS 22 Theory of Computation Review - Regular Languages RL - a simple class of languages that can be represented in two ways: 1 Machine description: Finite Automata are machines with a finite number of

More information

Computational Models - Lecture 4

Computational Models - Lecture 4 Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push

More information

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL) CS5371 Theory of Computation Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL) Objectives Introduce Pumping Lemma for CFL Apply Pumping Lemma to show that some languages are non-cfl Pumping Lemma

More information

Lecture 12: Algorithms for HMMs

Lecture 12: Algorithms for HMMs Lecture 12: Algorithms for HMMs Nathan Schneider (some slides from Sharon Goldwater; thanks to Jonathan May for bug fixes) ENLP 17 October 2016 updated 9 September 2017 Recap: tagging POS tagging is a

More information

Lecture 5: UDOP, Dependency Grammars

Lecture 5: UDOP, Dependency Grammars Lecture 5: UDOP, Dependency Grammars Jelle Zuidema ILLC, Universiteit van Amsterdam Unsupervised Language Learning, 2014 Generative Model objective PCFG PTSG CCM DMV heuristic Wolff (1984) UDOP ML IO K&M

More information

Chapter 4: Context-Free Grammars

Chapter 4: Context-Free Grammars Chapter 4: Context-Free Grammars 4.1 Basics of Context-Free Grammars Definition A context-free grammars, or CFG, G is specified by a quadruple (N, Σ, P, S), where N is the nonterminal or variable alphabet;

More information

Properties of context-free Languages

Properties of context-free Languages Properties of context-free Languages We simplify CFL s. Greibach Normal Form Chomsky Normal Form We prove pumping lemma for CFL s. We study closure properties and decision properties. Some of them remain,

More information

(pp ) PDAs and CFGs (Sec. 2.2)

(pp ) PDAs and CFGs (Sec. 2.2) (pp. 117-124) PDAs and CFGs (Sec. 2.2) A language is context free iff all strings in L can be generated by some context free grammar Theorem 2.20: L is Context Free iff a PDA accepts it I.e. if L is context

More information

Recap: HMM. ANLP Lecture 9: Algorithms for HMMs. More general notation. Recap: HMM. Elements of HMM: Sharon Goldwater 4 Oct 2018.

Recap: HMM. ANLP Lecture 9: Algorithms for HMMs. More general notation. Recap: HMM. Elements of HMM: Sharon Goldwater 4 Oct 2018. Recap: HMM ANLP Lecture 9: Algorithms for HMMs Sharon Goldwater 4 Oct 2018 Elements of HMM: Set of states (tags) Output alphabet (word types) Start state (beginning of sentence) State transition probabilities

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Sequence Models Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Methods for NLP 1(21) Introduction Structured

More information

Simplification and Normalization of Context-Free Grammars

Simplification and Normalization of Context-Free Grammars implification and Normalization, of Context-Free Grammars 20100927 Slide 1 of 23 Simplification and Normalization of Context-Free Grammars 5DV037 Fundamentals of Computer Science Umeå University Department

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Slides mostly from Mitch Marcus and Eric Fosler (with lots of modifications). Have you seen HMMs? Have you seen Kalman filters? Have you seen dynamic programming? HMMs are dynamic

More information

Pushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata.

Pushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata. Pushdown Automata We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata. Next we consider a more powerful computation model, called a

More information

The Pumping Lemma for Context Free Grammars

The Pumping Lemma for Context Free Grammars The Pumping Lemma for Context Free Grammars Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar is in the form A BC A a Where a is any terminal

More information

MTH401A Theory of Computation. Lecture 17

MTH401A Theory of Computation. Lecture 17 MTH401A Theory of Computation Lecture 17 Chomsky Normal Form for CFG s Chomsky Normal Form for CFG s For every context free language, L, the language L {ε} has a grammar in which every production looks

More information