Probabilistic Context Free Grammars
|
|
- Arline Hines
- 6 years ago
- Views:
Transcription
1 1 Defining PCFGs A PCFG G consists of Probabilistic Context Free Grammars 1. A set of terminals: {w k }, k = 1..., V 2. A set of non terminals: { i }, i = 1..., n 3. A designated Start symbol: 1 4. A set of rules: { i ξ j }, i = 1..., n 5. A probability function P which assigns probabilities to rules so that, for all nonterminals i Conventions: P( i ξ j ) = 1 j otation Meaning G Grammar (PCFG) L Language (generated or accepted by a grammar) t parse tree { 1,..., n } onterminal vocabulary ( 1 start symbol) {w 1,..., w V } Terminal vocabulary w 1... w m Sentence to be parsed j pq onterminal j spans positions p through q in sentence α j (p, q) Outside probabilities β j (p, q) Inside probabilities j w p... w q onterminal j dominates words w p through w q in sentence (not necessarily directly) A key property of PCFGs is what we will call the independence assumption: Independence Assumption The probability of a node sequence s r depends only on the immediate mother node, not any node above that or outside the current constituent. We first address the problem of finding the probability of a string given a PCFG grammar. This is directly useful for some applications, such as filtering the hypoethses of a speech recognizer. But considering some efficient algorithms for calculating string probabilities will have a sideeffect. It will provide a line of attack on a somewhat more central problem for this course: finding the most probable parse of a string. 1
2 2 Example Thus the only parameters of a PCFG grammar are the rule probabilities, as in the following simple PCFG: (1) S P VP 1.0 P P PP 0.4 PP P P 1.0 P astronomers 0.1 VP V P 0.7 P ears 0.18 VP VP PP 0.3 P saw 0.04 P with 1.0 P stars 0.18 V saw 1.0 P telescopes 0.1 otice the constraint P(P ξ j ) = 1 is met. The sum of the rule probabilities for all the P rules is 1. We consider the probability of the sentence (2) Astronomers saw stars with ears. j with respect to this grammar, which admits the two analyses shown in Figures 1 and 2. The probability of each anaylsis is computed simply by multiplying the probabilities of the rules. We compute the probability of the string by summing the probabilities of its two analyses: P(w 15 ) = P(t 1 ) + P(t 2 ) = = Two algorithms for efficiently computing string probabilities The naive approach to computing string probabilities: Compute the probability of each parse and add. But: The number of parses of a string with a CFG grammar in general grows exponentially with the length of the string. Therefore this is inefficient. 3.1 Inside Probabilities P(w 1,m G) = P( 1 w 1,m G) = P(w 1,m 1 1,m, G) = β 1 (1, m) The probability of a wordstring given the grammar equals the probability that the start symbol exhaustively dominates the string, which is the probability of the string w 1,m given that the start symbol dominates words 1 through m, which we call the inside probability of the string for the span 1m and the category 1. The general definition of an inside probability: β j (p, q) = P(w p,q j p,q, G) 2
3 There is an an algotrithm for computing the inside probability of a strinng known as the inside algorithm. It breaks down to a base case and an induction: 1. Base case: The event that a nonterminal j exhaustively dominates the word at position k is j kk and the probability that that happens given the grammar is: P( j kk ) = β j(k, k) = P( j w k G) The point of the β notation is that it uses only indices, indices of word positions (subscripts) and grammar elements (superscripts). The point of the 2nd line is that this reduces to a fact about the grammar, irrespective of word positions. 2. Induction We assume a Chomsky normal grammar, so the situation is as pictured in Figure 3. Then β j (p, q) = P(w pq j pq, G) (1) = r,s q 1 P(w pd, r pd, w(d + 1)q, s (d+1)q j pq, G) (2) d=p We regard the probability of an analysis of w pq as an instance of category j as the sum of the probabilities of an analysis using some way of expanding j ; each such analysis uses some production: j r s Following the picture in Figure 3, we split the word string into two halves at word d, p d < q, w pd and w (d+1)q, one corresponding to r and one to s. Thus the jointly occurring events in (2) are wordstring w pd being analyzed as r and word string w (d+1)q being analyzed as s. β j (p, q) = r,s q 1 P( r pd, s (d+1)q j pq, G) P(w pd j pq, r pd, s (d+1)q, G) d=p P(w (d+1)q j pq, r pd, s (d+1)q, w pd, G) (3) Equation (3) uses the chain rule on the joint events of (2). ow we use the independence assumptions of CFGs to conclude: β j (p, q) = r,s q 1 d=p P( r pd, s (d+1)q j pq, G) P(w pd r pd, G) P(w (d+1)q s (d+1)q, G) (4) And substituting in the definitions of an inside probability and a rule probability, we have the basic equation we can compute with: β j (p, q) = r,s q 1 P( j r s j pq, G) β r (p, d) β s (d + 1, q) (5) d=p 3
4 This means we can compute inside probabilities bottum up. We use a dynamic programming algorithm, and as is customary represent the computations in a table. Cell (i, j) for row i and column j contains the result of the β computations for the string beginning with word i and ending with word j. First we do all the word βs: β P = β P = 0.04 β V = β P = β P = β P = 0.18 astronomers saw stars with ears We now proceed as with CYK, trying to build constituents of length 2 next: β P = β P = 0.04 β VP = β V = β P = β P = 0.1 β PP = β P = 0.18 astronomers saw stars with ears The computation of β VP (2, 3) and β PP (4, 5) And in the next round: β VP (2, 3) = P(VP V P) β V (2, 2) β P (3, 3) (6) = (7) = (8) β PP (4, 5) = P(PP P P) β P (4, 4) β P (5, 5) (9) = (10) = 0.18 (11) β P = 0.1 β S = β P = 0.04 β V = 1.0 β VP = β P = 0.18 β P = β P = 0.1 β PP = β P = 0.18 astronomers saw stars with ears 4
5 The computation of β S (1, 3) and β P (3, 5) β S (1, 3) = P(S P VP) β P (1, 1) β VP (2, 3) (12) = (13) = (14) β P (3, 5) = P(P P PP) β P (3, 3) β PP (4, 5) (15) And in the only round with an ambiguous constituent: = (16) = (17) β P = 0.1 β S = β P = 0.04 β VP = β VP = β V = β P = 0.18 β P = β P = 0.1 β PP = β P = 0.18 astronomers saw stars with ears The computation of β VP (2, 5) : β VP (2, 5) = (P(VP V P) β V (2, 2) β P (3, 5)) + (P(VP VP PP) β VP (2, 3) β PP (4, 5)) (18) = ( ) + ( ) (19) = (20) = (21) The probability table (or chart) may be completed in the next step: β P = 0.1 β S = β S = β P = 0.04 β VP = β VP = β V = β P = 0.18 β P = β P = 0.1 β PP = β P = 0.18 astronomers saw stars with ears The computation of β S (1, 5) is left as an exercise. (22) 5
6 3.2 Outside Probabilities We define α j (p, q), the outside probability for a category j and a span w pq as α j (p, q) = P(w 1(p 1), j pq, w (q+1)m G) This is the probability of generating j covering the span (p, q) (whatever the words in it) along with the all the words outside it. Using the notion outside probability, we can view the problem of computing the probability of a string from the point of view of any word in it. Pick a word w k. The following is true: P(w 1,m G) = j P(w 1,(k 1), w k, w (k+1)m, j kk G) = j P(w 1,(k 1), w (k+1)m, j kk G) P(w k w 1,(k 1), j kk, w (k+1)m, G) = α j (k, k) P( j w k ) We start with the observation that any assignment of a category to w k fixes a probability for analyses of the entire string compatible with that assignment. So we sum up such analysis probabilities for each category j in the grammar. We then apply the chain rule, and the independence assumptions of PCFGs, and the definition of α j (p, q) to derive an expression for the probability of the entire string. So if we knew the α j (k, k) values, computing the string probability would be easy. We now show the outside algorithm, a method for solving the more general problem of computing α j (p, q). This is done topdown. We break it down as before into a base case and an induction: 1. Base case: α 1 (1, m) = 1 = 0 for j 1 2. Induction The situation is as pictured in Figure 4 or Figure 5 Then α j (p, q) = [ m f,g j e=q+1 P(w 1(p 1), w(q + 1)m, f pe, j pq, g (q+1)e )] +[ p 1 f,g e=1 P(w 1(p 1), w(q + 1)m, f eq, g e(p 1), j pq)] The first double sum is the case of Figure 4. The second is the case of Figure 5. In the first case, we restrict the two categories not to be equal, so as not to count rules of the form: twice. X Y Y α j (p, q) = [ m f,g j e=q+1 P(w 1(p 1), w(q + 1)m, f pe) P( j pq, g (q+1)e f pe) P(w(q + 1)m g (q+1)e )] + [ p 1 f,g e=1 P(w 1(e 1), w(q + 1)m, f eq) P( g e(p 1), j pq f eq) P(w e(p 1) g e(p 1) )] = [ m f,g j e=q+1 α f(p, e)p( f j g )β g (q + 1, e)] +[ p 1 f,g e=1 α f(e, q)p( j g j )β g (e, p 1)] (23) 6
7 4 Finding the most probable parse We define δ j (p, q) as the highest probability parse of w pq as j What follows is a Viterbi like algorithm for finding the best parse. Corresponding to Viterbi valkues we will have δ values. Corresponding to a Viterbi bnacktrace we will have poointers to the subtrees that help us produce our highest probvability parse for each (category,start, end) triple. 1. Initialization: δ i (p, p) = P( i w p ) 2. Induction: δ i (p, q) = max P( i j k )δ j (p, r)δ k (r + 1, q) 1 j,k n p r<q Store backtrace: Ψ i (p, q) = argmax j,k,r P( i j k )δ j (p, r)δ k (r + 1, q) 3. Termination: The probability of the max probability tree ˆt is: P(ˆt) = δ 1 (1, m) In addition we know the mother node and span of the highest probability parse: 1 and (p, q). In general given a mother node and a span (i, p, q), we can construct a highest probability parse tree below it by following the Ψ pointers: Let Ψ i (p, q) = (j, k, r) Then left daughter = j pr right daughter = k (r+1)q 7
8 S 1.0 P 0.1 VP 0.7 Astronomers V 1.0 P 0.4 saw P 0.18 PP 1.0 stars P 1.0 P 0.18 with ears P(t 1 ) = = Figure 1: t 1 S 1.0 P 0.1 VP 0.3 Astronomers VP 0.7 PP 1.0 V 1.0 P 0.18 P 1.0 P 0.18 saw stars with ears P(t 2 ) = = Figure 2: t 2 j r w p... w d s w d+1... w q Figure 3: Computation of inside probability: β j (p, q) 8
9 1 w 1... w p 1. f w e+1... w m j w p... w q g w q+1... w e Figure 4: Computation of outside probability (left category): α j (p, q) 9
10 1 w 1... w e 1. f w q+1... w m g w e... w p 1 j w p... w q Figure 5: Computation of outside probability (right category): α j (p, q) 10
PCFGs 2 L645 / B659. Dept. of Linguistics, Indiana University Fall PCFGs 2. Questions. Calculating P(w 1m ) Inside Probabilities
1 / 22 Inside L645 / B659 Dept. of Linguistics, Indiana University Fall 2015 Inside- 2 / 22 for PCFGs 3 questions for Probabilistic Context Free Grammars (PCFGs): What is the probability of a sentence
More informationProbabilistic Context-Free Grammar
Probabilistic Context-Free Grammar Petr Horáček, Eva Zámečníková and Ivana Burgetová Department of Information Systems Faculty of Information Technology Brno University of Technology Božetěchova 2, 612
More informationNatural Language Processing : Probabilistic Context Free Grammars. Updated 5/09
Natural Language Processing : Probabilistic Context Free Grammars Updated 5/09 Motivation N-gram models and HMM Tagging only allowed us to process sentences linearly. However, even simple sentences require
More information{Probabilistic Stochastic} Context-Free Grammars (PCFGs)
{Probabilistic Stochastic} Context-Free Grammars (PCFGs) 116 The velocity of the seismic waves rises to... S NP sg VP sg DT NN PP risesto... The velocity IN NP pl of the seismic waves 117 PCFGs APCFGGconsists
More informationCS : Speech, NLP and the Web/Topics in AI
CS626-449: Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-17: Probabilistic parsing; insideoutside probabilities Probability of a parse tree (cont.) S 1,l NP 1,2
More informationProcessing/Speech, NLP and the Web
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25 Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March, 2011 Bracketed Structure: Treebank Corpus [ S1[
More informationParsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22
Parsing Probabilistic CFG (PCFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 22 Table of contents 1 Introduction 2 PCFG 3 Inside and outside probability 4 Parsing Jurafsky
More informationMaschinelle Sprachverarbeitung
Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other
More informationMaschinelle Sprachverarbeitung
Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other
More informationAttendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment.
even Lectures on tatistical Parsing Christopher Manning LA Linguistic Institute 7 LA Lecture Attendee information Please put on a piece of paper: ame: Affiliation: tatus (undergrad, grad, industry, prof,
More informationProbabilistic Context-free Grammars
Probabilistic Context-free Grammars Computational Linguistics Alexander Koller 24 November 2017 The CKY Recognizer S NP VP NP Det N VP V NP V ate NP John Det a N sandwich i = 1 2 3 4 k = 2 3 4 5 S NP John
More informationReview. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering
Review Earley Algorithm Chapter 13.4 Lecture #9 October 2009 Top-Down vs. Bottom-Up Parsers Both generate too many useless trees Combine the two to avoid over-generation: Top-Down Parsing with Bottom-Up
More informationA* Search. 1 Dijkstra Shortest Path
A* Search Consider the eight puzzle. There are eight tiles numbered 1 through 8 on a 3 by three grid with nine locations so that one location is left empty. We can move by sliding a tile adjacent to the
More informationProbabilistic Context-Free Grammars. Michael Collins, Columbia University
Probabilistic Context-Free Grammars Michael Collins, Columbia University Overview Probabilistic Context-Free Grammars (PCFGs) The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar
More informationParsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)
Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements
More informationIntroduction to Computational Linguistics
Introduction to Computational Linguistics Olga Zamaraeva (2018) Based on Bender (prev. years) University of Washington May 3, 2018 1 / 101 Midterm Project Milestone 2: due Friday Assgnments 4& 5 due dates
More informationAdvanced Natural Language Processing Syntactic Parsing
Advanced Natural Language Processing Syntactic Parsing Alicia Ageno ageno@cs.upc.edu Universitat Politècnica de Catalunya NLP statistical parsing 1 Parsing Review Statistical Parsing SCFG Inside Algorithm
More informationCS460/626 : Natural Language
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24 Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th,
More informationGrammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG
Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universität Tübingen Part I Formal Properties of TAG 16.05.2007 und 21.05.2007 TAG Parsing
More informationCKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016
CKY & Earley Parsing Ling 571 Deep Processing Techniques for NLP January 13, 2016 No Class Monday: Martin Luther King Jr. Day CKY Parsing: Finish the parse Recognizer à Parser Roadmap Earley parsing Motivation:
More informationProperties of Context-Free Languages
Properties of Context-Free Languages Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationCS460/626 : Natural Language
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 27 SMT Assignment; HMM recap; Probabilistic Parsing cntd) Pushpak Bhattacharyya CSE Dept., IIT Bombay 17 th March, 2011 CMU Pronunciation
More informationEven More on Dynamic Programming
Algorithms & Models of Computation CS/ECE 374, Fall 2017 Even More on Dynamic Programming Lecture 15 Thursday, October 19, 2017 Sariel Har-Peled (UIUC) CS374 1 Fall 2017 1 / 26 Part I Longest Common Subsequence
More informationIn this chapter, we explore the parsing problem, which encompasses several questions, including:
Chapter 12 Parsing Algorithms 12.1 Introduction In this chapter, we explore the parsing problem, which encompasses several questions, including: Does L(G) contain w? What is the highest-weight derivation
More informationQuiz 1, COMS Name: Good luck! 4705 Quiz 1 page 1 of 7
Quiz 1, COMS 4705 Name: 10 30 30 20 Good luck! 4705 Quiz 1 page 1 of 7 Part #1 (10 points) Question 1 (10 points) We define a PCFG where non-terminal symbols are {S,, B}, the terminal symbols are {a, b},
More informationTo make a grammar probabilistic, we need to assign a probability to each context-free rewrite
Notes on the Inside-Outside Algorithm To make a grammar probabilistic, we need to assign a probability to each context-free rewrite rule. But how should these probabilities be chosen? It is natural to
More informationEinführung in die Computerlinguistik
Einführung in die Computerlinguistik Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 22 CFG (1) Example: Grammar G telescope : Productions: S NP VP NP
More informationParsing with Context-Free Grammars
Parsing with Context-Free Grammars Berlin Chen 2005 References: 1. Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2. Speech and Language Processing, chapters 9, 10 NLP-Berlin Chen 1 Grammars
More informationNatural Language Processing
SFU NatLangLab Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University September 27, 2018 0 Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class
More informationDecoding and Inference with Syntactic Translation Models
Decoding and Inference with Syntactic Translation Models March 5, 2013 CFGs S NP VP VP NP V V NP NP CFGs S NP VP S VP NP V V NP NP CFGs S NP VP S VP NP V NP VP V NP NP CFGs S NP VP S VP NP V NP VP V NP
More informationThis kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.
Chapter 12 Synchronous CFGs Synchronous context-free grammars are a generalization of CFGs that generate pairs of related strings instead of single strings. They are useful in many situations where one
More informationIntroduction to Probablistic Natural Language Processing
Introduction to Probablistic Natural Language Processing Alexis Nasr Laboratoire d Informatique Fondamentale de Marseille Natural Language Processing Use computers to process human languages Machine Translation
More informationComputability Theory
CS:4330 Theory of Computation Spring 2018 Computability Theory Decidable Problems of CFLs and beyond Haniel Barbosa Readings for this lecture Chapter 4 of [Sipser 1996], 3rd edition. Section 4.1. Decidable
More informationNatural Language Processing. Lecture 13: More on CFG Parsing
Natural Language Processing Lecture 13: More on CFG Parsing Probabilistc/Weighted Parsing Example: ambiguous parse Probabilistc CFG Ambiguous parse w/probabilites 0.05 0.05 0.20 0.10 0.30 0.20 0.60 0.75
More informationChapter 14 (Partially) Unsupervised Parsing
Chapter 14 (Partially) Unsupervised Parsing The linguistically-motivated tree transformations we discussed previously are very effective, but when we move to a new language, we may have to come up with
More informationExpectation Maximization (EM)
Expectation Maximization (EM) The EM algorithm is used to train models involving latent variables using training data in which the latent variables are not observed (unlabeled data). This is to be contrasted
More informationParsing with Context-Free Grammars
Parsing with Context-Free Grammars CS 585, Fall 2017 Introduction to Natural Language Processing http://people.cs.umass.edu/~brenocon/inlp2017 Brendan O Connor College of Information and Computer Sciences
More informationThe Inside-Outside Algorithm
The Inside-Outside Algorithm Karl Stratos 1 The Setup Our data consists of Q observations O 1,..., O Q where each observation is a sequence of symbols in some set X = {1,..., n}. We suspect a Probabilistic
More informationParsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing
L445 / L545 / B659 Dept. of Linguistics, Indiana University Spring 2016 1 / 46 : Overview Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the
More informationParsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.
: Overview L545 Dept. of Linguistics, Indiana University Spring 2013 Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the problem as searching
More informationRoger Levy Probabilistic Models in the Study of Language draft, October 2,
Roger Levy Probabilistic Models in the Study of Language draft, October 2, 2012 224 Chapter 10 Probabilistic Grammars 10.1 Outline HMMs PCFGs ptsgs and ptags Highlight: Zuidema et al., 2008, CogSci; Cohn
More informationParsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21
Parsing Unger s Parser Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2016/17 1 / 21 Table of contents 1 Introduction 2 The Parser 3 An Example 4 Optimizations 5 Conclusion 2 / 21 Introduction
More informationProbabilistic Linguistics
Matilde Marcolli MAT1509HS: Mathematical and Computational Linguistics University of Toronto, Winter 2019, T 4-6 and W 4, BA6180 Bernoulli measures finite set A alphabet, strings of arbitrary (finite)
More informationProbabilistic Context Free Grammars. Many slides from Michael Collins
Probabilistic Context Free Grammars Many slides from Michael Collins Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar
More informationChapter 16: Non-Context-Free Languages
Chapter 16: Non-Context-Free Languages Peter Cappello Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106 cappello@cs.ucsb.edu Please read the corresponding chapter
More informationNatural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science
Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):
More informationAlgorithms for Syntax-Aware Statistical Machine Translation
Algorithms for Syntax-Aware Statistical Machine Translation I. Dan Melamed, Wei Wang and Ben Wellington ew York University Syntax-Aware Statistical MT Statistical involves machine learning (ML) seems crucial
More informationSoft Inference and Posterior Marginals. September 19, 2013
Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard inference Give me a single solution Viterbi algorithm Maximum spanning tree (Chu-Liu-Edmonds alg.) Soft inference
More informationA DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005
A DOP Model for LFG Rens Bod and Ronald Kaplan Kathrin Spreyer Data-Oriented Parsing, 14 June 2005 Lexical-Functional Grammar (LFG) Levels of linguistic knowledge represented formally differently (non-monostratal):
More informationContext-Free Grammars. 2IT70 Finite Automata and Process Theory
Context-Free Grammars 2IT70 Finite Automata and Process Theory Technische Universiteit Eindhoven May 18, 2016 Generating strings language L 1 = {a n b n n > 0} ab L 1 if w L 1 then awb L 1 production rules
More informationProbabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning
Probabilistic Context Free Grammars Many slides from Michael Collins and Chris Manning Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic
More informationNatural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation
atural Language Processing 1 lecture 7: constituent parsing Ivan Titov Institute for Logic, Language and Computation Outline Syntax: intro, CFGs, PCFGs PCFGs: Estimation CFGs: Parsing PCFGs: Parsing Parsing
More informationLECTURER: BURCU CAN Spring
LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can
More informationStatistical Methods for NLP
Statistical Methods for NLP Stochastic Grammars Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Methods for NLP 1(22) Structured Classification
More informationCYK Algorithm for Parsing General Context-Free Grammars
CYK Algorithm for Parsing General Context-Free Grammars Why Parse General Grammars Can be difficult or impossible to make grammar unambiguous thus LL(k) and LR(k) methods cannot work, for such ambiguous
More informationSection 1 (closed-book) Total points 30
CS 454 Theory of Computation Fall 2011 Section 1 (closed-book) Total points 30 1. Which of the following are true? (a) a PDA can always be converted to an equivalent PDA that at each step pops or pushes
More informationIntroduction to Theory of Computing
CSCI 2670, Fall 2012 Introduction to Theory of Computing Department of Computer Science University of Georgia Athens, GA 30602 Instructor: Liming Cai www.cs.uga.edu/ cai 0 Lecture Note 3 Context-Free Languages
More informationCMPT-825 Natural Language Processing. Why are parsing algorithms important?
CMPT-825 Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/ anoop October 26, 2010 1/34 Why are parsing algorithms important? A linguistic theory is implemented in a formal system to generate
More information(pp ) PDAs and CFGs (Sec. 2.2)
(pp. 117-124) PDAs and CFGs (Sec. 2.2) A language is context free iff all strings in L can be generated by some context free grammar Theorem 2.20: L is Context Free iff a PDA accepts it I.e. if L is context
More informationAspects of Tree-Based Statistical Machine Translation
Aspects of Tree-Based Statistical Machine Translation Marcello Federico Human Language Technology FBK 2014 Outline Tree-based translation models: Synchronous context free grammars Hierarchical phrase-based
More informationFinite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018
Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 14 Ana Bove May 14th 2018 Recap: Context-free Grammars Simplification of grammars: Elimination of ǫ-productions; Elimination of
More informationd(ν) = max{n N : ν dmn p n } N. p d(ν) (ν) = ρ.
1. Trees; context free grammars. 1.1. Trees. Definition 1.1. By a tree we mean an ordered triple T = (N, ρ, p) (i) N is a finite set; (ii) ρ N ; (iii) p : N {ρ} N ; (iv) if n N + and ν dmn p n then p n
More informationSolutions to Problem Set 3
V22.0453-001 Theory of Computation October 8, 2003 TA: Nelly Fazio Solutions to Problem Set 3 Problem 1 We have seen that a grammar where all productions are of the form: A ab, A c (where A, B non-terminals,
More informationNon-context-Free Languages. CS215, Lecture 5 c
Non-context-Free Languages CS215, Lecture 5 c 2007 1 The Pumping Lemma Theorem. (Pumping Lemma) Let be context-free. There exists a positive integer divided into five pieces, Proof for for each, and..
More informationParsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26
Parsing Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 26 Table of contents 1 Context-Free Grammars 2 Simplifying CFGs Removing useless symbols Eliminating
More informationContext- Free Parsing with CKY. October 16, 2014
Context- Free Parsing with CKY October 16, 2014 Lecture Plan Parsing as Logical DeducBon Defining the CFG recognibon problem BoHom up vs. top down Quick review of Chomsky normal form The CKY algorithm
More informationImproved TBL algorithm for learning context-free grammar
Proceedings of the International Multiconference on ISSN 1896-7094 Computer Science and Information Technology, pp. 267 274 2007 PIPS Improved TBL algorithm for learning context-free grammar Marcin Jaworski
More informationContext Free Grammars
Automata and Formal Languages Context Free Grammars Sipser pages 101-111 Lecture 11 Tim Sheard 1 Formal Languages 1. Context free languages provide a convenient notation for recursive description of languages.
More informationAn introduction to PRISM and its applications
An introduction to PRISM and its applications Yoshitaka Kameya Tokyo Institute of Technology 2007/9/17 FJ-2007 1 Contents What is PRISM? Two examples: from population genetics from statistical natural
More informationOctober 6, Equivalence of Pushdown Automata with Context-Free Gramm
Equivalence of Pushdown Automata with Context-Free Grammar October 6, 2013 Motivation Motivation CFG and PDA are equivalent in power: a CFG generates a context-free language and a PDA recognizes a context-free
More informationSuppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis
1 Introduction Parenthesis Matching Problem Describe the set of arithmetic expressions with correctly matched parenthesis. Arithmetic expressions with correctly matched parenthesis cannot be described
More informationContext-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing
Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing CS 4120/6120 Spring 2017 Northeastern University David Smith with some slides from Jason Eisner & Andrew
More informationRemembering subresults (Part I): Well-formed substring tables
Remembering subresults (Part I): Well-formed substring tables Detmar Meurers: Intro to Computational Linguistics I OSU, LING 684.01, 1. February 2005 Problem: Inefficiency of recomputing subresults Two
More informationLecture 12: Algorithms for HMMs
Lecture 12: Algorithms for HMMs Nathan Schneider (some slides from Sharon Goldwater; thanks to Jonathan May for bug fixes) ENLP 26 February 2018 Recap: tagging POS tagging is a sequence labelling task.
More informationDoctoral Course in Speech Recognition. May 2007 Kjell Elenius
Doctoral Course in Speech Recognition May 2007 Kjell Elenius CHAPTER 12 BASIC SEARCH ALGORITHMS State-based search paradigm Triplet S, O, G S, set of initial states O, set of operators applied on a state
More informationLanguages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:
Languages A language is a set (usually infinite) of strings, also known as sentences Each string consists of a sequence of symbols taken from some alphabet An alphabet, V, is a finite set of symbols, e.g.
More informationHarvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars
Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars Salil Vadhan October 2, 2012 Reading: Sipser, 2.1 (except Chomsky Normal Form). Algorithmic questions about regular
More informationCSCI Compiler Construction
CSCI 742 - Compiler Construction Lecture 12 Cocke-Younger-Kasami (CYK) Algorithm Instructor: Hossein Hojjat February 20, 2017 Recap: Chomsky Normal Form (CNF) A CFG is in Chomsky Normal Form if each rule
More informationParsing beyond context-free grammar: Parsing Multiple Context-Free Grammars
Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars Laura Kallmeyer, Wolfgang Maier University of Tübingen ESSLLI Course 2008 Parsing beyond CFG 1 MCFG Parsing Multiple Context-Free
More informationCMSC 723: Computational Linguistics I Session #5 Hidden Markov Models. The ischool University of Maryland. Wednesday, September 30, 2009
CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models Jimmy Lin The ischool University of Maryland Wednesday, September 30, 2009 Today s Agenda The great leap forward in NLP Hidden Markov
More informationHandout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0
Massachusetts Institute of Technology 6.863J/9.611J, Natural Language Processing, Spring, 2001 Department of Electrical Engineering and Computer Science Department of Brain and Cognitive Sciences Handout
More informationTree Adjoining Grammars
Tree Adjoining Grammars TAG: Parsing and formal properties Laura Kallmeyer & Benjamin Burkhardt HHU Düsseldorf WS 2017/2018 1 / 36 Outline 1 Parsing as deduction 2 CYK for TAG 3 Closure properties of TALs
More informationParsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is
Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is Unger s Parser Laura Heinrich-Heine-Universität Düsseldorf Wintersemester 2012/2013 a top-down parser: we start with S and
More informationParsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17
Parsing Left-Corner Parsing Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 17 Table of contents 1 Motivation 2 Algorithm 3 Look-ahead 4 Chart Parsing 2 / 17 Motivation Problems
More informationNotes for Comp 497 (Comp 454) Week 10 4/5/05
Notes for Comp 497 (Comp 454) Week 10 4/5/05 Today look at the last two chapters in Part II. Cohen presents some results concerning context-free languages (CFL) and regular languages (RL) also some decidability
More informationCPS 220 Theory of Computation
CPS 22 Theory of Computation Review - Regular Languages RL - a simple class of languages that can be represented in two ways: 1 Machine description: Finite Automata are machines with a finite number of
More informationComputational Models - Lecture 4
Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push
More informationCS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)
CS5371 Theory of Computation Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL) Objectives Introduce Pumping Lemma for CFL Apply Pumping Lemma to show that some languages are non-cfl Pumping Lemma
More informationLecture 12: Algorithms for HMMs
Lecture 12: Algorithms for HMMs Nathan Schneider (some slides from Sharon Goldwater; thanks to Jonathan May for bug fixes) ENLP 17 October 2016 updated 9 September 2017 Recap: tagging POS tagging is a
More informationLecture 5: UDOP, Dependency Grammars
Lecture 5: UDOP, Dependency Grammars Jelle Zuidema ILLC, Universiteit van Amsterdam Unsupervised Language Learning, 2014 Generative Model objective PCFG PTSG CCM DMV heuristic Wolff (1984) UDOP ML IO K&M
More informationChapter 4: Context-Free Grammars
Chapter 4: Context-Free Grammars 4.1 Basics of Context-Free Grammars Definition A context-free grammars, or CFG, G is specified by a quadruple (N, Σ, P, S), where N is the nonterminal or variable alphabet;
More informationProperties of context-free Languages
Properties of context-free Languages We simplify CFL s. Greibach Normal Form Chomsky Normal Form We prove pumping lemma for CFL s. We study closure properties and decision properties. Some of them remain,
More information(pp ) PDAs and CFGs (Sec. 2.2)
(pp. 117-124) PDAs and CFGs (Sec. 2.2) A language is context free iff all strings in L can be generated by some context free grammar Theorem 2.20: L is Context Free iff a PDA accepts it I.e. if L is context
More informationRecap: HMM. ANLP Lecture 9: Algorithms for HMMs. More general notation. Recap: HMM. Elements of HMM: Sharon Goldwater 4 Oct 2018.
Recap: HMM ANLP Lecture 9: Algorithms for HMMs Sharon Goldwater 4 Oct 2018 Elements of HMM: Set of states (tags) Output alphabet (word types) Start state (beginning of sentence) State transition probabilities
More informationStatistical Methods for NLP
Statistical Methods for NLP Sequence Models Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Methods for NLP 1(21) Introduction Structured
More informationSimplification and Normalization of Context-Free Grammars
implification and Normalization, of Context-Free Grammars 20100927 Slide 1 of 23 Simplification and Normalization of Context-Free Grammars 5DV037 Fundamentals of Computer Science Umeå University Department
More informationHidden Markov Models
Hidden Markov Models Slides mostly from Mitch Marcus and Eric Fosler (with lots of modifications). Have you seen HMMs? Have you seen Kalman filters? Have you seen dynamic programming? HMMs are dynamic
More informationPushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata.
Pushdown Automata We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata. Next we consider a more powerful computation model, called a
More informationThe Pumping Lemma for Context Free Grammars
The Pumping Lemma for Context Free Grammars Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar is in the form A BC A a Where a is any terminal
More informationMTH401A Theory of Computation. Lecture 17
MTH401A Theory of Computation Lecture 17 Chomsky Normal Form for CFG s Chomsky Normal Form for CFG s For every context free language, L, the language L {ε} has a grammar in which every production looks
More information