Dependency Parsing. COSI 114 Computational Linguistics Marie Meteer. March 21, 2015 Brandeis University
|
|
- Ferdinand Lane
- 5 years ago
- Views:
Transcription
1 + Dependency Parsing COSI 114 Computational Linguistics Marie Meteer March 21, 2015 Brandeis University
2 Dependency Grammar and Dependency Structure Dependency syntax postulates that syntac1c structure consists of lexical items linked by binary asymmetric rela1ons ( arrows ) called dependencies The arrows are commonly typed with the name of gramma1cal rela1ons (subject, preposi1onal object, apposi1on, etc.) submitted nsubjpass auxpass prep Bills were by prep pobj on Brownback pobj nn appos ports Senator Republican cc conj prep and immigration of pobj Kansas
3 Dependency Grammar and Dependency Structure Dependency syntax postulates that syntac1c structure consists of lexical items linked by binary asymmetric rela1ons ( arrows ) called dependencies The arrow connects a head (governor, superior, regent) with a dependent (modifier, inferior, subordinate) Usually, dependencies form a tree (connected, acyclic, single-head) submitted nsubjpass auxpass prep Bills were by prep pobj on Brownback pobj nn appos ports Senator Republican cc conj prep and immigration of pobj Kansas
4 + Dependency Grammar/Parsing n A sentence is parsed by relating each word to other words in the sentence which depend on it. n The idea of dependency structure goes back a long way n To Pāṇini s grammar (c. 5th century BCE) n Constituency is a new-fangled invention n 20th century invention n Modern work often linked to work of L. Tesniere (1959) n Dominant approach in East (Eastern bloc/east Asia) n Among the earliest kinds of parsers in NLP, even in US: n David Hays, one of the founders of computational linguistics, built early (first?) dependency parser (Hays 1962)
5 + Dependency structure Shaw Publishing acquired 30 % of American City in March $$ n Words are linked from head (regent) to dependent n Warning! Some people do the arrows one way; some the other way (Tesniere has them point from head to dependent ). n Usually add a fake ROOT so every word is a dependent
6 Rela1on between phrase structure and dependency structure n A dependency grammar has a no1on of a head. Officially, CFGs don t. n But modern linguis1c theory and all modern sta1s1cal parsers (Charniak, Collins, Stanford, ) do, via hand-wripen phrasal head rules : n The head of a Noun Phrase is a noun/number/adj/ n The head of a Verb Phrase is a verb/modal/. n The head rules can be used to extract a dependency parse from a CFG parse n The closure of dependencies give cons1tuency from a dependency tree n But the dependents of a word must be at the same level (i.e., flat ) there can be no VP!
7 + Dependency Conditioning Preferences Sources of information: n n n bilexical dependencies distance of dependencies valency of heads (number of dependents) A word s dependents (adjuncts, arguments) Tend to fall near it These next 6 slides are based on slides by Jason Eisner and Noah Smith in the string.
8 + Probabilistic dependency grammar: generative model 1. Start with left wall $ 2. Generate root w 0 $ λw 0 ρw 0 3. Generate left children w -1, w -2,..., w -l from the FSA λw 0 w -1 w 0 w 1 4. Generate right children w 1, w 2,..., w r from the FSA ρw 0 5. Recurse on each w i for i in {-l,..., -1, 1,..., r}, sampling α i (steps 2-4) λw - w -... w -2 w 2... w r 6. Return α l...α -1 w 0 α 1...α r w -.-1
9 + Naïve Recognition/Parsing O(n 5 N 3 ) if N nonterminals O(n 5 ) combinations goal p p c i j k goal r 0 n takes I t takes takes to I t takes two to tango It takes two to tango
10 + Dependency Grammar Cubic Recognition/ Parsing (Eisner & Satta, 1999) } n Triangles: span over words, where tall side of triangle is the head, other side is dependent, and no non-head words expecting more dependents } n Trapezoids: span over words, where larger side is head, smaller side is dependent, and smaller side is still looking for dependents on its side of the trapezoid
11 + Dependency Grammar Cubic Recognition/ Parsing (Eisner & Satta, 1999) One trapezoid per dependency. A triangle is a goal head with some left (or right) subtrees. It takes two to tango
12 + Cubic Recognition/Parsing (Eisner & Satta, 1999) O(n) combinations goal 0 i n O(n 3 ) combinations i j k i j k O(n 3 ) combinations Gives O(n 3 ) dependency grammar parsing i j k i j k
13 Evaluation of Dependency Parsing: Simply use (labeled) dependency accuracy GOLD PARSED Accuracy = number of correct dependencies total number of dependencies 1 2 We SUBJ 2 0 eat ROOT 3 5 the DET 4 5 cheese MOD 5 2 sandwich OBJ = 2 / 5 = % 1 2 We SUBJ 2 0 eat ROOT 3 4 the DET 4 2 cheese OBJ 5 2 sandwich PRED
14 + McDonald et al. (2005 ACL): Online Large-Margin Training of Dependency Parsers n Builds a discriminative dependency parser n Can condition on rich features in that context n Best-known recent dependency parser n Lots of recent dependency parsing activity connected with CoNLL 2006/2007 shared task n Doesn t/can t report constituent LP/LR, but evaluating dependencies correct: n Accuracy is similar to but a fraction below dependencies extracted from Collins: n 90.9% vs. 91.4% combining them gives 92.2% [all lengths] n Stanford parser on length up to 40: n Pure generative dependency model: 85.0% n Lexicalized factored parser: 91.0%
15 + McDonald et al. (2005 ACL): Online Large-Margin Training of Dependency Parsers n Score of a parse is the sum of the scores of its dependencies n Each dependency is a linear function of features times weights n Feature weights are learned by MIRA, an online largemargin algorithm n But you could think of it as using a perceptron or maxent classifier n Features cover: n Head and dependent word and POS separately n Head and dependent word and POS bigram features n Words between head and dependent n Length and direction of dependency
16 + Extracting grammatical relations from statistical constituency parsers NP NNS [de Marneffe et al. LREC 2006] n Exploit the high-quality syntactic analysis done by statistical constituency parsers to get the grammatical relations [typed dependencies] n Dependencies are generated by pattern-matching rules NP IN PP NNS NP CC NN S VBD NNP Bills on ports and immigration were submitted by Senator Brownback VP VBN VP IN PP NP NNP submitted nsubjpass auxpass agent Bills were Brownback prep_on nn ports cc_and immigration Senator
17 + Methods of Dependency Parsing 1. Dynamic programming (like in the CKY algorithm) You can do it similarly to lexicalized PCFG parsing: an O(n 5 ) algorithm Eisner (1996) gives a clever algorithm that reduces the complexity to O(n 3 ), by producing parse items with heads at the ends rather than in the middle 2. Graph algorithms You create a Maximum Spanning Tree for a sentence McDonald et al. s (2005) MSTParser scores dependencies independently using a ML classifier (he uses MIRA, for online learning, but it could be MaxEnt) 3. Constraint Sa1sfac1on Edges are eliminated that don t sa1sfy hard constraints. Karlsson (1990), etc. 4. Determinis1c parsing Greedy choice of apachments guided by machine learning classifiers MaltParser (Nivre et al. 2008) discussed in the next segment
18 + Dependency Conditioning Preferences What are the sources of informa1on for dependency parsing? 1. Bilexical affini1es [issues à the] is plausible 2. Dependency distance mostly with nearby words 3. Intervening material Dependencies rarely span intervening verbs or punctua1on 4. Valency of heads How many dependents on which side are usual for a head? ROOT Discussion of the outstanding issues was completed.
19 + Greedy Transition-Based Parsing MaltParser
20 + MaltParser [Nivre et al. 2008] n A simple form of greedy discrimina1ve dependency parser n The parser does a sequence of bopom up ac1ons n Roughly like ship or reduce in a ship-reduce parser, but the reduce ac1ons are specialized to create dependencies with head on lep or right n The parser has: n a stack σ, wripen with top to the right n which starts with the ROOT symbol n a buffer β, wripen with top to the lep n which starts with the input sentence n a set of dependency arcs A n which starts off empty n a set of ac1ons
21 + Basic transi1on-based dependency parser Start: σ = [ROOT], β = w 1,, w n, A = 1. Ship σ, w i β, A è σ w i, β, A 2. Lep-Arc r σ w i, w j β, A è σ, w j β, A {r(w j,w i )} 3. Right-Arc r σ w i, w j β, A è σ, w i β, A {r(w i,w j )} Finish: β = Notes: n Unlike the regular presenta1on of the CFG reduce step, dependencies combine one thing from each of stack and buffer
22 + Ac1ons ( arc-eager dependency parser) Start: σ = [ROOT], β = w 1,, w n, A = 1. Lep-Arc r σ w i, w j β, A è σ, w j β, A {r(w j,w i )} Precondi1on: r (w k, w i ) A, w i ROOT 2. Right-Arc r σ w i, w j β, A è σ w i w j, β, A {r(w i,w j )} 3. Reduce σ w i, β, A è σ, β, A Precondi1on: r (w k, w i ) A 4. Ship σ, w i β, A è σ w i, β, A Finish: β = This is the common arc-eager variant: a head can immediately take a right dependent, before its dependents are found
23 + Example Happy children like to play with their friends. [ROOT] [Happy, children, ] Ship [ROOT, Happy] [children, like, ] LA amod [ROOT] [children, like, ] {amod(children, happy)} = A 1 Ship [ROOT, children] [like, to, ] A 1 LA nsubj [ROOT] [like, to, ] A 1 {nsubj(like, children)} = A 2 RA root [ROOT, like] [to, play, ] A 2 {root(root, like) = A 3 Ship [ROOT, like, to] [play, with, ] A 3 1. Lep-Arc r σ w i, w j β, A è σ, w j β, A {r(w j,w i )} Precondi1on: (w k, r, w i ) A, w i ROOT 2. Right-Arc r σ w i, w j β, A è σ w i w j, β, A {r(w i,w j )} 3. Reduce σ w i, β, A è σ, β, A Precondi1on: (w k, r, w i ) A 4. Ship σ, w i β, A è σ w i, β, A LA aux [ROOT, like] [play, with, ] A 3 {aux(play, to) = A 4 RA xcomp [ROOT, like, play] [with their, ] A 4 {xcomp(like, play) = A 5
24 + Example Happy children like to play with their friends. 1. Lep-Arc r σ w i, w j β, A è σ, w j β, A {r(w j,w i )} Precondi1on: (w k, r, w i ) A, w i ROOT 2. Right-Arc r σ w i, w j β, A è σ w i w j, β, A {r(w i,w j )} 3. Reduce σ w i, β, A è σ, β, A Precondi1on: (w k, r, w i ) A 4. Ship σ, w i β, A è σ w i, β, A RA xcomp [ROOT, like, play] [with their, ] A 4 {xcomp(like, play) = A 5 RA prep [ROOT, like, play, with] [their, friends, ] A 5 {prep(play, with) = A 6 Ship [ROOT, like, play, with, their] [friends,.] A 6 LA poss [ROOT, like, play, with] [friends,.] A 6 {poss(friends, their) = A 7 RA pobj [ROOT, like, play, with, friends] [.] A 7 {pobj(with, friends) = A 8 Reduce [ROOT, like, play, with] [.] A 8 Reduce [ROOT, like, play] [.] A 8 Reduce [ROOT, like] [.] A 8 RA punc [ROOT, like,.] [] A 8 {punc(like,.) = A 9 You terminate as soon as the buffer is empty. Dependencies = A 9
25 + MaltParser [Nivre et al. 2008] n We have lep to explain how we choose the next ac1on n Each ac1on is predicted by a discrimina1ve classifier (open SVM, could be maxent classifier) over each legal move n Max of 4 untyped choices, max of R when typed n Features: top of stack word, POS; first in buffer word, POS; etc. n There is NO search (in the simplest and usual form) n But you could do some kind of beam search if you wish n The model s accuracy is slightly below the best LPCFGs (evaluated on dependencies), but n It provides close to state of the art parsing performance n It provides VERY fast linear 1me parsing
26 + Evalua1on of Dependency Parsing: (labeled) dependency accuracy Acc = # correct deps # of deps ROOT She saw the video lecture UAS = 4 / 5 = 80% LAS = 2 / 5 = 40% Gold 1 2 She nsubj 2 0 saw root 3 5 the det 4 5 video nn 5 2 lecture dobj Parsed 1 2 She nsubj 2 0 saw root 3 4 the det 4 5 video nsubj 5 2 lecture ccomp
27 + Representa1ve performance numbers n The CoNLL-X (2006) shared task provides evalua1on numbers for various dependency parsing approaches over 13 languages n MALT: LAS scores from 65 92%, depending greatly on language/ treebank n Here we give a few UAS numbers for English to allow some comparison to cons1tuency parsing Parser UAS% Sagae and Lavie (2006) ensemble of dependency parsers 92.7 Charniak (2000) generative, constituency 92.2 Collins (1999) generative, constituency 91.7 McDonald and Pereira (2005) MST graph-based dependency 91.5 Yamada and Matsumoto (2003) transition-based dependency 90.4
28 Projec1vity n Dependencies from a CFG tree using heads, must be projec1ve n There must not be any crossing dependency arcs when the words are laid out in their linear order, with all arcs above the words. n But dependency theory normally does allow non-projec1ve structures to account for displaced cons1tuents n You can t easily get the seman1cs of certain construc1ons right without these nonprojec1ve dependencies Who did Bill buy the coffee from yesterday?
29 + Handling non-projec1vity n The arc-eager algorithm we presented only builds projec1ve dependency trees n Possible direc1ons to head: 1. Just declare defeat on nonprojec1ve arcs 2. Use a dependency formalism which only admits projec1ve representa1ons (a CFG doesn t represent such structures ) 3. Use a postprocessor to a projec1ve dependency parsing algorithm to iden1fy and resolve nonprojec1ve links 4. Add extra types of transi1ons that can model at least most non-projec1ve structures 5. Move to a parsing mechanism that does not use or require any constraints on projec1vity (e.g., the graph-based MSTParser)
30 + Dependencies encode relational structure Relation Extraction with Stanford Dependencies
31 + Dependency paths iden1fy rela1ons like protein interac1on [Erkan et al. EMNLP 07, Fundel et al. 2007] nsubj results det The demonstrated compl ccomp interacts prep_with that advmod SasA nsubj conj_and conj_and KaiC rythmically KaiA KaiB KaiC çnsubj interacts prep_withè SasA KaiC çnsubj interacts prep_withè SasA conj_andè KaiA KaiC çnsubj interacts prep_withè SasA conj_andè KaiB
32 + Stanford Dependencies [de Marneffe et al. LREC 2006] n The basic dependency representation is projective n It can be generated by postprocessing headed phrase structure parses (Penn Treebank syntax) n It can also be generated directly by dependency parsers, such as MaltParser, or the Easy-First Parser the jumped nsubj prep boy little over det amod pobj the det fence
33 + Graph modification to facilitate semantic analysis Bell, based in LA, makes and distributes electronic and computer products. Bell partmod nsubj based prep in pobj LA makes cc and conj dobj products amod electronic cc conj and distributes computer
34 + Graph modification to facilitate semantic analysis Bell, based in LA, makes and distributes electronic and computer products. nsubj nsubj conj_and makes distributes dobj Bell partmod based prep_in LA amod products amod electronic conj_and computer
35 + BioNLP 2009/2011 rela1on extrac1on shared tasks [Björne et al. 2009]
S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP V PP 0.1 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 VP NP PP 1.0. N people 0.
/6/7 CS 6/CS: Natural Language Processing Instructor: Prof. Lu Wang College of Computer and Information Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang The grammar: Binary, no epsilons,.9..5
More informationNatural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science
Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):
More informationDependency Parsing. Statistical NLP Fall (Non-)Projectivity. CoNLL Format. Lecture 9: Dependency Parsing
Dependency Parsing Statistical NLP Fall 2016 Lecture 9: Dependency Parsing Slav Petrov Google prep dobj ROOT nsubj pobj det PRON VERB DET NOUN ADP NOUN They solved the problem with statistics CoNLL Format
More informationLECTURER: BURCU CAN Spring
LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can
More information13A. Computational Linguistics. 13A. Log-Likelihood Dependency Parsing. CSC 2501 / 485 Fall 2017
Computational Linguistics CSC 2501 / 485 Fall 2017 13A 13A. Log-Likelihood Dependency Parsing Gerald Penn Department of Computer Science, University of Toronto Based on slides by Yuji Matsumoto, Dragomir
More informationProbabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning
Probabilistic Context Free Grammars Many slides from Michael Collins and Chris Manning Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic
More informationProbabilistic Context Free Grammars. Many slides from Michael Collins
Probabilistic Context Free Grammars Many slides from Michael Collins Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar
More informationDependency grammar. Recurrent neural networks. Transition-based neural parsing. Word representations. Informs Models
Dependency grammar Morphology Word order Transition-based neural parsing Word representations Recurrent neural networks Informs Models Dependency grammar Morphology Word order Transition-based neural parsing
More informationStructured Prediction
Structured Prediction Classification Algorithms Classify objects x X into labels y Y First there was binary: Y = {0, 1} Then multiclass: Y = {1,...,6} The next generation: Structured Labels Structured
More informationParsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)
Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements
More informationProbabilistic Context-Free Grammars. Michael Collins, Columbia University
Probabilistic Context-Free Grammars Michael Collins, Columbia University Overview Probabilistic Context-Free Grammars (PCFGs) The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar
More informationMaschinelle Sprachverarbeitung
Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other
More informationProbabilistic Context-free Grammars
Probabilistic Context-free Grammars Computational Linguistics Alexander Koller 24 November 2017 The CKY Recognizer S NP VP NP Det N VP V NP V ate NP John Det a N sandwich i = 1 2 3 4 k = 2 3 4 5 S NP John
More informationMaschinelle Sprachverarbeitung
Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other
More informationComputational Linguistics
Computational Linguistics Dependency-based Parsing Clayton Greenberg Stefan Thater FR 4.7 Allgemeine Linguistik (Computerlinguistik) Universität des Saarlandes Summer 2016 Acknowledgements These slides
More informationMarrying Dynamic Programming with Recurrent Neural Networks
Marrying Dynamic Programming with Recurrent Neural Networks I eat sushi with tuna from Japan Liang Huang Oregon State University Structured Prediction Workshop, EMNLP 2017, Copenhagen, Denmark Marrying
More informationNatural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation
atural Language Processing 1 lecture 7: constituent parsing Ivan Titov Institute for Logic, Language and Computation Outline Syntax: intro, CFGs, PCFGs PCFGs: Estimation CFGs: Parsing PCFGs: Parsing Parsing
More informationComputational Linguistics. Acknowledgements. Phrase-Structure Trees. Dependency-based Parsing
Computational Linguistics Dependency-based Parsing Dietrich Klakow & Stefan Thater FR 4.7 Allgemeine Linguistik (Computerlinguistik) Universität des Saarlandes Summer 2013 Acknowledgements These slides
More informationTransition-Based Parsing
Transition-Based Parsing Based on atutorial at COLING-ACL, Sydney 2006 with Joakim Nivre Sandra Kübler, Markus Dickinson Indiana University E-mail: skuebler,md7@indiana.edu Transition-Based Parsing 1(11)
More informationWord Embeddings in Feedforward Networks; Tagging and Dependency Parsing using Feedforward Networks. Michael Collins, Columbia University
Word Embeddings in Feedforward Networks; Tagging and Dependency Parsing using Feedforward Networks Michael Collins, Columbia University Overview Introduction Multi-layer feedforward networks Representing
More informationParsing with Context-Free Grammars
Parsing with Context-Free Grammars Berlin Chen 2005 References: 1. Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2. Speech and Language Processing, chapters 9, 10 NLP-Berlin Chen 1 Grammars
More informationNatural Language Processing
SFU NatLangLab Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University September 27, 2018 0 Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class
More informationCKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016
CKY & Earley Parsing Ling 571 Deep Processing Techniques for NLP January 13, 2016 No Class Monday: Martin Luther King Jr. Day CKY Parsing: Finish the parse Recognizer à Parser Roadmap Earley parsing Motivation:
More informationDecoding and Inference with Syntactic Translation Models
Decoding and Inference with Syntactic Translation Models March 5, 2013 CFGs S NP VP VP NP V V NP NP CFGs S NP VP S VP NP V V NP NP CFGs S NP VP S VP NP V NP VP V NP NP CFGs S NP VP S VP NP V NP VP V NP
More informationParsing with Context-Free Grammars
Parsing with Context-Free Grammars CS 585, Fall 2017 Introduction to Natural Language Processing http://people.cs.umass.edu/~brenocon/inlp2017 Brendan O Connor College of Information and Computer Sciences
More informationCS460/626 : Natural Language
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24 Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th,
More informationStatistical Methods for NLP
Statistical Methods for NLP Stochastic Grammars Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Methods for NLP 1(22) Structured Classification
More informationMultiword Expression Identification with Tree Substitution Grammars
Multiword Expression Identification with Tree Substitution Grammars Spence Green, Marie-Catherine de Marneffe, John Bauer, and Christopher D. Manning Stanford University EMNLP 2011 Main Idea Use syntactic
More informationRecap: Lexicalized PCFGs (Fall 2007): Lecture 5 Parsing and Syntax III. Recap: Charniak s Model. Recap: Adding Head Words/Tags to Trees
Recap: Lexicalized PCFGs We now need to estimate rule probabilities such as P rob(s(questioned,vt) NP(lawyer,NN) VP(questioned,Vt) S(questioned,Vt)) 6.864 (Fall 2007): Lecture 5 Parsing and Syntax III
More informationIntroduction to Data-Driven Dependency Parsing
Introduction to Data-Driven Dependency Parsing Introductory Course, ESSLLI 2007 Ryan McDonald 1 Joakim Nivre 2 1 Google Inc., New York, USA E-mail: ryanmcd@google.com 2 Uppsala University and Växjö University,
More informationCS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012
CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 Parsing Problem Semantics Part of Speech Tagging NLP Trinity Morph Analysis
More informationA* Search. 1 Dijkstra Shortest Path
A* Search Consider the eight puzzle. There are eight tiles numbered 1 through 8 on a 3 by three grid with nine locations so that one location is left empty. We can move by sliding a tile adjacent to the
More informationPenn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark
Penn Treebank Parsing Advanced Topics in Language Processing Stephen Clark 1 The Penn Treebank 40,000 sentences of WSJ newspaper text annotated with phrasestructure trees The trees contain some predicate-argument
More informationLab 12: Structured Prediction
December 4, 2014 Lecture plan structured perceptron application: confused messages application: dependency parsing structured SVM Class review: from modelization to classification What does learning mean?
More informationAdvanced Natural Language Processing Syntactic Parsing
Advanced Natural Language Processing Syntactic Parsing Alicia Ageno ageno@cs.upc.edu Universitat Politècnica de Catalunya NLP statistical parsing 1 Parsing Review Statistical Parsing SCFG Inside Algorithm
More informationThe SUBTLE NL Parsing Pipeline: A Complete Parser for English Mitch Marcus University of Pennsylvania
The SUBTLE NL Parsing Pipeline: A Complete Parser for English Mitch Marcus University of Pennsylvania 1 PICTURE OF ANALYSIS PIPELINE Tokenize Maximum Entropy POS tagger MXPOST Ratnaparkhi Core Parser Collins
More informationAlgorithms for Syntax-Aware Statistical Machine Translation
Algorithms for Syntax-Aware Statistical Machine Translation I. Dan Melamed, Wei Wang and Ben Wellington ew York University Syntax-Aware Statistical MT Statistical involves machine learning (ML) seems crucial
More informationCS 6120/CS4120: Natural Language Processing
CS 6120/CS4120: Natural Language Processing Instructor: Prof. Lu Wang College of Computer and Information Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Assignment/report submission
More informationNatural Language Processing : Probabilistic Context Free Grammars. Updated 5/09
Natural Language Processing : Probabilistic Context Free Grammars Updated 5/09 Motivation N-gram models and HMM Tagging only allowed us to process sentences linearly. However, even simple sentences require
More informationA Context-Free Grammar
Statistical Parsing A Context-Free Grammar S VP VP Vi VP Vt VP VP PP DT NN PP PP P Vi sleeps Vt saw NN man NN dog NN telescope DT the IN with IN in Ambiguity A sentence of reasonable length can easily
More informationStructured Prediction Models via the Matrix-Tree Theorem
Structured Prediction Models via the Matrix-Tree Theorem Terry Koo Amir Globerson Xavier Carreras Michael Collins maestro@csail.mit.edu gamir@csail.mit.edu carreras@csail.mit.edu mcollins@csail.mit.edu
More informationIn this chapter, we explore the parsing problem, which encompasses several questions, including:
Chapter 12 Parsing Algorithms 12.1 Introduction In this chapter, we explore the parsing problem, which encompasses several questions, including: Does L(G) contain w? What is the highest-weight derivation
More information10/17/04. Today s Main Points
Part-of-speech Tagging & Hidden Markov Model Intro Lecture #10 Introduction to Natural Language Processing CMPSCI 585, Fall 2004 University of Massachusetts Amherst Andrew McCallum Today s Main Points
More informationChapter 14 (Partially) Unsupervised Parsing
Chapter 14 (Partially) Unsupervised Parsing The linguistically-motivated tree transformations we discussed previously are very effective, but when we move to a new language, we may have to come up with
More informationCh. 2: Phrase Structure Syntactic Structure (basic concepts) A tree diagram marks constituents hierarchically
Ch. 2: Phrase Structure Syntactic Structure (basic concepts) A tree diagram marks constituents hierarchically NP S AUX VP Ali will V NP help D N the man A node is any point in the tree diagram and it can
More informationTransition-based dependency parsing
Transition-based dependency parsing Daniël de Kok Overview Dependency graphs and treebanks. Transition-based dependency parsing. Making parse choices using perceptrons. Today Recap Transition systems Parsing
More informationParsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing
L445 / L545 / B659 Dept. of Linguistics, Indiana University Spring 2016 1 / 46 : Overview Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the
More informationParsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.
: Overview L545 Dept. of Linguistics, Indiana University Spring 2013 Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the problem as searching
More informationNLP Homework: Dependency Parsing with Feed-Forward Neural Network
NLP Homework: Dependency Parsing with Feed-Forward Neural Network Submission Deadline: Monday Dec. 11th, 5 pm 1 Background on Dependency Parsing Dependency trees are one of the main representations used
More informationArtificial Intelligence
CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 20-21 Natural Language Parsing Parsing of Sentences Are sentences flat linear structures? Why tree? Is
More informationFeatures of Statistical Parsers
Features of tatistical Parsers Preliminary results Mark Johnson Brown University TTI, October 2003 Joint work with Michael Collins (MIT) upported by NF grants LI 9720368 and II0095940 1 Talk outline tatistical
More informationA Deterministic Word Dependency Analyzer Enhanced With Preference Learning
A Deterministic Word Dependency Analyzer Enhanced With Preference Learning Hideki Isozaki and Hideto Kazawa and Tsutomu Hirao NTT Communication Science Laboratories NTT Corporation 2-4 Hikaridai, Seikacho,
More informationAdvanced Graph-Based Parsing Techniques
Advanced Graph-Based Parsing Techniques Joakim Nivre Uppsala University Linguistics and Philology Based on previous tutorials with Ryan McDonald Advanced Graph-Based Parsing Techniques 1(33) Introduction
More informationMid-term Reviews. Preprocessing, language models Sequence models, Syntactic Parsing
Mid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing Preprocessing What is a Lemma? What is a wordform? What is a word type? What is a token? What is tokenization? What is
More informationProcessing/Speech, NLP and the Web
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25 Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March, 2011 Bracketed Structure: Treebank Corpus [ S1[
More informationParsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22
Parsing Probabilistic CFG (PCFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 22 Table of contents 1 Introduction 2 PCFG 3 Inside and outside probability 4 Parsing Jurafsky
More informationVine Pruning for Efficient Multi-Pass Dependency Parsing. Alexander M. Rush and Slav Petrov
Vine Pruning for Efficient Multi-Pass Dependency Parsing Alexander M. Rush and Slav Petrov Dependency Parsing Styles of Dependency Parsing greedy O(n) transition-based parsers (Nivre 2004) graph-based
More informationThe Formal Architecture of. Lexical-Functional Grammar. Ronald M. Kaplan and Mary Dalrymple
The Formal Architecture of Lexical-Functional Grammar Ronald M. Kaplan and Mary Dalrymple Xerox Palo Alto Research Center 1. Kaplan and Dalrymple, ESSLLI 1995, Barcelona Architectural Issues Representation:
More informationAlessandro Mazzei MASTER DI SCIENZE COGNITIVE GENOVA 2005
Alessandro Mazzei Dipartimento di Informatica Università di Torino MATER DI CIENZE COGNITIVE GENOVA 2005 04-11-05 Natural Language Grammars and Parsing Natural Language yntax Paolo ama Francesca yntactic
More informationDriving Semantic Parsing from the World s Response
Driving Semantic Parsing from the World s Response James Clarke, Dan Goldwasser, Ming-Wei Chang, Dan Roth Cognitive Computation Group University of Illinois at Urbana-Champaign CoNLL 2010 Clarke, Goldwasser,
More informationc(a) = X c(a! Ø) (13.1) c(a! Ø) ˆP(A! Ø A) = c(a)
Chapter 13 Statistical Parsg Given a corpus of trees, it is easy to extract a CFG and estimate its parameters. Every tree can be thought of as a CFG derivation, and we just perform relative frequency estimation
More informationIntroduction to Computational Linguistics
Introduction to Computational Linguistics Olga Zamaraeva (2018) Based on Bender (prev. years) University of Washington May 3, 2018 1 / 101 Midterm Project Milestone 2: due Friday Assgnments 4& 5 due dates
More informationDiscrimina)ve Latent Variable Models. SPFLODD November 15, 2011
Discrimina)ve Latent Variable Models SPFLODD November 15, 2011 Lecture Plan 1. Latent variables in genera)ve models (review) 2. Latent variables in condi)onal models 3. Latent variables in structural SVMs
More informationStatistical Machine Translation
Statistical Machine Translation -tree-based models (cont.)- Artem Sokolov Computerlinguistik Universität Heidelberg Sommersemester 2015 material from P. Koehn, S. Riezler, D. Altshuler Bottom-Up Decoding
More informationarxiv: v2 [cs.cl] 20 Apr 2017
Syntax Aware LSM Model for Chinese Semantic Role Labeling Feng Qian 2, Lei Sha 1, Baobao Chang 1, Lu-chen Liu 2, Ming Zhang 2 1 Key Laboratory of Computational Linguistics, Ministry of Education, Peking
More informationEmpirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs
Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs (based on slides by Sharon Goldwater and Philipp Koehn) 21 February 2018 Nathan Schneider ENLP Lecture 11 21
More informationReview. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering
Review Earley Algorithm Chapter 13.4 Lecture #9 October 2009 Top-Down vs. Bottom-Up Parsers Both generate too many useless trees Combine the two to avoid over-generation: Top-Down Parsing with Bottom-Up
More informationA Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister
A Syntax-based Statistical Machine Translation Model Alexander Friedl, Georg Teichtmeister 4.12.2006 Introduction The model Experiment Conclusion Statistical Translation Model (STM): - mathematical model
More informationConstituency Parsing
CS5740: Natural Language Processing Spring 2017 Constituency Parsing Instructor: Yoav Artzi Slides adapted from Dan Klein, Dan Jurafsky, Chris Manning, Michael Collins, Luke Zettlemoyer, Yejin Choi, and
More informationQuasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti
Quasi-Synchronous Phrase Dependency Grammars for Machine Translation Kevin Gimpel Noah A. Smith 1 Introduction MT using dependency grammars on phrases Phrases capture local reordering and idiomatic translations
More informationComputational Models - Lecture 4
Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push
More informationAspects of Tree-Based Statistical Machine Translation
Aspects of Tree-Based Statistical Machine Translation Marcello Federico Human Language Technology FBK 2014 Outline Tree-based translation models: Synchronous context free grammars Hierarchical phrase-based
More informationThe Infinite PCFG using Hierarchical Dirichlet Processes
S NP VP NP PRP VP VBD NP NP DT NN PRP she VBD heard DT the NN noise S NP VP NP PRP VP VBD NP NP DT NN PRP she VBD heard DT the NN noise S NP VP NP PRP VP VBD NP NP DT NN PRP she VBD heard DT the NN noise
More informationMachine Learning for Structured Prediction
Machine Learning for Structured Prediction Grzegorz Chrupa la National Centre for Language Technology School of Computing Dublin City University NCLT Seminar Grzegorz Chrupa la (DCU) Machine Learning for
More informationNLP Programming Tutorial 11 - The Structured Perceptron
NLP Programming Tutorial 11 - The Structured Perceptron Graham Neubig Nara Institute of Science and Technology (NAIST) 1 Prediction Problems Given x, A book review Oh, man I love this book! This book is
More informationBetter! Faster! Stronger*!
Jason Eisner Jiarong Jiang He He Better! Faster! Stronger*! Learning to balance accuracy and efficiency when predicting linguistic structures (*theorems) Hal Daumé III UMD CS, UMIACS, Linguistics me@hal3.name
More informationThis kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.
Chapter 12 Synchronous CFGs Synchronous context-free grammars are a generalization of CFGs that generate pairs of related strings instead of single strings. They are useful in many situations where one
More informationSpectral Unsupervised Parsing with Additive Tree Metrics
Spectral Unsupervised Parsing with Additive Tree Metrics Ankur Parikh, Shay Cohen, Eric P. Xing Carnegie Mellon, University of Edinburgh Ankur Parikh 2014 1 Overview Model: We present a novel approach
More informationSharpening the empirical claims of generative syntax through formalization
Sharpening the empirical claims of generative syntax through formalization Tim Hunter University of Minnesota, Twin Cities NASSLLI, June 2014 Part 1: Grammars and cognitive hypotheses What is a grammar?
More informationAN ABSTRACT OF THE DISSERTATION OF
AN ABSTRACT OF THE DISSERTATION OF Kai Zhao for the degree of Doctor of Philosophy in Computer Science presented on May 30, 2017. Title: Structured Learning with Latent Variables: Theory and Algorithms
More informationDT2118 Speech and Speaker Recognition
DT2118 Speech and Speaker Recognition Language Modelling Giampiero Salvi KTH/CSC/TMH giampi@kth.se VT 2015 1 / 56 Outline Introduction Formal Language Theory Stochastic Language Models (SLM) N-gram Language
More informationContext- Free Parsing with CKY. October 16, 2014
Context- Free Parsing with CKY October 16, 2014 Lecture Plan Parsing as Logical DeducBon Defining the CFG recognibon problem BoHom up vs. top down Quick review of Chomsky normal form The CKY algorithm
More informationNatural Language Processing. Slides from Andreas Vlachos, Chris Manning, Mihai Surdeanu
Natural Language Processing Slides from Andreas Vlachos, Chris Manning, Mihai Surdeanu Projects Project descriptions due today! Last class Sequence to sequence models Attention Pointer networks Today Weak
More informationHandout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0
Massachusetts Institute of Technology 6.863J/9.611J, Natural Language Processing, Spring, 2001 Department of Electrical Engineering and Computer Science Department of Brain and Cognitive Sciences Handout
More informationSharpening the empirical claims of generative syntax through formalization
Sharpening the empirical claims of generative syntax through formalization Tim Hunter University of Minnesota, Twin Cities ESSLLI, August 2015 Part 1: Grammars and cognitive hypotheses What is a grammar?
More informationGrammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG
Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universität Tübingen Part I Formal Properties of TAG 16.05.2007 und 21.05.2007 TAG Parsing
More informationUnit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing
CS 562: Empirical Methods in Natural Language Processing Unit 2: Tree Models Lectures 19-23: Context-Free Grammars and Parsing Oct-Nov 2009 Liang Huang (lhuang@isi.edu) Big Picture we have already covered...
More informationContext-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing
Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing CS 4120/6120 Spring 2017 Northeastern University David Smith with some slides from Jason Eisner & Andrew
More informationModels of Adjunction in Minimalist Grammars
Models of Adjunction in Minimalist Grammars Thomas Graf mail@thomasgraf.net http://thomasgraf.net Stony Brook University FG 2014 August 17, 2014 The Theory-Neutral CliffsNotes Insights Several properties
More informationDepartment of Computer Science and Engineering Indian Institute of Technology, Kanpur. Spatial Role Labeling
Department of Computer Science and Engineering Indian Institute of Technology, Kanpur CS 365 Artificial Intelligence Project Report Spatial Role Labeling Submitted by Satvik Gupta (12633) and Garvit Pahal
More informationCISC4090: Theory of Computation
CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Spring, 2014 Overview In Chapter
More informationGraph-based Dependency Parsing. Ryan McDonald Google Research
Graph-based Dependency Parsing Ryan McDonald Google Research ryanmcd@google.com Reader s Digest Graph-based Dependency Parsing Ryan McDonald Google Research ryanmcd@google.com root ROOT Dependency Parsing
More informationIntroduction to Probablistic Natural Language Processing
Introduction to Probablistic Natural Language Processing Alexis Nasr Laboratoire d Informatique Fondamentale de Marseille Natural Language Processing Use computers to process human languages Machine Translation
More informationCS 545 Lecture XVI: Parsing
CS 545 Lecture XVI: Parsing brownies_choco81@yahoo.com brownies_choco81@yahoo.com Benjamin Snyder Parsing Given a grammar G and a sentence x = (x1, x2,..., xn), find the best parse tree. We re not going
More informationA Tabular Method for Dynamic Oracles in Transition-Based Parsing
A Tabular Method for Dynamic Oracles in Transition-Based Parsing Yoav Goldberg Department of Computer Science Bar Ilan University, Israel yoav.goldberg@gmail.com Francesco Sartorio Department of Information
More informationLecture 7: Introduction to syntax-based MT
Lecture 7: Introduction to syntax-based MT Andreas Maletti Statistical Machine Translation Stuttgart December 16, 2011 SMT VII A. Maletti 1 Lecture 7 Goals Overview Tree substitution grammars (tree automata)
More informationContext-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing
Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing! CS 6120 Spring 2014! Northeastern University!! David Smith! with some slides from Jason Eisner & Andrew
More informationHidden Markov Models
CS 2750: Machine Learning Hidden Markov Models Prof. Adriana Kovashka University of Pittsburgh March 21, 2016 All slides are from Ray Mooney Motivating Example: Part Of Speech Tagging Annotate each word
More informationApplied Natural Language Processing
Applied Natural Language Processing Info 256 Lecture 7: Testing (Feb 12, 2019) David Bamman, UC Berkeley Significance in NLP You develop a new method for text classification; is it better than what comes
More informationSpatial Role Labeling CS365 Course Project
Spatial Role Labeling CS365 Course Project Amit Kumar, akkumar@iitk.ac.in Chandra Sekhar, gchandra@iitk.ac.in Supervisor : Dr.Amitabha Mukerjee ABSTRACT In natural language processing one of the important
More information