Syntax-Based Decoding

Size: px
Start display at page:

Download "Syntax-Based Decoding"

Transcription

1 Syntax-Based Decoding Philipp Koehn 9 November 2017

2 1 syntax-based models

3 Synchronous Context Free Grammar Rules 2 Nonterminal rules NP DET 1 2 JJ 3 DET 1 JJ 3 2 Terminal rules N maison house NP la maison bleue the blue house Mixed rules NP la maison JJ 1 the JJ 1 house

4 Extracting Minimal Rules 3 S VP VP VP PP NP PRP MD VB VBG RP TO PRP DT S I shall be passing on to you some comments Ich werde Ihnen die entsprechenden Anmerkungen aushändigen Extracted rule: S X 1 X 2 PRP 1 VP 2 DONE note: one rule per alignable constituent

5 4 decoding

6 Syntactic Decoding 5 Inspired by monolingual syntactic chart parsing: During decoding of the source sentence, a chart with translations for the O(n 2 ) spans has to be filled Sie PPER will VAFIN eine ART Tasse Kaffee trinken VVINF NP S VP

7 Syntax Decoding 6 VB drink Sie PPER will VAFIN eine ART Tasse Kaffee trinken VVINF NP S VP German input sentence with tree

8 Syntax Decoding 7 PRO she VB drink Sie PPER will VAFIN eine ART Tasse Kaffee trinken VVINF NP S VP Purely lexical rule: filling a span with a translation (a constituent in the chart)

9 Syntax Decoding 8 PRO she coffee VB drink Sie PPER will VAFIN eine ART Tasse Kaffee trinken VVINF NP S VP Purely lexical rule: filling a span with a translation (a constituent in the chart)

10 Syntax Decoding 9 coffee PRO she VB drink Sie PPER will VAFIN eine ART Tasse Kaffee trinken VVINF NP S VP Purely lexical rule: filling a span with a translation (a constituent in the chart)

11 Syntax Decoding 10 NP NP PP PRO she DET a cup coffee IN of VB drink Sie PPER will VAFIN eine ART Tasse Kaffee trinken VVINF NP S VP Complex rule: matching underlying constituent spans, and covering words

12 Syntax Decoding 11 VBZ wants VP TO to VP VB NP NP NP PP PRO she DET a cup coffee IN of VB drink Sie PPER will VAFIN eine ART Tasse Kaffee trinken VVINF NP S VP Complex rule with reordering

13 Syntax Decoding PRO VP S 12 VP VP VBZ wants TO to VB NP NP NP PP PRO she DET a cup IN of coffee VB drink Sie PPER will VAFIN eine ART Tasse Kaffee trinken VVINF NP S VP

14 Bottom-Up Decoding 13 For each span, a stack of (partial) translations is maintained Bottom-up: a higher stack is filled, once underlying stacks are complete

15 Chart Organization 14 Sie PPER will VAFIN eine ART Tasse Kaffee trinken VVINF NP S VP Chart consists of cells that cover contiguous spans over the input sentence Each cell contains a set of hypotheses 1 Hypothesis = translation of span with target-side constituent 1 In the book, they are called chart entries.

16 Naive Algorithm 15 Input: Foreign sentence f = f 1,...f lf, with syntax tree Output: English translation e 1: for all spans [start,end] (bottom up) do 2: for all sequences s of hypotheses and words in span [start,end] do 3: for all rules r do 4: if rule r applies to chart sequence s then 5: create new hypothesis c 6: add hypothesis c to chart 7: end if 8: end for 9: end for 10: end for 11: return English translation e from best hypothesis in span [0,l f ]

17 Stack Pruning 16 Number of hypotheses in each chart cell explodes Dynamic programming (recombination) not enough need to discard bad hypotheses e.g., keep 100 best only Different stacks for different output constituent labels? Cost estimates translation model cost known language model cost for internal words known estimates for initial words outside cost estimate? (how useful will be a NP covering input words 3 5 later on?)

18 Naive Algorithm: Blow-ups 17 Many subspan sequences for all sequences s of hypotheses and words in span [start,end] Many rules for all rules r Checking if a rule applies not trivial rule r applies to chart sequence s Unworkable

19 Solution 18 Prefix tree data structure for rules Dotted rules Cube pruning

20 19 storing rules efficiently

21 Storing Rules 20 First concern: do they apply to span? have to match available hypotheses and input words Example rule Check for applicability NP X 1 des X 2 NP 1 of the 2 is there an initial sub-span that with a hypothesis with constituent label NP? is it followed by a sub-span over the word des? is it followed by a final sub-span with a hypothesis with label? Sequence of relevant information NP des NP 1 of the 2

22 Rule Applicability Check 21 Trying to cover a span of six words with given rule NP des NP: NP of the

23 Rule Applicability Check 22 First: check for hypotheses with output constituent label NP NP des NP: NP of the

24 Rule Applicability Check 23 Found NP hypothesis in cell, matched first symbol of rule NP des NP: NP of the NP

25 Rule Applicability Check 24 Matched word des, matched second symbol of rule NP des NP: NP of the NP

26 Rule Applicability Check 25 Found a hypothesis in cell, matched last symbol of rule NP des NP: NP of the NP

27 Rule Applicability Check 26 Matched entire rule apply to create a NP hypothesis NP des NP: NP of the NP NP

28 Rule Applicability Check 27 Look up output words to create new hypothesis (note: there may be many matching underlying NP and hypotheses) NP des NP: NP of the NP: the house of the architect Frank Gehry NP: the house : architect Frank Gehry

29 Checking Rules vs. Finding Rules 28 What we showed: given a rule check if and how it can be applied But there are too many rules (millions) to check them all Instead: given the underlying chart cells and input words find which rules apply

30 Prefix Tree for Rules 29 NP DET NP NP: NP1... NP: NP1 IN2 NP3 NP: NP1 of DET2 NP3 NP: NP1 of IN2 NP3 PP VP des um VP NP: NP1 of the 2 NP: NP2 NP1 NP: NP1 of NP2... DET NP: DET1 2 das Haus NP: the house Highlighted Rules NP NP 1 DET 2 3 NP 1 IN 2 3 NP NP 1 NP 1 NP NP 1 des 2 NP 1 of the 2 NP NP 1 des 2 NP 2 NP 1 NP DET 1 2 DET 1 2 NP das Haus the house

31 30 dotted rules

32 Dotted Rules: Key Insight 31 If we can apply a rule like p A B C x to a span Then we could have applied a rule like q A B y to a sub-span with the same starting word We can re-use rule lookup by storing A B (dotted rule)

33 Finding Applicable Rules in Prefix Tree 32

34 Covering the First Cell 33

35 Looking up Rules in the Prefix Tree 34

36 Taking Note of the Dotted Rule 35

37 Checking if Dotted Rule has Translations 36 DET: the DET: that

38 Applying the Translation Rules 37 DET: the DET: that DET: that DET: the

39 Looking up Constituent Label in Prefix Tree 38 DET: that DET: the

40 Add to Span s List of Dotted Rules 39 DET: that DET: the

41 Moving on to the Next Cell 40 DET: that DET: the

42 Looking up Rules in the Prefix Tree 41 Haus ❸ DET: that DET: the

43 Taking Note of the Dotted Rule 42 Haus ❸ DET: that DET: the house ❸

44 Checking if Dotted Rule has Translations 43 Haus ❸ : house NP: house DET: that DET: the house ❸

45 Applying the Translation Rules 44 Haus ❸ : house NP: house DET: that DET: the NP: house : house house ❸

46 Looking up Constituent Label in Prefix Tree 45 Haus ❸ ❹ NP ❺ DET: that DET: the NP: house : house house ❸

47 Add to Span s List of Dotted Rules 46 Haus ❸ ❹ NP ❺ DET: that DET: the NP: house : house ❹ NP ❺ house ❸

48 More of the Same 47 Haus ❸ ❹ NP ❺ DET: that DET: the NP: house : house IN: of DET: the NP: architect : architect P: Frank P: Gehry ❹ NP ❺ house ❸ des ❹ Architekten P Frank P Gehry

49 Moving on to the Next Cell 48 Haus ❸ ❹ NP ❺ DET: that DET: the NP: house : house IN: of DET: the NP: architect : architect P: Frank P: Gehry ❹ NP ❺ house ❸ des ❹ Architekten P Frank P Gehry

50 Covering a Longer Span 49 Cannot consume multiple words at once All rules are extensions of existing dotted rules Here: only extensions of span over das possible DET: that DET: the NP: house : house IN: of DET: the NP: architect : architect P: Frank P: Gehry ❹ NP ❺ house ❸ des ❹ Architekten P Frank P Gehry

51 Extensions of Span over das 50 Haus ❸ ❹ NP ❺, NP, Haus?, NP, Haus? DET: that DET: the NP: house : house IN: of DET: the NP: architect : architect P: Frank P: Gehry ❹ NP ❺ house ❸ des ❹ Architekten P Frank P Gehry

52 Looking up Rules in the Prefix Tree 51 Haus ❻ ❼ Haus ❽ ❾ DET: that DET: the NP: house : house IN: of DET: the NP: architect : architect P: Frank P: Gehry ❹ NP ❺ house ❸ des ❹ Architekten P Frank P Gehry

53 Taking Note of the Dotted Rule 52 Haus ❻ ❼ Haus ❽ ❾ DET ❾ DET Haus❽ das ❼ das Haus❻ DET: that DET: the NP: house : house IN: of DET: the NP: architect : architect P: Frank P: Gehry ❹ NP ❺ house ❸ des ❹ Architekten P Frank P Gehry

54 Checking if Dotted Rules have Translations 53 Haus ❻ NP: the house ❼ NP: the Haus ❽ NP: DET house ❾ NP: DET DET ❾ DET Haus❽ das ❼ das Haus❻ DET: that DET: the NP: house : house IN: of DET: the NP: architect : architect P: Frank P: Gehry ❹ NP ❺ house ❸ des ❹ Architekten P Frank P Gehry

55 Applying the Translation Rules 54 Haus ❻ NP: the house ❼ NP: the Haus ❽ NP: DET house ❾ NP: DET NP: that house NP: the house DET ❾ DET Haus❽ das ❼ das Haus❻ DET: that DET: the NP: house : house IN: of DET: the NP: architect : architect P: Frank P: Gehry ❹ NP ❺ house ❸ des ❹ Architekten P Frank P Gehry

56 Looking up Constituent Label in Prefix Tree 55 Haus ❻ NP: the house ❼ NP: the Haus ❽ NP: DET house ❾ NP: DET NP ❺ NP: that house NP: the house DET ❾ DET Haus❽ das ❼ das Haus❻ DET: that DET: the NP: house : house IN: of DET: the NP: architect : architect P: Frank P: Gehry ❹ NP ❺ house ❸ des ❹ Architekten P Frank P Gehry

57 Add to Span s List of Dotted Rules 56 Haus ❻ NP: the house ❼ NP: the Haus ❽ NP: DET house ❾ NP: DET NP ❺ NP: that house NP: the house DET ❾ DET Haus❽ das ❼ das Haus❻ NP❺ DET: that DET: the NP: house : house ❹ NP ❺ house ❸ IN: of DET: the des NP: architect : architect ❹ Architekten P: Frank P Frank P: Gehry P Gehry

58 Even Larger Spans 57 Extend lists of dotted rules with cell constituent labels span s dotted rule list (with same start) plus neighboring span s constituent labels of hypotheses (with same end)

59 Reflections 58 Complexity O(rn 3 ) with sentence length n and size of dotted rule list r may introduce maximum size for spans that do not start at beginning may limit size of dotted rule list (very arbitrary) Does the list of dotted rules explode? Yes, if there are many rules with neighboring target-side non-terminals such rules apply in many places rules with words are much more restricted

60 Difficult Rules 59 Some rules may apply in too many ways Neighboring input non-terminals NP X 1 X 2 NP 2 to NP 1 non-terminals may match many different pairs of spans especially a problem for hierarchical models (no constituent label restrictions) may be okay for syntax-models Three neighboring input non-terminals VP trifft X 1 X 2 X 3 heute meets NP 1 today PP 2 PP 3 will get out of hand even for syntax models

61 Summary 60 Basic idea: bottom up chart parsing Prefix structure for easy rule access Caching rule matching with dotted rules Coming up... cube pruning for syntax-based decoding recombination and state scope3 pruning recursive cky+ coarse-to-fine

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation -tree-based models (cont.)- Artem Sokolov Computerlinguistik Universität Heidelberg Sommersemester 2015 material from P. Koehn, S. Riezler, D. Altshuler Bottom-Up Decoding

More information

Syntax-based Statistical Machine Translation

Syntax-based Statistical Machine Translation Syntax-based Statistical Machine Translation Philip Williams and Philipp Koehn 29 October 2014 Part I Part II - Introduction - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical

More information

Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions

Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical Machine Translation Philip Williams and Philipp Koehn 29 October 2014 Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical

More information

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016 CKY & Earley Parsing Ling 571 Deep Processing Techniques for NLP January 13, 2016 No Class Monday: Martin Luther King Jr. Day CKY Parsing: Finish the parse Recognizer à Parser Roadmap Earley parsing Motivation:

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based Statistical Machine Translation Marcello Federico Human Language Technology FBK 2014 Outline Tree-based translation models: Synchronous context free grammars Hierarchical phrase-based

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars Berlin Chen 2005 References: 1. Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2. Speech and Language Processing, chapters 9, 10 NLP-Berlin Chen 1 Grammars

More information

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing L445 / L545 / B659 Dept. of Linguistics, Indiana University Spring 2016 1 / 46 : Overview Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the

More information

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46. : Overview L545 Dept. of Linguistics, Indiana University Spring 2013 Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the problem as searching

More information

Probabilistic Context-free Grammars

Probabilistic Context-free Grammars Probabilistic Context-free Grammars Computational Linguistics Alexander Koller 24 November 2017 The CKY Recognizer S NP VP NP Det N VP V NP V ate NP John Det a N sandwich i = 1 2 3 4 k = 2 3 4 5 S NP John

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars CS 585, Fall 2017 Introduction to Natural Language Processing http://people.cs.umass.edu/~brenocon/inlp2017 Brendan O Connor College of Information and Computer Sciences

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based tatistical Machine Translation Marcello Federico (based on slides by Gabriele Musillo) Human Language Technology FBK-irst 2011 Outline Tree-based translation models: ynchronous context

More information

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing CS 4120/6120 Spring 2017 Northeastern University David Smith with some slides from Jason Eisner & Andrew

More information

Decoding and Inference with Syntactic Translation Models

Decoding and Inference with Syntactic Translation Models Decoding and Inference with Syntactic Translation Models March 5, 2013 CFGs S NP VP VP NP V V NP NP CFGs S NP VP S VP NP V V NP NP CFGs S NP VP S VP NP V NP VP V NP NP CFGs S NP VP S VP NP V NP VP V NP

More information

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Probabilistic Context-Free Grammars. Michael Collins, Columbia University Probabilistic Context-Free Grammars Michael Collins, Columbia University Overview Probabilistic Context-Free Grammars (PCFGs) The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Introduction to Computational Linguistics

Introduction to Computational Linguistics Introduction to Computational Linguistics Olga Zamaraeva (2018) Based on Bender (prev. years) University of Washington May 3, 2018 1 / 101 Midterm Project Milestone 2: due Friday Assgnments 4& 5 due dates

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Stochastic Grammars Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Methods for NLP 1(22) Structured Classification

More information

Statistical Machine Translation. Part III: Search Problem. Complexity issues. DP beam-search: with single and multi-stacks

Statistical Machine Translation. Part III: Search Problem. Complexity issues. DP beam-search: with single and multi-stacks Statistical Machine Translation Marcello Federico FBK-irst Trento, Italy Galileo Galilei PhD School - University of Pisa Pisa, 7-19 May 008 Part III: Search Problem 1 Complexity issues A search: with single

More information

Decoding in Statistical Machine Translation. Mid-course Evaluation. Decoding. Christian Hardmeier

Decoding in Statistical Machine Translation. Mid-course Evaluation. Decoding. Christian Hardmeier Decoding in Statistical Machine Translation Christian Hardmeier 2016-05-04 Mid-course Evaluation http://stp.lingfil.uu.se/~sara/kurser/mt16/ mid-course-eval.html Decoding The decoder is the part of the

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins

Probabilistic Context Free Grammars. Many slides from Michael Collins Probabilistic Context Free Grammars Many slides from Michael Collins Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Multilevel Coarse-to-Fine PCFG Parsing

Multilevel Coarse-to-Fine PCFG Parsing Multilevel Coarse-to-Fine PCFG Parsing Eugene Charniak, Mark Johnson, Micha Elsner, Joseph Austerweil, David Ellis, Isaac Haxton, Catherine Hill, Shrivaths Iyengar, Jeremy Moore, Michael Pozar, and Theresa

More information

A* Search. 1 Dijkstra Shortest Path

A* Search. 1 Dijkstra Shortest Path A* Search Consider the eight puzzle. There are eight tiles numbered 1 through 8 on a 3 by three grid with nine locations so that one location is left empty. We can move by sliding a tile adjacent to the

More information

Natural Language Processing. Lecture 13: More on CFG Parsing

Natural Language Processing. Lecture 13: More on CFG Parsing Natural Language Processing Lecture 13: More on CFG Parsing Probabilistc/Weighted Parsing Example: ambiguous parse Probabilistc CFG Ambiguous parse w/probabilites 0.05 0.05 0.20 0.10 0.30 0.20 0.60 0.75

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning Probabilistic Context Free Grammars Many slides from Michael Collins and Chris Manning Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic

More information

Marrying Dynamic Programming with Recurrent Neural Networks

Marrying Dynamic Programming with Recurrent Neural Networks Marrying Dynamic Programming with Recurrent Neural Networks I eat sushi with tuna from Japan Liang Huang Oregon State University Structured Prediction Workshop, EMNLP 2017, Copenhagen, Denmark Marrying

More information

CS460/626 : Natural Language

CS460/626 : Natural Language CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24 Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th,

More information

Tuning as Linear Regression

Tuning as Linear Regression Tuning as Linear Regression Marzieh Bazrafshan, Tagyoung Chung and Daniel Gildea Department of Computer Science University of Rochester Rochester, NY 14627 Abstract We propose a tuning method for statistical

More information

Unsupervised!eneratio" of #ara$el %reebanks through &ub'%ree (lignmen)

Unsupervised!eneratio of #ara$el %reebanks through &ub'%ree (lignmen) Unsupervised eneratio" of #ara$el %reebanks through &ub'%ree (lignmen) Ventsislav Zhechev contact@ventsislavzhechev.eu %alk *utlin+ What is a Parallel Treebank and Why do we Need One? Syem Design and Features

More information

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this. Chapter 12 Synchronous CFGs Synchronous context-free grammars are a generalization of CFGs that generate pairs of related strings instead of single strings. They are useful in many situations where one

More information

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering Review Earley Algorithm Chapter 13.4 Lecture #9 October 2009 Top-Down vs. Bottom-Up Parsers Both generate too many useless trees Combine the two to avoid over-generation: Top-Down Parsing with Bottom-Up

More information

Overview (Fall 2007) Machine Translation Part III. Roadmap for the Next Few Lectures. Phrase-Based Models. Learning phrases from alignments

Overview (Fall 2007) Machine Translation Part III. Roadmap for the Next Few Lectures. Phrase-Based Models. Learning phrases from alignments Overview Learning phrases from alignments 6.864 (Fall 2007) Machine Translation Part III A phrase-based model Decoding in phrase-based models (Thanks to Philipp Koehn for giving me slides from his EACL

More information

LECTURER: BURCU CAN Spring

LECTURER: BURCU CAN Spring LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can

More information

Bringing machine learning & compositional semantics together: central concepts

Bringing machine learning & compositional semantics together: central concepts Bringing machine learning & compositional semantics together: central concepts https://githubcom/cgpotts/annualreview-complearning Chris Potts Stanford Linguistics CS 244U: Natural language understanding

More information

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing CS 562: Empirical Methods in Natural Language Processing Unit 2: Tree Models Lectures 19-23: Context-Free Grammars and Parsing Oct-Nov 2009 Liang Huang (lhuang@isi.edu) Big Picture we have already covered...

More information

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):

More information

IBM Model 1 for Machine Translation

IBM Model 1 for Machine Translation IBM Model 1 for Machine Translation Micha Elsner March 28, 2014 2 Machine translation A key area of computational linguistics Bar-Hillel points out that human-like translation requires understanding of

More information

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0 Massachusetts Institute of Technology 6.863J/9.611J, Natural Language Processing, Spring, 2001 Department of Electrical Engineering and Computer Science Department of Brain and Cognitive Sciences Handout

More information

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write: Languages A language is a set (usually infinite) of strings, also known as sentences Each string consists of a sequence of symbols taken from some alphabet An alphabet, V, is a finite set of symbols, e.g.

More information

Machine Translation. CL1: Jordan Boyd-Graber. University of Maryland. November 11, 2013

Machine Translation. CL1: Jordan Boyd-Graber. University of Maryland. November 11, 2013 Machine Translation CL1: Jordan Boyd-Graber University of Maryland November 11, 2013 Adapted from material by Philipp Koehn CL1: Jordan Boyd-Graber (UMD) Machine Translation November 11, 2013 1 / 48 Roadmap

More information

Algorithms for Syntax-Aware Statistical Machine Translation

Algorithms for Syntax-Aware Statistical Machine Translation Algorithms for Syntax-Aware Statistical Machine Translation I. Dan Melamed, Wei Wang and Ben Wellington ew York University Syntax-Aware Statistical MT Statistical involves machine learning (ML) seems crucial

More information

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

CMPT-825 Natural Language Processing. Why are parsing algorithms important? CMPT-825 Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/ anoop October 26, 2010 1/34 Why are parsing algorithms important? A linguistic theory is implemented in a formal system to generate

More information

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation atural Language Processing 1 lecture 7: constituent parsing Ivan Titov Institute for Logic, Language and Computation Outline Syntax: intro, CFGs, PCFGs PCFGs: Estimation CFGs: Parsing PCFGs: Parsing Parsing

More information

Latent Variable Models in NLP

Latent Variable Models in NLP Latent Variable Models in NLP Aria Haghighi with Slav Petrov, John DeNero, and Dan Klein UC Berkeley, CS Division Latent Variable Models Latent Variable Models Latent Variable Models Observed Latent Variable

More information

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09 Natural Language Processing : Probabilistic Context Free Grammars Updated 5/09 Motivation N-gram models and HMM Tagging only allowed us to process sentences linearly. However, even simple sentences require

More information

Quantification: Quantifiers and the Rest of the Sentence

Quantification: Quantifiers and the Rest of the Sentence Ling255: Sem & Cogsci Maribel Romero February 17, 2005 Quantification: Quantifiers and the Rest of the Sentence 1. Introduction. We have seen that Determiners express a relation between two sets of individuals

More information

Language Model Rest Costs and Space-Efficient Storage

Language Model Rest Costs and Space-Efficient Storage Language Model Rest Costs and Space-Efficient Storage Kenneth Heafield Philipp Koehn Alon Lavie Carnegie Mellon, University of Edinburgh July 14, 2012 Complaint About Language Models Make Search Expensive

More information

Lagrangian Relaxation Algorithms for Inference in Natural Language Processing

Lagrangian Relaxation Algorithms for Inference in Natural Language Processing Lagrangian Relaxation Algorithms for Inference in Natural Language Processing Alexander M. Rush and Michael Collins (based on joint work with Yin-Wen Chang, Tommi Jaakkola, Terry Koo, Roi Reichart, David

More information

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister A Syntax-based Statistical Machine Translation Model Alexander Friedl, Georg Teichtmeister 4.12.2006 Introduction The model Experiment Conclusion Statistical Translation Model (STM): - mathematical model

More information

Soft Inference and Posterior Marginals. September 19, 2013

Soft Inference and Posterior Marginals. September 19, 2013 Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard inference Give me a single solution Viterbi algorithm Maximum spanning tree (Chu-Liu-Edmonds alg.) Soft inference

More information

Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for Natural Language Processing

Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for Natural Language Processing Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for atural Language Processing Alexander M. Rush (based on joint work with Michael Collins, Tommi Jaakkola, Terry Koo, David Sontag) Uncertainty

More information

Natural Language Processing

Natural Language Processing SFU NatLangLab Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University September 27, 2018 0 Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class

More information

Syntax Analysis Part I

Syntax Analysis Part I 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2013 2 Position of a Parser in the Compiler Model Source Program Lexical Analyzer

More information

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 Parsing Problem Semantics Part of Speech Tagging NLP Trinity Morph Analysis

More information

Syntax Analysis Part I

Syntax Analysis Part I 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2013 2 Position of a Parser in the Compiler Model Source Program Lexical Analyzer

More information

COMS 4705, Fall Machine Translation Part III

COMS 4705, Fall Machine Translation Part III COMS 4705, Fall 2011 Machine Translation Part III 1 Roadmap for the Next Few Lectures Lecture 1 (last time): IBM Models 1 and 2 Lecture 2 (today): phrase-based models Lecture 3: Syntax in statistical machine

More information

CS 6120/CS4120: Natural Language Processing

CS 6120/CS4120: Natural Language Processing CS 6120/CS4120: Natural Language Processing Instructor: Prof. Lu Wang College of Computer and Information Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Assignment/report submission

More information

A DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005

A DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005 A DOP Model for LFG Rens Bod and Ronald Kaplan Kathrin Spreyer Data-Oriented Parsing, 14 June 2005 Lexical-Functional Grammar (LFG) Levels of linguistic knowledge represented formally differently (non-monostratal):

More information

A Supertag-Context Model for Weakly-Supervised CCG Parser Learning

A Supertag-Context Model for Weakly-Supervised CCG Parser Learning A Supertag-Context Model for Weakly-Supervised CCG Parser Learning Dan Garrette Chris Dyer Jason Baldridge Noah A. Smith U. Washington CMU UT-Austin CMU Contributions 1. A new generative model for learning

More information

AN ABSTRACT OF THE DISSERTATION OF

AN ABSTRACT OF THE DISSERTATION OF AN ABSTRACT OF THE DISSERTATION OF Kai Zhao for the degree of Doctor of Philosophy in Computer Science presented on May 30, 2017. Title: Structured Learning with Latent Variables: Theory and Algorithms

More information

Constituency Parsing

Constituency Parsing CS5740: Natural Language Processing Spring 2017 Constituency Parsing Instructor: Yoav Artzi Slides adapted from Dan Klein, Dan Jurafsky, Chris Manning, Michael Collins, Luke Zettlemoyer, Yejin Choi, and

More information

Processing/Speech, NLP and the Web

Processing/Speech, NLP and the Web CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25 Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March, 2011 Bracketed Structure: Treebank Corpus [ S1[

More information

Applications of Tree Automata Theory Lecture VI: Back to Machine Translation

Applications of Tree Automata Theory Lecture VI: Back to Machine Translation Applications of Tree Automata Theory Lecture VI: Back to Machine Translation Andreas Maletti Institute of Computer Science Universität Leipzig, Germany on leave from: Institute for Natural Language Processing

More information

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universität Tübingen Part I Formal Properties of TAG 16.05.2007 und 21.05.2007 TAG Parsing

More information

Computational Linguistics II: Parsing

Computational Linguistics II: Parsing Computational Linguistics II: Parsing Left-corner-Parsing Frank Richter & Jan-Philipp Söhn fr@sfs.uni-tuebingen.de, jp.soehn@uni-tuebingen.de January 15th, 2007 Richter/Söhn (WS 2006/07) Computational

More information

Remembering subresults (Part I): Well-formed substring tables

Remembering subresults (Part I): Well-formed substring tables Remembering subresults (Part I): Well-formed substring tables Detmar Meurers: Intro to Computational Linguistics I OSU, LING 684.01, 1. February 2005 Problem: Inefficiency of recomputing subresults Two

More information

Recap: Lexicalized PCFGs (Fall 2007): Lecture 5 Parsing and Syntax III. Recap: Charniak s Model. Recap: Adding Head Words/Tags to Trees

Recap: Lexicalized PCFGs (Fall 2007): Lecture 5 Parsing and Syntax III. Recap: Charniak s Model. Recap: Adding Head Words/Tags to Trees Recap: Lexicalized PCFGs We now need to estimate rule probabilities such as P rob(s(questioned,vt) NP(lawyer,NN) VP(questioned,Vt) S(questioned,Vt)) 6.864 (Fall 2007): Lecture 5 Parsing and Syntax III

More information

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars Laura Kallmeyer & Tatiana Bladier Heinrich-Heine-Universität Düsseldorf Sommersemester 2018 Kallmeyer, Bladier SS 2018 Parsing Beyond CFG:

More information

Artificial Intelligence

Artificial Intelligence CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 20-21 Natural Language Parsing Parsing of Sentences Are sentences flat linear structures? Why tree? Is

More information

Context- Free Parsing with CKY. October 16, 2014

Context- Free Parsing with CKY. October 16, 2014 Context- Free Parsing with CKY October 16, 2014 Lecture Plan Parsing as Logical DeducBon Defining the CFG recognibon problem BoHom up vs. top down Quick review of Chomsky normal form The CKY algorithm

More information

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Hidden Markov Models Murhaf Fares & Stephan Oepen Language Technology Group (LTG) October 27, 2016 Recap: Probabilistic Language

More information

In this chapter, we explore the parsing problem, which encompasses several questions, including:

In this chapter, we explore the parsing problem, which encompasses several questions, including: Chapter 12 Parsing Algorithms 12.1 Introduction In this chapter, we explore the parsing problem, which encompasses several questions, including: Does L(G) contain w? What is the highest-weight derivation

More information

NLP Programming Tutorial 11 - The Structured Perceptron

NLP Programming Tutorial 11 - The Structured Perceptron NLP Programming Tutorial 11 - The Structured Perceptron Graham Neubig Nara Institute of Science and Technology (NAIST) 1 Prediction Problems Given x, A book review Oh, man I love this book! This book is

More information

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van ngelen, Flora State University, 2007 Position of a Parser in the Compiler Model 2 Source Program Lexical Analyzer Lexical

More information

Lexical Translation Models 1I

Lexical Translation Models 1I Lexical Translation Models 1I Machine Translation Lecture 5 Instructor: Chris Callison-Burch TAs: Mitchell Stern, Justin Chiu Website: mt-class.org/penn Last Time... X p( Translation)= p(, Translation)

More information

PCFGs 2 L645 / B659. Dept. of Linguistics, Indiana University Fall PCFGs 2. Questions. Calculating P(w 1m ) Inside Probabilities

PCFGs 2 L645 / B659. Dept. of Linguistics, Indiana University Fall PCFGs 2. Questions. Calculating P(w 1m ) Inside Probabilities 1 / 22 Inside L645 / B659 Dept. of Linguistics, Indiana University Fall 2015 Inside- 2 / 22 for PCFGs 3 questions for Probabilistic Context Free Grammars (PCFGs): What is the probability of a sentence

More information

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22 Parsing Probabilistic CFG (PCFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 22 Table of contents 1 Introduction 2 PCFG 3 Inside and outside probability 4 Parsing Jurafsky

More information

Decoding Revisited: Easy-Part-First & MERT. February 26, 2015

Decoding Revisited: Easy-Part-First & MERT. February 26, 2015 Decoding Revisited: Easy-Part-First & MERT February 26, 2015 Translating the Easy Part First? the tourism initiative addresses this for the first time the die tm:-0.19,lm:-0.4, d:0, all:-0.65 tourism touristische

More information

Exam: Synchronous Grammars

Exam: Synchronous Grammars Exam: ynchronous Grammars Duration: 3 hours Written documents are allowed. The numbers in front of questions are indicative of hardness or duration. ynchronous grammars consist of pairs of grammars whose

More information

A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus

A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus Timothy A. D. Fowler Department of Computer Science University of Toronto 10 King s College Rd., Toronto, ON, M5S 3G4, Canada

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing CS 4120/6120 Spring 2016 Northeastern University David Smith with some slides from Jason Eisner & Andrew

More information

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti Quasi-Synchronous Phrase Dependency Grammars for Machine Translation Kevin Gimpel Noah A. Smith 1 Introduction MT using dependency grammars on phrases Phrases capture local reordering and idiomatic translations

More information

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Hidden Markov Models Murhaf Fares & Stephan Oepen Language Technology Group (LTG) October 18, 2017 Recap: Probabilistic Language

More information

6.864: Lecture 17 (November 10th, 2005) Machine Translation Part III

6.864: Lecture 17 (November 10th, 2005) Machine Translation Part III 6.864: Lecture 17 (November 10th, 2005) Machine Translation Part III Overview A Phrase-Based Model: (Koehn, Och and Marcu 2003) Syntax Based Model 1: (Wu 1995) Syntax Based Model 2: (Yamada and Knight

More information

10/17/04. Today s Main Points

10/17/04. Today s Main Points Part-of-speech Tagging & Hidden Markov Model Intro Lecture #10 Introduction to Natural Language Processing CMPSCI 585, Fall 2004 University of Massachusetts Amherst Andrew McCallum Today s Main Points

More information

Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation

Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation David Vilar, Daniel Stein, Hermann Ney IWSLT 2008, Honolulu, Hawaii 20. October 2008 Human Language Technology

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing! CS 6120 Spring 2014! Northeastern University!! David Smith! with some slides from Jason Eisner & Andrew

More information

Multiword Expression Identification with Tree Substitution Grammars

Multiword Expression Identification with Tree Substitution Grammars Multiword Expression Identification with Tree Substitution Grammars Spence Green, Marie-Catherine de Marneffe, John Bauer, and Christopher D. Manning Stanford University EMNLP 2011 Main Idea Use syntactic

More information

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21 Parsing Unger s Parser Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2016/17 1 / 21 Table of contents 1 Introduction 2 The Parser 3 An Example 4 Optimizations 5 Conclusion 2 / 21 Introduction

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Lexical Translation Models 1I. January 27, 2015

Lexical Translation Models 1I. January 27, 2015 Lexical Translation Models 1I January 27, 2015 Last Time... X p( Translation)= p(, Translation) Alignment = X Alignment Alignment p( p( Alignment) Translation Alignment) {z } {z } X z } { z } { p(e f,m)=

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Probabilistic Context-Free Grammar

Probabilistic Context-Free Grammar Probabilistic Context-Free Grammar Petr Horáček, Eva Zámečníková and Ivana Burgetová Department of Information Systems Faculty of Information Technology Brno University of Technology Božetěchova 2, 612

More information

Logic as a Tool Chapter 1: Understanding Propositional Logic 1.1 Propositions and logical connectives. Truth tables and tautologies

Logic as a Tool Chapter 1: Understanding Propositional Logic 1.1 Propositions and logical connectives. Truth tables and tautologies Logic as a Tool Chapter 1: Understanding Propositional Logic 1.1 Propositions and logical connectives. Truth tables and tautologies Valentin Stockholm University September 2016 Propositions Proposition:

More information

Logic for Computer Science - Week 2 The Syntax of Propositional Logic

Logic for Computer Science - Week 2 The Syntax of Propositional Logic Logic for Computer Science - Week 2 The Syntax of Propositional Logic Ștefan Ciobâcă November 30, 2017 1 An Introduction to Logical Formulae In the previous lecture, we have seen what makes an argument

More information

Lecture Notes on Inductive Definitions

Lecture Notes on Inductive Definitions Lecture Notes on Inductive Definitions 15-312: Foundations of Programming Languages Frank Pfenning Lecture 2 September 2, 2004 These supplementary notes review the notion of an inductive definition and

More information

Creating a Recursive Descent Parse Table

Creating a Recursive Descent Parse Table Creating a Recursive Descent Parse Table Recursive descent parsing is sometimes called LL parsing (Left to right examination of input, Left derivation) Consider the following grammar E TE' E' +TE' T FT'

More information

Ling 240 Lecture #15. Syntax 4

Ling 240 Lecture #15. Syntax 4 Ling 240 Lecture #15 Syntax 4 agenda for today Give me homework 3! Language presentation! return Quiz 2 brief review of friday More on transformation Homework 4 A set of Phrase Structure Rules S -> (aux)

More information

Advanced Natural Language Processing Syntactic Parsing

Advanced Natural Language Processing Syntactic Parsing Advanced Natural Language Processing Syntactic Parsing Alicia Ageno ageno@cs.upc.edu Universitat Politècnica de Catalunya NLP statistical parsing 1 Parsing Review Statistical Parsing SCFG Inside Algorithm

More information

Improved Decipherment of Homophonic Ciphers

Improved Decipherment of Homophonic Ciphers Improved Decipherment of Homophonic Ciphers Malte Nuhn and Julian Schamper and Hermann Ney Human Language Technology and Pattern Recognition Computer Science Department, RWTH Aachen University, Aachen,

More information

THEORY OF COMPILATION

THEORY OF COMPILATION Lecture 04 Syntax analysis: top-down and bottom-up parsing THEORY OF COMPILATION EranYahav 1 You are here Compiler txt Source Lexical Analysis Syntax Analysis Parsing Semantic Analysis Inter. Rep. (IR)

More information