A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions

Size: px
Start display at page:

Download "A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions"

Transcription

1 A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions Wei Lu and Hwee Tou Ng National University of Singapore 1/26

2 The Task (Logical Form) λx 0.state(x 0 ) x 1.[loc(miss r, x 1 ) state(x 1 ) next to(x 1, x 0 )] give me the states bordering states that the mississipppi runs through (Natural Language Sentence) 2/26

3 The Task (Logical Form) arg max(x, river(x) y.[state(y) next to(y, indiana s) loc(x, y)], len(x))??? (Natural Language Sentence) 2/26

4 Challenges How to transform from complex logical forms with rich internal structures into text? Major Contribution (1): A Novel Forest-to-String Algorithm A novel packed forest representation of formal semantics (λ-expressions), and a novel reduction-based weighted binary SCFG for language generation; Inspired by the hierarchical phrase-based translation model (Chiang 2005, 2007). 3/26

5 Challenges How to automatically acquire the lexicon that maps from logical terms to natural language words? Major Contribution (2): A Novel Grammar Induction Algorithm Acquiring such synchronous grammar rules by learning the correspondence between logical sub-expressions and (possibly discontiguous) natural language word sequences; Inspired by the hybrid tree model (Lu et al. 2008). 4/26

6 Previous Work From logical/semantic forms, but not probabilistic Wang (1980) On computational sentence generation from logical form Shieber et al. (1990) Semantic-head-driven generation 5/26

7 Previous Work From logical/semantic forms, but not probabilistic Wang (1980) On computational sentence generation from logical form Shieber et al. (1990) Semantic-head-driven generation Probabilistic, but from specialized representations Variable-free tree-structured representations Wong and Mooney (2007) Generation by inverting a semantic parser that uses statistical machine translation Lu et al. (2009) Natural language generation with tree conditional random fields Database entries Angeli et al. (2010) A simple domain-independent probabilistic approach to generation 5/26

8 Previous Work From logical/semantic forms, but not probabilistic Wang (1980) On computational sentence generation from logical form Shieber et al. (1990) Semantic-head-driven generation Probabilistic, but from specialized representations Variable-free tree-structured representations Wong and Mooney (2007) Generation by inverting a semantic parser that uses statistical machine translation Lu et al. (2009) Natural language generation with tree conditional random fields Database entries Angeli et al. (2010) A simple domain-independent probabilistic approach to generation From formal logical forms, and probabilistic This work 5/26

9 Notes about λ-calculus Alternative notations for functional application f g (f g) f g h ((f g) h) Types Basic types: e: entity; t: truth value. Composite types: e,t : takes in type e and returns type t. Conversions α-conversion: λy.state(y) λx.state(x) β-reduction: λy.λx.loc(y, x) miss r λx.loc(miss r, x) (Restricted) higher-order unification (Kwiatkowski et al. 2010). λx.loc(miss r, x) state(x) λg.λf.λx.g(x) f(x) λx.loc(miss r, x) λx.state(x) 6/26

10 Packed Meaning Forest λx.loc(miss r, x) state(x) λg.λf.λx.g(x) f(x) λg.λf.λx.f(x) g(x) λx.loc(miss r, x) λx.state(x) λx.state(x) λx.loc(miss r, x) (1) (2) the mississippi runs through which states (1) which states have the mississippi river (2) 7/26

11 Packed Meaning Forest λg.λf.λx.g(x) f(x) λx.loc(miss r, x) λx.state(x) λg.λf.λx.g(x) f(x) λy.λx.loc(y, x) λx.state(x) λg.λf.λx.g(x) f(x) miss r λx.loc(miss r, x) λx.state(x) 8/26

12 Packed Meaning Forest λg.λf.λx.g(x) f(x) λx.loc(miss r, x) λx.state(x) λg.λf.λx.g(x) f(x) λy.λx.loc(y, x) λx.state(x) λg.λf.λx.g(x) f(x) miss r λx.loc(miss r, x) λx.state(x) 8/26

13 Packed Meaning Forest r : e,t λx.loc(miss r, x) state(x) states that the mississippi river runs through e,t : λg.λf.λx.g(x) f(x) e,t 1 e,t 2 λx.loc(miss r, x) state(x) states that the mississippi river runs through e,t : λg.λf.λx.f(x) g(x) e,t 1 e,t 2 λx.loc(miss r, x) state(x) states that the mississippi river runs through e,t : λx.state(x) λx.state(x) states e,t : λy.λx.loc(y, x) e 1 λx.loc(miss r, x) that the mississippi river runs through e : miss r miss r the mississippi river e,t : λx.loc(miss r, x) λx.loc(miss r, x) that the mississippi river runs through 9/26

14 Reduction-based Synchronous CFG Grammar used for language generation. e,t e λy.λx.loc(y, x) e 1, that e 1 runs through miss r, the mississippi (3) (4) A derivation with (3)+(4) e,t : λx.loc(miss r, x), that the mississippi runs through where λx.loc(miss r, x) λy.λx.loc(y, x) miss r How do we automatically induce such a grammar from data? 10/26

15 Reduction-based Synchronous CFG Grammar used for language generation. e,t e λy.λx.loc(y, x) e 1, that e 1 runs through miss r, the mississippi (3) (4) A derivation with (3)+(4) e,t : λx.loc(miss r, x), that the mississippi runs through where λx.loc(miss r, x) λy.λx.loc(y, x) miss r How do we automatically induce such a grammar from data? 10/26

16 Reduction-based Synchronous CFG Grammar used for language generation. e,t e λy.λx.loc(y, x) e 1, that e 1 runs through miss r, the mississippi (3) (4) A derivation with (3)+(4) e,t : λx.loc(miss r, x), that the mississippi runs through where λx.loc(miss r, x) λy.λx.loc(y, x) miss r How do we automatically induce such a grammar from data? 10/26

17 Grammar Induction Problem How to find mappings between λ-sub-expressions and NL words in an unsupervised manner? 11/26

18 Grammar Induction Problem How to find mappings between λ-sub-expressions and NL words in an unsupervised manner? Challenges Logical forms (e.g., λ-expressions) have complex internal structures and variable dependencies; Text-to-text aligners (Giza++, Berkeley aligner) are not applicable. 11/26

19 Grammar Induction Problem How to find mappings between λ-sub-expressions and NL words in an unsupervised manner? Challenges Solution Logical forms (e.g., λ-expressions) have complex internal structures and variable dependencies; Text-to-text aligners (Giza++, Berkeley aligner) are not applicable. λ-hybrid tree : a new generative model that explicitly models the correspondence between λ-sub-expressions and NL word sequences. 11/26

20 λ-hybrid Tree A tree whose leaves are natural language words, and internal nodes are λ-productions Generated from an underlying joint generative process Extensions to Lu et al. (2008): Internal nodes involve λ-expressions Meaning representation has packed-forest representation e,t 2 : λg.λf.λx.g(x) f(x) e,t 1 e,t 2 e,t 2 : λx.state(x) e,t 1 : λy.λx.loc(y, x) e 1 states that e 1 : miss r runs through the mississippi 12/26

21 λ-hybrid Tree e,t 1 : λy.λx.loc(y, x) e 1 that e 1 : miss r runs through the mississippi p 1 e,t : λy.λx.loc(y, x) e p 2 e : miss r T the (partial) hybrid tree above ( ) P T = ϕ(m wyw p 1 ) ψ(that e 1 runs through p 1 ) ρ(p 2 p 1, arg 1 ) ϕ(m w p 2 ) ψ(the mississippi p 2 ) Pattern Parameters Emission Parameters MR Model Parameters 13/26

22 λ-hybrid Tree p 1 that p 2 runs through the mississippi p 1 e,t : λy.λx.loc(y, x) e p 2 e : miss r T the (partial) hybrid tree above ( ) P T = ϕ(m wyw p 1 ) ψ(that e 1 runs through p 1 ) ρ(p 2 p 1, arg 1 ) ϕ(m w p 2 ) ψ(the mississippi p 2 ) Pattern Parameters Emission Parameters MR Model Parameters 13/26

23 λ-hybrid Tree p 1 that p 2 runs through the mississippi p 1 e,t : λy.λx.loc(y, x) e p 2 e : miss r T the (partial) hybrid tree above ( ) P T = ϕ(m wyw p 1 ) ψ(that e 1 runs through p 1 ) ρ(p 2 p 1, arg 1 ) ϕ(m w p 2 ) ψ(the mississippi p 2 ) Pattern Parameters Emission Parameters MR Model Parameters 13/26

24 λ-hybrid Tree p 1 that p 2 runs through the mississippi p 1 e,t : λy.λx.loc(y, x) e p 2 e : miss r T the (partial) hybrid tree above ( ) P T = ϕ(m wyw p 1 ) ψ(that e 1 runs through p 1 ) ρ(p 2 p 1, arg 1 ) ϕ(m w p 2 ) ψ(the mississippi p 2 ) Pattern Parameters Emission Parameters MR Model Parameters 13/26

25 λ-hybrid Tree p 1 that p 2 runs through the mississippi p 1 e,t : λy.λx.loc(y, x) e p 2 e : miss r T the (partial) hybrid tree above ( ) P T = ϕ(m wyw p 1 ) ψ(that e 1 runs through p 1 ) ρ(p 2 p 1, arg 1 ) ϕ(m w p 2 ) ψ(the mississippi p 2 ) Pattern Parameters Emission Parameters MR Model Parameters 13/26

26 λ-hybrid Tree p 1 that p 2 runs through the mississippi p 1 e,t : λy.λx.loc(y, x) e p 2 e : miss r T the (partial) hybrid tree above ( ) P T = ϕ(m wyw p 1 ) ψ(that e 1 runs through p 1 ) ρ(p 2 p 1, arg 1 ) ϕ(m w p 2 ) ψ(the mississippi p 2 ) Pattern Parameters Emission Parameters MR Model Parameters 13/26

27 λ-hybrid Tree p 1 that p 2 runs through the mississippi p 1 e,t : λy.λx.loc(y, x) e p 2 e : miss r T the (partial) hybrid tree above ( ) P T = ϕ(m wyw p 1 ) ψ(that e 1 runs through p 1 ) ρ(p 2 p 1, arg 1 ) ϕ(m w p 2 ) ψ(the mississippi p 2 ) Pattern Parameters Emission Parameters MR Model Parameters 13/26

28 λ-hybrid Tree p 1 that p 2 runs through the mississippi p 1 e,t : λy.λx.loc(y, x) e p 2 e : miss r T the (partial) hybrid tree above ( ) P T = ϕ(m wyw p 1 ) ψ(that e 1 runs through p 1 ) ρ(p 2 p 1, arg 1 ) ϕ(m w p 2 ) ψ(the mississippi p 2 ) Pattern Parameters Emission Parameters MR Model Parameters 13/26

29 λ-hybrid Tree e,t 2 : λg.λf.λx.g(x) f(x) e,t 1 e,t 2 e,t 2 : λx.state(x) e,t 1 : λy.λx.loc(y, x) e 1 states that e 1 : miss r runs through (English) the mississippi e,t 2 : λg.λf.λx.g(x) f(x) e,t 1 e,t 2 e,t 1 : λy.λx.loc(y, x) e 1 e,t 2 : λx.state(x) e 1 : miss r 穿越的 州 (Chinese) 密西西比河 λ-expression: λx.loc(miss r, x) state(x) 14/26

30 λ-hybrid Tree which hybrid tree is the correct one? e,t : λy.λx.loc(y, x) e 1 e,t : λy.λx.loc(y, x) e 1 that the e 1 : miss r runs through that e 1 : miss r through mississippi the mississippi runs e,t : λy.λx.loc(y, x) e 1 e,t : λy.λx.loc(y, x) e 1 e 1 : miss r runs through that the mississippi e 1 : miss r that the mississippi runs through Hybrid trees are hidden structures which need to be estimated with the Inside-Outside algorithm; We have developed an efficient algorithm that runs in cubic time in the number of words of NL sentence. 15/26

31 Grammar Induction Overall Algorithm For each training instance, construct its packed meaning forest. Train the λ-hybrid tree generative model with the training set, and find the most probable λ-hybrid tree for each training instance, and then extract the grammar rules from it. 16/26

32 Rule Extraction e,t 2 : λg.λf.λx.g(x) f(x) e,t 1 e,t 2 e,t 2 : λx.state(x) e,t 1 : λy.λx.loc(y, x) e 1 states that e 1 : miss r runs through the mississippi One-level rules e,t λy.λx.loc(y, x) e 1, that e 1 runs through 17/26

33 Rule Extraction e,t 2 : λg.λf.λx.g(x) f(x) e,t 1 e,t 2 e,t 2 : λx.state(x) e,t 1 : λy.λx.loc(y, x) e 1 states that e 1 : miss r runs through the mississippi Subtree rules e,t λx.loc(miss r, x), that the mississippi runs through 18/26

34 Rule Extraction e,t 2 : λg.λf.λx.g(x) f(x) e,t 1 e,t 2 e,t 2 : λx.state(x) e,t 1 : λy.λx.loc(y, x) e 1 states that e 1 : miss r runs through the mississippi Two-level rules λy.λx.loc(y, x) state(x) e 1 λy.λx.loc(y, x) state(x) e 1 e,t λy.λx.loc(y, x) state(x) e 1, states that e 1 runs through Substitution β-reductions α-conversion 19/26

35 Rule Extraction e,t 2 : λg.λf.λx.g(x) f(x) e,t 1 e,t 2 e,t 2 : λx.state(x) e,t 1 : λy.λx.loc(y, x) e 1 states that e 1 : y runs through Two-level rules [ ] λy. λg.λf.λx.g(x) f(x) [λy.λx.loc(y, x) y ] λx.state(x) e 1 λy.λx.loc(y, x) state(x) e 1 λy.λx.loc(y, x) state(x) e 1 e,t λy.λx.loc(y, x) state(x) e 1, states that e 1 runs through Substitution β-reductions α-conversion 19/26

36 Rule Extraction e,t 2 : λg.λf.λx.g(x) f(x) e,t 1 e,t 2 e,t 2 : λx.state(x) e,t 1 : λy.λx.loc(y, x) e 1 states that e 1 : y runs through Two-level rules [ ] λy. λg.λf.λx.g(x) f(x) [λy.λx.loc(y, x) y ] λx.state(x) e 1 λy.λx.loc(y, x) state(x) e 1 λy.λx.loc(y, x) state(x) e 1 e,t λy.λx.loc(y, x) state(x) e 1, states that e 1 runs through Substitution β-reductions α-conversion 19/26

37 Rule Extraction e,t 2 : λg.λf.λx.g(x) f(x) e,t 1 e,t 2 e,t 2 : λx.state(x) e,t 1 : λy.λx.loc(y, x) e 1 states that e 1 : y runs through Two-level rules [ ] λy. λg.λf.λx.g(x) f(x) [λy.λx.loc(y, x) y ] λx.state(x) e 1 λy.λx.loc(y, x) state(x) e 1 λy.λx.loc(y, x) state(x) e 1 e,t λy.λx.loc(y, x) state(x) e 1, states that e 1 runs through Substitution β-reductions α-conversion 19/26

38 Rule Extraction e,t 2 : λg.λf.λx.g(x) f(x) e,t 1 e,t 2 e,t 2 : λx.state(x) e,t 1 : λy.λx.loc(y, x) e 1 states that e 1 : y runs through Two-level rules [ ] λy. λg.λf.λx.g(x) f(x) [λy.λx.loc(y, x) y ] λx.state(x) e 1 λy.λx.loc(y, x) state(x) e 1 λy.λx.loc(y, x) state(x) e 1 e,t λy.λx.loc(y, x) state(x) e 1, states that e 1 runs through Substitution β-reductions α-conversion 19/26

39 Log-linear Model We assign score to derivation D: ( ) w(d) = f i (r) w i p LM (ŝ) w LM r D i 4 simple and general features: 3 rule-specific features 1 LM feature Minimum Error Rate Training (Och 2003) for learning feature weights 20/26

40 Decoding Forest-to-String Decoding For a given source expression e, find the most probable derivation D as scored by w, that produces e; the target side gives the generated sentence ŝ. ( ) ŝ = s arg max w(d) D s.t. e(d) e A bottom-up dynamic programming algorithm with cube-pruning. 21/26

41 Automatic Evaluation The Geoquery dataset (880), annotated with complete sentences in both English and Chinese. English Chinese Bleu 1 Ter Bleu 1 Ter text Moses preorder inorder postorder text Joshua preorder inorder postorder This work p < 0.01 for all cases, except for comparing against Joshua-preorder (p < 0.05) 22/26

42 Importance of Different Rules Subtree rules and two-level rules are capable of modeling some longer range dependencies. English Chinese Bleu 1 Ter Bleu 1 Ter with all rules w/o subtree rules w/o two-level rules /26

43 Human Evaluation Randomly sampled 50% of the testing examples. Five Judges each for both languages. English Flu Sem Moses 4.48 ± ± 0.20 Joshua 4.40 ± ± 0.18 This work 4.66 ± ± 0.16 Chinese Flu Sem Moses 4.14 ± ± 0.17 Joshua 4.00 ± ± 0.21 This work 4.59 ± ± 0.10 p < 0.01 for all cases 24/26

44 Variable-free Datasets The model can be applied to such variable-free datasets with tree-structured representations. For example, midfield(opp) λx.midfield(x) opp. Robocup(300) Geoquery(880) Bleu Nist Bleu Nist Wong and Mooney (2007) Lu et al. (2009) This work /26

45 Conclusions Introduced a novel reduction-based binary SCFG with a forest-to-string algorithm for language generation from typed lambda calculus expressions represented as a packed meaning forest. Introduced a novel grammar induction algorithm, built on top of the λ-hybrid tree model that models the joint generative process of both λ-expressions and natural language texts. 26/26

Driving Semantic Parsing from the World s Response

Driving Semantic Parsing from the World s Response Driving Semantic Parsing from the World s Response James Clarke, Dan Goldwasser, Ming-Wei Chang, Dan Roth Cognitive Computation Group University of Illinois at Urbana-Champaign CoNLL 2010 Clarke, Goldwasser,

More information

Tuning as Linear Regression

Tuning as Linear Regression Tuning as Linear Regression Marzieh Bazrafshan, Tagyoung Chung and Daniel Gildea Department of Computer Science University of Rochester Rochester, NY 14627 Abstract We propose a tuning method for statistical

More information

Introduction to Semantic Parsing with CCG

Introduction to Semantic Parsing with CCG Introduction to Semantic Parsing with CCG Kilian Evang Heinrich-Heine-Universität Düsseldorf 2018-04-24 Table of contents 1 Introduction to CCG Categorial Grammar (CG) Combinatory Categorial Grammar (CCG)

More information

Lecture 5: Semantic Parsing, CCGs, and Structured Classification

Lecture 5: Semantic Parsing, CCGs, and Structured Classification Lecture 5: Semantic Parsing, CCGs, and Structured Classification Kyle Richardson kyle@ims.uni-stuttgart.de May 12, 2016 Lecture Plan paper: Zettlemoyer and Collins (2012) general topics: (P)CCGs, compositional

More information

Decoding and Inference with Syntactic Translation Models

Decoding and Inference with Syntactic Translation Models Decoding and Inference with Syntactic Translation Models March 5, 2013 CFGs S NP VP VP NP V V NP NP CFGs S NP VP S VP NP V V NP NP CFGs S NP VP S VP NP V NP VP V NP NP CFGs S NP VP S VP NP V NP VP V NP

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based Statistical Machine Translation Marcello Federico Human Language Technology FBK 2014 Outline Tree-based translation models: Synchronous context free grammars Hierarchical phrase-based

More information

Advanced Natural Language Processing Syntactic Parsing

Advanced Natural Language Processing Syntactic Parsing Advanced Natural Language Processing Syntactic Parsing Alicia Ageno ageno@cs.upc.edu Universitat Politècnica de Catalunya NLP statistical parsing 1 Parsing Review Statistical Parsing SCFG Inside Algorithm

More information

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti Quasi-Synchronous Phrase Dependency Grammars for Machine Translation Kevin Gimpel Noah A. Smith 1 Introduction MT using dependency grammars on phrases Phrases capture local reordering and idiomatic translations

More information

TALP Phrase-Based System and TALP System Combination for the IWSLT 2006 IWSLT 2006, Kyoto

TALP Phrase-Based System and TALP System Combination for the IWSLT 2006 IWSLT 2006, Kyoto TALP Phrase-Based System and TALP System Combination for the IWSLT 2006 IWSLT 2006, Kyoto Marta R. Costa-jussà, Josep M. Crego, Adrià de Gispert, Patrik Lambert, Maxim Khalilov, José A.R. Fonollosa, José

More information

Cross-lingual Semantic Parsing

Cross-lingual Semantic Parsing Cross-lingual Semantic Parsing Part I: 11 Dimensions of Semantic Parsing Kilian Evang University of Düsseldorf 1 / 94 Abbreviations NL natural language e.g., English, Bulgarian NLU natural language utterance

More information

Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation

Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation David Vilar, Daniel Stein, Hermann Ney IWSLT 2008, Honolulu, Hawaii 20. October 2008 Human Language Technology

More information

NLU: Semantic parsing

NLU: Semantic parsing NLU: Semantic parsing Adam Lopez slide credits: Chris Dyer, Nathan Schneider March 30, 2018 School of Informatics University of Edinburgh alopez@inf.ed.ac.uk Recall: meaning representations Sam likes Casey

More information

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this. Chapter 12 Synchronous CFGs Synchronous context-free grammars are a generalization of CFGs that generate pairs of related strings instead of single strings. They are useful in many situations where one

More information

Applications of Tree Automata Theory Lecture VI: Back to Machine Translation

Applications of Tree Automata Theory Lecture VI: Back to Machine Translation Applications of Tree Automata Theory Lecture VI: Back to Machine Translation Andreas Maletti Institute of Computer Science Universität Leipzig, Germany on leave from: Institute for Natural Language Processing

More information

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister A Syntax-based Statistical Machine Translation Model Alexander Friedl, Georg Teichtmeister 4.12.2006 Introduction The model Experiment Conclusion Statistical Translation Model (STM): - mathematical model

More information

Structure and Complexity of Grammar-Based Machine Translation

Structure and Complexity of Grammar-Based Machine Translation Structure and of Grammar-Based Machine Translation University of Padua, Italy New York, June 9th, 2006 1 2 Synchronous context-free grammars Definitions Computational problems 3 problem SCFG projection

More information

Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1

Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1 Lecture 9: Decoding Andreas Maletti Statistical Machine Translation Stuttgart January 20, 2012 SMT VIII A. Maletti 1 Lecture 9 Last time Synchronous grammars (tree transducers) Rule extraction Weight training

More information

Knowledge representation DATA INFORMATION KNOWLEDGE WISDOM. Figure Relation ship between data, information knowledge and wisdom.

Knowledge representation DATA INFORMATION KNOWLEDGE WISDOM. Figure Relation ship between data, information knowledge and wisdom. Knowledge representation Introduction Knowledge is the progression that starts with data which s limited utility. Data when processed become information, information when interpreted or evaluated becomes

More information

Phrase-Based Statistical Machine Translation with Pivot Languages

Phrase-Based Statistical Machine Translation with Pivot Languages Phrase-Based Statistical Machine Translation with Pivot Languages N. Bertoldi, M. Barbaiani, M. Federico, R. Cattoni FBK, Trento - Italy Rovira i Virgili University, Tarragona - Spain October 21st, 2008

More information

Latent Variable Models in NLP

Latent Variable Models in NLP Latent Variable Models in NLP Aria Haghighi with Slav Petrov, John DeNero, and Dan Klein UC Berkeley, CS Division Latent Variable Models Latent Variable Models Latent Variable Models Observed Latent Variable

More information

AN ABSTRACT OF THE DISSERTATION OF

AN ABSTRACT OF THE DISSERTATION OF AN ABSTRACT OF THE DISSERTATION OF Kai Zhao for the degree of Doctor of Philosophy in Computer Science presented on May 30, 2017. Title: Structured Learning with Latent Variables: Theory and Algorithms

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Algorithms for NLP. Machine Translation II. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley

Algorithms for NLP. Machine Translation II. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley Algorithms for NLP Machine Translation II Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley Announcements Project 4: Word Alignment! Will be released soon! (~Monday) Phrase-Based System Overview

More information

Machine Translation: Examples. Statistical NLP Spring Levels of Transfer. Corpus-Based MT. World-Level MT: Examples

Machine Translation: Examples. Statistical NLP Spring Levels of Transfer. Corpus-Based MT. World-Level MT: Examples Statistical NLP Spring 2009 Machine Translation: Examples Lecture 17: Word Alignment Dan Klein UC Berkeley Corpus-Based MT Levels of Transfer Modeling correspondences between languages Sentence-aligned

More information

Bringing machine learning & compositional semantics together: central concepts

Bringing machine learning & compositional semantics together: central concepts Bringing machine learning & compositional semantics together: central concepts https://githubcom/cgpotts/annualreview-complearning Chris Potts Stanford Linguistics CS 244U: Natural language understanding

More information

Syntax-based Statistical Machine Translation

Syntax-based Statistical Machine Translation Syntax-based Statistical Machine Translation Philip Williams and Philipp Koehn 29 October 2014 Part I Part II - Introduction - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical

More information

A Discriminative Model for Semantics-to-String Translation

A Discriminative Model for Semantics-to-String Translation A Discriminative Model for Semantics-to-String Translation Aleš Tamchyna 1 and Chris Quirk 2 and Michel Galley 2 1 Charles University in Prague 2 Microsoft Research July 30, 2015 Tamchyna, Quirk, Galley

More information

Algorithms for Syntax-Aware Statistical Machine Translation

Algorithms for Syntax-Aware Statistical Machine Translation Algorithms for Syntax-Aware Statistical Machine Translation I. Dan Melamed, Wei Wang and Ben Wellington ew York University Syntax-Aware Statistical MT Statistical involves machine learning (ML) seems crucial

More information

Chapter 14 (Partially) Unsupervised Parsing

Chapter 14 (Partially) Unsupervised Parsing Chapter 14 (Partially) Unsupervised Parsing The linguistically-motivated tree transformations we discussed previously are very effective, but when we move to a new language, we may have to come up with

More information

Cross-Lingual Language Modeling for Automatic Speech Recogntion

Cross-Lingual Language Modeling for Automatic Speech Recogntion GBO Presentation Cross-Lingual Language Modeling for Automatic Speech Recogntion November 14, 2003 Woosung Kim woosung@cs.jhu.edu Center for Language and Speech Processing Dept. of Computer Science The

More information

Spatial Role Labeling CS365 Course Project

Spatial Role Labeling CS365 Course Project Spatial Role Labeling CS365 Course Project Amit Kumar, akkumar@iitk.ac.in Chandra Sekhar, gchandra@iitk.ac.in Supervisor : Dr.Amitabha Mukerjee ABSTRACT In natural language processing one of the important

More information

Introduction to Semantics. Common Nouns and Adjectives in Predicate Position 1

Introduction to Semantics. Common Nouns and Adjectives in Predicate Position 1 Common Nouns and Adjectives in Predicate Position 1 (1) The Lexicon of Our System at Present a. Proper Names: [[ Barack ]] = Barack b. Intransitive Verbs: [[ smokes ]] = [ λx : x D e. IF x smokes THEN

More information

Language Model Rest Costs and Space-Efficient Storage

Language Model Rest Costs and Space-Efficient Storage Language Model Rest Costs and Space-Efficient Storage Kenneth Heafield Philipp Koehn Alon Lavie Carnegie Mellon, University of Edinburgh July 14, 2012 Complaint About Language Models Make Search Expensive

More information

Statistical NLP Spring Corpus-Based MT

Statistical NLP Spring Corpus-Based MT Statistical NLP Spring 2010 Lecture 17: Word / Phrase MT Dan Klein UC Berkeley Corpus-Based MT Modeling correspondences between languages Sentence-aligned parallel corpus: Yo lo haré mañana I will do it

More information

Corpus-Based MT. Statistical NLP Spring Unsupervised Word Alignment. Alignment Error Rate. IBM Models 1/2. Problems with Model 1

Corpus-Based MT. Statistical NLP Spring Unsupervised Word Alignment. Alignment Error Rate. IBM Models 1/2. Problems with Model 1 Statistical NLP Spring 2010 Corpus-Based MT Modeling correspondences between languages Sentence-aligned parallel corpus: Yo lo haré mañana I will do it tomorrow Hasta pronto See you soon Hasta pronto See

More information

Human-level concept learning through probabilistic program induction

Human-level concept learning through probabilistic program induction B.M Lake, R. Salakhutdinov, J.B. Tenenbaum Human-level concept learning through probabilistic program induction journal club at two aspects in which machine learning spectacularly lags behind human learning

More information

Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions

Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical Machine Translation Philip Williams and Philipp Koehn 29 October 2014 Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical

More information

Multiword Expression Identification with Tree Substitution Grammars

Multiword Expression Identification with Tree Substitution Grammars Multiword Expression Identification with Tree Substitution Grammars Spence Green, Marie-Catherine de Marneffe, John Bauer, and Christopher D. Manning Stanford University EMNLP 2011 Main Idea Use syntactic

More information

Statistical NLP Spring HW2: PNP Classification

Statistical NLP Spring HW2: PNP Classification Statistical NLP Spring 2010 Lecture 16: Word Alignment Dan Klein UC Berkeley HW2: PNP Classification Overall: good work! Top results: 88.1: Matthew Can (word/phrase pre/suffixes) 88.1: Kurtis Heimerl (positional

More information

HW2: PNP Classification. Statistical NLP Spring Levels of Transfer. Phrasal / Syntactic MT: Examples. Lecture 16: Word Alignment

HW2: PNP Classification. Statistical NLP Spring Levels of Transfer. Phrasal / Syntactic MT: Examples. Lecture 16: Word Alignment Statistical NLP Spring 2010 Lecture 16: Word Alignment Dan Klein UC Berkeley HW2: PNP Classification Overall: good work! Top results: 88.1: Matthew Can (word/phrase pre/suffixes) 88.1: Kurtis Heimerl (positional

More information

Expectation Maximization (EM)

Expectation Maximization (EM) Expectation Maximization (EM) The EM algorithm is used to train models involving latent variables using training data in which the latent variables are not observed (unlabeled data). This is to be contrasted

More information

National Centre for Language Technology School of Computing Dublin City University

National Centre for Language Technology School of Computing Dublin City University with with National Centre for Language Technology School of Computing Dublin City University Parallel treebanks A parallel treebank comprises: sentence pairs parsed word-aligned tree-aligned (Volk & Samuelsson,

More information

Statistical Machine Translation. Part III: Search Problem. Complexity issues. DP beam-search: with single and multi-stacks

Statistical Machine Translation. Part III: Search Problem. Complexity issues. DP beam-search: with single and multi-stacks Statistical Machine Translation Marcello Federico FBK-irst Trento, Italy Galileo Galilei PhD School - University of Pisa Pisa, 7-19 May 008 Part III: Search Problem 1 Complexity issues A search: with single

More information

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Probabilistic Graphical Models: MRFs and CRFs CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Why PGMs? PGMs can model joint probabilities of many events. many techniques commonly

More information

Extensions to the Logic of All x are y: Verbs, Relative Clauses, and Only

Extensions to the Logic of All x are y: Verbs, Relative Clauses, and Only 1/53 Extensions to the Logic of All x are y: Verbs, Relative Clauses, and Only Larry Moss Indiana University Nordic Logic School August 7-11, 2017 2/53 An example that we ll see a few times Consider the

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation -tree-based models (cont.)- Artem Sokolov Computerlinguistik Universität Heidelberg Sommersemester 2015 material from P. Koehn, S. Riezler, D. Altshuler Bottom-Up Decoding

More information

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark Penn Treebank Parsing Advanced Topics in Language Processing Stephen Clark 1 The Penn Treebank 40,000 sentences of WSJ newspaper text annotated with phrasestructure trees The trees contain some predicate-argument

More information

The Geometry of Statistical Machine Translation

The Geometry of Statistical Machine Translation The Geometry of Statistical Machine Translation Presented by Rory Waite 16th of December 2015 ntroduction Linear Models Convex Geometry The Minkowski Sum Projected MERT Conclusions ntroduction We provide

More information

Lecture 7: Introduction to syntax-based MT

Lecture 7: Introduction to syntax-based MT Lecture 7: Introduction to syntax-based MT Andreas Maletti Statistical Machine Translation Stuttgart December 16, 2011 SMT VII A. Maletti 1 Lecture 7 Goals Overview Tree substitution grammars (tree automata)

More information

Natural Language Processing

Natural Language Processing SFU NatLangLab Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University September 27, 2018 0 Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class

More information

Multiple System Combination. Jinhua Du CNGL July 23, 2008

Multiple System Combination. Jinhua Du CNGL July 23, 2008 Multiple System Combination Jinhua Du CNGL July 23, 2008 Outline Introduction Motivation Current Achievements Combination Strategies Key Techniques System Combination Framework in IA Large-Scale Experiments

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based tatistical Machine Translation Marcello Federico (based on slides by Gabriele Musillo) Human Language Technology FBK-irst 2011 Outline Tree-based translation models: ynchronous context

More information

Efficient Incremental Decoding for Tree-to-String Translation

Efficient Incremental Decoding for Tree-to-String Translation Efficient Incremental Decoding for Tree-to-String Translation Liang Huang 1 1 Information Sciences Institute University of Southern California 4676 Admiralty Way, Suite 1001 Marina del Rey, CA 90292, USA

More information

CMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss

CMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss CMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss Jeffrey Flanigan Chris Dyer Noah A. Smith Jaime Carbonell School of Computer Science, Carnegie Mellon University, Pittsburgh,

More information

Variational Decoding for Statistical Machine Translation

Variational Decoding for Statistical Machine Translation Variational Decoding for Statistical Machine Translation Zhifei Li, Jason Eisner, and Sanjeev Khudanpur Center for Language and Speech Processing Computer Science Department Johns Hopkins University 1

More information

Statistical Ranking Problem

Statistical Ranking Problem Statistical Ranking Problem Tong Zhang Statistics Department, Rutgers University Ranking Problems Rank a set of items and display to users in corresponding order. Two issues: performance on top and dealing

More information

Natural Language Processing (CSEP 517): Machine Translation

Natural Language Processing (CSEP 517): Machine Translation Natural Language Processing (CSEP 57): Machine Translation Noah Smith c 207 University of Washington nasmith@cs.washington.edu May 5, 207 / 59 To-Do List Online quiz: due Sunday (Jurafsky and Martin, 2008,

More information

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09 Natural Language Processing : Probabilistic Context Free Grammars Updated 5/09 Motivation N-gram models and HMM Tagging only allowed us to process sentences linearly. However, even simple sentences require

More information

Learning Dependency-Based Compositional Semantics

Learning Dependency-Based Compositional Semantics Learning Dependency-Based Compositional Semantics Semantic Representations for Textual Inference Workshop Mar. 0, 0 Percy Liang Google/Stanford joint work with Michael Jordan and Dan Klein Motivating Problem:

More information

Semantic Parsing with Combinatory Categorial Grammars

Semantic Parsing with Combinatory Categorial Grammars Semantic Parsing with Combinatory Categorial Grammars Yoav Artzi, Nicholas FitzGerald and Luke Zettlemoyer University of Washington ACL 2013 Tutorial Sofia, Bulgaria Learning Data Learning Algorithm CCG

More information

Speech Translation: from Singlebest to N-Best to Lattice Translation. Spoken Language Communication Laboratories

Speech Translation: from Singlebest to N-Best to Lattice Translation. Spoken Language Communication Laboratories Speech Translation: from Singlebest to N-Best to Lattice Translation Ruiqiang ZHANG Genichiro KIKUI Spoken Language Communication Laboratories 2 Speech Translation Structure Single-best only ASR Single-best

More information

Outline. Learning. Overview Details Example Lexicon learning Supervision signals

Outline. Learning. Overview Details Example Lexicon learning Supervision signals Outline Learning Overview Details Example Lexicon learning Supervision signals 0 Outline Learning Overview Details Example Lexicon learning Supervision signals 1 Supervision in syntactic parsing Input:

More information

Generative Models for Sentences

Generative Models for Sentences Generative Models for Sentences Amjad Almahairi PhD student August 16 th 2014 Outline 1. Motivation Language modelling Full Sentence Embeddings 2. Approach Bayesian Networks Variational Autoencoders (VAE)

More information

Lecture 13: Structured Prediction

Lecture 13: Structured Prediction Lecture 13: Structured Prediction Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/nlp16 CS6501: NLP 1 Quiz 2 v Lectures 9-13 v Lecture 12: before page

More information

Logic and machine learning review. CS 540 Yingyu Liang

Logic and machine learning review. CS 540 Yingyu Liang Logic and machine learning review CS 540 Yingyu Liang Propositional logic Logic If the rules of the world are presented formally, then a decision maker can use logical reasoning to make rational decisions.

More information

Triplet Lexicon Models for Statistical Machine Translation

Triplet Lexicon Models for Statistical Machine Translation Triplet Lexicon Models for Statistical Machine Translation Saša Hasan, Juri Ganitkevitch, Hermann Ney and Jesús Andrés Ferrer lastname@cs.rwth-aachen.de CLSP Student Seminar February 6, 2009 Human Language

More information

Lecture 15. Probabilistic Models on Graph

Lecture 15. Probabilistic Models on Graph Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how

More information

Out of GIZA Efficient Word Alignment Models for SMT

Out of GIZA Efficient Word Alignment Models for SMT Out of GIZA Efficient Word Alignment Models for SMT Yanjun Ma National Centre for Language Technology School of Computing Dublin City University NCLT Seminar Series March 4, 2009 Y. Ma (DCU) Out of Giza

More information

Overview (Fall 2007) Machine Translation Part III. Roadmap for the Next Few Lectures. Phrase-Based Models. Learning phrases from alignments

Overview (Fall 2007) Machine Translation Part III. Roadmap for the Next Few Lectures. Phrase-Based Models. Learning phrases from alignments Overview Learning phrases from alignments 6.864 (Fall 2007) Machine Translation Part III A phrase-based model Decoding in phrase-based models (Thanks to Philipp Koehn for giving me slides from his EACL

More information

Probabilistic Context-Free Grammar

Probabilistic Context-Free Grammar Probabilistic Context-Free Grammar Petr Horáček, Eva Zámečníková and Ivana Burgetová Department of Information Systems Faculty of Information Technology Brno University of Technology Božetěchova 2, 612

More information

A DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005

A DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005 A DOP Model for LFG Rens Bod and Ronald Kaplan Kathrin Spreyer Data-Oriented Parsing, 14 June 2005 Lexical-Functional Grammar (LFG) Levels of linguistic knowledge represented formally differently (non-monostratal):

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars Berlin Chen 2005 References: 1. Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2. Speech and Language Processing, chapters 9, 10 NLP-Berlin Chen 1 Grammars

More information

Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs

Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs (based on slides by Sharon Goldwater and Philipp Koehn) 21 February 2018 Nathan Schneider ENLP Lecture 11 21

More information

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 18 Alignment in SMT and Tutorial on Giza++ and Moses)

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 18 Alignment in SMT and Tutorial on Giza++ and Moses) CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 18 Alignment in SMT and Tutorial on Giza++ and Moses) Pushpak Bhattacharyya CSE Dept., IIT Bombay 15 th Feb, 2011 Going forward

More information

CS460/626 : Natural Language

CS460/626 : Natural Language CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 27 SMT Assignment; HMM recap; Probabilistic Parsing cntd) Pushpak Bhattacharyya CSE Dept., IIT Bombay 17 th March, 2011 CMU Pronunciation

More information

Discriminative Training

Discriminative Training Discriminative Training February 19, 2013 Noisy Channels Again p(e) source English Noisy Channels Again p(e) p(g e) source English German Noisy Channels Again p(e) p(g e) source English German decoder

More information

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):

More information

Advances in Abstract Categorial Grammars

Advances in Abstract Categorial Grammars Advances in Abstract Categorial Grammars Language Theory and Linguistic Modeling Lecture 3 Reduction of second-order ACGs to to Datalog Extension to almost linear second-order ACGs CFG recognition/parsing

More information

On rigid NL Lambek grammars inference from generalized functor-argument data

On rigid NL Lambek grammars inference from generalized functor-argument data 7 On rigid NL Lambek grammars inference from generalized functor-argument data Denis Béchet and Annie Foret Abstract This paper is concerned with the inference of categorial grammars, a context-free grammar

More information

Learning to translate with neural networks. Michael Auli

Learning to translate with neural networks. Michael Auli Learning to translate with neural networks Michael Auli 1 Neural networks for text processing Similar words near each other France Spain dog cat Neural networks for text processing Similar words near each

More information

Soft Inference and Posterior Marginals. September 19, 2013

Soft Inference and Posterior Marginals. September 19, 2013 Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard inference Give me a single solution Viterbi algorithm Maximum spanning tree (Chu-Liu-Edmonds alg.) Soft inference

More information

A* Search. 1 Dijkstra Shortest Path

A* Search. 1 Dijkstra Shortest Path A* Search Consider the eight puzzle. There are eight tiles numbered 1 through 8 on a 3 by three grid with nine locations so that one location is left empty. We can move by sliding a tile adjacent to the

More information

A Stochastic l-calculus

A Stochastic l-calculus A Stochastic l-calculus Content Areas: probabilistic reasoning, knowledge representation, causality Tracking Number: 775 Abstract There is an increasing interest within the research community in the design

More information

Fast Consensus Decoding over Translation Forests

Fast Consensus Decoding over Translation Forests Fast Consensus Decoding over Translation Forests John DeNero Computer Science Division University of California, Berkeley denero@cs.berkeley.edu David Chiang and Kevin Knight Information Sciences Institute

More information

LECTURER: BURCU CAN Spring

LECTURER: BURCU CAN Spring LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can

More information

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing CS 562: Empirical Methods in Natural Language Processing Unit 2: Tree Models Lectures 19-23: Context-Free Grammars and Parsing Oct-Nov 2009 Liang Huang (lhuang@isi.edu) Big Picture we have already covered...

More information

Shift-Reduce Word Reordering for Machine Translation

Shift-Reduce Word Reordering for Machine Translation Shift-Reduce Word Reordering for Machine Translation Katsuhiko Hayashi, Katsuhito Sudoh, Hajime Tsukada, Jun Suzuki, Masaaki Nagata NTT Communication Science Laboratories, NTT Corporation 2-4 Hikaridai,

More information

IBM Model 1 for Machine Translation

IBM Model 1 for Machine Translation IBM Model 1 for Machine Translation Micha Elsner March 28, 2014 2 Machine translation A key area of computational linguistics Bar-Hillel points out that human-like translation requires understanding of

More information

Discrimina)ve Latent Variable Models. SPFLODD November 15, 2011

Discrimina)ve Latent Variable Models. SPFLODD November 15, 2011 Discrimina)ve Latent Variable Models SPFLODD November 15, 2011 Lecture Plan 1. Latent variables in genera)ve models (review) 2. Latent variables in condi)onal models 3. Latent variables in structural SVMs

More information

Global Machine Learning for Spatial Ontology Population

Global Machine Learning for Spatial Ontology Population Global Machine Learning for Spatial Ontology Population Parisa Kordjamshidi, Marie-Francine Moens KU Leuven, Belgium Abstract Understanding spatial language is important in many applications such as geographical

More information

Parts 3-6 are EXAMPLES for cse634

Parts 3-6 are EXAMPLES for cse634 1 Parts 3-6 are EXAMPLES for cse634 FINAL TEST CSE 352 ARTIFICIAL INTELLIGENCE Fall 2008 There are 6 pages in this exam. Please make sure you have all of them INTRODUCTION Philosophical AI Questions Q1.

More information

Shift-Reduce Word Reordering for Machine Translation

Shift-Reduce Word Reordering for Machine Translation Shift-Reduce Word Reordering for Machine Translation Katsuhiko Hayashi, Katsuhito Sudoh, Hajime Tsukada, Jun Suzuki, Masaaki Nagata NTT Communication Science Laboratories, NTT Corporation 2-4 Hikaridai,

More information

Tribhuvan University Institute of Science and Technology Micro Syllabus

Tribhuvan University Institute of Science and Technology Micro Syllabus Tribhuvan University Institute of Science and Technology Micro Syllabus Course Title: Discrete Structure Course no: CSC-152 Full Marks: 80+20 Credit hours: 3 Pass Marks: 32+8 Nature of course: Theory (3

More information

Notes on the framework of Ando and Zhang (2005) 1 Beyond learning good functions: learning good spaces

Notes on the framework of Ando and Zhang (2005) 1 Beyond learning good functions: learning good spaces Notes on the framework of Ando and Zhang (2005 Karl Stratos 1 Beyond learning good functions: learning good spaces 1.1 A single binary classification problem Let X denote the problem domain. Suppose we

More information

N-gram Language Modeling

N-gram Language Modeling N-gram Language Modeling Outline: Statistical Language Model (LM) Intro General N-gram models Basic (non-parametric) n-grams Class LMs Mixtures Part I: Statistical Language Model (LM) Intro What is a statistical

More information

Lagrangian Relaxation Algorithms for Inference in Natural Language Processing

Lagrangian Relaxation Algorithms for Inference in Natural Language Processing Lagrangian Relaxation Algorithms for Inference in Natural Language Processing Alexander M. Rush and Michael Collins (based on joint work with Yin-Wen Chang, Tommi Jaakkola, Terry Koo, Roi Reichart, David

More information

Machine Translation without Words through Substring Alignment

Machine Translation without Words through Substring Alignment Machine Translation without Words through Substring Alignment Graham Neubig 1,2,3, Taro Watanabe 2, Shinsuke Mori 1, Tatsuya Kawahara 1 1 2 3 now at 1 Machine Translation Translate a source sentence F

More information

Theory of Alignment Generators and Applications to Statistical Machine Translation

Theory of Alignment Generators and Applications to Statistical Machine Translation Theory of Alignment Generators and Applications to Statistical Machine Translation Raghavendra Udupa U Hemanta K Mai IBM India Research Laboratory, New Delhi {uraghave, hemantkm}@inibmcom Abstract Viterbi

More information

Personal Project: Shift-Reduce Dependency Parsing

Personal Project: Shift-Reduce Dependency Parsing Personal Project: Shift-Reduce Dependency Parsing 1 Problem Statement The goal of this project is to implement a shift-reduce dependency parser. This entails two subgoals: Inference: We must have a shift-reduce

More information