Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1

Size: px
Start display at page:

Download "Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1"

Transcription

1 Lecture 9: Decoding Andreas Maletti Statistical Machine Translation Stuttgart January 20, 2012 SMT VIII A. Maletti 1

2 Lecture 9 Last time Synchronous grammars (tree transducers) Rule extraction Weight training (EM) This time Inside and outside weights n-best derivations Cube pruning Factorization SMT VIII A. Maletti 2

3 Lecture 9 Last time Synchronous grammars (tree transducers) Rule extraction Weight training (EM) This time Inside and outside weights (again) n-best derivations Cube pruning Factorization SMT VIII A. Maletti 2

4 Lecture 9 Last time Synchronous grammars (tree transducers) Rule extraction Weight training (EM) This time Inside and outside weights n-best derivations Cube pruning Factorization SMT VIII A. Maletti 2

5 Lecture 9 Last time Synchronous grammars (tree transducers) Rule extraction Weight training (EM) This time Inside and outside weights n-best derivations Cube pruning Factorization SMT VIII A. Maletti 2

6 Lecture 9 Last time Synchronous grammars (tree transducers) Rule extraction Weight training (EM) This time Inside and outside weights n-best derivations Cube pruning Factorization SMT VIII A. Maletti 2

7 Contents 1 Inside and Outside Weights 2 n-best Derivations 3 Cube Pruning 4 Factorization SMT VIII A. Maletti 3

8 State Metrics of a Weighted Tree Grammar Given weighted tree grammar G = (Q, Σ, I, R) D q M : derivations in M starting from state q Definition Inside weight of q Q: in(q) = d D q M wt(d) Outside weight of q Q: out(q) = t C Σ ({q}) d D M (t) wt(d) SMT VIII A. Maletti 4

9 Inside Weight Question Why did nobody complain about the infinite sum? in(q) = wt(d) d D q M SMT VIII A. Maletti 5

10 Computing Inside Weights Theorem in(q) = c in(q 1 )... in(q k ) q c σ(q 1,...,q k ) R Problem But that is not a well-founded recursion! Turn it into a system of equations 0 = in(q) + c in(q 1 )... in(q k ) 0 = in(p) + Q variables and Q equations q c σ(q 1,...,q k ) R SMT VIII A. Maletti 6

11 Computing Inside Weights Theorem in(q) = c in(q 1 )... in(q k ) q c σ(q 1,...,q k ) R Problem But that is not a well-founded recursion! Turn it into a system of equations 0 = in(q) + c in(q 1 )... in(q k ) 0 = in(p) + Q variables and Q equations q c σ(q 1,...,q k ) R SMT VIII A. Maletti 6

12 Computing Inside Weights Theorem in(q) = c in(q 1 )... in(q k ) q c σ(q 1,...,q k ) R Problem But that is not a well-founded recursion! Turn it into a system of equations 0 = in(q) + c in(q 1 )... in(q k ) 0 = in(p) + Q variables and Q equations q c σ(q 1,...,q k ) R SMT VIII A. Maletti 6

13 Solving a system of equations Q variables and Q equations 0 = in(q) + c in(q 1 )... in(q k ) q c σ(q 1,...,q k ) R 0 = in(p) + Question Is it a linear system of equations? Approximation Use any method for approximating zero-points of functions Regula Falsi Tangent method (NEWTON-RAPHSON) Secant method... SMT VIII A. Maletti 7

14 Solving a system of equations Q variables and Q equations 0 = in(q) + c in(q 1 )... in(q k ) 0 = in(p) + q c σ(q 1,...,q k ) R Question Is it a linear system of equations? NO! Approximation Use any method for approximating zero-points of functions Regula Falsi Tangent method (NEWTON-RAPHSON) Secant method... SMT VIII A. Maletti 7

15 Solving a system of equations Q variables and Q equations 0 = in(q) + c in(q 1 )... in(q k ) 0 = in(p) + q c σ(q 1,...,q k ) R Question Is it a linear system of equations? NO! Approximation Use any method for approximating zero-points of functions Regula Falsi Tangent method (NEWTON-RAPHSON) Secant method... SMT VIII A. Maletti 7

16 Newton-Raphson method Algorithm 1 Select initial point p 2 Determine tangent to curve at p 3 Determine zero-point p of tangent 4 Go back to 2 SMT VIII A. Maletti 8

17 Newton-Raphson Iteration SMT VIII A. Maletti 9

18 Newton-Raphson Iteration SMT VIII A. Maletti 9

19 Newton-Raphson Iteration SMT VIII A. Maletti 9

20 Newton-Raphson Iteration SMT VIII A. Maletti 9

21 Newton-Raphson Iteration SMT VIII A. Maletti 9

22 Newton-Raphson Iteration SMT VIII A. Maletti 9

23 Newton-Raphson Iteration SMT VIII A. Maletti 9

24 Newton-Raphson Iteration SMT VIII A. Maletti 9

25 Newton-Raphson Iteration SMT VIII A. Maletti 9

26 Newton-Raphson Iteration SMT VIII A. Maletti 9

27 Newton-Raphson Iteration SMT VIII A. Maletti 9

28 Newton-Raphson Iteration SMT VIII A. Maletti 9

29 Newton-Raphson Iteration SMT VIII A. Maletti 9

30 Newton-Raphson Iteration SMT VIII A. Maletti 9

31 Newton-Raphson Iteration SMT VIII A. Maletti 9

32 Newton-Raphson Iteration SMT VIII A. Maletti 9

33 Newton-Raphson Iteration SMT VIII A. Maletti 9

34 Convergence Speed given a good initial point converges quadratically to the solution (number of correct decimal points doubles with each iteration) But Convergence is not guaranteed Initial point needs to be reasonably close to zero-point SMT VIII A. Maletti 10

35 Convergence Speed given a good initial point converges quadratically to the solution (number of correct decimal points doubles with each iteration) But Convergence is not guaranteed Initial point needs to be reasonably close to zero-point SMT VIII A. Maletti 10

36 Convergence SMT VIII A. Maletti 10

37 Contents 1 Inside and Outside Weights 2 n-best Derivations 3 Cube Pruning 4 Factorization SMT VIII A. Maletti 11

38 n-best Derivations Problem Given WTG G = (Q, Σ, I, R) and q Q Compute the n highest scoring derivations of D q M Caveats Semantics is given by wt(t) = ( ) wt(d) q I d D q M (t) We disregard the actual input We disregard the summation SMT VIII A. Maletti 12

39 n-best Derivations Problem Given WTG G = (Q, Σ, I, R) and q Q Compute the n highest scoring derivations of D q M Caveats Semantics is given by wt(t) = ( ) wt(d) q I d D q M (t) We disregard the actual input We disregard the summation SMT VIII A. Maletti 12

40 n-best Derivations Problem Given WTG G = (Q, Σ, I, R) and q Q Compute the n highest scoring derivations of D q M Caveats Semantics is given by wt(t) = ( ) wt(d) q I d D q M (t) We disregard the actual input Use product construction to restrict to actual input We disregard the summation SMT VIII A. Maletti 12

41 n-best Derivations Problem Given WTG G = (Q, Σ, I, R) and q Q Compute the n highest scoring derivations of D q M Caveats Semantics is given by wt(t) = ( ) wt(d) q I d D q M (t) We disregard the actual input Use product construction to restrict to actual input We disregard the summation No Fix! SMT VIII A. Maletti 12

42 Viterbi Algorithm Algorithm 1 Sort states topologically 2 Select least unprocessed state q 3 Determine its best derivation (based on previous states) 4 Mark q and return to 2. Requirement Optimal substructure property (dynamic programming) a i b i implies f (a 1,..., a k ) f (b 1,..., b k ) in our case: a i b i must imply k i=1 a i k i=1 b i SMT VIII A. Maletti 13

43 Viterbi Algorithm 3 σ α 0.7 β 0.3 α 0.2 β 0.8 SMT VIII A. Maletti 14

44 Viterbi Algorithm 3 σ α 0.7 β 0.3 α 0.2 β 0.8 Best derivations: 1 α 0.7 SMT VIII A. Maletti 14

45 Viterbi Algorithm 3 σ α 0.7 β 0.3 α 0.2 β 0.8 Best derivations: 1 α β 0.8 SMT VIII A. Maletti 14

46 Viterbi Algorithm 3 σ α 0.7 β 0.3 α 0.2 β 0.8 Best derivations: 1 α β σ(α, β) SMT VIII A. Maletti 14

47 Viterbi Algorithm Exercise (Taken from KAESLIN: A gentle introduction to dynamic programming) SMT VIII A. Maletti 15

48 Viterbi Algorithm Suitable weight structures (R 0, +,, 0, 1) (R, +,, 0, 1) ({0, 1}, max, min, 0, 1) (N, +,, 0, 1) ([0, 1], max,, 0, 1) Extension Can we handle cyclic WTG? additional requirement: f (a 1,..., a k ) a i or in our case k a i a i i=1 SMT VIII A. Maletti 16

49 Viterbi Algorithm Suitable weight structures (R 0, +,, 0, 1) YES (R, +,, 0, 1) ({0, 1}, max, min, 0, 1) (N, +,, 0, 1) ([0, 1], max,, 0, 1) Extension Can we handle cyclic WTG? additional requirement: f (a 1,..., a k ) a i or in our case k a i a i i=1 SMT VIII A. Maletti 16

50 Viterbi Algorithm Suitable weight structures (R 0, +,, 0, 1) YES (R, +,, 0, 1) NO ({0, 1}, max, min, 0, 1) (N, +,, 0, 1) ([0, 1], max,, 0, 1) Extension Can we handle cyclic WTG? additional requirement: f (a 1,..., a k ) a i or in our case k a i a i i=1 SMT VIII A. Maletti 16

51 Viterbi Algorithm Suitable weight structures (R 0, +,, 0, 1) YES (R, +,, 0, 1) NO ({0, 1}, max, min, 0, 1) YES (N, +,, 0, 1) ([0, 1], max,, 0, 1) Extension Can we handle cyclic WTG? additional requirement: f (a 1,..., a k ) a i or in our case k a i a i i=1 SMT VIII A. Maletti 16

52 Viterbi Algorithm Suitable weight structures (R 0, +,, 0, 1) YES (R, +,, 0, 1) NO ({0, 1}, max, min, 0, 1) YES (N, +,, 0, 1) YES ([0, 1], max,, 0, 1) Extension Can we handle cyclic WTG? additional requirement: f (a 1,..., a k ) a i or in our case k a i a i i=1 SMT VIII A. Maletti 16

53 Viterbi Algorithm Suitable weight structures (R 0, +,, 0, 1) YES (R, +,, 0, 1) NO ({0, 1}, max, min, 0, 1) YES (N, +,, 0, 1) YES ([0, 1], max,, 0, 1) YES Extension Can we handle cyclic WTG? additional requirement: f (a 1,..., a k ) a i or in our case k a i a i i=1 SMT VIII A. Maletti 16

54 Viterbi Algorithm Suitable weight structures (R 0, +,, 0, 1) YES (R, +,, 0, 1) NO ({0, 1}, max, min, 0, 1) YES (N, +,, 0, 1) YES ([0, 1], max,, 0, 1) YES Extension Can we handle cyclic WTG? additional requirement: f (a 1,..., a k ) a i or in our case k a i a i i=1 SMT VIII A. Maletti 16

55 Viterbi Algorithm Suitable weight structures (R 0, +,, 0, 1) YES (R, +,, 0, 1) NO ({0, 1}, max, min, 0, 1) YES (N, +,, 0, 1) YES ([0, 1], max,, 0, 1) YES Extension Can we handle cyclic WTG? additional requirement: YES f (a 1,..., a k ) a i or in our case k a i a i i=1 SMT VIII A. Maletti 16

56 n-best Algorithm Observations Best subderivations yield the best derivation How do we obtain the 2nd best derivation? SMT VIII A. Maletti 17

57 n-best Algorithm Observations Best subderivations yield the best derivation How do we obtain the 2nd best derivation? SMT VIII A. Maletti 17

58 n-best Algorithm Observations Best subderivations yield the best derivation How do we obtain the 2nd best derivation? SMT VIII A. Maletti 17

59 n-best Algorithm Observations Best subderivations yield the best derivation How do we obtain the 2nd best derivation? SMT VIII A. Maletti 17

60 n-best Algorithm Observations Best subderivations yield the best derivation How do we obtain the 2nd best derivation? SMT VIII A. Maletti 17

61 n-best Algorithm Observations Best subderivations yield the best derivation How do we obtain the 2nd best derivation? SMT VIII A. Maletti 17

62 n-best Algorithm Observations Best subderivations yield the best derivation How do we obtain the 2nd best derivation? SMT VIII A. Maletti 17

63 n-best Algorithm Optimization lazy computation of values yields O( E + D max n log n) E : number of transitions D max : longest derivation in top n n: desired number of derivations SMT VIII A. Maletti 18

64 3-best Algorithm 3 σ α 0.7 β 0.3 α 0.2 β 0.8 1st derivation: σ(α, β) with weight = nd derivation: SMT VIII A. Maletti 19

65 3-best Algorithm 3 σ α 0.7 β 0.3 α 0.2 β 0.8 1st derivation: σ(α, β) with weight = nd derivation: best of β 0.3 σ 0.8 with 2nd best from 1 and best from 2 σ 0.8 with best from 1 and 2nd best from 2 SMT VIII A. Maletti 19

66 3-best Algorithm 3 σ α 0.7 β 0.3 α 0.2 β 0.8 1st derivation: σ(α, β) with weight = nd derivation: best of β σ 0.8 with 2nd best from 1 and best from 2 σ 0.8 with best from 1 and 2nd best from 2 SMT VIII A. Maletti 19

67 3-best Algorithm 3 σ α 0.7 β 0.3 α 0.2 β 0.8 1st derivation: σ(α, β) with weight = nd derivation: best of β σ 0.8 with 2nd best from 1 and best from σ 0.8 with best from 1 and 2nd best from 2 SMT VIII A. Maletti 19

68 3-best Algorithm 3 σ α 0.7 β 0.3 α 0.2 β 0.8 1st derivation: σ(α, β) with weight = nd derivation: best of β σ 0.8 with 2nd best from 1 and best from σ 0.8 with best from 1 and 2nd best from SMT VIII A. Maletti 19

69 3-best Algorithm 3 σ α 0.7 β 0.3 α 0.2 β 0.8 1st derivation: σ(α, β) with weight = nd derivation: β with weight 0.3 = 0.3 3rd derivation: σ 0.8 with 2nd best from 1 and best from σ 0.8 with best from 1 and 2nd best from SMT VIII A. Maletti 19

70 3-best Algorithm 3 σ α 0.7 β 0.3 α 0.2 β 0.8 1st derivation: σ(α, β) with weight = nd derivation: β with weight 0.3 = 0.3 3rd derivation: σ(β, β) with weight = SMT VIII A. Maletti 19

71 5-best Algorithm Exercise SMT VIII A. Maletti 20

72 Contents 1 Inside and Outside Weights 2 n-best Derivations 3 Cube Pruning 4 Factorization SMT VIII A. Maletti 21

73 Cube Pruning Cube Pruning = n-best + language model SMT VIII A. Maletti 22

74 Cube Pruning Syntax-based systems Input Parser Machine translation system Language model Output SMT VIII A. Maletti 23

75 Cube Pruning Syntax-based systems Input Parser Machine translation system Language model Output Notes We can individually compute n-best lists But that leads to large search errors Slightly better: 10 n+2 -best list and 10 n -best list SMT VIII A. Maletti 23

76 Cube Pruning Syntax-based systems Input Parser Machine translation system Language model Output Notes We can individually compute n-best lists But that leads to large search errors Slightly better: 10 n+2 -best list and 10 n -best list SMT VIII A. Maletti 23

77 Cube Pruning Syntax-based systems Input Parser Machine translation system Language model Output Alternatives Full integration (Intersection) Pruning and beam search Cube pruning SMT VIII A. Maletti 23

78 Cube Pruning Syntax-based systems Input Parser Machine translation system Language model Output Alternatives Full integration (Intersection) Pruning and beam search Cube pruning too expensive SMT VIII A. Maletti 23

79 Cube Pruning Syntax-based systems Input Parser Machine translation system Language model Output Alternatives Full integration (Intersection) Pruning and beam search Cube pruning too expensive SMT VIII A. Maletti 23

80 Cube Pruning Syntax-based systems Input Parser Machine translation system Language model Output Alternatives Full integration (Intersection) Pruning and beam search Cube pruning too expensive too bad SMT VIII A. Maletti 23

81 Cube Pruning Syntax-based systems Input Parser Machine translation system Language model Output Alternatives Full integration (Intersection) Pruning and beam search Cube pruning too expensive too bad SMT VIII A. Maletti 23

82 Cube Pruning Algorithm Proceed as for n-best but multiply the language model score in the end SMT VIII A. Maletti 24

83 Cube Pruning Algorithm Proceed as for n-best but multiply the language model score in the end SMT VIII A. Maletti 24

84 Cube Pruning Algorithm Proceed as for n-best but multiply the language model score in the end SMT VIII A. Maletti 24

85 Cube Pruning Algorithm Proceed as for n-best but multiply the language model score in the end SMT VIII A. Maletti 24

86 Cube Pruning Algorithm Proceed as for n-best but multiply the language model score in the end SMT VIII A. Maletti 24

87 Cube Pruning Algorithm Proceed as for n-best but multiply the language model score in the end SMT VIII A. Maletti 24

88 Cube Pruning Algorithm Proceed as for n-best but multiply the language model score in the end SMT VIII A. Maletti 24

89 Cube Pruning Observations retains the efficiency of n-best derivation output list not sorted no guarantees SMT VIII A. Maletti 25

90 Contents 1 Inside and Outside Weights 2 n-best Derivations 3 Cube Pruning 4 Factorization SMT VIII A. Maletti 26

91 Factorization Motivation rk(m) essential part in the complexity of xtt operations [like input restriction] O( R len(m) P 2 rk(m)+5 ) Factorization reduce rk(m) by decomposing rules maximal decomposition prefered SMT VIII A. Maletti 27

92 Factorization References ZHANG, HUANG, GILDEA, KNIGHT: Synchronous binarization for machine translation. HLT-NAACL 2006 [for SCFG] GILDEA, SATTA, ZHANG: Factoring synchronous grammars by sorting. CoLing/ACL 2006 [for SCFG] NESSON, SATTA, SHIEBER: Optimal k-arization of synchronous tree-adjoining grammar. ACL 2008 [for STAG] GILDEA: Optimal parsing strategies for linear context-free rewriting systems. HLT-NAACL 2010 [for LCFRS] SMT VIII A. Maletti 28

93 Maximal Factorization Common advertisement slogan Optimal k-arization Optimal parsing strategy But factorization is just one way to reduce rk(m) obtained optimal rank is not optimal for the given transformation rk(m) is just one parameter to the parsing complexity Conclusion optimal k-arization maximal factorization optimal parsing strategy maximal factorization SMT VIII A. Maletti 29

94 Maximal Factorization Common advertisement slogan Optimal k-arization Optimal parsing strategy But factorization is just one way to reduce rk(m) obtained optimal rank is not optimal for the given transformation rk(m) is just one parameter to the parsing complexity Conclusion optimal k-arization maximal factorization optimal parsing strategy maximal factorization SMT VIII A. Maletti 29

95 Construction Example γ σ σ q 1 σ q 3 q 2 q q 1 γ σ q 2 q 3 Constructed rules SMT VIII A. Maletti 30

96 Construction Example γ σ σ q 1 σ q 3 q 2 q q 1 γ σ q 2 q 3 Constructed rules SMT VIII A. Maletti 30

97 Construction Example γ σ σ q 1 σ q 3 q 2 q q 1 γ σ q 2 q 3 Constructed rules σ q 1 q q 1 γ σ σ q 3 q 2 γ σ q 2 q 3 SMT VIII A. Maletti 30

98 Construction Notes runs in linear time returns maximal (meaningful) factorization returned XTOP is rank-optimal (among all factorizations) SMT VIII A. Maletti 31

99 Construction Notes runs in linear time returns maximal (meaningful) factorization returned XTOP is rank-optimal (among all factorizations) More Notes Factorization is strongly related to rule extraction Rule extraction always yields rank-optimal XTOP SMT VIII A. Maletti 31

100 References HUANG, CHIANG: Better k-best parsing. IWPT 2005 CHIANG: Hierarchical Phrase-Based Translation. Computat. Linguist. 33, 2007 MAY, KNIGHT: TIBURON a weighted tree automata toolkit. CIAA 2006 SMT VIII A. Maletti 32

101 References HUANG, CHIANG: Better k-best parsing. IWPT 2005 CHIANG: Hierarchical Phrase-Based Translation. Computat. Linguist. 33, 2007 MAY, KNIGHT: TIBURON a weighted tree automata toolkit. CIAA 2006 Thank you for your attention! SMT VIII A. Maletti 32

102 Questions? SMT VIII A. Maletti 33

Applications of Tree Automata Theory Lecture VI: Back to Machine Translation

Applications of Tree Automata Theory Lecture VI: Back to Machine Translation Applications of Tree Automata Theory Lecture VI: Back to Machine Translation Andreas Maletti Institute of Computer Science Universität Leipzig, Germany on leave from: Institute for Natural Language Processing

More information

Lecture 7: Introduction to syntax-based MT

Lecture 7: Introduction to syntax-based MT Lecture 7: Introduction to syntax-based MT Andreas Maletti Statistical Machine Translation Stuttgart December 16, 2011 SMT VII A. Maletti 1 Lecture 7 Goals Overview Tree substitution grammars (tree automata)

More information

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this. Chapter 12 Synchronous CFGs Synchronous context-free grammars are a generalization of CFGs that generate pairs of related strings instead of single strings. They are useful in many situations where one

More information

How to train your multi bottom-up tree transducer

How to train your multi bottom-up tree transducer How to train your multi bottom-up tree transducer Andreas Maletti Universität tuttgart Institute for Natural Language Processing tuttgart, Germany andreas.maletti@ims.uni-stuttgart.de Portland, OR June

More information

Tuning as Linear Regression

Tuning as Linear Regression Tuning as Linear Regression Marzieh Bazrafshan, Tagyoung Chung and Daniel Gildea Department of Computer Science University of Rochester Rochester, NY 14627 Abstract We propose a tuning method for statistical

More information

Statistical Machine Translation of Natural Languages

Statistical Machine Translation of Natural Languages 1/26 Statistical Machine Translation of Natural Languages Heiko Vogler Technische Universität Dresden Germany Graduiertenkolleg Quantitative Logics and Automata Dresden, November, 2012 1/26 Weighted Tree

More information

Decoding and Inference with Syntactic Translation Models

Decoding and Inference with Syntactic Translation Models Decoding and Inference with Syntactic Translation Models March 5, 2013 CFGs S NP VP VP NP V V NP NP CFGs S NP VP S VP NP V V NP NP CFGs S NP VP S VP NP V NP VP V NP NP CFGs S NP VP S VP NP V NP VP V NP

More information

Structure and Complexity of Grammar-Based Machine Translation

Structure and Complexity of Grammar-Based Machine Translation Structure and of Grammar-Based Machine Translation University of Padua, Italy New York, June 9th, 2006 1 2 Synchronous context-free grammars Definitions Computational problems 3 problem SCFG projection

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based Statistical Machine Translation Marcello Federico Human Language Technology FBK 2014 Outline Tree-based translation models: Synchronous context free grammars Hierarchical phrase-based

More information

Why Synchronous Tree Substitution Grammars?

Why Synchronous Tree Substitution Grammars? Why ynchronous Tr ustitution Grammars? Andras Maltti Univrsitat Rovira i Virgili Tarragona, pain andras.maltti@urv.cat Los Angls Jun 4, 2010 Why ynchronous Tr ustitution Grammars? A. Maltti 1 Motivation

More information

Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers

Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers Andreas Maletti Institute of Computer Science Universität Leipzig, Germany on leave from: Institute for Natural Language Processing

More information

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister A Syntax-based Statistical Machine Translation Model Alexander Friedl, Georg Teichtmeister 4.12.2006 Introduction The model Experiment Conclusion Statistical Translation Model (STM): - mathematical model

More information

Random Generation of Nondeterministic Tree Automata

Random Generation of Nondeterministic Tree Automata Random Generation of Nondeterministic Tree Automata Thomas Hanneforth 1 and Andreas Maletti 2 and Daniel Quernheim 2 1 Department of Linguistics University of Potsdam, Germany 2 Institute for Natural Language

More information

Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems

Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems Pierluigi Crescenzi Dip. di Sistemi e Informatica Università di Firenze Daniel Gildea Computer Science Dept. University

More information

Hierarchies of Tree Series TransducersRevisited 1

Hierarchies of Tree Series TransducersRevisited 1 Hierarchies of Tree Series TransducersRevisited 1 Andreas Maletti 2 Technische Universität Dresden Fakultät Informatik June 27, 2006 1 Financially supported by the Friends and Benefactors of TU Dresden

More information

Optimal k-arization of synchronous tree-adjoining grammar

Optimal k-arization of synchronous tree-adjoining grammar Optimal k-arization of synchronous tree-adjoining grammar The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters itation Rebecca Nesson,

More information

Syntax-based Statistical Machine Translation

Syntax-based Statistical Machine Translation Syntax-based Statistical Machine Translation Philip Williams and Philipp Koehn 29 October 2014 Part I Part II - Introduction - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical

More information

Prefix Probability for Probabilistic Synchronous Context-Free Grammars

Prefix Probability for Probabilistic Synchronous Context-Free Grammars Prefix Probability for Probabilistic Synchronous Context-Free rammars Mark-Jan Nederhof School of Computer Science University of St Andrews North Haugh, St Andrews, Fife KY16 9SX United Kingdom markjan.nederhof@googlemail.com

More information

A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions

A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions Wei Lu and Hwee Tou Ng National University of Singapore 1/26 The Task (Logical Form) λx 0.state(x 0

More information

Multi Bottom-up Tree Transducers

Multi Bottom-up Tree Transducers Multi Bottom-up Tree Transducers Andreas Maletti Institute for Natural Language Processing Universität tuttgart, Germany maletti@ims.uni-stuttgart.de Avignon April 24, 2012 Multi Bottom-up Tree Transducers

More information

Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions

Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical Machine Translation Philip Williams and Philipp Koehn 29 October 2014 Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical

More information

Top-Down K-Best A Parsing

Top-Down K-Best A Parsing Top-Down K-Best A Parsing Adam Pauls and Dan Klein Computer Science Division University of California at Berkeley {adpauls,klein}@cs.berkeley.edu Chris Quirk Microsoft Research Redmond, WA, 98052 chrisq@microsoft.com

More information

Compositions of Tree Series Transformations

Compositions of Tree Series Transformations Compositions of Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de December 03, 2004 1. Motivation 2.

More information

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti Quasi-Synchronous Phrase Dependency Grammars for Machine Translation Kevin Gimpel Noah A. Smith 1 Introduction MT using dependency grammars on phrases Phrases capture local reordering and idiomatic translations

More information

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane.

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane. Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane c Sateesh R. Mane 2018 3 Lecture 3 3.1 General remarks March 4, 2018 This

More information

Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for Natural Language Processing

Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for Natural Language Processing Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for atural Language Processing Alexander M. Rush (based on joint work with Michael Collins, Tommi Jaakkola, Terry Koo, David Sontag) Uncertainty

More information

Decoding Revisited: Easy-Part-First & MERT. February 26, 2015

Decoding Revisited: Easy-Part-First & MERT. February 26, 2015 Decoding Revisited: Easy-Part-First & MERT February 26, 2015 Translating the Easy Part First? the tourism initiative addresses this for the first time the die tm:-0.19,lm:-0.4, d:0, all:-0.65 tourism touristische

More information

Bringing machine learning & compositional semantics together: central concepts

Bringing machine learning & compositional semantics together: central concepts Bringing machine learning & compositional semantics together: central concepts https://githubcom/cgpotts/annualreview-complearning Chris Potts Stanford Linguistics CS 244U: Natural language understanding

More information

Efficient Incremental Decoding for Tree-to-String Translation

Efficient Incremental Decoding for Tree-to-String Translation Efficient Incremental Decoding for Tree-to-String Translation Liang Huang 1 1 Information Sciences Institute University of Southern California 4676 Admiralty Way, Suite 1001 Marina del Rey, CA 90292, USA

More information

UNIT - 2 Unit-02/Lecture-01

UNIT - 2 Unit-02/Lecture-01 UNIT - 2 Unit-02/Lecture-01 Solution of algebraic & transcendental equations by regula falsi method Unit-02/Lecture-01 [RGPV DEC(2013)] [7] Unit-02/Lecture-01 [RGPV JUNE(2014)] [7] Unit-02/Lecture-01 S.NO

More information

Lagrangian Relaxation Algorithms for Inference in Natural Language Processing

Lagrangian Relaxation Algorithms for Inference in Natural Language Processing Lagrangian Relaxation Algorithms for Inference in Natural Language Processing Alexander M. Rush and Michael Collins (based on joint work with Yin-Wen Chang, Tommi Jaakkola, Terry Koo, Roi Reichart, David

More information

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing CS 562: Empirical Methods in Natural Language Processing Unit 2: Tree Models Lectures 19-23: Context-Free Grammars and Parsing Oct-Nov 2009 Liang Huang (lhuang@isi.edu) Big Picture we have already covered...

More information

Statistical Machine Translation of Natural Languages

Statistical Machine Translation of Natural Languages 1/37 Statistical Machine Translation of Natural Languages Rule Extraction and Training Probabilities Matthias Büchse, Toni Dietze, Johannes Osterholzer, Torsten Stüber, Heiko Vogler Technische Universität

More information

Dual Decomposition for Natural Language Processing. Decoding complexity

Dual Decomposition for Natural Language Processing. Decoding complexity Dual Decomposition for atural Language Processing Alexander M. Rush and Michael Collins Decoding complexity focus: decoding problem for natural language tasks motivation: y = arg max y f (y) richer model

More information

Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers

Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers Nina eemann and Daniel Quernheim and Fabienne Braune and Andreas Maletti University of tuttgart, Institute for Natural

More information

Tree Transducers in Machine Translation

Tree Transducers in Machine Translation Tree Transducers in Machine Translation Andreas Maletti Universitat Rovira i Virgili Tarragona, pain andreas.maletti@urv.cat Jena August 23, 2010 Tree Transducers in MT A. Maletti 1 Machine translation

More information

The Power of Tree Series Transducers

The Power of Tree Series Transducers The Power of Tree Series Transducers Andreas Maletti 1 Technische Universität Dresden Fakultät Informatik June 15, 2006 1 Research funded by German Research Foundation (DFG GK 334) Andreas Maletti (TU

More information

LECTURER: BURCU CAN Spring

LECTURER: BURCU CAN Spring LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can

More information

Doctoral Course in Speech Recognition. May 2007 Kjell Elenius

Doctoral Course in Speech Recognition. May 2007 Kjell Elenius Doctoral Course in Speech Recognition May 2007 Kjell Elenius CHAPTER 12 BASIC SEARCH ALGORITHMS State-based search paradigm Triplet S, O, G S, set of initial states O, set of operators applied on a state

More information

Cube Summing, Approximate Inference with Non-Local Features, and Dynamic Programming without Semirings

Cube Summing, Approximate Inference with Non-Local Features, and Dynamic Programming without Semirings Cube Summing, Approximate Inference with Non-Local Features, and Dynamic Programming without Semirings Kevin Gimpel and Noah A. Smith Language Technologies Institute Carnegie Mellon University Pittsburgh,

More information

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Probabilistic Graphical Models: MRFs and CRFs CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Why PGMs? PGMs can model joint probabilities of many events. many techniques commonly

More information

Compositions of Bottom-Up Tree Series Transformations

Compositions of Bottom-Up Tree Series Transformations Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de May 17, 2005 1. Motivation

More information

Complexity, Parsing, and Factorization of Tree-Local Multi-Component Tree-Adjoining Grammar

Complexity, Parsing, and Factorization of Tree-Local Multi-Component Tree-Adjoining Grammar Complexity, Parsing, and Factorization of Tree-Local Multi-Component Tree-Adjoining Grammar Rebecca Nesson School of Engineering and Applied Sciences Harvard University Giorgio Satta Department of Information

More information

SRI RAMAKRISHNA INSTITUTE OF TECHNOLOGY, COIMBATORE- 10 DEPARTMENT OF SCIENCE AND HUMANITIES B.E - EEE & CIVIL UNIT I

SRI RAMAKRISHNA INSTITUTE OF TECHNOLOGY, COIMBATORE- 10 DEPARTMENT OF SCIENCE AND HUMANITIES B.E - EEE & CIVIL UNIT I SRI RAMAKRISHNA INSTITUTE OF TECHNOLOGY, COIMBATORE- 10 DEPARTMENT OF SCIENCE AND HUMANITIES B.E - EEE & CIVIL SUBJECT: NUMERICAL METHODS ( SEMESTER VI ) UNIT I SOLUTIONS OF EQUATIONS AND EIGEN VALUE PROBLEMS

More information

Compositions of Top-down Tree Transducers with ε-rules

Compositions of Top-down Tree Transducers with ε-rules Compositions of Top-down Tree Transducers with ε-rules Andreas Maletti 1, and Heiko Vogler 2 1 Universitat Rovira i Virgili, Departament de Filologies Romàniques Avinguda de Catalunya 35, 43002 Tarragona,

More information

The iteration formula for to find the root of the equation

The iteration formula for to find the root of the equation SRI RAMAKRISHNA INSTITUTE OF TECHNOLOGY, COIMBATORE- 10 DEPARTMENT OF SCIENCE AND HUMANITIES SUBJECT: NUMERICAL METHODS & LINEAR PROGRAMMING UNIT II SOLUTIONS OF EQUATION 1. If is continuous in then under

More information

Lecture Notes on Inductive Definitions

Lecture Notes on Inductive Definitions Lecture Notes on Inductive Definitions 15-312: Foundations of Programming Languages Frank Pfenning Lecture 2 August 28, 2003 These supplementary notes review the notion of an inductive definition and give

More information

Numerical mathematics with GeoGebra in high school

Numerical mathematics with GeoGebra in high school Herceg 2009/2/18 23:36 page 363 #1 6/2 (2008), 363 378 tmcs@inf.unideb.hu http://tmcs.math.klte.hu Numerical mathematics with GeoGebra in high school Dragoslav Herceg and Ðorđe Herceg Abstract. We have

More information

NUMERICAL AND STATISTICAL COMPUTING (MCA-202-CR)

NUMERICAL AND STATISTICAL COMPUTING (MCA-202-CR) NUMERICAL AND STATISTICAL COMPUTING (MCA-202-CR) Autumn Session UNIT 1 Numerical analysis is the study of algorithms that uses, creates and implements algorithms for obtaining numerical solutions to problems

More information

Algorithms for Syntax-Aware Statistical Machine Translation

Algorithms for Syntax-Aware Statistical Machine Translation Algorithms for Syntax-Aware Statistical Machine Translation I. Dan Melamed, Wei Wang and Ben Wellington ew York University Syntax-Aware Statistical MT Statistical involves machine learning (ML) seems crucial

More information

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):

More information

Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems

Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems Carlos Gómez-Rodríguez 1, Marco Kuhlmann 2, and Giorgio Satta 3 1 Departamento de Computación, Universidade da Coruña, Spain, cgomezr@udc.es

More information

Chapter 4. Solution of Non-linear Equation. Module No. 1. Newton s Method to Solve Transcendental Equation

Chapter 4. Solution of Non-linear Equation. Module No. 1. Newton s Method to Solve Transcendental Equation Numerical Analysis by Dr. Anita Pal Assistant Professor Department of Mathematics National Institute of Technology Durgapur Durgapur-713209 email: anita.buie@gmail.com 1 . Chapter 4 Solution of Non-linear

More information

In this chapter, we explore the parsing problem, which encompasses several questions, including:

In this chapter, we explore the parsing problem, which encompasses several questions, including: Chapter 12 Parsing Algorithms 12.1 Introduction In this chapter, we explore the parsing problem, which encompasses several questions, including: Does L(G) contain w? What is the highest-weight derivation

More information

Numerical Methods Dr. Sanjeev Kumar Department of Mathematics Indian Institute of Technology Roorkee Lecture No 7 Regula Falsi and Secant Methods

Numerical Methods Dr. Sanjeev Kumar Department of Mathematics Indian Institute of Technology Roorkee Lecture No 7 Regula Falsi and Secant Methods Numerical Methods Dr. Sanjeev Kumar Department of Mathematics Indian Institute of Technology Roorkee Lecture No 7 Regula Falsi and Secant Methods So welcome to the next lecture of the 2 nd unit of this

More information

To make a grammar probabilistic, we need to assign a probability to each context-free rewrite

To make a grammar probabilistic, we need to assign a probability to each context-free rewrite Notes on the Inside-Outside Algorithm To make a grammar probabilistic, we need to assign a probability to each context-free rewrite rule. But how should these probabilities be chosen? It is natural to

More information

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements

More information

Synchronous Grammars

Synchronous Grammars ynchronous Grammars ynchronous grammars are a way of simultaneously generating pairs of recursively related strings ynchronous grammar w wʹ ynchronous grammars were originally invented for programming

More information

THE LANGUAGE OF FUNCTIONS *

THE LANGUAGE OF FUNCTIONS * PRECALCULUS THE LANGUAGE OF FUNCTIONS * In algebra you began a process of learning a basic vocabulary, grammar, and syntax forming the foundation of the language of mathematics. For example, consider the

More information

Lecture Notes on Inductive Definitions

Lecture Notes on Inductive Definitions Lecture Notes on Inductive Definitions 15-312: Foundations of Programming Languages Frank Pfenning Lecture 2 September 2, 2004 These supplementary notes review the notion of an inductive definition and

More information

NON-LINEAR ALGEBRAIC EQUATIONS Lec. 5.1: Nonlinear Equation in Single Variable

NON-LINEAR ALGEBRAIC EQUATIONS Lec. 5.1: Nonlinear Equation in Single Variable NON-LINEAR ALGEBRAIC EQUATIONS Lec. 5.1: Nonlinear Equation in Single Variable Dr. Niket Kaisare Department of Chemical Engineering IIT Madras NPTEL Course: MATLAB Programming for Numerical Computations

More information

SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS BISECTION METHOD

SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS BISECTION METHOD BISECTION METHOD If a function f(x) is continuous between a and b, and f(a) and f(b) are of opposite signs, then there exists at least one root between a and b. It is shown graphically as, Let f a be negative

More information

Out of GIZA Efficient Word Alignment Models for SMT

Out of GIZA Efficient Word Alignment Models for SMT Out of GIZA Efficient Word Alignment Models for SMT Yanjun Ma National Centre for Language Technology School of Computing Dublin City University NCLT Seminar Series March 4, 2009 Y. Ma (DCU) Out of Giza

More information

Roots of Equations. ITCS 4133/5133: Introduction to Numerical Methods 1 Roots of Equations

Roots of Equations. ITCS 4133/5133: Introduction to Numerical Methods 1 Roots of Equations Roots of Equations Direct Search, Bisection Methods Regula Falsi, Secant Methods Newton-Raphson Method Zeros of Polynomials (Horner s, Muller s methods) EigenValue Analysis ITCS 4133/5133: Introduction

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based tatistical Machine Translation Marcello Federico (based on slides by Gabriele Musillo) Human Language Technology FBK-irst 2011 Outline Tree-based translation models: ynchronous context

More information

Feature-Rich Translation by Quasi-Synchronous Lattice Parsing

Feature-Rich Translation by Quasi-Synchronous Lattice Parsing Feature-Rich Translation by Quasi-Synchronous Lattice Parsing Kevin Gimpel and Noah A. Smith Language Technologies Institute School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213,

More information

Hence a root lies between 1 and 2. Since f a is negative and f(x 0 ) is positive The root lies between a and x 0 i.e. 1 and 1.

Hence a root lies between 1 and 2. Since f a is negative and f(x 0 ) is positive The root lies between a and x 0 i.e. 1 and 1. The Bisection method or BOLZANO s method or Interval halving method: Find the positive root of x 3 x = 1 correct to four decimal places by bisection method Let f x = x 3 x 1 Here f 0 = 1 = ve, f 1 = ve,

More information

Improving Sequence-to-Sequence Constituency Parsing

Improving Sequence-to-Sequence Constituency Parsing Improving Sequence-to-Sequence Constituency Parsing Lemao Liu, Muhua Zhu and Shuming Shi Tencent AI Lab, Shenzhen, China {redmondliu,muhuazhu, shumingshi}@tencent.com Abstract Sequence-to-sequence constituency

More information

Numerical Methods. Root Finding

Numerical Methods. Root Finding Numerical Methods Solving Non Linear 1-Dimensional Equations Root Finding Given a real valued function f of one variable (say ), the idea is to find an such that: f() 0 1 Root Finding Eamples Find real

More information

Composition Closure of ε-free Linear Extended Top-down Tree Transducers

Composition Closure of ε-free Linear Extended Top-down Tree Transducers Composition Closure of ε-free Linear Extended Top-down Tree Transducers Zoltán Fülöp 1, and Andreas Maletti 2, 1 Department of Foundations of Computer Science, University of Szeged Árpád tér 2, H-6720

More information

Comparative Analysis of Convergence of Various Numerical Methods

Comparative Analysis of Convergence of Various Numerical Methods Journal of Computer and Mathematical Sciences, Vol.6(6),290-297, June 2015 (An International Research Journal), www.compmath-journal.org ISSN 0976-5727 (Print) ISSN 2319-8133 (Online) Comparative Analysis

More information

p 1 p 0 (p 1, f(p 1 )) (p 0, f(p 0 )) The geometric construction of p 2 for the se- cant method.

p 1 p 0 (p 1, f(p 1 )) (p 0, f(p 0 )) The geometric construction of p 2 for the se- cant method. 80 CHAP. 2 SOLUTION OF NONLINEAR EQUATIONS f (x) = 0 y y = f(x) (p, 0) p 2 p 1 p 0 x (p 1, f(p 1 )) (p 0, f(p 0 )) The geometric construction of p 2 for the se- Figure 2.16 cant method. Secant Method The

More information

Cross-Entropy and Estimation of Probabilistic Context-Free Grammars

Cross-Entropy and Estimation of Probabilistic Context-Free Grammars Cross-Entropy and Estimation of Probabilistic Context-Free Grammars Anna Corazza Department of Physics University Federico II via Cinthia I-8026 Napoli, Italy corazza@na.infn.it Giorgio Satta Department

More information

The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers

The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers International Journal of Foundations of Computer cience c World cientific Publishing Company The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers ANDREA MALETTI Universität Leipzig,

More information

Computing Lattice BLEU Oracle Scores for Machine Translation

Computing Lattice BLEU Oracle Scores for Machine Translation Computing Lattice Oracle Scores for Machine Translation Artem Sokolov & Guillaume Wisniewski & François Yvon {firstname.lastname}@limsi.fr LIMSI, Orsay, France 1 Introduction 2 Oracle Decoding Task 3 Proposed

More information

Contents. Preface to the Third Edition (2007) Preface to the Second Edition (1992) Preface to the First Edition (1985) License and Legal Information

Contents. Preface to the Third Edition (2007) Preface to the Second Edition (1992) Preface to the First Edition (1985) License and Legal Information Contents Preface to the Third Edition (2007) Preface to the Second Edition (1992) Preface to the First Edition (1985) License and Legal Information xi xiv xvii xix 1 Preliminaries 1 1.0 Introduction.............................

More information

Lecture 4: Types of errors. Bayesian regression models. Logistic regression

Lecture 4: Types of errors. Bayesian regression models. Logistic regression Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture

More information

Chapter 14 (Partially) Unsupervised Parsing

Chapter 14 (Partially) Unsupervised Parsing Chapter 14 (Partially) Unsupervised Parsing The linguistically-motivated tree transformations we discussed previously are very effective, but when we move to a new language, we may have to come up with

More information

Consensus Training for Consensus Decoding in Machine Translation

Consensus Training for Consensus Decoding in Machine Translation Consensus Training for Consensus Decoding in Machine Translation Adam Pauls, John DeNero and Dan Klein Computer Science Division University of California at Berkeley {adpauls,denero,klein}@cs.berkeley.edu

More information

Online Learning in Tensor Space

Online Learning in Tensor Space Online Learning in Tensor Space Yuan Cao Sanjeev Khudanpur Center for Language & Speech Processing and Human Language Technology Center of Excellence The Johns Hopkins University Baltimore, MD, USA, 21218

More information

Parsing Linear Context-Free Rewriting Systems with Fast Matrix Multiplication

Parsing Linear Context-Free Rewriting Systems with Fast Matrix Multiplication Parsing Linear Context-Free Rewriting Systems with Fast Matrix Multiplication Shay B. Cohen University of Edinburgh Daniel Gildea University of Rochester We describe a recognition algorithm for a subset

More information

CHAPTER-II ROOTS OF EQUATIONS

CHAPTER-II ROOTS OF EQUATIONS CHAPTER-II ROOTS OF EQUATIONS 2.1 Introduction The roots or zeros of equations can be simply defined as the values of x that makes f(x) =0. There are many ways to solve for roots of equations. For some

More information

Translation as Weighted Deduction

Translation as Weighted Deduction Translation as Weighted Deduction Adam Lopez University of Edinburgh 10 Crichton Street Edinburgh, EH8 9AB United Kingdom alopez@inf.ed.ac.uk Abstract We present a unified view of many translation algorithms

More information

Finite-State Transducers

Finite-State Transducers Finite-State Transducers - Seminar on Natural Language Processing - Michael Pradel July 6, 2007 Finite-state transducers play an important role in natural language processing. They provide a model for

More information

Extended Multi Bottom-Up Tree Transducers

Extended Multi Bottom-Up Tree Transducers Extended Multi Bottom-Up Tree Transducers Joost Engelfriet 1, Eric Lilin 2, and Andreas Maletti 3;? 1 Leiden Instite of Advanced Comper Science Leiden University, P.O. Box 9512, 2300 RA Leiden, The Netherlands

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation -tree-based models (cont.)- Artem Sokolov Computerlinguistik Universität Heidelberg Sommersemester 2015 material from P. Koehn, S. Riezler, D. Altshuler Bottom-Up Decoding

More information

Solving nonlinear equations (See online notes and lecture notes for full details) 1.3: Newton s Method

Solving nonlinear equations (See online notes and lecture notes for full details) 1.3: Newton s Method Solving nonlinear equations (See online notes and lecture notes for full details) 1.3: Newton s Method MA385 Numerical Analysis September 2018 (1/16) Sir Isaac Newton, 1643-1727, England. Easily one of

More information

Math 4329: Numerical Analysis Chapter 03: Fixed Point Iteration and Ill behaving problems. Natasha S. Sharma, PhD

Math 4329: Numerical Analysis Chapter 03: Fixed Point Iteration and Ill behaving problems. Natasha S. Sharma, PhD Why another root finding technique? iteration gives us the freedom to design our own root finding algorithm. The design of such algorithms is motivated by the need to improve the speed and accuracy of

More information

Translation as Weighted Deduction

Translation as Weighted Deduction Translation as Weighted Deduction Adam Lopez University of Edinburgh 10 Crichton Street Edinburgh, EH8 9AB United Kingdom alopez@inf.ed.ac.uk Abstract We present a unified view of many translation algorithms

More information

Natural Language Processing (CSEP 517): Machine Translation

Natural Language Processing (CSEP 517): Machine Translation Natural Language Processing (CSEP 57): Machine Translation Noah Smith c 207 University of Washington nasmith@cs.washington.edu May 5, 207 / 59 To-Do List Online quiz: due Sunday (Jurafsky and Martin, 2008,

More information

Margin-based Decomposed Amortized Inference

Margin-based Decomposed Amortized Inference Margin-based Decomposed Amortized Inference Gourab Kundu and Vivek Srikumar and Dan Roth University of Illinois, Urbana-Champaign Urbana, IL. 61801 {kundu2, vsrikum2, danr}@illinois.edu Abstract Given

More information

Syntax-Based Decoding

Syntax-Based Decoding Syntax-Based Decoding Philipp Koehn 9 November 2017 1 syntax-based models Synchronous Context Free Grammar Rules 2 Nonterminal rules NP DET 1 2 JJ 3 DET 1 JJ 3 2 Terminal rules N maison house NP la maison

More information

Zeroes of Transcendental and Polynomial Equations. Bisection method, Regula-falsi method and Newton-Raphson method

Zeroes of Transcendental and Polynomial Equations. Bisection method, Regula-falsi method and Newton-Raphson method Zeroes of Transcendental and Polynomial Equations Bisection method, Regula-falsi method and Newton-Raphson method PRELIMINARIES Solution of equation f (x) = 0 A number (real or complex) is a root of the

More information

Bayesian Learning of Non-compositional Phrases with Synchronous Parsing

Bayesian Learning of Non-compositional Phrases with Synchronous Parsing Bayesian Learning of Non-compositional Phrases with Synchronous Parsing Hao Zhang Computer Science Department University of Rochester Rochester, NY 14627 zhanghao@cs.rochester.edu Chris Quirk Microsoft

More information

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing CMSC 330: Organization of Programming Languages Pushdown Automata Parsing Chomsky Hierarchy Categorization of various languages and grammars Each is strictly more restrictive than the previous First described

More information

Automata Theory. CS F-10 Non-Context-Free Langauges Closure Properties of Context-Free Languages. David Galles

Automata Theory. CS F-10 Non-Context-Free Langauges Closure Properties of Context-Free Languages. David Galles Automata Theory CS411-2015F-10 Non-Context-Free Langauges Closure Properties of Context-Free Languages David Galles Department of Computer Science University of San Francisco 10-0: Fun with CFGs Create

More information

Hidden Markov Models and Gaussian Mixture Models

Hidden Markov Models and Gaussian Mixture Models Hidden Markov Models and Gaussian Mixture Models Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 4&5 23&27 January 2014 ASR Lectures 4&5 Hidden Markov Models and Gaussian

More information

Statistical Machine Translation. Part III: Search Problem. Complexity issues. DP beam-search: with single and multi-stacks

Statistical Machine Translation. Part III: Search Problem. Complexity issues. DP beam-search: with single and multi-stacks Statistical Machine Translation Marcello Federico FBK-irst Trento, Italy Galileo Galilei PhD School - University of Pisa Pisa, 7-19 May 008 Part III: Search Problem 1 Complexity issues A search: with single

More information

2. Tree-to-String [6] (Phrase Based Machine Translation, PBMT)[1] [2] [7] (Targeted Self-Training) [7] Tree-to-String [3] Tree-to-String [4]

2. Tree-to-String [6] (Phrase Based Machine Translation, PBMT)[1] [2] [7] (Targeted Self-Training) [7] Tree-to-String [3] Tree-to-String [4] 1,a) 1,b) Graham Nubig 1,c) 1,d) 1,) 1. (Phras Basd Machin Translation, PBMT)[1] [2] Tr-to-String [3] Tr-to-String [4] [5] 1 Graduat School of Information Scinc, Nara Institut of Scinc and Tchnology a)

More information

Log-Linear Models, MEMMs, and CRFs

Log-Linear Models, MEMMs, and CRFs Log-Linear Models, MEMMs, and CRFs Michael Collins 1 Notation Throughout this note I ll use underline to denote vectors. For example, w R d will be a vector with components w 1, w 2,... w d. We use expx

More information