Applications of Tree Automata Theory Lecture VI: Back to Machine Translation

Size: px
Start display at page:

Download "Applications of Tree Automata Theory Lecture VI: Back to Machine Translation"

Transcription

1 Applications of Tree Automata Theory Lecture VI: Back to Machine Translation Andreas Maletti Institute of Computer Science Universität Leipzig, Germany on leave from: Institute for Natural Language Processing Universität Stuttgart, Germany Yekaterinburg August 25, 2014 Lecture VI: Tree Transducers in SMT A. Maletti 1

2 Roadmap 1 Theory of Tree Automata 2 Parsing Basics and Evaluation 3 Parsing Advanced Topics 4 Machine Translation Basics and Evaluation 5 Theory of Tree Transducers 6 Machine Translation Advanced Topics Always ask questions right away! Lecture VI: Tree Transducers in SMT A. Maletti 2

3 Tree Transducers in Machine Translation Important relations SCFG = synchronous context-free grammar LTG-LTG [CHIANG, 2007] (synchronous local tree grammar) ln-top special top-down tree transducer STSG = synchronous tree substitution grammar TSG-TSG [EISNER, 2003] ln-xtop special extended top-down tree transducer STAG = synchronous tree adjunction grammar [SHIEBER, SCHABES, 1990] TAG-TAG SCFTG = synchronous context-free tree grammar [NEDERHOF, VOGLER, 2012] CFTG-CFTG Lecture VI: Tree Transducers in SMT A. Maletti 3

4 Tree Transducers in Machine Translation Towards asymetric relations STSSG = synch. tree-sequence substitution grammar [ZHANG et al., 2008] TSSG-TSSG lmbot = local shallow multi bottom-up tree transducer [BRAUNE et al., 2013] LTG-TSSG ln-xmbot corresponds rougly to RTG-TSSG Lecture VI: Tree Transducers in SMT A. Maletti 4

5 Tree Transducers in Machine Translation Extended Multi Bottom-up Tree Transducers Lecture VI: Tree Transducers in SMT A. Maletti 5

6 Extended Multi Bottom-up Tree Transducers S ynzran w VP kana VP NP-SBJ PP-CLR PP-MNR Aly NP b NP h $kl mdhk him at NP looking PP-CLR a funny way in NP PP they were VP And NP-SBJ VP S Lecture VI: Tree Transducers in SMT A. Maletti 6

7 Extended Multi Bottom-up Tree Transducers ynzran S w VP kana VP NP-SBJ PP-CLR PP-MNR NP h NP q NP him b q q b in $kl $kl a. way mdhk q mdhk funny w qw And Aly q Aly at kana q kana NP-SBJ they. were ynzran q ynzran looking Aly NP b NP h $kl mdhk him at NP looking PP-CLR a funny way in NP PP they were VP And NP-SBJ VP S Lecture VI: Tree Transducers in SMT A. Maletti 6

8 Extended Multi Bottom-up Tree Transducers q ynzran q w S q kana VP NP-SBJ VP PP-CLR PP-MNR q Aly q NP q b NP NP h NP q NP him b q q b in $kl $kl a. way mdhk q mdhk funny w qw And Aly q Aly at kana q kana NP-SBJ they. were ynzran q ynzran looking q $kl q mdhk q $kl q mdhk q $kl q Aly q NP q b NP q ynzran PP-CLR PP q kana VP q w q kana VP S Lecture VI: Tree Transducers in SMT A. Maletti 6

9 Extended Multi Bottom-up Tree Transducers q ynzran q w S q kana VP NP-SBJ VP PP-CLR PP-MNR q Aly q NP q b NP q $kl q $kl q mdhk q $kl q mdhk NP h NP q NP him b q q b in $kl $kl a. way mdhk q mdhk funny w qw And Aly q Aly at kana q kana NP-SBJ they PP-CLR qaly qnp. were ynzran q ynzran looking PP-CLR q PP-CLR qaly qnp NP NP q $kl qmdhk NP q $kl qmdhk q $kl q q Aly q NP q b NP q ynzran PP-CLR PP q kana VP q w q kana VP S Lecture VI: Tree Transducers in SMT A. Maletti 6

10 Extended Multi Bottom-up Tree Transducers q ynzran S q w VP q kana VP NP-SBJ q PP-CLR PP-MNR NP h NP q NP him b q q b in $kl $kl a. way mdhk q mdhk funny w qw And Aly q Aly at kana q kana NP-SBJ they. were ynzran q ynzran looking q b q NP PP-CLR qaly qnp PP-CLR q PP-CLR qaly qnp q b q ynzran q PP-CLR PP q NP NP NP q $kl qmdhk NP q $kl qmdhk q $kl q q kana VP q w q kana VP S Lecture VI: Tree Transducers in SMT A. Maletti 6

11 Extended Multi Bottom-up Tree Transducers q ynzran S q w VP q kana VP NP-SBJ q PP-CLR PP-MNR NP h NP q NP him b q q b in $kl $kl a. way mdhk q mdhk funny w qw And Aly q Aly at kana q kana NP-SBJ they. were ynzran q ynzran looking q b q NP PP-CLR qaly qnp PP-CLR q PP-CLR qaly qnp q b q NP q ynzran q PP-CLR PP q kana VP q w q kana VP S NP NP q $kl qmdhk NP q $kl qmdhk q $kl q PP-MNR qb qnp PP q PP-MNR qb qnp Lecture VI: Tree Transducers in SMT A. Maletti 6

12 Extended Multi Bottom-up Tree Transducers Extracted rules S S VP q q VP q kana. q w q VP q w q VP q VP q kana q VP PP-MNR PP q PP-MNR q b q NP q b q NP q ynzran NP-SBJ VP NP h w qw And NP q NP him kana q kana VP NP NP q NP q kana q VP q $kl q mdhk q $kl q mdhk q $kl b q q b in $kl $kl q a. way mdhk mdhk funny NP-SBJ they. were ynzran q ynzran looking Aly q Aly at VP PP-CLR PP-CLR q PP-CLR q q VP q PP-MNR PP-CLR q ynzran q PP-CLR q PP-MNR q Aly q NP q Aly q NP Lecture VI: Tree Transducers in SMT A. Maletti 7

13 Extended Multi Bottom-up Tree Transducers Synchronous grammar notation: VP q VP q kana. q kana q VP q kana VP q VP Tree transducer notation (links expressed by variables): VP q VP q kana q VP x 1 VP x 1 x 2 x 3 x 2 x 3 Lecture VI: Tree Transducers in SMT A. Maletti 8

14 Extended Multi Bottom-up Tree Transducers Definition (Alternative for linear XMBOT) Linear extended multi bottom-up tree transducer (Q, Σ,, I, R) finite set Q states alphabets Σ and input/output symbols I Q initial states finite set R T Σ (Q) Q T (Q) rules each q Q occurs at most once in l (l, q, r) R each q Q that occurs in r also occurs in l (l, q, r) R S S VP q q VP q kana. q w q VP q w q VP q VP q kana q VP VP NP NP q NP q kana q VP q $kl q mdhk q $kl q mdhk q $kl Lecture VI: Tree Transducers in SMT A. Maletti 9

15 Extended Multi Bottom-up Tree Transducers Rules S S VP q q VP q kana. q w q VP q w q VP q VP q kana q VP q kana VP q VP Lecture VI: Tree Transducers in SMT A. Maletti 10

16 Extended Multi Bottom-up Tree Transducers Rules S S VP q q VP q kana. q w q VP q w q VP q VP q kana q VP q kana VP q VP S S q w VP q q w q kana VP q kana q VP q kana q VP Lecture VI: Tree Transducers in SMT A. Maletti 10

17 Tree Transducers in Machine Translation Evaluation of XMBOT Lecture VI: Tree Transducers in SMT A. Maletti 11

18 Statistical Machine Translation Implementation [BRAUNE et al., 2013] shallow lmbot implemented in MOSES framework [KOEHN et al., 2007] variant of the syntax-based component [HOANG et al., 2009] Lecture VI: Tree Transducers in SMT A. Maletti 12

19 Statistical Machine Translation Implementation [BRAUNE et al., 2013] shallow lmbot implemented in MOSES framework [KOEHN et al., 2007] variant of the syntax-based component hard- and soft-matching available incl. optimizations like cube pruning [HOANG et al., 2009] Lecture VI: Tree Transducers in SMT A. Maletti 12

20 Statistical Machine Translation Evaluation English-to-German WMT 2009 translation task 4th version EUROPARL and news commentary (approx. 1.5 million sentence pairs) System BLEU-4 constrained? University of Edinburgh (winner) 15.2 GOOGLE 14.7 University of Stuttgart 12.5 Moses (SCFG) tree-to-tree 12.6 lmbot tree-to-tree 13.1 Lecture VI: Tree Transducers in SMT A. Maletti 13

21 Statistical Machine Translation Example lmbot translation das problem der globalen erwärmung hat präsident václav klaus wieder kommentiert president václav klaus has again commented on the problem of global warming präsident václav klaus hat wieder kommentiert das problem der globalen erwärmung baseline translation Lecture VI: Tree Transducers in SMT A. Maletti 14

22 Statistical Machine Translation Evaluation [POPOVA, 2014] Russian-to-English WMT 13 translation task YANDEX corpus (approx. 1 million sentence pairs) System BLEU-4 winner [PINO et al., 2013] 25.9 (hierarchical phrase-based) hierarchical phrase-based 21.9 lmbot string-to-tree 20.7 string-to-tree 19.8 Lecture VI: Tree Transducers in SMT A. Maletti 15

23 Statistical Machine Translation Pipeline Input Parser Translation model Language model Output Lecture VI: Tree Transducers in SMT A. Maletti 16

24 Statistical Machine Translation Definition (One-symbol normal form) XMBOT (Q, Σ,, I, R) in one-symbol normal form if l contains at most one (occurrence of a) symbol of Σ for all (l, q, r) R Lecture VI: Tree Transducers in SMT A. Maletti 17

25 Statistical Machine Translation Definition (One-symbol normal form) XMBOT (Q, Σ,, I, R) in one-symbol normal form if l contains at most one (occurrence of a) symbol of Σ for all (l, q, r) R Theorem [ENGELFRIET et al., 2009] For every XMBOT there exists an equivalent XMBOT in one-symbol normal form NP h NP q NP him NP q 1 q NP q 1 h q 1 NP him Lecture VI: Tree Transducers in SMT A. Maletti 17

26 Statistical Machine Translation q ynzran NP-SBJ VP q PP-CLR q q VP PP-MNR q ynzran q PP-CLR q PP-MNR VP VP VP q ynzran q 2 q PP-CLR q PP-MNR q VP q ynzran q PP-CLR q PP-MNR NP-SBJ q 3 q 2 ε q 3 ε Transformation into one-symbol normal form Linear-time procedure Lecture VI: Tree Transducers in SMT A. Maletti 18

27 Statistical Machine Translation Implementation is often a weighted tree automaton yields an (unambiguous) weighted tree automaton for the parses of the input sentence efficient representation of L: T Σ R 0 Lecture VI: Tree Transducers in SMT A. Maletti 19

28 Statistical Machine Translation Implementation is often a weighted tree automaton yields an (unambiguous) weighted tree automaton for the parses of the input sentence efficient representation of L: T Σ R 0 Approach we can intersect all (weighted) parses with our translation model (XMBOT) improved stability under parse errors (but only at decode) Lecture VI: Tree Transducers in SMT A. Maletti 19

29 Statistical Machine Translation Definition (Input product) 1 weighted translation τ : T Σ T R 0 2 weighted language p : Σ R 0 (language model) pτ : T Σ T R 0 (t, u) τ(t, u) p(yd(t)) γ σ s 1 s 2 s 1 s 3 s 1 s 2 s 1 s 2 s 2 s 3 s 2 δ(s 1, α) s 1 α s 2 Lecture VI: Tree Transducers in SMT A. Maletti 20

30 Statistical Machine Translation Definition (Input product) 1 weighted translation τ : T Σ T R 0 2 weighted tree language L: T Σ R 0 (parses) Lτ : T Σ T R 0 (t, u) τ(t, u) L(t) Lecture VI: Tree Transducers in SMT A. Maletti 21

31 Statistical Machine Translation Definition (Input product) 1 weighted translation τ : T Σ T R 0 2 weighted tree language L: T Σ R 0 (parses) Lτ : T Σ T R 0 (t, u) τ(t, u) L(t) Theorem [, 2011]... product of wxmbot M with... is side wa A wta A input O( M A 3 ) O( M A ) output O( M A 2 rk(m)+2 ) O( M A rk(m) ) Lecture VI: Tree Transducers in SMT A. Maletti 21

32 Statistical Machine Translation Definition (Input product) 1 weighted translation τ : T Σ T R 0 2 weighted tree language L: T Σ R 0 (parses) Lτ : T Σ T R 0 (t, u) τ(t, u) L(t) Theorem [, 2011]... product of wxmbot M with... is side wa A wta A input O( M A 3 ) O( M A ) output O( M A 2 rk(m)+2 ) O( M A rk(m) ) Lecture VI: Tree Transducers in SMT A. Maletti 21

33 Statistical Machine Translation Example (Input product) S q w q VP q S q w q VP q VP S VP-5 VP-3 S-1 ε S VP-5, q w VP-3, q VP S-1,q S VP-5, q w VP-3, q VP VP-3, q VP Lecture VI: Tree Transducers in SMT A. Maletti 22

34 Statistical Machine Translation Pipeline Input Parser Translation model Language model Output exact decoding of the red part integrating the language model would require O( M A 2 rk(m)+2 ) Lecture VI: Tree Transducers in SMT A. Maletti 23

35 Statistical Machine Translation Implementation [QUERNHEIM, 2014] 1 exactly computes a wta representing the derivations for the first two models Lecture VI: Tree Transducers in SMT A. Maletti 24

36 Statistical Machine Translation Implementation [QUERNHEIM, 2014] 1 exactly computes a wta representing the derivations for the first two models 2 extracts the k-best derivations Lecture VI: Tree Transducers in SMT A. Maletti 24

37 Statistical Machine Translation Implementation [QUERNHEIM, 2014] 1 exactly computes a wta representing the derivations for the first two models 2 extracts the k-best derivations 3 reranks them by the language model (i.e., multiplies their score with the LM score and resorts) Lecture VI: Tree Transducers in SMT A. Maletti 24

38 Statistical Machine Translation Implementation [QUERNHEIM, 2014] 1 exactly computes a wta representing the derivations for the first two models 2 extracts the k-best derivations 3 reranks them by the language model (i.e., multiplies their score with the LM score and resorts) Disadvantages Lecture VI: Tree Transducers in SMT A. Maletti 24

39 Statistical Machine Translation Implementation [QUERNHEIM, 2014] 1 exactly computes a wta representing the derivations for the first two models 2 extracts the k-best derivations 3 reranks them by the language model (i.e., multiplies their score with the LM score and resorts) Disadvantages 1 s... Lecture VI: Tree Transducers in SMT A. Maletti 24

40 Statistical Machine Translation Implementation [QUERNHEIM, 2014] 1 exactly computes a wta representing the derivations for the first two models 2 extracts the k-best derivations 3 reranks them by the language model (i.e., multiplies their score with the LM score and resorts) Disadvantages 1 s... l... Lecture VI: Tree Transducers in SMT A. Maletti 24

41 Statistical Machine Translation Implementation [QUERNHEIM, 2014] 1 exactly computes a wta representing the derivations for the first two models 2 extracts the k-best derivations 3 reranks them by the language model (i.e., multiplies their score with the LM score and resorts) Disadvantages 1 s... l... o... Lecture VI: Tree Transducers in SMT A. Maletti 24

42 Statistical Machine Translation Implementation [QUERNHEIM, 2014] 1 exactly computes a wta representing the derivations for the first two models 2 extracts the k-best derivations 3 reranks them by the language model (i.e., multiplies their score with the LM score and resorts) Disadvantages 1 s... l... o... w Lecture VI: Tree Transducers in SMT A. Maletti 24

43 Statistical Machine Translation Implementation [QUERNHEIM, 2014] 1 exactly computes a wta representing the derivations for the first two models 2 extracts the k-best derivations 3 reranks them by the language model (i.e., multiplies their score with the LM score and resorts) Disadvantages 1 s... l... o... w 2 language model not integrated (needs strict structure) Lecture VI: Tree Transducers in SMT A. Maletti 24

44 Statistical Machine Translation Implementation [QUERNHEIM, 2014] 1 exactly computes a wta representing the derivations for the first two models 2 extracts the k-best derivations 3 reranks them by the language model (i.e., multiplies their score with the LM score and resorts) Disadvantages 1 s... l... o... w 2 language model not integrated (needs strict structure) 3 strictness coverage problems Lecture VI: Tree Transducers in SMT A. Maletti 24

45 Statistical Machine Translation Evaluation English-to-German 7th EUROPARL, news commentary, and Common crawl (approx million sentence pairs) System BLEU-4 WMT 13 WMT 14 winner Moses (SCFG) tree-to-tree 13.1 ExactMBOT tree-to-tree Moses (SCFG) string-to-tree 14.7 lmbot string-to-tree 15.5 Moses phrase-based 17.5 Lecture VI: Tree Transducers in SMT A. Maletti 25

46 Tree Transducers in Machine Translation Composition Lecture VI: Tree Transducers in SMT A. Maletti 26

47 Composition τ 1 ; τ 2 = {(s, u) t : (s, t) τ 1, (t, u) τ 2 } support modular development allow integration of external knowledge sources Question given a class C of transformations, is there n N such that C n = k 1 C k C k = C ; ; C }{{} k times Lecture VI: Tree Transducers in SMT A. Maletti 27

48 l(n)-xmbot l-xtop le-xtop l-xtop 2 le-xtop 2 ln-xtop 2 l-xtop R lne-xtop 2 l-xtop le-xtop ln-xtop lne-xtop Lecture VI: Tree Transducers in SMT A. Maletti 28

49 l-top l-xtop l-xmbot ε-free, strict, nondeleting 1 1 ε-free, strict 2 1 ε-free 2 1 otherwise (without delabeling) 2 1 Lecture VI: Tree Transducers in SMT A. Maletti 29

50 l-top l-xtop l-xmbot ε-free, strict, nondeleting ε-free, strict 2? 1 ε-free 2? 1 otherwise (without delabeling) 2? 1 Lecture VI: Tree Transducers in SMT A. Maletti 29

51 e = ε-free; d = delabeling s = strict; n = nondeleting Theorem [FÜLÖP,, 2013] switch delabeling from back to front: le[s]-xtop R ; l[s]d-top R le[s]-xtop R l[s]d-top R ; lesn-xtop Lecture VI: Tree Transducers in SMT A. Maletti 30

52 e = ε-free; d = delabeling s = strict; n = nondeleting Theorem [FÜLÖP,, 2013] switch delabeling from back to front: le[s]-xtop R ; l[s]d-top R le[s]-xtop R l[s]d-top R ; lesn-xtop Lecture VI: Tree Transducers in SMT A. Maletti 30

53 e = ε-free; d = delabeling s = strict; n = nondeleting Theorem [FÜLÖP,, 2013] switch delabeling from back to front: le[s]-xtop R ; l[s]d-top R le[s]-xtop R l[s]d-top R ; lesn-xtop Notes other transducer becomes strict and nondeleting Lecture VI: Tree Transducers in SMT A. Maletti 30

54 e = ε-free; d = delabeling s = strict; n = nondeleting Theorem [FÜLÖP,, 2013] switch delabeling from back to front: le[s]-xtop R ; l[s]d-top R le[s]-xtop R l[s]d-top R ; lesn-xtop Notes other transducer becomes strict and nondeleting other transducer loses look-ahead Lecture VI: Tree Transducers in SMT A. Maletti 30

55 e = ε-free; d = delabeling s = strict; n = nondeleting Theorem (le[s]-xtop R ) n l[s]d-top R ; lesn-xtop 2 (le[s]-xtop R ) 3 Lecture VI: Tree Transducers in SMT A. Maletti 31

56 e = ε-free; d = delabeling s = strict; n = nondeleting Theorem (le[s]-xtop R ) n l[s]d-top R ; lesn-xtop 2 (le[s]-xtop R ) 3 Proof. (le[s]-xtop R ) n+1 Lecture VI: Tree Transducers in SMT A. Maletti 31

57 e = ε-free; d = delabeling s = strict; n = nondeleting Theorem (le[s]-xtop R ) n l[s]d-top R ; lesn-xtop 2 (le[s]-xtop R ) 3 Proof. (le[s]-xtop R ) n+1 le[s]-xtop R ; l[s]d-top R ; lesn-xtop 2 Lecture VI: Tree Transducers in SMT A. Maletti 31

58 e = ε-free; d = delabeling s = strict; n = nondeleting Theorem (le[s]-xtop R ) n l[s]d-top R ; lesn-xtop 2 (le[s]-xtop R ) 3 Proof. (le[s]-xtop R ) n+1 le[s]-xtop R ; l[s]d-top R ; lesn-xtop 2 l[s]d-top R ; lesn-xtop 3 Lecture VI: Tree Transducers in SMT A. Maletti 31

59 e = ε-free; d = delabeling s = strict; n = nondeleting Theorem (le[s]-xtop R ) n l[s]d-top R ; lesn-xtop 2 (le[s]-xtop R ) 3 Proof. (le[s]-xtop R ) n+1 le[s]-xtop R ; l[s]d-top R ; lesn-xtop 2 l[s]d-top R ; lesn-xtop 3 l[s]d-top R ; lesn-xtop 2 Lecture VI: Tree Transducers in SMT A. Maletti 31

60 Corollary le[s]-xtop n QR ; l[s]d-top ; lesn-xtop 2 le[s]-xtop 4 Lecture VI: Tree Transducers in SMT A. Maletti 32

61 Corollary le[s]-xtop n QR ; l[s]d-top ; lesn-xtop 2 le[s]-xtop 4 Proof. uses only standard encoding of look-ahead Lecture VI: Tree Transducers in SMT A. Maletti 32

62 l-top le-xtop strict, nondeleting 1 2 strict, look-ahead 1 strict 2 look-ahead 1 2 Lecture VI: Tree Transducers in SMT A. Maletti 33

63 l-top le-xtop strict, nondeleting 1 2 strict, look-ahead 1 strict 2 look-ahead Lecture VI: Tree Transducers in SMT A. Maletti 33

64 Theorem delabeling homomorphism moving from front to back: lsd-hom ; les-xtop les-xtop lesn-xtop ; lsd-hom Lecture VI: Tree Transducers in SMT A. Maletti 34

65 Theorem delabeling homomorphism moving from front to back: lsd-hom ; les-xtop les-xtop lesn-xtop ; lsd-hom Notes Lecture VI: Tree Transducers in SMT A. Maletti 34

66 Theorem delabeling homomorphism moving from front to back: lsd-hom ; les-xtop les-xtop lesn-xtop ; lsd-hom Notes other transducer becomes nondeleting Lecture VI: Tree Transducers in SMT A. Maletti 34

67 Theorem delabeling homomorphism moving from front to back: lsd-hom ; les-xtop les-xtop lesn-xtop ; lsd-hom Notes other transducer becomes nondeleting other transducer needs to be strict and have no look-ahead Lecture VI: Tree Transducers in SMT A. Maletti 34

68 Theorem (les-xtop R ) n lesn-xtop ; les-xtop les-xtop 2 Lecture VI: Tree Transducers in SMT A. Maletti 35

69 Theorem (les-xtop R ) n lesn-xtop ; les-xtop les-xtop 2 Proof. (les-xtop R ) n+1 (les-xtop R ) n ; les-xtop Lecture VI: Tree Transducers in SMT A. Maletti 35

70 Theorem (les-xtop R ) n lesn-xtop ; les-xtop les-xtop 2 Proof. (les-xtop R ) n+1 (les-xtop R ) n ; les-xtop lesn-xtop ; lsd-hom ; les-xtop 2 Lecture VI: Tree Transducers in SMT A. Maletti 35

71 Theorem (les-xtop R ) n lesn-xtop ; les-xtop les-xtop 2 Proof. (les-xtop R ) n+1 (les-xtop R ) n ; les-xtop lesn-xtop ; lsd-hom ; les-xtop 2 lesn-xtop 3 ; lsd-hom Lecture VI: Tree Transducers in SMT A. Maletti 35

72 Theorem (les-xtop R ) n lesn-xtop ; les-xtop les-xtop 2 Proof. (les-xtop R ) n+1 (les-xtop R ) n ; les-xtop lesn-xtop ; lsd-hom ; les-xtop 2 lesn-xtop 3 ; lsd-hom lesn-xtop 2 ; lsd-hom Lecture VI: Tree Transducers in SMT A. Maletti 35

73 Theorem (les-xtop R ) n lesn-xtop ; les-xtop les-xtop 2 Proof. (les-xtop R ) n+1 (les-xtop R ) n ; les-xtop lesn-xtop ; lsd-hom ; les-xtop 2 lesn-xtop 3 ; lsd-hom lesn-xtop 2 ; lsd-hom lesn-xtop ; les-xtop R Lecture VI: Tree Transducers in SMT A. Maletti 35

74 Theorem (les-xtop R ) n lesn-xtop ; les-xtop les-xtop 2 Proof. (les-xtop R ) n+1 (les-xtop R ) n ; les-xtop lesn-xtop ; lsd-hom ; les-xtop 2 lesn-xtop 3 ; lsd-hom lesn-xtop 2 ; lsd-hom lesn-xtop ; les-xtop R lesn-xtop ; les-xtop Lecture VI: Tree Transducers in SMT A. Maletti 35

75 l-top le-xtop strict, nondeleting 1 2 strict, look-ahead 1 strict 2 look-ahead Lecture VI: Tree Transducers in SMT A. Maletti 36

76 l-top le-xtop strict, nondeleting 1 2 strict, look-ahead 1 2 strict 2 2 look-ahead Lecture VI: Tree Transducers in SMT A. Maletti 36

77 Definition (Hierarchy properties) A dependency t, D, u is input hierarchical if 1 w 2 w 1 2 (v 1, w 1 ) D with w 1 w 2 for all (v 1, w 1 ), (v 2, w 2 ) D with v 1 < v 2 Lecture VI: Tree Transducers in SMT A. Maletti 37

78 Definition (Hierarchy properties) A dependency t, D, u is input hierarchical if 1 w 2 w 1 2 (v 1, w 1 ) D with w 1 w 2 for all (v 1, w 1 ), (v 2, w 2 ) D with v 1 < v 2 Lecture VI: Tree Transducers in SMT A. Maletti 37

79 Definition (Hierarchy properties) A dependency t, D, u is input hierarchical if 1 w 2 w 1 2 (v 1, w 1 ) D with w 1 w 2 for all (v 1, w 1 ), (v 2, w 2 ) D with v 1 < v 2 strictly input hierarchical if 1 v 1 < v 2 implies w 1 w 2 2 v 1 = v 2 implies w 1 w 2 or w 2 w 1 for all (v 1, w 1 ), (v 2, w 2 ) D Lecture VI: Tree Transducers in SMT A. Maletti 37

80 Definition (Distance properties) A dependency t, D, u is input link-distance bounded by b N if for all (v 1, w 1 ), (v 1 v, w 2 ) D with v > b (v 1 v, w 3 ) D such that v < v and 1 v b Lecture VI: Tree Transducers in SMT A. Maletti 38

81 Definition (Distance properties) A dependency t, D, u is input link-distance bounded by b N if for all (v 1, w 1 ), (v 1 v, w 2 ) D with v > b (v 1 v, w 3 ) D such that v < v and 1 v b Lecture VI: Tree Transducers in SMT A. Maletti 38

82 Definition (Distance properties) A dependency t, D, u is input link-distance bounded by b N if for all (v 1, w 1 ), (v 1 v, w 2 ) D with v > b (v 1 v, w 3 ) D such that v < v and 1 v b strict input link-distance bounded by b if for all v 1, v 1 v pos(t) with v > b (v 1 v, w 3 ) D such that v < v and 1 v b Lecture VI: Tree Transducers in SMT A. Maletti 38

83 hierarchical link-distance bounded Model \ Property input output input output ln-xtop strictly strictly strictly strictly l-xtop R strictly strictly strictly l-mbot strictly strictly Lecture VI: Tree Transducers in SMT A. Maletti 39

84 Theorem [ et al., 2009] les-xtop les-xtop R les-xtop 2 = (les-xtop R ) 2 Lecture VI: Tree Transducers in SMT A. Maletti 40

85 Theorem [ et al., 2009] les-xtop les-xtop R les-xtop 2 = (les-xtop R ) 2 Proof. look-ahead adds power at first level none of the basic classes is closed under composition Lecture VI: Tree Transducers in SMT A. Maletti 40

86 l-top le-xtop strict, nondeleting 1 2 strict, look-ahead 1 2 strict 2 2 look-ahead Lecture VI: Tree Transducers in SMT A. Maletti 41

87 l-top le-xtop strict, nondeleting 1 2 strict, look-ahead 1 2 strict 2 2 look-ahead Lecture VI: Tree Transducers in SMT A. Maletti 41

88 Theorem [FÜLÖP,, 2013] le-xtop 2 (le-xtop R ) 2 le-xtop 3 (le-xtop R ) 3 σ1 σ1 σ1 σ1 t1 t = σ1 c σ2 σ1 σ1 σ2 σ2 t3 t2 = u σ2 σ2 t2 t1 σ1 σ2 ti 1 ti 2 tn tn 1 σ2 ti ti 1 tn σ2 ti+1 ti ti+2 ti+1 tn 1 tn 2 s = v v vi+1 vi vi 1 si+1 si si 1 Lecture VI: Tree Transducers in SMT A. Maletti 42

89 Theorem [FÜLÖP,, 2013] le-xtop 2 (le-xtop R ) 2 le-xtop 3 (le-xtop R ) 3 v v i 1 and v v i and v v i+1 σ1 σ1 σ1 σ1 t1 t = σ1 c σ2 σ1 σ1 σ2 σ2 t3 t2 = u σ2 σ2 t2 t1 σ1 σ2 ti 1 ti 2 tn tn 1 σ2 ti ti 1 tn σ2 ti+1 ti ti+2 ti+1 tn 1 tn 2 s = v v vi+1 vi vi 1 si+1 si si 1 Lecture VI: Tree Transducers in SMT A. Maletti 42

90 Theorem [FÜLÖP,, 2013] le-xtop 2 (le-xtop R ) 2 le-xtop 3 (le-xtop R ) 3 v v i 1 and v v i and v v i+1 σ1 σ1 σ1 σ1 t1 t = σ1 c σ2 σ1 σ1 σ2 σ2 t3 t2 = u σ2 σ2 t2 t1 σ1 σ2 ti 1 ti 2 tn tn 1 σ2 ti ti 1 tn σ2 ti+1 ti ti+2 ti+1 tn 1 tn 2 s = v v vi+1 vi vi 1 si+1 si si 1 Lecture VI: Tree Transducers in SMT A. Maletti 42

91 Theorem [FÜLÖP,, 2013] le-xtop 2 (le-xtop R ) 2 le-xtop 3 (le-xtop R ) 3 v v i 1 and v v i and v v i+1 v v i 1 and v v i and v v i+1 σ1 σ1 σ1 σ1 t1 t = σ1 c σ2 σ1 σ1 σ2 σ2 t3 t2 = u σ2 σ2 t2 t1 σ1 σ2 ti 1 ti 2 tn tn 1 σ2 ti ti 1 tn σ2 ti+1 ti ti+2 ti+1 tn 1 tn 2 s = v v vi+1 vi vi 1 si+1 si si 1 Lecture VI: Tree Transducers in SMT A. Maletti 42

92 l-top le-xtop strict, nondeleting 1 2 strict, look-ahead 1 2 strict 2 2 look-ahead Lecture VI: Tree Transducers in SMT A. Maletti 43

93 l-top le-xtop strict, nondeleting 1 2 strict, look-ahead 1 2 strict 2 2 look-ahead (4) Lecture VI: Tree Transducers in SMT A. Maletti 43

94 l-top l-xtop l-xtop R ε-free, nondeleting 1 strict 2 nondeleting 1 strict, nondeleting 1 2 Proof. completely different technique [FÜLÖP,, 2013] Lecture VI: Tree Transducers in SMT A. Maletti 44

95 l-top l-xtop l-xtop R l-mbot ε-free, strict, nondeleting ε-free, strict ε-free otherwise (w/o delabeling) 2 1 Lecture VI: Tree Transducers in SMT A. Maletti 45

96 l(n)-mbot? l-xtop le-xtop 4? le-xtop 3 l-xtop 2 le-xtop 2 ln-xtop 2 l-xtop R lne-xtop 2 l-xtop le-xtop ln-xtop lne-xtop Lecture VI: Tree Transducers in SMT A. Maletti 46

97 Literature Selected references ARNOLD, DAUCHET: Morphismes et Bimorphismes d Arbres Theoret. Comput. Sci. 20, 1982 BRAUNE, SEEMANN, QUERNHEIM, : Shallow local multi bottom-up tree transducers in statistical machine translation. Proc. ACL, 2013 FÜLÖP, : Composition closure of ε-free linear extended top-down tree transducers. Proc. DLT, 2013, GRAEHL, HOPKINS, KNIGHT: The Power of Extended Top-down Tree Transducers. SIAM J. Comput. 39, 2009 Lecture VI: Tree Transducers in SMT A. Maletti 47

Linking Theorems for Tree Transducers

Linking Theorems for Tree Transducers Linking Theorems for Tree Transducers Andreas Maletti maletti@ims.uni-stuttgart.de peyer October 1, 2015 Andreas Maletti Linking Theorems for MBOT Theorietag 2015 1 / 32 tatistical Machine Translation

More information

Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers

Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers Andreas Maletti Institute of Computer Science Universität Leipzig, Germany on leave from: Institute for Natural Language Processing

More information

Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1

Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1 Lecture 9: Decoding Andreas Maletti Statistical Machine Translation Stuttgart January 20, 2012 SMT VIII A. Maletti 1 Lecture 9 Last time Synchronous grammars (tree transducers) Rule extraction Weight training

More information

Statistical Machine Translation of Natural Languages

Statistical Machine Translation of Natural Languages 1/26 Statistical Machine Translation of Natural Languages Heiko Vogler Technische Universität Dresden Germany Graduiertenkolleg Quantitative Logics and Automata Dresden, November, 2012 1/26 Weighted Tree

More information

How to train your multi bottom-up tree transducer

How to train your multi bottom-up tree transducer How to train your multi bottom-up tree transducer Andreas Maletti Universität tuttgart Institute for Natural Language Processing tuttgart, Germany andreas.maletti@ims.uni-stuttgart.de Portland, OR June

More information

The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers

The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers International Journal of Foundations of Computer cience c World cientific Publishing Company The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers ANDREA MALETTI Universität Leipzig,

More information

Composition Closure of ε-free Linear Extended Top-down Tree Transducers

Composition Closure of ε-free Linear Extended Top-down Tree Transducers Composition Closure of ε-free Linear Extended Top-down Tree Transducers Zoltán Fülöp 1, and Andreas Maletti 2, 1 Department of Foundations of Computer Science, University of Szeged Árpád tér 2, H-6720

More information

Lecture 7: Introduction to syntax-based MT

Lecture 7: Introduction to syntax-based MT Lecture 7: Introduction to syntax-based MT Andreas Maletti Statistical Machine Translation Stuttgart December 16, 2011 SMT VII A. Maletti 1 Lecture 7 Goals Overview Tree substitution grammars (tree automata)

More information

Statistical Machine Translation of Natural Languages

Statistical Machine Translation of Natural Languages 1/37 Statistical Machine Translation of Natural Languages Rule Extraction and Training Probabilities Matthias Büchse, Toni Dietze, Johannes Osterholzer, Torsten Stüber, Heiko Vogler Technische Universität

More information

Hierarchies of Tree Series TransducersRevisited 1

Hierarchies of Tree Series TransducersRevisited 1 Hierarchies of Tree Series TransducersRevisited 1 Andreas Maletti 2 Technische Universität Dresden Fakultät Informatik June 27, 2006 1 Financially supported by the Friends and Benefactors of TU Dresden

More information

Random Generation of Nondeterministic Tree Automata

Random Generation of Nondeterministic Tree Automata Random Generation of Nondeterministic Tree Automata Thomas Hanneforth 1 and Andreas Maletti 2 and Daniel Quernheim 2 1 Department of Linguistics University of Potsdam, Germany 2 Institute for Natural Language

More information

The Power of Tree Series Transducers

The Power of Tree Series Transducers The Power of Tree Series Transducers Andreas Maletti 1 Technische Universität Dresden Fakultät Informatik June 15, 2006 1 Research funded by German Research Foundation (DFG GK 334) Andreas Maletti (TU

More information

Compositions of Bottom-Up Tree Series Transformations

Compositions of Bottom-Up Tree Series Transformations Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de May 17, 2005 1. Motivation

More information

Rule Extraction for Machine Translation

Rule Extraction for Machine Translation Rule Extraction for Machine Translation Andreas Maletti Institute of Computer cience Universität Leipzig, Germany on leave from: Institute for Natural Language Processing Universität tuttgart, Germany

More information

Compositions of Tree Series Transformations

Compositions of Tree Series Transformations Compositions of Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de December 03, 2004 1. Motivation 2.

More information

Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers

Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers Nina eemann and Daniel Quernheim and Fabienne Braune and Andreas Maletti University of tuttgart, Institute for Natural

More information

Multi Bottom-up Tree Transducers

Multi Bottom-up Tree Transducers Multi Bottom-up Tree Transducers Andreas Maletti Institute for Natural Language Processing Universität tuttgart, Germany maletti@ims.uni-stuttgart.de Avignon April 24, 2012 Multi Bottom-up Tree Transducers

More information

Compositions of Bottom-Up Tree Series Transformations

Compositions of Bottom-Up Tree Series Transformations Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de May 27, 2005 1. Motivation

More information

The Geometry of Statistical Machine Translation

The Geometry of Statistical Machine Translation The Geometry of Statistical Machine Translation Presented by Rory Waite 16th of December 2015 ntroduction Linear Models Convex Geometry The Minkowski Sum Projected MERT Conclusions ntroduction We provide

More information

Tuning as Linear Regression

Tuning as Linear Regression Tuning as Linear Regression Marzieh Bazrafshan, Tagyoung Chung and Daniel Gildea Department of Computer Science University of Rochester Rochester, NY 14627 Abstract We propose a tuning method for statistical

More information

Key words. tree transducer, natural language processing, hierarchy, composition closure

Key words. tree transducer, natural language processing, hierarchy, composition closure THE POWER OF EXTENDED TOP-DOWN TREE TRANSDUCERS ANDREAS MALETTI, JONATHAN GRAEHL, MARK HOPKINS, AND KEVIN KNIGHT Abstract. Extended top-down tree transducers (transducteurs généralisés descendants [Arnold,

More information

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this. Chapter 12 Synchronous CFGs Synchronous context-free grammars are a generalization of CFGs that generate pairs of related strings instead of single strings. They are useful in many situations where one

More information

Language Model Rest Costs and Space-Efficient Storage

Language Model Rest Costs and Space-Efficient Storage Language Model Rest Costs and Space-Efficient Storage Kenneth Heafield Philipp Koehn Alon Lavie Carnegie Mellon, University of Edinburgh July 14, 2012 Complaint About Language Models Make Search Expensive

More information

Variational Decoding for Statistical Machine Translation

Variational Decoding for Statistical Machine Translation Variational Decoding for Statistical Machine Translation Zhifei Li, Jason Eisner, and Sanjeev Khudanpur Center for Language and Speech Processing Computer Science Department Johns Hopkins University 1

More information

A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions

A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions Wei Lu and Hwee Tou Ng National University of Singapore 1/26 The Task (Logical Form) λx 0.state(x 0

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation -tree-based models (cont.)- Artem Sokolov Computerlinguistik Universität Heidelberg Sommersemester 2015 material from P. Koehn, S. Riezler, D. Altshuler Bottom-Up Decoding

More information

Compositions of Extended Top-down Tree Transducers

Compositions of Extended Top-down Tree Transducers Compositions of Extended Top-down Tree Transducers Andreas Maletti ;1 International Computer Science Institute 1947 Center Street, Suite 600, Berkeley, CA 94704, USA Abstract Unfortunately, the class of

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based Statistical Machine Translation Marcello Federico Human Language Technology FBK 2014 Outline Tree-based translation models: Synchronous context free grammars Hierarchical phrase-based

More information

Compositions of Top-down Tree Transducers with ε-rules

Compositions of Top-down Tree Transducers with ε-rules Compositions of Top-down Tree Transducers with ε-rules Andreas Maletti 1, and Heiko Vogler 2 1 Universitat Rovira i Virgili, Departament de Filologies Romàniques Avinguda de Catalunya 35, 43002 Tarragona,

More information

Syntax-Directed Translations and Quasi-alphabetic Tree Bimorphisms Revisited

Syntax-Directed Translations and Quasi-alphabetic Tree Bimorphisms Revisited Syntax-Directed Translations and Quasi-alphabetic Tree Bimorphisms Revisited Andreas Maletti and C t lin Ionuµ Tîrn uc Universitat Rovira i Virgili Departament de Filologies Romàniques Av. Catalunya 35,

More information

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister A Syntax-based Statistical Machine Translation Model Alexander Friedl, Georg Teichtmeister 4.12.2006 Introduction The model Experiment Conclusion Statistical Translation Model (STM): - mathematical model

More information

SYNCHRONOUS GRAMMARS AS TREE TRANSDUCERS

SYNCHRONOUS GRAMMARS AS TREE TRANSDUCERS SYNCHRONOUS GRAMMARS AS TREE TRANSDUCERS STUART M. SHIEBER ABSTRACT. Tree transducer formalisms were developed in the formal language theory community as generalizations of finite-state transducers from

More information

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS Automata Theory Lecture on Discussion Course of CS2 This Lecture is about Mathematical Models of Computation. Why Should I Care? - Ways of thinking. - Theory can drive practice. - Don t be an Instrumentalist.

More information

Syntax-based Statistical Machine Translation

Syntax-based Statistical Machine Translation Syntax-based Statistical Machine Translation Philip Williams and Philipp Koehn 29 October 2014 Part I Part II - Introduction - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical

More information

Syntax-Based Decoding

Syntax-Based Decoding Syntax-Based Decoding Philipp Koehn 9 November 2017 1 syntax-based models Synchronous Context Free Grammar Rules 2 Nonterminal rules NP DET 1 2 JJ 3 DET 1 JJ 3 2 Terminal rules N maison house NP la maison

More information

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u, 1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u, v, x, y, z as per the pumping theorem. 3. Prove that

More information

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti Quasi-Synchronous Phrase Dependency Grammars for Machine Translation Kevin Gimpel Noah A. Smith 1 Introduction MT using dependency grammars on phrases Phrases capture local reordering and idiomatic translations

More information

Decoding and Inference with Syntactic Translation Models

Decoding and Inference with Syntactic Translation Models Decoding and Inference with Syntactic Translation Models March 5, 2013 CFGs S NP VP VP NP V V NP NP CFGs S NP VP S VP NP V V NP NP CFGs S NP VP S VP NP V NP VP V NP NP CFGs S NP VP S VP NP V NP VP V NP

More information

Efficient Incremental Decoding for Tree-to-String Translation

Efficient Incremental Decoding for Tree-to-String Translation Efficient Incremental Decoding for Tree-to-String Translation Liang Huang 1 1 Information Sciences Institute University of Southern California 4676 Admiralty Way, Suite 1001 Marina del Rey, CA 90292, USA

More information

Introduction to Computers & Programming

Introduction to Computers & Programming 16.070 Introduction to Computers & Programming Theory of computation: What is a computer? FSM, Automata Prof. Kristina Lundqvist Dept. of Aero/Astro, MIT Models of Computation What is a computer? If you

More information

Structure and Complexity of Grammar-Based Machine Translation

Structure and Complexity of Grammar-Based Machine Translation Structure and of Grammar-Based Machine Translation University of Padua, Italy New York, June 9th, 2006 1 2 Synchronous context-free grammars Definitions Computational problems 3 problem SCFG projection

More information

Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions

Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical Machine Translation Philip Williams and Philipp Koehn 29 October 2014 Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical

More information

Tree transformations and dependencies

Tree transformations and dependencies Tree transformations and deendencies Andreas Maletti Universität Stuttgart Institute for Natural Language Processing Azenbergstraße 12, 70174 Stuttgart, Germany andreasmaletti@imsuni-stuttgartde Abstract

More information

Finite State Transducers

Finite State Transducers Finite State Transducers Eric Gribkoff May 29, 2013 Original Slides by Thomas Hanneforth (Universitat Potsdam) Outline 1 Definition of Finite State Transducer 2 Examples of FSTs 3 Definition of Regular

More information

Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation

Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation David Vilar, Daniel Stein, Hermann Ney IWSLT 2008, Honolulu, Hawaii 20. October 2008 Human Language Technology

More information

Bimorphism Machine Translation

Bimorphism Machine Translation Bimorphism Machine Translation Von der Fakultät für Mathematik und Informatik der Universität Leipzig angenommene D I S S E R T A T I O N zur Erlangung des akademischen Grades DOCTOR PHILOSOPHIAE (Dr.

More information

Fuzzy Tree Transducers

Fuzzy Tree Transducers Advances in Fuzzy Mathematics. ISSN 0973-533X Volume 11, Number 2 (2016), pp. 99-112 Research India Publications http://www.ripublication.com Fuzzy Tree Transducers S. R. Chaudhari and M. N. Joshi Department

More information

Finite-state Machines: Theory and Applications

Finite-state Machines: Theory and Applications Finite-state Machines: Theory and Applications Unweighted Finite-state Automata Thomas Hanneforth Institut für Linguistik Universität Potsdam December 10, 2008 Thomas Hanneforth (Universität Potsdam) Finite-state

More information

Natural Language Processing (CSEP 517): Machine Translation

Natural Language Processing (CSEP 517): Machine Translation Natural Language Processing (CSEP 57): Machine Translation Noah Smith c 207 University of Washington nasmith@cs.washington.edu May 5, 207 / 59 To-Do List Online quiz: due Sunday (Jurafsky and Martin, 2008,

More information

Learning to translate with neural networks. Michael Auli

Learning to translate with neural networks. Michael Auli Learning to translate with neural networks Michael Auli 1 Neural networks for text processing Similar words near each other France Spain dog cat Neural networks for text processing Similar words near each

More information

Pure and O-Substitution

Pure and O-Substitution International Journal of Foundations of Computer Science c World Scientific Publishing Company Pure and O-Substitution Andreas Maletti Department of Computer Science, Technische Universität Dresden 01062

More information

Cross-Entropy and Estimation of Probabilistic Context-Free Grammars

Cross-Entropy and Estimation of Probabilistic Context-Free Grammars Cross-Entropy and Estimation of Probabilistic Context-Free Grammars Anna Corazza Department of Physics University Federico II via Cinthia I-8026 Napoli, Italy corazza@na.infn.it Giorgio Satta Department

More information

What we have done so far

What we have done so far What we have done so far DFAs and regular languages NFAs and their equivalence to DFAs Regular expressions. Regular expressions capture exactly regular languages: Construct a NFA from a regular expression.

More information

Tree Transducers in Machine Translation

Tree Transducers in Machine Translation Tree Transducers in Machine Translation Andreas Maletti Universitat Rovira i Virgili Tarragona, pain andreas.maletti@urv.cat Jena August 23, 2010 Tree Transducers in MT A. Maletti 1 Machine translation

More information

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism, CS 54, Lecture 2: Finite Automata, Closure Properties Nondeterminism, Why so Many Models? Streaming Algorithms 0 42 Deterministic Finite Automata Anatomy of Deterministic Finite Automata transition: for

More information

Extended Multi Bottom-Up Tree Transducers

Extended Multi Bottom-Up Tree Transducers Extended Multi Bottom-Up Tree Transducers Joost Engelfriet 1, Eric Lilin 2, and Andreas Maletti 3;? 1 Leiden Instite of Advanced Comper Science Leiden University, P.O. Box 9512, 2300 RA Leiden, The Netherlands

More information

Introduction to Automata

Introduction to Automata Introduction to Automata Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 /

More information

Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation

Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation Joern Wuebker, Hermann Ney Human Language Technology and Pattern Recognition Group Computer Science

More information

Tree Adjoining Grammars

Tree Adjoining Grammars Tree Adjoining Grammars TAG: Parsing and formal properties Laura Kallmeyer & Benjamin Burkhardt HHU Düsseldorf WS 2017/2018 1 / 36 Outline 1 Parsing as deduction 2 CYK for TAG 3 Closure properties of TALs

More information

Harvard CS 121 and CSCI E-207 Lecture 10: Ambiguity, Pushdown Automata

Harvard CS 121 and CSCI E-207 Lecture 10: Ambiguity, Pushdown Automata Harvard CS 121 and CSCI E-207 Lecture 10: Ambiguity, Pushdown Automata Salil Vadhan October 4, 2012 Reading: Sipser, 2.2. Another example of a CFG (with proof) L = {x {a, b} : x has the same # of a s and

More information

Composition and Decomposition of Extended Multi Bottom-Up Tree Transducers?

Composition and Decomposition of Extended Multi Bottom-Up Tree Transducers? Composition and Decomposition of Extended Multi Bottom-Up Tree Transducers? Joost Engelfriet 1, Eric Lilin 2, and Andreas Maletti 3;?? 1 Leiden Institute of Advanced Computer Science Leiden University,

More information

Context-Free Grammars (and Languages) Lecture 7

Context-Free Grammars (and Languages) Lecture 7 Context-Free Grammars (and Languages) Lecture 7 1 Today Beyond regular expressions: Context-Free Grammars (CFGs) What is a CFG? What is the language associated with a CFG? Creating CFGs. Reasoning about

More information

Feature-Rich Translation by Quasi-Synchronous Lattice Parsing

Feature-Rich Translation by Quasi-Synchronous Lattice Parsing Feature-Rich Translation by Quasi-Synchronous Lattice Parsing Kevin Gimpel and Noah A. Smith Language Technologies Institute School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213,

More information

Syntax Analysis Part I

Syntax Analysis Part I 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2013 2 Position of a Parser in the Compiler Model Source Program Lexical Analyzer

More information

Computational Models - Lecture 4

Computational Models - Lecture 4 Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push

More information

Einführung in die Computerlinguistik

Einführung in die Computerlinguistik Einführung in die Computerlinguistik Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 22 CFG (1) Example: Grammar G telescope : Productions: S NP VP NP

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Machine Translation. Uwe Dick

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Machine Translation. Uwe Dick Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Machine Translation Uwe Dick Google Translate Rosetta Stone Hieroglyphs Demotic Greek Machine Translation Automatically translate

More information

Dual Decomposition for Natural Language Processing. Decoding complexity

Dual Decomposition for Natural Language Processing. Decoding complexity Dual Decomposition for atural Language Processing Alexander M. Rush and Michael Collins Decoding complexity focus: decoding problem for natural language tasks motivation: y = arg max y f (y) richer model

More information

Why Synchronous Tree Substitution Grammars?

Why Synchronous Tree Substitution Grammars? Why ynchronous Tr ustitution Grammars? Andras Maltti Univrsitat Rovira i Virgili Tarragona, pain andras.maltti@urv.cat Los Angls Jun 4, 2010 Why ynchronous Tr ustitution Grammars? A. Maltti 1 Motivation

More information

Shift-Reduce Word Reordering for Machine Translation

Shift-Reduce Word Reordering for Machine Translation Shift-Reduce Word Reordering for Machine Translation Katsuhiko Hayashi, Katsuhito Sudoh, Hajime Tsukada, Jun Suzuki, Masaaki Nagata NTT Communication Science Laboratories, NTT Corporation 2-4 Hikaridai,

More information

Algorithms for Syntax-Aware Statistical Machine Translation

Algorithms for Syntax-Aware Statistical Machine Translation Algorithms for Syntax-Aware Statistical Machine Translation I. Dan Melamed, Wei Wang and Ben Wellington ew York University Syntax-Aware Statistical MT Statistical involves machine learning (ML) seems crucial

More information

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs Harry Lewis October 8, 2013 Reading: Sipser, pp. 119-128. Pushdown Automata (review) Pushdown Automata = Finite automaton

More information

Minimization of Weighted Automata

Minimization of Weighted Automata Minimization of Weighted Automata Andreas Maletti Universitat Rovira i Virgili Tarragona, Spain Wrocław May 19, 2010 Minimization of Weighted Automata Andreas Maletti 1 In Collaboration with ZOLTÁN ÉSIK,

More information

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis Context-free grammars Derivations Parse Trees Left-recursive grammars Top-down parsing non-recursive predictive parsers construction of parse tables Bottom-up parsing shift/reduce parsers LR parsers GLR

More information

UNIT II REGULAR LANGUAGES

UNIT II REGULAR LANGUAGES 1 UNIT II REGULAR LANGUAGES Introduction: A regular expression is a way of describing a regular language. The various operations are closure, union and concatenation. We can also find the equivalent regular

More information

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements

More information

Lecture 3: Nondeterministic Finite Automata

Lecture 3: Nondeterministic Finite Automata Lecture 3: Nondeterministic Finite Automata September 5, 206 CS 00 Theory of Computation As a recap of last lecture, recall that a deterministic finite automaton (DFA) consists of (Q, Σ, δ, q 0, F ) where

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY 5-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY NON-DETERMINISM and REGULAR OPERATIONS THURSDAY JAN 6 UNION THEOREM The union of two regular languages is also a regular language Regular Languages Are

More information

A Discriminative Model for Semantics-to-String Translation

A Discriminative Model for Semantics-to-String Translation A Discriminative Model for Semantics-to-String Translation Aleš Tamchyna 1 and Chris Quirk 2 and Michel Galley 2 1 Charles University in Prague 2 Microsoft Research July 30, 2015 Tamchyna, Quirk, Galley

More information

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013 CMPSCI 250: Introduction to Computation Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013 λ-nfa s to NFA s to DFA s Reviewing the Three Models and Kleene s Theorem The Subset

More information

Multiple System Combination. Jinhua Du CNGL July 23, 2008

Multiple System Combination. Jinhua Du CNGL July 23, 2008 Multiple System Combination Jinhua Du CNGL July 23, 2008 Outline Introduction Motivation Current Achievements Combination Strategies Key Techniques System Combination Framework in IA Large-Scale Experiments

More information

Lagrangian Relaxation Algorithms for Inference in Natural Language Processing

Lagrangian Relaxation Algorithms for Inference in Natural Language Processing Lagrangian Relaxation Algorithms for Inference in Natural Language Processing Alexander M. Rush and Michael Collins (based on joint work with Yin-Wen Chang, Tommi Jaakkola, Terry Koo, Roi Reichart, David

More information

Syntax Analysis Part I

Syntax Analysis Part I 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2013 2 Position of a Parser in the Compiler Model Source Program Lexical Analyzer

More information

1. (a) Explain the procedure to convert Context Free Grammar to Push Down Automata.

1. (a) Explain the procedure to convert Context Free Grammar to Push Down Automata. Code No: R09220504 R09 Set No. 2 II B.Tech II Semester Examinations,December-January, 2011-2012 FORMAL LANGUAGES AND AUTOMATA THEORY Computer Science And Engineering Time: 3 hours Max Marks: 75 Answer

More information

Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for Natural Language Processing

Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for Natural Language Processing Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for atural Language Processing Alexander M. Rush (based on joint work with Michael Collins, Tommi Jaakkola, Terry Koo, David Sontag) Uncertainty

More information

Synchronous Grammars

Synchronous Grammars ynchronous Grammars ynchronous grammars are a way of simultaneously generating pairs of recursively related strings ynchronous grammar w wʹ ynchronous grammars were originally invented for programming

More information

Intro to Theory of Computation

Intro to Theory of Computation Intro to Theory of Computation 1/19/2016 LECTURE 3 Last time: DFAs and NFAs Operations on languages Today: Nondeterminism Equivalence of NFAs and DFAs Closure properties of regular languages Sofya Raskhodnikova

More information

Chapter 3. Regular grammars

Chapter 3. Regular grammars Chapter 3 Regular grammars 59 3.1 Introduction Other view of the concept of language: not the formalization of the notion of effective procedure, but set of words satisfying a given set of rules Origin

More information

Multi-Task Word Alignment Triangulation for Low-Resource Languages

Multi-Task Word Alignment Triangulation for Low-Resource Languages Multi-Task Word Alignment Triangulation for Low-Resource Languages Tomer Levinboim and David Chiang Department of Computer Science and Engineering University of Notre Dame {levinboim.1,dchiang}@nd.edu

More information

Google s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

Google s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Google s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Y. Wu, M. Schuster, Z. Chen, Q.V. Le, M. Norouzi, et al. Google arxiv:1609.08144v2 Reviewed by : Bill

More information

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing CS 562: Empirical Methods in Natural Language Processing Unit 2: Tree Models Lectures 19-23: Context-Free Grammars and Parsing Oct-Nov 2009 Liang Huang (lhuang@isi.edu) Big Picture we have already covered...

More information

Lecture Notes on Inductive Definitions

Lecture Notes on Inductive Definitions Lecture Notes on Inductive Definitions 15-312: Foundations of Programming Languages Frank Pfenning Lecture 2 August 28, 2003 These supplementary notes review the notion of an inductive definition and give

More information

Translation as Weighted Deduction

Translation as Weighted Deduction Translation as Weighted Deduction Adam Lopez University of Edinburgh 10 Crichton Street Edinburgh, EH8 9AB United Kingdom alopez@inf.ed.ac.uk Abstract We present a unified view of many translation algorithms

More information

Overview (Fall 2007) Machine Translation Part III. Roadmap for the Next Few Lectures. Phrase-Based Models. Learning phrases from alignments

Overview (Fall 2007) Machine Translation Part III. Roadmap for the Next Few Lectures. Phrase-Based Models. Learning phrases from alignments Overview Learning phrases from alignments 6.864 (Fall 2007) Machine Translation Part III A phrase-based model Decoding in phrase-based models (Thanks to Philipp Koehn for giving me slides from his EACL

More information

Statistical Machine Translation. Part III: Search Problem. Complexity issues. DP beam-search: with single and multi-stacks

Statistical Machine Translation. Part III: Search Problem. Complexity issues. DP beam-search: with single and multi-stacks Statistical Machine Translation Marcello Federico FBK-irst Trento, Italy Galileo Galilei PhD School - University of Pisa Pisa, 7-19 May 008 Part III: Search Problem 1 Complexity issues A search: with single

More information

DT2118 Speech and Speaker Recognition

DT2118 Speech and Speaker Recognition DT2118 Speech and Speaker Recognition Language Modelling Giampiero Salvi KTH/CSC/TMH giampi@kth.se VT 2015 1 / 56 Outline Introduction Formal Language Theory Stochastic Language Models (SLM) N-gram Language

More information

Shift-Reduce Word Reordering for Machine Translation

Shift-Reduce Word Reordering for Machine Translation Shift-Reduce Word Reordering for Machine Translation Katsuhiko Hayashi, Katsuhito Sudoh, Hajime Tsukada, Jun Suzuki, Masaaki Nagata NTT Communication Science Laboratories, NTT Corporation 2-4 Hikaridai,

More information

Translation as Weighted Deduction

Translation as Weighted Deduction Translation as Weighted Deduction Adam Lopez University of Edinburgh 10 Crichton Street Edinburgh, EH8 9AB United Kingdom alopez@inf.ed.ac.uk Abstract We present a unified view of many translation algorithms

More information

Succinctness of the Complement and Intersection of Regular Expressions

Succinctness of the Complement and Intersection of Regular Expressions Succinctness of the Complement and Intersection of Regular Expressions Wouter Gelade and Frank Neven Hasselt University and transnational University of Limburg February 21, 2008 W. Gelade (Hasselt University)

More information

Multi-Source Neural Translation

Multi-Source Neural Translation Multi-Source Neural Translation Barret Zoph and Kevin Knight Information Sciences Institute Department of Computer Science University of Southern California {zoph,knight}@isi.edu In the neural encoder-decoder

More information