Tree Transducers in Machine Translation

Size: px
Start display at page:

Download "Tree Transducers in Machine Translation"

Transcription

1 Tree Transducers in Machine Translation Andreas Maletti Universitat Rovira i Virgili Tarragona, pain andreas.maletti@urv.cat Jena August 23, 2010 Tree Transducers in MT A. Maletti 1

2 Machine translation Applications Technical manuals Example (An mp3 player) The synchronous manifestation of lyrics is a procedure for can roadcasting the music, waiting the mp3 file at the same time showing the lyrics. With the this kind method that the euipments that synchronous function of support up roadcast to make use of document create setup, you can pass the LCD window way the check at the document contents that roadcast. That procedure returns offerings to have to modify, and delete, and stick top, keep etc. edit function. Tree Transducers in MT A. Maletti 2

3 Machine translation Applications Technical manuals Example (An mp3 player) The synchronous manifestation of lyrics is a procedure for can roadcasting the music, waiting the mp3 file at the same time showing the lyrics. With the this kind method that the euipments that synchronous function of support up roadcast to make use of document create setup, you can pass the LCD window way the check at the document contents that roadcast. That procedure returns offerings to have to modify, and delete, and stick top, keep etc. edit function. Tree Transducers in MT A. Maletti 2

4 Machine translation Applications Technical manuals Example (An mp3 player) The synchronous manifestation of lyrics is a procedure for can roadcasting the music, waiting the mp3 file at the same time showing the lyrics. With the this kind method that the euipments that synchronous function of support up roadcast to make use of document create setup, you can pass the LCD window way the check at the document contents that roadcast. That procedure returns offerings to have to modify, and delete, and stick top, keep etc. edit function. Tree Transducers in MT A. Maletti 2

5 Machine translation Applications Technical manuals TripAdvisor R Example (Hotel Uppsala, weden) Wir hatten die Zimmer eingestuft wird als uperior weil sie renoviert wurde im letzten Jahr oder zwei. Unsere Zimmer hatten Parkettoden und waren sehr geräumig. Man musste allerdings nicht musste seitwärts ewegen. Tree Transducers in MT A. Maletti 2

6 Machine translation Applications Technical manuals TripAdvisor R Example (Hotel Uppsala, weden) Nos alojamos en haitaciones clasificado como superior porue se lo haían renovado en el año pasado o dos. Nuestras haitaciones tenían suelos de madera y eran espaciosas. No te tenías ue caminar arria para movernos por allí. Tree Transducers in MT A. Maletti 2

7 Machine translation Applications Technical manuals TripAdvisor R Example (Hotel Uppsala, weden) Wir hatten die Zimmer eingestuft wird als uperior weil sie renoviert wurde im letzten Jahr oder zwei. Unsere Zimmer hatten Parkettoden und waren sehr geräumig. Man musste allerdings nicht musste seitwärts ewegen. We stayed in rooms classified as superior ecause they had een renovated in the last year or two. Our rooms had wood floors and were roomy. You didn t have to walk sideways to move around. Tree Transducers in MT A. Maletti 2

8 Machine translation Applications Technical manuals TripAdvisor R Military Example (JONE, HEN, HERZOG 2009) oldier: Okay, what is your name? Local: Adul. oldier: And your last name? Local: Al Farran. Tree Transducers in MT A. Maletti 2

9 Machine translation Applications Technical manuals TripAdvisor R Military Example (JONE, HEN, HERZOG 2009) oldier: Okay, what is your name? Local: Adul. oldier: And your last name? Local: Al Farran. peech-to-text machine translation oldier: Local: Okay, what s your name? milk a mechanic and I am here I mean yes Tree Transducers in MT A. Maletti 2

10 Machine translation Applications Technical manuals TripAdvisor R Military Example (JONE, HEN, HERZOG 2009) oldier: Okay, what is your name? Local: Adul. oldier: And your last name? Local: Al Farran. peech-to-text machine translation oldier: Local: oldier: Local: Okay, what s your name? milk a mechanic and I am here I mean yes What is your last name? every two weeks my son s name is ismail Tree Transducers in MT A. Maletti 2

11 Machine translation Applications Technical manuals TripAdvisor R Military MDN, Knowledge Base... Tree Transducers in MT A. Maletti 2

12 Machine translation (cont d) ystems GOOGLE translate BING translator LANGUAGE WEAVER + DL... translate.google.com Tree Transducers in MT A. Maletti 3

13 Machine translation (cont d) ystems GOOGLE translate BING translator LANGUAGE WEAVER + DL... translate.google.com Try them! Tree Transducers in MT A. Maletti 3

14 Machine translation (cont d) History 1 Dark age (60s 90s) rule-ased systems (e.g., YTRAN) CHOMKYAN approach perfect translation, poor coverage 2 Reformation (1991 present) word-ased, phrase-ased, syntax-ased systems statistical approach cheap, automatically trained 3 Potential future semantics-ased systems (e.g., FRAMENET) semi-supervised, statistical approach asic understanding of translated text Tree Transducers in MT A. Maletti 4

15 Machine translation (cont d) History 1 Dark age (60s 90s) rule-ased systems (e.g., YTRAN) CHOMKYAN approach perfect translation, poor coverage 2 Reformation (1991 present) word-ased, phrase-ased, syntax-ased systems statistical approach cheap, automatically trained 3 Potential future semantics-ased systems (e.g., FRAMENET) semi-supervised, statistical approach asic understanding of translated text Tree Transducers in MT A. Maletti 4

16 Machine translation (cont d) History 1 Dark age (60s 90s) rule-ased systems (e.g., YTRAN) CHOMKYAN approach perfect translation, poor coverage 2 Reformation (1991 present) word-ased, phrase-ased, syntax-ased systems statistical approach cheap, automatically trained 3 Potential future semantics-ased systems (e.g., FRAMENET) semi-supervised, statistical approach asic understanding of translated text Tree Transducers in MT A. Maletti 4

17 Machine translation (cont d) chema Input Machine translation system Language model Output Tree Transducers in MT A. Maletti 5

18 Machine translation (cont d) chema Input Machine translation system Language model Output Tree Transducers in MT A. Maletti 5

19 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: And then the matter was decided, and everything was put in place Output: Tree Transducers in MT A. Maletti 6

20 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: then the matter was decided, and everything was put in place Output: Tree Transducers in MT A. Maletti 6

21 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter was decided, and everything was put in place Output: Tree Transducers in MT A. Maletti 6

22 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter was decided, and everything was put in place Output: f Tree Transducers in MT A. Maletti 6

23 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter was decided, and everything was put in place Output: f kan Tree Transducers in MT A. Maletti 6

24 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter was decided, and everything was put in place Output: f kan Tree Transducers in MT A. Maletti 6

25 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter was decided, and everything was put in place Output: f kan Tree Transducers in MT A. Maletti 6

26 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter was decided, and everything was put in place Output: f kan Tree Transducers in MT A. Maletti 6

27 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter, and everything was put in place Output: f kan An tm AlHsm Tree Transducers in MT A. Maletti 6

28 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter and everything was put in place Output: f kan An tm AlHsm Tree Transducers in MT A. Maletti 6

29 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter everything was put in place Output: f kan An tm AlHsm w Tree Transducers in MT A. Maletti 6

30 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter was put in place Output: f kan An tm AlHsm w Tree Transducers in MT A. Maletti 6

31 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter was put in place Output: f kan An tm AlHsm w Tree Transducers in MT A. Maletti 6

32 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: the matter in place Output: f kan An tm AlHsm w wdet Tree Transducers in MT A. Maletti 6

33 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: in place Output: f kan An tm AlHsm w wdet Almwr Tree Transducers in MT A. Maletti 6

34 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: place Output: f kan An tm AlHsm w wdet Almwr fy Tree Transducers in MT A. Maletti 6

35 Word-ased system (FT) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: Output: f kan An tm AlHsm w wdet Almwr fy na ha Tree Transducers in MT A. Maletti 6

36 Phrase-ased machine translation chema Input Machine translation system Language model Output Phrase-ased systems Input egmenter Machine translation system Language model Output Tree Transducers in MT A. Maletti 7

37 Phrase-ased machine translation chema Input Machine translation system Language model Output Phrase-ased systems Input egmenter Machine translation system Language model Output Tree Transducers in MT A. Maletti 7

38 Phrase-ased system (FT+Perm) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: And then the matter was decided, and everything was put in place Output: Tree Transducers in MT A. Maletti 8

39 Phrase-ased system (FT+Perm) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: And then 1 the matter 5 was decided 2, and everything 3 was put 4 in place 6 Output: Tree Transducers in MT A. Maletti 8

40 Phrase-ased system (FT+Perm) And then the matter was decided, and everything was put in place f kan An tm AlHsm w wdet Almwr fy na ha Derivation Input: And then 1 the matter 5 was decided 2, and everything 3 was put 4 in place 6 Output: f kan 1 An tm AlHsm 2 w 3 wdet 4 Almwr 5 fy na ha 6 Tree Transducers in MT A. Maletti 8

41 Machine translation (cont d) Phrase-ased systems Input egmenter Machine translation system Language model Output yntax-ased systems Input Parser Machine translation system Language model Output Tree Transducers in MT A. Maletti 9

42 Machine translation (cont d) Phrase-ased systems Input egmenter Machine translation system Language model Output yntax-ased systems Input Parser Machine translation system Language model Output Tree Transducers in MT A. Maletti 9

43 Parser CC CC AD -BJ-9 and -BJ-1 And RB DT NN VBD NN VBD then the matter was VBN -9 everything was VBN -1 PP decided, put * IN in NN place And then the matter was decided, and everything was put in place (thanks to KEVIN KNIGHT for the data) Tree Transducers in MT A. Maletti 10

44 CC And AD RB then -BJ-9 DT the NN matter VBD was VBN decided -9, CC and -BJ-1 NN everything VBD was VBN put -1 * PP IN in NN place CONJ f PV kan -BJ * BAR UB An PV tm -BJ DET-NN AlHsm CONJ w PV wdet -BJ1 DET-NN Almwr -OBJ1 * PP PREP fy NN na PO ha Tree Transducers in MT A. Maletti 11

45 -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 12

46 Contents 1 Machine Translation 2 Extended Top-down Tree Transducers 3 Multi Bottom-up Tree Transducers 4 ynchronous Tree-Adjoining Grammars Tree Transducers in MT A. Maletti 13

47 Weight structure Definition Commutative semiring (C, +,, 0, 1) if (C, +, 0) and (C,, 1) commutative monoids distriutes over finite (incl. empty) sums Example BOOLEAN semiring ({0, 1}, max, min, 0, 1) emiring (R 0, +,, 0, 1) of proailities Tropical semiring (N { }, min, +,, 0) Any field, ring, etc. Most of the talk: BOOLEAN semiring Tree Transducers in MT A. Maletti 14

48 Weight structure Definition Commutative semiring (C, +,, 0, 1) if (C, +, 0) and (C,, 1) commutative monoids distriutes over finite (incl. empty) sums Example BOOLEAN semiring ({0, 1}, max, min, 0, 1) emiring (R 0, +,, 0, 1) of proailities Tropical semiring (N { }, min, +,, 0) Any field, ring, etc. Most of the talk: BOOLEAN semiring Tree Transducers in MT A. Maletti 14

49 Weight structure Definition Commutative semiring (C, +,, 0, 1) if (C, +, 0) and (C,, 1) commutative monoids distriutes over finite (incl. empty) sums Example BOOLEAN semiring ({0, 1}, max, min, 0, 1) emiring (R 0, +,, 0, 1) of proailities Tropical semiring (N { }, min, +,, 0) Any field, ring, etc. Most of the talk: BOOLEAN semiring Tree Transducers in MT A. Maletti 14

50 yntax Definition (ARNOLD, DAUCHET 1976, GRAEHL, KNIGHT 2004) Extended top-down tree transducer (XTOP) M = (Q, Σ,, I, R) with finitely many rules Σ x 1... x k (x i )... p(x j ),, p Q are states i, j {1,..., k} Tree Transducers in MT A. Maletti 15

51 yntax (cont d) Definition (ROUND 1970, THATCHER 1970) Top-down tree transducer (TOP) if all rules σ x 1... x k (x i )... p(x j ) linear if no variale occurs twice in r for all rules l r nondeleting if var(l) = var(r) for all rules l r Tree Transducers in MT A. Maletti 16

52 yntax (cont d) Definition (ROUND 1970, THATCHER 1970) Top-down tree transducer (TOP) if all rules σ x 1... x k (x i )... p(x j ) linear if no variale occurs twice in r for all rules l r nondeleting if var(l) = var(r) for all rules l r Tree Transducers in MT A. Maletti 16

53 yntax (cont d) Definition (ROUND 1970, THATCHER 1970) Top-down tree transducer (TOP) if all rules σ x 1... x k (x i )... p(x j ) linear if no variale occurs twice in r for all rules l r nondeleting if var(l) = var(r) for all rules l r Tree Transducers in MT A. Maletti 16

54 emantics Example tates {, V, } of which only is initial x 1 x 2 V x 2 x 1 x 2 V x 1 x 2 V x 1 x 1 x 2 x 2 Derivation t 1 t 2 t 3 V t 2 t 3 t 1 t 2 t 3 V t 2 t 1 t 2 t 3 V t 2 t 1 t 3 Tree Transducers in MT A. Maletti 17

55 emantics (cont d) Definition Computed transformation: τ M = {(t, u) T Σ T I : (t) u} Tree Transducers in MT A. Maletti 18

56 -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 19

57 -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 19

58 -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 19

59 -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 19

60 Rule extraction NML JJ N -BJ N Voislav VBD PP signed IN Yugoslav President for N eria x 1 VBD signed x 2 CONJ w PV -OBJ twly DET-NN AltwyE x 2 x 1 rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP twly PP w PV -OBJ CONJ DET-NN DET-ADJ NN-PROP -BJ Tree Transducers in MT A. Maletti 20

61 Rule extraction NML JJ N -BJ N Voislav VBD PP signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ CONJ w PV -OBJ x 1 twly VBD x 2 DET-NN x 2 signed AltwyE -BJ -BJ x 1 N x 1 NN-PROP Voislav fwyslaf x 1 Tree Transducers in MT A. Maletti 20

62 Rule extraction NML JJ N -BJ N Voislav VBD PP signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ CONJ w PV -OBJ x 1 twly x 1 VBD x 2 DET-NN x 2 signed AltwyE -BJ -BJ x 1 N x 1 NN-PROP Voislav fwyslaf NML DET-NN DET-ADJ JJ N Alr}ys AlywgwslAfy Yugoslav President Tree Transducers in MT A. Maletti 20

63 ymmetry Original rule x 1 VBD signed x 2 CONJ w PV -OBJ twly DET-NN AltwyE x 2 x 1 Tree Transducers in MT A. Maletti 21

64 ymmetry Original rule Inverted rule x 1 VBD signed x 2 CONJ w PV -OBJ twly DET-NN AltwyE x 2 x 1 CONJ w PV -OBJ twly DET-NN x 2 x 1 x 1 VBD signed x 2 AltwyE Tree Transducers in MT A. Maletti 21

65 ymmetry Original rule Inverted rule x 1 VBD signed x 2 CONJ w PV -OBJ twly DET-NN AltwyE x 2 x 1 CONJ w PV -OBJ twly DET-NN x 2 x 1 x 1 VBD signed x 2 AltwyE Linear nondeleting XTT can e inverted Tree Transducers in MT A. Maletti 21

66 Preservation of regularity chematics Input Parser XTT Language model Output Parse trees est parse tree n-est parses all parses Can all e represented y regular tree language Tree Transducers in MT A. Maletti 22

67 Preservation of regularity chematics Input Parser Parse trees XTT... Parse trees est parse tree n-est parses all parses Can all e represented y regular tree language Tree Transducers in MT A. Maletti 22

68 Preservation of regularity chematics Input Parser Parse trees XTT... Parse trees est parse tree n-est parses all parses Can all e represented y regular tree language Tree Transducers in MT A. Maletti 22

69 Preservation of regularity chematics Input Parser Parse trees XTT... Parse trees est parse tree n-est parses all parses Can all e represented y regular tree language Tree Transducers in MT A. Maletti 22

70 Preservation of regularity (cont d) chematics Regular language XTT Regular language? Approach Input restriction Project to output Result Linear XTT preserve regularity Tree Transducers in MT A. Maletti 23

71 Preservation of regularity (cont d) chematics Regular language XTT Regular language? Approach Input restriction Project to output Result Linear XTT preserve regularity Tree Transducers in MT A. Maletti 23

72 Preservation of regularity (cont d) chematics Regular language XTT Regular language? Approach Input restriction Project to output Result Linear XTT preserve regularity Tree Transducers in MT A. Maletti 23

73 Composition chematics Parse trees XTT... Example (YAMADA, KNIGHT 2002) Reorder Insert words Translate words Tree Transducers in MT A. Maletti 24

74 Composition chematics Parse trees tage 1 XTT tage 2 XTT tage 3 XTT... Example (YAMADA, KNIGHT 2002) Reorder Insert words Translate words Tree Transducers in MT A. Maletti 24

75 Composition chematics Parse trees Composed XTT... Example (YAMADA, KNIGHT 2002) Reorder Insert words Translate words Tree Transducers in MT A. Maletti 24

76 Composition (cont d) Example (ARNOLD, DAUCHET 1982) σ t 1 σ δ t 2 t 3 δ t n 4 t n 3 δ t 1 t 2 σ t 3 σ σ t 4 t n 3 σ σ δ t 2 t 1 δ t 4 t 3 δ t n 2 t n 3 σ t n 2 t n 1 t n t n 2 σ t n 1 t n t n 1 t n Tree Transducers in MT A. Maletti 25

77 ummary Model \ Criterion EXPR YM PRE PRE 1 COMP Linear nondeleting TOP Linear TOP Linear TOP R General TOP General TOP R Linear nondeleting XTOP Linear XTOP Linear XTOP R General XTOP General XTOP R Tree Transducers in MT A. Maletti 26

78 ummary Model \ Criterion EXPR YM PRE PRE 1 COMP Linear nondeleting TOP Linear TOP Linear TOP R General TOP General TOP R Linear nondeleting XTOP Linear XTOP Linear XTOP R General XTOP General XTOP R Comp. closure ln-xtop composale ln-xtop?? Tree Transducers in MT A. Maletti 26

79 Implementation TIBURON [MAY, KNIGHT 2006] Implements XTOP (and tree automata; everything also weighted) Framework with command-line interface Optimized for machine translation Algorithms Application of XTOP to input tree/language Backward application of XTOP to output language Composition (for some XTOP) Example.(DT(the) N(oy)) -> (N(atefl)) Tree Transducers in MT A. Maletti 27

80 Implementation TIBURON [MAY, KNIGHT 2006] Implements XTOP (and tree automata; everything also weighted) Framework with command-line interface Optimized for machine translation Algorithms Application of XTOP to input tree/language Backward application of XTOP to output language Composition (for some XTOP) Example.(DT(the) N(oy)) -> (N(atefl)) Tree Transducers in MT A. Maletti 27

81 Implementation TIBURON [MAY, KNIGHT 2006] Implements XTOP (and tree automata; everything also weighted) Framework with command-line interface Optimized for machine translation Algorithms Application of XTOP to input tree/language Backward application of XTOP to output language Composition (for some XTOP) Example.(DT(the) N(oy)) -> (N(atefl)) Tree Transducers in MT A. Maletti 27

82 Multi Bottom-up Tree Transducers -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 28

83 yntax Definition Extended multi ottom-up tree transducer (XMBOT) is M = (Q, Σ,, F, R) with finitely many rules Σ x 1... x l... p x m... x n x i... x j... x i... x j, p, Q are now ranked states F Q 1 final states Tree Transducers in MT A. Maletti 29

84 Example tates {f (1), (2) } with final state f and rules e e e x 1 a x 2 a a x 1 x 2 x 1 x 2 x 1 x 2 x 1 x 2 f σ x 1 x 2 Example (Derivation) a e Tree Transducers in MT A. Maletti 30

85 Example tates {f (1), (2) } with final state f and rules e e e x 1 a x 2 a a x 1 x 2 x 1 x 2 x 1 x 2 x 1 x 2 f σ x 1 x 2 Example (Derivation) a a e e e Tree Transducers in MT A. Maletti 30

86 Example tates {f (1), (2) } with final state f and rules e e e x 1 a x 2 a a x 1 x 2 x 1 x 2 x 1 x 2 x 1 x 2 f σ x 1 x 2 Example (Derivation) a a a e e e e e Tree Transducers in MT A. Maletti 30

87 Example tates {f (1), (2) } with final state f and rules e e e x 1 a x 2 a a x 1 x 2 x 1 x 2 x 1 x 2 x 1 x 2 f σ x 1 x 2 Example (Derivation) a a a a e e e e e e e Tree Transducers in MT A. Maletti 30

88 Example tates {f (1), (2) } with final state f and rules e e e x 1 a x 2 a a x 1 x 2 x 1 x 2 x 1 x 2 x 1 x 2 f σ x 1 x 2 Example (Derivation) a a a a a a e e e e e e e e e Tree Transducers in MT A. Maletti 30

89 Example tates {f (1), (2) } with final state f and rules e e e a x 1 x 2 a x 1 a x 2 x 1 x 2 x 1 x 2 x 1 x 2 f σ x 1 x 2 Example (Derivation) a e a e e a e e a e e a e a e f σ a e a e Tree Transducers in MT A. Maletti 30

90 emantics Definition Computed transformation: τ M = {(t, u) T Σ T F : t (u)} Tree Transducers in MT A. Maletti 31

91 emantics Definition Computed transformation: τ M = {(t, u) T Σ T F : t (u)} Example τ M = { t, σ(t, t) t T Σ } e e e x 1 a x 2 a a x 1 x 2 x 1 x 2 x 1 x 2 x 1 x 2 f σ x 1 x 2 Tree Transducers in MT A. Maletti 31

92 -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 32

93 -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 32

94 -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 32

95 -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 32

96 -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 32

97 Rule extraction -BJ NML N VBD PP x 1 p x 2 x 3 CONJ w x 2 x 3 x 1 JJ N Voislav signed IN Yugoslav President for N eria AltwyE DET-NN twly w PV PREP PP -OBJ -BJ CONJ rya En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN DET-ADJ NN-PROP Tree Transducers in MT A. Maletti 33

98 Rule extraction -BJ NML N JJ Yugoslav N President Voislav VBD PP signed IN for N eria x 1 p x 2 x 3 VBD signed x 1 CONJ w PV twly x 2 x 3 x 1 p -OBJ DET-NN x 1 AltwyE DET-NN twly w PV PREP PP -OBJ -BJ CONJ rya En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN DET-ADJ NN-PROP AltwyE Tree Transducers in MT A. Maletti 33

99 One-symol normal form Definition Rule in one-symol normal form if it contains at most one symol Tree Transducers in MT A. Maletti 34

100 One-symol normal form Definition Rule in one-symol normal form if it contains at most one symol Example (ENGELFRIET, LILIN, 2009) x 1 x 2 x 1 x 2 Tree Transducers in MT A. Maletti 34

101 One-symol normal form Definition Rule in one-symol normal form if it contains at most one symol Example (ENGELFRIET, LILIN, 2009) x 1 x 2 x 1 x 2 In one-symol normal form x 1 x x 2 x 1 x 2 x 1 x 2 x 1 x 1 2 x 2 x 1 x2 Tree Transducers in MT A. Maletti 34

102 Basic properties Example (Copying translation) τ M = { t, σ(t, t) t T Σ } Conseuences XMBOT are not symmetric XMBOT do not preserve regularity ut they can e composed Tree Transducers in MT A. Maletti 35

103 Basic properties Example (Copying translation) τ M = { t, σ(t, t) t T Σ } Conseuences XMBOT are not symmetric XMBOT do not preserve regularity ut they can e composed Tree Transducers in MT A. Maletti 35

104 Basic properties Example (Copying translation) τ M = { t, σ(t, t) t T Σ } Conseuences XMBOT are not symmetric XMBOT do not preserve regularity ut they can e composed Tree Transducers in MT A. Maletti 35

105 Composition x σ σ p 1 p 2 x 1 x 2 x x 2 p 2 x 1 x 2 imple composition works in the typical cases [BAKER 1979, ENGELFRIET 1975] Tree Transducers in MT A. Maletti 36

106 Composition x σ σ p 1 p 2 x 1 x 2 x x 2 p 2 x 1 x 2 1 p 1 p α p 1 p 2 p x 2 x 1 α 1 p 2 x 1 x 2 imple composition works in the typical cases [BAKER 1979, ENGELFRIET 1975] Tree Transducers in MT A. Maletti 36

107 Composition x σ σ p 1 p 2 x 1 x 2 x x 2 p 2 x 1 x 2 p 1 p α 1 1 p 1 p 2 p x 2 x 1 p 2 α x 1 x 2 x 1 1 x 2 γ x 2 1 γ p 2 x 1 x 2 p 1 p 2 p x 2 x 1 x 2 p x 2 imple composition works in the typical cases [BAKER 1979, ENGELFRIET 1975] Tree Transducers in MT A. Maletti 36

108 Composition x σ σ p 1 p 2 x 1 x 2 x x 2 p 2 x 1 x 2 p 1 p α 1 1 p 1 p 2 p x 2 x 1 p 2 α x 1 x 2 x 1 1 x 2 γ x 2 1 γ p 2 x 1 x 2 p 1 p 2 p x 2 x 1 x 2 p x 2 imple composition works in the typical cases [BAKER 1979, ENGELFRIET 1975] Tree Transducers in MT A. Maletti 36

109 ummary Model \ Criterion EXPR YM PRE PRE 1 COMP Linear nondeleting TOP Linear nondeleting XTOP Linear nondeleting XMBOT Linear XMBOT General XMBOT reg.-preserving linear XMBOT invertale linear XMBOT Tree Transducers in MT A. Maletti 37

110 ummary Model \ Criterion EXPR YM PRE PRE 1 COMP Linear nondeleting TOP Linear nondeleting XTOP Linear nondeleting XMBOT Linear XMBOT General XMBOT reg.-preserving linear XMBOT invertale linear XMBOT Tree Transducers in MT A. Maletti 37

111 Implementation No implementation yet, Tree Transducers in MT A. Maletti 38

112 Implementation No implementation yet, ut stay tuned Tree Transducers in MT A. Maletti 38

113 ynchronous Tree-Adjoining Grammars -BJ NML N VBD PP JJ N Voislav signed IN Yugoslav President for N eria rya AltwyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP twly PP w PV -OBJ -BJ CONJ Tree Transducers in MT A. Maletti 39

114 yntax Definition (HIEBER, CHABE 1990) ynchronous tree-adjoining grammar (TAG) is G = (N, Σ,,, R) with a finite set R of sustitution rules adjunction rules Example (ustitution rule) X X Σ X 1... X k X i... X j Tree Transducers in MT A. Maletti 40

115 yntax Definition (HIEBER, CHABE 1990) ynchronous tree-adjoining grammar (TAG) is G = (N, Σ,,, R) with a finite set R of sustitution rules adjunction rules Example (Adjunction rule) X X Σ X X Tree Transducers in MT A. Maletti 40

116 Example Tree Transducers in MT A. Maletti 41

117 Example Used sustitution rule Tree Transducers in MT A. Maletti 41

118 Example 1 1 V 2 V 2 Used sustitution rule V V Tree Transducers in MT A. Maletti 41

119 Example 1 1 V 2 V 2 likes aime Used sustitution rule V likes V aime Tree Transducers in MT A. Maletti 41

120 Example V V likes N aime DT N les Used sustitution rule N DT N les Tree Transducers in MT A. Maletti 41

121 Example V V likes N aime DT N candies les onons Used sustitution rule N candies N onons Tree Transducers in MT A. Maletti 41

122 Example V V likes N aime DT N ADJ N les N ADJ candies onons Used adjunction rule N ADJ N N N ADJ Tree Transducers in MT A. Maletti 41

123 Example V V likes N aime DT N ADJ N les N ADJ red candies onons rouges Used sustitution rule ADJ red ADJ rouges Tree Transducers in MT A. Maletti 41

124 emantics Definition Computed transformation: τ G = {(t, u) T Σ T (, ) (t, u)} Tree Transducers in MT A. Maletti 42

125 Relation to tree transducers Illustration EMB HOM EMB HOM TG eval eval TAG Definition (HIEBER 2006) emedded tree transducer is a macro tree transducer: linear, nondeleting, deterministic, total 1-parameter: linear, nondeleting Tree Transducers in MT A. Maletti 43

126 Relation to tree transducers Illustration EMB HOM EMB HOM TG eval eval TAG Definition (HIEBER 2006) emedded tree transducer is a macro tree transducer: linear, nondeleting, deterministic, total 1-parameter: linear, nondeleting Tree Transducers in MT A. Maletti 43

127 Copying example Example T c T c a T c a a T c a a T c a a T c a Example T c T c a a a a Tree Transducers in MT A. Maletti 44

128 Copying example Example T c T c a T c a a T c a a T c a a T c a tring translation {(wcw R, wcw) w {a, } } Tree Transducers in MT A. Maletti 44

129 Basic properties Example (Copying translation) τ G = {(wcw R, wcw) w {a, } } Conseuences TAG are symmetric TAG do not preserve regularity (neither direction) Tree Transducers in MT A. Maletti 45

130 Basic properties Example (Copying translation) τ G = {(wcw R, wcw) w {a, } } Conseuences TAG are symmetric TAG do not preserve regularity (neither direction) Tree Transducers in MT A. Maletti 45

131 ummary Model \ Criterion EXPR YM PRE PRE 1 COMP Linear nondeleting TOP Linear nondeleting XTOP Linear nondeleting XMBOT Linear XMBOT General XMBOT reg.-preserving linear XMBOT invertale linear XMBOT TAG Tree Transducers in MT A. Maletti 46

132 Implementation XTAG [THE XTAG PROJECT 2008] Implements TAG, TAG Optimized for natural language applications Application of TAG Tree Transducers in MT A. Maletti 47

133 Implementation XTAG [THE XTAG PROJECT 2008] Implements TAG, TAG Optimized for natural language applications Application of TAG Tree Transducers in MT A. Maletti 47

134 References ARNOLD, DAUCHET: Bi-transductions de forêts. ICALP 1976 BERTEL, REUTENAUER: Recognizale formal power series on trees. Theor. Comput. ci. 18, 1982 ENGELFRIET: Top-down tree transducers with regular look-ahead. Math. ystems Theory 10, 1977 GALLEY, HOPKIN, KNIGHT, MARCU: What s in a translation rule? HLT-NAACL 2004 GRAEHL, KNIGHT: Training tree transducers. HLT-NAACL 2004 MAY, KNIGHT: TIBURON a weighted tree automata toolkit. CIAA 2006 ROUND: Mappings and grammars on trees. Math. ystems Theory 4, 1970 THATCHER Generalized 2 seuential machine maps. J. Comput. ystem ci. 4, 1970 Tree Transducers in MT A. Maletti 48

Rule Extraction for Machine Translation

Rule Extraction for Machine Translation Rule Extraction for Machine Translation Andreas Maletti Institute of Computer cience Universität Leipzig, Germany on leave from: Institute for Natural Language Processing Universität tuttgart, Germany

More information

How to train your multi bottom-up tree transducer

How to train your multi bottom-up tree transducer How to train your multi bottom-up tree transducer Andreas Maletti Universität tuttgart Institute for Natural Language Processing tuttgart, Germany andreas.maletti@ims.uni-stuttgart.de Portland, OR June

More information

Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers

Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers Andreas Maletti Institute of Computer Science Universität Leipzig, Germany on leave from: Institute for Natural Language Processing

More information

Multi Bottom-up Tree Transducers

Multi Bottom-up Tree Transducers Multi Bottom-up Tree Transducers Andreas Maletti Institute for Natural Language Processing Universität tuttgart, Germany maletti@ims.uni-stuttgart.de Avignon April 24, 2012 Multi Bottom-up Tree Transducers

More information

Lecture 7: Introduction to syntax-based MT

Lecture 7: Introduction to syntax-based MT Lecture 7: Introduction to syntax-based MT Andreas Maletti Statistical Machine Translation Stuttgart December 16, 2011 SMT VII A. Maletti 1 Lecture 7 Goals Overview Tree substitution grammars (tree automata)

More information

Linking Theorems for Tree Transducers

Linking Theorems for Tree Transducers Linking Theorems for Tree Transducers Andreas Maletti maletti@ims.uni-stuttgart.de peyer October 1, 2015 Andreas Maletti Linking Theorems for MBOT Theorietag 2015 1 / 32 tatistical Machine Translation

More information

Why Synchronous Tree Substitution Grammars?

Why Synchronous Tree Substitution Grammars? Why ynchronous Tr ustitution Grammars? Andras Maltti Univrsitat Rovira i Virgili Tarragona, pain andras.maltti@urv.cat Los Angls Jun 4, 2010 Why ynchronous Tr ustitution Grammars? A. Maltti 1 Motivation

More information

Hierarchies of Tree Series TransducersRevisited 1

Hierarchies of Tree Series TransducersRevisited 1 Hierarchies of Tree Series TransducersRevisited 1 Andreas Maletti 2 Technische Universität Dresden Fakultät Informatik June 27, 2006 1 Financially supported by the Friends and Benefactors of TU Dresden

More information

Statistical Machine Translation of Natural Languages

Statistical Machine Translation of Natural Languages 1/26 Statistical Machine Translation of Natural Languages Heiko Vogler Technische Universität Dresden Germany Graduiertenkolleg Quantitative Logics and Automata Dresden, November, 2012 1/26 Weighted Tree

More information

Compositions of Bottom-Up Tree Series Transformations

Compositions of Bottom-Up Tree Series Transformations Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de May 17, 2005 1. Motivation

More information

Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers

Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers Nina eemann and Daniel Quernheim and Fabienne Braune and Andreas Maletti University of tuttgart, Institute for Natural

More information

The Power of Tree Series Transducers

The Power of Tree Series Transducers The Power of Tree Series Transducers Andreas Maletti 1 Technische Universität Dresden Fakultät Informatik June 15, 2006 1 Research funded by German Research Foundation (DFG GK 334) Andreas Maletti (TU

More information

Compositions of Tree Series Transformations

Compositions of Tree Series Transformations Compositions of Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de December 03, 2004 1. Motivation 2.

More information

Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1

Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1 Lecture 9: Decoding Andreas Maletti Statistical Machine Translation Stuttgart January 20, 2012 SMT VIII A. Maletti 1 Lecture 9 Last time Synchronous grammars (tree transducers) Rule extraction Weight training

More information

Applications of Tree Automata Theory Lecture VI: Back to Machine Translation

Applications of Tree Automata Theory Lecture VI: Back to Machine Translation Applications of Tree Automata Theory Lecture VI: Back to Machine Translation Andreas Maletti Institute of Computer Science Universität Leipzig, Germany on leave from: Institute for Natural Language Processing

More information

Compositions of Bottom-Up Tree Series Transformations

Compositions of Bottom-Up Tree Series Transformations Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de May 27, 2005 1. Motivation

More information

Compositions of Top-down Tree Transducers with ε-rules

Compositions of Top-down Tree Transducers with ε-rules Compositions of Top-down Tree Transducers with ε-rules Andreas Maletti 1, and Heiko Vogler 2 1 Universitat Rovira i Virgili, Departament de Filologies Romàniques Avinguda de Catalunya 35, 43002 Tarragona,

More information

Extended Multi Bottom-Up Tree Transducers

Extended Multi Bottom-Up Tree Transducers Extended Multi Bottom-Up Tree Transducers Joost Engelfriet 1, Eric Lilin 2, and Andreas Maletti 3;? 1 Leiden Instite of Advanced Comper Science Leiden University, P.O. Box 9512, 2300 RA Leiden, The Netherlands

More information

Composition Closure of ε-free Linear Extended Top-down Tree Transducers

Composition Closure of ε-free Linear Extended Top-down Tree Transducers Composition Closure of ε-free Linear Extended Top-down Tree Transducers Zoltán Fülöp 1, and Andreas Maletti 2, 1 Department of Foundations of Computer Science, University of Szeged Árpád tér 2, H-6720

More information

Key words. tree transducer, natural language processing, hierarchy, composition closure

Key words. tree transducer, natural language processing, hierarchy, composition closure THE POWER OF EXTENDED TOP-DOWN TREE TRANSDUCERS ANDREAS MALETTI, JONATHAN GRAEHL, MARK HOPKINS, AND KEVIN KNIGHT Abstract. Extended top-down tree transducers (transducteurs généralisés descendants [Arnold,

More information

The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers

The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers International Journal of Foundations of Computer cience c World cientific Publishing Company The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers ANDREA MALETTI Universität Leipzig,

More information

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister A Syntax-based Statistical Machine Translation Model Alexander Friedl, Georg Teichtmeister 4.12.2006 Introduction The model Experiment Conclusion Statistical Translation Model (STM): - mathematical model

More information

Minimization of Weighted Automata

Minimization of Weighted Automata Minimization of Weighted Automata Andreas Maletti Universitat Rovira i Virgili Tarragona, Spain Wrocław May 19, 2010 Minimization of Weighted Automata Andreas Maletti 1 In Collaboration with ZOLTÁN ÉSIK,

More information

Syntax-based Machine Translation using Multi Bottom-up Tree Transducers

Syntax-based Machine Translation using Multi Bottom-up Tree Transducers yntx-sed Mchine Trnsltion using Multi Bottom-up Tree Trnsducers Andres Mletti Fienne Brune, Dniel Quernheim, Nin eemnn Institute for Nturl Lnguge Processing Universität tuttgrt, Germny mletti@ims.uni-stuttgrt.de

More information

Statistical Machine Translation of Natural Languages

Statistical Machine Translation of Natural Languages 1/37 Statistical Machine Translation of Natural Languages Rule Extraction and Training Probabilities Matthias Büchse, Toni Dietze, Johannes Osterholzer, Torsten Stüber, Heiko Vogler Technische Universität

More information

Compositions of Extended Top-down Tree Transducers

Compositions of Extended Top-down Tree Transducers Compositions of Extended Top-down Tree Transducers Andreas Maletti ;1 International Computer Science Institute 1947 Center Street, Suite 600, Berkeley, CA 94704, USA Abstract Unfortunately, the class of

More information

Syntax-Directed Translations and Quasi-alphabetic Tree Bimorphisms Revisited

Syntax-Directed Translations and Quasi-alphabetic Tree Bimorphisms Revisited Syntax-Directed Translations and Quasi-alphabetic Tree Bimorphisms Revisited Andreas Maletti and C t lin Ionuµ Tîrn uc Universitat Rovira i Virgili Departament de Filologies Romàniques Av. Catalunya 35,

More information

Composition and Decomposition of Extended Multi Bottom-Up Tree Transducers?

Composition and Decomposition of Extended Multi Bottom-Up Tree Transducers? Composition and Decomposition of Extended Multi Bottom-Up Tree Transducers? Joost Engelfriet 1, Eric Lilin 2, and Andreas Maletti 3;?? 1 Leiden Institute of Advanced Computer Science Leiden University,

More information

Pure and O-Substitution

Pure and O-Substitution International Journal of Foundations of Computer Science c World Scientific Publishing Company Pure and O-Substitution Andreas Maletti Department of Computer Science, Technische Universität Dresden 01062

More information

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti Quasi-Synchronous Phrase Dependency Grammars for Machine Translation Kevin Gimpel Noah A. Smith 1 Introduction MT using dependency grammars on phrases Phrases capture local reordering and idiomatic translations

More information

Random Generation of Nondeterministic Tree Automata

Random Generation of Nondeterministic Tree Automata Random Generation of Nondeterministic Tree Automata Thomas Hanneforth 1 and Andreas Maletti 2 and Daniel Quernheim 2 1 Department of Linguistics University of Potsdam, Germany 2 Institute for Natural Language

More information

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):

More information

Fuzzy Tree Transducers

Fuzzy Tree Transducers Advances in Fuzzy Mathematics. ISSN 0973-533X Volume 11, Number 2 (2016), pp. 99-112 Research India Publications http://www.ripublication.com Fuzzy Tree Transducers S. R. Chaudhari and M. N. Joshi Department

More information

CS460/626 : Natural Language

CS460/626 : Natural Language CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24 Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th,

More information

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Language Processing Systems Prof. Mohamed Hamada Software Engineering La. The University of izu Japan Syntax nalysis (Parsing) 1. Uses Regular Expressions to define tokens 2. Uses Finite utomata to recognize

More information

Structure and Complexity of Grammar-Based Machine Translation

Structure and Complexity of Grammar-Based Machine Translation Structure and of Grammar-Based Machine Translation University of Padua, Italy New York, June 9th, 2006 1 2 Synchronous context-free grammars Definitions Computational problems 3 problem SCFG projection

More information

LECTURER: BURCU CAN Spring

LECTURER: BURCU CAN Spring LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can

More information

Synchronous Grammars

Synchronous Grammars ynchronous Grammars ynchronous grammars are a way of simultaneously generating pairs of recursively related strings ynchronous grammar w wʹ ynchronous grammars were originally invented for programming

More information

Hasse Diagrams for Classes of Deterministic Bottom-Up Tree-to-Tree-Series Transformations

Hasse Diagrams for Classes of Deterministic Bottom-Up Tree-to-Tree-Series Transformations Hasse Diagrams for Classes of Deterministic Bottom-Up Tree-to-Tree-Series Transformations Andreas Maletti 1 Technische Universität Dresden Fakultät Informatik D 01062 Dresden Germany Abstract The relationship

More information

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this. Chapter 12 Synchronous CFGs Synchronous context-free grammars are a generalization of CFGs that generate pairs of related strings instead of single strings. They are useful in many situations where one

More information

Tuning as Linear Regression

Tuning as Linear Regression Tuning as Linear Regression Marzieh Bazrafshan, Tagyoung Chung and Daniel Gildea Department of Computer Science University of Rochester Rochester, NY 14627 Abstract We propose a tuning method for statistical

More information

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements

More information

Exam: Synchronous Grammars

Exam: Synchronous Grammars Exam: ynchronous Grammars Duration: 3 hours Written documents are allowed. The numbers in front of questions are indicative of hardness or duration. ynchronous grammars consist of pairs of grammars whose

More information

Natural Language Processing (CSEP 517): Machine Translation

Natural Language Processing (CSEP 517): Machine Translation Natural Language Processing (CSEP 57): Machine Translation Noah Smith c 207 University of Washington nasmith@cs.washington.edu May 5, 207 / 59 To-Do List Online quiz: due Sunday (Jurafsky and Martin, 2008,

More information

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universität Tübingen Part I Formal Properties of TAG 16.05.2007 und 21.05.2007 TAG Parsing

More information

Tree transformations and dependencies

Tree transformations and dependencies Tree transformations and deendencies Andreas Maletti Universität Stuttgart Institute for Natural Language Processing Azenbergstraße 12, 70174 Stuttgart, Germany andreasmaletti@imsuni-stuttgartde Abstract

More information

Algorithms for Syntax-Aware Statistical Machine Translation

Algorithms for Syntax-Aware Statistical Machine Translation Algorithms for Syntax-Aware Statistical Machine Translation I. Dan Melamed, Wei Wang and Ben Wellington ew York University Syntax-Aware Statistical MT Statistical involves machine learning (ML) seems crucial

More information

Shift-Reduce Word Reordering for Machine Translation

Shift-Reduce Word Reordering for Machine Translation Shift-Reduce Word Reordering for Machine Translation Katsuhiko Hayashi, Katsuhito Sudoh, Hajime Tsukada, Jun Suzuki, Masaaki Nagata NTT Communication Science Laboratories, NTT Corporation 2-4 Hikaridai,

More information

CISC4090: Theory of Computation

CISC4090: Theory of Computation CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Spring, 2014 Overview In Chapter

More information

Advanced Automata Theory 10 Transducers and Rational Relations

Advanced Automata Theory 10 Transducers and Rational Relations Advanced Automata Theory 10 Transducers and Rational Relations Frank Stephan Department of Computer Science Department of Mathematics National University of Singapore fstephan@comp.nus.edu.sg Advanced

More information

Notes for Comp 497 (Comp 454) Week 12 4/19/05. Today we look at some variations on machines we have already seen. Chapter 21

Notes for Comp 497 (Comp 454) Week 12 4/19/05. Today we look at some variations on machines we have already seen. Chapter 21 Notes for Comp 497 (Comp 454) Week 12 4/19/05 Today we look at some variations on machines we have already seen. Errata (Chapter 21): None known Chapter 21 So far we have seen the equivalence of Post Machines

More information

RELATING TREE SERIES TRANSDUCERS AND WEIGHTED TREE AUTOMATA

RELATING TREE SERIES TRANSDUCERS AND WEIGHTED TREE AUTOMATA International Journal of Foundations of Computer Science Vol. 16, No. 4 (2005) 723 741 c World Scientific Publishing Company RELATING TREE SERIES TRANSDUCERS AND WEIGHTED TREE AUTOMATA ANDREAS MALETTI

More information

Processing/Speech, NLP and the Web

Processing/Speech, NLP and the Web CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25 Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March, 2011 Bracketed Structure: Treebank Corpus [ S1[

More information

Computer Science 160 Translation of Programming Languages

Computer Science 160 Translation of Programming Languages Computer Science 160 Translation of Programming Languages Instructor: Christopher Kruegel Building a Handle Recognizing Machine: [now, with a look-ahead token, which is LR(1) ] LR(k) items An LR(k) item

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins

Probabilistic Context Free Grammars. Many slides from Michael Collins Probabilistic Context Free Grammars Many slides from Michael Collins Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Dependency Parsing. Statistical NLP Fall (Non-)Projectivity. CoNLL Format. Lecture 9: Dependency Parsing

Dependency Parsing. Statistical NLP Fall (Non-)Projectivity. CoNLL Format. Lecture 9: Dependency Parsing Dependency Parsing Statistical NLP Fall 2016 Lecture 9: Dependency Parsing Slav Petrov Google prep dobj ROOT nsubj pobj det PRON VERB DET NOUN ADP NOUN They solved the problem with statistics CoNLL Format

More information

An algebraic characterization of unary two-way transducers

An algebraic characterization of unary two-way transducers An algebraic characterization of unary two-way transducers (Extended Abstract) Christian Choffrut 1 and Bruno Guillon 1 LIAFA, CNRS and Université Paris 7 Denis Diderot, France. Abstract. Two-way transducers

More information

Decision Problems of Tree Transducers with Origin

Decision Problems of Tree Transducers with Origin Decision Problems of Tree Transducers with Origin Emmanuel Filiot a, Sebastian Maneth b, Pierre-Alain Reynier 1, Jean-Marc Talbot 1 a Université Libre de Bruxelles & FNRS b University of Edinburgh c Aix-Marseille

More information

Alessandro Mazzei MASTER DI SCIENZE COGNITIVE GENOVA 2005

Alessandro Mazzei MASTER DI SCIENZE COGNITIVE GENOVA 2005 Alessandro Mazzei Dipartimento di Informatica Università di Torino MATER DI CIENZE COGNITIVE GENOVA 2005 04-11-05 Natural Language Grammars and Parsing Natural Language yntax Paolo ama Francesca yntactic

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Shift-Reduce Word Reordering for Machine Translation

Shift-Reduce Word Reordering for Machine Translation Shift-Reduce Word Reordering for Machine Translation Katsuhiko Hayashi, Katsuhito Sudoh, Hajime Tsukada, Jun Suzuki, Masaaki Nagata NTT Communication Science Laboratories, NTT Corporation 2-4 Hikaridai,

More information

Chapter 14 (Partially) Unsupervised Parsing

Chapter 14 (Partially) Unsupervised Parsing Chapter 14 (Partially) Unsupervised Parsing The linguistically-motivated tree transformations we discussed previously are very effective, but when we move to a new language, we may have to come up with

More information

Unterspezifikation in der Semantik Scope Semantics in Lexicalized Tree Adjoining Grammars

Unterspezifikation in der Semantik Scope Semantics in Lexicalized Tree Adjoining Grammars in der emantik cope emantics in Lexicalized Tree Adjoining Grammars Laura Heinrich-Heine-Universität Düsseldorf Wintersemester 2011/2012 LTAG: The Formalism (1) Tree Adjoining Grammars (TAG): Tree-rewriting

More information

Introduction to Computers & Programming

Introduction to Computers & Programming 16.070 Introduction to Computers & Programming Theory of computation: What is a computer? FSM, Automata Prof. Kristina Lundqvist Dept. of Aero/Astro, MIT Models of Computation What is a computer? If you

More information

From Monadic Second-Order Definable String Transformations to Transducers

From Monadic Second-Order Definable String Transformations to Transducers From Monadic Second-Order Definable String Transformations to Transducers Rajeev Alur 1 Antoine Durand-Gasselin 2 Ashutosh Trivedi 3 1 University of Pennsylvania 2 LIAFA, Université Paris Diderot 3 Indian

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based Statistical Machine Translation Marcello Federico Human Language Technology FBK 2014 Outline Tree-based translation models: Synchronous context free grammars Hierarchical phrase-based

More information

Optimal k-arization of synchronous tree-adjoining grammar

Optimal k-arization of synchronous tree-adjoining grammar Optimal k-arization of synchronous tree-adjoining grammar The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters itation Rebecca Nesson,

More information

Parsing and Translation Algorithms Based on Weighted Extended Tree Transducers

Parsing and Translation Algorithms Based on Weighted Extended Tree Transducers Paring and Tranlation Algorithm Baed on Weighted Extended Tree Tranducer Andrea Maletti Departament de Filologie Romànique Univeritat Rovira i Virgili Tarragona, Spain Giorgio Satta Department of Information

More information

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 Parsing Problem Semantics Part of Speech Tagging NLP Trinity Morph Analysis

More information

Arturo Carpi 1 and Cristiano Maggi 2

Arturo Carpi 1 and Cristiano Maggi 2 Theoretical Informatics and Applications Theoret. Informatics Appl. 35 (2001) 513 524 ON SYNCHRONIZED SEQUENCES AND THEIR SEPARATORS Arturo Carpi 1 and Cristiano Maggi 2 Abstract. We introduce the notion

More information

Probabilistic Context-free Grammars

Probabilistic Context-free Grammars Probabilistic Context-free Grammars Computational Linguistics Alexander Koller 24 November 2017 The CKY Recognizer S NP VP NP Det N VP V NP V ate NP John Det a N sandwich i = 1 2 3 4 k = 2 3 4 5 S NP John

More information

Backward and Forward Bisimulation Minimisation of Tree Automata

Backward and Forward Bisimulation Minimisation of Tree Automata Backward and Forward Bisimulation Minimisation of Tree Automata Johanna Högberg 1, Andreas Maletti 2, and Jonathan May 3 1 Dept. of Computing Science, Umeå University, S 90187 Umeå, Sweden johanna@cs.umu.se

More information

Effectiveness of complex index terms in information retrieval

Effectiveness of complex index terms in information retrieval Effectiveness of complex index terms in information retrieval Tokunaga Takenobu, Ogibayasi Hironori and Tanaka Hozumi Department of Computer Science Tokyo Institute of Technology Abstract This paper explores

More information

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Probabilistic Context-Free Grammars. Michael Collins, Columbia University Probabilistic Context-Free Grammars Michael Collins, Columbia University Overview Probabilistic Context-Free Grammars (PCFGs) The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Accelerated Natural Language Processing Lecture 3 Morphology and Finite State Machines; Edit Distance

Accelerated Natural Language Processing Lecture 3 Morphology and Finite State Machines; Edit Distance Accelerated Natural Language Processing Lecture 3 Morphology and Finite State Machines; Edit Distance Sharon Goldwater (based on slides by Philipp Koehn) 20 September 2018 Sharon Goldwater ANLP Lecture

More information

Decision Problems of Tree Transducers with Origin

Decision Problems of Tree Transducers with Origin Decision Problems of Tree Transducers with Origin Emmanuel Filiot 1, Sebastian Maneth 2, Pierre-Alain Reynier 3, and Jean-Marc Talbot 3 1 Université Libre de Bruxelles 2 University of Edinburgh 3 Aix-Marseille

More information

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima. http://goo.gl/jv7vj9 Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT

More information

MTH 65 WS 3 ( ) Radical Expressions

MTH 65 WS 3 ( ) Radical Expressions MTH 65 WS 3 (9.1-9.4) Radical Expressions Name: The next thing we need to develop is some new ways of talking aout the expression 3 2 = 9 or, more generally, 2 = a. We understand that 9 is 3 squared and

More information

problem X reduces to Problem Y solving X would be easy, if we knew how to solve Y

problem X reduces to Problem Y solving X would be easy, if we knew how to solve Y CPS220 Reducibility A reduction is a procedure to convert one problem to another problem, in such a way that a solution to the second problem can be used to solve the first problem. The conversion itself

More information

A Discriminative Model for Semantics-to-String Translation

A Discriminative Model for Semantics-to-String Translation A Discriminative Model for Semantics-to-String Translation Aleš Tamchyna 1 and Chris Quirk 2 and Michel Galley 2 1 Charles University in Prague 2 Microsoft Research July 30, 2015 Tamchyna, Quirk, Galley

More information

Multiword Expression Identification with Tree Substitution Grammars

Multiword Expression Identification with Tree Substitution Grammars Multiword Expression Identification with Tree Substitution Grammars Spence Green, Marie-Catherine de Marneffe, John Bauer, and Christopher D. Manning Stanford University EMNLP 2011 Main Idea Use syntactic

More information

Dependency grammar. Recurrent neural networks. Transition-based neural parsing. Word representations. Informs Models

Dependency grammar. Recurrent neural networks. Transition-based neural parsing. Word representations. Informs Models Dependency grammar Morphology Word order Transition-based neural parsing Word representations Recurrent neural networks Informs Models Dependency grammar Morphology Word order Transition-based neural parsing

More information

Bisimulation Minimisation for Weighted Tree Automata

Bisimulation Minimisation for Weighted Tree Automata Bisimulation Minimisation for Weighted Tree Automata Johanna Högberg 1, Andreas Maletti 2, and Jonathan May 3 1 Department of Computing Science, Umeå University S 90187 Umeå, Sweden johanna@cs.umu.se 2

More information

A Context-Free Grammar

A Context-Free Grammar Statistical Parsing A Context-Free Grammar S VP VP Vi VP Vt VP VP PP DT NN PP PP P Vi sleeps Vt saw NN man NN dog NN telescope DT the IN with IN in Ambiguity A sentence of reasonable length can easily

More information

Computational Linguistics

Computational Linguistics Computational Linguistics Dependency-based Parsing Clayton Greenberg Stefan Thater FR 4.7 Allgemeine Linguistik (Computerlinguistik) Universität des Saarlandes Summer 2016 Acknowledgements These slides

More information

On simulations and bisimulations of general flow systems

On simulations and bisimulations of general flow systems On simulations and bisimulations of general flow systems Jen Davoren Department of Electrical & Electronic Engineering The University of Melbourne, AUSTRALIA and Paulo Tabuada Department of Electrical

More information

Logical Characterization of Weighted Pebble Walking Automata

Logical Characterization of Weighted Pebble Walking Automata Logical Characterization of Weighted Pebble Walking Automata Benjamin Monmege Université libre de Bruxelles, Belgium Benedikt Bollig and Paul Gastin (LSV, ENS Cachan, France) Marc Zeitoun (LaBRI, Bordeaux

More information

Introduction to Kleene Algebras

Introduction to Kleene Algebras Introduction to Kleene Algebras Riccardo Pucella Basic Notions Seminar December 1, 2005 Introduction to Kleene Algebras p.1 Idempotent Semirings An idempotent semiring is a structure S = (S, +,, 1, 0)

More information

Lecture Notes on Inductive Definitions

Lecture Notes on Inductive Definitions Lecture Notes on Inductive Definitions 15-312: Foundations of Programming Languages Frank Pfenning Lecture 2 August 28, 2003 These supplementary notes review the notion of an inductive definition and give

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars Berlin Chen 2005 References: 1. Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2. Speech and Language Processing, chapters 9, 10 NLP-Berlin Chen 1 Grammars

More information

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing CMSC 330: Organization of Programming Languages Pushdown Automata Parsing Chomsky Hierarchy Categorization of various languages and grammars Each is strictly more restrictive than the previous First described

More information

CS 420, Spring 2018 Homework 4 Solutions. 1. Use the Pumping Lemma to show that the following languages are not regular: (a) {0 2n 10 n n 0};

CS 420, Spring 2018 Homework 4 Solutions. 1. Use the Pumping Lemma to show that the following languages are not regular: (a) {0 2n 10 n n 0}; CS 420, Spring 2018 Homework 4 Solutions 1. Use the Pumping Lemma to show that the following languages are not regular: (a) {0 2n 10 n n 0}; Solution: Given p 1, choose s = 0 2p 10 p. Then, s is in the

More information

Lexical Analysis Part II: Constructing a Scanner from Regular Expressions

Lexical Analysis Part II: Constructing a Scanner from Regular Expressions Lexical Analysis Part II: Constructing a Scanner from Regular Expressions CS434 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper, Ken Kennedy

More information

A higher-order temporal logic for dynamical systems

A higher-order temporal logic for dynamical systems A higher-order temporal logic for dynamical systems David I. Spivak and Patrick Schultz Mathematics Department Massachusetts Institute of Technology AMS Fall Sectional, Riverside CA November 4, 2017 0

More information

CS 545 Lecture XVI: Parsing

CS 545 Lecture XVI: Parsing CS 545 Lecture XVI: Parsing brownies_choco81@yahoo.com brownies_choco81@yahoo.com Benjamin Snyder Parsing Given a grammar G and a sentence x = (x1, x2,..., xn), find the best parse tree. We re not going

More information

NLP Programming Tutorial 11 - The Structured Perceptron

NLP Programming Tutorial 11 - The Structured Perceptron NLP Programming Tutorial 11 - The Structured Perceptron Graham Neubig Nara Institute of Science and Technology (NAIST) 1 Prediction Problems Given x, A book review Oh, man I love this book! This book is

More information

0/16. Tree Transducers. Niko Paltzer. Seminar Formal Grammars WS 06/07 Advisor: Marco Kuhlmann. Programming Systems Lab.

0/16. Tree Transducers. Niko Paltzer. Seminar Formal Grammars WS 06/07 Advisor: Marco Kuhlmann. Programming Systems Lab. 0/16 Tree Transducers Niko Paltzer Seminar Formal Grammars WS 06/07 Advisor: Marco Kuhlmann Programming Systems Lab Outline 1/16 Trees & Tree Transducers Derivations & State-sequences Copying Normal Form

More information

THEORY OF COMPILATION

THEORY OF COMPILATION Lecture 04 Syntax analysis: top-down and bottom-up parsing THEORY OF COMPILATION EranYahav 1 You are here Compiler txt Source Lexical Analysis Syntax Analysis Parsing Semantic Analysis Inter. Rep. (IR)

More information

Creating a Recursive Descent Parse Table

Creating a Recursive Descent Parse Table Creating a Recursive Descent Parse Table Recursive descent parsing is sometimes called LL parsing (Left to right examination of input, Left derivation) Consider the following grammar E TE' E' +TE' T FT'

More information