Rule Extraction for Machine Translation

Size: px
Start display at page:

Download "Rule Extraction for Machine Translation"

Transcription

1 Rule Extraction for Machine Translation Andreas Maletti Institute of Computer cience Universität Leipzig, Germany on leave from: Institute for Natural Language Processing Universität tuttgart, Germany Darmstadt eptember 10, 2014

2 Main notions Machine translation (MT) Automatic natural language translation (by a computer) as opposed to: manual translation computer-aided translation (e.g., translation memory) Rule Extraction for MT A. Maletti 2

3 Main notions Machine translation (MT) Automatic natural language translation (by a computer) as opposed to: manual translation computer-aided translation (e.g., translation memory) tatistical machine translation (MT) MT using systems automatically obtained from translations as opposed to: rule-based machine translation (old) YTRAN example-based machine translation translation by analogy Rule Extraction for MT A. Maletti 2

4 hort history Timeline 1 Dark age (60s 90s) rule-based systems (e.g., YTRAN) CHOMKYAN approach perfect translation, poor coverage Rule Extraction for MT A. Maletti 3

5 hort history Timeline 1 Dark age (60s 90s) rule-based systems (e.g., YTRAN) CHOMKYAN approach perfect translation, poor coverage 2 Reformation (1991 present) phrase-based and syntax-based systems statistical approach cheap, automatically trained Rule Extraction for MT A. Maletti 3

6 hort history Timeline 1 Dark age (60s 90s) rule-based systems (e.g., YTRAN) CHOMKYAN approach perfect translation, poor coverage 2 Reformation (1991 present) phrase-based and syntax-based systems statistical approach cheap, automatically trained 3 Potential future semantics-based systems (e.g., FRAMENET-based) semi-supervised, statistical approach basic understanding of translated text Rule Extraction for MT A. Maletti 3

7 Examples Applications Technical manuals Example (An mp3 player) The synchronous manifestation of lyrics is a procedure for can broadcasting the music, waiting the mp3 file at the same time showing the lyrics. Rule Extraction for MT A. Maletti 4

8 Examples Applications Technical manuals Example (An mp3 player) With the this kind method that the equipments that synchronous function of support up broadcast to make use of document create setup, you can pass the LCD window way the check at the document contents that broadcast. Rule Extraction for MT A. Maletti 4

9 Examples Applications Technical manuals Example (An mp3 player) That procedure returns offerings to have to modify, and delete, and stick top, keep etc. edit function. Rule Extraction for MT A. Maletti 4

10 Examples Applications Technical manuals Example (Hotel Uppsala, weden) Wir hatten die Zimmer eingestuft wird als uperior weil sie renoviert wurde im letzten Jahr oder zwei. Unsere Zimmer hatten Parkettboden und waren sehr geräumig. Man musste allerdings nicht musste seitwärts bewegen. Rule Extraction for MT A. Maletti 4

11 Examples Applications Technical manuals Example (Hotel Uppsala, weden) Wir hatten die Zimmer eingestuft wird als uperior weil sie renoviert wurde im letzten Jahr oder zwei. Unsere Zimmer hatten Parkettboden und waren sehr geräumig. Man musste allerdings nicht musste seitwärts bewegen. We stayed in rooms classified as superior because they had been renovated in the last year or two. Our rooms had wood floors and were roomy. You didn t have to walk sideways to move around. Rule Extraction for MT A. Maletti 4

12 Examples Applications Technical manuals U military Example (JONE, HEN, HERZOG 2009) oldier: Okay, what is your name? Local: Abdul. oldier: And your last name? Local: Al Farran. Rule Extraction for MT A. Maletti 4

13 Examples Applications Technical manuals U military Example (JONE, HEN, HERZOG 2009) oldier: Okay, what is your name? Local: Abdul. oldier: And your last name? Local: Al Farran. peech-to-text machine translation oldier: Local: Okay, what s your name? milk a mechanic and I am here I mean yes Rule Extraction for MT A. Maletti 4

14 Examples Applications Technical manuals U military Example (JONE, HEN, HERZOG 2009) oldier: Okay, what is your name? Local: Abdul. oldier: And your last name? Local: Al Farran. peech-to-text machine translation oldier: Local: oldier: Local: Okay, what s your name? milk a mechanic and I am here I mean yes What is your last name? every two weeks my son s name is ismail Rule Extraction for MT A. Maletti 4

15 Examples Applications Technical manuals U military MDN, Knowledge Base... Rule Extraction for MT A. Maletti 4

16 tandard pipeline chema Input Translation model Language model Output (the models are often integrated in practice) Rule Extraction for MT A. Maletti 5

17 tandard pipeline chema Input Translation model Language model Output (the models are often integrated in practice) Required resources bilingual text (sentences in both languages) 1.5M sent. Rule Extraction for MT A. Maletti 5

18 tandard pipeline chema Input Translation model Language model Output (the models are often integrated in practice) Required resources bilingual text (sentences in both languages) monolingual text (in target language) 1.5M sent. 44M sent. Rule Extraction for MT A. Maletti 5

19 Word Alignment English-German example We can help countries catch up, but not by putting their neighbours on hold Wir können Ländern beim Aufholen helfen, aber nicht, indem wir ihre Nachbarn in den Wartesaal schicken English-Russian example I was the one who sat down and copied them ß áûë åäèíñòâåííûì, êòî çàíÿëñÿ êîïèðîâàíèåì äåìî íà êàññåòå Rule Extraction for MT A. Maletti 6

20 Parallel Corpus EUROPARL German-English parallel corpus 1, 920, 209 parallel sentences 44, 548, 491 words in German 47, 818, 827 words in English sentence-aligned, but not word-aligned from parliament proceedings Rule Extraction for MT A. Maletti 7

21 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Algorithm 1 phrase pair ([j, j ], [i, i ]) consistently aligned if l [i, i ] for all l [j, j ] and (l, l ) A l [j, j ] for all l [i, i ] and (l, l ) A Rule Extraction for MT A. Maletti 8

22 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Algorithm 1 phrase pair ([j, j ], [i, i ]) consistently aligned if l [i, i ] for all l [j, j ] and (l, l ) A l [j, j ] for all l [i, i ] and (l, l ) A 2 extract all consistently aligned phrase pairs 3 (restrict length of phrases based on corpus size) Rule Extraction for MT A. Maletti 8

23 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Formally: ([1,1], [2,3]) ([2,2], [4,4]) ([3,3], [1,1]) ([4,5], [5,5]) ([6,6], [6,6]) ([7,7], [7,7]) ([8,8], [8,8]) ([9,11], [9,9]) ([12,13], [10,10]) Rule Extraction for MT A. Maletti 9

24 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Formally: ([1,1], [2,3]) ([2,2], [4,4]) ([3,3], [1,1]) ([4,5], [5,5]) ([6,6], [6,6]) ([7,7], [7,7]) ([8,8], [8,8]) ([9,11], [9,9]) ([12,13], [10,10]) Rule Extraction for MT A. Maletti 9

25 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Formally: ([1,1], [2,3]) ([2,2], [4,4]) ([3,3], [1,1]) ([4,5], [5,5]) ([6,6], [6,6]) ([7,7], [7,7]) ([8,8], [8,8]) ([9,11], [9,9]) ([12,13], [10,10]) Rule Extraction for MT A. Maletti 9

26 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Formally: ([1,1], [2,3]) ([2,2], [4,4]) ([3,3], [1,1]) ([4,5], [5,5]) ([6,6], [6,6]) ([7,7], [7,7]) ([8,8], [8,8]) ([9,11], [9,9]) ([12,13], [10,10]) Rule Extraction for MT A. Maletti 9

27 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Formally: ([1,1], [2,3]) ([2,2], [4,4]) ([3,3], [1,1]) ([4,5], [5,5]) ([6,6], [6,6]) ([7,7], [7,7]) ([8,8], [8,8]) ([9,11], [9,9]) ([12,13], [10,10]) Rule Extraction for MT A. Maletti 9

28 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Formally: ([1,1], [2,3]) ([2,2], [4,4]) ([3,3], [1,1]) ([4,5], [5,5]) ([6,6], [6,6]) ([7,7], [7,7]) ([8,8], [8,8]) ([9,11], [9,9]) ([12,13], [10,10]) Rule Extraction for MT A. Maletti 9

29 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Formally: ([1,1], [2,3]) ([2,2], [4,4]) ([3,3], [1,1]) ([4,5], [5,5]) ([6,6], [6,6]) ([7,7], [7,7]) ([8,8], [8,8]) ([9,11], [9,9]) ([12,13], [10,10]) Rule Extraction for MT A. Maletti 9

30 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Formally: ([1,1], [2,3]) ([2,2], [4,4]) ([3,3], [1,1]) ([4,5], [5,5]) ([6,6], [6,6]) ([7,7], [7,7]) ([8,8], [8,8]) ([9,11], [9,9]) ([12,13], [10,10]) Rule Extraction for MT A. Maletti 9

31 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Formally: ([1,1], [2,3]) ([2,2], [4,4]) ([3,3], [1,1]) ([4,5], [5,5]) ([6,6], [6,6]) ([7,7], [7,7]) ([8,8], [8,8]) ([9,11], [9,9]) ([12,13], [10,10]) Rule Extraction for MT A. Maletti 9

32 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Formally: For better readability: ([1,1], [2,3]) ([2,2], [4,4]) ([3,3], [1,1]) ([4,5], [5,5]) ([6,6], [6,6]) ([7,7], [7,7]) ([8,8], [8,8]) ([9,11], [9,9]) ([12,13], [10,10]) Könnten would like ie your mir I eine Auskunft advice zu about Artikel Rule im Zusammenhang mit concerning der Unzulässigkeit inadmissibility Rule Extraction for MT A. Maletti 9

33 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Notes these were only minimal phrase pairs extract all (sensible) combinations of these e.g., ([1, 1], [2, 3]) and ([2, 2], [4, 4]) yield ([1, 2], [2, 4]) Rule Extraction for MT A. Maletti 10

34 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Notes these were only minimal phrase pairs extract all (sensible) combinations of these e.g., ([1, 1], [2, 3]) and ([2, 2], [4, 4]) yield ([1, 2], [2, 4]) unaligned words can be added to neighboring phrases e.g., ([12, 13], [10, 10]) extends to ([12, 14], [10, 10]) Rule Extraction for MT A. Maletti 10

35 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Notes these were only minimal phrase pairs extract all (sensible) combinations of these e.g., ([1, 1], [2, 3]) and ([2, 2], [4, 4]) yield ([1, 2], [2, 4]) unaligned words can be added to neighboring phrases e.g., ([12, 13], [10, 10]) extends to ([12, 14], [10, 10]) Könnten ie would like your der Unzulässigkeit geben inadmissibility Rule Extraction for MT A. Maletti 10

36 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Alternative representation (rectangles): Rule Extraction for MT A. Maletti 11

37 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Alternative representation (rectangles): Rule Extraction for MT A. Maletti 11

38 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Alternative representation (rectangles): Rule Extraction for MT A. Maletti 11

39 Phrase-based Models Könnten 1 ie 2 mir 3 eine 4 Auskunft 5 zu 6 Artikel im 9 Zusammenhang 10 mit 11 der 12 Unzulässigkeit 13 geben 14 I 1 would 2 like 3 your 4 advice 5 about 6 Rule concerning 9 inadmissibility 10 Alternative representation (rectangles): Rule Extraction for MT A. Maletti 11

40 Phrase-based Models Rule weights simple relative frequencies during extraction normally different normalizations as features Rule Extraction for MT A. Maletti 12

41 Phrase-based Models Rule weights simple relative frequencies during extraction normally different normalizations as features Ideally weights should be set to utility of the rule for explaining the training data would require reprocessing of training data EM or similar algorithms available impractical, EM not used Rule Extraction for MT A. Maletti 12

42 yntax-based Machine Translation yntax-based systems Input Parser Machine translation system Language model Output Rule Extraction for MT A. Maletti 13

43 yntax-based Machine Translation CHARNIAK parser: [CHARNIAK, JOHNON, 2005] VP PRP MD VP We must VB PP bear IN PP in NN DT NN IN mind the Community as DT NN BitPar parser: [CHMID, 2006] -TOP TOP $. a whole -B/Pl VMFIN-HD-Pl VP-OC/inf. PPER-HD-Nom.Pl müssen -DA PP-OP/V VVINF-HD $, VP-OC/zu Wir PPER-HD-Dat.Pl PROAV-PH hüten, VP-OC/inf VZ-HD uns davor -OA VVINF-HD PTKZU-PM VMINF-HD PI-HD-Acc.g.Neut vergemeinschaften zu wollen alles Rule Extraction for MT A. Maletti 14

44 yntax-based Machine Translation Arabic-English Yugoslav President Voislav signed for erbia. Translit.: w twly AltwqyE En rbya Alr}ys AlywgwslAfy fwyslaf. And then the matter was decided, and everything was put in place. Translit.: f kan An tm AlHsm w wdet Al>mwr fy nab ha. Below are the male and female winners in the different categories. Translit.: w hna Al>wA}l w Al>wlyAt fy mxtlf Alf}At. Rule Extraction for MT A. Maletti 15

45 yntax-based Machine Translation Alignment Yugoslav President Voislav signed for erbia w twly AltwqyE En rbya Alr}ys AlywgwslAfy fwyslaf Rule Extraction for MT A. Maletti 16

46 yntax-based Machine Translation [GALLEY et al., 2004] -BJ VP NML N VBD PP JJ N Voislav signed IN Yugoslav President for N erbia rbya twly w PV AltwqyE En NN-PROP Alr}ys AlywgwslAfy fwyslaf DET-NN PREP DET-NN DET-ADJ NN-PROP PP -OBJ -BJ CONJ VP Rule Extraction for MT A. Maletti 17

47 yntax-based Machine Translation -BJ VP NML Voislav signed PP elect next node bottom-up Yugoslav President for erbia rbya AltwqyE En Alr}ys AlywgwslAfy fwyslaf PP twly -OBJ -BJ w VP Rule Extraction for MT A. Maletti 18

48 yntax-based Machine Translation -BJ VP NML Voislav signed PP Yugoslav President for elect next node bottom-up Identify maximal subtree of aligned nodes erbia rbya AltwqyE En Alr}ys AlywgwslAfy fwyslaf PP twly -OBJ -BJ w VP Rule Extraction for MT A. Maletti 18

49 yntax-based Machine Translation -BJ VP NML Voislav signed PP Yugoslav President for erbia elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. rbya AltwqyE En Alr}ys AlywgwslAfy fwyslaf PP twly -OBJ -BJ w VP Rule Extraction for MT A. Maletti 18

50 yntax-based Machine Translation -BJ VP NML Voislav signed PP Yugoslav President for erbia rbya AltwqyE En Alr}ys AlywgwslAfy fwyslaf PP twly -OBJ -BJ w VP elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Rule Extraction for MT A. Maletti 18

51 yntax-based Machine Translation -BJ VP NML Voislav signed PP qy President for erbia rbya AltwqyE En Alr}ys qy fwyslaf PP twly -OBJ -BJ w VP elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Repeat Yugoslav q Y AlywgwslAfy Rule Extraction for MT A. Maletti 18

52 yntax-based Machine Translation -BJ VP NML qv signed PP qy qp q f q q AltwqyE q f qp qy qv PP twly -OBJ -BJ w VP elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Repeat Yugoslav q Y AlywgwslAfy President q P Alr}ys Voislav q V fwyslaf for q f En erbia q rbya Rule Extraction for MT A. Maletti 18

53 yntax-based Machine Translation -BJ VP NML qv signed PP qy qp q f q q AltwqyE q f qp qy qv PP twly -OBJ -BJ w VP elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Repeat Yugoslav q Y AlywgwslAfy President q P Alr}ys Voislav q V fwyslaf for q f En erbia q rbya Rule Extraction for MT A. Maletti 18

54 yntax-based Machine Translation -BJ VP NML qv signed PP qy qp q f q q AltwqyE q f qp qy qv PP twly -OBJ -BJ w VP elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Repeat NML(q Y, q P ) q NML (q P, q Y ) Rule Extraction for MT A. Maletti 18

55 yntax-based Machine Translation -BJ VP qnml qv signed PP q f q q AltwqyE q f qv PP qnml twly -OBJ -BJ w VP elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Repeat NML(q Y, q P ) q NML (q P, q Y ) Rule Extraction for MT A. Maletti 18

56 yntax-based Machine Translation w -BJ qnml qv AltwqyE q f twly -OBJ VP signed PP q VP q f qnml PP q qv -BJ elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Repeat NML(q Y, q P ) q NML (q P, q Y ) (q ) q (q ) PP(q f, q ) q PP PP(q f, q ) -BJ(q NML, q V ) q -BJ -BJ(q NML, (q V )) Rule Extraction for MT A. Maletti 18

57 yntax-based Machine Translation q -BJ signed AltwqyE q PP VP twly -OBJ w VP q PP q -BJ elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Repeat NML(q Y, q P ) q NML (q P, q Y ) (q ) q (q ) PP(q f, q ) q PP PP(q f, q ) -BJ(q NML, q V ) q -BJ -BJ(q NML, (q V )) Rule Extraction for MT A. Maletti 18

58 yntax-based Machine Translation q -BJ signed AltwqyE q PP VP twly -OBJ w VP q PP q -BJ elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Repeat NML(q Y, q P ) q NML (q P, q Y ) (q ) q (q ) PP(q f, q ) q PP PP(q f, q ) -BJ(q NML, q V ) q -BJ -BJ(q NML, (q V )) Rule Extraction for MT A. Maletti 18

59 yntax-based Machine Translation q -BJ signed AltwqyE q PP VP twly -OBJ w VP q PP q -BJ elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Repeat NML(q Y, q P ) q NML (q P, q Y ) (q ) q (q ) PP(q f, q ) q PP PP(q f, q ) -BJ(q NML, q V ) q -BJ -BJ(q NML, (q V )) Rule Extraction for MT A. Maletti 18

60 yntax-based Machine Translation q -BJ signed AltwqyE q PP VP twly -OBJ w VP q PP q -BJ elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Repeat NML(q Y, q P ) q NML (q P, q Y ) (q ) q (q ) PP(q f, q ) q PP PP(q f, q ) -BJ(q NML, q V ) q -BJ -BJ(q NML, (q V )) Rule Extraction for MT A. Maletti 18

61 yntax-based Machine Translation q -BJ signed AltwqyE q PP VP twly -OBJ w VP q PP q -BJ elect next node bottom-up Identify maximal subtree of aligned nodes Identify subtree of nodes aligned to aligned nodes, etc. Extract rule and leave state Repeat NML(q Y, q P ) q NML (q P, q Y ) (q ) q (q ) PP(q f, q ) q PP PP(q f, q ) -BJ(q NML, q V ) q -BJ -BJ(q NML, (q V )) Rule Extraction for MT A. Maletti 18

62 yntax-based Machine Translation Rules Yugoslav q Y AlywgwslAfy President q P Alr}ys Voislav q V fwyslaf for q f En erbia q rbya NML(q Y, q P ) q NML (q P, q Y ) (q ) q (q ) PP(q f, q ) q PP PP(q f, q ) -BJ(q NML, q V ) q -BJ -BJ(q NML, (q V )) Rule Extraction for MT A. Maletti 19

63 yntax-based Machine Translation Rules Yugoslav q Y AlywgwslAfy President q P Alr}ys Voislav q V fwyslaf for q f En erbia q rbya NML(q Y, q P ) q NML (q P, q Y ) (q ) q (q ) PP(q f, q ) q PP PP(q f, q ) -BJ(q NML, q V ) q -BJ -BJ(q NML, (q V )) Rules of an Extended Top-down Tree Transducer Rule Extraction for MT A. Maletti 19

64 Extended Top-down Tree Transducer Advantages simple and natural model easy to train [GRAEHL et al., 2008] symmetric (from linguistic resources) Rule Extraction for MT A. Maletti 20

65 Extended Top-down Tree Transducer Advantages simple and natural model easy to train [GRAEHL et al., 2008] symmetric (from linguistic resources) Disadvantages weights set in the same manner does not capture utility, EM available EM not used in practice Rule Extraction for MT A. Maletti 20

66 From Automata to Transducers Graphical representation q -BJ -BJ q NML q V -BJ q NML q V corresponds to two productions q -BJ -BJ(q NML, q V ) q -BJ -BJ(q NML, (q V )) Rule Extraction for MT A. Maletti 21

67 From Automata to Transducers General idea ynchronous grammars are essentially two grammars over the same nonterminals whose productions are paired Convention same nonterminals are synchronized (or linked) and develop at the same time Rule Extraction for MT A. Maletti 22

68 From Automata to Transducers Approach join two productions q 1 r 1 and q 2 r 2 to (q 1, q 2 ) (r 1, r 2 ) Rule Extraction for MT A. Maletti 23

69 From Automata to Transducers Approach join two productions q 1 r 1 and q 2 r 2 to (q 1, q 2 ) (r 1, r 2 ) demand q 1 = q = q 2 for simplicity and write r 1 q r2 Rule Extraction for MT A. Maletti 23

70 From Automata to Transducers Approach join two productions q 1 r 1 and q 2 r 2 to (q 1, q 2 ) (r 1, r 2 ) demand q 1 = q = q 2 for simplicity and write r 1 paired productions develop input and output tree at the same time q r2 Rule Extraction for MT A. Maletti 23

71 From Automata to Transducers q q Used rule: Next rule: q q CONJ wa q Rule Extraction for MT A. Maletti 24

72 From Automata to Transducers q CONJ q wa Used rule: Next rule: q q CONJ wa q q 1 VP p q 2 q p q 1 q 2 Rule Extraction for MT A. Maletti 24

73 From Automata to Transducers q 1 VP CONJ p q 2 wa p q 1 q 2 Used rule: Next rule: q 1 VP q p q 1 q 2 V saw p V ra aa p q 2 Rule Extraction for MT A. Maletti 24

74 From Automata to Transducers q 1 VP CONJ V q 2 wa V q 1 q 2 saw ra aa Used rule: Next rule: V saw p V ra aa DT the r q 1 r Rule Extraction for MT A. Maletti 24

75 From Automata to Transducers VP CONJ DT r V q 2 wa V q 2 the saw ra aa r Used rule: Next rule: DT the r q 1 r N boy r N atefl Rule Extraction for MT A. Maletti 24

76 From Automata to Transducers VP CONJ DT N V q 2 wa V q 2 the boy saw ra aa N atefl Used rule: Next rule: N boy r N atefl DT the r q 2 r Rule Extraction for MT A. Maletti 24

77 From Automata to Transducers VP CONJ DT N V wa V the boy saw DT r ra aa N r the atefl Used rule: Next rule: DT the r q 2 r N door r N albab Rule Extraction for MT A. Maletti 24

78 From Automata to Transducers VP CONJ DT N V wa V the boy saw DT N ra aa N N the door atefl albab Used rule: Next rule: N door r N albab Rule Extraction for MT A. Maletti 24

79 From Automata to Transducers Remarks synchronization breaks almost all existing constructions (e.g., the normalization construction) the basic grammar model very important Rule Extraction for MT A. Maletti 25

80 Other yntax-based Models Important relations CFG = synchronous context-free grammar LTG-LTG [CHIANG, 2007] (synchronous local tree grammar) ln-top special top-down tree transducer TG = synchronous tree substitution grammar TG-TG [EINER, 2003] ln-xtop special extended top-down tree transducer TAG = synchronous tree adjunction grammar [HIEBER, CHABE, 1990] TAG-TAG CFTG = synchronous context-free tree grammar [NEDERHOF, VOGLER, 2012] CFTG-CFTG Rule Extraction for MT A. Maletti 26

81 Other yntax-based Models Towards asymetric relations TG = synch. tree-sequence substitution grammar [ZHANG et al., 2008] TG-TG lmbot = local shallow multi bottom-up tree transducer [BRAUNE et al., 2013] LTG-TG ln-xmbot corresponds rougly to RTG-TG Rule Extraction for MT A. Maletti 27

82 Where is Machine Learning? Examples unsupervised learning of translation models for TG [BLUNOM et al., 2008] uses GIBB sampling, hierarchical DIRICHLET process,... Rule Extraction for MT A. Maletti 28

83 Where is Machine Learning? Examples unsupervised learning of translation models for TG [BLUNOM et al., 2008] uses GIBB sampling, hierarchical DIRICHLET process,... But... can only be used on small data does not scale well currently not used in practice Rule Extraction for MT A. Maletti 28

84 Machine Learning & Grammatical Inference Quo vadis? Rule Extraction for MT A. Maletti 29

85 Literature elected references ARNOLD, DAUCHET: Morphismes et Bimorphismes d Arbres Theoret. Comput. ci. 20, 1982 BRAUNE, EEMANN, QUERNHEIM, : hallow local multi bottom-up tree transducers in statistical machine translation. Proc. ACL, 2013 GALLEY, HOPKIN, KNIGHT, MARCU What s in a Translation Rule? Proc. NAACL, 2004 KOEHN: tatistical Machine Translation Cambridge University Press, 2010, GRAEHL, HOPKIN, KNIGHT: The Power of Extended Top-down Tree Transducers. IAM J. Comput. 39, 2009 OCH, NEY: The Alignment Template Approach to tatistical Machine Translation. Comput. Linguist. 30, 2004 Rule Extraction for MT A. Maletti 30

Tree Transducers in Machine Translation

Tree Transducers in Machine Translation Tree Transducers in Machine Translation Andreas Maletti Universitat Rovira i Virgili Tarragona, pain andreas.maletti@urv.cat Jena August 23, 2010 Tree Transducers in MT A. Maletti 1 Machine translation

More information

How to train your multi bottom-up tree transducer

How to train your multi bottom-up tree transducer How to train your multi bottom-up tree transducer Andreas Maletti Universität tuttgart Institute for Natural Language Processing tuttgart, Germany andreas.maletti@ims.uni-stuttgart.de Portland, OR June

More information

Multi Bottom-up Tree Transducers

Multi Bottom-up Tree Transducers Multi Bottom-up Tree Transducers Andreas Maletti Institute for Natural Language Processing Universität tuttgart, Germany maletti@ims.uni-stuttgart.de Avignon April 24, 2012 Multi Bottom-up Tree Transducers

More information

Applications of Tree Automata Theory Lecture VI: Back to Machine Translation

Applications of Tree Automata Theory Lecture VI: Back to Machine Translation Applications of Tree Automata Theory Lecture VI: Back to Machine Translation Andreas Maletti Institute of Computer Science Universität Leipzig, Germany on leave from: Institute for Natural Language Processing

More information

Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers

Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers Andreas Maletti Institute of Computer Science Universität Leipzig, Germany on leave from: Institute for Natural Language Processing

More information

Lecture 7: Introduction to syntax-based MT

Lecture 7: Introduction to syntax-based MT Lecture 7: Introduction to syntax-based MT Andreas Maletti Statistical Machine Translation Stuttgart December 16, 2011 SMT VII A. Maletti 1 Lecture 7 Goals Overview Tree substitution grammars (tree automata)

More information

Syntax-based Machine Translation using Multi Bottom-up Tree Transducers

Syntax-based Machine Translation using Multi Bottom-up Tree Transducers yntx-sed Mchine Trnsltion using Multi Bottom-up Tree Trnsducers Andres Mletti Fienne Brune, Dniel Quernheim, Nin eemnn Institute for Nturl Lnguge Processing Universität tuttgrt, Germny mletti@ims.uni-stuttgrt.de

More information

Statistical Machine Translation of Natural Languages

Statistical Machine Translation of Natural Languages 1/37 Statistical Machine Translation of Natural Languages Rule Extraction and Training Probabilities Matthias Büchse, Toni Dietze, Johannes Osterholzer, Torsten Stüber, Heiko Vogler Technische Universität

More information

Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers

Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers Nina eemann and Daniel Quernheim and Fabienne Braune and Andreas Maletti University of tuttgart, Institute for Natural

More information

Statistical Machine Translation of Natural Languages

Statistical Machine Translation of Natural Languages 1/26 Statistical Machine Translation of Natural Languages Heiko Vogler Technische Universität Dresden Germany Graduiertenkolleg Quantitative Logics and Automata Dresden, November, 2012 1/26 Weighted Tree

More information

Linking Theorems for Tree Transducers

Linking Theorems for Tree Transducers Linking Theorems for Tree Transducers Andreas Maletti maletti@ims.uni-stuttgart.de peyer October 1, 2015 Andreas Maletti Linking Theorems for MBOT Theorietag 2015 1 / 32 tatistical Machine Translation

More information

Syntax-Based Decoding

Syntax-Based Decoding Syntax-Based Decoding Philipp Koehn 9 November 2017 1 syntax-based models Synchronous Context Free Grammar Rules 2 Nonterminal rules NP DET 1 2 JJ 3 DET 1 JJ 3 2 Terminal rules N maison house NP la maison

More information

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister A Syntax-based Statistical Machine Translation Model Alexander Friedl, Georg Teichtmeister 4.12.2006 Introduction The model Experiment Conclusion Statistical Translation Model (STM): - mathematical model

More information

Random Generation of Nondeterministic Tree Automata

Random Generation of Nondeterministic Tree Automata Random Generation of Nondeterministic Tree Automata Thomas Hanneforth 1 and Andreas Maletti 2 and Daniel Quernheim 2 1 Department of Linguistics University of Potsdam, Germany 2 Institute for Natural Language

More information

IBM Model 1 for Machine Translation

IBM Model 1 for Machine Translation IBM Model 1 for Machine Translation Micha Elsner March 28, 2014 2 Machine translation A key area of computational linguistics Bar-Hillel points out that human-like translation requires understanding of

More information

Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1

Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1 Lecture 9: Decoding Andreas Maletti Statistical Machine Translation Stuttgart January 20, 2012 SMT VIII A. Maletti 1 Lecture 9 Last time Synchronous grammars (tree transducers) Rule extraction Weight training

More information

Structure and Complexity of Grammar-Based Machine Translation

Structure and Complexity of Grammar-Based Machine Translation Structure and of Grammar-Based Machine Translation University of Padua, Italy New York, June 9th, 2006 1 2 Synchronous context-free grammars Definitions Computational problems 3 problem SCFG projection

More information

Compositions of Bottom-Up Tree Series Transformations

Compositions of Bottom-Up Tree Series Transformations Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de May 17, 2005 1. Motivation

More information

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this. Chapter 12 Synchronous CFGs Synchronous context-free grammars are a generalization of CFGs that generate pairs of related strings instead of single strings. They are useful in many situations where one

More information

Chapter 14 (Partially) Unsupervised Parsing

Chapter 14 (Partially) Unsupervised Parsing Chapter 14 (Partially) Unsupervised Parsing The linguistically-motivated tree transformations we discussed previously are very effective, but when we move to a new language, we may have to come up with

More information

Compositions of Bottom-Up Tree Series Transformations

Compositions of Bottom-Up Tree Series Transformations Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de May 27, 2005 1. Motivation

More information

Hierarchies of Tree Series TransducersRevisited 1

Hierarchies of Tree Series TransducersRevisited 1 Hierarchies of Tree Series TransducersRevisited 1 Andreas Maletti 2 Technische Universität Dresden Fakultät Informatik June 27, 2006 1 Financially supported by the Friends and Benefactors of TU Dresden

More information

The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers

The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers International Journal of Foundations of Computer cience c World cientific Publishing Company The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers ANDREA MALETTI Universität Leipzig,

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):

More information

Tuning as Linear Regression

Tuning as Linear Regression Tuning as Linear Regression Marzieh Bazrafshan, Tagyoung Chung and Daniel Gildea Department of Computer Science University of Rochester Rochester, NY 14627 Abstract We propose a tuning method for statistical

More information

LECTURER: BURCU CAN Spring

LECTURER: BURCU CAN Spring LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can

More information

The Power of Tree Series Transducers

The Power of Tree Series Transducers The Power of Tree Series Transducers Andreas Maletti 1 Technische Universität Dresden Fakultät Informatik June 15, 2006 1 Research funded by German Research Foundation (DFG GK 334) Andreas Maletti (TU

More information

Latent Variable Models in NLP

Latent Variable Models in NLP Latent Variable Models in NLP Aria Haghighi with Slav Petrov, John DeNero, and Dan Klein UC Berkeley, CS Division Latent Variable Models Latent Variable Models Latent Variable Models Observed Latent Variable

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based Statistical Machine Translation Marcello Federico Human Language Technology FBK 2014 Outline Tree-based translation models: Synchronous context free grammars Hierarchical phrase-based

More information

Syntax-based Statistical Machine Translation

Syntax-based Statistical Machine Translation Syntax-based Statistical Machine Translation Philip Williams and Philipp Koehn 29 October 2014 Part I Part II - Introduction - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical

More information

Natural Language Processing (CSEP 517): Machine Translation

Natural Language Processing (CSEP 517): Machine Translation Natural Language Processing (CSEP 57): Machine Translation Noah Smith c 207 University of Washington nasmith@cs.washington.edu May 5, 207 / 59 To-Do List Online quiz: due Sunday (Jurafsky and Martin, 2008,

More information

A Supertag-Context Model for Weakly-Supervised CCG Parser Learning

A Supertag-Context Model for Weakly-Supervised CCG Parser Learning A Supertag-Context Model for Weakly-Supervised CCG Parser Learning Dan Garrette Chris Dyer Jason Baldridge Noah A. Smith U. Washington CMU UT-Austin CMU Contributions 1. A new generative model for learning

More information

Probabilistic Context-free Grammars

Probabilistic Context-free Grammars Probabilistic Context-free Grammars Computational Linguistics Alexander Koller 24 November 2017 The CKY Recognizer S NP VP NP Det N VP V NP V ate NP John Det a N sandwich i = 1 2 3 4 k = 2 3 4 5 S NP John

More information

Advanced Natural Language Processing Syntactic Parsing

Advanced Natural Language Processing Syntactic Parsing Advanced Natural Language Processing Syntactic Parsing Alicia Ageno ageno@cs.upc.edu Universitat Politècnica de Catalunya NLP statistical parsing 1 Parsing Review Statistical Parsing SCFG Inside Algorithm

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

The effect of non-tightness on Bayesian estimation of PCFGs

The effect of non-tightness on Bayesian estimation of PCFGs The effect of non-tightness on Bayesian estimation of PCFGs hay B. Cohen Department of Computer cience Columbia University scohen@cs.columbia.edu Mark Johnson Department of Computing Macquarie University

More information

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing CS 562: Empirical Methods in Natural Language Processing Unit 2: Tree Models Lectures 19-23: Context-Free Grammars and Parsing Oct-Nov 2009 Liang Huang (lhuang@isi.edu) Big Picture we have already covered...

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based tatistical Machine Translation Marcello Federico (based on slides by Gabriele Musillo) Human Language Technology FBK-irst 2011 Outline Tree-based translation models: ynchronous context

More information

Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation

Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation David Vilar, Daniel Stein, Hermann Ney IWSLT 2008, Honolulu, Hawaii 20. October 2008 Human Language Technology

More information

A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions

A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions Wei Lu and Hwee Tou Ng National University of Singapore 1/26 The Task (Logical Form) λx 0.state(x 0

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation -tree-based models (cont.)- Artem Sokolov Computerlinguistik Universität Heidelberg Sommersemester 2015 material from P. Koehn, S. Riezler, D. Altshuler Bottom-Up Decoding

More information

Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions

Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical Machine Translation Philip Williams and Philipp Koehn 29 October 2014 Part I - Introduction Part II - Rule Extraction Part III - Decoding Part IV - Extensions Syntax-based Statistical

More information

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015 Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch COMP-599 Oct 1, 2015 Announcements Research skills workshop today 3pm-4:30pm Schulich Library room 313 Start thinking about

More information

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09 Natural Language Processing : Probabilistic Context Free Grammars Updated 5/09 Motivation N-gram models and HMM Tagging only allowed us to process sentences linearly. However, even simple sentences require

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Stochastic Grammars Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Methods for NLP 1(22) Structured Classification

More information

Why Synchronous Tree Substitution Grammars?

Why Synchronous Tree Substitution Grammars? Why ynchronous Tr ustitution Grammars? Andras Maltti Univrsitat Rovira i Virgili Tarragona, pain andras.maltti@urv.cat Los Angls Jun 4, 2010 Why ynchronous Tr ustitution Grammars? A. Maltti 1 Motivation

More information

Algorithms for Syntax-Aware Statistical Machine Translation

Algorithms for Syntax-Aware Statistical Machine Translation Algorithms for Syntax-Aware Statistical Machine Translation I. Dan Melamed, Wei Wang and Ben Wellington ew York University Syntax-Aware Statistical MT Statistical involves machine learning (ML) seems crucial

More information

Key words. tree transducer, natural language processing, hierarchy, composition closure

Key words. tree transducer, natural language processing, hierarchy, composition closure THE POWER OF EXTENDED TOP-DOWN TREE TRANSDUCERS ANDREAS MALETTI, JONATHAN GRAEHL, MARK HOPKINS, AND KEVIN KNIGHT Abstract. Extended top-down tree transducers (transducteurs généralisés descendants [Arnold,

More information

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements

More information

The SUBTLE NL Parsing Pipeline: A Complete Parser for English Mitch Marcus University of Pennsylvania

The SUBTLE NL Parsing Pipeline: A Complete Parser for English Mitch Marcus University of Pennsylvania The SUBTLE NL Parsing Pipeline: A Complete Parser for English Mitch Marcus University of Pennsylvania 1 PICTURE OF ANALYSIS PIPELINE Tokenize Maximum Entropy POS tagger MXPOST Ratnaparkhi Core Parser Collins

More information

Shift-Reduce Word Reordering for Machine Translation

Shift-Reduce Word Reordering for Machine Translation Shift-Reduce Word Reordering for Machine Translation Katsuhiko Hayashi, Katsuhito Sudoh, Hajime Tsukada, Jun Suzuki, Masaaki Nagata NTT Communication Science Laboratories, NTT Corporation 2-4 Hikaridai,

More information

Spectral Unsupervised Parsing with Additive Tree Metrics

Spectral Unsupervised Parsing with Additive Tree Metrics Spectral Unsupervised Parsing with Additive Tree Metrics Ankur Parikh, Shay Cohen, Eric P. Xing Carnegie Mellon, University of Edinburgh Ankur Parikh 2014 1 Overview Model: We present a novel approach

More information

Composition Closure of ε-free Linear Extended Top-down Tree Transducers

Composition Closure of ε-free Linear Extended Top-down Tree Transducers Composition Closure of ε-free Linear Extended Top-down Tree Transducers Zoltán Fülöp 1, and Andreas Maletti 2, 1 Department of Foundations of Computer Science, University of Szeged Árpád tér 2, H-6720

More information

Decoding and Inference with Syntactic Translation Models

Decoding and Inference with Syntactic Translation Models Decoding and Inference with Syntactic Translation Models March 5, 2013 CFGs S NP VP VP NP V V NP NP CFGs S NP VP S VP NP V V NP NP CFGs S NP VP S VP NP V NP VP V NP NP CFGs S NP VP S VP NP V NP VP V NP

More information

Word Alignment by Thresholded Two-Dimensional Normalization

Word Alignment by Thresholded Two-Dimensional Normalization Word Alignment by Thresholded Two-Dimensional Normalization Hamidreza Kobdani, Alexander Fraser, Hinrich Schütze Institute for Natural Language Processing University of Stuttgart Germany {kobdani,fraser}@ims.uni-stuttgart.de

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars Berlin Chen 2005 References: 1. Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2. Speech and Language Processing, chapters 9, 10 NLP-Berlin Chen 1 Grammars

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins

Probabilistic Context Free Grammars. Many slides from Michael Collins Probabilistic Context Free Grammars Many slides from Michael Collins Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Features of Statistical Parsers

Features of Statistical Parsers Features of tatistical Parsers Preliminary results Mark Johnson Brown University TTI, October 2003 Joint work with Michael Collins (MIT) upported by NF grants LI 9720368 and II0095940 1 Talk outline tatistical

More information

Multiword Expression Identification with Tree Substitution Grammars

Multiword Expression Identification with Tree Substitution Grammars Multiword Expression Identification with Tree Substitution Grammars Spence Green, Marie-Catherine de Marneffe, John Bauer, and Christopher D. Manning Stanford University EMNLP 2011 Main Idea Use syntactic

More information

CS460/626 : Natural Language

CS460/626 : Natural Language CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24 Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th,

More information

N-grams. Motivation. Simple n-grams. Smoothing. Backoff. N-grams L545. Dept. of Linguistics, Indiana University Spring / 24

N-grams. Motivation. Simple n-grams. Smoothing. Backoff. N-grams L545. Dept. of Linguistics, Indiana University Spring / 24 L545 Dept. of Linguistics, Indiana University Spring 2013 1 / 24 Morphosyntax We just finished talking about morphology (cf. words) And pretty soon we re going to discuss syntax (cf. sentences) In between,

More information

Shift-Reduce Word Reordering for Machine Translation

Shift-Reduce Word Reordering for Machine Translation Shift-Reduce Word Reordering for Machine Translation Katsuhiko Hayashi, Katsuhito Sudoh, Hajime Tsukada, Jun Suzuki, Masaaki Nagata NTT Communication Science Laboratories, NTT Corporation 2-4 Hikaridai,

More information

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Probabilistic Context-Free Grammars. Michael Collins, Columbia University Probabilistic Context-Free Grammars Michael Collins, Columbia University Overview Probabilistic Context-Free Grammars (PCFGs) The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Learning to translate with neural networks. Michael Auli

Learning to translate with neural networks. Michael Auli Learning to translate with neural networks Michael Auli 1 Neural networks for text processing Similar words near each other France Spain dog cat Neural networks for text processing Similar words near each

More information

National Centre for Language Technology School of Computing Dublin City University

National Centre for Language Technology School of Computing Dublin City University with with National Centre for Language Technology School of Computing Dublin City University Parallel treebanks A parallel treebank comprises: sentence pairs parsed word-aligned tree-aligned (Volk & Samuelsson,

More information

Cross-Entropy and Estimation of Probabilistic Context-Free Grammars

Cross-Entropy and Estimation of Probabilistic Context-Free Grammars Cross-Entropy and Estimation of Probabilistic Context-Free Grammars Anna Corazza Department of Physics University Federico II via Cinthia I-8026 Napoli, Italy corazza@na.infn.it Giorgio Satta Department

More information

Bayesian Learning of Non-compositional Phrases with Synchronous Parsing

Bayesian Learning of Non-compositional Phrases with Synchronous Parsing Bayesian Learning of Non-compositional Phrases with Synchronous Parsing Hao Zhang Computer Science Department University of Rochester Rochester, NY 14627 zhanghao@cs.rochester.edu Chris Quirk Microsoft

More information

A Discriminative Model for Semantics-to-String Translation

A Discriminative Model for Semantics-to-String Translation A Discriminative Model for Semantics-to-String Translation Aleš Tamchyna 1 and Chris Quirk 2 and Michel Galley 2 1 Charles University in Prague 2 Microsoft Research July 30, 2015 Tamchyna, Quirk, Galley

More information

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016 CKY & Earley Parsing Ling 571 Deep Processing Techniques for NLP January 13, 2016 No Class Monday: Martin Luther King Jr. Day CKY Parsing: Finish the parse Recognizer à Parser Roadmap Earley parsing Motivation:

More information

Natural Language Processing (CSEP 517): Machine Translation (Continued), Summarization, & Finale

Natural Language Processing (CSEP 517): Machine Translation (Continued), Summarization, & Finale Natural Language Processing (CSEP 517): Machine Translation (Continued), Summarization, & Finale Noah Smith c 2017 University of Washington nasmith@cs.washington.edu May 22, 2017 1 / 30 To-Do List Online

More information

The Infinite PCFG using Hierarchical Dirichlet Processes

The Infinite PCFG using Hierarchical Dirichlet Processes S NP VP NP PRP VP VBD NP NP DT NN PRP she VBD heard DT the NN noise S NP VP NP PRP VP VBD NP NP DT NN PRP she VBD heard DT the NN noise S NP VP NP PRP VP VBD NP NP DT NN PRP she VBD heard DT the NN noise

More information

CS 545 Lecture XVI: Parsing

CS 545 Lecture XVI: Parsing CS 545 Lecture XVI: Parsing brownies_choco81@yahoo.com brownies_choco81@yahoo.com Benjamin Snyder Parsing Given a grammar G and a sentence x = (x1, x2,..., xn), find the best parse tree. We re not going

More information

A Context-Free Grammar

A Context-Free Grammar Statistical Parsing A Context-Free Grammar S VP VP Vi VP Vt VP VP PP DT NN PP PP P Vi sleeps Vt saw NN man NN dog NN telescope DT the IN with IN in Ambiguity A sentence of reasonable length can easily

More information

Logistic Normal Priors for Unsupervised Probabilistic Grammar Induction

Logistic Normal Priors for Unsupervised Probabilistic Grammar Induction Logistic Normal Priors for Unsupervised Probabilistic Grammar Induction Shay B. Cohen Kevin Gimpel Noah A. Smith Language Technologies Institute School of Computer Science Carnegie Mellon University {scohen,gimpel,nasmith}@cs.cmu.edu

More information

Processing/Speech, NLP and the Web

Processing/Speech, NLP and the Web CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25 Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March, 2011 Bracketed Structure: Treebank Corpus [ S1[

More information

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. lti Quasi-Synchronous Phrase Dependency Grammars for Machine Translation Kevin Gimpel Noah A. Smith 1 Introduction MT using dependency grammars on phrases Phrases capture local reordering and idiomatic translations

More information

Machine Translation. CL1: Jordan Boyd-Graber. University of Maryland. November 11, 2013

Machine Translation. CL1: Jordan Boyd-Graber. University of Maryland. November 11, 2013 Machine Translation CL1: Jordan Boyd-Graber University of Maryland November 11, 2013 Adapted from material by Philipp Koehn CL1: Jordan Boyd-Graber (UMD) Machine Translation November 11, 2013 1 / 48 Roadmap

More information

Expectation Maximization (EM)

Expectation Maximization (EM) Expectation Maximization (EM) The EM algorithm is used to train models involving latent variables using training data in which the latent variables are not observed (unlabeled data). This is to be contrasted

More information

Probabilistic Linguistics

Probabilistic Linguistics Matilde Marcolli MAT1509HS: Mathematical and Computational Linguistics University of Toronto, Winter 2019, T 4-6 and W 4, BA6180 Bernoulli measures finite set A alphabet, strings of arbitrary (finite)

More information

Computational Linguistics

Computational Linguistics Computational Linguistics Dependency-based Parsing Clayton Greenberg Stefan Thater FR 4.7 Allgemeine Linguistik (Computerlinguistik) Universität des Saarlandes Summer 2016 Acknowledgements These slides

More information

Sharpening the empirical claims of generative syntax through formalization

Sharpening the empirical claims of generative syntax through formalization Sharpening the empirical claims of generative syntax through formalization Tim Hunter University of Minnesota, Twin Cities ESSLLI, August 2015 Part 1: Grammars and cognitive hypotheses What is a grammar?

More information

Chapter 2 The Naïve Bayes Model in the Context of Word Sense Disambiguation

Chapter 2 The Naïve Bayes Model in the Context of Word Sense Disambiguation Chapter 2 The Naïve Bayes Model in the Context of Word Sense Disambiguation Abstract This chapter discusses the Naïve Bayes model strictly in the context of word sense disambiguation. The theoretical model

More information

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 Parsing Problem Semantics Part of Speech Tagging NLP Trinity Morph Analysis

More information

Sharpening the empirical claims of generative syntax through formalization

Sharpening the empirical claims of generative syntax through formalization Sharpening the empirical claims of generative syntax through formalization Tim Hunter University of Minnesota, Twin Cities NASSLLI, June 2014 Part 1: Grammars and cognitive hypotheses What is a grammar?

More information

DT2118 Speech and Speaker Recognition

DT2118 Speech and Speaker Recognition DT2118 Speech and Speaker Recognition Language Modelling Giampiero Salvi KTH/CSC/TMH giampi@kth.se VT 2015 1 / 56 Outline Introduction Formal Language Theory Stochastic Language Models (SLM) N-gram Language

More information

Computational Linguistics. Acknowledgements. Phrase-Structure Trees. Dependency-based Parsing

Computational Linguistics. Acknowledgements. Phrase-Structure Trees. Dependency-based Parsing Computational Linguistics Dependency-based Parsing Dietrich Klakow & Stefan Thater FR 4.7 Allgemeine Linguistik (Computerlinguistik) Universität des Saarlandes Summer 2013 Acknowledgements These slides

More information

Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for Natural Language Processing

Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for Natural Language Processing Probabilistic Graphical Models: Lagrangian Relaxation Algorithms for atural Language Processing Alexander M. Rush (based on joint work with Michael Collins, Tommi Jaakkola, Terry Koo, David Sontag) Uncertainty

More information

Probabilistic Context-Free Grammar

Probabilistic Context-Free Grammar Probabilistic Context-Free Grammar Petr Horáček, Eva Zámečníková and Ivana Burgetová Department of Information Systems Faculty of Information Technology Brno University of Technology Božetěchova 2, 612

More information

statistical machine translation

statistical machine translation statistical machine translation P A R T 3 : D E C O D I N G & E V A L U A T I O N CSC401/2511 Natural Language Computing Spring 2019 Lecture 6 Frank Rudzicz and Chloé Pou-Prom 1 University of Toronto Statistical

More information

ACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging

ACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging ACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging Stephen Clark Natural Language and Information Processing (NLIP) Group sc609@cam.ac.uk The POS Tagging Problem 2 England NNP s POS fencers

More information

NATURAL LANGUAGE PROCESSING. Dr. G. Bharadwaja Kumar

NATURAL LANGUAGE PROCESSING. Dr. G. Bharadwaja Kumar NATURAL LANGUAGE PROCESSING Dr. G. Bharadwaja Kumar Sentence Boundary Markers Many natural language processing (NLP) systems generally take a sentence as an input unit part of speech (POS) tagging, chunking,

More information

{Probabilistic Stochastic} Context-Free Grammars (PCFGs)

{Probabilistic Stochastic} Context-Free Grammars (PCFGs) {Probabilistic Stochastic} Context-Free Grammars (PCFGs) 116 The velocity of the seismic waves rises to... S NP sg VP sg DT NN PP risesto... The velocity IN NP pl of the seismic waves 117 PCFGs APCFGGconsists

More information

Natural Language Processing

Natural Language Processing SFU NatLangLab Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University September 27, 2018 0 Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class

More information

AN ABSTRACT OF THE DISSERTATION OF

AN ABSTRACT OF THE DISSERTATION OF AN ABSTRACT OF THE DISSERTATION OF Kai Zhao for the degree of Doctor of Philosophy in Computer Science presented on May 30, 2017. Title: Structured Learning with Latent Variables: Theory and Algorithms

More information

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars Laura Kallmeyer & Tatiana Bladier Heinrich-Heine-Universität Düsseldorf Sommersemester 2018 Kallmeyer, Bladier SS 2018 Parsing Beyond CFG:

More information

Tree Adjoining Grammars

Tree Adjoining Grammars Tree Adjoining Grammars TAG: Parsing and formal properties Laura Kallmeyer & Benjamin Burkhardt HHU Düsseldorf WS 2017/2018 1 / 36 Outline 1 Parsing as deduction 2 CYK for TAG 3 Closure properties of TALs

More information

Overview (Fall 2007) Machine Translation Part III. Roadmap for the Next Few Lectures. Phrase-Based Models. Learning phrases from alignments

Overview (Fall 2007) Machine Translation Part III. Roadmap for the Next Few Lectures. Phrase-Based Models. Learning phrases from alignments Overview Learning phrases from alignments 6.864 (Fall 2007) Machine Translation Part III A phrase-based model Decoding in phrase-based models (Thanks to Philipp Koehn for giving me slides from his EACL

More information

Stochastic Parsing. Roberto Basili

Stochastic Parsing. Roberto Basili Stochastic Parsing Roberto Basili Department of Computer Science, System and Production University of Roma, Tor Vergata Via Della Ricerca Scientifica s.n.c., 00133, Roma, ITALY e-mail: basili@info.uniroma2.it

More information

Multi-Source Neural Translation

Multi-Source Neural Translation Multi-Source Neural Translation Barret Zoph and Kevin Knight Information Sciences Institute Department of Computer Science University of Southern California {zoph,knight}@isi.edu Abstract We build a multi-source

More information