Welcome to Pittsburgh!

Size: px
Start display at page:

Download "Welcome to Pittsburgh!"

Transcription

1 Welcome to Pittsburgh!

2 Overview of the IWSLT 2005 Evaluation Campaign atthias Eck and Chiori Hori InterACT Carnegie ellon niversity

3 IWSLT2005 Evaluation campaign Working on the same test bed Training Corpus elease: ay 20,2005 Test Corpus elease: Aug 16,2005 esult submission Due: Aug 18, 2005 Camera-ready Paper: Sep 25, 2005 Technical paper Submission: July 25,2005

4 Translation target anual transcription (plain sentences in BTEC) AS output of spoken BTEC sentences Where would you like to go? Is there a discount for children? Did you have fun today? Sure. Can I have a receipt? I'd like to try some local wine. No discourse.

5 Scientific Question How well AS output could be translated in the face of recognition errors? How much T performance could be enhanced by considering multiple hypotheses? Which hypothesis can contribute for T performance?..

6 Translation target AS output of spoken BTEC sentences eal evaluation conditions: ead aloud speech of the BTEC No spontaneity The difference between text and AS output translations -> handling recognition errors Providing multiple hypotheses: N-best, lattice (HTK format)

7 Directions and source input Translation direction anual transcription AS output Chinese English Japanese English Arabic English Korean English English Chinese (Dr. Chen, NLP) Dr. Yamamoto (AT), () - () - () (r. Paulik, /KA)

8 Provided Data All Data from BTEC corpus 2 Development sets CSTA 2003 test sets: 506 sentences IWSLT 2004 test sets: 500 sentences Training Data 20K sentences Test Data 506 sentences

9 Data and tool restriction Supplied Supplied & Tools nrestricted C-STA IWSLT05 corpus Tagger/Chunker/Parser x Public data x x Proprietary data x x x

10 Detail of Data and tool restriction Supplied Data Track: - the supplied corpus only Supplied Data + Tools Track: - The training data is limited to the supplied corpus only - Parser/Chunker and Tagger tools is available nrestricted Data Track: - all publicly available data - the data crawled from the www C-STA Track: - no limitations on the linguistic resources - Full BTEC corpus and proprietary data

11 Participants - 17 institutions/16 groups Institution Systems Aachen niversity ITC - Center for Scientific and Technological esearch ITC-IST niversity of Edinburgh EDINBGH Nagaoka niversity of Technology NGKT niversity of Southern California Information Sciences Institute niversity of Tokyo TOKYO AT Spoken Language Communication esearch Labs AT-ALEPH AT-SL

12 Participants 19 translation systems Institution Systems IT/Lincoln Laboratory Airforce esearch Laboratory IT-LL/AFL National Laboratory of Pattern ecognition NLP Cyber Space Laboratories TALP esearch Center ram rase esearch icrosoft esearch ICOSOFT Carnegie ellon niversity Oki Electric Industry Co., Ltd. OKI Sehda Inc. SEHDA

13 Participants - Techniques 19 translation systems ST ST+Synta x EBT ET ram SEHDA TOKYO EDINBGH TALPphrase ICOSOFT AT-ALEPH NGKT OKI AT-SL ITC-IST ITLL/AFL NLP

14 Translation systems Techniques Country of Origin Italy, 1 China, 1 ET, 1 K, 1 EBT, 3 Germany, 1 ST + Syntax, 3 Japan, 7 (5 groups) Spain, 2 (1 group) ST, 12 SA, 6

15 System participation anual transcription Supplied 12 Supplied & Tools C-STA nrestricted Chinese English Japanese English Arabic English Korean English English Chinese

16 System participation AS output Supplied Supplied & Tools nrestricted Chinese English Japanese English English Chinese C-STA

17 esults an. Trans. atthias

18 BLE BLE Geometric mean of n-gram precision of hypothesis compared to the reference translation Length Penalty for short translations Benefits issing references can be covered by combining of other references Correlates well with Fluency Scores: 0 1 Problems e-combination of references could cause errors All words are equally important Weak correlation with Adequacy

19 NIST NIST Variant of BLE using arithmetic mean of weighted n-gram precision values Scores: Tt Tt Benefits Problems Considers information gain p to 9-grams, usually 5grams Good correlation with Adequacy 0 e-combination of references could cause errors Weak correlation with Fluency (human judgement)

20 mwe, mpe mwe Word Error ate on multiple references edit distance: hypothesis closest reference Scores: 0 1 mpe mwe without considering word order Benefits Correlates well with human judgement... Problems...if enough references are available

21 GT, ETEO GT Similarity between texts using unigram based F-measure ETEO Considers: Exact matches stem matches synonym matches (using WordNet) case insensitive Cannot be used on Chinese output (yet) Scores: werwerwerwerewr 0 1 houses (exact match) house (stem match) home (synonym match)

22 Automatic Evaluation Evaluation specification for English outputs Focus on speech-to-speech translation Punctuation marks and mixed casing less relevant Standard Evaluation case insensitive, all lowercase removed punctuation marks:.? removed - to split compounds!,:; but: Optional Evaluation case sensitive, mixed case separated punctuation marks only done if submitted data contained mixed case characters No numbers reported here please refer to overview paper

23 Automatic Evaluation Evaluation specification for Chinese outputs Evaluation 1 sing given (AS) segmentation emoved punctuation marks Evaluation 2 Character segmented Eliminates segmentation influence emoved punctuation marks Eval Server

24 Online Evaluation Server

25 Online Evaluation Server Language Pair Data Track File Further Comments

26 Evaluation Server Output

27 Evaluation Server Output ixed case!! Automatically detected Number of lines?

28 Evaluation Server Output (2)

29 Subjective Evaluation Fluency/Adequacy Fluency Adequacy 4 Flawless English 4 All information 3 Good English 3 ost information 2 Non-Native English 2 uch information 1 Disfluent English 1 Little information 0 Incomprehensible 0 None Typically used metrics Fluency/Adequacy (e.g. IWSLT 2004) Here: 0 4 instead of 1 5

30 Subjective Evaluation eaning aintenance eaning aintenance Adequacy 4 Exactly the same meaning 4 All information 3 Almost the same meaning 3 ost information 2 Partially the same meaning and no new information 2 uch information 1 Partially the same meaning but misleading information is introduced 1 Little information 0 Totally different meaning 0 None

31 Why eaning aintenance? Focus on comparing meaning of translation with source Degree of misleading information? 2 types of errors errors Obvious error no meaning change Translation is still useful Adequacy and eaning aintenance Score are similar Error changes meaning (e.g. negation) X Translation is not useful Adequacy grader might ignore change and judge only correct parts Prevented by focus on meaning

32 Subjective Evaluation procedure All translations shown at the same time andomly ordered Comparison among translations of the same sentence No explicit reference reference included in translations No bias by shown reference gives oracle score Source is shown for Adequacy and eaning aintenance scores 5 bilingual graders (scores shown are for 3 graders) First all Fluency scores, then Adequacy, finally eaning aintenance

33 Subjective Evaluation Tool - Fluency Part 1: Fluency

34 Subjective Evaluation Tool - Adequacy Part 2: Adequacy

35 Subjective Evaluation Tool ean. aint. Part 3: eaning aintenance eaning aintenance

36 Evaluation esults Human Evaluation was only done for most popular track Chinese English translation of manual transcription (T) Supplied Data Track 11 submissions for this track were evaluated +10% translations graded a second time by the same grader to measure inconsistencies

37 Human Evaluation esults Adequacy Fluency ean. aint. IT-LL/AFL 2.71 ITC-IST 3.15 IT-LL/AFL 2.63 ITC-IST ITC-IST 2.60 rase rase ram ram 2.44 EDINBGH 2.81 ram 2.40 EDINBGH 2.33 IT-LL/AFL 2.79 EDINBGH rase

38 Human Evaluation esults - Adequacy pper bound: reference performance Adequacy IS I SC C TT N ra ED m IN B G H AT -C 3 -n g IB TA LP e ra s -p h W TH -I ST IT C TA LP IT -L L /A F L 0.00 Significance

39 e TT -IS I SC C N ra ED m IN B G H AT -C 3 -n g IB ra s W TH -p h TA LP L -I ST /A F IT C -L L TA LP IT Adequacy Significance? Adequacy

40 e -IS I SC C TT Adequacy N ra ED m IN B G H AT -C 3 -n g IB ra s W TH -p h TA LP L -I ST /A F IT C -L L TA LP IT...and Fluency Fluency

41 ...and eaning aintenance Adequacy Fluency ean. aint IS I SC C TT N ra ED m IN B G H AT -C 3 -n g IB TA LP e ra s -p h W TH -I ST IT C TA LP IT -L L /A F L 0.00 Adequacy - Fluency

42 Analysis: Adequacy - Fluency 5 4 Fluency Adequacy 3 4 5

43 Analysis: Adequacy - Fluency Fluency >> Adequacy 5 4 Fluency Fluency ~ Adequacy Adequacy Fluency << Adequacy

44 Analysis: Adequacy - Fluency Fluency 3 (Good English) is rare 4: Flawless 5 4 Fluency : Non-native Adequacy Inconsistencies

45 Consistency? (Inter - Grader) How consistent are the scores assigned by the 3 graders? Average differences between grades: Adequacy Fluency ean. aint. G1-G G1-G G2-G AVG Agreement between all 3 graders for about 40% of sentences Agreement between 2 graders for about 60% of sentences

46 Consistency? (Intra Grader) How consistent are the scores assigned by each grader? (based on 10% sentences graded twice) Average differences between first and second grade: Adequacy Fluency ean. aint. Grader Grader Grader AVG

47 Do we need eaning aintenance? Difference to Adequacy is less than 2 in 91% of the cases High correlation with Adequacy (Pearson: 0.82) But low correlation for low scores (ean. aint. 0, 1) Avg. difference is 0.75, Pearson 0.20 For high scores (3,4): Avg. difference is 0.25 with Pearson: 0.65 Graders tend to use similar scores on good translations Differences on bad translations Lower grader inconsistency for eaning aintenance No additional score necessary ake graders aware of meaning in Adequacy scoring BLE Scores

48 IT - H -C N TT I 3 SC -I S AT IB C se LL /A F TA L LP -n gr am ph ra G W TH B P- IN TA L ED IT CI ST BLE: Chinese English T Supplied Data

49 IT - H -C N TT I 3 SC -I S AT IB C se LL /A F TA L LP -n gr am ph ra G W TH B P- IN TA L ED IT C -I ST BLE: Chinese English T Supplied Data

50 I H TT SC -I S C 3 se G N ph ra B P- m -C ng ra IB AT P- IN TA L ED TH LL /A F W L IT C -I ST IT - TA L NIST: Chinese English T Supplied Data

51 I H TT SC -I S C 3 se G N ph ra B P- m -C ng ra IB AT P- IN TA L ED TH LL /A F W L IT C -I ST IT - TA L NIST: Chinese English T Supplied Data

52 NT T AT C3 S CIS I C IB TA LP -n gr am W ED TH IN B G TA H LP -p hr as IT e -L L/ AF L IT CI ST mpe mpe TE

53 NT T AT C3 S CIS I mwe C IB TA LP -n gr am W ED TH IN B G TA H LP -p hr as IT e -L L/ AF L IT CI ST mwe, mpe mpe TE

54 Addtl. etric for Chinese English, T, Suppl. TE Translation Error ate Newly introduced metric: easure error as the minimum number of edits needed to change hypothesis so that it exactly matches one of the references TE = <# of edits> / <avg # of reference words> TE is calculated against best (closest) reference Edits include insertions, deletions, substitutions and shifts All edits count as 1 error (=edit distance) Shift moves a sequence of words within the hypothesis Shift of any sequence of words (any distance) is only 1 error 0 1

55 NT T AT C3 S CIS I mwe C IB TA LP -n gr am W ED TH IN B G TA H LP -p hr as IT e -L L/ AF L IT CI ST mwe, mpe mpe

56 TE NT T AT C3 S CIS I mpe C mwe IB TA LP -n gr am W ED TH IN B G TA H LP -p hr as IT e -L L/ AF L IT CI ST mwe, mpe and TE

57 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF

58 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF

59 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF

60 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF

61 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF

62 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF

63 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF

64 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF

65 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF

66 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF

67 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF

68 Chinese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO TE ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF Adeq. ITC-IST IT/AF Fluency ean.. ITC-IST IT/AF IT/AF IT/AF ITC-IST ITC-IST ITC-IST ITC-IST IT/AF IT/AF IT/AF IT/AF correlations Human/Automatic

69 Correlation Human Automatic Scores Pearson Correlation between scores Adequacy Fluency ean. aint. BLE NIST mwe mpe GT ETEO TE other data conditions

70 C IB AT -C 3 SC -IS I N TT N G K AT T -S L AT N LP -A LE PH IT C -I ST ED W IN T H TA B LP G H -ph IT r -L ase TA L/A LP F -n L gr am Chinese English BLE Scores Supplied

71 C IB AT -C 3 SC -IS I N TT N G K AT T -S L AT N LP -A LE PH IT C -I ST ED W IN T H TA B LP G H -ph IT r -L ase TA L/A LP F -n L gr am Chinese English BLE Scores Supplied Supplied + Tools

72 Supplied C IB AT -C 3 SC -IS I N TT N G K AT T -S L AT N LP -A LE PH IT C -I ST ED W IN T H TA B LP G H -ph IT r -L ase TA L/A LP F -n L gr am Chinese English BLE Scores Supplied + Tools nrestricted

73 Supplied C IB AT -C 3 SC -IS I N TT N G K AT T -S L AT N LP -A LE PH IT C -I ST ED W IN T H TA B LP G H -ph IT r -L ase TA L/A LP F -n L gr am Chinese English BLE Scores Supplied + Tools nrestricted C-STA NIST Scores

74 IT - W LL TH /A IT F C- L I ST TA LP -n gr AT a m TA L P -C -p 3 hr as e ED IN NT B T G H C SC N I SI G AT K T -S L AT N L P -A LE PH Chinese English NIST Scores Supplied

75 IT - W LL TH /A IT F C- L I ST TA LP -n gr AT a m TA L P -C -p 3 hr as e ED IN NT B T G H C SC N I SI G AT K T -S L AT N L P -A LE PH Chinese English NIST Scores Supplied Supplied + Tools

76 IT - W LL TH /A IT F C- L I ST TA LP -n gr AT a m TA L P -C -p 3 hr as e ED IN NT B T G H C SC N I SI G AT K T -S L AT N L P -A LE PH Chinese English NIST Scores Supplied Supplied + Tools nrestricted

77 IT - W LL TH /A IT F C- L I ST TA LP -n gr AT a m TA L P -C -p 3 hr as e ED IN NT B T G H C SC N I SI G AT K T -S L AT N L P -A LE PH Chinese English NIST Scores Supplied Supplied + Tools nrestricted C-STA

78 IN C O KI B G AT H -C 3 N TT SC IC -I S I O SO AT F T -S L TO KY N O AT GK -A T LE PH ED IT CI ST W TH Japanese English BLE Scores Supplied

79 IN C Supplied O KI B G AT H -C 3 N TT SC IC -I S I O SO AT F T -S L TO KY N O AT GK -A T LE PH ED IT CI ST W TH Japanese English BLE Scores Supplied + Tools

80 Japanese English BLE Scores Supplied Supplied + Tools nrestricted O KI B G AT H -C 3 N TT SC IC -I S I O SO AT F T -S L TO KY N O AT GK -A T LE PH C IN ED IT CI ST W TH 0.0

81 Japanese English BLE Scores Supplied Supplied + Tools nrestricted C-STA O KI B G AT H -C 3 N TT SC IC -I S I O SO AT F T -S L TO KY N O AT GK -A T LE PH C IN ED IT CI ST W TH 0.0 NIST Scores

82 W TH O KI AT IT C3 CI ST ED S C -I S IN B I IC G O H SO F TO T KY N O G K AT T A T -S L -A LE PH TT N C Japanese English NIST Scores Supplied

83 W TH Supplied O KI AT IT C3 CI ST ED S C -I S IN B I IC G O H SO F TO T KY N O G K AT T A T -S L -A LE PH TT N C Japanese English NIST Scores Supplied + Tools

84 Japanese English NIST Scores Supplied Supplied + Tools nrestricted O KI AT IT C3 CI ST ED S C -I S IN B I IC G O H SO F TO T KY N O G K AT T A T -S L -A LE PH W TH TT N C 0.0

85 Japanese English NIST Scores Supplied Supplied + Tools nrestricted C-STA O KI AT IT C3 CI ST ED S C -I S IN B I IC G O H SO F TO T KY N O G K AT T A T -S L -A LE PH W TH TT N C 0.0

86 Japanese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO ITC-IST ITC-IST ITC-IST ITC-IST ITC-IST EDINBGH EDINBGH EDINBGH ITC-IST EDINBGH EDINBGH EDINBGH Different metrics rank differently!

87 Japanese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO ITC-IST ITC-IST ITC-IST ITC-IST ITC-IST EDINBGH EDINBGH EDINBGH ITC-IST EDINBGH EDINBGH EDINBGH

88 Japanese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO ITC-IST ITC-IST ITC-IST ITC-IST ITC-IST EDINBGH EDINBGH EDINBGH ITC-IST EDINBGH EDINBGH EDINBGH

89 Japanese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO ITC-IST ITC-IST ITC-IST ITC-IST ITC-IST EDINBGH EDINBGH EDINBGH ITC-IST EDINBGH EDINBGH EDINBGH

90 Japanese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO ITC-IST ITC-IST ITC-IST ITC-IST ITC-IST EDINBGH EDINBGH EDINBGH ITC-IST EDINBGH EDINBGH EDINBGH

91 Japanese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO ITC-IST ITC-IST ITC-IST ITC-IST ITC-IST EDINBGH EDINBGH EDINBGH ITC-IST EDINBGH EDINBGH EDINBGH

92 Japanese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO ITC-IST ITC-IST ITC-IST ITC-IST ITC-IST EDINBGH EDINBGH EDINBGH ITC-IST EDINBGH EDINBGH EDINBGH

93 Japanese English Supplied Data - ankings BLE NIST mwe mpe GT ETEO ITC-IST ITC-IST ITC-IST ITC-IST ITC-IST EDINBGH EDINBGH EDINBGH ITC-IST EDINBGH EDINBGH EDINBGH Arabic English

94 hr a se IN -A LE PH I TT SC -I S C m G H N B TH IB W ng ra P- AT ED -p IT CI ST LP TA L TA Arabic English BLE Scores Supplied

95 Arabic English BLE Scores Supplied Supplied + Tools LE PH I AT -A SC -I S C TT N G H B m IN ng ra ED P- IB TH W IT CI ST TA L TA LP -p hr a se 0.0

96 Arabic English BLE Scores Supplied Supplied + Tools nrestricted LE PH I AT -A SC -I S C TT N G H B m IN ng ra ED P- IB TH W IT CI ST TA L TA LP -p hr a se 0.0

97 Arabic English BLE Scores Supplied Supplied + Tools nrestricted C-STA AT -A LE PH I SC -I S C TT N G H B m IN ng ra ED P- IB TH W IT CI ST TA L TA LP -p hr a se 0.0 NIST Scores

98 W TH AT LE PH I m SC -I S -A ng ra G H IB TT B P- IN TA L ED C N IT CI TA ST LP -p hr as e Arabic English NIST Scores Supplied

99 Arabic English NIST Scores Supplied Supplied + Tools LE PH I -A AT SC -I S m ng ra P- G H TA L B IN C TT N IB ED W TH IT CI TA ST LP -p hr as e 0.0

100 Arabic English NIST Scores Supplied Supplied + Tools nrestricted LE PH I -A AT SC -I S m ng ra P- G H TA L B IN C TT N IB ED W TH IT CI TA ST LP -p hr as e 0.0

101 Arabic English NIST Scores Supplied Supplied + Tools nrestricted C-STA LE PH I -A AT SC -I S m ng ra P- G H TA L B IN C TT N IB ED W TH IT CI TA ST LP -p hr as e 0.0 Korean English

102 Korean English BLE Scores Supplied PH D A LE -A SE I SC -I S TT N C H AT ED IN B G H 0.0

103 Korean English BLE Scores Supplied Supplied + Tools PH D A LE -A SE I SC -I S TT N C H AT ED IN B G H 0.0

104 Korean English BLE Scores Supplied Supplied + Tools nrestricted PH D A LE -A SE I SC -I S TT N C H AT ED IN B G H 0.0

105 Korean English BLE Scores Supplied Supplied + Tools nrestricted C-STA PH D A LE -A SE I SC -I S TT N C H AT ED IN B G H 0.0 NIST Scores

106 Korean English NIST Scores Supplied PH DA AT -A LE H G B IN ED SE H I SC -I S TT N C

107 Korean English NIST Scores Supplied Supplied + Tools PH DA AT -A LE H G B IN ED SE H I SC -I S TT N C

108 Korean English NIST Scores Supplied Supplied + Tools nrestricted PH DA AT -A LE H G B IN ED SE H I SC -I S TT N C

109 Korean English NIST Scores Supplied Supplied + Tools nrestricted C-STA PH DA AT -A LE H G B IN ED SE H I SC -I S TT N C English Chinese

110 English Chinese BLE Score Supplied EDINBGH ICOSOFT AT-ALEPH

111 English Chinese BLE Score Supplied Supplied + Tools ICOSOFT EDINBGH AT-ALEPH

112 English Chinese BLE Score Supplied Supplied + Tools nrestricted EDINBGH ICOSOFT AT-ALEPH

113 English Chinese BLE Score Supplied Supplied + Tools nrestricted C-STA EDINBGH ICOSOFT AT-ALEPH

114 English Chinese NIST Score Supplied EDINBGH ICOSOFT AT-ALEPH

115 English Chinese NIST Score Supplied Supplied + Tools EDINBGH ICOSOFT AT-ALEPH

116 English Chinese NIST Score Supplied Supplied + Tools nrestricted EDINBGH ICOSOFT AT-ALEPH

117 English Chinese NIST Score Supplied Supplied + Tools nrestricted C-STA EDINBGH ICOSOFT AT-ALEPH

118 AS esults Chiori

119 Directions and source input Translation direction anual transcription AS output Chinese English Japanese English Arabic English Korean English English Chinese (Dr. Chen, NLP) Dr. Yamamoto (AT), () - () - () (r. Paulik, /KA)

120 Japanese AS performance Word error rate (%) 506 utterances were recognized. 1-best best lattice DEVSET1 DEVSET2 TESTSET

121 Japanese AS performance Word error rate (%) 506 utterances were recognized. 1-best best lattice DEVSET1 DEVSET2 TESTSET

122 Japanese AS performance Word error rate (%) 506 utterances were recognized. 1-best best lattice DEVSET1 DEVSET2 TESTSET

123 Chinese AS performance 506 utterances were recognized. 1-best 50 Word error rate (%) 20-best latteice DEVSET1 DEVSET2 TESTSET

124 Chinese AS performance 506 utterances were recognized. 1-best 50 Word error rate (%) 20-best latteice DEVSET1 DEVSET2 TESTSET

125 Chinese AS performance 506 utterances were recognized. 1-best 50 Word error rate (%) 20-best latteice DEVSET1 DEVSET2 TESTSET

126 How much the performance of T is degraded by AS recognition? Chinese English track 160 #sentences % 20-40% 40-60% 60-80% Word error rate (%) %

127 BLE 0.6 IT BLE score Word error rate (%)

128 BLE 0.6 IT BLE score Word error rate (%)

129 NIST IT 10 NIST score Word error rate (%)

130 NIST IT 10 NIST score Word error rate (%)

131 ultiple AS hypotheses translation #sentences Word error rate (%) CASIA CASIA Word error rate (%) BLE NIST Score Word error rate (%) Word error rate (%)

132 Acknowledgment All participants NLP and AT for AS BBN for TE /KA Thanks a lot!

Evaluation. Brian Thompson slides by Philipp Koehn. 25 September 2018

Evaluation. Brian Thompson slides by Philipp Koehn. 25 September 2018 Evaluation Brian Thompson slides by Philipp Koehn 25 September 2018 Evaluation 1 How good is a given machine translation system? Hard problem, since many different translations acceptable semantic equivalence

More information

Machine Translation Evaluation

Machine Translation Evaluation Machine Translation Evaluation Sara Stymne 2017-03-29 Partly based on Philipp Koehn s slides for chapter 8 Why Evaluation? How good is a given machine translation system? Which one is the best system for

More information

ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation

ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation Chin-Yew Lin and Franz Josef Och Information Sciences Institute University of Southern California 4676 Admiralty Way

More information

TALP Phrase-Based System and TALP System Combination for the IWSLT 2006 IWSLT 2006, Kyoto

TALP Phrase-Based System and TALP System Combination for the IWSLT 2006 IWSLT 2006, Kyoto TALP Phrase-Based System and TALP System Combination for the IWSLT 2006 IWSLT 2006, Kyoto Marta R. Costa-jussà, Josep M. Crego, Adrià de Gispert, Patrik Lambert, Maxim Khalilov, José A.R. Fonollosa, José

More information

1 Evaluation of SMT systems: BLEU

1 Evaluation of SMT systems: BLEU 1 Evaluation of SMT systems: BLEU Idea: We want to define a repeatable evaluation method that uses: a gold standard of human generated reference translations a numerical translation closeness metric in

More information

Speech Translation: from Singlebest to N-Best to Lattice Translation. Spoken Language Communication Laboratories

Speech Translation: from Singlebest to N-Best to Lattice Translation. Spoken Language Communication Laboratories Speech Translation: from Singlebest to N-Best to Lattice Translation Ruiqiang ZHANG Genichiro KIKUI Spoken Language Communication Laboratories 2 Speech Translation Structure Single-best only ASR Single-best

More information

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister A Syntax-based Statistical Machine Translation Model Alexander Friedl, Georg Teichtmeister 4.12.2006 Introduction The model Experiment Conclusion Statistical Translation Model (STM): - mathematical model

More information

Multiple System Combination. Jinhua Du CNGL July 23, 2008

Multiple System Combination. Jinhua Du CNGL July 23, 2008 Multiple System Combination Jinhua Du CNGL July 23, 2008 Outline Introduction Motivation Current Achievements Combination Strategies Key Techniques System Combination Framework in IA Large-Scale Experiments

More information

Maja Popović Humboldt University of Berlin Berlin, Germany 2 CHRF and WORDF scores

Maja Popović Humboldt University of Berlin Berlin, Germany 2 CHRF and WORDF scores CHRF deconstructed: β parameters and n-gram weights Maja Popović Humboldt University of Berlin Berlin, Germany maja.popovic@hu-berlin.de Abstract Character n-gram F-score (CHRF) is shown to correlate very

More information

N-grams. Motivation. Simple n-grams. Smoothing. Backoff. N-grams L545. Dept. of Linguistics, Indiana University Spring / 24

N-grams. Motivation. Simple n-grams. Smoothing. Backoff. N-grams L545. Dept. of Linguistics, Indiana University Spring / 24 L545 Dept. of Linguistics, Indiana University Spring 2013 1 / 24 Morphosyntax We just finished talking about morphology (cf. words) And pretty soon we re going to discuss syntax (cf. sentences) In between,

More information

Enhanced Bilingual Evaluation Understudy

Enhanced Bilingual Evaluation Understudy Enhanced Bilingual Evaluation Understudy Krzysztof Wołk, Krzysztof Marasek Department of Multimedia Polish Japanese Institute of Information Technology, Warsaw, POLAND kwolk@pjwstk.edu.pl Abstract - Our

More information

SYNTHER A NEW M-GRAM POS TAGGER

SYNTHER A NEW M-GRAM POS TAGGER SYNTHER A NEW M-GRAM POS TAGGER David Sündermann and Hermann Ney RWTH Aachen University of Technology, Computer Science Department Ahornstr. 55, 52056 Aachen, Germany {suendermann,ney}@cs.rwth-aachen.de

More information

Natural Language Processing SoSe Words and Language Model

Natural Language Processing SoSe Words and Language Model Natural Language Processing SoSe 2016 Words and Language Model Dr. Mariana Neves May 2nd, 2016 Outline 2 Words Language Model Outline 3 Words Language Model Tokenization Separation of words in a sentence

More information

Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation

Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation David Vilar, Daniel Stein, Hermann Ney IWSLT 2008, Honolulu, Hawaii 20. October 2008 Human Language Technology

More information

Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics

Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics Chin-Yew Lin and Franz Josef Och Information Sciences Institute University of Southern California

More information

Phrase-Based Statistical Machine Translation with Pivot Languages

Phrase-Based Statistical Machine Translation with Pivot Languages Phrase-Based Statistical Machine Translation with Pivot Languages N. Bertoldi, M. Barbaiani, M. Federico, R. Cattoni FBK, Trento - Italy Rovira i Virgili University, Tarragona - Spain October 21st, 2008

More information

Design and Implementation of Speech Recognition Systems

Design and Implementation of Speech Recognition Systems Design and Implementation of Speech Recognition Systems Spring 2013 Class 7: Templates to HMMs 13 Feb 2013 1 Recap Thus far, we have looked at dynamic programming for string matching, And derived DTW from

More information

Tuning as Linear Regression

Tuning as Linear Regression Tuning as Linear Regression Marzieh Bazrafshan, Tagyoung Chung and Daniel Gildea Department of Computer Science University of Rochester Rochester, NY 14627 Abstract We propose a tuning method for statistical

More information

Paterson Public Schools

Paterson Public Schools A. Concepts About Print Understand how print is organized and read. (See LAL Curriculum Framework Grade KR Page 1 of 12) Understand that all print materials in English follow similar patterns. (See LAL

More information

CS230: Lecture 10 Sequence models II

CS230: Lecture 10 Sequence models II CS23: Lecture 1 Sequence models II Today s outline We will learn how to: - Automatically score an NLP model I. BLEU score - Improve Machine II. Beam Search Translation results with Beam search III. Speech

More information

The Noisy Channel Model and Markov Models

The Noisy Channel Model and Markov Models 1/24 The Noisy Channel Model and Markov Models Mark Johnson September 3, 2014 2/24 The big ideas The story so far: machine learning classifiers learn a function that maps a data item X to a label Y handle

More information

Natural Language Processing. Statistical Inference: n-grams

Natural Language Processing. Statistical Inference: n-grams Natural Language Processing Statistical Inference: n-grams Updated 3/2009 Statistical Inference Statistical Inference consists of taking some data (generated in accordance with some unknown probability

More information

N-gram Language Modeling

N-gram Language Modeling N-gram Language Modeling Outline: Statistical Language Model (LM) Intro General N-gram models Basic (non-parametric) n-grams Class LMs Mixtures Part I: Statistical Language Model (LM) Intro What is a statistical

More information

Mark Scheme (Results) June GCSE Mathematics (2MB01) Foundation 5MB2F (Non-Calculator) Paper 01

Mark Scheme (Results) June GCSE Mathematics (2MB01) Foundation 5MB2F (Non-Calculator) Paper 01 Mark Scheme (Results) June 2012 GCSE Mathematics (2MB01) Foundation 5MB2F (Non-Calculator) Paper 01 Edexcel and BTEC Qualifications Edexcel and BTEC qualifications come from Pearson, the world s leading

More information

Évalua&on)des)systèmes)de)TA)

Évalua&on)des)systèmes)de)TA) Avant)propos) Évalua&on)des)systèmes)de)TA) Hervé)Blanchon) Herve.Blanchon@imag.fr)! De)quoi)vaAtAon)parler)?) " Évalua&on)des)systèmes)de)traduc&on)automa&que)! Que)vaAtAon)couvrir)?) " Évalua&on)par)des)humains):)évalua&on)subjec&ve)

More information

Paterson Public Schools

Paterson Public Schools By this marking period students should be applying skills and strategies learned in the first three marking periods. (See LAL Curriculum frameworks Grade KR page 1 of 12) Identifying front cover, back

More information

Discriminative Training

Discriminative Training Discriminative Training February 19, 2013 Noisy Channels Again p(e) source English Noisy Channels Again p(e) p(g e) source English German Noisy Channels Again p(e) p(g e) source English German decoder

More information

An Empirical Study on Computing Consensus Translations from Multiple Machine Translation Systems

An Empirical Study on Computing Consensus Translations from Multiple Machine Translation Systems An Empirical Study on Computing Consensus Translations from Multiple Machine Translation Systems Wolfgang Macherey Google Inc. 1600 Amphitheatre Parkway Mountain View, CA 94043, USA wmach@google.com Franz

More information

TnT Part of Speech Tagger

TnT Part of Speech Tagger TnT Part of Speech Tagger By Thorsten Brants Presented By Arghya Roy Chaudhuri Kevin Patel Satyam July 29, 2014 1 / 31 Outline 1 Why Then? Why Now? 2 Underlying Model Other technicalities 3 Evaluation

More information

Course Staff. Textbook

Course Staff. Textbook Course Staff CS311H: Discrete Mathematics Intro and Propositional Logic Instructor: Işıl Dillig Instructor: Prof. Işıl Dillig TAs: Jacob Van Geffen, Varun Adiga, Akshay Gupta Class meets every Monday,

More information

statistical machine translation

statistical machine translation statistical machine translation P A R T 3 : D E C O D I N G & E V A L U A T I O N CSC401/2511 Natural Language Computing Spring 2019 Lecture 6 Frank Rudzicz and Chloé Pou-Prom 1 University of Toronto Statistical

More information

Cross-Lingual Language Modeling for Automatic Speech Recogntion

Cross-Lingual Language Modeling for Automatic Speech Recogntion GBO Presentation Cross-Lingual Language Modeling for Automatic Speech Recogntion November 14, 2003 Woosung Kim woosung@cs.jhu.edu Center for Language and Speech Processing Dept. of Computer Science The

More information

The Geometry of Statistical Machine Translation

The Geometry of Statistical Machine Translation The Geometry of Statistical Machine Translation Presented by Rory Waite 16th of December 2015 ntroduction Linear Models Convex Geometry The Minkowski Sum Projected MERT Conclusions ntroduction We provide

More information

Chapter 3: Basics of Language Modeling

Chapter 3: Basics of Language Modeling Chapter 3: Basics of Language Modeling Section 3.1. Language Modeling in Automatic Speech Recognition (ASR) All graphs in this section are from the book by Schukat-Talamazzini unless indicated otherwise

More information

Deep Learning Sequence to Sequence models: Attention Models. 17 March 2018

Deep Learning Sequence to Sequence models: Attention Models. 17 March 2018 Deep Learning Sequence to Sequence models: Attention Models 17 March 2018 1 Sequence-to-sequence modelling Problem: E.g. A sequence X 1 X N goes in A different sequence Y 1 Y M comes out Speech recognition:

More information

MIDDLE GRADES MATHEMATICS

MIDDLE GRADES MATHEMATICS MIDDLE GRADES MATHEMATICS Content Domain Range of Competencies l. Number Sense and Operations 0001 0002 17% ll. Algebra and Functions 0003 0006 33% lll. Measurement and Geometry 0007 0009 25% lv. Statistics,

More information

CS 224N HW:#3. (V N0 )δ N r p r + N 0. N r (r δ) + (V N 0)δ. N r r δ. + (V N 0)δ N = 1. 1 we must have the restriction: δ NN 0.

CS 224N HW:#3. (V N0 )δ N r p r + N 0. N r (r δ) + (V N 0)δ. N r r δ. + (V N 0)δ N = 1. 1 we must have the restriction: δ NN 0. CS 224 HW:#3 ARIA HAGHIGHI SUID :# 05041774 1. Smoothing Probability Models (a). Let r be the number of words with r counts and p r be the probability for a word with r counts in the Absolute discounting

More information

External Backward Linkage and External Forward Linkage. in Asian International Input-Output Table

External Backward Linkage and External Forward Linkage. in Asian International Input-Output Table Prepared for the 20 th INFORUM World Conference in Firenze, Italy External Backward Linkage and External Forward Linkage in Asian International Input-Output Table Toshiaki Hasegawa Faculty of Economics

More information

Lecture 10. Discriminative Training, ROVER, and Consensus. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen

Lecture 10. Discriminative Training, ROVER, and Consensus. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen Lecture 10 Discriminative Training, ROVER, and Consensus Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen IBM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen}@us.ibm.com

More information

Language Models. Philipp Koehn. 11 September 2018

Language Models. Philipp Koehn. 11 September 2018 Language Models Philipp Koehn 11 September 2018 Language models 1 Language models answer the question: How likely is a string of English words good English? Help with reordering p LM (the house is small)

More information

Chapter 3: Basics of Language Modelling

Chapter 3: Basics of Language Modelling Chapter 3: Basics of Language Modelling Motivation Language Models are used in Speech Recognition Machine Translation Natural Language Generation Query completion For research and development: need a simple

More information

Recap: Language models. Foundations of Natural Language Processing Lecture 4 Language Models: Evaluation and Smoothing. Two types of evaluation in NLP

Recap: Language models. Foundations of Natural Language Processing Lecture 4 Language Models: Evaluation and Smoothing. Two types of evaluation in NLP Recap: Language models Foundations of atural Language Processing Lecture 4 Language Models: Evaluation and Smoothing Alex Lascarides (Slides based on those from Alex Lascarides, Sharon Goldwater and Philipp

More information

{ Jurafsky & Martin Ch. 6:! 6.6 incl.

{ Jurafsky & Martin Ch. 6:! 6.6 incl. N-grams Now Simple (Unsmoothed) N-grams Smoothing { Add-one Smoothing { Backo { Deleted Interpolation Reading: { Jurafsky & Martin Ch. 6:! 6.6 incl. 1 Word-prediction Applications Augmentative Communication

More information

N-gram Language Modeling Tutorial

N-gram Language Modeling Tutorial N-gram Language Modeling Tutorial Dustin Hillard and Sarah Petersen Lecture notes courtesy of Prof. Mari Ostendorf Outline: Statistical Language Model (LM) Basics n-gram models Class LMs Cache LMs Mixtures

More information

Speech and Language Processing. Chapter 9 of SLP Automatic Speech Recognition (II)

Speech and Language Processing. Chapter 9 of SLP Automatic Speech Recognition (II) Speech and Language Processing Chapter 9 of SLP Automatic Speech Recognition (II) Outline for ASR ASR Architecture The Noisy Channel Model Five easy pieces of an ASR system 1) Language Model 2) Lexicon/Pronunciation

More information

Natural Language Processing SoSe Language Modelling. (based on the slides of Dr. Saeedeh Momtazi)

Natural Language Processing SoSe Language Modelling. (based on the slides of Dr. Saeedeh Momtazi) Natural Language Processing SoSe 2015 Language Modelling Dr. Mariana Neves April 20th, 2015 (based on the slides of Dr. Saeedeh Momtazi) Outline 2 Motivation Estimation Evaluation Smoothing Outline 3 Motivation

More information

Mark Scheme (Results) Summer Pearson Edexcel GCE in Statistics 3R (6691/01R)

Mark Scheme (Results) Summer Pearson Edexcel GCE in Statistics 3R (6691/01R) Mark Scheme (Results) Summer 2014 Pearson Edexcel GCE in Statistics 3R (6691/01R) Edexcel and BTEC Qualifications Edexcel and BTEC qualifications come from Pearson, the world s leading learning company.

More information

Statistical Phrase-Based Speech Translation

Statistical Phrase-Based Speech Translation Statistical Phrase-Based Speech Translation Lambert Mathias 1 William Byrne 2 1 Center for Language and Speech Processing Department of Electrical and Computer Engineering Johns Hopkins University 2 Machine

More information

Automated Summarisation for Evidence Based Medicine

Automated Summarisation for Evidence Based Medicine Automated Summarisation for Evidence Based Medicine Diego Mollá Centre for Language Technology, Macquarie University HAIL, 22 March 2012 Contents Evidence Based Medicine Our Corpus for Summarisation Structure

More information

Mark Scheme (Results) Summer International GCSE Mathematics (4MA0) Paper 4HR

Mark Scheme (Results) Summer International GCSE Mathematics (4MA0) Paper 4HR Mark Scheme (Results) Summer 0 International GCSE Mathematics (4MA0) Paper 4HR Edexcel and BTEC Qualifications Edexcel and BTEC qualifications come from Pearson, the world s leading learning company. We

More information

Mass Asset Additions. Overview. Effective mm/dd/yy Page 1 of 47 Rev 1. Copyright Oracle, All rights reserved.

Mass Asset Additions.  Overview. Effective mm/dd/yy Page 1 of 47 Rev 1. Copyright Oracle, All rights reserved. Overview Effective mm/dd/yy Page 1 of 47 Rev 1 System References None Distribution Oracle Assets Job Title * Ownership The Job Title [list@yourcompany.com?subject=eduxxxxx] is responsible for ensuring

More information

Homework 4, Part B: Structured perceptron

Homework 4, Part B: Structured perceptron Homework 4, Part B: Structured perceptron CS 585, UMass Amherst, Fall 2016 Overview Due Friday, Oct 28. Get starter code/data from the course website s schedule page. You should submit a zipped directory

More information

Improved Decipherment of Homophonic Ciphers

Improved Decipherment of Homophonic Ciphers Improved Decipherment of Homophonic Ciphers Malte Nuhn and Julian Schamper and Hermann Ney Human Language Technology and Pattern Recognition Computer Science Department, RWTH Aachen University, Aachen,

More information

Statistical Machine Translation and Automatic Speech Recognition under Uncertainty

Statistical Machine Translation and Automatic Speech Recognition under Uncertainty Statistical Machine Translation and Automatic Speech Recognition under Uncertainty Lambert Mathias A dissertation submitted to the Johns Hopkins University in conformity with the requirements for the degree

More information

Elastic and Inelastic Collisions

Elastic and Inelastic Collisions Elastic and Inelastic Collisions - TA Version Physics Topics If necessary, review the following topics and relevant textbook sections from Serway / Jewett Physics for Scientists and Engineers, 9th Ed.

More information

PMT. Mark Scheme (Results) Summer Pearson Edexcel GCSE In Mathematics A (1MA0) Higher (Calculator) Paper 2H

PMT. Mark Scheme (Results) Summer Pearson Edexcel GCSE In Mathematics A (1MA0) Higher (Calculator) Paper 2H Mark Scheme (Results) Summer 2014 Pearson Edexcel GCSE In Mathematics A (1MA0) Higher (Calculator) Paper 2H Edexcel and BTEC Qualifications Edexcel and BTEC qualifications are awarded by Pearson, the UK

More information

Part I: Web Structure Mining Chapter 1: Information Retrieval and Web Search

Part I: Web Structure Mining Chapter 1: Information Retrieval and Web Search Part I: Web Structure Mining Chapter : Information Retrieval an Web Search The Web Challenges Crawling the Web Inexing an Keywor Search Evaluating Search Quality Similarity Search The Web Challenges Tim

More information

1. Markov models. 1.1 Markov-chain

1. Markov models. 1.1 Markov-chain 1. Markov models 1.1 Markov-chain Let X be a random variable X = (X 1,..., X t ) taking values in some set S = {s 1,..., s N }. The sequence is Markov chain if it has the following properties: 1. Limited

More information

Math.3336: Discrete Mathematics. Applications of Propositional Logic

Math.3336: Discrete Mathematics. Applications of Propositional Logic Math.3336: Discrete Mathematics Applications of Propositional Logic Instructor: Dr. Blerina Xhabli Department of Mathematics, University of Houston https://www.math.uh.edu/ blerina Email: blerina@math.uh.edu

More information

Language Processing with Perl and Prolog

Language Processing with Perl and Prolog Language Processing with Perl and Prolog Chapter 5: Counting Words Pierre Nugues Lund University Pierre.Nugues@cs.lth.se http://cs.lth.se/pierre_nugues/ Pierre Nugues Language Processing with Perl and

More information

Machine Learning for natural language processing

Machine Learning for natural language processing Machine Learning for natural language processing N-grams and language models Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 25 Introduction Goals: Estimate the probability that a

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation Marcello Federico FBK-irst Trento, Italy Galileo Galilei PhD School University of Pisa Pisa, 7-19 May 2008 Part V: Language Modeling 1 Comparing ASR and statistical MT N-gram

More information

Unsupervised Vocabulary Induction

Unsupervised Vocabulary Induction Infant Language Acquisition Unsupervised Vocabulary Induction MIT (Saffran et al., 1997) 8 month-old babies exposed to stream of syllables Stream composed of synthetic words (pabikumalikiwabufa) After

More information

Triplet Lexicon Models for Statistical Machine Translation

Triplet Lexicon Models for Statistical Machine Translation Triplet Lexicon Models for Statistical Machine Translation Saša Hasan, Juri Ganitkevitch, Hermann Ney and Jesús Andrés Ferrer lastname@cs.rwth-aachen.de CLSP Student Seminar February 6, 2009 Human Language

More information

Statistical Substring Reduction in Linear Time

Statistical Substring Reduction in Linear Time Statistical Substring Reduction in Linear Time Xueqiang Lü Institute of Computational Linguistics Peking University, Beijing lxq@pku.edu.cn Le Zhang Institute of Computer Software & Theory Northeastern

More information

Discriminative Learning in Speech Recognition

Discriminative Learning in Speech Recognition Discriminative Learning in Speech Recognition Yueng-Tien, Lo g96470198@csie.ntnu.edu.tw Speech Lab, CSIE Reference Xiaodong He and Li Deng. "Discriminative Learning in Speech Recognition, Technical Report

More information

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark Penn Treebank Parsing Advanced Topics in Language Processing Stephen Clark 1 The Penn Treebank 40,000 sentences of WSJ newspaper text annotated with phrasestructure trees The trees contain some predicate-argument

More information

A L A BA M A L A W R E V IE W

A L A BA M A L A W R E V IE W A L A BA M A L A W R E V IE W Volume 52 Fall 2000 Number 1 B E F O R E D I S A B I L I T Y C I V I L R I G HT S : C I V I L W A R P E N S I O N S A N D TH E P O L I T I C S O F D I S A B I L I T Y I N

More information

Kneser-Ney smoothing explained

Kneser-Ney smoothing explained foldl home blog contact feed Kneser-Ney smoothing explained 18 January 2014 Language models are an essential element of natural language processing, central to tasks ranging from spellchecking to machine

More information

Language Modeling. Michael Collins, Columbia University

Language Modeling. Michael Collins, Columbia University Language Modeling Michael Collins, Columbia University Overview The language modeling problem Trigram models Evaluating language models: perplexity Estimation techniques: Linear interpolation Discounting

More information

Out of GIZA Efficient Word Alignment Models for SMT

Out of GIZA Efficient Word Alignment Models for SMT Out of GIZA Efficient Word Alignment Models for SMT Yanjun Ma National Centre for Language Technology School of Computing Dublin City University NCLT Seminar Series March 4, 2009 Y. Ma (DCU) Out of Giza

More information

A Convolutional Neural Network-based

A Convolutional Neural Network-based A Convolutional Neural Network-based Model for Knowledge Base Completion Dat Quoc Nguyen Joint work with: Dai Quoc Nguyen, Tu Dinh Nguyen and Dinh Phung April 16, 2018 Introduction Word vectors learned

More information

An Algorithm for Fast Calculation of Back-off N-gram Probabilities with Unigram Rescaling

An Algorithm for Fast Calculation of Back-off N-gram Probabilities with Unigram Rescaling An Algorithm for Fast Calculation of Back-off N-gram Probabilities with Unigram Rescaling Masaharu Kato, Tetsuo Kosaka, Akinori Ito and Shozo Makino Abstract Topic-based stochastic models such as the probabilistic

More information

Accelerated Natural Language Processing Lecture 3 Morphology and Finite State Machines; Edit Distance

Accelerated Natural Language Processing Lecture 3 Morphology and Finite State Machines; Edit Distance Accelerated Natural Language Processing Lecture 3 Morphology and Finite State Machines; Edit Distance Sharon Goldwater (based on slides by Philipp Koehn) 20 September 2018 Sharon Goldwater ANLP Lecture

More information

Spatial Role Labeling CS365 Course Project

Spatial Role Labeling CS365 Course Project Spatial Role Labeling CS365 Course Project Amit Kumar, akkumar@iitk.ac.in Chandra Sekhar, gchandra@iitk.ac.in Supervisor : Dr.Amitabha Mukerjee ABSTRACT In natural language processing one of the important

More information

ACS Introduction to NLP Lecture 3: Language Modelling and Smoothing

ACS Introduction to NLP Lecture 3: Language Modelling and Smoothing ACS Introduction to NLP Lecture 3: Language Modelling and Smoothing Stephen Clark Natural Language and Information Processing (NLIP) Group sc609@cam.ac.uk Language Modelling 2 A language model is a probability

More information

Elastic and Inelastic Collisions

Elastic and Inelastic Collisions Physics Topics Elastic and Inelastic Collisions If necessary, review the following topics and relevant textbook sections from Serway / Jewett Physics for Scientists and Engineers, 9th Ed. Kinetic Energy

More information

Decoding Revisited: Easy-Part-First & MERT. February 26, 2015

Decoding Revisited: Easy-Part-First & MERT. February 26, 2015 Decoding Revisited: Easy-Part-First & MERT February 26, 2015 Translating the Easy Part First? the tourism initiative addresses this for the first time the die tm:-0.19,lm:-0.4, d:0, all:-0.65 tourism touristische

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation -tree-based models (cont.)- Artem Sokolov Computerlinguistik Universität Heidelberg Sommersemester 2015 material from P. Koehn, S. Riezler, D. Altshuler Bottom-Up Decoding

More information

Conditional Language Modeling. Chris Dyer

Conditional Language Modeling. Chris Dyer Conditional Language Modeling Chris Dyer Unconditional LMs A language model assigns probabilities to sequences of words,. w =(w 1,w 2,...,w`) It is convenient to decompose this probability using the chain

More information

Statistical methods for NLP Estimation

Statistical methods for NLP Estimation Statistical methods for NLP Estimation UNIVERSITY OF Richard Johansson January 29, 2015 why does the teacher care so much about the coin-tossing experiment? because it can model many situations: I pick

More information

CS446: Machine Learning Spring Problem Set 4

CS446: Machine Learning Spring Problem Set 4 CS446: Machine Learning Spring 2017 Problem Set 4 Handed Out: February 27 th, 2017 Due: March 11 th, 2017 Feel free to talk to other members of the class in doing the homework. I am more concerned that

More information

Decoding in Statistical Machine Translation. Mid-course Evaluation. Decoding. Christian Hardmeier

Decoding in Statistical Machine Translation. Mid-course Evaluation. Decoding. Christian Hardmeier Decoding in Statistical Machine Translation Christian Hardmeier 2016-05-04 Mid-course Evaluation http://stp.lingfil.uu.se/~sara/kurser/mt16/ mid-course-eval.html Decoding The decoder is the part of the

More information

Experiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition

Experiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition Experiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition ABSTRACT It is well known that the expectation-maximization (EM) algorithm, commonly used to estimate hidden

More information

Semantics with Dense Vectors. Reference: D. Jurafsky and J. Martin, Speech and Language Processing

Semantics with Dense Vectors. Reference: D. Jurafsky and J. Martin, Speech and Language Processing Semantics with Dense Vectors Reference: D. Jurafsky and J. Martin, Speech and Language Processing 1 Semantics with Dense Vectors We saw how to represent a word as a sparse vector with dimensions corresponding

More information

Acoustic Modeling for Speech Recognition

Acoustic Modeling for Speech Recognition Acoustic Modeling for Speech Recognition Berlin Chen 2004 References:. X. Huang et. al. Spoken Language Processing. Chapter 8 2. S. Young. The HTK Book (HTK Version 3.2) Introduction For the given acoustic

More information

The statement calculus and logic

The statement calculus and logic Chapter 2 Contrariwise, continued Tweedledee, if it was so, it might be; and if it were so, it would be; but as it isn t, it ain t. That s logic. Lewis Carroll You will have encountered several languages

More information

FROM QUERIES TO TOP-K RESULTS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

FROM QUERIES TO TOP-K RESULTS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS FROM QUERIES TO TOP-K RESULTS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Retrieval evaluation Link

More information

] Automatic Speech Recognition (CS753)

] Automatic Speech Recognition (CS753) ] Automatic Speech Recognition (CS753) Lecture 17: Discriminative Training for HMMs Instructor: Preethi Jyothi Sep 28, 2017 Discriminative Training Recall: MLE for HMMs Maximum likelihood estimation (MLE)

More information

Mark Scheme (Results) November Pearson Edexcel GCSE in Mathematics Linear (1MA0) Higher (Non-Calculator) Paper 1H

Mark Scheme (Results) November Pearson Edexcel GCSE in Mathematics Linear (1MA0) Higher (Non-Calculator) Paper 1H Mark Scheme (Results) November 2013 Pearson Edexcel GCSE in Mathematics Linear (1MA0) Higher (Non-Calculator) Paper 1H Edexcel and BTEC Qualifications Edexcel and BTEC qualifications are awarded by Pearson,

More information

Collaborative NLP-aided ontology modelling

Collaborative NLP-aided ontology modelling Collaborative NLP-aided ontology modelling Chiara Ghidini ghidini@fbk.eu Marco Rospocher rospocher@fbk.eu International Winter School on Language and Data/Knowledge Technologies TrentoRISE Trento, 24 th

More information

Language Modeling. Introduction to N-grams. Many Slides are adapted from slides by Dan Jurafsky

Language Modeling. Introduction to N-grams. Many Slides are adapted from slides by Dan Jurafsky Language Modeling Introduction to N-grams Many Slides are adapted from slides by Dan Jurafsky Probabilistic Language Models Today s goal: assign a probability to a sentence Why? Machine Translation: P(high

More information

DARPA ATIS Test Results June 1990

DARPA ATIS Test Results June 1990 DARPA ATIS Test Results June 1990 D. S. Pallett, W. M. Fisher, J. G. Fiscus, and J. S. Garofolo Room A 216 Technology Building National Institute of Standards and Technology (NIST) Gaithersburg, MD 20899

More information

The distribution of characters, bi- and trigrams in the Uppsala 70 million words Swedish newspaper corpus

The distribution of characters, bi- and trigrams in the Uppsala 70 million words Swedish newspaper corpus Uppsala University Department of Linguistics The distribution of characters, bi- and trigrams in the Uppsala 70 million words Swedish newspaper corpus Bengt Dahlqvist Abstract The paper describes some

More information

Modeling Norms of Turn-Taking in Multi-Party Conversation

Modeling Norms of Turn-Taking in Multi-Party Conversation Modeling Norms of Turn-Taking in Multi-Party Conversation Kornel Laskowski Carnegie Mellon University Pittsburgh PA, USA 13 July, 2010 Laskowski ACL 1010, Uppsala, Sweden 1/29 Comparing Written Documents

More information

Machine Translation: Examples. Statistical NLP Spring Levels of Transfer. Corpus-Based MT. World-Level MT: Examples

Machine Translation: Examples. Statistical NLP Spring Levels of Transfer. Corpus-Based MT. World-Level MT: Examples Statistical NLP Spring 2009 Machine Translation: Examples Lecture 17: Word Alignment Dan Klein UC Berkeley Corpus-Based MT Levels of Transfer Modeling correspondences between languages Sentence-aligned

More information

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation.

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation. ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Previous lectures: Sparse vectors recap How to represent

More information

ANLP Lecture 22 Lexical Semantics with Dense Vectors

ANLP Lecture 22 Lexical Semantics with Dense Vectors ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Henry S. Thompson ANLP Lecture 22 5 November 2018 Previous

More information

Automatic Speech Recognition and Statistical Machine Translation under Uncertainty

Automatic Speech Recognition and Statistical Machine Translation under Uncertainty Outlines Automatic Speech Recognition and Statistical Machine Translation under Uncertainty Lambert Mathias Advisor: Prof. William Byrne Thesis Committee: Prof. Gerard Meyer, Prof. Trac Tran and Prof.

More information

An Introduction to Bioinformatics Algorithms Hidden Markov Models

An Introduction to Bioinformatics Algorithms   Hidden Markov Models Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training

More information