Automatic Evaluation in Text Summarization
|
|
- Lorraine Kelley
- 5 years ago
- Views:
Transcription
1 1 Automatic Evaluation in Text Summarization Hidetsugu Nanba Tsutomu Hirao Hiroshima City University NTT NTT Communication Science Laboratories keywords: text summarization, summary evaluation Summary Recently, the evaluation of computer-produced summaries has become recognized as one of the problem areas that must be addressed in the field of automatic text summarization, and a number of automatic and manual methods have been proposed Manual methods evaluate summaries correctly, because humans evaluate them, but are costly On the other hand, automatic methods, which use evaluation tools or programs, are low cost, although these methods cannot evaluate summaries as accurately as manual methods To solve this problem, new automatic methods have been investigated, which can reduce the errors of traditional automatic methods by using several evaluation results obtained manually In this paper, we survey the state of the art of automatic evaluation methods for text summarization 1 Luhn[Luhn 58] () ( ) ( ) ( ) BLEU[Papineni 02] BLEU ROUGE[Lin 04b] 2005 ACL ()
2 a Luhn [Luhn 58] Luhn F 1990 ( ) [Mani 99, Nanba 00] (snipet) [Tombros 98] Text Summarization Challenge(TSC) 1 Tipster SUMMAC Document Understanding Conference(DUC) 2 ( ) Nenkova SCU(Summary Content Unit) () [Nenkova 07] SCU SCU R C () (N N ) 3 1 ROUGE ROUGE-N [Lin 03] 3 N e n-gram(c) ROUGE-N(C,R) = e n-gram(r) Count clip (e) Count(e) n-gram(c) N n-gram(r) N Count(e) N Count clip (e) N Count(e n-gram(c)) Count(e n-gram(r)) Lin N 1 4 (1) N= 1,2 3 DUC
3 3 ROUGE-N Lin ( ) ROUGE-S [Lin 04a, Lin 04b] ROUGE-S(C,R) = (1 + γ2 )P S (C,R)R S (C,R) R S (C,R) + γ 2 P S (C,R) (2) P S (C,R) R S (C,R) e bigram(c) skip-bigram(c) P S (C,R) = e bigram(c) skip-bigram(c) e bigram(c) skip-bigram(c) R S (C,R) = e bigram(r) skip-bigram(r) Count clip (e) Count(e) (3) Count clip (e) Count(e) (4) ROUGE-S ROUGE-SU [Lin 04a, Lin 04b] ROUGE-S ROUGE-SU ROUGE-N 3 2 ESK ROUGE [ 06] (Extended String Subsequence Kernek: ESK) ESK Lodhi String Subsequence Kernel (SSK)[Lodhi 02] Cancedda Word Sequence Kernel (WSK)[Cancedda 03] ESK d ESK 1 ROUGE-1 3 ROUGE-2 1 ROUGE-S 3 ROUGE-SU 4 ESK d=2 7 + λ + 2λ 2 + λ 3 + λ 4 + λ 5 + λ 6 + λ 9 [ 06] λ(0 λ 1) 1 λ 2 λ 2 ESK R C Becoming a cosmonaut:{spaceman} is my great dream:{dream} Becoming an astronaut:{spaceman} is my ambition:{dream} cosmonaut astronaut SPACEMAN ambition dream DREAM Becoming DREAM R a cosmonaut:spaceman is my great 5 C 4 λ 5 λ 4 ESK d=2 (C,R) C R C R 15 ESK d=2 (C,R) = λ 9 + λ 2 + λ 4 + λ 6 + λ λ 2 + λ λ = 7 + λ + 2λ 2 + λ 3 + λ 4 + λ 5 + λ 6 + λ 9 1 R C ROUGE-1 2 ROUGE-S SU ESK d=2 ROUGE ESK 1 ESK λ = 0 ROUGE [ 06] λ 05
4 a Basic Elements ROUGE ESK Basic Elements (BE) [Hovy 06] BE BE BE () <, 1, 2 > R C BE R : <,, > <,, > C : <,, > <,, > [ 06] 2 R 1 BE C 2 BE BE ROUGE ( ) [ 03] n C 1 C n C 1 C n H(C 1,R) H(C n,r) R C 1 C n H(C 1,R) H(C n,r) X C i 4 sim i n i=1 H(C i,r) sim i X sim i X [Yasuda 03] Nanba [Nanba 06] (Yasuda ) Yasuda Yasuda A B 2 A B A B A B A B 05 X X (1) S 1 S n (2) T 1 T m 4 3
5 5 (3) () (1) X T 1 T m (2) 3 ROUGE X (3) S 1 S n T 1 T m (4) ROUGE S 1 S n (5) 2 4 X S 1 S n (6) ( 3 ) 5 (7) ( ) 05 X Nanba TSC2 Yasuda ROUGE Yasuda ROUGE 4 3 Yasuda ( 1 ) [ 07] n H(C i,r) ROUGE-1 ROUGE- S ESK H(C i,r) H(C 1, R) = β 0 + β 1 ROUGE-1(C 1, R) + β 2 ROUGE-S(C 1, R) + β 3 ESK(C 1, R) + ϵ 1 H(C 2, R) = β 0 + β 1 ROUGE-1(C 2, R) + β 2 ROUGE-S(C 2, R) + β 3 ESK(C 2, R) + ϵ 2 (5) H(C n, R) = β 0 + β 1 ROUGE-1(C n, R) + β 2 ROUGE-S(C n, R) + β 3 ESK(C n, R) + ϵ n 2 1 ROUGE-1(C 1, R) ROUGE-S(C 1, R) ESK(C 1, R) 1 ROUGE-1(C 2, R) ROUGE-S(C 2, R) ESK(C 2, R) X= ROUGE-1(C n, R) ROUGE-S(C n, R) ESK(C n, R) (5) y = Xβ + ϵ ˆβ ˆβ = (X T X) 1 X T y (6) ˆβ ROUGE (6) 32 [ 07] E = {ROUGE-1,ROUGE-S,ESK} {ROUGE-1} {ROUGE-S} {ESK} {ROUGE-1, ROUGE-S} {ROUGE-1, ESK} {ROUGE-S, ESK} {ROUGE-1, ROUGE-S, ESK} M 1,M 2,,M 7 AIC [Akaike 73] AIC c (Corrected-AIC) [Sugiura 78] p H(C i,r) ϵ i β =(β 0,β 1,β 2,β 3 ) T 2p(p + 1) AIC c(m) = AIC(M) + y = (H(C 1,R),H(C 2,R),,H(C n,r)) T n p 1 ϵ = (ϵ 1,ϵ 2,,ϵ n ) X AIC(M) (7) 3 7 5,
6 a AIC c AIC c AIC c AIC c M 1 S M 2 S M 3 S M 4 S M 5 S M 6 S M 7 S AIC(M) = 2MLL + 2p, (8) MLL = n 2 {(1 + log(2πσ2 )) + log( R p n )} R p () AIC c AIC c best (= min i AIC c (M i )) AIC c (= AIC c (M i ) AIC c best ) τ M 1 M 7 AIC c AIC c 2 τ = 5 M 2,M 4,M 5,M 7 ( )/4 = 072 [ 06] TSC3 DUC2004 ROUGE-2 ESK ( [Hirao 04] ) 5 [Akaike 73] Akaike, H: Information Theory as an Extension of the Maximum Likelihood Principle, in Proc of the 2nd International Symposium on Information Theory, pp (1973) [Cancedda 03] Cancedda, N, Gaussier, E, Goutte, C, and Renders, J-M: Word Sequence Kernels, Journal of Machine Learning Research, Vol 3, No Feb, pp (2003) [Hirao 04] Hirao, T, Okumura, M, Fukushima, T, and Nanba, H: Text Summarization Challenge 3 Text Summarization Evaluation at NTCIR Workshop 4, in Working Notes of the 4th NTCIR Workhop Meeting, pp (2004) [Hovy 06] Hovy, E, Lin, C-Y, Zhou, L, and Fukumoto, J: Automated Summarization Evaluation with Basic Elements, in Proc of the 5th Conference on Language Resources and Evaluation (2006)
7 7 [Lin 03] Lin, C-Y and Hovy, E: Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics, in Proc of the 4th Meeting of the North American Chapter of the Association for Computational Linguistics and Human Language Technology, pp (2003) [Lin 04a] Lin, C-Y: ROUGE: A Package for Automatic Evaluation of Summaries, in Proc of Workshop on Text Summarization Branches Out, Post Conference Workshop of ACL 2004, pp (2004) [Lin 04b] Lin, C-Y and Och, F: Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics, in Proc of the 42nd Annual Meeting of the Association for Computational Linguistics, pp (2004) [Lodhi 02] Lodhi, H, Saunders, C, Shawe-Taylor, J, Cristianini, N, and Watkins, C: Text Classification Using String Kernel, Journal of Machine Learning Research, Vol 2, No Feb, pp (2002) [Luhn 58] Luhn, H: The Automatic Creation of Literature Abstracts, IBM Journal of Research and Development, Vol 2, No 2, pp (1958) [Mani 99] Mani, I, Gates, B, and Bloedom, E: Improving Summaries by Revising Them, in Proc of the 37th Annual Meeting of the Association for Computational Linguistics, pp (1999) [Nanba 00] Nanba, H and Okumura, M: Producing More Readable Extracts by Revising Them, in Proc of the 18th International Conference on Computational Linguistics, pp (2000) [Nanba 06] Nanba, H and Okumura, M: An Automatic Method for Summary Evaluation Using Multiple Evaluation Results by a Manual Method, in Proc of COLING/ACL 206 Main Conference Poster Sessions, pp (2006) [Nenkova 07] Nenkova, A, Passonneau, R, and McKeown, K: The Pyramid Method: Incorporating Human Content Selection Variation in Summarization Evaluation, ACM Transactions on Speech and Language Processing, Vol 4, Issue 2 (2007) [Papineni 02] Papineni, K, Roukos, S, Ward, T, and W- J, Z: BLEU: a Method for Automatic Evaluation of Machine Translation, in Proc of the 40th Annual Meeting of the Association for Computational Linguistics, pp (2002) [Sugiura 78] Sugiura, N: Further Analysis of the Data by Akaike s Information Criterion and the Finite Corrections, Theory and Methods, Vol A7, No 1, pp (1978) [Tombros 98] Tombros, A and Sanderson, M: Advantages of Query Biased Summarization in Information Retrieval, in Proc of the 21st Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pp 2 10 (1998) [Yasuda 03] Yasuda, K, Sugaya, F, Takezawa, T, Yamamoto, S and Yanagida, M: Applications of Automatic Evaluation Methods to Measuring a Capability of Speech Translation System, in Proc of the 10th Conference of the European Chapter of the Association for Computational Linguistics, pp (2003) [ 03], Arrigan, T,,, NL-158 pp (2003) [ 06],,,, Vol 47, No 6, pp (2006) [ 07],,,,, Vol 22, No 2B, pp (2007) [ 06],,,, Basic Element, FI-084, pp (2006) ( ) ACL ACM nanba@hiroshima-cuacjp nanba ( )NTT 2000 NTT () ACL
ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation
ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation Chin-Yew Lin and Franz Josef Och Information Sciences Institute University of Southern California 4676 Admiralty Way
More informationEvaluating Text Summarization Systems with a Fair Baseline from Multiple Reference Summaries
Evaluating Text Summarization Systems with a Fair Baseline from Multiple Reference Summaries Fahmida Hamid (B, David Haraburda, and Paul Tarau Department of Computer Science and Engineering, University
More informationAutomatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics
Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics Chin-Yew Lin and Franz Josef Och Information Sciences Institute University of Southern California
More informationAn Introduction to String Re-Writing Kernel
An Introduction to String Re-Writing Kernel Fan Bu 1, Hang Li 2 and Xiaoyan Zhu 3 1,3 State Key Laboratory of Intelligent Technology and Systems 1,3 Tsinghua National Laboratory for Information Sci. and
More informationLatent Dirichlet Allocation Based Multi-Document Summarization
Latent Dirichlet Allocation Based Multi-Document Summarization Rachit Arora Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai - 600 036, India. rachitar@cse.iitm.ernet.in
More informationLatent Dirichlet Allocation and Singular Value Decomposition based Multi-Document Summarization
Latent Dirichlet Allocation and Singular Value Decomposition based Multi-Document Summarization Rachit Arora Computer Science and Engineering Indian Institute of Technology Madras Chennai - 600 036, India.
More informationAutomatic Generation of Shogi Commentary with a Log-Linear Language Model
1,a) 2, 1,b) 1,c) 3,d) 1,e) 2011 11 4, 2011 12 1 2 Automatic Generation of Shogi Commentary with a Log-Linear Language Model Hirotaka Kameko 1,a) Makoto Miwa 2, 1,b) Yoshimasa Tsuruoka 1,c) Shinsuke Mori
More informationRepresenting structured relational data in Euclidean vector spaces. Tony Plate
Representing structured relational data in Euclidean vector spaces Tony Plate tplate@acm.org http://www.d-reps.org October 2004 AAAI Symposium 2004 1 Overview A general method for representing structured
More informationN-grams. Motivation. Simple n-grams. Smoothing. Backoff. N-grams L545. Dept. of Linguistics, Indiana University Spring / 24
L545 Dept. of Linguistics, Indiana University Spring 2013 1 / 24 Morphosyntax We just finished talking about morphology (cf. words) And pretty soon we re going to discuss syntax (cf. sentences) In between,
More informationN-gram N-gram Language Model for Large-Vocabulary Continuous Speech Recognition
2010 11 5 N-gram N-gram Language Model for Large-Vocabulary Continuous Speech Recognition 1 48-106413 Abstract Large-Vocabulary Continuous Speech Recognition(LVCSR) system has rapidly been growing today.
More informationRanked Retrieval (2)
Text Technologies for Data Science INFR11145 Ranked Retrieval (2) Instructor: Walid Magdy 31-Oct-2017 Lecture Objectives Learn about Probabilistic models BM25 Learn about LM for IR 2 1 Recall: VSM & TFIDF
More informationNTT/NAIST s Text Summarization Systems for TSC-2
Proceedings of the Third NTCIR Workshop NTT/NAIST s Text Summarization Systems for TSC-2 Tsutomu Hirao Kazuhiro Takeuchi Hideki Isozaki Yutaka Sasaki Eisaku Maeda NTT Communication Science Laboratories,
More informationLearning to translate with neural networks. Michael Auli
Learning to translate with neural networks Michael Auli 1 Neural networks for text processing Similar words near each other France Spain dog cat Neural networks for text processing Similar words near each
More informationPredicting Neighbor Goodness in Collaborative Filtering
Predicting Neighbor Goodness in Collaborative Filtering Alejandro Bellogín and Pablo Castells {alejandro.bellogin, pablo.castells}@uam.es Universidad Autónoma de Madrid Escuela Politécnica Superior Introduction:
More informationINF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS. Jan Tore Lønning, Lecture 3, 7 Sep., 2016
1 INF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS Jan Tore Lønning, Lecture 3, 7 Sep., 2016 jtl@ifi.uio.no Machine Translation Evaluation 2 1. Automatic MT-evaluation: 1. BLEU 2. Alternatives 3. Evaluation
More informationToponym Disambiguation using Ontology-based Semantic Similarity
Toponym Disambiguation using Ontology-based Semantic Similarity David S Batista 1, João D Ferreira 2, Francisco M Couto 2, and Mário J Silva 1 1 IST/INESC-ID Lisbon, Portugal {dsbatista,msilva}@inesc-id.pt
More informationThe Noisy Channel Model and Markov Models
1/24 The Noisy Channel Model and Markov Models Mark Johnson September 3, 2014 2/24 The big ideas The story so far: machine learning classifiers learn a function that maps a data item X to a label Y handle
More informationIntroduction to Statistical modeling: handout for Math 489/583
Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect
More informationLanguage Models. Data Science: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM PHILIP KOEHN
Language Models Data Science: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM PHILIP KOEHN Data Science: Jordan Boyd-Graber UMD Language Models 1 / 8 Language models Language models answer
More informationNatural Language Processing (CSEP 517): Machine Translation
Natural Language Processing (CSEP 57): Machine Translation Noah Smith c 207 University of Washington nasmith@cs.washington.edu May 5, 207 / 59 To-Do List Online quiz: due Sunday (Jurafsky and Martin, 2008,
More informationLearning Features from Co-occurrences: A Theoretical Analysis
Learning Features from Co-occurrences: A Theoretical Analysis Yanpeng Li IBM T. J. Watson Research Center Yorktown Heights, New York 10598 liyanpeng.lyp@gmail.com Abstract Representing a word by its co-occurrences
More informationLanguage Modelling. Marcello Federico FBK-irst Trento, Italy. MT Marathon, Edinburgh, M. Federico SLM MT Marathon, Edinburgh, 2012
Language Modelling Marcello Federico FBK-irst Trento, Italy MT Marathon, Edinburgh, 2012 Outline 1 Role of LM in ASR and MT N-gram Language Models Evaluation of Language Models Smoothing Schemes Discounting
More informationEnhanced Bilingual Evaluation Understudy
Enhanced Bilingual Evaluation Understudy Krzysztof Wołk, Krzysztof Marasek Department of Multimedia Polish Japanese Institute of Information Technology, Warsaw, POLAND kwolk@pjwstk.edu.pl Abstract - Our
More informationOn a New Model for Automatic Text Categorization Based on Vector Space Model
On a New Model for Automatic Text Categorization Based on Vector Space Model Makoto Suzuki, Naohide Yamagishi, Takashi Ishida, Masayuki Goto and Shigeichi Hirasawa Faculty of Information Science, Shonan
More informationMeasuring Variability in Sentence Ordering for News Summarization
Measuring Variability in Sentence Ordering for News Summarization Nitin Madnani a,b Rebecca Passonneau c Necip Fazil Ayan a,b John M. Conroy d Bonnie J. Dorr a,b Judith L. Klavans e Dianne P. O Leary a,b
More informationLarge-Scale Training of SVMs with Automata Kernels
Large-Scale Training of SVMs with Automata Kernels Cyril Allauzen, Corinna Cortes, and Mehryar Mohri, Google Research, 76 Ninth Avenue, New York, NY Courant Institute of Mathematical Sciences, 5 Mercer
More informationCS230: Lecture 10 Sequence models II
CS23: Lecture 1 Sequence models II Today s outline We will learn how to: - Automatically score an NLP model I. BLEU score - Improve Machine II. Beam Search Translation results with Beam search III. Speech
More informationKernels for Structured Natural Language Data
Kernels for Structured atural Language Data Jun Suzuki, Yutaka Sasaki, and Eisaku Maeda TT Communication Science Laboratories, TT Corp -4 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 69-037 Japan {jun, sasaki,
More informationFast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation
Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation Joern Wuebker, Hermann Ney Human Language Technology and Pattern Recognition Group Computer Science
More informationCourse 495: Advanced Statistical Machine Learning/Pattern Recognition
Course 495: Advanced Statistical Machine Learning/Pattern Recognition Lecturer: Stefanos Zafeiriou Goal (Lectures): To present discrete and continuous valued probabilistic linear dynamical systems (HMMs
More informationKSU Team s System and Experience at the NTCIR-11 RITE-VAL Task
KSU Team s System and Experience at the NTCIR-11 RITE-VAL Task Tasuku Kimura Kyoto Sangyo University, Japan i1458030@cse.kyoto-su.ac.jp Hisashi Miyamori Kyoto Sangyo University, Japan miya@cse.kyoto-su.ac.jp
More informationP R + RQ P Q: Transliteration Mining Using Bridge Language
P R + RQ P Q: Transliteration Mining Using Bridge Language Mitesh M. Khapra Raghavendra Udupa n Institute of Technology Microsoft Research, Bombay, Bangalore, Powai, Mumbai 400076 raghavu@microsoft.com
More informationStatistical Ranking Problem
Statistical Ranking Problem Tong Zhang Statistics Department, Rutgers University Ranking Problems Rank a set of items and display to users in corresponding order. Two issues: performance on top and dealing
More informationImproving Diversity in Ranking using Absorbing Random Walks
Improving Diversity in Ranking using Absorbing Random Walks Andrew B. Goldberg with Xiaojin Zhu, Jurgen Van Gael, and David Andrzejewski Department of Computer Sciences, University of Wisconsin, Madison
More informationNatural Language Processing SoSe Words and Language Model
Natural Language Processing SoSe 2016 Words and Language Model Dr. Mariana Neves May 2nd, 2016 Outline 2 Words Language Model Outline 3 Words Language Model Tokenization Separation of words in a sentence
More informationStatistics 202: Data Mining. c Jonathan Taylor. Model-based clustering Based in part on slides from textbook, slides of Susan Holmes.
Model-based clustering Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Model-based clustering General approach Choose a type of mixture model (e.g. multivariate Normal)
More informationstatistical machine translation
statistical machine translation P A R T 3 : D E C O D I N G & E V A L U A T I O N CSC401/2511 Natural Language Computing Spring 2019 Lecture 6 Frank Rudzicz and Chloé Pou-Prom 1 University of Toronto Statistical
More informationRe-evaluating Automatic Summarization with BLEU and 192 Shades of ROUGE
Re-evaluating Automatic Summarization with BLEU and 192 Shades of ROUGE Yvette Graham ADAPT Centre School of Computer Science and Statistics Trinity College Dublin graham.yvette@gmail.com Abstract We provide
More informationTuning as Linear Regression
Tuning as Linear Regression Marzieh Bazrafshan, Tagyoung Chung and Daniel Gildea Department of Computer Science University of Rochester Rochester, NY 14627 Abstract We propose a tuning method for statistical
More informationDomain Adaptation for Word Sense Disambiguation under the Problem of Covariate Shift
Domain Adaptation for Word Sense Disambiguation under the Problem of Covariate Shift HIRONORI KIKUCHI 1,a) HIROYUKI SHINNOU 1,b) Abstract: Word sense disambiguation(wsd) is the task of identifying the
More informationOn the complexity of shallow and deep neural network classifiers
On the complexity of shallow and deep neural network classifiers Monica Bianchini and Franco Scarselli Department of Information Engineering and Mathematics University of Siena Via Roma 56, I-53100, Siena,
More informationModeling User s Cognitive Dynamics in Information Access and Retrieval using Quantum Probability (ESR-6)
Modeling User s Cognitive Dynamics in Information Access and Retrieval using Quantum Probability (ESR-6) SAGAR UPRETY THE OPEN UNIVERSITY, UK QUARTZ Mid-term Review Meeting, Padova, Italy, 07/12/2018 Self-Introduction
More informationSupport Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2
More informationN-gram Language Model. Language Models. Outline. Language Model Evaluation. Given a text w = w 1...,w t,...,w w we can compute its probability by:
N-gram Language Model 2 Given a text w = w 1...,w t,...,w w we can compute its probability by: Language Models Marcello Federico FBK-irst Trento, Italy 2016 w Y Pr(w) =Pr(w 1 ) Pr(w t h t ) (1) t=2 where
More informationMedia Map: A Multilingual Document Map with a Design Interface
Media Map: A Multilingual Document Map with a Design Interface Timo Honkela 1, Jorma Laaksonen 1, Hannele Törrö 2, and Juhani Tenhunen 2 1 Aalto University School of Science, Department of Information
More informationChapter 3: Basics of Language Modelling
Chapter 3: Basics of Language Modelling Motivation Language Models are used in Speech Recognition Machine Translation Natural Language Generation Query completion For research and development: need a simple
More informationStatistics 262: Intermediate Biostatistics Model selection
Statistics 262: Intermediate Biostatistics Model selection Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Today s class Model selection. Strategies for model selection.
More informationCS 188: Artificial Intelligence Fall 2011
CS 188: Artificial Intelligence Fall 2011 Lecture 20: HMMs / Speech / ML 11/8/2011 Dan Klein UC Berkeley Today HMMs Demo bonanza! Most likely explanation queries Speech recognition A massive HMM! Details
More informationWhy Language Models and Inverse Document Frequency for Information Retrieval?
Why Language Models and Inverse Document Frequency for Information Retrieval? Catarina Moreira, Andreas Wichert Instituto Superior Técnico, INESC-ID Av. Professor Cavaco Silva, 2744-016 Porto Salvo, Portugal
More information1 Evaluation of SMT systems: BLEU
1 Evaluation of SMT systems: BLEU Idea: We want to define a repeatable evaluation method that uses: a gold standard of human generated reference translations a numerical translation closeness metric in
More informationFast Computing Grammar-driven Convolution Tree Kernel for Semantic Role Labeling
Fast Computing Grammar-driven Convolution Tree Kernel for Semantic Role Labeling Wanxiang Che 1, Min Zhang 2, Ai Ti Aw 2, Chew Lim Tan 3, Ting Liu 1, Sheng Li 1 1 School of Computer Science and Technology
More informationThe effect on accuracy of tweet sample size for hashtag segmentation dictionary construction
The effect on accuracy of tweet sample size for hashtag segmentation dictionary construction Laurence A. F. Park and Glenn Stone School of Computing, Engineering and Mathematics Western Sydney University,
More information2. Tree-to-String [6] (Phrase Based Machine Translation, PBMT)[1] [2] [7] (Targeted Self-Training) [7] Tree-to-String [3] Tree-to-String [4]
1,a) 1,b) Graham Nubig 1,c) 1,d) 1,) 1. (Phras Basd Machin Translation, PBMT)[1] [2] Tr-to-String [3] Tr-to-String [4] [5] 1 Graduat School of Information Scinc, Nara Institut of Scinc and Tchnology a)
More informationUtilizing Portion of Patent Families with No Parallel Sentences Extracted in Estimating Translation of Technical Terms
1 1 1 2 2 30% 70% 70% NTCIR-7 13% 90% 1,000 Utilizing Portion of Patent Families with No Parallel Sentences Extracted in Estimating Translation of Technical Terms Itsuki Toyota 1 Yusuke Takahashi 1 Kensaku
More informationDistribution of Environments in Formal Measures of Intelligence: Extended Version
Distribution of Environments in Formal Measures of Intelligence: Extended Version Bill Hibbard December 2008 Abstract This paper shows that a constraint on universal Turing machines is necessary for Legg's
More informationA Systematic Comparison of Training Criteria for Statistical Machine Translation
A Systematic Comparison of Training Criteria for Statistical Machine Translation Richard Zens and Saša Hasan and Hermann Ney Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6
More informationSequential Minimal Optimization (SMO)
Data Science and Machine Intelligence Lab National Chiao Tung University May, 07 The SMO algorithm was proposed by John C. Platt in 998 and became the fastest quadratic programming optimization algorithm,
More informationAn Empirical Study on Computing Consensus Translations from Multiple Machine Translation Systems
An Empirical Study on Computing Consensus Translations from Multiple Machine Translation Systems Wolfgang Macherey Google Inc. 1600 Amphitheatre Parkway Mountain View, CA 94043, USA wmach@google.com Franz
More informationUSING TOPONYM CO-OCCURRENCES TO MEASURERELATIONSHIPS BETWEEN PLACES
USING TOPONYM CO-OCCURRENCES TO MEASURERELATIONSHIPS BETWEEN PLACES A Novel Perspective on the Dutch Settlement System Evert Meijers (TU Delft) CITIES CANNOT BE STUDIED IN ISOLATION Fate and fortune of
More informationEfficient Path Counting Transducers for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices
Efficient Path Counting Transducers for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices Graeme Blackwood, Adrià de Gispert, William Byrne Machine Intelligence Laboratory Cambridge
More informationSupport Vector Machines and Kernel Methods
Support Vector Machines and Kernel Methods Geoff Gordon ggordon@cs.cmu.edu July 10, 2003 Overview Why do people care about SVMs? Classification problems SVMs often produce good results over a wide range
More informationSparse Models for Speech Recognition
Sparse Models for Speech Recognition Weibin Zhang and Pascale Fung Human Language Technology Center Hong Kong University of Science and Technology Outline Introduction to speech recognition Motivations
More informationHierarchical Bayesian Nonparametric Models of Language and Text
Hierarchical Bayesian Nonparametric Models of Language and Text Gatsby Computational Neuroscience Unit, UCL Joint work with Frank Wood *, Jan Gasthaus *, Cedric Archambeau, Lancelot James SIGIR Workshop
More informationMachine Learning for natural language processing
Machine Learning for natural language processing N-grams and language models Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 25 Introduction Goals: Estimate the probability that a
More informationMaximal Lattice Overlap in Example-Based Machine Translation
Maximal Lattice Overlap in Example-Based Machine Translation Rebecca Hutchinson Paul N. Bennett Jaime Carbonell Peter Jansen Ralf Brown June 6, 2003 CMU-CS-03-138 School of Computer Science Carnegie Mellon
More informationCross-Lingual Language Modeling for Automatic Speech Recogntion
GBO Presentation Cross-Lingual Language Modeling for Automatic Speech Recogntion November 14, 2003 Woosung Kim woosung@cs.jhu.edu Center for Language and Speech Processing Dept. of Computer Science The
More informationAn Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition
An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition Yu-Seop Kim 1, Jeong-Ho Chang 2, and Byoung-Tak Zhang 2 1 Division of Information and Telecommunication
More informationA Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister
A Syntax-based Statistical Machine Translation Model Alexander Friedl, Georg Teichtmeister 4.12.2006 Introduction The model Experiment Conclusion Statistical Translation Model (STM): - mathematical model
More informationThe OntoNL Semantic Relatedness Measure for OWL Ontologies
The OntoNL Semantic Relatedness Measure for OWL Ontologies Anastasia Karanastasi and Stavros hristodoulakis Laboratory of Distributed Multimedia Information Systems and Applications Technical University
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Language Models. Tobias Scheffer
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Language Models Tobias Scheffer Stochastic Language Models A stochastic language model is a probability distribution over words.
More informationCourse 10. Kernel methods. Classical and deep neural networks.
Course 10 Kernel methods. Classical and deep neural networks. Kernel methods in similarity-based learning Following (Ionescu, 2018) The Vector Space Model ò The representation of a set of objects as vectors
More informationLanguage Models, Smoothing, and IDF Weighting
Language Models, Smoothing, and IDF Weighting Najeeb Abdulmutalib, Norbert Fuhr University of Duisburg-Essen, Germany {najeeb fuhr}@is.inf.uni-due.de Abstract In this paper, we investigate the relationship
More informationTowards Maximum Geometric Margin Minimum Error Classification
THE SCIENCE AND ENGINEERING REVIEW OF DOSHISHA UNIVERSITY, VOL. 50, NO. 3 October 2009 Towards Maximum Geometric Margin Minimum Error Classification Kouta YAMADA*, Shigeru KATAGIRI*, Erik MCDERMOTT**,
More informationarxiv: v1 [cs.cl] 21 May 2017
Spelling Correction as a Foreign Language Yingbo Zhou yingbzhou@ebay.com Utkarsh Porwal uporwal@ebay.com Roberto Konow rkonow@ebay.com arxiv:1705.07371v1 [cs.cl] 21 May 2017 Abstract In this paper, we
More informationVol.2011-NL-201 No.8 Vol.2011-SLP-86 No /5/17
thispaper, we propose a method which utilizesa probabilistic language In in non-factoid type question-answering system in order to improve itsac- model The model isa mixture probabilistic language model
More informationACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging
ACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging Stephen Clark Natural Language and Information Processing (NLIP) Group sc609@cam.ac.uk The POS Tagging Problem 2 England NNP s POS fencers
More informationHYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH
HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi
More informationa) b) (Natural Language Processing; NLP) (Deep Learning) Bag of words White House RGB [1] IBM
c 1. (Natural Language Processing; NLP) (Deep Learning) RGB IBM 135 8511 5 6 52 yutat@jp.ibm.com a) b) 2. 1 0 2 1 Bag of words White House 2 [1] 2015 4 Copyright c by ORSJ. Unauthorized reproduction of
More informationInstance-based Domain Adaptation via Multi-clustering Logistic Approximation
Instance-based Domain Adaptation via Multi-clustering Logistic Approximation FENG U, Nanjing University of Science and Technology JIANFEI YU, Singapore Management University RUI IA, Nanjing University
More informationGeoTime Retrieval through Passage-based Learning to Rank
GeoTime Retrieval through Passage-based Learning to Rank Xiao-Lin Wang 1,2, Hai Zhao 1,2, Bao-Liang Lu 1,2 1 Center for Brain-Like Computing and Machine Intelligence Department of Computer Science and
More informationNatural Language Processing. Statistical Inference: n-grams
Natural Language Processing Statistical Inference: n-grams Updated 3/2009 Statistical Inference Statistical Inference consists of taking some data (generated in accordance with some unknown probability
More informationCount-Min Tree Sketch: Approximate counting for NLP
Count-Min Tree Sketch: Approximate counting for NLP Guillaume Pitel, Geoffroy Fouquier, Emmanuel Marchand and Abdul Mouhamadsultane exensa firstname.lastname@exensa.com arxiv:64.5492v [cs.ir] 9 Apr 26
More informationNatural Language Processing (CSEP 517): Machine Translation (Continued), Summarization, & Finale
Natural Language Processing (CSEP 517): Machine Translation (Continued), Summarization, & Finale Noah Smith c 2017 University of Washington nasmith@cs.washington.edu May 22, 2017 1 / 30 To-Do List Online
More informationUsing a Mixture of N-Best Lists from Multiple MT Systems in Rank-Sum-Based Confidence Measure for MT Outputs
Using a Mixture of N-Best Lists from Multiple MT Systems in Rank-Sum-Based Confidence Measure for MT Outputs Yasuhiro Akiba,, Eiichiro Sumita, Hiromi Nakaiwa, Seiichi Yamamoto, and Hiroshi G. Okuno ATR
More informationInformation Retrival Ranking. tf-idf. Eddie Aronovich. October 30, 2012
October 30, 2012 Table of contents 1 2 Term Frequency tf idf Inforamation Retrival the techniques of storing and recovering and often disseminating recorded data especially through the use of a computerized
More informationHW 3. (Posted on March 03, 2017) ax, 0 < x < 1 0, otherwise. f(x)dx = 1).
HW 3. (Posted on March 03, 2017) Name: HW 3.1. (Continuous Distribution) Suppose that X has the pdf f(x) = { ax, 0 < x < 1 0, otherwise. (a) Find the value of a (use the requirement f(x)dx = 1). (b) Calculate
More informationMultiple Similarities Based Kernel Subspace Learning for Image Classification
Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese
More informationOut of GIZA Efficient Word Alignment Models for SMT
Out of GIZA Efficient Word Alignment Models for SMT Yanjun Ma National Centre for Language Technology School of Computing Dublin City University NCLT Seminar Series March 4, 2009 Y. Ma (DCU) Out of Giza
More informationNote on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing
Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing 1 Zhong-Yuan Zhang, 2 Chris Ding, 3 Jie Tang *1, Corresponding Author School of Statistics,
More informationTemporal context calibrates interval timing
Temporal context calibrates interval timing, Mehrdad Jazayeri & Michael N. Shadlen Helen Hay Whitney Foundation HHMI, NPRC, Department of Physiology and Biophysics, University of Washington, Seattle, Washington
More informationAutomatically Evaluating Text Coherence using Anaphora and Coreference Resolution
1 1 Barzilay 1) Automatically Evaluating Text Coherence using Anaphora and Coreference Resolution Ryu Iida 1 and Takenobu Tokunaga 1 We propose a metric for automatically evaluating discourse coherence
More informationSYNTHER A NEW M-GRAM POS TAGGER
SYNTHER A NEW M-GRAM POS TAGGER David Sündermann and Hermann Ney RWTH Aachen University of Technology, Computer Science Department Ahornstr. 55, 52056 Aachen, Germany {suendermann,ney}@cs.rwth-aachen.de
More informationBuilding Cognitive Applications
Building Cognitive Applications * *creating visualizations using cognitive APIs Jonathan Kaufman @kauffecup jkaufman.io June 15, 2016 1. What is cognitive? 2. Demo some apps + look at code 3. Build our
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationCS4705. Probability Review and Naïve Bayes. Slides from Dragomir Radev
CS4705 Probability Review and Naïve Bayes Slides from Dragomir Radev Classification using a Generative Approach Previously on NLP discriminative models P C D here is a line with all the social media posts
More informationLearning k-edge Deterministic Finite Automata in the Framework of Active Learning
Learning k-edge Deterministic Finite Automata in the Framework of Active Learning Anuchit Jitpattanakul* Department of Mathematics, Faculty of Applied Science, King Mong s University of Technology North
More informationComputation of Csiszár s Mutual Information of Order α
Computation of Csiszár s Mutual Information of Order Damianos Karakos, Sanjeev Khudanpur and Care E. Priebe Department of Electrical and Computer Engineering and Center for Language and Speech Processing
More information5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)
5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) Assumption #A1: Our regression model does not lack of any further relevant exogenous variables beyond x 1i, x 2i,..., x Ki and
More informationText mining and natural language analysis. Jefrey Lijffijt
Text mining and natural language analysis Jefrey Lijffijt PART I: Introduction to Text Mining Why text mining The amount of text published on paper, on the web, and even within companies is inconceivably
More informationUAlacant word-level machine translation quality estimation system at WMT 2015
UAlacant word-level machine translation quality estimation system at WMT 2015 Miquel Esplà-Gomis Felipe Sánchez-Martínez Mikel L. Forcada Departament de Llenguatges i Sistemes Informàtics Universitat d
More information