Complex Question Answering with ASQA at NTCIR 7 ACLIA
|
|
- Arnold Logan
- 5 years ago
- Views:
Transcription
1 Complex Queston Answerng wth ASQA at NTCIR 7 ACLIA Y-Hsun Lee 1, Cheng-We Lee 12, Cheng-Lung Sung 1, Mon-Tn Tzou 1, Chh-Chen Wang 1, Shh-Hung Lu 1, Cheng-We Shh 1, Pe-Yn Yang 1, Wen-Lan Hsu 1 1 Insttute of Informaton Scence, Academa Snca, Tawan, R.O.C 2 Department of Computer Scence, Natonal Tsng-Hua Unversty, Tawan, R.O.C correspondng author {rog,aska,clsung,mttzou,vncent,journey,dap,annabel,hsu}@s.snca.edu.tw Abstract At NTCIR 7, we mplemented the Academa Snca Queston Answerng (ASQA) system for complex questons. The system uses three methods to select answer strngs from a news corpus. (a) It uses syntactc patterns, whch are usually used by QA systems, to retreve more precse answer strngs than those derved by tradtonal IR. (b) Usng external knowledge, the system can fnd accurate answers to specfc questons that the tradtonal IR approach can not process. (c) Entropy-based and co-occurrence-based mnng methods are used to retreve relevant answer strngs for document retreval. In the NTCIR 7 CCLQA task, ASQA acheved 0.26 n the CT-CT task and 0.20 n the CS-CS task. 1. Introducton Because of the hgh level of nformaton overload on the Internet, research nto queston answerng, whch focuses on how to respond to users queres wth exact answers, s becomng ncreasngly mportant. In recent years, many nternatonal queston answerng contest-type evaluaton tasks have been held at conferences and workshops, such as TREC [3], CLEF [1], and NTCIR [2]. At NTCIR 7, the systems were requred to process complex questons related to defntons, bographes, relatonshps, and events. Ths year, our system focused on external knowledge, syntax patterns and some data mnng technques. We ntroduce the methods n Secton 5. The remander of ths paper s organzed as follows. Secton 2 provdes an overvew of the system archtecture. In Sectons 3 and 4, we ntroduce the queston analyss strategy and document retreval method, respectvely. In Secton 5, we dscuss the methods used at NTCIR 7. In Secton 6, we consder the redundancy removal module. Secton 7 evaluates the system performance, and Secton 8 contans a dscusson. Secton 9 summarzes our conclusons. 2. System Descrpton Our queston answerng system ASQA (Academa Snca Queston Answerng) for complex questons s dvded nto fve modules: queston processng, document retreval, sentence selecton, answer rankng, and redundancy removal. Questons are frst analyzed by the queston processng module to extract the keywords and queston topc, whch are then used to retreve documents from the corpus. In the next step, the sentence selecton module obtans the sentence scores, after whch the sentence rankng module determnes whether or not the sentences are relevant. Fnally, the redundancy removal module flters smlar sentences as answer strngs. 3. Queston Analyss The queston processng module uses smple surface patterns to classfy questons nto four types: (a) defnton, (b) bography, (c) relatonshp, and (d) event. For example, bography questons always start wth, whch we call a queston term. The man topcs of such questons are usually adjacent to the queston terms, so we use surface text patterns to retreve a queston s topc. In bography 70
2 questons, we consder that the queston topc and the queston keywords should be the same. In the other queston types, we use a keyword extractor to extract keywords from queston topcs. To ensure that a queston s not dvded nto meanngless small peces by the keyword extractor, we retreve Wkpeda ttles to address the problem. If the queston topc appears n the Wkpeda ttles, we set the keywords as the queston topc. 4. Document Retreval The document retreval module uses Lucene, an open source nformaton retreval engne, to ndex documents by character. In the retreval process, we use the AND operator to form a query strng comprsed of queston topcs, whch s then sent to the Lucene engne. If the number of retreved documents s larger than a threshold, we termnate the retreval process; otherwse, we use the keywords to form a new query strng. If the number of retreved documents s stll less than the threshold, we use the OR operator to combne the keywords as a query strng. After retrevng the documents, we splt each one nto several sentences. 5. Sentence Selecton The sentence selecton module uses co-occurrence-based and entropy-based methods, syntactc patterns, and external knowledge to fnd relevant sentences. We descrbe the process n detal n followng sub-sectons Co-occurrence-based Sentence Selecton Magnn et al. [5] consder that the number of documents n whch the queston terms and the answer co-occur s useful for QA. The hypothess s smlar to that of Clarke et al. [4], who use co-occurrence methods to measure the relevance of an answer to the gven queston based on Web search results. Although the method s used n factod QA, we thnk the hypothess could also be useful for complex QA. Our ratonale s that, n good qualty passages, the more often a term co-occurs wth the queston topc, the hgher the confdence that the passage wll be relevant. We regard a co-occurrence as an ndcaton that the passage can be taken as correct based on the co-occurrng queston topcs. Let the gven term be t and the gven queston be Q, where Q conssts of a set (QT) of queston terms {qt1, qt2, qt3,, qtn} created from the queston processng result of Q. We use the followng equaton to calculate each term s weght: p( t Q) frequency( t Q) / freuqency( Q), p( t ) frequency( t ) / N where P (t QT) denotes the probablty of the term when QT appears; N s the number of documents n the corpus; and P(t) s the probablty of the term occurrng n the news corpus. We consder that the probablty P (t QT) should be sgnfcantly dfferent to the probablty P(t). After calculatng the probabltes of the terms, we use the followng hypothess test to determne whether or not the term s relevant: H 0 : p( t Q) p( t ). H : p( t Q) p( t ) Entropy-based Sentence Retreval Inspred by the work of Xu and Croft [6], we use the concept of entropy, whch s a measure of uncertanty n nformaton theory, to perform sentence retreval n our queston answer system. Ths concept s smlar to the nverse document frequency (IDF) n nformaton retreval. If the entropy value of a term s large, the term s regarded as a stop word or keyword n some topc. For example, the term poltcal party should be mportant n poltcal news. The entropy value of each term we use s calculated by: n Entropy( t) p( t ) log p( (1) 1 10 t ) Let t and n denote the gven term and the number of retreved documents respectvely. Then, the probablty of t s calculated by frequency( t ) P( t ) (2) Total( t) where frequency( t ) represents the number of passages n whch the term t occurs, and Total (t) denotes the number of retreved passages n whch t occurs. Next, we select relevant sentences based on ther entropy values. We adopt the vector space model to calculate the smlarty between the retreved passages and an entropy vector, whch s constructed from the terms whose entropy values are greater than a threshold. In our system, the threshold s set at 0.6. In addton, we use the IDF to flter stop words. The sentence vector s constructed based on the segmentaton result. Each weght of the vector represents the term s entropy value or a default value. Because the entropy method s data-drven, we only use t f the queston topc appears n several documents. 71
3 5.3. External-Knowledge-based Sentence Retreval There are several knowledge databases, such as WordNet, HowNet, and Wkpeda. The knowledge they contan should be useful for QA systems. To support our ASQA, we use Wkpeda, an onlne encyclopeda whose content s provded by Internet users. If the keywords of a queston appear n a ttle n Wkpeda, the contents of the ttle should be the standard answers that also appear n the corpus. In addton, we consder the case where the queston does not appear on a Wkpeda page. To address ths problem, we employ the three methods descrbed n the followng sub-sectons Related Term-based Sentence Retreval In ths sentence retreval method, we also use entropy values to fnd related terms. A low entropy value ndcates a topc-specfc term, whch should be useful for dentfyng answer sentences. However, n our experence, terms wth small entropy values are not useful when the corpus s derved from news artcles or the Web, because Web data or corpus data contans too much nose. Therefore, we only apply the entropy method to the Wkpeda data because t contans less nosy nformaton. For example, n Wkpeda Lesley who s the daughter of Ma Yng-jeou only co-occurs wth Ma Yng-jeou, but on the Web, Lesley co-occurs wth many terms that are not of nterest to us. Obvously, ths sentence selecton method can only be used when the queston topc exsts n Wkpeda. Equaton (1) s used to calculate each term s entropy value. Then, we apply the followng equaton to calculate the sentence score: If entropy(term ) 0 m Score( s) 1 (3) entropy( term 1 ) The smaller a term s entropy value, the more relevant the term wll be. If a sentence contans many relevant terms, t wll be taken as the standard answer. Therefore, we aggregate the nverse of the terms entropy values as the score of s Defntonal Term-based Sentence Retreval Some terms or phrases provde useful hnts for dentfyng defntonal sentences. For example, the phrase s a knd of s a clear ndcaton that the sentence s about the type or category of an object or term. Snce such phrases or terms are used a great deal n Wkpeda, we collect the Wkpeda pages to extract such phrases or terms, whch we call defntonal terms. Ths sentence selecton method uses Equaton (1) to fnd defntonal terms. In contrast to the method descrbed n sub-secton 5.3.1, whch utlzes small entropy values to locate related terms, we want to fnd terms or phrases wth large entropy values that occur n most Wkpeda artcles. However, these knds of terms could also be stop words or defntonal terms, so they must be fltered out. We assume that stop words are evenly dstrbuted n any corpus. If a term has a hgh entropy value or a hgh IDF value n both the Wkpeda corpus and the News corpus, the term s probably a stop word. Fnally, we calculate the score of a sentence by aggregatng the scores of terms that have hgh entropy values as follows: 0 If entropy(term ) threshold m Score( s) (4) entropyterm ( ) Syntactc Surface Pattern-based Sentence Retreval Syntactc patterns are useful for fndng precse nformaton for bography questons. Next, we descrbe how we construct bographcal syntactc surface patterns (SPs). 72
4 Table 1. TREC BographyQueston Type Q Types Year 2003 Year 2004 Year 2005 Year 2006 Sum percentage Appellaton % Ancestral Home % Natonalty % Domcle % Brth/Death Day and Place % Race % Famly % Parentage % Occupaton % Educaton % Belef % Characterstcs % Honour % Contrbuton % Notables % Organzaton % Quotatons % Publcatons % Dsease % Dsablty % Msadventure % Scandal % Controversy % Others % Syntactc Pattern Generaton To construct SPs for bography questons, we need To construct SPs for bography questons, we need categores of bographcal nformaton. Snce 2008 was the frst year that TREC consdered bography questons, we had to use Englsh tranng data for our analyss. We took the test questons from TREC 2003 to TREC 2006 and categorzed them nto 24 types. Some of the queston types were combned based on ther smlarty, whch resulted n the 20 types shown n Table 1. Then, from the bographcal nformaton n the table, we created our own Chnese tranng questons and answer nuggets to tran the SPs. We created the SPs sem-automatcally from the followng sources. 1. Algnment-based Template Generaton: For relatonal nformaton, such as the relatonshp between a husband and wfe, we collected the relatonal term pars and appled our algnment-based template generaton method to obtan prelmnary sentence patterns. Then, hgh frequency patterns were rewrtten nto answer templates by our analysts. 2. Sentence makng: We tred to thnk of possble bographcal answers for each type ourselves and then wrote correspondng templates. 3. Internet: Defntonal sentences n bographcal artcles obtaned from the Internet, such as Wkpeda, were also referenced durng template formaton. 4. News Corpus: Our analysts used the Lucene search engne to extract artcles about the targeted person from the CIRB20 and CIRB40 corpora, and then selected relevant defntonal sentences n the artcles 73
5 Table 2. Bography Template Type Bography20Attrbutes Instance Rule TOTAL Rules Appellaton Belef Characterstcs Contrbuton Controversy Domcle DPBD&Natonalty Educaton Famly Honour IPC Msadventure Notables Org&Occ Others Parentage Publcatons Quotatons Race Scandal TOTAL to make correspondng templates Negaton Term To help screen out unwanted sentences whle keepng the good ones, we added two categores of keywords: Terms to be excluded and Terms that should not be excluded. Most of the content of the frst category s comprsed of negatve terms, whle the second contans double negatves. The followng are some examples of the two types of terms; the words n parentheses are the lteral meanngs of the Chnese terms: 1. Terms to be excluded: (reject)(ban) (cant)(wont)(should) (Dont thnk about)(questonng partcle) (rumor)(not necessary) 2. Terms that should not to be excluded: (cant help but)(cantwthout ) (no..not...) (no other than...) (none...not...) Punctuaton In Chnese, the sentence structures of declaratves and questons are often the same. If a queston does not have a questonng partcle, such as or, t would be very dffcult for a computer program to dstngush t from a declaratve. Therefore, a sentence that matches our templates may be just a queston, rather than a declaratve that we wsh to retreve. To avod ths problem, all sentences endng wth queston marks were deleted when retrevng sentences, unless the queston marks were part of the matched templates. 74
6 6. Redundancy Removal Rankng Canddates ALL Best IASL-CT-CT-01-T IASL-CS-CS-01-T X Term OverLap 8. Dscusson and Error Analyss N-gram Overlap VSM Model Our analyss of the experment results dentfed certan problems that need to be addressed. We dscuss them n the followng sub-sectons Segmentaton of NER problems Answer Strngs Fgure 1 Redundancy Removal FlowChart A news corpus may contan a great deal of redundant nformaton, especally f the news s hot, because smlar nformaton could be mentoned n dfferent news artcles. To handle ths problem, we need a module to flter out smlar sentences. We adopt a two-stage redundancy removal approach, as shown n Fgure 2. Frst, we use the concept of term overlap to remove smlar sentences. The sentence selecton module dentfes relevant terms, defntonal terms, and entropy terms. In ths stage, we set a new threshold to fnd good terms from the set of terms derved by the sentence selecton module. It s assumed that these terms would be the same n smlar sentences. In the second stage, f we have answer canddates that match some of the syntactc patterns descrbed n sub-secton 5.4, we use an N-gram overlap method to remove smlar sentences; otherwse, we use a vector space model to remove such sentences. 7. System Performance In the NTCIR 7 CCLQA Task, we acheved an F-score of n CT-CT and n CS-CS. Among the four queston types, our system acheved a hgher F-score for bography questons than for the other three types. The system dd not perform well on event questons because we only used IR technques to retreve relevant sentences. All of our system s modules are based on tradtonal Chnese; therefore we have to convert smplfed Chnese questons and corpora. However, as shown by the results, the process degrades the system s performance. Because we only used a character-based ndex for document retreval and dd not use an NER technque to flter out wrong documents wth dfferent people s names, we encountered some problems when nformaton about the wrong person was retreved. For example, there was a queston. However, there was another person called n the news corpus. Examples lke ths cause the performance to deterorate on some questons Dfferent Perspectves between Tranng and Test Data Accordng to the tranng data provded for the task, most of the answers were summarzaton-orented. The answers contaned a lot of related nformaton about the queston focus, and they were qute dfferent from the answers for factod questons. Unfortunately, we found a dscrepancy between the tranng data and test data because some of the test questons were judged as f they were factod questons. For example, there was a queston. In the tranng set, the answers to ths knd of queston would contan nformaton lke brthplace, brthday, famly or occupaton, but there s only a small porton of ths nformaton n the standard answers n the test data. The above-mentoned dscrepancy had a substantal mpact on the system s performance Inconsstency n Treatng People wth Same Name For questons lke and, there s more than one person wth the same name n the news corpus. We found that the task evaluators treated them n dfferent ways. For, there are at least fve people wth that name n the news corpus, but only the person who does gymnastcs was labeled as correct. However, for, all the answers related to dfferent were labeled as correct. Some errors 75
7 were caused by ths nconsstency Unmarked Correct Answers System responses judged as correct answers should be marked, yet some correct answers were unmarked. Ths degraded our system s performance substantally. For example, the standard answer, whch appeared n our system response, was unmarked. for Answer Valdaton, Proceedngs of the 40th Annual Meetng on Assocaton for Computatonal Lngustcs, 2001, pp [6] J. Xu and W. B. Croft, Query expanson usng local and global document analyss, Proceedngs of the 19th annual nternatonal ACM SIGIR conference, Zurch, Swtzerland, 1996, pp Concluson For NTCIR 7 CCLQA, we bult a complex queston answerng system. Snce the task s complex, we assume the answers are summarzaton-orented, whch means they contan varous types of nformaton that requres careful flterng. We used several syntax patterns to select precse answers when dealng wth bography questons, and used the entropy method to select defntonal terms and related terms from Wkpeda for defnton questons and relaton questons respectvely. Even though the pattern-based methods were not hghly accurate, we stll encountered the low coverage problem. Therefore, we adopted the co-occurrence and entropy-based methods to fnd relevant sentences. 10. Acknowledgments Ths research was supported n part by the Natonal Scence Councl of Tawan under Center of Excellence Grant NSC E PAE; and by the Research Center for Humantes and Socal Scences, Academa Snca, and the Thematc Program of Academa Snca under Grant AS 95ASIA02. We wsh to thank the Chnese Knowledge and Informaton Processng Group (CKIP) of Academa Snca for provdng us wth AutoTag for Chnese word segmentaton. 11. References [1] Cross Language Evaluaton Forum (CLEF), [2] NTCIR Workshop, [3] Text REtreval Conference (TREC), [4] C. L. A. Clarke, G. Cormack, G. Kemkes, M. Laszlo, T. Lynam, E. Terra and P. Tlker, Statstcal Selecton of Exact Answers (MultText Experments for TREC 2002), Proc. of TREC, 2002, pp [5] B. Magnn, M. N. R. Prevete and H. Tanev, Is t the rght answer?: explotng web redundancy 76
Question Classification Using Language Modeling
Queston Classfcaton Usng Language Modelng We L Center for Intellgent Informaton Retreval Department of Computer Scence Unversty of Massachusetts, Amherst, MA 01003 ABSTRACT Queston classfcaton assgns a
More informationSimilar Sentence Retrieval for Machine Translation Based on Word-Aligned Bilingual Corpus
Smlar Sentence Retreval for Machne Translaton Based on Word-Algned Blngual Corpus Wen-Han Chao and Zhou-Jun L School of Computer Scence, Natonal Unversty of Defense Technology, Chna, 40073 cwhk@63.com
More informationCOMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS
More informationPulse Coded Modulation
Pulse Coded Modulaton PCM (Pulse Coded Modulaton) s a voce codng technque defned by the ITU-T G.711 standard and t s used n dgtal telephony to encode the voce sgnal. The frst step n the analog to dgtal
More informationStatistics II Final Exam 26/6/18
Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the
More informationThe Order Relation and Trace Inequalities for. Hermitian Operators
Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence
More informationUncertainty and auto-correlation in. Measurement
Uncertanty and auto-correlaton n arxv:1707.03276v2 [physcs.data-an] 30 Dec 2017 Measurement Markus Schebl Federal Offce of Metrology and Surveyng (BEV), 1160 Venna, Austra E-mal: markus.schebl@bev.gv.at
More informationFuzzy Boundaries of Sample Selection Model
Proceedngs of the 9th WSES Internatonal Conference on ppled Mathematcs, Istanbul, Turkey, May 7-9, 006 (pp309-34) Fuzzy Boundares of Sample Selecton Model L. MUHMD SFIIH, NTON BDULBSH KMIL, M. T. BU OSMN
More informationExtracting Pronunciation-translated Names from Chinese Texts using Bootstrapping Approach
Extractng Pronuncaton-translated Names from Chnese Texts usng Bootstrappng Approach Jng Xao School of Computng, Natonal Unversty of Sngapore xaojng@comp.nus.edu.sg Jmn Lu School of Computng, Natonal Unversty
More informationOrientation Model of Elite Education and Mass Education
Proceedngs of the 8th Internatonal Conference on Innovaton & Management 723 Orentaton Model of Elte Educaton and Mass Educaton Ye Peng Huanggang Normal Unversty, Huanggang, P.R.Chna, 438 (E-mal: yepeng@hgnc.edu.cn)
More informationMSU at ImageCLEF: Cross Language and Interactive Image Retrieval
MSU at ImageCLEF: Cross Language and Interactve Image Retreval Vneet Bansal, Chen Zhang, Joyce Y. Cha, Rong Jn Department of Computer Scence and Engneerng, Mchgan State Unversty East Lansng, MI48824, U.S.A.
More informationCorpora and Statistical Methods Lecture 6. Semantic similarity, vector space models and wordsense disambiguation
Corpora and Statstcal Methods Lecture 6 Semantc smlarty, vector space models and wordsense dsambguaton Part 1 Semantc smlarty Synonymy Dfferent phonologcal/orthographc words hghly related meanngs: sofa
More informationPop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing
Advanced Scence and Technology Letters, pp.164-168 http://dx.do.org/10.14257/astl.2013 Pop-Clc Nose Detecton Usng Inter-Frame Correlaton for Improved Portable Audtory Sensng Dong Yun Lee, Kwang Myung Jeon,
More informationExtending Relevance Model for Relevance Feedback
Extendng Relevance Model for Relevance Feedback Le Zhao, Chenmn Lang and Jame Callan Language Technologes Insttute School of Computer Scence Carnege Mellon Unversty {lezhao, chenmnl, callan}@cs.cmu.edu
More informationChapter 13: Multiple Regression
Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationOn the Repeating Group Finding Problem
The 9th Workshop on Combnatoral Mathematcs and Computaton Theory On the Repeatng Group Fndng Problem Bo-Ren Kung, Wen-Hsen Chen, R.C.T Lee Graduate Insttute of Informaton Technology and Management Takmng
More informationCHAPTER IV RESEARCH FINDING AND ANALYSIS
CHAPTER IV REEARCH FINDING AND ANALYI A. Descrpton of Research Fndngs To fnd out the dfference between the students who were taught by usng Mme Game and the students who were not taught by usng Mme Game
More informationREAL-TIME DETERMINATION OF INDOOR CONTAMINANT SOURCE LOCATION AND STRENGTH, PART II: WITH TWO SENSORS. Beijing , China,
REAL-TIME DETERMIATIO OF IDOOR COTAMIAT SOURCE LOCATIO AD STREGTH, PART II: WITH TWO SESORS Hao Ca,, Xantng L, Wedng Long 3 Department of Buldng Scence, School of Archtecture, Tsnghua Unversty Bejng 84,
More informationModule 9. Lecture 6. Duality in Assignment Problems
Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept
More informationOn the correction of the h-index for career length
1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationχ x B E (c) Figure 2.1.1: (a) a material particle in a body, (b) a place in space, (c) a configuration of the body
Secton.. Moton.. The Materal Body and Moton hyscal materals n the real world are modeled usng an abstract mathematcal entty called a body. Ths body conssts of an nfnte number of materal partcles. Shown
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationCS47300: Web Information Search and Management
CS47300: Web Informaton Search and Management Probablstc Retreval Models Prof. Chrs Clfton 7 September 2018 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group 14 Why probabltes
More informationCHAPTER 6 GOODNESS OF FIT AND CONTINGENCY TABLE PREPARED BY: DR SITI ZANARIAH SATARI & FARAHANIM MISNI
CHAPTER 6 GOODNESS OF FIT AND CONTINGENCY TABLE Expected Outcomes Able to test the goodness of ft for categorcal data. Able to test whether the categorcal data ft to the certan dstrbuton such as Bnomal,
More informationA New Scrambling Evaluation Scheme based on Spatial Distribution Entropy and Centroid Difference of Bit-plane
A New Scramblng Evaluaton Scheme based on Spatal Dstrbuton Entropy and Centrod Dfference of Bt-plane Lang Zhao *, Avshek Adhkar Kouch Sakura * * Graduate School of Informaton Scence and Electrcal Engneerng,
More informationThe optimal delay of the second test is therefore approximately 210 hours earlier than =2.
THE IEC 61508 FORMULAS 223 The optmal delay of the second test s therefore approxmately 210 hours earler than =2. 8.4 The IEC 61508 Formulas IEC 61508-6 provdes approxmaton formulas for the PF for smple
More informationCONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING INTRODUCTION
CONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING N. Phanthuna 1,2, F. Cheevasuvt 2 and S. Chtwong 2 1 Department of Electrcal Engneerng, Faculty of Engneerng Rajamangala
More informationComparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method
Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method
More informationPsychology 282 Lecture #24 Outline Regression Diagnostics: Outliers
Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.
More informationRecover plaintext attack to block ciphers
Recover plantext attac to bloc cphers L An-Png Bejng 100085, P.R.Chna apl0001@sna.com Abstract In ths paper, we wll present an estmaton for the upper-bound of the amount of 16-bytes plantexts for Englsh
More informationPower law and dimension of the maximum value for belief distribution with the max Deng entropy
Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng
More informationLecture 3 Stat102, Spring 2007
Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationAn Improved multiple fractal algorithm
Advanced Scence and Technology Letters Vol.31 (MulGraB 213), pp.184-188 http://dx.do.org/1.1427/astl.213.31.41 An Improved multple fractal algorthm Yun Ln, Xaochu Xu, Jnfeng Pang College of Informaton
More informationMidterm Examination. Regression and Forecasting Models
IOMS Department Regresson and Forecastng Models Professor Wllam Greene Phone: 22.998.0876 Offce: KMC 7-90 Home page: people.stern.nyu.edu/wgreene Emal: wgreene@stern.nyu.edu Course web page: people.stern.nyu.edu/wgreene/regresson/outlne.htm
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationCalculating the Quasi-static Pressures of Confined Explosions Considering Chemical Reactions under the Constant Entropy Assumption
Appled Mechancs and Materals Onlne: 202-04-20 ISS: 662-7482, ol. 64, pp 396-400 do:0.4028/www.scentfc.net/amm.64.396 202 Trans Tech Publcatons, Swtzerland Calculatng the Quas-statc Pressures of Confned
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationEvaluation for sets of classes
Evaluaton for Tet Categorzaton Classfcaton accuracy: usual n ML, the proporton of correct decsons, Not approprate f the populaton rate of the class s low Precson, Recall and F 1 Better measures 21 Evaluaton
More informationTurbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH
Turbulence classfcaton of load data by the frequency and severty of wnd gusts Introducton Oscar Moñux, DEWI GmbH Kevn Blebler, DEWI GmbH Durng the wnd turbne developng process, one of the most mportant
More informationClock-Gating and Its Application to Low Power Design of Sequential Circuits
Clock-Gatng and Its Applcaton to Low Power Desgn of Sequental Crcuts ng WU Department of Electrcal Engneerng-Systems, Unversty of Southern Calforna Los Angeles, CA 989, USA, Phone: (23)74-448 Massoud PEDRAM
More informationTemperature. Chapter Heat Engine
Chapter 3 Temperature In prevous chapters of these notes we ntroduced the Prncple of Maxmum ntropy as a technque for estmatng probablty dstrbutons consstent wth constrants. In Chapter 9 we dscussed the
More informationA New Evolutionary Computation Based Approach for Learning Bayesian Network
Avalable onlne at www.scencedrect.com Proceda Engneerng 15 (2011) 4026 4030 Advanced n Control Engneerng and Informaton Scence A New Evolutonary Computaton Based Approach for Learnng Bayesan Network Yungang
More informationImprovement of Histogram Equalization for Minimum Mean Brightness Error
Proceedngs of the 7 WSEAS Int. Conference on Crcuts, Systems, Sgnal and elecommuncatons, Gold Coast, Australa, January 7-9, 7 3 Improvement of Hstogram Equalzaton for Mnmum Mean Brghtness Error AAPOG PHAHUA*,
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationDepartment of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution
Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable
More informationA Note on Bound for Jensen-Shannon Divergence by Jeffreys
OPEN ACCESS Conference Proceedngs Paper Entropy www.scforum.net/conference/ecea- A Note on Bound for Jensen-Shannon Dvergence by Jeffreys Takuya Yamano, * Department of Mathematcs and Physcs, Faculty of
More informationDUE: WEDS FEB 21ST 2018
HOMEWORK # 1: FINITE DIFFERENCES IN ONE DIMENSION DUE: WEDS FEB 21ST 2018 1. Theory Beam bendng s a classcal engneerng analyss. The tradtonal soluton technque makes smplfyng assumptons such as a constant
More informationMDL-Based Unsupervised Attribute Ranking
MDL-Based Unsupervsed Attrbute Rankng Zdravko Markov Computer Scence Department Central Connectcut State Unversty New Brtan, CT 06050, USA http://www.cs.ccsu.edu/~markov/ markovz@ccsu.edu MDL-Based Unsupervsed
More informationSemi-supervised Classification with Active Query Selection
Sem-supervsed Classfcaton wth Actve Query Selecton Jao Wang and Swe Luo School of Computer and Informaton Technology, Beng Jaotong Unversty, Beng 00044, Chna Wangjao088@63.com Abstract. Labeled samples
More informationFREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,
FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then
More informationNeryškioji dichotominių testo klausimų ir socialinių rodiklių diferencijavimo savybių klasifikacija
Neryškoj dchotomnų testo klausmų r socalnų rodklų dferencjavmo savybų klasfkacja Aleksandras KRYLOVAS, Natalja KOSAREVA, Julja KARALIŪNAITĖ Technologcal and Economc Development of Economy Receved 9 May
More informationResource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis
Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques
More informationRegulation No. 117 (Tyres rolling noise and wet grip adhesion) Proposal for amendments to ECE/TRANS/WP.29/GRB/2010/3
Transmtted by the expert from France Informal Document No. GRB-51-14 (67 th GRB, 15 17 February 2010, agenda tem 7) Regulaton No. 117 (Tyres rollng nose and wet grp adheson) Proposal for amendments to
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationFeature Weighting and Instance Selection for Collaborative Filtering
Proc. nd Internatonal Workshop on Management of Informaton on the Web - Web Data and Text Mnng (MIW 01) Feature Weghtng and Instance Selecton for Collaboratve Flterng Ka Yu, Zhong Wen, Xaowe Xu 1, Martn
More informationMessage modification, neutral bits and boomerangs
Message modfcaton, neutral bts and boomerangs From whch round should we start countng n SHA? Antone Joux DGA and Unversty of Versalles St-Quentn-en-Yvelnes France Jont work wth Thomas Peyrn 1 Dfferental
More informationDifference Equations
Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1
More informationLearning Theory: Lecture Notes
Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be
More informationA new construction of 3-separable matrices via an improved decoding of Macula s construction
Dscrete Optmzaton 5 008 700 704 Contents lsts avalable at ScenceDrect Dscrete Optmzaton journal homepage: wwwelsevercom/locate/dsopt A new constructon of 3-separable matrces va an mproved decodng of Macula
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationUsing Multivariate Rank Sum Tests to Evaluate Effectiveness of Computer Applications in Teaching Business Statistics
Usng Multvarate Rank Sum Tests to Evaluate Effectveness of Computer Applcatons n Teachng Busness Statstcs by Yeong-Tzay Su, Professor Department of Mathematcs Kaohsung Normal Unversty Kaohsung, TAIWAN
More informationSimulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests
Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth
More informationSupporting Information
Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to
More informationModule 14: THE INTEGRAL Exploring Calculus
Module 14: THE INTEGRAL Explorng Calculus Part I Approxmatons and the Defnte Integral It was known n the 1600s before the calculus was developed that the area of an rregularly shaped regon could be approxmated
More informationSystem in Weibull Distribution
Internatonal Matheatcal Foru 4 9 no. 9 94-95 Relablty Equvalence Factors of a Seres-Parallel Syste n Webull Dstrbuton M. A. El-Dacese Matheatcs Departent Faculty of Scence Tanta Unversty Tanta Egypt eldacese@yahoo.co
More information2016 Wiley. Study Session 2: Ethical and Professional Standards Application
6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton
More informationCHAPTER IV RESEARCH FINDING AND DISCUSSIONS
CHAPTER IV RESEARCH FINDING AND DISCUSSIONS A. Descrpton of Research Fndng. The Implementaton of Learnng Havng ganed the whole needed data, the researcher then dd analyss whch refers to the statstcal data
More informationarxiv:cs.cv/ Jun 2000
Correlaton over Decomposed Sgnals: A Non-Lnear Approach to Fast and Effectve Sequences Comparson Lucano da Fontoura Costa arxv:cs.cv/0006040 28 Jun 2000 Cybernetc Vson Research Group IFSC Unversty of São
More informationUsing the estimated penetrances to determine the range of the underlying genetic model in casecontrol
Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationModule 2. Random Processes. Version 2 ECE IIT, Kharagpur
Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be
More informationCompilers. Spring term. Alfonso Ortega: Enrique Alfonseca: Chapter 4: Syntactic analysis
Complers Sprng term Alfonso Ortega: alfonso.ortega@uam.es nrque Alfonseca: enrque.alfonseca@uam.es Chapter : Syntactc analyss. Introducton. Bottom-up Analyss Syntax Analyser Concepts It analyses the context-ndependent
More informationFoundations of Arithmetic
Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an
More informationBi-Relational Network Analysis Using a Fast Random Walk with Restart
B-Relatonal Network Analyss Usng a Fast Random Walk wth Restart Jng Xa, Dona Caragea and Wllam Hsu Departmet of Computng and Informaton Scences Kansas State Unversty, Manhattan, USA {xajng, dcaragea, bhsu}@ksuedu
More informationLecture 6: Introduction to Linear Regression
Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6
More informationChapter 8 Indicator Variables
Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n
More informationNote on EM-training of IBM-model 1
Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are
More informationOn an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1
On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool
More informationThe Quadratic Trigonometric Bézier Curve with Single Shape Parameter
J. Basc. Appl. Sc. Res., (3541-546, 01 01, TextRoad Publcaton ISSN 090-4304 Journal of Basc and Appled Scentfc Research www.textroad.com The Quadratc Trgonometrc Bézer Curve wth Sngle Shape Parameter Uzma
More informationThermodynamics and statistical mechanics in materials modelling II
Course MP3 Lecture 8/11/006 (JAE) Course MP3 Lecture 8/11/006 Thermodynamcs and statstcal mechancs n materals modellng II A bref résumé of the physcal concepts used n materals modellng Dr James Ellott.1
More informationSTATISTICS QUESTIONS. Step by Step Solutions.
STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to
More informationInteractive Bi-Level Multi-Objective Integer. Non-linear Programming Problem
Appled Mathematcal Scences Vol 5 0 no 65 3 33 Interactve B-Level Mult-Objectve Integer Non-lnear Programmng Problem O E Emam Department of Informaton Systems aculty of Computer Scence and nformaton Helwan
More informationThe Synchronous 8th-Order Differential Attack on 12 Rounds of the Block Cipher HyRAL
The Synchronous 8th-Order Dfferental Attack on 12 Rounds of the Block Cpher HyRAL Yasutaka Igarash, Sej Fukushma, and Tomohro Hachno Kagoshma Unversty, Kagoshma, Japan Emal: {garash, fukushma, hachno}@eee.kagoshma-u.ac.jp
More informationEntropy generation in a chemical reaction
Entropy generaton n a chemcal reacton E Mranda Área de Cencas Exactas COICET CCT Mendoza 5500 Mendoza, rgentna and Departamento de Físca Unversdad aconal de San Lus 5700 San Lus, rgentna bstract: Entropy
More informationThis column is a continuation of our previous column
Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard
More information= z 20 z n. (k 20) + 4 z k = 4
Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5
More informationDepartment of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6
Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.
More informationEvaluating ADM on a Four-Level Relevance Scale Document Set from NTCIR. (DRAFT - Not for Quotation)
Workng Notes of NTCIR-4, Tokyo, 2-4 June 24 Evaluatng on a Four-Level Relevance Scale Document Set from NTCIR (DRAFT - Not for Quotaton) Vncenzo Della Mea 1, Luca D Gaspero 2, Stefano Mzzaro 1 1 Department
More informationDO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes
25/6 Canddates Only January Examnatons 26 Student Number: Desk Number:...... DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR Department Module Code Module Ttle Exam Duraton
More informationCase A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.
THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty
More informationCS-433: Simulation and Modeling Modeling and Probability Review
CS-433: Smulaton and Modelng Modelng and Probablty Revew Exercse 1. (Probablty of Smple Events) Exercse 1.1 The owner of a camera shop receves a shpment of fve cameras from a camera manufacturer. Unknown
More informationTime-Varying Systems and Computations Lecture 6
Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy
More informationChapter 9: Statistical Inference and the Relationship between Two Variables
Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,
More informationMAE140 - Linear Circuits - Fall 13 Midterm, October 31
Instructons ME140 - Lnear Crcuts - Fall 13 Mdterm, October 31 () Ths exam s open book. You may use whatever wrtten materals you choose, ncludng your class notes and textbook. You may use a hand calculator
More informationBasically, if you have a dummy dependent variable you will be estimating a probability.
ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy
More informationLINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables
LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory
More informationUncertainty in measurements of power and energy on power networks
Uncertanty n measurements of power and energy on power networks E. Manov, N. Kolev Department of Measurement and Instrumentaton, Techncal Unversty Sofa, bul. Klment Ohrdsk No8, bl., 000 Sofa, Bulgara Tel./fax:
More information