On Statistical Analysis and Optimization of Information Retrieval Effectiveness Metrics

Size: px
Start display at page:

Download "On Statistical Analysis and Optimization of Information Retrieval Effectiveness Metrics"

Transcription

1 On Statstcal Analyss and Optmzaton of Informaton Retreval Effectveness Metrcs Jun Wang and Janhan Zhu Department of Computer Scence, Unversty College London, UK ABSTRACT Ths paper presents a new way of thnkng for IR metrc optmzaton. It s argued that the optmal rankng problem should be factorzed nto two dstnct yet nterrelated stages: the relevance predcton stage and rankng decson stage. Durng retreval the relevance of documents s not known a pror, and the jont probablty of relevance s used to measure the uncertanty of documents relevance n the collecton as a whole. The resultng optmzaton objectve functon n the latter stage s, thus, the expected value of the IR metrc wth respect to ths probablty measure of relevance. Through statstcally analyzng the expected values of IR metrcs under such uncertanty, we dscover and explan some nterestng propertes of IR metrcs that have not been known before. Our analyss and optmzaton framework do not assume a partcular (relevance) retreval model and metrc, makng t applcable to many exstng IR models and metrcs. The experments on one of resultng applcatons have demonstrated ts sgnfcance n adaptng to varous IR metrcs. Categores and Subject Descrptors H.3 [Informaton Storage and Retreval]: H3.Content analyss and Indexng; H.3.3 Informaton Search and Retreval General Terms Algorthms, Expermentaton, Measurement, Performance. INTRODUCTION In Informaton Retreval Modellng, the man efforts have been devoted to, for a specfc nformaton need (query), automatcally scorng ndvdual documents wth respect to ther relevance states. Representatve examples nclude the Probablstc Indexng model that studes how lkely a query term s assgned to a relevant document [7], the RSJ model that derves a scorng functon on the bass of the log-rato of probablty of relevance [], to name just a few. And yet, gven the fact that n many practcal stuatons relevance nformaton s not steadly avalable, major developments have shfted ther focus to estmatng text statstcs n the documents and queres and then buldng up the lnk through these statstcs[,, 34]. For example, scorng functons Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, to republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. SIGIR, July 9 3,, Geneva, Swtzerland. Copyrght ACM //7...$.. such as TF IDF, Vector Space Model, and the Dvergence from Randomness (DFR) model [] have been developed [6]. A practcal approxmaton of the RSJ model led to the popular BM5 scorng functon []. Another drecton n probablstc modellng was to buld a language model of a document and assess ts lkelhood of generatng a gven query [34]; a query language model s also covered under the Kullback-Lebler dvergence based loss functon [5]. Despte the efforts for retreval, when n the evaluaton phase, many IR tasks have evaluaton crtera that go beyond smply countng the number of relevant documents n a ranked lst. Measurng IR effectveness by dfferent metrcs s crtcal because, for dfferent retreval goals, we need to capture dfferent aspects of retreval performance. In the case where the preference goes strongly towards earlyretreved documents, MRR (Mean Recprocal Rank) s a good measure [8], whereas f we try to capture a broader summary of retreval performance, MAP (Mean Average Precson) becomes sutable [3]. Thus, there s a gap between the underlyng (rankng) decson process of retreval models and the fnal evaluaton crteron used to measure success n a task. Ideally, t s desrable to have retreval systems adapted to the specfc IR effectveness metrcs. In fact, IR researchers have already started to explore the opportunty. One extreme case s learnng to rank; t drectly constructs a document rankng model from tranng data, bypassng the step of estmatng the relevance states of ndvdual documents [8]. Under ths paradgm, some attempts have been made to drectly optmzng IR metrcs such as NDCG (Normalzed Dscounted Cumulated Gan) and MAP [3, 33]. However, t s known that some evaluaton metrcs are less nformatve than others [4]. As argued n [3], some IR metrcs thus do not necessarly summarze the (tranng) data well; f we begn optmzng IR metrcs rght from the data, the statstcs of the data may not be fully explored and utlzed. A somewhat opposte drecton s to focus stll on desgnng a scorng functon of a document, but wth the acknowledgement of varous retreval goals and the fnal rank context. The less s more model proposed n [] s one of the examples. By treatng the prevously retreved documents as non-relevant when calculatng the relevance of documents for the current rank poston, the algorthm s shown to be equvalent to maxmzng the Recprocal Rank measure. In [35], a more general and flexble treatment n ths drecton s proposed. In the framework, Bayesan decson theory s appled to ncorporate varous rankng strateges through predefned loss functons. Despte ts generalty, the resultng IR models, however, lack the ablty of drectly ncorporatng IR metrcs nto the rank decson. In ths paper, we argue that regardng the retreval task solely as ether optmzng IR metrcs or dervng a (rele-

2 Fgure : The two dstnct stages n the statstcal document rankng process. vance) scorng functon presents a partal vew of the underlyng problem; a more unfed vew s to dvde the retreval process nto two dstnct stages, namely relevance predcton and rankng decson optmzaton stages, and solve them sequentally. In the frst stage, the am s to estmate the relevance of documents as accurate as possble, and summarze t by the jont probablty of documents relevance. Only n the second stage s the rank preference specfed, possbly by an IR metrc. The rank decson makng s a stochastc one due to the uncertanty about the relevance. As a result, the optmal rankng acton s the one that maxmzes the expected value of the IR metrc. We shall show that statstcal analyss of the expected value of IR metrcs gves nsght nto the propertes of the metrcs. One of the fndngs s that AP (Average Precson) encourages documents whose relevance s postvely correlated wth prevous retreved documents, whle RR (Recprocal Rank) does otherwse. It follows that f a rank acheves superor results on AP, t must pay wth nferorty on RR. Apart from a theoretcal contrbuton, our experments on TREC data sets demonstrate the sgnfcance of our probablstc framework. The remander of the paper s organzed as follows. We frst establsh our optmzaton scheme, and study major expected IR metrcs and practcal ssues. We then provde an emprcal evaluaton, and fnally conclude our work.. STATISTICAL RANKING MECHANICS In ths secton, we present the framework of optmzng IR metrcs n the stuaton where the relevance of documents s unknown. To keep our dscusson smple, we consder bnary relevance, whle graded relevance can be extended smlarly. Gven an nformaton need, let us assume each document n the corpus s ether relevant or non-relevant. We denote them jontly as a vector r (r,..., r k,..., r N) {, } N, where k = {,..., N}, N denotes the number of documents. r k = f document k s relevant; otherwse r k =. Our vew s the followng: frstly the IR model should focus on estmatng the relevance of documents. The relevance n ths stage s the true topcal relevance [8], dfferent from the user perceved relevance that wll be qualfed n the next stage. In statstcal modellng, we assgn to every possble relevance state r a number p(r q), whch we nterpret as the probablty that a user, who ssues query q, wll fnd the documents relevance states as r. Gven the observaton so far (the query, the user s nteracton etc), the posteror probablty p(r q) presents our (or the IR model s) belef about the relevance states of the documents n the collecton as a whole. Note that we use the jont dstrbuton of relevance nstead of the margnal dstrbuton p(r k q) to cover the dependency of relevance among documents. It s argued that only n the second stage does the retreval model make a rankng decson under the uncertanty specfed by the jont probablty of relevance. To formulate ths, we follow the termnology n natural language processng [6]; a rankng order s represented by a vector a (a,..., a,..., a N), where a {,..., N}. If a document k s n rank poston, then a = k. The retreval task s, thus, to fnd an optmal rank order a to maxmze a certan retreval objectve. Formally, an IR metrc (measure) m(a r) s defned as a score functon of a gven r. A good metrc should be able to measure the user s gan or utlty of a rank order a when the true relevance states of all the documents, r, are known. m(a r) can be also seen as a measure of the user s perceved relevance n the context of a ranked lst. For example, Precson concerns a soluton that fnds relevant documents as many as possble n the lst regardless of ther order, whle Recprocal Rank (nverse of the rank of the frst relevant document retreved) makes sure to retreve the frst relevant document as early as possble regardless of the rank postons of remanng relevant documents. Gven the fact that dfferent IR effectveness metrcs are useful for capturng dfferent aspects of retreval qualty, t s desrable to optmze a wth respect to the specfc metrc m. Bayesan decson theory suggests that the optmal rank order â s obtaned by maxmzng the expected IR metrc: â = argmax a E r[m q] = argmax a X r {,} N m(a r)p(r q), where E[ q] denotes an expectaton wth respect to a condtonal dstrbuton p( q). The subscrpt r ndcates t s averaged over all possble r. Eq. () shows that: frstly the true relevance state of documents, r, s generated from probablty p(r q) estmated by an IR model. Under the relevance state r, the score of a gven rank order a s calculated. E r[m q], the expected score of the rank order, s the one averagng over all possble relevance states of r. Fnally, the optmal rank order s chosen by maxmzng E r[m q]. Although the formulaton can be thought of as a specal nstantaton of the general retreval decson framework n [5, 35], our underlyng dea and development are qute dfferent from ther nstantated models. The advantage s that, as llustrated n Fgure, n our framework, the IR metrc (utlty) reles only on the true relevance and rankng order, whle (relevance) IR models are for estmatng the relevance. Decouplng them s essental to drectly use any retreval metrc and plug t nto the optmzaton procedure. More dscusson can be found n Secton 4. To obtan Eq. (), we analyze the expected IR metrcs E r[m q] n Secton. and present a practcal mplementaton and maxmzaton (search) method n Secton... Analyss of Expected IR metrcs.. Expected Average Precson Average Precson (AP) s a wdely-adopted metrc. For each query, t s the average of the precson scores obtaned across rank postons where each relevant document s retreved; relevant documents that are not retreved receve a precson score of zero [7]. The metrc, n fact, s the area under the Precson-Recall curve, capturng a broad summary of retreval performance wth a sngle value [4]. By defnton, the Average Precson measure s as follows: m A(a r) ( + P r a = j= ra j ) (), () where M N ( P j= ra j when =). NR s the number P of relevant documents, and ts expected value equals N = p(ra = ), the summaton of the margnal probablty of relevance. For smplcty, we defne p(r a = ) p(r a ) n the remander of the paper. Because durng retreval r s hdden, m A(a r) cannot be calculated exactly. Instead, ts expected value under the jont probablty of relevance s derved by makng use of the propertes of expectaton (Throughout ths paper the expectaton s all condtoned on a gven query q and wth respect to r. For smplcty, we drop the subscrpt r and notaton q n E[ ] from now on): E[m A] = X p( q)e[m A ] (3)

3 Weght Rato. p(r) Weght Rato.. p(r) (a) (b) (c) Weght Rato Expected AP: P(r) =.9 Expected AP: P(r) =.5 Expected DCG Expected RR: P(r) =.5 Expected RR: P(r) = Fgure : (a) The adaptve weght w A of the expected Average Precson, (b) The adaptve weght w R of the expected Recprocal Rank, and (c) Comparson of the weghts n dfferent expected IR metrcs. = X p( q) = X p( q) X j= = = ( E[ra NR] ( E[ra NR] X + + j= E[r a r aj ] ) Cov(r a, r aj ) + E[r a ]E[r aj ] ), where Cov(r a, r aj ) denotes the correlaton between the relevance values of documents at rank and j gven the number of relevant documents s. Eq. (3) shows that the expected AP can be nterpreted as: for the gven query, an IR model frst estmates the number of relevant documents n the collecton, and then estmates the expected AP for that number of relevant documents. The fnal expected measure s the average, weghted by p( q), across all the possble numbers of relevant documents. We can obtan more nsght nto the expected AP by makng a smple approxmaton to the average over. By assumng that the posteror dstrbuton of s sharply peaked around the most probable value (the mode) ˆ, we can use the mode to approxmate the average [5]. Ths gves: E R[m A] ˆNR M X = X w A p(r a ) + j= Cov(r a, r aj ), (4) where E[r a ] = P r a r a p(r a ) = p(r a ), the margnal probablty of a document s relevance at rank. Note that the equaton removes the dependency of ˆNR because the condtonal expectaton and varance are well approxmated by the non-condtonal ones when P( ˆ q). To smplfy the equaton, we also defne w A, whch s regarded as an adaptve weght of rank. The frst term n ths smple approxmaton ndcates that the expected AP s a weghted average of the scores across all rank postons, and as we ncrease the margnal probablty of relevance p(r a ) n the ranked lst, the expected AP ncreases. Furthermore, because the weght rato: w+ a w a +P j= p(ra j ) = + ( + p(r a ) + P j= p(ra j )) (5) + and.the rato s adaptve + )) receved s n the range between to the expected relevance (defned as P j= p(ra j so far. To get the nsght nto t, we approxmate the weght by settng p(r a ) all equal to p(r). We plot the weght rato aganst the margnal dstrbuton p(r) and rank poston n Fgure (a). It llustrates that when we have more confdence about the relevance of the early retreved documents (p(r) approaches one), the weght rato becomes near one. As a result, the metrc s less worred about the early retreved documents, thus puttng equal weghts to the laterretreved documents. Ths s smlar to the Precson metrc. But once less confdent documents (p(r) approaches zero) are retreved, partcularly n the top ranked postons, the + weght rato approaches ts lower bound. As a consequence, the weght penalzes more the later-retreved relevant documents, and the rato of the expected AP behaves more lke that of the expected DCG, whch wll be dscussed later. The second term n Eq. (4) ndcates that a document wll contrbute more to the expected AP f ts relevance s more postvely correlated wth those of prevous retreved documents. The consequence s that t wll push postvely correlated documents up n the ranked lst. Ths s an nterestng fndng because t shows that the expected AP s n fact nonlnear t models well the dependences between documents relevance and ncorporates them n decdng the preferred rank order. The ratonal of encouragng postvely correlated relevant documents s that f a document s relevant, t s lkely that ts postvely correlated documents are also relevant. It theoretcally explans why pseudo relevance feedback,.e., the top ranked documents are generally lkely to be relevant, and fndng other documents smlar to these top ranked ones helps mprove MAP [4]... Expected DCG and Precson Dscounted Cumulatve Gan (DCG) s another popular measure for rankng effectveness, especally n web search. DCG measures the usefulness, or gan, of a document based on ts (graded) relevance[4] (for the moment, let us consder r a to cover the graded relevance too); the gan s accumulated from the top of the result lst to the bottom. To penalze late-retreved relevant documents, the gan of each result s dscounted by a functon of ts rank poston. By defnton, we have the DCG measure as: m D(a r) = w D g(r a ), (6) = where w D s the dscount weght for rank poston, and g(r a ) s a gan functon mappng the relevance value to the retreval gan. Unlke the expected AP, the expect DCG s lnear wth respect to rank postons. We thus have: E r[m D] = w D E[g(r a )] (7) = Snce g(r a ) s nfntely dfferentable n the neghborhood

4 of the mean of r a,.e., ˆr a E[r a ], the mean of g(r a ) can be represented by a Taylor power seres as: E[g(r a )] =E[g(ˆr a )] + E[(r a ˆr a )g (ˆr a )]+ E[ (ra ˆra ) g (ˆr a )] +... =g(ˆr a ) + + V ar(ra )g (ˆr a ) +... g(ˆr a ) + V ar(r a ) g (ˆr a ), The expected DCG s thus approxmated by: E r[m D] = w D (8) g`ˆr a + g (ˆr a )V AR(r a ), (9) where V AR(r a ) denotes the varance of r a. Eq. (9) shows that the expected value of DCG s determned by both the mean and varance of the relevance of documents at rank postons from to M. Whether t should add varance or mnus varance depends on the sgn of the second dervatve of the gan functon. In the case of graded relevance, f consder hghly relevant documents more valuable than margnally relevant documents and gve them more gan, we can then use a gan functon lke g(r a ) = ra. In ths case, we need to add varance. It s shown that when w D > w D... > wm, D the document wth the hghest score of g(ˆr a )+ g (ˆr a )V AR(r a ) s retreved frst, the document wth the next hghest score s retreved second, and so on. It s common to defne w D. Compared to the adaptve weght n the log (+) expected AP, t penalzes more the late-retreved relevant documents. Fgure (c) compares ther weght ratos. Precson at M s a specal case of DCG, where the dscount s a constant and the gan functon s lnear. Thus, the expected Precson measure s E[m P] = E(r a ) p(r a ) () M M =..3 Expected Recprocal Rank In the cases lke web search and queston answerng tasks, we qute often expect a relevant document to be retreved as early as possble [, 8]. Expected Search Length and Recprocal Rank (RR) are strongly based towards earlyretreved documents. Ths secton analyzes RR, whle Expected Search Length can be derved smlarly. RR s the nverse of the rank of the frst relevant document and bounded between and. It s formally defned as: m R(a r) =r a + ra ( r a ) = = + r a3 ( r a )( r a ) NX = r a Y NX ( r aj ) = j= = vra, () where we defne v Q j= ( ra j ), a functon of the relevance values of documents ranked above ; (v when = ). Conceptually, RR measure can be thought of as a weghted average of relevance values at dfferent rank postons, where the weghts are adaptve to earler retreved documents. The expected value of the RR measure s the followng: E[m R] =E[ = = = vra ] = X M = E[v r a ] E[v ]E[r a ] + Cov(r a, v ) = `wr p(r a ) + Cov(ra, v), = () where, smlarly, we consder E[v ] as an adaptve weght and denote t as w R. It can be approxmated by assumng that the rrelevance of documents above rank s ndependent when calculatng w R,.e., w R E[v Q ] j= ( p(ra )). j Thus w R > w+. R On the one hand, smlar to the expected DCG, the weght w R s a dscount factor penalzng late retreved relevant documents. As a result, maxmzng the measure ntends to push documents that have hgh margnal dstrbuton of relevance p(r j) to the top. However, the penalty s much larger than the ones n expected DCG and expected AP. To see ths, let us agan approxmate the weght by settng p(r a ) p(r). The weght rato s compared wth those of the expected AP and expected DCG n Fgure (c). It shows that expected RR has the smallest weght rato, whle expected AP has the largest. Expected DCG s the one n the mddle. One the other hand, the weght s updated n a completely dfferent way compared to expected AP. Fgure (b) plots the weght rato aganst the margnal dstrbuton p(r) and rank poston. Dfferent from expected AP, the weght rato of expected RR becomes larger when p(r) s larger, renforcng the dscount further. As a consequence, t entrely focuses on the qualty of a few early retreval documents. For example, the upper bound for w3 R s. If we consder p(r a ) >.5 for = {,, 3}, whle for DCG t usually equals = and for expected AP even larger. log 4 The covarance bt n Eq. () shows that overall the expected value of RR ncreases when relevance of a document s more postvely correlated wth v, the product of nonrelevances ( r aj ) of the documents above. The effect s that negatvely correlated documents wll have hgher expected RR than postvely correlated documents. Such effect wll be dscounted by a factor / at rank. Ths s an entrely opposte preference compared to the expected AP. To see ths, suppose we have two documents to rank: E[m RR] =E[R a ] + E[Ra ] =p(r a ) + p(ra ) E[ra r a ] Cov[ra, r a ] + p(r a )p(r a ) =p(r a ) + w R p(r a ) Cov[ra, r a ], (3) where w R = ( p(ra )). It shows that negatvely correlated document has a hgher value of the expected RR, confrmng the fndngs n [, 9] that the RR metrc s optmzed by dversfyng the ranked lst of documents...4 A General Vew Through our analyss, t can be seen that the expected IR metrcs roughly have two components. A unfed defnton s gven as follows: E[m(a r)] = W p(r a ) + = V (r a,..., r a ), (4) where W s the dscount weght n poston, and V s a

5 Defnton: W Table : A unfed vew of expected IR metrcs. Precson DCG AP RR P M P M = ra = ra P M (+ P log (+) = ra j= ra j ) P M r a = Expected Precson Expected DCG Expected AP Expected RR + P j= p(ra j ) Q j= ( p(ra j )) log (+) V (r a,..., r a ) Q j= ( ra j ) P j= Cov(ra, ra j ) Cov(ra, Q j= ( ra j )) functon defnng the correlaton between documents. The specfc defntons wth respect to dfferent metrcs are summarzed n Table. Notce that for DCG, n the case of bnary relevance, g(r a ) = ra can be approxmated as a lnear functon, and the varance bt vanshes n Eq. (9). The frst bt s a lnear one wth respect to the margnal probablty p(r a ). Strctly speakng, ths s untrue as W s adaptve to prevously retreved documents. But snce the weght rato W +/W s usually smaller than one, the maxmum value of the frst bt s stll acheved by rankng n the decreasng order of the margnal probablty of relevance. Ths s dentcal to what the Probablty Rankng Prncple has suggested [9]. We call t the general rankng preference. The second bt makes the IR metrcs dfferent from each other. It s called the specfc rankng preference. A more detaled dscusson and comparson about t s presented n Secton 3. through a smulaton.. Practcal Consderatons Stack Search Maxmzng Eq. (4) s a non-trval task because t needs to search over all possble rankng combnatons. We use stack search smlar to [3], whch keeps a lst of the best n rankng combnatons as canddates seen so far. These canddates are ncomplete solutons tll rank. It then teratvely expands each of the best partal solutons by addng a document at rank +. For each canddate, we select top-n documents that have the maxmum ncreases of the expected IR metrc n Eq. (4). We then put all resultng partal solutons (n ths case, n n) onto the stack and then trm the resultng lst of partal solutons to the top n canddates agan. We repeat the loop untl the end of the rank lst s reached. The soluton s the one havng the maxmum value among the canddate solutons. Such a sequental update may not necessarly provde a global optmzaton soluton, but t provdes an excellent trade off between accuracy and effcency by adjustng n. When n s, t goes back to the greedy approach. When we ncrease n, better solutons may be found at the expense of more computatonal cost. For detals refer to [3]. IR Model Calbraton To calculate the expected IR metrcs durng retreval, we need to estmate the jont probablty of relevance. An obvous soluton s to drectly estmate t from the (tranng) data []. Relevance nformaton s, however, not steadly avalable n many practcal stuatons to buld a robust relevance model. In ths paper, we ntend to conduct an ndrect estmaton usng exstng IR models. It s observed that n many text retreval experments that the calculated rankng scores can serve as robust ndcators of documents relevance wth respect to queres. Thus, a mappng functon can be developed to map from the rankng scores to the probablty of relevance. Smlar to [9], the jont probablty of relevance p(r q) s summarzed by the margnal probablty p(r a q) and covarance Cov[r a, r aj ]. Let us frst look at p(r a q), and treat t as the utlty of rankng scores. We expect the utlty, defned as u, to be a non-decreasng functon of the rankng score. Thus the frst dervatve u >. It s also expected that u has a maxmum value as the rankng score ncreases. Thus the Fgure 3: By adjustng the correlaton between documents from -. to., the gan on performance for average precson, DCG, and RR, respectvely. second dervatve u <. Our experment (Secton 3.) on TREC data has confrmed our ntuton. Applyng an exponental utlty functon (u > and u < ) [] gves the mappng functon as: p(r a q) u(s) = e bs, (5) where u(s), n the range [, ), s the utlty of the rankng score s, where s. b denotes a constant. For the emprcal study of the mappng, we refer to Secton 3.. The next queston s how to estmate the covarance q Cov[r a, r aj ] = ρ(r a, r aj ) V ar[r a ]V ar[r aj ], (6) where V ar[r a ] = ( p(r a ))p(r a ) f r a follows a Bernoull dstrbuton. The correlaton coeffcent ρ(r a, r aj ) models the dependency of relevance between documents at rank and j. Durng retreval, t s reasonable to use the documents score correlaton to estmate the relevance correlaton,.e., ρ(r a, r aj ) ρ(s a, s aj ). Strctly speakng, the score correlaton s query-dependent. A practcal soluton s, however, to approxmate t by samplng queres and calculatng the correlaton between documents rankng scores from an IR model. In our mplementaton, we construct each of these queres by randomly samplng query terms from the vocabulary of a data set. For the expected RR, we need to compute the covarance between document a and varable v, where v s the meta-relevance of prevously retreved documents,.e., ) as defned n Secton..3. In our mplementaton, we aggregate the content of the top - documents as a meta document, and estmate the correlaton between r a and v as mnus the correlaton between the meta document s rankng score and document a s rankng score. v Q j= ( ra j 3. EXPERIMENTS 3. Smulaton In ths secton, we carred out a smulaton as a confrmaton of our analyss about the effect of correlaton between dfferent documents relevance on a range of IR metrcs. The relevance states of documents were generated for, trals. At each tral, for each rank poston, we kept

6 Fgure 4: Probablty that a result from each bn s relevant aganst the medan of each bn. the margnal probablty of relevance p(r a q) unchanged and generated the relevance/nonrelevance states of the document. The samples were then randomly perturbed so that the correlaton between each par of varables ncreases from negatve to postve (x axs n Fgure (3) ). For each sample n each tral we calculated the value of an IR metrc. We then averaged the metrc values across all the trals to obtan the average value. We used the value of the IR metrc when the correlaton s set as zero as the bass for calculatng the gan on the metrc when the correlaton changes. The results for AP, DCG, and RR are shown n Fgure (3). It confrms our dervaton of the expected DCG that t s nsenstve to correlaton. AP value ncreases when correlaton ncreases, whereas RR does otherwse. We tred wth dfferent settngs such as the number of documents, and margnals etc, and got smlar fndngs to the reported above. Prevous emprcal studes on TREC data have found out that one cannot optmze both the RR and AP metrcs at the same tme [4, 9]. The analytcal forms and the smulaton provde drect evdence that the AP metrc encourage postvely correlated documents whereas the RR metrc encourages the opposte. 3. IR Model Calbraton In ths secton, TREC data s used to get an nsght nto how the mappng functon u looks lke. Smlar to the expermental setup n [], we measured the utlty of rankng scores by the probablty that documents gven the rankng scores are judged relevant. Documents were bnned based on ther rankng scores for analyss; we judged the probablty that a randomly pcked document from each bn s judged as relevant. More specfcally, we ran the Jelnek-Mercer smoothng language model on the TREC4 Robust Track 49 topcs wth the parameter λ set as ts typcal value. [34]. The top documents were returned for each topc, and there were n total 4,66 results returned for these 49 queres, among whch there are 7,9 relevant documents out of a total number of 7,4 relevant documents n the track. The queres contan dfferent numbers of terms. To makng the rankng scores comparable across queres, we normalzed the rankng scores for all results of each query by dvdng these rankng scores by the number of terms n the query. We sorted the 4,66 results n the descendng order n terms of ther scores, and dvded ths ranked lst nto bns of,5 results each, yeldng 6 bns: the frst 6 bns contanng,5 results each, and the last bn contanng the 66 documents wth the lowest scores. We selected the medan score n each bn to represent the bn. In Fgure 4, the utlty of each bn,.e., the probablty that a randomly chosen result from the bn s relevant, s estmated as the number of relevant documents n each bn dvded by the bn sze. The data ponts are based on the pars of the medan of each bn and probablty of relevance, and the data ponts are connected by smoothed curves. Table : Overvew of sx TREC collectons. Name Descrpton Sze # Docs Topcs TREC8 TREC dsks.86 GB 58, &5 mnus CR Robust TREC dsks.86 GB 58, and 6-4 4&5 mnus CR 7 mnus 67 Robust TREC dsks.86 GB 58,55 5 dffcult Robust4 Hard 4&5 mnus CR topcs WTg TREC Web GB,69, collecton CSIRO CSIRO crawl 4. GB 37,75-5 mnus 8 unjudged topcs.gov crawl of.gov doman 8 GB,47, Fgure 4 confrms our ntuton that the mappng functon s approxmately a concave curve (u > and u < ) and fttng Eq. (5) to the data n Fgure 4 gves b= Our experments showed that the performance of our approach s robust wth respect to the choce of b, and a value of b anywhere between 7. and. results n neglgble changes of the performance on all the test collectons. For the remanng experments, we fx the parameter b as 9, whle bearng n mnd that tunng t from tranng data mght have potentals for further performance mprovement. 3.3 Performance We contnued our emprcal study of the proposed probablstc retreval framework, focusng on understandng ts ablty of optmzng IR metrcs. Drchlet and Jelnek-Mercer smoothng language models were chosen as the two baselne IR models snce they are frequently reported for good performance on TREC test collectons [34]. For each query, the rankng score of each document, calculated by ether of the two IR models, s normalzed by dvdng them over the number of terms n the query. It s used as the nput to estmate the margnal probabltes and covarance on the bass of the dscusson n Secton.. The stack search s then appled to fnd an optmal rankng lst that maxmzes a gven IR metrc n Eq. (4). For the stack search, we smply set n=,.e., equvalent to a greedy approach, whle leavng ths lne of research to future work. Standard stemmng and stopword removng were carred out for both queres and documents. The smoothng parameters of the language models were tuned for the optmal performance for a metrc on each data set. The results are reported on sx TREC test collectons, descrbed n Table. TREC8, Robust 4, and Robust 4 Hard topcs are three plan text collectons, and TREC ad hoc task on WTg data, TREC 7 enterprse track document search task on CSIRO data, and TREC topc dstllaton task on.gov data are on three Web collectons. The results n Table 3 ndcate that f we choose a certan IR metrc to maxmze, we obtaned n most cases the best performance on ths metrc than optmzng other metrcs and the baselnes. More specfcally, our approach always had the best performance wth respect to MAP and MRR when the objectve was to maxmze the expected AP and RR, respectvely. When we amed to optmze the expected DCG, our approach mproved the baselne on 8 out of occasons n terms of NDCG. It s worth mentonng that no parameter was needed when optmzng the metrcs. Wthout any parameter tunng, our approach consstently outperformed the two baselne models, and eght mprovements are statstcally sgnfcant. Recall the analyss n Secton that the expected AP and RR have a rather opposte rank preference (utlty) the expected AP favors a document whose relevance s postvely correlated wth those of the documents ranked above, whereas the expected RR suggests otherwse. Table 3 demonstrates that the optmzaton of the expected RR always leads to better performance on MRR than optmzaton

7 Table 3: Performance on MAP, NDCG and MRR when the objectve s to optmze AP, DCG, and RR, respectvely. We used the Drchlet and Jelnek-Mercer smoothng language models, whose smoothng parameters were tuned for the optmal performance of a metrc on each data set, as the baselnes n optmzaton. We hghlght the hghest performance n bold. A Wlcoxon sgned-rank test (p <.5) s conducted and statstcally sgnfcant mprovements over the baselnes are marked wth. TREC8 MAP NDCG MRR Robust4 MAP NDCG MRR Robust hard MAP NDCG MRR Drchlet (Baselne) 8 6 Drchlet (Baselne)..596 Drchlet (Baselne) Maxmze AP.36 8 Maxmze AP Maxmze AP Maxmze DCG 4 5 Maxmze DCG Maxmze DCG Maxmze RR Maxmze RR Maxmze RR.76.3 Jelnek-Mercer (Baselne) 4 58 Jelnek-Mercer (Baselne)..54 Jelnek-Mercer (Baselne) Maxmze AP Maxmze AP.593 Maxmze AP Maxmze DCG Maxmze DCG Maxmze DCG Maxmze RR Maxmze RR Maxmze RR WTg MAP NDCG MRR CSIRO MAP NDCG MRR.Gov MAP NDCG MRR Drchlet (Baselne)..55 Drchlet (Baselne) Drchlet (Baselne) Maxmze AP Maxmze AP Maxmze AP Maxmze DCG Maxmze DCG Maxmze DCG Maxmze RR Maxmze RR Maxmze RR Jelnek-Mercer (Baselne) Jelnek-Mercer (Baselne) Jelnek-Mercer (Baselne) Maxmze AP Maxmze AP Maxmze AP Maxmze DCG Maxmze DCG Maxmze DCG Maxmze RR Maxmze RR Maxmze RR of the expected AP, and vce versa. The result supports our theoretcal fndng that RR and AP are two dfferent types of metrcs, and optmzng ether of them cannot lead to the optmal performance of the other. Table 3 also shows that optmzaton of AP can sometmes lead to better performance on NDCG than drect optmzaton of DCG. Smlar fndng appeared n the learnng to rank paradgm, and t was argued that the reason s due to the fact that MAP s more nformatve than DCG [3]. Yet, we thnk that the nformatve explanaton, although true n learnng to rank, does not necessarly hold n our probablstc framework snce we do not use IR metrcs to summarze the tranng data. Our belef s supported by the results from the smulaton n Secton 3. that the expected DCG s nvarant to the changes of relevance correlaton between documents; and as a result, optmzng AP (promptng documents whose relevance s postvely correlated wth prevous documents) shouldn t do any better than drectly optmzng DCG for the NDCG metrc. We thus beleve the somewhat contradcted fndng n the real data set may be attrbuted to the estmaton of the jont probablty of relevance, more specfcally the relevance correlaton, gven the fact we used textual content to nfer relevancy. As the cluster hypothess suggests that relevant documents tend to be smlar to each other to form clusters [5], a document s lkely to be relevant f t s smlar to relevant documents. As a result, the expected AP bases towards puttng documents smlar wth each other n the top rank postons. When top ranked documents are relevant, these other documents are also lkely to be relevant - ther margnal probabltes of relevance mght be hgher than the estmated. As a result, metrcs such as NDCG and Precson are mproved. Fnally, we provde a further account of RR and AP, the two dfferently behavng metrcs. Recall that n Fgure the propertes of the expected RR and AP were depcted by adjustng the weght functons w A and w R usng a sngle parameter p(r). Fgure (5) used TREC8 test collecton to further show the effect of p(r) on the resultng MRR and MAP performance. For comparson, the performance of the baselne Drchlet smoothng language model, and the exact optmzaton of RR, MAP and DCG was also plotted. It shows that adjustng p(r) to approxmate AP s very stable snce the soluton keeps roughly the same for all eght values of p(r). Ths could be explaned by the fact that the weght rato between w+ A and w A saturates at for Fgure 5: MRR v.s. MAP all values of p(r) when ncreases above 4. By contrast, the RR approxmaton s more volatle wth respect to p(r). As p(r) ncreases from. to.5, the MRR performance ncreases whereas the MAP performance decreases. Ths s due to the fact that as p(r) decreases, the weght rato of RR becomes smlar to that of DCG and AP. p(r) can be used to trade off between the performance of MAP and MRR. When p(r) =.3 and, the performance on MRR even slghtly exceeds that on the exact optmzaton of RR. Ths suggests that there mght be stll scope to mprove our stack search algorthm by settng n hgher than. 4. LINKS TO OTHER WORK To complement Secton, we contnue the dscusson of related work. In the learnng to rank paradgm, optmzng IR metrcs s conducted n a dscrmnatve manner where Support Vector Machnes or Neural Networks were commonly used [3, 33]. By contrast, we study the problem n a probablstc framework where the ntenton s to combne both the generatve and dscrmnatve processes. Our formulaton of optmal rankng also fundamentally departs from the dea n [6], where a probablty dstrbuton over document permutatons (rank) s defned, and the expectaton of IR metrcs s consdered under ths dstrbuton. In ths paper, we, however, beleve that the expectaton of IR metrcs should be wth respect to a dstrbuton of relevance, because the uncertanty comes only from the fact that we cannot know the relevance of documents wth absolute certanty. For the purpose of evaluaton, the estmaton of IR metrcs, partcularly MAP, has been nvestgated n the past.

8 For example, to reduce the varablty of test collecton, a normalzaton technque was ntroduced []; to deal wth ncomplete judgements, samplng approaches were proposed [3, 3]. Emprcally, ther error rates were measured [7]; and the uncertanty from the varablty of relevance judgments n TREC were also examned [7]. By contrast, our study s for the purpose of retreval, and thus the IR metrc estmaton and optmzaton were explored n a complete dfferent stuaton where the relevance s not known a pror. The most relevant work can be found n [, 5, 35]. The study n [] argued that n some tasks users would be satsfed wth a lmted number of relevant documents, rather than requrng all relevant documents. The authors therefore proposed to maxmze the probablty of fndng a relevant document among the top n. By treatng the prevously retreved documents as non-relevant ones, ther algorthm s equvalent to optmzng Recprocal Rank. A more general soluton s proposed n [35] on the bass of the Bayesan rank decson framework n [5]. In ther solutons, dfferent rank preferences are expressed by dfferent utlty functons and can be ncorporated when calculatng the score for each of the documents. The two deas are close n sprt to the Maxmal Margnal Relevance (MMR) crteron n [9], and can be called margnal relevance IR models because they are desgned to calculate the addtonal nformaton a document contrbutes n a result lst. But unfortunately ths framework does not allow the capacty to model and optmze dfferent IR metrcs. Ths paper takes a rather dfferent vew, although smlar to [5, 35] we also follow the Bayesan decson theory. We argue that the rank utlty s nothng to do wth the (relevance) model parameters but only wth the hdden true topcal relevance; and the relevance states of documents need to be estmated before knowng any user (rank) utlty. A good IR metrc could be able to specfy one type of rank utltes. Once we summarze our belef about the true relevance by the jont probablty of relevance, the utlty, expressed by an evaluaton metrc, can be estmated under such uncertanty, and the optmal decson s the one that optmzes that expected value. The two dstnct retreval steps do not assume a partcular (relevance) retreval model, makng t applcable to many exstng IR models and IR metrcs. Our work s also related to the portfolo theory of document rankng [9]. By an analogy wth the fnancal problems, they argued that an optmal rank order s the one that balances the overall relevance (mean) of the ranked lst aganst ts rsk level (varance). Ths paper follows the dea of usng mean and varance to summarze a dstrbuton and to analyze the expected IR metrcs. Our analytcal forms of expected IR metrcs on the bass of the mean and varance reveal some nterestng propertes that have not been shown n the past. 5. CONCLUSIONS In ths paper, we have studed the statstcal propertes of expected IR metrcs when the relevance of documents s unknown. An mplementaton based on our analyss and the two-stage framework has already shown ts ablty of optmzng major IR metrcs n a probablstc framework. In the future, t s of great nterest to seek ts usage n web search where clck-through data can be vewed as ndrect evdence of documents relevance. Also, durng evaluaton, the Cranfeld paradgm consders relevance as determnstc values, ether bnary or graded ones. It s, however, more general to consder IR evaluaton as a stochastc process too. Thus, despte the fact that our study of the expected IR metrcs s for retreval, the analyss and development are also relevant to evaluaton f the dsagreement between relevance assessors needs to be modelled. 6. REFERENCES [] G. Amat and C. J. V. Rjsbergen. Probablstc models of nformaton retreval based on measurng the dvergence from randomness. ACM Trans. Inf. Syst., (4): ,. [] K. Arrow. Aspects of the Theory of Rsk-Bearng. Helsnk: Yrjö Hahnsson Foundaton, 965. [3] J. A. Aslam, V. Pavlu, and E. Ylmaz. A statstcal method for system evaluaton usng ncomplete judgments. In SIGIR, 6. [4] J. A. Aslam, E. Ylmaz, and V. Pavlu. The maxmum entropy method for analyzng retreval measures. In SIGIR, 5. [5] C. M. Bshop. Pattern Recognton and Machne Learnng. Sprnger, 6. [6] P. F. Brown, V. J. D. Petra, S. A. D. Petra, and R. L. Mercer. The mathematcs of statstcal machne translaton: parameter estmaton. Comput. Lngust., 993. [7] C. Buckley and E. M. Voorhees. Evaluatng evaluaton measure stablty. In SIGIR,. [8] C. Burges, T. Shaked, E. Renshaw, A. Lazer, M. Deeds, N. Hamlton, and G. Hullender. Learnng to rank usng gradent descent. In ICML 5, 5. [9] J. Carbonell and J. Goldsten. The use of MMR, dversty-based rerankng for reorderng documents and producng summares. In SIGIR, 998. [] H. Chen and D. R. Karger. Less s more: probablstc models for retrevng fewer relevant documents. In SIGIR, 6. [] G. V. Cormack and T. R. Lynam. Statstcal precson of nformaton retreval evaluaton. In SIGIR, 6. [] W. B. Croft and D. J. Harper. Usng probablstc models of document retreval wthout relevance nformaton. Document Retreval Systems, 988. [3] D. Harman. Overvew of the second text retreval conference (trec-). In HLT 94, 994. [4] K. Järveln and J. Kekälänen. Cumulated gan-based evaluaton of IR technques. ACM Trans. Inf. Syst.,. [5] J. Lafferty and C. Zha. Document language models, query models, and rsk mnmzaton for nformaton retreval. In SIGIR,. [6] C. D. Mannng, P. Raghavan, and H. Schütze. Introducton to Informaton Retreval. Cambrdge Unversty Press, 8. [7] M. E. Maron and J. L. Kuhns. On relevance, probablstc ndexng and nformaton retreval. J. ACM, 96. [8] S. Mzzaro. Relevance: The whole hstory. Journal of the Amercan Socety of Informaton Scence, 997. [9] S. E. Robertson. The probablty rankng prncple n IR. Journal of Documentaton, pages 94 34, 977. [] S. E. Robertson and K. Spärck Jones. Relevance weghtng of search terms. Journal of the Amercan Socety for Informaton Scence, 7(3):9 46, 976. [] S. E. Robertson and S. Walker. Some smple effectve approxmatons to the -posson model for probablstc weghted retreval. In SIGIR, 994. [] A. Snghal, C. Buckley, and M. Mtra. Pvoted document length normalzaton. In SIGIR, pages 9, 996. [3] M. Taylor, J. Guver, S. Robertson, and T. Mnka. Softrank: optmzng non-smooth rank metrcs. In WSDM, 8. [4] S. Tomlnson. Early precson measures: mplcatons from the downsde of blnd feedback. In SIGIR, 6. [5] C. J. van Rjsbergen. Informaton Retreval. Butterworths, London, London, UK, 979. [6] M. N. Volkovs and R. S. Zemel. Boltzrank: learnng to maxmze expected rankng gan. In ICML 9, 9. [7] E. Voorhees. Varatons n relevance judgments and the measurement of retreval effectveness. In Informaton Processng and Management, pages ACM Press, 998. [8] E. M. Voorhees. The TREC-8 queston answerng track report. In TREC-8, pages 77 8, 999. [9] J. Wang and J. Zhu. Portfolo theory of nformaton retreval. In SIGIR, 9. [3] Y. Wang and A. Wabel. Decodng algorthm n statstcal machne translaton. In EACL, 997. [3] E. Ylmaz, E. Kanoulas, and J. A. Aslam. A smple and effcent samplng method for estmatng ap and ndcg. In SIGIR, 8. [3] E. Ylmaz and S. Robertson. On the choce of effectveness measures for learnng to rank. Informaton Retreval, 9. [33] Y. Yue, T. Fnley, F. Radlnsk, and T. Joachms. A support vector method for optmzng average precson. In SIGIR, 7. [34] C. Zha. Statstcal language models for nformaton retreval a crtcal revew. Found. Trends Inf. Retr., (3):37 3, 8. [35] C. Zha and J. D. Lafferty. A rsk mnmzaton framework for nformaton retreval. Inf. Process. Manage., 4():3 55, 6.

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Informaton Search and Management Probablstc Retreval Models Prof. Chrs Clfton 7 September 2018 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group 14 Why probabltes

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Probabilistic Information Retrieval CE-324: Modern Information Retrieval Sharif University of Technology

Probabilistic Information Retrieval CE-324: Modern Information Retrieval Sharif University of Technology Probablstc Informaton Retreval CE-324: Modern Informaton Retreval Sharf Unversty of Technology M. Soleyman Fall 2016 Most sldes have been adapted from: Profs. Mannng, Nayak & Raghavan (CS-276, Stanford)

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

This column is a continuation of our previous column

This column is a continuation of our previous column Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Evaluation for sets of classes

Evaluation for sets of classes Evaluaton for Tet Categorzaton Classfcaton accuracy: usual n ML, the proporton of correct decsons, Not approprate f the populaton rate of the class s low Precson, Recall and F 1 Better measures 21 Evaluaton

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Retrieval Models: Language models

Retrieval Models: Language models CS-590I Informaton Retreval Retreval Models: Language models Luo S Department of Computer Scence Purdue Unversty Introducton to language model Ungram language model Document language model estmaton Maxmum

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Credit Card Pricing and Impact of Adverse Selection

Credit Card Pricing and Impact of Adverse Selection Credt Card Prcng and Impact of Adverse Selecton Bo Huang and Lyn C. Thomas Unversty of Southampton Contents Background Aucton model of credt card solctaton - Errors n probablty of beng Good - Errors n

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Relevance Vector Machines Explained

Relevance Vector Machines Explained October 19, 2010 Relevance Vector Machnes Explaned Trstan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introducton Ths document has been wrtten n an attempt to make Tppng s [1] Relevance Vector Machnes

More information

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}.

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}. CS 189 Introducton to Machne Learnng Sprng 2018 Note 26 1 Boostng We have seen that n the case of random forests, combnng many mperfect models can produce a snglodel that works very well. Ths s the dea

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

Bayesian Planning of Hit-Miss Inspection Tests

Bayesian Planning of Hit-Miss Inspection Tests Bayesan Plannng of Ht-Mss Inspecton Tests Yew-Meng Koh a and Wllam Q Meeker a a Center for Nondestructve Evaluaton, Department of Statstcs, Iowa State Unversty, Ames, Iowa 5000 Abstract Although some useful

More information

Hopfield Training Rules 1 N

Hopfield Training Rules 1 N Hopfeld Tranng Rules To memorse a sngle pattern Suppose e set the eghts thus - = p p here, s the eght beteen nodes & s the number of nodes n the netor p s the value requred for the -th node What ll the

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for

More information

On the correction of the h-index for career length

On the correction of the h-index for career length 1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat

More information

Extending Relevance Model for Relevance Feedback

Extending Relevance Model for Relevance Feedback Extendng Relevance Model for Relevance Feedback Le Zhao, Chenmn Lang and Jame Callan Language Technologes Insttute School of Computer Scence Carnege Mellon Unversty {lezhao, chenmnl, callan}@cs.cmu.edu

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

Polynomial Regression Models

Polynomial Regression Models LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger JAB Chan Long-tal clams development ASTIN - September 2005 B.Verder A. Klnger Outlne Chan Ladder : comments A frst soluton: Munch Chan Ladder JAB Chan Chan Ladder: Comments Black lne: average pad to ncurred

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

THE SUMMATION NOTATION Ʃ

THE SUMMATION NOTATION Ʃ Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Temperature. Chapter Heat Engine

Temperature. Chapter Heat Engine Chapter 3 Temperature In prevous chapters of these notes we ntroduced the Prncple of Maxmum ntropy as a technque for estmatng probablty dstrbutons consstent wth constrants. In Chapter 9 we dscussed the

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Uncertainty and auto-correlation in. Measurement

Uncertainty and auto-correlation in. Measurement Uncertanty and auto-correlaton n arxv:1707.03276v2 [physcs.data-an] 30 Dec 2017 Measurement Markus Schebl Federal Offce of Metrology and Surveyng (BEV), 1160 Venna, Austra E-mal: markus.schebl@bev.gv.at

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

CS286r Assign One. Answer Key

CS286r Assign One. Answer Key CS286r Assgn One Answer Key 1 Game theory 1.1 1.1.1 Let off-equlbrum strateges also be that people contnue to play n Nash equlbrum. Devatng from any Nash equlbrum s a weakly domnated strategy. That s,

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k. THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

arxiv: v2 [stat.me] 26 Jun 2012

arxiv: v2 [stat.me] 26 Jun 2012 The Two-Way Lkelhood Rato (G Test and Comparson to Two-Way χ Test Jesse Hoey June 7, 01 arxv:106.4881v [stat.me] 6 Jun 01 1 One-Way Lkelhood Rato or χ test Suppose we have a set of data x and two hypotheses

More information

Finding Dense Subgraphs in G(n, 1/2)

Finding Dense Subgraphs in G(n, 1/2) Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information