Selecting Good Expansion Terms for Pseudo-Relevance Feedback

Size: px
Start display at page:

Download "Selecting Good Expansion Terms for Pseudo-Relevance Feedback"

Transcription

1 Guhong Cao, Jan-Yun Ne Department of Computer Scence and Operatons Research Unversty of Montreal, Canada {caogu, Selectng Good Expanson erms for Pseudo-Relevance Feedback ABSRAC Pseudo-relevance feedback assumes that most frequent terms n the pseudo-feedback documents are useful for the retreval. In ths study, we re-examne ths assumpton and show that t does not hold n realty many expanson terms dentfed n tradtonal approaches are ndeed unrelated to the query and harmful to the retreval. We also show that good expanson terms cannot be dstngushed from bad ones merely on ther dstrbutons n the feedback documents and n the whole collecton. We then propose to ntegrate a term classfcaton process to predct the usefulness of expanson terms. Multple addtonal features can be ntegrated n ths process. Our experments on three REC collectons show that retreval effectveness can be much mproved when term classfcaton s used. In addton, we also demonstrate that good terms should be dentfed drectly accordng to ther possble mpact on the retreval effectveness,.e. usng supervsed learnng, nstead of unsupervsed learnng. Categores and Subject Descrptors H.3.3 [Informaton Storage and Retreval]: Retreval models General erms Desgn, Algorthm, heory, Expermentaton Keywords Pseudo-relevance feedback, Expanson erm Classfcaton, SVM, Language Models. INRODUCION User queres are usually too short to descrbe the nformaton need accurately. Many mportant terms can be absent from the query, leadng to a poor coverage of the relevant documents. o solve ths problem, query expanson has been wdely used [9], [5], [2], [22]. Among all the approaches, pseudo-relevance feedback (PRF) explotng the retreval result has been the most effectve [2]. he basc assumpton of PRF s that the top-ranked documents n the frst retreval result contan many useful terms that can help dscrmnate relevant documents from rrelevant ones. In general, the expanson terms are extracted ether accordng to the term dstrbutons n the feedback documents (.e. one tres to extract the most frequent terms); or accordng to the comparson between the term dstrbutons n the feedback documents and n the whole document collecton (.e. to extract the most specfc terms n the feedback documents). Several addtonal crtera have been Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. o copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. SIGIR 08, July 20 24, 2008, Sngapore. Copyrght 2008 ACM /08/07...$5.00. Janfeng Gao Mcrosoft Research, Redmond, USA jfgao@mcrosoft.com Stephen Robertson Mcrosoft Research at Cambrdge, Cambrdge, UK ser@mcrosoft.com proposed. For example, df s wdely used n vector space model [5]. Query length has been consdered n [7] for the weghtng of expanson terms. Some lngustc features have been tested n [6]. However, few studes have drectly examned whether the expanson terms extracted from pseudo-feedback documents by the exstng methods can ndeed help retreval. In general, one was concerned only wth the global mpact of a set of expanson terms on the retreval effectveness. A fundamental queston often overlooked at s whether the expanson terms extracted are truly related to the query and are useful for IR. In fact, as we wll show n ths paper, the assumpton that most expanson terms extracted from the feedback documents are useful does not hold, even when the global retreval effectveness can be mproved. Among the extracted terms, a nonneglgble part s ether unrelated to the query or s harmful, nstead of helpful, to retreval effectveness. So a crucal queston s: how can we better select useful expanson terms from pseudo-feedback documents? In ths study, we propose to use a supervsed learnng method for term selecton. he term selecton problem can be consdered as a term classfcaton problem we try to separate good expanson terms from the others drectly accordng to ther potental mpact on the retreval effectveness. hs method s dfferent from the exstng ones, whch can typcally be consdered as an unsupervsed learnng. SVM [6], [20] wll be used for term classfcaton, whch uses not only the term dstrbuton crtera as n prevous studes, but also several addtonal crtera such as term proxmty. hs approach proposed has at least the followng advantages: ) Expanson terms are no longer selected merely based on term dstrbutons and other crtera ndrectly related to the retreval effectveness. It s done drectly accordng to ther possble mpact on the retreval effectveness. We can expect the selected terms to have a hgher mpact on the effectveness. 2) he term classfcaton process can naturally ntegrate varous crtera, and thus provdes a framework for ncorporatng dfferent sources of evdence. We evaluate our method on three REC collectons and compare t to the tradtonal approaches. he expermental results show that the retreval effectveness can be mproved sgnfcantly when term classfcaton s ntegrated. o our knowledge, ths s the frst attempt tryng to nvestgate the drect mpact on retreval effectveness of ndvdual expanson terms n pseudo-relevance feedback. he remanng of the paper s organzed as follows: Secton 2 revews some related work and the state-of-the-art approaches to query expanson. In secton 3, we examne the PRF assumpton used n the prevous studes and show that t does not hold n realty. Secton 4 presents some experments to nvestgate the potental usefulness of selectng good terms for expanson. Secton 5 descrbes our term classfcaton method and reports an evaluaton of the classfcaton process. he ntegraton of the classfcaton 243

2 results nto the PRF methods s descrbed n Secton 6. In secton 7, we evaluate the resultng retreval method wth three REC collectons. Secton 8 concludes ths paper and suggests some avenues for future work. 2. Related Work Pseudo-relevance feedback has been wdely used n IR. It has been mplemented n dfferent retreval models: vector space model [5], probablstc model [3], and so on. Recently, the PRF prncple has also been mplemented wthn the language modelng framework. Snce our work s also carred out usng language modelng, we wll revew the related studes n ths framework n more detal. he basc rankng functon n language modelng uses KLdvergence as follows: Score ( d, q) = w V w θ q )log w θ d ) () where V s the vocabulary of the whole collecton, and are respectvely the query model and the document model. he document model has to be smoothed to solve the zero-probablty problem. A commonly used smoothng method s Drchlet smoothng [23]: tf ( w, d) + u w C) w θ d ) = d + u where d s the length of the document, tf(w,d) the term frequency of w wthn d, w C) the probablty of w n the whole collecton C estmated wth MLE (Maxmum Lkelhood Estmaton), and u s the Drchlet pror (set at,500 n our experments). he query model descrbes the user s nformaton need. In most tradtonal approaches usng language modelng, ths model s estmated wth MLE wthout smoothng. We denote ths model by w θ o). In general, ths query model has a poor coverage of the relevant and useful terms, especally for short queres. Many terms related to the query s topc are absent from (or has a zero probablty n) the model. Pseudo-relevance feedback s often used to mprove the query model. We menton two representatve approaches here: relevance model and mxture model. he relevance model [8] assumes that a query term s generated by a relevance model w θ R). However, t s mpossble to defne the relevance model wthout any relevance nformaton. [8] thus explots the top-ranked feedback documents by assumng them to be samples from the relevance model. he relevance model s then estmated as follows: D F P ( w θ ) w D) D θ ) R R Where F denotes the feedback documents. On the rght sde, the relevance model θ R s approxmated by the orgnal query Q. Applyng Bayesan rule and makng some smplfcatons, we obtan: w D) Q D) D) P ( w θ R ) = w D) Q D) Q) hat s, the probablty of a term w n the relevance model s determned by ts probablty n the feedback documents (.e. w D)) as well as the correspondence of the latter to the query (.e. Q D)). he above relevance model s used to enhance the orgnal query model by the followng nterpolaton: P w θ ) = ( λ) w θ ) + λ w θ ) (4) ( q 0 R (2) (3) where s the nterpolaton weght (set at 0.5 n our experments). Notce that the above nterpolaton can also be mplemented as document re-rankng n practce, n whch only the top-ranked documents are re-ranked accordng to the relevance model. he mxture model [22] also tres to buld a language model for the query topc from the feedback documents, but n a way dfferent from the relevance model. It assumes that the query topc model to be extracted corresponds to the part that s the most dstnctve from the whole document collecton. hs dstnctve part s extracted as follows: Each feedback document s assumed to be generated by the topc model to be extracted and the collecton model, and the EM algorthm [3] s used to extract the topc model so as to maxmze the lkelhood of the feedback documents. hen the topc model s combned wth the orgnal query model by an nterpolaton smlarly to the relevance model. Although the specfc technques used n the above two approaches are dfferent, both assume that the strong terms contaned n the feedback documents are related to the query and are useful to mprove the retreval effectveness. In both cases, the strong terms are determned accordng to ther dstrbutons. he only dfference s that the relevant model tres to extract the most frequent terms from the feedback documents (.e. wth a strong w D)), whle the mxture model tres to extract those that are the most dstnctve between the feedback documents and the general collecton. hese crtera have been generally used n other PRF approaches (e.g. [2]). Several addtonal crtera have been used to select terms related to the query. For example, [4] proposed the prncple that the selected terms should have a hgher probablty n the relevant documents than n the rrelevant documents. For document flterng, term selecton s more wdely used n order to update the topc profle. For example, [24] extracted terms from true relevant and rrelevant documents to update the user profle (.e. query) usng the Roccho method. Kwok et al. [7] also made use of the query length as well as the sze of the vocabulary. Smeaton and Van Rjsbergen [6] examned the mpact of determnng expanson terms usng mnmal spannng tree and some smple lngustc analyss. Despte the large number of studes, a crucal queston that has not been drectly examned s whether the expanson terms selected n a way or another are truly useful for the retreval. One was usually concerned wth the global mpact of a set of expanson terms. Indeed, n many experments, mprovements n the retreval effectveness have been observed wth PRF [8], [9], [9], [22]. hs mght suggest that most expanson terms are useful. Is t really so n realty? We wll examne ths queston n the next secton. Notce that some studes (e.g. []) have tred to understand the effect of query expanson. However, these studes have examned the terms extracted from the whole collecton nstead of from the feedback documents. In addton, they also focused on the term dstrbuton aspects. 3. A Re-examnaton of the PRF Assumpton he general assumpton behnd PRF can be formulated as follows: Most frequent or dstnctve terms n pseudo-relevance feedback documents are useful and they can mprove the retreval effectveness when added nto the query. o test ths assumpton, we wll consder all the terms extracted from the feedback documents usng the mxture model. We wll test each of these terms n turn to see ts mpact on the retreval 244

3 effectveness. he followng score functon s used to ntegrate an expanson term e: Score d, q) = t θ ) log t θ ) + wlog e θ ) (5) ( w V o d d where t s a query term, t θo) s the orgnal query model as descrbed n secton 2, e s the expanson term under consderaton, and w s ts weght. he above expresson s a smplfed form of query expanson wth a sngle term. In order to make the test smpler, the followng smplfcatons are made: ) An expanson term s assumed to act on the query ndependently from other expanson terms; 2) Each expanson term s added nto the query wth equal weght - the weght w s set at 0.0 or In practce, an expanson term may act on the query n dependence wth other terms, and ther weghts may be dfferent. Despte these smplfcatons, our test can stll reflect the man characterstcs of the expanson terms. Good expanson terms are those that mprove the effectveness when w s 0.0 and hurt the effectveness when w s -0.0; bad expanson terms produce the opposte effect. Neutral expanson terms are those that produce smlar effect when w s 0.0 or herefore we can generate three groups of expanson terms: good, bad and neutral. Ideally, we would lke to use only good expanson terms to expand queres. Let us descrbe the dentfcaton of the three groups of terms n more detal. Suppose MAq) and MAqU e) are respectvely the MAP of the orgnal query and expanded query (expanded wth e). We measure the performance change due to e by the rato chg( e) = [ MA q e) MA q) ] MA q). We set a threshold at e., good and bad expanson terms should produce a performance change such that chg(e) > In addton to the above performance change, we also assume that a term appearng less than 3 tmes n the feedback documents s not an mportant expanson term. hs allows us to flter out some nose. he above dentfcaton produces a desred result for term classfcaton. Now, we wll examne whether the canddate expanson terms proposed by the mxture model are good terms. Our verfcaton s made on three REC collectons: AP, WSJ and Dsk4&5. he characterstcs of these collectons are descrbed n Secton 7.. We consder 50 queres for each collecton and 80 expansons wth the largest probabltes for each query. he followng table shows the proporton of good, bad and neutral terms for all the queres n each collecton. Collecton Good erms Neutral erms Bad erms AP 7.52% 47.59% 36.69% WSJ 7.4% 49.89% 32.69% Dsk4&5 7.64% 56.46% 25.88% able. Proportons of each group of expanson terms selected by the mxture model As we can see, only less than 8% of the expanson terms used n the mxture model are good terms n all the three collectons. he proporton of bad terms s hgher. hs shows that the expanson process ndeed added more bad terms than good ones. We also notce from able that a large proporton of the expanson terms are neutral terms, whch have lttle mpact on the retreval effectveness. Although ths part of the terms does necessarly not hurt retreval, addng them nto the query would produce a long query and thus a heaver query traffc (longer evaluaton tme). It s then desrable to remove these terms, too. feedback documents Good Neutral Bad collecton Fgure. Dstrbuton of the expanson terms for arbus subsdes n the feedback documents and n the collecton he above analyss clearly shows that the term selecton process used n the mxture model s nsuffcent. Smlar phenomenon s observed on the relevance model and can be generalzed to all the methods explotng the same crtera. hs suggests that the term selecton crtera used - term dstrbutons n the feedback documents and n the whole document collecton, s nsuffcent. hs also ndcates that good and bad expanson terms may have smlar dstrbutons because the mxture model, whch explots the dfference of term dstrbuton between the feedback documents and the collecton, has faled to dstngush them. o llustrate the last pont, let us look at the dstrbuton of the expanson terms selected wth the mxture model for REC query #5 arbus subsdes. In Fgure, we place the top 80 expanson terms wth the largest probabltes n a two-dmensonal space one dmenson represents the logarthm of ts probablty n the pseudo-relevant documents and another dmenson represents that n the whole collecton. o make the llustraton easer, a smple normalzaton s made so that the fnal value wll be n the range [0, ]. Fgure shows the dstrbuton of the three groups of expanson terms. We can observe that the neutral terms are somehow solated from the good and the bad terms to some extent (on the lower-rght corner), but the good expanson terms are ntertwned wth the bad expanson terms. hs fgure llustrates the dffculty to separate good and bad expanson terms accordng to term dstrbutons solely. It s then desrable to use addtonal crtera to better select useful expanson terms. 4. Usefulness of Selectng Good erms Before proposng an approach to select good terms, let us frst examne the possble mpact wth a good term selecton process. Let us assume an oracle classfer that separate correctly good, bad and neutral expanson terms as determned n Secton 3. In ths experment, we wll only keep the good expanson terms for each query. All the good terms are ntegrated nto the new query model n the same way as ether relevance model or mxture model. able 2 shows the MAP (Mean Average Precson) for the top 000 results wth the orgnal query model (LM), the expanded query models by the relevance model (REL) and by the mxture model (MIX), as well as by the oracle expanson terms (REL+Oracle and MIX+Oracle). he superscrpt, L, R and M ndcates that the mprovement over LM, REL and MIX s statstcally sgnfcant at p<

4 Models AP WSJ Dsk4&5 LM REL L L L REL+Oracle R,L R,L R,L MIX L L L MIX+Oracle M,L M,L M,L able 2.he mpact of oracle expanson classfer We can see that the retreval effectveness can be much mproved f term classfcaton s done perfectly. he oracle expanson terms can generally mprove the MAP of the relevance model and the mxture model by 8-30%. hs shows the usefulness of correctly classfyng the expanson terms and the hgh potental of mprovng the retreval effectveness by a good term classfcaton. he MAP obtaned wth the oracle expanson terms represents the upper bound retreval effectveness we can expect to obtan usng pseudorelevance feedback. Our problem now s to develop an effectve method to correctly classfy the expanson terms. 5. Classfcaton of Expanson erms 5. Classfer Any classfer can be used for term classfcaton. Here, we use SVM. More specfcally, we use the SVM varant C-SVM [2] because of ts effectveness and smplcty [20]. Several kernel functons can be used n SVM. We use the radal-based kernel functon (RBF) because t has relatvely fewer hyper parameters and has shown to be effectve n prevous studes [2],[5]. hs functon s defned as follows: 2 K ( x, x ) = exp( x x 2σ ) (6) j j where σ s a parameter controllng the shape of the RBF functon. he functon gets flatter when s larger. Another parameter C>0 n C-SVM should be set to control the trade-off between the slack varable penalty and the margn [2]. Both parameters are estmated wth a 5-fold cross-valdaton to maxmze the classfcaton accuracy of the tranng data (see able 7). In our term classfcaton, we are nterested to know not only f a term s good, but also the extent to whch t s good. hs latter value s useful for us to measure the mportance of an expanson term and to weght t n the new query. herefore, once we obtan a classfcaton score, we use the method descrbed n [2] to transform t nto a posteror probablty: Suppose the classfcaton score calculated wth the SVM s s(x). hen the probablty of x belongng to the class of good terms (denoted by +) s defned by: P ( + x) = exp( As( x) + B) (7) where A and B are the parameters, whch are estmated by mnmzng the cross-entropy of a porton of tranng data, namely the development data. hs process has been automated n LIBSVM [5]. We wll have + x)>0.5 f and only f the term x s classfed as a good term. More detals about ths model can be found n [2]. Note that the above probablstc SVM may have dfferent classfcaton results from the smple SVM, whch classfes nstances accordng to sgn(s(x)). In our experments, we have tested both probablstc and smple SVMs, and found that the former performs better. We use the SVM mplementaton LIBSVM [5] n our experments. 5.2 Features Used for erm Classfcaton Each expanson term s represented by a feature vector F( e) = N [ f ( e), f ( e),..., f ( e) N ] R, where means a transpose of a vector. Useful features nclude those already used n tradtonal approaches such as term dstrbuton n the feedback documents and term dstrbuton n the whole collecton. As we mentoned, these features are nsuffcent. herefore, we consder the followng addtonal features: - co-occurrences of the expanson term wth the orgnal query terms; - proxmty of the expanson terms to the query terms. We wll explan several groups of features below. Our assumpton s that the most useful feature for term selecton s the one that makes the largest dfference between the feedback documents and the whole collecton (smlar to the prncple used n the mxture model). So, we wll defne two sets of features, one for the feedback documents and another for the whole collecton. However, techncally, both sets of features can be obtaned n a smlar way. herefore, we wll only descrbe the features for the feedback documents. he others can be defned smlarly. erm dstrbutons he frst features are the term dstrbutons n the pseudo-relevant documents and n the collecton. he feature for the feedback documents s defned as follows: tf e D f ( (, ) e) = log tf ( t, D ) t where F s the set of feedback documents. f 2 (e) s defned smlarly on the whole collecton. hese features are the tradtonal ones used n the relevance model and mxture model. Co-occurrence wth sngle query term Many studes have found that the terms that co-occur wth the query terms frequently are often related to the query []. herefore, we defne the followng feature to capture ths fact: f 3 ( e) = log n n = t C( t, e D) tf ( t, D) where C(t,e D) s the frequency of co-occurrences of query term t and the expanson term e wthn text wndows n document D. he wndow sze s emprcally set to be 2 words. Co-occurrence wth pars query terms A stronger co-occurrence relaton for an expanson term s wth two query terms together. [] has shown that ths type of co-occurrence relaton s much better than the prevous one because t can take nto account some query contexts. he text wndow sze used here s 5 words. Gven the set of possble term pars, we defne the followng feature, whch s slghtly extended from the prevous one: f 5 ( e) = log Ω ( t, t j ) Ω Weghted term proxmty C t ( t, t, e D) j tf ( t, D) he dea of usng term proxmty has been used n several studes [8]. Here we also assume that two terms that co-occur at a smaller dstance s more closely related. here are several ways to defne the dstance between two terms n a set of documents [8]. Here, we defne t as the mnmum number of words between the two terms among all co-occurrences n the documents. Let us denote ths dstance between t and t j among the set B of documents by dst(t,t j B). For a query of multple words, we have to aggregate the dstances between the expanson term and all query terms. he smplest method s to consder the average dstance, whch s smlar to the average dstance defned n [8]. However, t does not produce good results n our experments. Instead, the weghted 246

5 average dstance works better. In the latter, a dstance s weghted by the frequency of ther co-occurrences. We then have the followng feature: f ( e) 7 n C( t =, = log n e) dst( t, e F) = C( t, e) where C(t, e) s the frequency of co-occurrences of t and e wthn text wndows n the collecton. he wndow sze s set to 2 words as before. Document frequency for query terms and the expanson term together he features n ths group model the count of documents n whch the expanson term co-occurs wth all query terms. We then have: f9 = log[ I( ) ) + ] D F t q t D e D 0. 5 where I(x) s the ndcator functon whose value s when the Boolean expresson x s true, and 0 otherwse. he constant 0.5 here acts as a smoothng factor to avod zero value. o avod that a feature whose values vares n a larger numerc range domnates those varyng n smaller numerc ranges, scalng on feature values s necessary [5]. he scalng s done n a queryby-query manner. Let e GEN(q) be an expanson term of the query q, and f (e) s one feature value of e. We scale f (e) as follows: ' ( f ( e) mn ) ( max mn ) f ( e) = where mn =mn e GEN(q)f (e) and max =max e GEN(q)f (e). Wth ths transformaton, each feature becomes a real number n [0, ]. In our experments, only the above features are used. However, the general method s not lmted to them. Other features can be added. he possblty to ntegrate arbtrary features for the selecton of expanson terms ndeed represents an advantage of our method. 5.3 Classfcaton Experments Let us now examne the qualty of our classfcaton. We use three test collectons (see able 7 n Secton 7.), wth 50 queres for each collecton. We dvde these queres nto three groups of 50 queres. We then do leave-one-out cross valdaton to evaluate the classfcaton accuracy. o generate tranng and test data, we use the method descrbed n secton 3 to label possble expanson terms of each query as good terms or non-good terms (ncludng bad and neutral terms), and then represent each expanson wth the features descrbed n secton 5.2. he canddate expanson terms are those that occur n the feedback documents (top 20 documents n the ntal retreval) no less than three tmes. able 3 shows the classfcaton results. In ths table, we show the percentage of good expanson terms for all the queres n each collecton around /3. Usng the SVM classfer, we obtan a classfcaton accuracy of about 69%. hs number s not hgh. In fact, f we use a naïve classfer that always classfes nstances nto non good class, the accuracy (.e. one mnuses the percentage of good terms) s only slghtly lower. However, such a classfer s useless for our purpose because no expanson term s classfed as good term. Better ndcators are recall, and more partcularly precson. Although the classfer only dentfes about /3 of the good terms (.e. recall), around 60% of the dentfed ones are truly good terms (.e. precson). Comparng to able for the expanson terms selected by the mxture model, we can see that the expanson terms select by the SVM classfer are of much hgher qualty. hs Coll. Percentage of SVM good terms Accuracy Rec. Prec. AP WSJ Dsk4& able 3. Classfcaton results of SVM shows that the addtonal features we consdered n the classfcaton are useful, although they could be further mproved n the future. In the next secton, we wll descrbe how the selected expanson terms are ntegrated nto our retreval model. 6. Re-weghtng Expanson erms wth erm Classfcaton he classfcaton process performs a further selecton of expanson terms among those proposed by the relevance model and the mxture model respectvely. he selected terms can be ntegrated n these models n two dfferent ways: hard flterng,.e. we only keep the expanson terms classfed as good; or soft flterng,.e. we use the classfcaton score to enhance the weght of good terms n the fnal query model. Our experments show that the second method performs better. We wll make a comparson between these two methods n Secton 7.4. In ths secton, we focus on the second method, whch means a redefnton of the models w θr) for the relevance model and w θ ) for the mxture model. hese models are redefned as follows: For a term e such that + e)>0.5, old ( e θ R) ( + e )) Z old e θ ) ( + e ) new w θ ) = α + R ) ( ) Z new w θ ) = α + ) (8) where Z s the normalzaton factor, and s a coeffcent, whch s estmated wth some development data n our experments usng lne search [4]. Once the expanson terms are re-weghted, we wll retan the top 80 terms wth the hghest probabltes for expanson. her weghts are normalzed before beng nterpolated wth the orgnal query model. he number 80 s used for a far comparson wth the relevance model and the mxture model. Name Descrpton #Docs ran opcs Dev. opcs est topcs AP Assoc. Press , Wall St. Journal 73, WSJ Dsk4&5 REC dsk4&5 556, able 4. Statstcs of evaluaton data sets 7. IR Experments 7. Expermental Settngs We evaluate our method wth three REC collectons, AP88-90, WSJ87-92 and all documents on REC dsks 4&5. able 4 shows the statstcs of the three collectons. For each dataset, we splt the avalable topcs nto three parts: the tranng data to tran the SVM classfer, the development data to estmate the parameter n equaton 9, and the test data. We only use the ttle for each REC topc as our query. Both documents and queres are stemmed wth Porter stemmer and stop words are removed. he man evaluaton metrc s Mean Average Precson (MAP) for top 000 documents. Snce some prevous studes showed that PRF mproves recall but may hurt precson, we also show the precson at top 30 and 00 documents,.e., P@30 and P@00. We also show recall as a supplementary measure. We do a query-by-query analyss and conduct t-test to determne whether the mprovement on MAP s statstcally sgnfcant. 247

6 he Indr 2.6 search engne [7] s used as our basc retreval system. We use the relevance model mplemented n Indr, but mplemented the mxture model followng [22] snce Indr does not mplement ths model. 7.2 Ad-hoc Retreval Results In the experments, the followng methods are compared: LM: the KL-dvergence retreval model wth the orgnal queres; REL: the relevance model; REL+SVM: the relevance model wth term classfcaton; MIX: the mxture model; MIX+SVM: the mxture model wth term classfcaton. hese models requre some parameters, such as the weght of orgnal model when formng the fnal query representaton, the Drchlet pror for document model smoothng and so on. Snce the purpose of ths paper s not to optmze these parameters, we set all of them at the same values for all the models. ables 5, 6 and 7 show the results obtaned wth dfferent models on the three collectons. In the tables, mp means the mprovement rate over LM model, * ndcates that the mprovement s statstcally sgnfcant at the level of p<0.05, and ** at p<0.0. he superscrpts R and M ndcate that the result s statstcally better than the relevance model and mxture model respectvely at p<0.05. From the tables, we observe that both relevance model and mxture model, whch explot a form of PRF, can mprove the retreval effectveness of LM sgnfcantly. hs observaton s consstent wth prevous studes. he MAP we obtaned wth these two models represent the state-of-the-art effectveness on these test collectons. Comparng the relevance model and the mxture model, we see that the latter performs better. he reason may be the followng: he mxture model reles more on the dfference between the feedback documents and the whole collecton to select the expanson terms, than the relevance model. By dong ths, one can flter out more bad or neutral expanson terms. On all the three collectons, the model ntegratng term classfcaton performs very well. When the classfcaton model s used together wth a PRF model, the effectveness s always mproved. On the AP and Dsk4&5 collectons, the mprovements are more than 7.5% and are statstcally sgnfcant. he mprovements on the WSJ collecton are smaller (about 3.5%) and are not statstcally sgnfcant. About the mpact on precson, we can also see that term classfcaton can also mprove the precson at top ranked documents, except n the case of Dsk4&5 when SVM s added to REL. hs shows that n most cases, addng the expanson terms does not hurt, but mproves, precson. Let us show the expanson terms for the queres machne translaton and natural language processng, n able 8. he stemmed words have been restored n ths table for better readablty. All the terms contaned n the table are those suggested by the mxture model. However, only part of them (n talc) s useful expanson terms. Many of them are general terms that are not useful, for example, food, make, year, 50, and so on. Model P@30 P@00 MAP Imp Recall LM REL %** REL+SVM R 22.93%** MIX %** MIX+SVM M,R 28.36%** able 5. Ad-hoc retreval results on AP data Model P@30 P@00 MAP Imp Recall LM REL %** REL+SVM %** MIX %** MIX+SVM R 4.82%** 0.70 able 6. Ad-hoc retreval results on WSJ data Model P@30 P@00 MAP Imp Recall LM REL %* REL+SVM R 4.20%** MIX %** MIX+SVM M,R 25.96%** able 7. Ad-hoc retreval results on Dsk4&5 data machne translaton Expanson terms t θ ) Expanson terms t θ ) compute year sovet work company make typewrter englsh busy bm ncrease people natural language processng Expanson terms t θ ) Expanson terms t θ ) englsh publsh word naton french develop food russan make program world dctonary gorlla able 8. Expanson terms of two queres. he terms n talc are real good expanson terms, and those n bold are classfed as good terms he classfcaton process can help dentfy well the useful expanson terms (n bold): although not all the useful expanson terms are dentfed, those dentfed (e.g. program, dctonary ) are hghly related and useful. As the weght of these terms s ncreased, the relatve weght of the other terms s decreased, makng ther weghts n the fnal query model smaller. hese examples llustrate why the term classfcaton process can mprove the retreval effectveness. 7.3 Supervsed vs. Unsupervsed Learnng Compared to the relevance model and the mxture model, the approach wth term classfcaton made two changes: t uses supervsed learnng nstead of unsupervsed learnng; t uses several addtonal features. It s then mportant to see whch of these changes contrbuted the most to the ncrease n retreval effectveness. In order to see ths, we desgn a method usng unsupervsed learnng, but wth the same addtonal features. he unsupervsed learnng extends the mxture model n the followng way: 248

7 Each feedback document s also consdered to be generated from the topc model (to be extracted) and the collecton model. We try to extract the topc model so as to maxmze the lkelhood of the feedback documents as n the mxture model. However, the dfference s that, nstead of defnng the topc model w θ ) as a multnomal model, we defne t as a log-lnear model that combnes all the features: ( F( w ) Z w θ ) = exp λ ) (9) where F(w) s the feature vector defned n secton 5.2, λ s the weght vector and Z s the normalzaton factor to make w θ ) a real probablty. λ s estmated by maxmzng the lkelhood of the feedback documents. o avod overfttng, we do regularzaton on λ by assumng that t has a zero-mean Gaussan pror dstrbuton [2]. hen the objectve functon to be maxmzed becomes: L ( F) = tf ( w, D) log( ( α) w θ C ) + α w θ ) (0) D F w ) V λ λ where s the regularzaton factor, whch s set to be 0.0 n our experments. α s the parameter representng how lkely we use the topc model to generate the pseudo-relevant document. It s set at a fxed value as n [22] (0.5 n our case). Snce L(F) s a concave functon w.r.t. λ, t has a unque maxmum. We solve ths unconstraned optmzaton problem wth L-BFGS algorthm [0]. able 9 shows the results measured by MAP. Agan, the superscrpt, M and L ndcate the mprovement over MIX and Log-lnear model s statstcally sgnfcant at p<0.05. From ths table, we can observe that the log-lnear model outperforms the mxture model only slghtly. hs shows that an unsupervsed learnng method, even wth addtonal features, cannot mprove the retreval effectveness by a large margn. he possble reason s that the objectve functon, L(F), does not correlate very well wth MAP. he parameters maxmzng L(F) do not necessarly produce good MAP. In comparson, the MIX+SVM model outperforms largely the loglnear model on all the three collectons, and the mprovements on AP and Dsk4&5 are statstcally sgnfcant. hs result shows that a supervsed learnng method can more effectvely capture the characterstcs of the genune good expanson terms than an unsupervsed method. Model AP WSJ Dsk4&5 MIX Log-lnear MIX+SVM M,L M,L able 9. Supervsed Learnng VS Unsupervsed Learnng 7.4 Soft Flterng vs. Hard Flterng We mentoned two possble ways to use the classfcaton results: hard flterng of expanson terms by retanng only the good terms, or soft flterng by ncreasng the weght of the good terms. In ths secton, we compare the two methods. able 0 shows the results obtaned wth both methods. In the table, M, R, and H ndcate the mprovement over MIX, REL and HARD are statstcally sgnfcant wth p<0.05 From ths table, we see that both hard and soft flterng mproves the effectveness. Although the mprovements wth hard flterng are smaller, they are steady on all the three collectons. However, only the mprovement over MIX model on the AP and Dsk4&5 data s statstcally sgnfcant. In comparson, the soft flterng method performs much better. Our explanaton s that, snce the classfcaton accuracy s far from perfect (actually, t s less than 70% as shown n able 3), some top ranked good expanson terms, whch could mprove the performance sgnfcantly, can be removed by the hard flterng. On the other hand, n the soft flterng case, even f the top ranked good terms are msclassfed, we wll only reduce ther relatve weght n the fnal query model rather than removng them. herefore, these expanson terms can stll contrbute to mprovng the performance. In other words, the soft flterng method s less affected by classfcaton errors. Model AP WSJ Dsk4&5 MIX MIX+HARD M M MIX+SOF M,H M,H REL REL+HARD REL+SOF R,H R able 0. Soft Flterng VS Hard Flterng 7.5 Reducng Query raffc A crtcal aspect wth query expanson s that, as more terms are added nto the query, the query traffc,.e. the tme needed for ts evaluaton, becomes larger. In the prevous sectons, for the purpose of comparson wth prevous methods, we used 80 expanson terms. In practce, ths number can be too large. In ths secton, we examne the possblty to further reduce the number of expanson terms. In ths experment, after a re-weghtng wth soft flterng, nstead of keepng 80 expanson terms, we only select the top 0 expanson terms. hese terms are used to construct a small query topc model w θ ). hs model s nterpolated wth the orgnal query model as before. he followng table descrbes the results usng the mxture model. Model AP WSJ Dsk4&5 MIX+SOF able. Soft flterng wth 0 terms As expected, the effectveness wth 0 expanson terms s lower than wth 80 terms. However, we can stll obtan much hgher effectveness than the tradtonal language model LM, and all the mprovements are sgnfcantly sgnfcant. he results wth 0 expanson terms can also be advantageously compared to the mxture model wth 80 expanson terms: for both AP and Dsk4&5 collectons, the effectveness s hgher than the mxture model. he effectveness on WSJ s very close. hs experment shows that we can reduce the number of expanson terms, and even wth a reasonably small number, the retreval effectveness can be greatly ncreased. hs observaton allows us to control query traffc wthn an acceptable range, and make the method more feasble n the search engnes. 8. Concluson Pseudo-relevance feedback, whch adds addtonal terms extracted from the feedback documents, s an effectve method to mprove the query representaton and the retreval effectveness. he basc assumpton s that most strong terms n the feedback documents are useful for IR. In ths paper, we re-examned ths hypothess on three test collectons and showed that the expanson terms determned n tradtonal ways are not all useful. In realty, only a small proporton of the suggested expanson terms are useful, and many others are ether harmful or useless. In addton, we also showed that the 249

8 tradtonal crtera for the selecton of expanson terms based on term dstrbutons are nsuffcent: good and bad expanson terms are not dstngushable on these dstrbutons. Motvated by these observatons, we proposed to further classfy expanson terms usng addtonal features. In addton, we am to select the expanson terms drectly accordng to ther possble mpact on the retreval effectveness. hs method s dfferent from the exstng ones, whch often rely on some other crtera that do not always correlate wth the retreval effectveness. Our experments on three REC collectons showed that the expanson terms selected usng our method are sgnfcantly better than the tradtonal expanson terms. In addton, we also showed that t s possble to lmt the query traffc by controllng the number of expanson terms, and ths stll lead to qute large mprovements n retreval effectveness. hs study shows the mportance to examne the crucal problem of usefulness of expanson terms before the terms are used. he method we propose also provdes a general framework to ntegrate multple sources of evdence. hs study suggests several nterestng research avenues for our future nvestgaton: he results we obtaned wth term classfcaton are much lower than wth the oracle expanson terms. hs means that there s stll much room for mprovement. In partcular, mprovement n classfcaton qualty could drectly result n mprovement n retreval effectveness. he mprovement of classfcaton qualty could be obtaned by ntegratng more useful features. In ths paper, we have lmted our nvestgaton to only a few often used features. More dscrmnatve features can be nvestgated n the future. REFERENCES [] Ba, J. Ne, J., Bouchard, H. and Cao, G. Usng query contexts n nformaton retreval. In the Proceedngs of SIGIR 2007, Armsterdam, Netherlands, [2] Bshop, C. Patten recognton and machne learnng. Sprnger, [3] Dempster, A., Lard, N. and Rubn, D. Maxmum lkelhood from ncomplete data va the EM algorthm. Journal of the Royal Statstcal Socety. Seres B. 39():-38, 977 [4] Gao, J., Q, H., Xa, X., and Ne, J. Lnear dscrmnant model for nformaton retreval. In the Proceedngs of SIGIR 2005, pp , [5] Hsu, C. Chang, C. and Ln, C. A practcal gude to support vector classfcaton. echncal Report, Natonal awan Unversty. [6] Joachms,. ext categorzaton wth support vector machnes: learnng wth features. In ECML, pp.37-42, 998. [7] Kwok, K.L, Grunfeld, L., Chan, K., HREC-8 ad-hoc, query and flterng track experments usng PIRCS, In REC0, [8] Lavrenko, V. and Croft, B. Relevance-based language models. In the Proceedngs of SIGIR 200, pp.20-28, 200. [9] Metzler, D. and Croft, B. Latent Concept Expanson Usng Markov Random Felds. In the Proceedngs of SIGIR 2007, pp [0] Nocedal, J. and Wrght, S. Numercal optmzaton. Sprnger, [] Peat, H.J. and Wllett, P., he lmtatons of term cooccurrence data for query expanson n document retreval systems. JASIS, 42(5): , 99. [2] Platt, J. Probabltes for SV Machnes. Advances n large margn classfers, pages 6-74, Cambrdge, MA, MI Press [3] Robertson, S., and Sparck Jones, K. Relevance weghtng of search terms. JASIS, 27:29-46, 976 [4] Robertson, S.E., On term selecton for query expanson, Journal of Documentaton, 46(4): [5] Roccho, J. Relevance feedback n nformaton retreval. In he SMAR Retreval System: Experments n Automatc Document Processng, pages , 97 [6] Smeaton, A. F. and Van Rjsbergen, C. J. he retreval effects of query expanson on a feedback document retreval system. Computer Journal, 26(3): [7] Strohman,., Metzler, D. and urtle, H., and Croft, B. (2004). Indr: A Language Model-based Search Engne for Complex Queres. In Proceedngs of the Internatonal conference on Intellgence Analyss. [8] ao,. and Zha, C. An exploraton of proxmty measures nformaton retreval. In the Proceedngs of SIGIR 2007, pp , [9] ao,. and Zha, C. Regularzed estmaton of mxture models for robust pseudo-relevance feedback. In the Proceedngs of SIGIR [20] Vapnk, V. Statstcal Learnng heory. New York: Wley, 998 [2] Xu, J. and Croft, B. Query expanson usng local and global document analyss. In the Proceedngs of SIGIR 2006, pp.4-, 996. [22] Zha, C. and Lafferty, J. Model-based feedback n the KLdvergence retreval model. In CIKM, pp , 200a. [23] Zha, C. and Lafferty, J. A study of smoothng methods for language models appled to ad hoc nformaton retreval. In Proceedngs of SIGIR 200, pp , 200b. [24] Zhang, Y., Callan, J., he bas problem and language models n adaptve flterng. In the Proceedngs of REC, pp.78-83,

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Evaluation for sets of classes

Evaluation for sets of classes Evaluaton for Tet Categorzaton Classfcaton accuracy: usual n ML, the proporton of correct decsons, Not approprate f the populaton rate of the class s low Precson, Recall and F 1 Better measures 21 Evaluaton

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Informaton Search and Management Probablstc Retreval Models Prof. Chrs Clfton 7 September 2018 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group 14 Why probabltes

More information

MDL-Based Unsupervised Attribute Ranking

MDL-Based Unsupervised Attribute Ranking MDL-Based Unsupervsed Attrbute Rankng Zdravko Markov Computer Scence Department Central Connectcut State Unversty New Brtan, CT 06050, USA http://www.cs.ccsu.edu/~markov/ markovz@ccsu.edu MDL-Based Unsupervsed

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Extending Relevance Model for Relevance Feedback

Extending Relevance Model for Relevance Feedback Extendng Relevance Model for Relevance Feedback Le Zhao, Chenmn Lang and Jame Callan Language Technologes Insttute School of Computer Scence Carnege Mellon Unversty {lezhao, chenmnl, callan}@cs.cmu.edu

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Retrieval Models: Language models

Retrieval Models: Language models CS-590I Informaton Retreval Retreval Models: Language models Luo S Department of Computer Scence Purdue Unversty Introducton to language model Ungram language model Document language model estmaton Maxmum

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Probabilistic Information Retrieval CE-324: Modern Information Retrieval Sharif University of Technology

Probabilistic Information Retrieval CE-324: Modern Information Retrieval Sharif University of Technology Probablstc Informaton Retreval CE-324: Modern Informaton Retreval Sharf Unversty of Technology M. Soleyman Fall 2016 Most sldes have been adapted from: Profs. Mannng, Nayak & Raghavan (CS-276, Stanford)

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Semi-supervised Classification with Active Query Selection

Semi-supervised Classification with Active Query Selection Sem-supervsed Classfcaton wth Actve Query Selecton Jao Wang and Swe Luo School of Computer and Informaton Technology, Beng Jaotong Unversty, Beng 00044, Chna Wangjao088@63.com Abstract. Labeled samples

More information

Question Classification Using Language Modeling

Question Classification Using Language Modeling Queston Classfcaton Usng Language Modelng We L Center for Intellgent Informaton Retreval Department of Computer Scence Unversty of Massachusetts, Amherst, MA 01003 ABSTRACT Queston classfcaton assgns a

More information

Manning & Schuetze, FSNLP (c)1999, 2001

Manning & Schuetze, FSNLP (c)1999, 2001 page 589 16.2 Maxmum Entropy Modelng 589 Mannng & Schuetze, FSNLP (c)1999, 2001 a decson tree that detects spam. Fndng the rght features s paramount for ths task, so desgn your feature set carefully. Exercse

More information

Regularized Discriminant Analysis for Face Recognition

Regularized Discriminant Analysis for Face Recognition 1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Uncertainty as the Overlap of Alternate Conditional Distributions

Uncertainty as the Overlap of Alternate Conditional Distributions Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

arxiv:cs.cv/ Jun 2000

arxiv:cs.cv/ Jun 2000 Correlaton over Decomposed Sgnals: A Non-Lnear Approach to Fast and Effectve Sequences Comparson Lucano da Fontoura Costa arxv:cs.cv/0006040 28 Jun 2000 Cybernetc Vson Research Group IFSC Unversty of São

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015 CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

Pattern Classification

Pattern Classification Pattern Classfcaton All materals n these sldes ere taken from Pattern Classfcaton (nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wley & Sons, 000 th the permsson of the authors and the publsher

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Support Vector Machines

Support Vector Machines /14/018 Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Uncertainty in measurements of power and energy on power networks

Uncertainty in measurements of power and energy on power networks Uncertanty n measurements of power and energy on power networks E. Manov, N. Kolev Department of Measurement and Instrumentaton, Techncal Unversty Sofa, bul. Klment Ohrdsk No8, bl., 000 Sofa, Bulgara Tel./fax:

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Learning from Data 1 Naive Bayes

Learning from Data 1 Naive Bayes Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

On the correction of the h-index for career length

On the correction of the h-index for career length 1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat

More information

9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov

9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov 9.93 Class IV Part I Bayesan Decson Theory Yur Ivanov TOC Roadmap to Machne Learnng Bayesan Decson Makng Mnmum Error Rate Decsons Mnmum Rsk Decsons Mnmax Crteron Operatng Characterstcs Notaton x - scalar

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Unified Subspace Analysis for Face Recognition

Unified Subspace Analysis for Face Recognition Unfed Subspace Analyss for Face Recognton Xaogang Wang and Xaoou Tang Department of Informaton Engneerng The Chnese Unversty of Hong Kong Shatn, Hong Kong {xgwang, xtang}@e.cuhk.edu.hk Abstract PCA, LDA

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata TC XVII IMEKO World Congress Metrology n the 3rd Mllennum June 7, 3,

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing Advanced Scence and Technology Letters, pp.164-168 http://dx.do.org/10.14257/astl.2013 Pop-Clc Nose Detecton Usng Inter-Frame Correlaton for Improved Portable Audtory Sensng Dong Yun Lee, Kwang Myung Jeon,

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

Support Vector Machines

Support Vector Machines Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class

More information

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting Onlne Appendx to: Axomatzaton and measurement of Quas-hyperbolc Dscountng José Lus Montel Olea Tomasz Strzaleck 1 Sample Selecton As dscussed before our ntal sample conssts of two groups of subjects. Group

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material Natural Images, Gaussan Mxtures and Dead Leaves Supplementary Materal Danel Zoran Interdscplnary Center for Neural Computaton Hebrew Unversty of Jerusalem Israel http://www.cs.huj.ac.l/ danez Yar Wess

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced, FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then

More information

Large-Margin HMM Estimation for Speech Recognition

Large-Margin HMM Estimation for Speech Recognition Large-Margn HMM Estmaton for Speech Recognton Prof. Hu Jang Department of Computer Scence and Engneerng York Unversty, Toronto, Ont. M3J 1P3, CANADA Emal: hj@cs.yorku.ca Ths s a jont work wth Chao-Jun

More information

A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS. Dr. Derald E. Wentzien, Wesley College, (302) ,

A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS. Dr. Derald E. Wentzien, Wesley College, (302) , A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS Dr. Derald E. Wentzen, Wesley College, (302) 736-2574, wentzde@wesley.edu ABSTRACT A lnear programmng model s developed and used to compare

More information

MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN

MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN S. Chtwong, S. Wtthayapradt, S. Intajag, and F. Cheevasuvt Faculty of Engneerng, Kng Mongkut s Insttute of Technology

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

STATISTICS QUESTIONS. Step by Step Solutions.

STATISTICS QUESTIONS. Step by Step Solutions. STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to

More information

Lecture 2: Prelude to the big shrink

Lecture 2: Prelude to the big shrink Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County Smart Home Health Analytcs Sprng 2018 Bayesan Learnng Nrmalya Roy Department of Informaton Systems Unversty of Maryland Baltmore ounty www.umbc.edu Bayesan Learnng ombnes pror knowledge wth evdence to

More information