A Probabilistic Multimedia Retrieval Model and Its Evaluation

Size: px
Start display at page:

Download "A Probabilistic Multimedia Retrieval Model and Its Evaluation"

Transcription

1 EURASIP Journal on Appled Sgnal Processng 2:2, c 2 Hndaw Publshng Corporaton A Probablstc Multmeda Retreval Model and Its Evaluaton Thjs Westerveld Natonal Research Insttute for Mathematcs and Computer Scence (CWI), P.O. Box 979, 9 GB Amsterdam, The Netherlands Emal: thjs@cw.nl Arjen P. de Vres Natonal Research Insttute for Mathematcs and Computer Scence (CWI), P.O. Box 979, 9 GB Amsterdam, The Netherlands Emal: arjen@cw.nl Alex van Ballegooj Natonal Research Insttute for Mathematcs and Computer Scence (CWI), P.O. Box 979, 9 GB Amsterdam, The Netherlands Emal: alexb@cw.nl Francska de Jong Unversty of Twente, P.O. Box 27, 75 AE Enschede, The Netherlands Emal: fdejong@cs.utwente.nl Djoerd Hemstra Unversty of Twente, P.O. Box 27, 75 AE Enschede, The Netherlands Emal: hemstra@cs.utwente.nl Receved 2 March 22 and n revsed form November 22 We present a probablstc model for the retreval of multmodal documents. The model s based on Bayesan decson theory and combnes models for text-based search wth models for vsual search. The textual model s based on the language modellng approach to text retreval, and the vsual nformaton s modelled as a mxture of Gaussan denstes. Both models have proved successful on varous standard retreval tasks. We evaluate the multmodal model on the search task of TREC s vdeo track. We found that the dsclosure of vdeo materal based on vsual nformaton only s stll too dffcult. Even wth purely vsual nformaton needs, text-based retreval stll outperforms vsual approaches. The probablstc model s useful for text, vsual, and multmeda retreval. Unfortunately, smplfyng assumptons that reduce ts computatonal complexty degrade retreval effectveness. Regardng the queston whether the model can effectvely combne nformaton from dfferent modaltes, we conclude that whenever both modaltes yeld reasonable scores, a combned run outperforms the ndvdual runs. Keywords and phrases: multmeda retreval, evaluaton, probablstc models, Gaussan mxture models, language models.. INTRODUCTION Both mage analyss and vdeo moton processng have been unable to meet the requrements for dsclosng the content of large scale unstructured vdeo archves. There appear to be two major unsolved problems n the ndexng and retreval of vdeo materal on the bass of these technologes, namely, (a) mage and vdeo processng s stll far away from understandng the content of a pcture n the sense of a knowledge-based understandng and (b) there s no effectve query language (n the wder sense) for searchng mage and vdeo databases. Unlke the target content n the feld of text retreval, the content of vdeo archves s hard to capture at the conceptual level. An ncreasng number of developers that accept ths analyss of the state-of-the-art n the feld have started to use human language as the meda nterlngua, makng the assumpton that as long as there s no possblty to carry out both a broad scale recognton of vsual objects and an automatc mappng from such objects to lngustc representatons, the detaled content of vdeo materal s best dsclosed through the lngustc content (text) that may be assocated wth the mages: speech transcrpts,

2 A Probablstc Multmeda Retreval Model and Its Evaluaton 87 manually generated annotatons, subttles, captons, and so on []. Snce the recent advances n automatc speech recognton, the potental role of speech transcrpts n mprovng the dsclosure of multmeda archves has been especally gven a lot of attenton. One of the nsghts ganed by these nvestgatons s that for the purpose of ndexng and retreval, perfect word recognton s not an ndspensable condton snce not every word wll have to make t nto the ndex, relevant words are lkely to occur more than once, and not every expresson n the ndex s lkely to be quered. Research nto the dfferences between text retreval and spoken document retreval ndcates that, gven the current level of performance of nformaton retreval technques, recognton errors do not add new problems to the retreval task [2, ]. The lmtatons nherent n the deployment of language features only have already lead to several attempts to deal wth the requrements of vdeo retreval by more closer ntegraton of human language technology and mage processng. The noton of multmodal and even more ambtous cross-modal retreval have come n use to refer to the explotaton of the analyss of a varety of feature types n representng and ndexng aspects of vdeo documents [, 5, 6, 7, 8, 9]. As ndcated, many useful tools and technques have become avalable from varous research areas that have contrbuted to the doman of multmeda retreval, but the ntegraton of automatcally generated multmodal metadata s most often done n an ad hoc manner. The varous nformaton modaltes that play a role n vdeo documents are each handled by dfferent tools. How the varous analyses affect the retreval performance s hard to establsh, and t s mpossble to gve an explanaton of performance results n terms of a formal retreval model. Ths paper descrbes an approach whch employs both textual and mage features and represents them n terms of one unform theoretcal framework. The output from varous feature extracton tools s represented n probablstc models based on Bayesan decson theory and the resultng model s a transparent combnaton of two smlar models, one for textual features based on language models for text and speech retreval [], and the other for mage features based on a mxture of Gaussan denstes []. Intal deployment of the approach wthn the search tasks for the vdeo retreval tracks n TREC-2 [2] and TREC- 22 [] has demonstrated the possblty of usng ths model n retreval experments for unstructured vdeo content. Addtonal experments have taken place for smaller test collectons. Secton 2 of ths paper descrbes the general probablstc retreval model, ts textual (Secton 2.), and vsual consttuents (Secton 2.2). Secton presents the expermental setup followed by a number of expermental results to evaluate the effectveness of the retreval model. Fnally, Secton summarses our man conclusons. 2. PROBABILISTIC RETRIEVAL MODEL If we reformulate the nformaton retreval problem to one of pattern classfcaton, the goal s to fnd the class to whch the query belongs. Let Ω ={ω,ω 2,...,ω M } be the set of classes underlyng our document collecton and Q be a queryrepresentaton. Usng the optmal Bayes or maxmum a posteror classfer, we can then fnd the class ω, wth mnmal probablty of classfcaton error, ω = arg max P ( ω Q ). () In a retreval settng, the best strategy s to rank classes by ncreasng probablty of classfcaton error. When no classfcaton s avalable, we can smply let each document be a separate class. It s hard to estmate () drectly; therefore, we reverse the probabltes usng Bayes rule ω = arg max P ( Q ω ) P ( ω ) P(Q) = arg max P ( Q ) ( ) ω P ω. (2) If the a pror probabltes of all classes are equal (.e., P(ω ) s unform), the maxmum a posteror classfer (2) reduces to the maxmum lkelhood classfer, whch s approxmated by the Kullback-Lebler (KL) dvergence between query model and class model ω = arg mn KL [ P q (x) P (x) ]. () The KL-dvergence measures the amount of nformaton there s to dscrmnate one model from another. The best matchng document s the document wth the model that s hardest to dscrmnate from the query model. Fgure llustrates the retreval framework. We buld models for queres and documents and compare them usng the KL-dvergence between the models. The vsual part s modelled as a mxture of Gaussans (see Secton 2.2); for the textual part, we use the language modellng approach n whch documents are treated as bags of words (see Secton 2.). The KL-dvergence between query model and document model s defned as follows: KL [ P q (x) P (x) ] = P ( x ) P ( x ) ω q ω q log P ( x ) dx ω = P ( x ) ( ) ω q log P x ω q dx () P ( x ) ( ) ω q log P x ω dx. The frst ntegral s ndependent of ω and can be gnored; thus, The query model s here, lke the document models, represented as a Gaussan mxture model but t can also be represented as a bag of blocks (see Secton 2.2).

3 88 EURASIP Journal on Appled Sgnal Processng Ths s an example capton. Ths s an example capton Ths s an example capton. Ths s an example capton Ths s an example capton. Ths s an example capton Ths s an example capton. Ths s an example capton Ths s an example capton. Ths s an example capton Ths s an example capton. Ths s an example capton example capton example capton example capton example capton example capton example capton example capton example capton example capton example capton Ths s an example capton. Ths s an example capton example capton Ths s an example capton. Ths s an example capton example capton Ths s an example capton. Ths s an example capton Ths s an example capton. Ths s an example capton Ths s an example capton. Ths s an example capton Ths s an example capton. Ths s an example capton Query model Document models LM LM KLdvergence LM LM Fgure : Retreval framework: mage represented as Gaussan mxture and text as language model ( bags of words ). ω = arg mn KL [ P q (x) P (x) ] = arg max P ( x (5) ) ( ) ω q log P x ω dx. When workng wth multmodal materal lke vdeo, the documents n our collecton contan features n dfferent modaltes. Ths means that the classes underlyng our document collecton may contan dfferent feature subclasses. The class condtonal denstes can thus be descrbed as mxtures of feature denstes P ( x ω ) = F f = P ( x ω, f ) P ( ω, f ), (6) where F s the number of underlyng feature subclasses, P(ω, f ) s the probablty of subclass f of class ω, and P(x ω, f ) s the subclass condtonal densty for ths subclass. When we draw a random sample from class ω,wefrstselect a feature subclass accordng to P(ω, f ) and then draw a sample from ths subclass usng P(x ω, f ). To arrve at a generc expresson for smlarty between mxture models, Vasconcelos [] parttons the feature space nto dsjont subspaces, where each pont n the feature space s assgned to the subspace correspondng to the most probable feature subclass χ k = { x : P ( ω,k x ) P ( ω,l x ), l k }. (7) Usng ths partton, (5) canberewrttenas(theproofs gven n []) P ( x ) ( ) ω q log P x ω dx = P ( ) [ ω q, f log P ( ) ω,k f,k + P ( x ( ) ] ) P x ω,k ω q, f,x χ k log χ k P ( ω,k x ) dx P ( x ) ω q, f dx. χ k (8) When the subspaces χ k form the same hard parttonng of the features space for all query and document models, that s, when P ( ω,k x ) = P ( ω q,k x ) {, f x χk, = (9), otherwse, then P ( x { ), f f = k, ω q, f dx χ k, otherwse, P ( ω,k x ) =, x χ k. Ths reduces (8)to P ( x ) ( ) ω q log P x ω dx = P ( ) ( ) ω q, f log P ω, f f + f P ( ω q, f ) χ f P ( x ω q, f ) log P ( x ω, f ) dx. () ()

4 A Probablstc Multmeda Retreval Model and Its Evaluaton 89 Ths rankng formula s general and can, n prncple, be used for any knd of multmodal document collecton. In the rest of the paper, we lmt ourselves to vdeo collectons represented by stll frames and speech-recognzed transcrpts. The classes underlyng our collecton are defned through the shots n the vdeos. Furthermore, we assume that we have two feature subclasses, namely, a subclass generatng textual features and another generatng vsual features. We can now partton the feature space nto two dstnct subspaces for textual and vsual features: χ t and χ v. Ths parttonng s hard, that s, a feature can be textual or vsual but never both. Our rankng formula becomes ω = arg max P ( x ) ( ) ω q log P x ω dx [ = arg max P ( ω q,t ) log P ( ω,t ) + P ( ω q,t ) χ t P ( x ω q,t ) log P ( x ω,t ) dx + P ( ω q,v ) log P ( ω,v ) + P ( ω q,v ) χ v P ( x ω q,v ) log P ( x ω,v ) dx ]. (2) The mxture probabltes for the textual and vsual models P(ω,t )andp(ω,v )mghtbedervedfrombackground knowledge about the class ω. If, for example, we know that ω s a class from a news broadcast, we mght assgn ahghervaluetop(ω,t ) snce the probablty that there s text that helps us n fndng relevant nformaton s relatvely hgh. On the other hand, f ω s from a documentary or a slent move, we mght gan less nformaton from the text from ω and assgn a lower value to P(ω,t ). At the moment, however, we have no background nformaton; therefore, we do not dstngush between classes and use unform mxture probabltes. Ths means that the frst and thrd terms from (2) are ndependent of ω and can be gnored. Our fnal (general) rankng formula becomes [ ω = arg max P(t) P ( x ) ( ) ω q,t log P x ω,t dx χ t + P(v) P ( x ] ) ( ) ω q,v log P x ω,v dx, χ v () where P(t) and P(v) are the class-ndependent probabltes of drawng textual and vsual features, respectvely. 2.. Text retreval For the textual part of our rankng functon, we use statstcal language models. A famous applcaton of these models s Shannon s llustraton of the mplcatons of codng and nformaton theory usng models of letter sequences and word sequences []. In the 97s, statstcal language models were developed as a general natural language-processng tool, frst for automatc speech recognton [5] and later also for, for example, part-of-speech taggng [6] and machne translaton [7]. Recently, statstcal language models have been suggested for nformaton retreval by Ponte and Croft [8], Hemstra [9], and Mller et al. [2]. The language modelng approach to nformaton retreval defnes a smple ungram language model for each document n a collecton. For each document ω,t, the language model defnes the probablty P(x t,,...,x t,nt ω,t )ofa sequence of N t textual features (.e., words) x t,,...,x t,nt and the documents are ranked by that probablty. The standard language modellng approach to nformaton retreval uses a lnear nterpolaton of the document model P(x t,j ω )wth a general collecton model P(x t,j )[9, 2, 2, 22]. As these models operate on dscrete sgnals, the ntegral from ()can be replaced by a sum. Furthermore, f we use the emprcal dstrbuton of the query as the query model, then the standard textual part of ()s ω t = arg max N t N t log [ λp ( ) ( )] x t,j ω +( λ)p xt,j. () j= The lnear combnaton needs a smoothng parameter λ whch s set emprcally on some test collecton or alternatvely estmated by the expectaton-maxmsaton (EM)- algorthm [2] on a test collecton. The probablty of drawng textual feature x t,j from document ω (P(x t,j ω )) s computed as follows: f the document contans terms n total and the term x t,j occurs 2 tmes, ths probablty would smply be 2/ =.2. Smlarly, P(x t,j ) s the probablty of drawng x t,j from the entre document collecton. Usng the statstcal language modellng approach for vdeo retreval, we would lke to explot the herarchcal data model of vdeo, n whch a vdeo s subdvded nto scenes whch are subdvded nto shots whch are, n turn, subdvded nto frames. Statstcal language models are partcularly well suted for modellng such complex representatons of the data. We can smply extend the mxture to nclude the dfferent levels of the herarchy, wth models for shots and scenes, 2 Shot = arg max N t N t log [ λ Shot P ( ) x t,j Shot j= + λ Scene P ( x t,j Scene ) + λcoll P ( x t,j ) ] wth λ Coll = λ Shot λ Scene. (5) The man dea behnd ths approach s that a good shot contans the query terms and s part of a scene havng more occurrences of the query terms. Also, by ncludng scenes n 2 We assume that each shot s a separate class and replace ω wth Shot.

5 9 EURASIP Journal on Appled Sgnal Processng the rankng functon, we hope to retreve the shot of nterest even f the vdeo s speech descrbes the shot just before t begns or just after t s fnshed. Dependng on the nformaton need of the user, we mght use a smlar strategy to rank scenes or complete vdeos nstead of shots, that s, the best scene mght be a scene that contans a shot n whch the query terms (co-)occur Image retreval In order to specalse the vsual part of our rankng formula (), we need to estmate the class condtonal denstes for the vsual features P(x v ω ). We follow Vasconcelos []and model them usng Gaussan mxture models. The dea behnd modellng shots as a mxture of Gaussans s that each shot contans a certan number of classes or components and that each sample from a shot (.e., each block of 8 by 8 pxels extracted from a frame) was generated by one of these components. The class condtonal denstes for a Gaussan mxture model are defned as follows: P ( ) C x v ω = P ( ) ( ) θ,c xv,µ,c, Σ,c, (6) c= wherec s the number of components n the mxture model, θ,c s component c of class model ω,and (x,µ, Σ) s the Gaussan densty wth mean vector µ and covarance matrx Σ, (x, µ, Σ) = (2π) n Σ e (/2) x µ Σ, (7) where n s the dmensonalty of the feature space and x µ Σ = (x µ) T Σ (x µ). (8) 2.2. Estmatng model parameters The parameters of the models for a gven shot can be estmated usng the EM algorthm. Ths algorthm terates between estmatng the a posteror class probabltes for each sample P(θ c x v ) (the E-step) and re-estmatng the components parameters (µ c, Σ c,andp(θ c )) based on the sample dstrbuton (M-step). The approach s rather general: any knd of feature vectors can be used to descrbe samples. Our samplng process s as follows (It s llustrated n Fgure 2). Frst, we convert the keyframe of a shot to the YCbCr color space. Then, we cut t n dstnct blocks of 8 by 8 pxels. On these blocks, we perform the dscrete cosne transform (DCT) for each of the color channels. We now take the frst DCT coeffcents from the Y-channel and only the DC coeffcent from both the Cb and the Cr channels to descrbe the samples. These feature vectors are then fed to the EM algorthm to fnd the parameters (µ c, Σ c,andp(θ c )). The EM algorthm frst assgns each sample to a random component. Next, we Lookng at a sngle shot, we can drop the class subscrpts. compute the parameters (µ c, Σ c,andp(θ c )) for each component, based on the samples assgned to that component. We re-estmate the class assgnments, that s, we compute the posteror probabltes (P(θ c x)forallc). We terate between estmatng class assgnments (expectaton step) and estmatng class parameters (maxmsaton step) untl the algorthm converges. Fgure shows a query mage and the component assgnments after dfferent teratons of the EM algorthm. Instead of a random ntalsaton, we ntally assgned the left-most part of the samples to component, the samples n the mddle to component 2, and the rght-most samples to component. Ths way t s clearly vsble how the component assgnments move about the mage. Fnally, after convergence of the EM algorthm, we descrbe the poston n the mage plane of each component as a 2D-Gaussan wth mean and covarance computed from the postons of the samples assgned to ths component Bags of blocks Just lke n our textual approach, for the query model, we can smply take the emprcal dstrbuton of the query samples. If a query mage x v conssts of N v samples x v = (x v,,x v,2,...,x v,nv ), then P(x v, ω q ) = /N v. For the document model, we take a mxture of foreground and background probabltes, that s, the (foreground) probablty of drawng a query sample from the document s Gaussan mxture model, and the (background) probablty of drawng t from any Gaussan mxture n the collecton. In other words, the query mage s vewed as a bag of blocks (BoB), and ts probablty s estmated as the jont probablty of all ts blocks. The BoB measure for query mages then becomes ω v = arg max N v log [ κp ( ) ( )] x v,j ω +( κ)p xv,j, N v j= (9) where κ s a mxng parameter and the background probablty P(x v,j ) can be found by margnalsng over all M documents n the collecton P ( ) M x v,j = P ( ) ( ) x v,j ω P ω. (2) = Agan, we assume unform document prors (P(ω ) = /M for all ). In text retreval, one of the reasons for mxng the document model wth a collecton model s to assgn nonzero probabltes to words that are not observed n a document. Smoothng s not necessary n the vsual case snce the documents are modelled as mxtures of Gaussans havng nfnte support. Another motvaton for mxng s to weght term mportance: a common sample x (.e., a sample that occurs frequently n the collecton) has a relatvely In practce, a sample does not always belong entrely to one component. In fact, we compute means, covarances, and prors on the weghted feature vectors, where the feature vectors are weghted by ther proporton of belongng to the class under consderaton.

6 A Probablstc Multmeda Retreval Model and Its Evaluaton 9 Splt colour channels y Cb Cr Take samples DCT coeffcents EM algorthm Fgure 2: Buldng a Gaussan mxture model from an mage. hgh probablty P(x) (equal for all documents) and, therefore, P(x ω) has only lttle nfluence on the probablty estmate. In other words, common terms and common blocks nfluence the fnal rankng only margnally Asymptotc lkelhood approxmaton A dsadvantage of usng the BoB measure s ts computatonal complexty. In order to rank the collecton, gven a query we need to compute the posteror probablty P(x v ω )

7 92 EURASIP Journal on Appled Sgnal Processng Intal 2 teratons teratons teratons Fgure : Class assgnments ( classes) for the mage at the top after dfferent numbers of teratons. of each mage block x v n the query for each document ω n the collecton. For evaluatng a retreval method, ths s fne, but for an nteractve retreval system, optmsaton s necessary. An alternatve s to represent the query mage, lke the document mage, as a Gaussan model (nstead of by ts emprcal dstrbuton as a bag of blocks) and then compare these two models usng the KL-dvergence. Yet, f we use Gaussans to model the class condtonal denstes of the mxture components, there s no closed-form soluton for the vsual part of the resultng rankng formula (). As a soluton, Vasconcelos assumes that the Gaussans are well separated and derves an approxmaton gnorng the overlap between the mxture components: the asymptotc lkelhood approxmaton (ALA) []. Startng from (8), he arrves at ω v = arg max χ v P ( x v ω q ) log P ( xv ω ) dxv arg max ALA [ ( ) ( )] P q xv P xv = arg max P ( ) { θ q,c log P ( ) θ,α(c) c +log ( ) µ q,c,µ,α(c), Σ,α(c) 2 trace [ Σ,α(c) Σ q,c] }, where α(c)=k µ q,c µ,k Σ,k < µ q,c µ,l Σ,l, l k. (2) In ths equaton, subscrpts ndcate, respectvely, classes and components (e.g., µ,c s the mean for component θ c of class ω ). 2.. ALA assumptons The man assumpton behnd the ALA s that the Gaussans for the components θ c wthn a class model ω have small overlap; n fact, there are two parts to ths []. The frst assumpton s that each mage sample s assgned to one and only one of the mxture components. The second s that samples from the support set from a sngle query component are all assgned to the same document component. More formally, we have the followng assumptons. Assumpton. Foreachsample,thecomponentwthmaxmum posteror probablty has posteror probablty one ω,x:maxp ( θ,k x ) =. (22) k Assumpton 2. For any document ω j, the component wth maxmum posteror probablty s the same for all samples of the support set of a sngle query component θ q,k, θ q,k,ω j l, x, P ( x θ q,k ) > = arg max l P ( θ j,l x ) =l. (2) We used Monte Carlo smulaton to test these assumptons on our collecton (the TREC-22 vdeo collecton, see Secton.) as follows. Frst, we took a random document ω from the search collecton and then a random mxture component θ,k from the mxture model of ths document. We then drew, random samples from ths component and, for each sample x,computed () P(θ,l x), the posteror component assgnment wthn document for all components θ,l ; () P(θ j,m x), the posteror component assgnment n a dfferent randomly chosen document j, for all components θ j,m. For the frst measure, we smply took the maxmum posteror probablty for each sample. We averaged the second measure over all, samples and took the maxmum over all components to approxmate the proporton of samples assgned to the most probable component

8 A Probablstc Multmeda Retreval Model and Its Evaluaton 9 Number of samples Number of samples max l P(θ,l x) (a)..5.6 max m P(θ j,m x) (b) Fgure : Testng the ALA assumptons (hstogram (a)) and 2 (hstogram (b)), samples x are drawn from P(x θ,k ). (remember, there should be a component that explans all samples). We repeated ths process, teratons for dfferent documents and components selected at random, and hstogrammed the results (Fgure ). Both measures should be close to, the frst to satsfyassumpton and the second tosatsfyassumpton 2. As we can see from the plots n Fgure, the frst assumpton appears reasonable, but the second does not hold. 5 We nvestgate the effect of ths observaton n the retreval experments below. 5 The bar at probablty zero results from a truncaton error n the Bayesan nverson to compute P(θ j,m x) from a (too small) probablty P(x θ j,m ) EXPERIMENTS We evaluated the model outlned above and the presented measures on the search task of the vdeo track of the Text REtreval Conference TREC-22 []... TREC vdeo track TREC s a seres of workshops for large scale evaluaton of nformaton retreval technology [2, 25]. The goal s to test retreval technology on realstc test collecton usng unform and approprate scorng procedures. The general procedure s as follows: () a set of statements of an nformaton need (topc) s created; () partcpants search the collecton and return the top N results for each topc; () returned documents are pooled and judged for relevance to the topc; (v) systems are evaluated usng the relevance judgements. The measures used n evaluaton are usually precson and recall orented. Precson and recall are defned as follows: number of relevant shots retreved precson = total number of shots retreved, recall = number of relevant shots retreved total number of relevant shots n collecton. (2) The vdeo track was ntroduced at TREC-2 to evaluate content-based retreval from dgtal vdeo [2]. Here, we use the data from the TREC-22 vdeo track []. The track defnes three tasks: shot boundary detecton, feature detecton, and general nformaton search. The goal of the shot boundary task s to dentfy shot boundares n a gven vdeo clp. In the feature detecton task, we have to assgn a set of predefned features to a shot, for example, ndoor, outdoor, people, andspeech. In the search task, the goal s to fnd relevant shots gven a descrpton of an nformaton need, expressed by a multmeda topc. Both n the feature detecton task and n the search task, a predefned set of shots s to be used. In our experments, we focus on the search task. The collecton to be searched n ths task conssts of approxmately hours of MPEG- encoded vdeo; n addton, a set of 2 hours of tranng materal was avalable. The topcs consst of a textual descrpton of the nformaton need, accompaned by mages, vdeo fragments, and/or audo fragments llustratng what s needed. For each topc, a system could return a ranked lst of vdeo fragments. The top 5 returned shots of each run are then pooled and judged. We report expermental results usng the standard TREC measures, average precson, and mean average precson (MAP). Average precson s the average of the precson value obtaned after each relevant document s retreved (when a relevant document s not retreved at all, ts precson s assumed to be ). MAP s the mean of the average precson values over all topcs.

9 9 EURASIP Journal on Appled Sgnal Processng MAP κ Fgure 5: MAP on vdeo search task for dfferent κ. For the textual descrptons of the shots, we used speech transcrpts kndly provded by LIMSI. These transcrpts were algned to the predefned vdeo shots. We dd not have or defne a semantc dvson of the vdeo nto scenes but defned scenes smply as overlappng wndows of 5 consecutve shots. 6 We removed common words from the transcrpts (stoppng) and stemmed all terms usng the Porter stemmer [26]. For the vsual descrpton, we took keyframes from the common vdeo shots, and we used EM to fnd the parameters of Gaussan mxture models. Keyframe selecton was straghtforward: we smply used the mddle frame from each shot as representatve for the shot..2. Estmatng the mxture parameters The model does not specfy the value of mxng parameters λ, λ Shot, λ Scene,andκ. An optmal value can only be found a posteror by evaluatng retreval performance for dfferent values on a test collecton; a pror, we must make an educated guess for the rght values. Fgure 5 shows the MAP scores on the TREC-22 vdeo tracksearchtaskforκrangng from. to.. We can see that retreval results are nsenstve to the value of the mxng parameteraslongaswetakebothforegroundandbackground nto account. The plot has a smlar shape as that found n Hemstra s thess for the λ parameter n the standard language model []. For the transcrpts, we tred over thrty combnatons of settngs, usng two sets of text queres (see also Secton.). For queryset Tlong, ths resulted n optmal settngs for MAP wth λ Shot =.9, λ Scene =.2, and λ Coll =.7. Here, modellng the herarchy n the vdeo makes sense because shot and scene both contrbute to results n the rankng (λ Shot and λ Scene are larger than zero). For set Tshort, however, the optmalsettngshadλ Shot =. and the resultng model s 6 In prelmnary experments on the TREC-2 collecton, when varyng the wndow lengths, 5 shots were the optmum..9 dentcal to the orgnal language model. Summarzng and rankng transcrpt unts longer than shots s mportant, but we cannot conclude from these experments whether modelng the herarchy s really necessary. In all experments, the dfferences between the better parameter choces are not sgnfcant, but a partcularly bad choce may serously degrade retreval effectveness. In the remander of ths work, we have used κ =.9, λ Shot =.9, λ Scene =.2, and λ Coll =.7... Usng all or some mage examples In general, t s hard to guess what would be a good example mage for a specfc query. If we look for shots of the Golden Gate Brdge, we mght not care from what angle the brdge was flmed, or f the clp was flmed on a sunny or a cloudy day; vsually, however, such examples may be qute dfferent (Fgure 6). If a user has presented three examples and no addtonal nformaton, the best we can do s to try to fnd documents that descrbe all example mages well. Unfortunately, a document may be ranked low even though t models the samples from one example mage well as t may not explan the samples from the other mages. For each topc, we computed whch of the example mages would have gven the best results f t had been used as the only example for that topc. We compared these best example results to the full topc results n whch we used all avalable vsual examples. The experment was done usng both the ALA and the BoB measure. In the full topc case, the set of avalable topcs was regarded as one large bag of samples. For the ALA measure, we bult one mxture model to descrbe all avalable vsual examples. For BoB, we ranked documents by ther probablty of generatng all samples n all query mages. For the sngle mage queres n the best example, we bult a separate mxture model from each example and used t for ALA rankng. For BoB rankng, we used all samples from the sngle vsual example. Snce t s problematc to use multple examples n a query, we wanted to see f t s possble to guess n advance what would be a good example for a specfc topc. Therefore, for each topc, we also hand-pcked a sngle representatve from the avalable examples and compared these manual example results to the other two result sets. The results for the dfferent settngs are lsted n Table. A frst thng to notce s that all scores are rather low. When we take a closer look at the topcs wth hgher average precson scores, we see that these manly contan examples from the search collecton. In other words, we can fnd smlar shots from wthn the same vdeo, but generalsaton s a problem. Comparng BoB to ALA, we see that, averaged over all topcs for each set of examples, BoB outperforms ALA. For some specfc topcs, the ALA gves hgher scores, but agan these are cases wth examples from wthn the collecton. In general, the BoB approach, whch uses fewer assumptons, performs better. The fact that usng the best mage example outperforms the use of all examples shows that combnng results from

10 A Probablstc Multmeda Retreval Model and Its Evaluaton 95 Fgure 6: Vsual examples of the Golden Gate Brdge. Table : MAP for full topcs, best examples, and manual examples. Full topc Best example Manual example Topc BoB ALA BoB ALA BoB ALA vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt MAP dfferent vsual examples can ndeed degrade results. Lookng at the results, manually selectng good examples seems a nontrval task, but the drop n performance s partly due to the generalsaton problem. If one of the mage examples happens to come from the collecton, t scores hgh. If we fal to select that partcular example, the score for the manual example run drops. Smply countng how often the manually selected example was the same as the best-performng example, we see that ths was the case for 8 out of topcs. 7 7 If we gnore the topcs for whch there s only one example and the ones for whch the best example scored... Usng example transcrpts We took two dfferent approaches n buldng textual queres from the multmeda topcs. The frst set of textual queres, Tshort, was constructed smply by takng the textual descrpton from the topc. In the second set of queres, Tlong, we augmented these wth the speech transcrpts from the vdeo examples avalable for a topc. The assumpton here s that relevant shots share a vocabulary wth example shots; thus, usng example transcrpts mght mprove retreval results. In both sets of queres, we removed common words and stemmed all terms. We found that across topcs, Tlong outperformed Tshort wth a MAP of.22 aganst.96. For detaled per-topc nformaton, see Table Combnng textual and vsual runs We combned textual and vsual runs usng our combned rankng formula (). Snce we had no data to estmate the parameters for mxng textual and vsual nformaton, we used P(t) = P(v) =.5. For the textual part, we tred both short and long queres; for the vsual part, we used full queres and best-example queres. Table 2 shows the results for combnatons wth the BoB measure. We also expermented wth combnatons wth the ALA measure, but we found that n the ALA case, t s dffcult to combne textual and vsual scores because they are on dfferent scales. The BoB measure s closer to the KL-dvergence and, on top of that, more smlar to our textual approach, and thus easer to combne wth the textual scores. For most of the topcs, textual runs gve the best results; however, for some topcs, usng the vsual examples s useful. Ths s manly the case when ether the topcs come from the search collecton or when the relevant documents are outlers n the collecton. Ths llustrates how dffcult t s to search a generc vdeo collecton usng vsual nformaton only. We succeed only f the relevant documents are ether hghly smlar to the examples provded or very dssmlar from the other documents n the collecton (and, therefore, relatvely smlar to the query examples). When both textual and vsual runs have reasonable scores, combnng the runs can mprove on the ndvdual runs; however, when one of them has nferor performance, a combnaton only adds nose and lowers the scores.. CONCLUSIONS We presented a probablstc framework for multmodal retreval n whch textual and vsual retreval models are

11 96 EURASIP Journal on Appled Sgnal Processng Table 2: Average precson per topc for textual runs, BoB runs, and combned runs. Topc Tshort Tlong BoBfull BoBbest BoBfull BoBfull BoBbest BoBbest +Tshort +Tlong +Tshort +Tlong vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt vt MAP ntegrated seamlessly, and evaluated the framework usng the search task from the TREC-22 vdeo track. We found that even though the topcs were specfcally desgned for contentbased retreval and relevance was defned vsually, a textual search outperforms vsual search for most topcs. As we have seen before [6], standard mage retreval technques cannot readly be appled to satsfy a varety of nformaton requests from a generc vdeo collecton. Future work has to show how ncorporatng dfferent sources of addtonal nformaton (e.g., contextual frames, the movement n vdeo, or user nteracton) can help mprove results. In the text-only experments, we saw that usng the transcrpts from the example vdeos n queres mproves results. We also found that t s useful to take transcrpts from surroundng shots nto account to descrbe a shot. However, t s stll unclear whether a herarchcal descrpton of scenes and shots s necessary. In our vsual experments, we found that the general probablstc framework s useful for mage retreval. However, we found that one of the assumptons underlyng the ALA of the KL-dvergence does not hold for the generc vdeo collecton we used. Ths was reflected n the dfference n performance of the ALA and the BoB model. Unfortunately, computng the jont block probabltes n the BoB model s computatonally expensve and unsutable for an nteractve retreval system. Future work wll nvestgate ways to speed up the process. Furthermore, we notced generalsaton problems. The vsual models only gave satsfyng results f the relevant documents were ether hghly smlar to the query mage(s) (.e., the query mages came from the collecton) or hghly dssmlar to the rest of the collecton (.e., the relevant documents were outlers n the collecton). When ether textual or vsual results are poor, combnng them, thus addng nose, seems to degrade the scores. However, when both modaltes yeld reasonable scores, a combned run outperforms the ndvdual runs. REFERENCES [] F. de Jong, J.-L. Gauvan, D. Hemstra, and K. Netter, Language-based multmeda nformaton retreval, n Proc. RIAO 2 Content-Based Multmeda Informaton Access,pp , Pars, France, Aprl 2. [2] J.-L. Gauvan, L. Lamel, and G. Adda, Transcrbng broadcast news for audo and vdeo ndexng, Communcatons of the ACM, vol., no. 2, pp. 6 7, 2. []G.J.F.Jones,J.T.Foote,K.SparckJones,andS.J.Young, The vdeo mal retreval project: experences n retrevng

12 A Probablstc Multmeda Retreval Model and Its Evaluaton 97 spoken documents, n Intellgent Multmeda Informaton Retreval, M. T. Maybury, Ed., pp. 9 2, AAAI Press/MIT Press, Cambrdge, Mass, USA, 997. [] K. Barnard and D. Forsyth, Learnng the semantcs of words and pctures, n Proc. Internatonal Conf. on Computer Vson, vol. 2, pp. 8 5, Vancouver, Canada, 2. [5] M. La Casca, S. Seth, and S. Sclaroff, Combnng textual and vsual cues for content-based mage retreval on the world wde web, n Proc. IEEE Workshop on Content-Based Access of Image and Vdeo Lbrares, pp. 2 28, Santa Barbara, Calf, USA, June 998. [6] The lowlands team, Lazy users and automatc vdeo retreval tools n (the) lowlands, n The th Text REtreval Conference (TREC-2),E.M.VoorheesandD.K.Harman, Eds., vol., pp , Natonal Insttute of Standards and Technology, NIST, Gathersburg, Md, USA, 22. [7] T. Westerveld, Image retreval: Content versus context, n Proc. RIAO 2 Content-Based Multmeda Informaton Access, pp , Pars, France, Aprl 2. [8] T. Westerveld, Probablstc multmeda retreval, n Proc. the 25th Annual Internatonal ACM SIGIR Conference on Research and Development n Informaton Retreval, pp. 7 8, Tampere, Fnland, 22. [9] T. Westerveld, A. P. de Vres, and A. van Ballegooj, CWI at the TREC-22 vdeo track, n The th Text REtreval Conference (TREC-22),E.M.VoorheesandD.K.Harman, Eds., Natonal Insttute of Standards and Technology, NIST, Gathersburg, Md, USA, 22. [] D. Hemstra, Usng language models for nformaton retreval, Ph.D. thess, Centre for Telematcs and Informaton Technology, Unversty of Twente, The Netherlands, 2. [] N. Vasconcelos, Bayesan models for vsual nformaton retreval, Ph.D. thess, Massachusetts Insttute of Technology, Cambrdge, Mass, USA, 2. [2] P. Over and R. Taban, The TREC-2 vdeo track framework, n Proc. the th Text REtreval Conference (TREC- 2), E.M.VoorheesandD.K.Harman,Eds.,vol.,pp , Natonal Insttute of Standards and Technology, NIST, Gathersburg, Md, USA, 22. [] A. F. Smeaton and P. Over, The TREC-22 vdeo track report, n The th Text REtreval Conference (TREC-22), E. M. Voorhees and D. K. Harman, Eds., Natonal Insttute of Standards and Technology, NIST, Gathersburg, Md, USA, 22. [] C. E. Shannon, A mathematcal theory of communcaton, Bell System Techncal Journal, vol. 27, pp. 79 2, , 98. [5] F. Jelnek, Statstcal Methods for Speech Recognton, MIT Press, Cambrdge, Mass, USA, 997. [6] D. Cuttng, J. Kupec, J. Pedersen, and P. Sbun, A practcal part-of-speech tagger, n Proc. the rd Conference on Appled Natural Language Processng, pp., Trento, Italy, 992. [7] P. F. Brown, J. Cocke, S. A. Della Petra, et al., A statstcal approach to machne translaton, Computatonal Lngustcs, vol. 6, no. 2, pp , 99. [8] J. M. Ponte and W. B. Croft, A language modelng approach to nformaton retreval, n Proc. the 2st Annual Internatonal ACM SIGIR Conference on Research and Development n Informaton Retreval, pp , Melbourne, Australa, 998. [9] D. Hemstra, A lngustcally motvated probablstc model of nformaton retreval, n Proc. the 2nd European Conference on Research and Advanced Technology for Dgtal Lbrares, C. Ncolaou and C. Stephands, Eds., pp , Heraklon, Crete, Greece, September 998. [2] D. R. H. Mller, T. Leek, and R. M. Schwartz, A hdden Markov model nformaton retreval system, n Proc. the 22nd Annual Internatonal ACM SIGIR Conference on Research and Development n Informaton Retreval, pp. 2 22, Berkeley, Calf, USA, August 999. [2] J. Lafferty and C. Zha, Document language models, query models, and rsk mnmzaton for nformaton retreval, n Proc. the 2th Annual Internatonal ACM SIGIR Conference on Research and Development n Informaton Retreval, pp. 9, New Orleans, La, USA, September 2. [22] K. Ng, A maxmum lkelhood rato nformaton retreval model, n Proc. the 8th Text REtreval Conference, TREC- 8, NIST Specal Publcatons, Natonal Insttute of Standards and Technology, NIST, Gathersburg, Md, USA, 2. [2] A. P. Dempster, N. M. Lard, and D. B. Rubn, Maxmum lkelhood from ncomplete data va the EM algorthm, Journal of the Royal statstcal Socety, Seres B, vol.9,no.,pp. 8, 977. [2] E. M. Voorhees and D. K. Harman, Eds., The th Text REtreval Conference (TREC-22), Natonal Insttute of Standards and Technology, NIST, Gathersburg, Md, USA, 22. [25] E. M. Voorhees and D. K. Harman, Eds., The th Text REtreval Conference (TREC-2), vol., Natonal Insttute of Standards and Technology, NIST, Gathersburg, Md, USA, 22. [26] M. F. Porter, An algorthm for suffx strppng, Program, vol., no., pp. 7, 98. Thjs Westerveld receved the M.S. degree n computer scence from the Unversty of Twente. As a Research Assstant at the same unversty, he has partcpated n a number of EU projects n the area of multmeda nformaton retreval. Workng on the natonal Waterland project at the CWI, the Natonal Research Insttute for Mathematcs and Computer Scence n the Netherlands, he nvestgates, for hs Ph.D., the use of probablstc models for retreval from generc multmeda collectons. Arjen P. de Vres receved hs Ph.D. n computer scence from the Unversty of Twente n 999, on the ntegraton of content management n database systems. He s especally nterested n the desgn of database systems that support search n multmeda dgtal lbrares. Arjen works as a Postdoctoral Researcher at the CWI, the Natonal Research Insttute for Mathematcs and Computer Scence n the Netherlands. Alex van Ballegooj receved the M.S. degree n computer scence from the Vrje Unverstet of Amsterdam n 999. He works towards hs Ph.D. on the natonal ICES-KIS MIA project at the CWI, the Natonal Research Insttute for Mathematcs and Computer Scence n the Netherlands. Hs current research actvtes ental the nvestgaton of aspects that make a database system sutable for computatonally ntensve tasks, specfcally search n multmeda dgtal lbrares.

13 98 EURASIP Journal on Appled Sgnal Processng Francska de Jong s Full Professor of language technology at the Computer Scence Department of the Unversty of Twente, Enschede snce 992. She s also afflated to the TNO TPD n Delft. She has a background n theoretcal and computatonal lngustcs and receved the Ph.D. degree at the Unversty of Utrecht n 99. She worked as a Researcher at Phlps Research on the Rosetta machne translaton project ( ). Currently, her man research nterest s n the feld of multmeda ndexng and retreval. She s frequently nvolved n nternatonal program commttees, expert groups, and revew panels and has ntated a number of EU projects. Djoerd Hemstra s an Assstant Professor n the Database Group n the Computer Scence Department of the Unversty of Twente snce 2. At ths same unversty, he studed computer scence and graduated n the feld of language technology (996). In 2, he worked for three months at Mcrosoft Research n Cambrdge. He wrote a Ph.D. thess on probablstc retreval usng language models. Multmeda databases, cross-language nformaton retreval, and statstcal language modelng are among the research themes he s currently workng on. Together wth Arjen de Vres, he ntated the project CIRQUID, funded by NWO.

A Probabilistic Multimedia Retrieval Model and its Evaluation

A Probabilistic Multimedia Retrieval Model and its Evaluation A Probablstc Multmeda Retreval Model and ts Evaluaton Thjs Westerveld 1, Arjen P. de Vres 1, Alex van Ballegooj 1, Francska de Jong 2, and Djoerd Hemstra 2 1 CWI PO Box 94079, 1090 GB Amsterdam The Netherlands

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Informaton Search and Management Probablstc Retreval Models Prof. Chrs Clfton 7 September 2018 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group 14 Why probabltes

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Question Classification Using Language Modeling

Question Classification Using Language Modeling Queston Classfcaton Usng Language Modelng We L Center for Intellgent Informaton Retreval Department of Computer Scence Unversty of Massachusetts, Amherst, MA 01003 ABSTRACT Queston classfcaton assgns a

More information

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

GEMINI GEneric Multimedia INdexIng

GEMINI GEneric Multimedia INdexIng GEMINI GEnerc Multmeda INdexIng Last lecture, LSH http://www.mt.edu/~andon/lsh/ Is there another possble soluton? Do we need to perform ANN? 1 GEnerc Multmeda INdexIng dstance measure Sub-pattern Match

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal

More information

Relevance Vector Machines Explained

Relevance Vector Machines Explained October 19, 2010 Relevance Vector Machnes Explaned Trstan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introducton Ths document has been wrtten n an attempt to make Tppng s [1] Relevance Vector Machnes

More information

Probabilistic Information Retrieval CE-324: Modern Information Retrieval Sharif University of Technology

Probabilistic Information Retrieval CE-324: Modern Information Retrieval Sharif University of Technology Probablstc Informaton Retreval CE-324: Modern Informaton Retreval Sharf Unversty of Technology M. Soleyman Fall 2016 Most sldes have been adapted from: Profs. Mannng, Nayak & Raghavan (CS-276, Stanford)

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Uncertainty as the Overlap of Alternate Conditional Distributions

Uncertainty as the Overlap of Alternate Conditional Distributions Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

U-Pb Geochronology Practical: Background

U-Pb Geochronology Practical: Background U-Pb Geochronology Practcal: Background Basc Concepts: accuracy: measure of the dfference between an expermental measurement and the true value precson: measure of the reproducblty of the expermental result

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material Natural Images, Gaussan Mxtures and Dead Leaves Supplementary Materal Danel Zoran Interdscplnary Center for Neural Computaton Hebrew Unversty of Jerusalem Israel http://www.cs.huj.ac.l/ danez Yar Wess

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Evaluation for sets of classes

Evaluation for sets of classes Evaluaton for Tet Categorzaton Classfcaton accuracy: usual n ML, the proporton of correct decsons, Not approprate f the populaton rate of the class s low Precson, Recall and F 1 Better measures 21 Evaluaton

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

Note on EM-training of IBM-model 1

Note on EM-training of IBM-model 1 Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

Unified Subspace Analysis for Face Recognition

Unified Subspace Analysis for Face Recognition Unfed Subspace Analyss for Face Recognton Xaogang Wang and Xaoou Tang Department of Informaton Engneerng The Chnese Unversty of Hong Kong Shatn, Hong Kong {xgwang, xtang}@e.cuhk.edu.hk Abstract PCA, LDA

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

MSU at ImageCLEF: Cross Language and Interactive Image Retrieval

MSU at ImageCLEF: Cross Language and Interactive Image Retrieval MSU at ImageCLEF: Cross Language and Interactve Image Retreval Vneet Bansal, Chen Zhang, Joyce Y. Cha, Rong Jn Department of Computer Scence and Engneerng, Mchgan State Unversty East Lansng, MI48824, U.S.A.

More information

Semi-supervised Classification with Active Query Selection

Semi-supervised Classification with Active Query Selection Sem-supervsed Classfcaton wth Actve Query Selecton Jao Wang and Swe Luo School of Computer and Informaton Technology, Beng Jaotong Unversty, Beng 00044, Chna Wangjao088@63.com Abstract. Labeled samples

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Clustering gene expression data & the EM algorithm

Clustering gene expression data & the EM algorithm CG, Fall 2011-12 Clusterng gene expresson data & the EM algorthm CG 08 Ron Shamr 1 How Gene Expresson Data Looks Entres of the Raw Data matrx: Rato values Absolute values Row = gene s expresson pattern

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,

More information

Regularized Discriminant Analysis for Face Recognition

Regularized Discriminant Analysis for Face Recognition 1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Extending Relevance Model for Relevance Feedback

Extending Relevance Model for Relevance Feedback Extendng Relevance Model for Relevance Feedback Le Zhao, Chenmn Lang and Jame Callan Language Technologes Insttute School of Computer Scence Carnege Mellon Unversty {lezhao, chenmnl, callan}@cs.cmu.edu

More information

Tracking with Kalman Filter

Tracking with Kalman Filter Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,

More information

Clustering & Unsupervised Learning

Clustering & Unsupervised Learning Clusterng & Unsupervsed Learnng Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 2012 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County Smart Home Health Analytcs Sprng 2018 Bayesan Learnng Nrmalya Roy Department of Informaton Systems Unversty of Maryland Baltmore ounty www.umbc.edu Bayesan Learnng ombnes pror knowledge wth evdence to

More information

1 Matrix representations of canonical matrices

1 Matrix representations of canonical matrices 1 Matrx representatons of canoncal matrces 2-d rotaton around the orgn: ( ) cos θ sn θ R 0 = sn θ cos θ 3-d rotaton around the x-axs: R x = 1 0 0 0 cos θ sn θ 0 sn θ cos θ 3-d rotaton around the y-axs:

More information

Statistical pattern recognition

Statistical pattern recognition Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30 STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear

More information

Retrieval Models: Language models

Retrieval Models: Language models CS-590I Informaton Retreval Retreval Models: Language models Luo S Department of Computer Scence Purdue Unversty Introducton to language model Ungram language model Document language model estmaton Maxmum

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Learning from Data 1 Naive Bayes

Learning from Data 1 Naive Bayes Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

SPANC -- SPlitpole ANalysis Code User Manual

SPANC -- SPlitpole ANalysis Code User Manual Functonal Descrpton of Code SPANC -- SPltpole ANalyss Code User Manual Author: Dale Vsser Date: 14 January 00 Spanc s a code created by Dale Vsser for easer calbratons of poston spectra from magnetc spectrometer

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

On the correction of the h-index for career length

On the correction of the h-index for career length 1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat

More information

Automatic Object Trajectory- Based Motion Recognition Using Gaussian Mixture Models

Automatic Object Trajectory- Based Motion Recognition Using Gaussian Mixture Models Automatc Object Trajectory- Based Moton Recognton Usng Gaussan Mxture Models Fasal I. Bashr, Ashfaq A. Khokhar, Dan Schonfeld Electrcal and Computer Engneerng, Unversty of Illnos at Chcago. Chcago, IL,

More information

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF 10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the

More information

The big picture. Outline

The big picture. Outline The bg pcture Vncent Claveau IRISA - CNRS, sldes from E. Kjak INSA Rennes Notatons classes: C = {ω = 1,.., C} tranng set S of sze m, composed of m ponts (x, ω ) per class ω representaton space: R d (=

More information

Image Processing for Bubble Detection in Microfluidics

Image Processing for Bubble Detection in Microfluidics Image Processng for Bubble Detecton n Mcrofludcs Introducton Chen Fang Mechancal Engneerng Department Stanford Unverst Startng from recentl ears, mcrofludcs devces have been wdel used to buld the bomedcal

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

UNR Joint Economics Working Paper Series Working Paper No Further Analysis of the Zipf Law: Does the Rank-Size Rule Really Exist?

UNR Joint Economics Working Paper Series Working Paper No Further Analysis of the Zipf Law: Does the Rank-Size Rule Really Exist? UNR Jont Economcs Workng Paper Seres Workng Paper No. 08-005 Further Analyss of the Zpf Law: Does the Rank-Sze Rule Really Exst? Fungsa Nota and Shunfeng Song Department of Economcs /030 Unversty of Nevada,

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

THE SUMMATION NOTATION Ʃ

THE SUMMATION NOTATION Ʃ Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the

More information

CS 468 Lecture 16: Isometry Invariance and Spectral Techniques

CS 468 Lecture 16: Isometry Invariance and Spectral Techniques CS 468 Lecture 16: Isometry Invarance and Spectral Technques Justn Solomon Scrbe: Evan Gawlk Introducton. In geometry processng, t s often desrable to characterze the shape of an object n a manner that

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information