Probabilistic Information Retrieval CE-324: Modern Information Retrieval Sharif University of Technology
|
|
- Leonard Barber
- 5 years ago
- Views:
Transcription
1 Probablstc Informaton Retreval CE-324: Modern Informaton Retreval Sharf Unversty of Technology M. Soleyman Fall 2016 Most sldes have been adapted from: Profs. Mannng, Nayak & Raghavan (CS-276, Stanford)
2 Why probabltes n IR? User Informaton Need Query Representaton Understandng of user need s uncertan How to match? Documents Document Representaton Uncertan guess of whether doct has relevant content In tradtonal IR systems, matchng between each doc and query s attempted n a semantcally mprecse space of ndex terms. Probabltes provde a prncpled foundaton for uncertan reasonng. Can we use probabltes to quantfy our uncertantes? 2
3 Probablstc IR Probablstc methods are one of the oldest but also one of the currently hottest topcs n IR. Tradtonally: neat deas, but ddn t wn on performance It may be dfferent now. 3
4 Probablstc IR topcs Classcal probablstc retreval model Probablty Rankng Prncple Bnary ndependence model ( We wll see that ts a Naïve Bayes text categorzaton) (Okap) BM25 Bayesan networks for IR [we wll not cover ths topc] Language model approach to IR An mportant emphass on ths approach n recent work 4
5 The document rankng problem Problem specfcaton: We have a collecton of docs User ssues a query A lst of docs needs to be returned Rankng method s the core of an IR system: In what order do we present documents to the user? Idea: Rank by probablty of relevance of the doc w.r.t. nformaton need P(R = 1 doc, query) 5
6 Probablty Rankng Prncple (PRP) If a reference retreval system s response to each request s a rankng of the docs n the collecton n order of decreasng probablty of relevance to the user who submtted the request, where the probabltes are estmated as accurately as possble on the bass of whatever data have been made avalable to the system for ths purpose, the overall effectveness of the system to ts user wll be the best that s obtanable on the bass of those data. [1960s/1970s] S. Robertson, W.S. Cooper, M.E. Maron; van Rjsbergen (1979:113); Mannng & Schütze (1999:538) 6
7 Recall a few probablty bascs Product rule: p a, b = p a b P(b) Sum rule: p a = b p(a, b) Bayes Rule Posteror Odds: Pror p( b a) p( a) p( b a) p( a) p( a b) p( b ) p( b a) p( a) p( b a ) p( a ) 7 p( a) p( a) O( a) p( a) 1 p( a)
8 Probablty Rankng Prncple (PRP) d: doc q: query R: relevance of a doc w.r.t. gven (fxed) query R = 1: relevant R = 0: not relevant Need to fnd probablty that a doc x s relevant to a query q. p(r = 1 d, q) p R = 0 d, q = 1 p R = 1 d, q 8
9 Probablty Rankng Prncple (PRP) p R = 1 d, q = p R = 0 d, q = p d R = 1, q p(r = 1 q) p(d q) p d R = 0, q p(r = 0 q) p(d q) p(d R = 1, q): probablty of d n the class of relevant docs to the query q. p(d R = 0, q) : probablty of d n the class of nonrelevant docs to the query q. 9
10 Probablty Rankng Prncple (PRP) Bayes Optmal Decson Rule d s relevant ff P(R = 1 d, q) > P(R = 0 d, q).e.,d s relevant ff P(R = 1 d, q) > 0.5 But PRP n acton: Rank all documents by P(R d, q) Theorem: Usng the PRP s optmal, n that t mnmzes the loss (Bayes rsk) under 1/0 loss Provable f all probabltes correct, etc. [e.g., Rpley 1996] 10
11 Probablty Rankng Prncple (PRP) How do we compute all those probabltes? Do not know exact probabltes, have to use estmates Bnary Independence Model (BIM) whch we dscuss next s the smplest model 11
12 Probablstc Retreval Strategy Estmate how terms contrbute to relevance How do thngs lke tf, df, and length nfluence your judgments about doc relevance? A more nuanced answer s the Okap formula Spärck Jones / Robertson Combne the above estmated values to fnd doc relevance probablty Order docs by decreasng probablty 12
13 Probablstc Rankng Basc concept: For a gven query, f we know some docs that are relevant, terms that occur n those docs should be gven greater weghtng n searchng for other relevant docs. By makng assumptons about the dstrbuton of terms and applyng Bayes Theorem, t s possble to derve weghts theoretcally. Van Rjsbergen 13
14 Bnary Independence Model Tradtonally used n conjuncton wth PRP Bnary = Boolean: docs are represented as bnary ncdence vectors of terms x = [x 1 x 2 x m ] x = 1 ff term s present n document x. Independence : terms occur n docs ndependently Equvalent to Multvarate Bernoull Nave Bayes model Sometmes used for text categorzaton [we wll see n the next lectures] 14
15 Bnary Independence Model Wll use odds and Bayes Rule: O ( R q, x) P ( R 1 q) P ( x R 1, q) P ( R 1 q, x) P( x q) P ( R 0 q, x) P ( R 0 q ) P ( x R 0, q) P( x q) 15
16 Bnary Independence Model O ( R q, x) P ( R 1 q, x) P ( R 1 q) P ( x R 1, q) P ( R 0 q, x) P ( R 0 q) P ( x R 0, q) Constant for a gven query Needs estmaton Usng Independence Assumpton: n p( x R 1, q) P ( x R 1, q) p( x R 0, q) P ( x R 0, q) 1 16 O ( R q, d ) O ( R q) n 1 P ( x R 1, q) P ( x R 0, q )
17 Bnary Independence Model Snce x s ether 0 or 1: P ( x 1 R 1, q) P ( x 0 R 1, q) O ( R q, d ) O ( R q) P ( x 1 R 0, q) P ( x 0 R 0, q) x 1 x 0 Let p P ( x 1 R 1, q) u P ( x 1 R 0, q) Assume, for all terms not occurrng n the query (q =0) that p = u Ths can be changed (e.g., n relevance feedback) 17
18 Probabltes document relevant (R=1) not relevant (R=0) term present x = 1 term absent x = 0 p u (1 p ) (1 u ) Then... 18
19 Bnary Independence Model p O ( R q, x ) O ( R q ) u All matchng terms x q 1 x 0 q 1 1 p 1u p (1 u ) 1 p O ( R q ) u (1 p ) 1u x q 1 q 1 Non-matchng query terms All matchng terms All query terms 19
20 Bnary Independence Model p (1 u ) 1 p O ( R q, x ) O ( R q ) u (1 p ) 1u Constant for each query x q 1 q 1 20 RSV Retreval Status Value: log p (1 u ) p (1 u ) log u (1 p ) u (1 p ) x q 1 x q 1 Only quantty to be estmated for rankngs
21 Bnary Independence Model All bols down to computng RSV: RSV log p (1 u ) p (1 u ) log u (1 p ) u (1 p ) x q 1 x q 1 RSV c x q 1 ; c p log u (1 u) (1 p ) c s functon as the term weghts n ths model So, how do we compute c s from our data? 21
22 BIM: example q = {x 1, x 2 } Relevance judgements from 20 docs together wth the dstrbuton of x 1, x 2 wthn these docs p 1 = 8/12, u 1 = 3/8 p 2 = 7/12 and u 2 = 4/8. c 1 = log 10 /3 c 2 = log 7 /5 (1,1) (1,0) (0,1) (0,0) 22
23 Bnary Independence Model Estmatng RSV coeffcents n theory For each term look at ths table of document counts: Documents Relevant Non-Relevant Total x=1 s df-s df x=0 S-s N-df-S+s N-df Total S N-S N Estmates: Weght of -th term: 23 p s S c log df s df s u = N S s S s N df S + s For now, assume no zero terms.
24 Estmaton key challenge If non-relevant docs are approxmated by the whole collecton: u = df /N prob. of occurrence n non-relevant docs for query log(1 u )/u = log(n df )/df log N/df IDF! 24
25 Estmaton key challenge cannot be approxmated as easly as u probablty of occurrence n relevant docs p p can be estmated n varous ways: constant (Croft and Harper combnaton match) Then just get df weghtng of terms proportonal to prob. of occurrence n collecton Greff (SIGIR 1998) argues for 1/3 + 2/3 df /N from relevant docs f know some Relevance weghtng can be used n a feedback loop 25
26 Probablstc Relevance Feedback 1. Guess p and u and use t to retreve a frst set of relevant docs VR. 2. Interact wth the user to refne the descrpton: user specfes some defnte members wth R = 1 (the set VR) and R = 0 (the set VNR) 3. Re-estmate p and u : p = VR + 1 2, u VR +1 = VNR + 4. Repeat, thus generatng a successon of approxmatons to relevant docs 1 2 VNR +1 26
27 Probablstc Relevance Feedback 1. Guess p and u and use t to retreve a frst set of relevant docs VR. 2. Interact wth the user to refne the descrpton: learn some defnte members wth R = 1 and R = 0 3. Re-estmate p and u : Or can combne new nfo wth orgnal guess (use Bayesan update): ( 1) VR p p VR ( t ) t κ s pror weght 4. Repeat, thus generatng a successon of approxmatons to relevant docs 27
28 Iteratvely estmatng p (= Pseudo-relevance feedback) 1. Assume that p s constant over all x n query p = 0.5 (even odds) for any gven doc 2. Determne guess of relevant doc set: V s fxed sze set of hghest ranked docs on ths model 3. We need to mprove our guesses for p and u : Let V be set of docs contanng x p = V V + 1 Assume f not retreved then not relevant u = df V + 1/2 N V Go to 2. untl converges then return rankng 28
29 PRP and BIM Gettng reasonable approxmatons of probabltes s possble. Requres restrctve assumptons: boolean representaton of docs/queres/relevance term ndependence terms that do not appear n the query don t affect the outcome doc relevance values are ndependent Some of these assumptons can be removed Problem: ether requre partal relevance nformaton or only can derve somewhat nferor term weghts 29
30 Removng term ndependence In general, ndex terms aren t ndependent Dependences can be complex Rjsbergen (1979) proposed model of smple tree dependences Exactly Fredman and Goldszmdt s Tree Augmented Nave Bayes (AAAI 13, 1996) Each term dependent on one other In 1970s, estmaton problems held back success of ths model 30
31 A key lmtatons of the BIM BIM was desgned for ttles or abstracts, and not for modern full text search lke much of orgnal IR We want to pay attenton to term frequency and doc lengths just lke n other models we ve dscussed. 31
32 Okap BM25 BM25 Best Match 25 (they had a bunch of tres!) Developed n the context of the Okap system Started to be ncreasngly adopted by other teams durng the TREC compettons It works well Goal: Releasng some assumpton of BIM whle not addng too many parameters (Spärck Jones et al. 2000) I ll omt the theory, but show the form. 32
33 Recall: BIM Bols down to: RSV BIM c BIM x q 1 ; c BIM log p (1 u ) (1 p ) u Log odds rato document relevant (R=1) not relevant (R=0) term present x = 1 p u term absent x = 0 (1 p ) (1 u ) Smplfes to (wth constant p = 0.5) RSV BIM x q 1 log N df
34 Early versons of BM25 Verson 1: usng the saturaton functon c BM 25v1 (tf ) = c BIM tf k 1 + tf Verson 2: BIM smplfcaton to IDF: c BM 25 v2 N 1 ) ( tf ) log df k1 tf (k 1 + 1) factor doesn t change rankng, but makes term score 1 when tf = 1 Smlar to tf-df, but term scores are bounded ( k 1 tf
35 Document length normalzaton Longer documents are lkely to have larger tf values Why mght documents be longer? Verbosty: suggests observed tf too hgh Larger scope: suggests observed tf may be rght A real document collecton probably has both effects so should apply some knd of normalzaton
36 Document length normalzaton Document length: V avdl:average document length over collecton Length normalzaton component: b = 1 full document length normalzaton b = 0 no document length normalzaton dl tf dl B ( 1 b) b, 0 b 1 av _ dl
37 Okap BM25 Factor n the frequency of each term versus doc length: BM 25 BM 25 RSV c q ( tf ); c BM 25 ( tf ) log N df k 1 ( k1 1) tf dl ((1 b) b ) av _ dl tf tf,d s term freq of n d L d s length of d and L ave s ave. doc length k 1 and b are tunng parameters 37
38 Okap BM25 RSV BM 25 log N df k ( k1 1) tf dl ((1 b) b avdl q ) 1 tf k 1 controls term frequency scalng k 1 = 0 s bnary model; k 1 = large s raw term frequency b controls doc length normalzaton b = 0 s no length normalzaton; b = 1 s relatve frequency (fully scale by doc length) Typcally, k 1 s set around and b around
39 Resources S. E. Robertson and K. Spärck Jones Relevance Weghtng of Search Terms. Journal of the Amercan Socety for Informaton Scences 27(3): C. J. van Rjsbergen Informaton Retreval. 2nd ed. London: Butterworths, chapter 6. [Most detals of math] N. Fuhr Probablstc Models n Informaton Retreval. The Computer Journal, 35(3), [Easest read, wth BNs] F. Crestan, M. Lalmas, C. J. van Rjsbergen, and I. Campbell Is Ths Document Relevant?... Probably: A Survey of Probablstc Models n Informaton Retreval. ACM Computng Surveys 30(4): [Adds very lttle materal that sn t n van Rjsbergen or Fuhr ] 39
CS47300: Web Information Search and Management
CS47300: Web Informaton Search and Management Probablstc Retreval Models Prof. Chrs Clfton 7 September 2018 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group 14 Why probabltes
More informationWeb-Mining Agents Probabilistic Information Retrieval
Web-Mnng Agents Probablstc Informaton etreval Prof. Dr. alf Möller Unverstät zu Lübeck Insttut für Informatonssysteme Karsten Martny Übungen Acknowledgements Sldes taken from: Introducton to Informaton
More informationRetrieval Models: Language models
CS-590I Informaton Retreval Retreval Models: Language models Luo S Department of Computer Scence Purdue Unversty Introducton to language model Ungram language model Document language model estmaton Maxmum
More informationInformation Retrieval
Introduction to Information Retrieval Lecture 11: Probabilistic Information Retrieval 1 Outline Basic Probability Theory Probability Ranking Principle Extensions 2 Basic Probability Theory For events A
More informationEvaluation for sets of classes
Evaluaton for Tet Categorzaton Classfcaton accuracy: usual n ML, the proporton of correct decsons, Not approprate f the populaton rate of the class s low Precson, Recall and F 1 Better measures 21 Evaluaton
More informationCS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements
CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before
More informationMLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012
MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:
More informationPV211: Introduction to Information Retrieval
PV211: Introduction to Information Retrieval http://www.fi.muni.cz/~sojka/pv211 IIR 11: Probabilistic Information Retrieval Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics, Masaryk
More informationIntroduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:
CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and
More informationExpectation Maximization Mixture Models HMMs
-755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood
More informationENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition
EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30
More informationBasically, if you have a dummy dependent variable you will be estimating a probability.
ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy
More informationOutline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline
Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number
More informationQuestion Classification Using Language Modeling
Queston Classfcaton Usng Language Modelng We L Center for Intellgent Informaton Retreval Department of Computer Scence Unversty of Massachusetts, Amherst, MA 01003 ABSTRACT Queston classfcaton assgns a
More informationProbabilistic Ranking Principle. Hongning Wang
Probablstc anng Prncple Hongnng Wang CS@UVa ecap: latent semantc analyss Soluton Low ran matrx approxmaton Imagne ths s *true* concept-document matrx Imagne ths s our observed term-document matrx andom
More informationENTROPIC QUESTIONING
ENTROPIC QUESTIONING NACHUM. Introucton Goal. Pck the queston that contrbutes most to fnng a sutable prouct. Iea. Use an nformaton-theoretc measure. Bascs. Entropy (a non-negatve real number) measures
More informationArtificial Intelligence Bayesian Networks
Artfcal Intellgence Bayesan Networks Adapted from sldes by Tm Fnn and Mare desjardns. Some materal borrowed from Lse Getoor. 1 Outlne Bayesan networks Network structure Condtonal probablty tables Condtonal
More informationMachine learning: Density estimation
CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of
More informationxp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ
CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and
More informationExtending Relevance Model for Relevance Feedback
Extendng Relevance Model for Relevance Feedback Le Zhao, Chenmn Lang and Jame Callan Language Technologes Insttute School of Computer Scence Carnege Mellon Unversty {lezhao, chenmnl, callan}@cs.cmu.edu
More informationPsychology 282 Lecture #24 Outline Regression Diagnostics: Outliers
Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationDepartment of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING
MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/
More informationLecture 2: Prelude to the big shrink
Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson
More informationProbabilistic Ranking Principle. Hongning Wang
robablstc ankng rncple Hongnng Wang CS@UVa Noton of relevance elevance epq epd Smlarty r1qd r {01} robablty of elevance d q or q d robablstc nference Dfferent rep & smlarty Vector space model Salton et
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationLecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More information18.1 Introduction and Recap
CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng
More informationProbabilistic Information Retrieval
Probabilistic Information Retrieval Sumit Bhatia July 16, 2009 Sumit Bhatia Probabilistic Information Retrieval 1/23 Overview 1 Information Retrieval IR Models Probability Basics 2 Document Ranking Problem
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationLecture 9: Probabilistic IR The Binary Independence Model and Okapi BM25
Lecture 9: Probabilistic IR The Binary Independence Model and Okapi BM25 Trevor Cohn (Slide credits: William Webber) COMP90042, 2015, Semester 1 What we ll learn in this lecture Probabilistic models for
More informationINF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018
INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton
More informationNote on EM-training of IBM-model 1
Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are
More informationBayesian Planning of Hit-Miss Inspection Tests
Bayesan Plannng of Ht-Mss Inspecton Tests Yew-Meng Koh a and Wllam Q Meeker a a Center for Nondestructve Evaluaton, Department of Statstcs, Iowa State Unversty, Ames, Iowa 5000 Abstract Although some useful
More informationWhy Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)
Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch
More informationMIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU
Group M D L M Chapter Bayesan Decson heory Xn-Shun Xu @ SDU School of Computer Scence and echnology, Shandong Unversty Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern
More information1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands
Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationGrover s Algorithm + Quantum Zeno Effect + Vaidman
Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationThe Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD
he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationCIE4801 Transportation and spatial modelling Trip distribution
CIE4801 ransportaton and spatal modellng rp dstrbuton Rob van Nes, ransport & Plannng 17/4/13 Delft Unversty of echnology Challenge the future Content What s t about hree methods Wth specal attenton for
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationP R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More informationLearning from Data 1 Naive Bayes
Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why
More informationDepartment of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution
Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable
More informationEvaluating ADM on a Four-Level Relevance Scale Document Set from NTCIR. (DRAFT - Not for Quotation)
Workng Notes of NTCIR-4, Tokyo, 2-4 June 24 Evaluatng on a Four-Level Relevance Scale Document Set from NTCIR (DRAFT - Not for Quotaton) Vncenzo Della Mea 1, Luca D Gaspero 2, Stefano Mzzaro 1 1 Department
More informationInformation Retrieval Language models for IR
Informaton Retreval Language models for IR From Mannng and Raghavan s course [Borros sldes from Vktor Lavrenko and Chengxang Zha] 1 Recap Tradtonal models Boolean model Vector space model robablstc models
More informationBoostrapaggregating (Bagging)
Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod
More informationGaussian Mixture Models
Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationBOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu
BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com
More informationModule 2. Random Processes. Version 2 ECE IIT, Kharagpur
Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be
More informationPredictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore
Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.
More informationINTRODUCTION TO MACHINE LEARNING 3RD EDITION
ETHEM ALPAYDIN The MIT Press, 2014 Lecture Sldes for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydn@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/2ml3e CHAPTER 3: BAYESIAN DECISION THEORY Probablty
More informationBayesian Networks. Course: CS40022 Instructor: Dr. Pallab Dasgupta
Bayesan Networks Course: CS40022 Instructor: Dr. Pallab Dasgupta Department of Computer Scence & Engneerng Indan Insttute of Technology Kharagpur Example Burglar alarm at home Farly relable at detectng
More informationEM and Structure Learning
EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder
More informationCorpora and Statistical Methods Lecture 6. Semantic similarity, vector space models and wordsense disambiguation
Corpora and Statstcal Methods Lecture 6 Semantc smlarty, vector space models and wordsense dsambguaton Part 1 Semantc smlarty Synonymy Dfferent phonologcal/orthographc words hghly related meanngs: sofa
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014
COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and
More informationA Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach
A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models
More informationLecture 14: Forces and Stresses
The Nuts and Bolts of Frst-Prncples Smulaton Lecture 14: Forces and Stresses Durham, 6th-13th December 2001 CASTEP Developers Group wth support from the ESF ψ k Network Overvew of Lecture Why bother? Theoretcal
More informationPHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University
PHYS 45 Sprng semester 7 Lecture : Dealng wth Expermental Uncertantes Ron Refenberger Brck anotechnology Center Purdue Unversty Lecture Introductory Comments Expermental errors (really expermental uncertantes)
More informationParametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010
Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton
More informationNEW ASTERISKS IN VERSION 2.0 OF ACTIVEPI
NEW ASTERISKS IN VERSION 2.0 OF ACTIVEPI ASTERISK ADDED ON LESSON PAGE 3-1 after the second sentence under Clncal Trals Effcacy versus Effectveness versus Effcency The apprasal of a new or exstng healthcare
More informationHow its computed. y outcome data λ parameters hyperparameters. where P denotes the Laplace approximation. k i k k. Andrew B Lawson 2013
Andrew Lawson MUSC INLA INLA s a relatvely new tool that can be used to approxmate posteror dstrbutons n Bayesan models INLA stands for ntegrated Nested Laplace Approxmaton The approxmaton has been known
More informationStatistical pattern recognition
Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve
More informationLecture 4 Hypothesis Testing
Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More information1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations
Physcs 171/271 -Davd Klenfeld - Fall 2005 (revsed Wnter 2011) 1 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys
More informationCourse 395: Machine Learning - Lectures
Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng
More informationCredit Card Pricing and Impact of Adverse Selection
Credt Card Prcng and Impact of Adverse Selecton Bo Huang and Lyn C. Thomas Unversty of Southampton Contents Background Aucton model of credt card solctaton - Errors n probablty of beng Good - Errors n
More informationUsing the estimated penetrances to determine the range of the underlying genetic model in casecontrol
Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable
More informationFREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,
FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then
More informationThe big picture. Outline
The bg pcture Vncent Claveau IRISA - CNRS, sldes from E. Kjak INSA Rennes Notatons classes: C = {ω = 1,.., C} tranng set S of sze m, composed of m ponts (x, ω ) per class ω representaton space: R d (=
More informationCSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography
CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve
More informationIntroduction to Information Theory, Data Compression,
Introducton to Informaton Theory, Data Compresson, Codng Mehd Ibm Brahm, Laura Mnkova Aprl 5, 208 Ths s the augmented transcrpt of a lecture gven by Luc Devroye on the 3th of March 208 for a Data Structures
More informationExplaining the Stein Paradox
Explanng the Sten Paradox Kwong Hu Yung 1999/06/10 Abstract Ths report offers several ratonale for the Sten paradox. Sectons 1 and defnes the multvarate normal mean estmaton problem and ntroduces Sten
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationBayesian decision theory. Nuno Vasconcelos ECE Department, UCSD
Bayesan decson theory Nuno Vasconcelos ECE Department UCSD Notaton the notaton n DHS s qute sloppy e.. show that error error z z dz really not clear what ths means we wll use the follown notaton subscrpts
More informationEstimation: Part 2. Chapter GREG estimation
Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models
More informationThe optimal delay of the second test is therefore approximately 210 hours earlier than =2.
THE IEC 61508 FORMULAS 223 The optmal delay of the second test s therefore approxmately 210 hours earler than =2. 8.4 The IEC 61508 Formulas IEC 61508-6 provdes approxmaton formulas for the PF for smple
More information9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations
Physcs 171/271 - Chapter 9R -Davd Klenfeld - Fall 2005 9 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys a set
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationUniversity of Washington Department of Chemistry Chemistry 453 Winter Quarter 2015
Lecture 2. 1/07/15-1/09/15 Unversty of Washngton Department of Chemstry Chemstry 453 Wnter Quarter 2015 We are not talkng about truth. We are talkng about somethng that seems lke truth. The truth we want
More informationUncertainty as the Overlap of Alternate Conditional Distributions
Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant
More informationLecture 4. Instructor: Haipeng Luo
Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would
More informationInformation Retrieval Basic IR models. Luca Bondi
Basic IR models Luca Bondi Previously on IR 2 d j q i IRM SC q i, d j IRM D, Q, R q i, d j d j = w 1,j, w 2,j,, w M,j T w i,j = 0 if term t i does not appear in document d j w i,j and w i:1,j assumed to
More informationOther NN Models. Reinforcement learning (RL) Probabilistic neural networks
Other NN Models Renforcement learnng (RL) Probablstc neural networks Support vector machne (SVM) Renforcement learnng g( (RL) Basc deas: Supervsed dlearnng: (delta rule, BP) Samples (x, f(x)) to learn
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More informationSPANC -- SPlitpole ANalysis Code User Manual
Functonal Descrpton of Code SPANC -- SPltpole ANalyss Code User Manual Author: Dale Vsser Date: 14 January 00 Spanc s a code created by Dale Vsser for easer calbratons of poston spectra from magnetc spectrometer
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More information