MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU
|
|
- Dominick Singleton
- 5 years ago
- Views:
Transcription
1 Group M D L M Chapter Bayesan Decson heory Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty
2 Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern recognton Mathematcal foundaton for decson makng Usng probablstc approach to help makng decson so as to mnmze the rsk cost. Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty
3 Bayesan Decson heory Basc Assumptons he decson problem s posed formalzed n probablstc terms All the relevant probablty values are known Key rncple Bayes heorem Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 3
4 relmnares and Notatons a state of nature : : p : p : : pror probablty feature vector evdence probablty class-condtonal densty / lkelhood posteror probablty Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 4
5 Decson Before Observaton he roblem o make a decson where ror probablty s known No observaton s allowed Naïve Decson Rule Decde f, otherwse hs s the best we can do wthout observaton Fed pror probabltes -> Same decsons all tme Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 5
6 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 6 Bayes heorem p p c p p homas Bayes 70-76
7 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 7 Decson After Observaton p p arg ma D arg ma D unmportant n makng decson
8 Decson After Observaton p p Bayes Formula Known ror probablty Class-condtonal pdf Observaton Unknown osteror probablty : Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 8
9 Specal Cases p p Case I: Equal pror probablty = = = c =/c Depends on the lkelhood p Case II: Equal lkelhood p =p = = p c Degenerate to naïve decson rule posteror lkelhood pror evdence Normally, pror probablty and lkelhood functon together n Bayesan decson process Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 9
10 An eample : sea bass : salmon =/3 =/3 What wll the posteror probablty for ether type of fsh look lke? class-condtonal pdf for lghtness Decde f p > p ; otherwse decde Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 0
11 An eample R R R R posteror probablty for ether type of fsh h-as: lghtness of fsh scales v-as: posteror probablty for each type of fsh Black curve: sea bass Red curve: salmon For each value of, the hgher curve yelds the output of Bayesan decson For each value of, the posterors of ether curve sum to.0 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty
12 Another Eample roblem statement A new medcal test s used to detect whether a patent has a certan cancer or not, whose test result s ether + postve or negatve For patent wth ths cancer, the probablty of returnng postve test result s 0.98 For patent wthout ths cancer, the probablty of returnng negatve test result s 0.97 he probablty for any person to have ths cancer s Queston If postve test result s returned, does she/he have cancer? Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty
13 Another Eample Cont. Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 3
14 Feasblty of Bayes Formula p p o compute posteror probablty, we need to know pror probablty and lkelhood posteror lkelhood pror evdence How do we know these probabltes? A smple soluton: Countng Relatve frequences An advanced soluton: Conduct Densty estmaton Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 4
15 A Further Eample roblem Based on the heght of a car n some campus, decde whether t costs more than $50,000 or not : prce > $ 50,000 : prce <=$ 50,000 : heght of a car Decde f > ; otherwse decde Quanttes to know: How to get them? Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 5
16 A Further Eample Cont. Collectng samples Suppose we have randomly pcked 09 cars n the campus, got prces from ther owners, and measured ther heghts Compute and # cars n : # cars n : Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 6
17 A Further Eample Cont. Compute Dscretze the heght spectrum say [0.5m,.5m] nto 0 ntervals each wth length 0.m, and then count the number of cars fallng nto each nterval for ether class Suppose =.05, whch means that falls nto nterval I = [.0m,.m] For, # cars n I s 46, For, # cars n I s 59, Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 7
18 A Further Eample Cont. Queston For a car wth heght.05m, s ts prce greater than $50,000? / <, prce<=$50,000 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 8
19 Is Bayes Decson Rule Optmal Consder two categores Decde f > ; otherwse decde When we observe, the probablty of error s: error f f we decde we decde hus, under Bayes decson rule, we have error mn[, ] For every, we ensure that error s as small as possble Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 9
20 Is Bayes Decson Rule Optmal Consder two categores Decde f > ; otherwse decde When we observe, the probablty of error s: error f f we decde we decde hus, under Bayes decson rule, we have error mn[, ] For every, we ensure that error s as small as possble Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 0
21 Generalzed Bayes Decson Rule Allowng to use more than one feature R d R : d-dmensonal Eucldean Space Allowng more than two states of nature a set of c states of nature Allowng actons other than merely decdng the state of nature a set of a possble actons Note that c a Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty
22 Generalzed Bayes Decson Rule cont. Introducng a loss functon more general than the probablty of error : A R loss functon, : the loss ncurred for takng acton when the state of nature s For ease of reference, t s usually wrtten as: : We want to mnmze the epected loss n makng decson. Rsk Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty
23 Generalzed Bayes Decson Rule cont. Introducng a loss functon more general than the probablty of error : A R loss functon, : the loss ncurred for takng acton when the state of nature s For ease of reference, t s usually wrtten as: : We want to mnmze the epected loss n makng decson. Rsk Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 3
24 Generalzed Bayes Decson Rule cont. roblem Gven a partcular, we have to decde whch acton to take o do ths, we need to know the loss of takng each acton a α : However, the true state of nature s uncertan he acton beng taken α rue state of nature Epected average loss We want to mnmze the epected loss n makng decson. Rsk Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 4
25 Generalzed Bayes Decson Rule cont. Epected loss c R Gven, the epected loss rsk assocated wth takng acton. c he ncurred loss of takng acton α n case of true state of nature beng he probablty of beng the true state of nature he epected loss s also named as condtonal rsk Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 5
26 Generalzed Bayes Decson Rule cont. Suppose we have: For a partcular : = 0.0 = 0.99 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 6
27 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 7 Generalzed Bayes Decson Rule cont. 0/ Loss Functon c R c otherwse correct decson asscated wth s a 0 R error
28 Generalzed Bayes Decson Rule cont. Bayes decson rule general case A Overall rsk A arg mn R arg mn c R R p d Decson functon For every, we ensure that the condtonal rsk Ra s as small as possble; hus, the overall rsk over all possble must be as small as possble. he optmal one to mnmze the overall rsk Its resultng overall rsk s called the Bayesan rsk Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 8
29 General Case: wo-category {, } {, } Loss Functon Acton State of Nature R R Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 9
30 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 30 General Case: wo-category erform f R > R ; otherwse perform R R
31 General Case: wo-category erform f R > R ; otherwse perform postve postve osteror probabltes are scaled before comparson. Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 3
32 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 3 General Case: wo-category erform f R > R ; otherwse perform p p p p
33 General Case: wo-category Lkelhood Rato hreshold erform f p p Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 33
34 Dscrmnant Functon Dscrmnant functons for multcategory g d : R R c One functon per category g g Acton e.g., classfcaton g c Assgn to f g > g for all. Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 34
35 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 35 Dscrmnant Functon Mnmum Rsk Case: Mnmum Error-Rate Case: R g g p g ln ln p g
36 Dscrmnant Functon Relatonshp between mnmum rsk and mnmum error rate Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 36
37 Dscrmnant Functon Varous dscrmnant functon Identcal classfcaton results If f. s a monotoncally ncreasng functon, then fg. s are also be dscrmnant functons. Eample f k k 0 f g k g c f ln f g ln g c Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 37
38 Decson Regons c dscrmnant functons result n c decson regons. R { g g } where R R and c R R d Decson boundary Decson regons are separated by decson boundares wo-category eample Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 38
39 he Normal Dstrbuton Dscrete random varable X - Assume nteger robablty mass functon pmf: p X Cumulatve dstrbuton functon cdf: Contnuous random varable X robablty densty functon pdf: p or f Cumulatve dstrbuton functon cdf: F X p t t not a probablty F X p t dt Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 39
40 Epectatons a.k.a. epected value, mean or average of a random varable s a random varable, the epectaton of E[ ] p p d he k th k moment E[ X ] he st moment E[X X ] s dscrete s contnuous he k th k central moment E[ X ] Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 40 X
41 Important Epectatons Mean X Varance E[ X ] p p d X X s s dscrete contnuous X Var[ X ] E[ X X ] X X p p d X X s s dscrete contnuous Notaton: Var[ ] σ: standard devaton? Fact: Var[ ] E[ ] E[ ] Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 4
42 Entropy he entropy measures the fundamental uncertanty n the value of ponts selected randomly from a dstrbuton. H[ X ] p log p p log p d X X s dscrete s contnuous Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 4
43 Unvarate Gaussan Dstrbuton Gaussan dstrbuton, a.k.a. Gaussan densty, normal densty. X~N,σ p E[X] = e Var[X] =σ Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 43
44 Unvarate Gaussan Dstrbuton Gaussan dstrbuton, a.k.a. Gaussan densty, normal densty. X~N,σ p E[X] = e Var[X] =σ Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 44
45 Random Vectors A d-dmensonal random vector s: X,,, d d X : R X ~ p X p, Epected vector E[ ] E[ ] E[ X] E[ d ], ont pdf Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 45, d E[ ] p d d Margnal pdf on the th component. E X [ ],,, d
46 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 46 Random Vectors Covarance matr ] [ E X X d d d d d -, ] [ d d p E Margnal pdf on a par of random varables, ropertes: Symmetrc, ostve semdefnte
47 Multvarate Gaussan Dstrbuton X s a d-dmensonal random vector X ~ N, p E[X ] d ep / / E[ X X ] Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 47
48 ropertes of N, X s a d-dmensonal random vector, and X ~ N, If Y=A X, where A s a d k matr, then Y~NA, A A Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 48
49 On Covarance Matr As mentoned before, s symmetrc and postve semdefnte. ΦΛΦ hus, ΦΛ / Λ / Φ : orthonormal matr, whose columns are egenvectors of. : dagonal matr egenvalues. ΦΛ / ΦΛ / Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 49
50 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 50 Mahalanobs Dstance Mahalanobs dstance r, ~ N X ep / / d p ep / / d p constant r depends on the value of r.c. Mahalanobs
51 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 5 Dscrmnant Functons for Gaussan Densty Mnmum-error-rate classfcaton c g ln g ln ln g ep / / d p ln ln ln d g Constant, could be gnored Constant, could be gnored
52 Dscrmnant Functons for Gaussan Densty hree cases Case Classes are centered at dfferent mean, and ther feature components are parwsely ndependent have the same varance. Case I Classes are centered at dfferent mean, but have the same varaton. Case 3 Arbtrary Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 5
53 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 53 Case : I ln ln ln d g rrelevant ln g ln ln g rrelevant I
54 Case : g ln It s a lnear dscrmnant functon I where g Weght vector w w 0 w hreshold/bas w ln 0 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 54
55 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 55 Case : I 0 w g w 0 0 w w w w 0 0 w w w w ln ln Boundary btw. and g g w ln 0 w
56 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 56 Case : he decson boundary wll be a hyperplane perpendcular to the lne btw. the means at somewhere. I Boundary btw. and g g ln w w 0 0 w ln 0 w f = mdpont
57 Case : I Mnmum dstance classfer template matchng Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 57
58 Case : I Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 58
59 Case : I Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 59
60 Case : I Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 60
61 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 6 Case : ln ln ln d g rrelevant Irrelevant f =, ln g Mahalanobs Dstance ln Irrelevant 0 w g w w ln 0 w
62 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 6 Case : 0 w g w g g 0 0 w w ] / ln[ 0 w 0 w ln 0 w
63 Case : Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 63
64 Case : Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 64
65 Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 65 Case 3: ln ln g 0 w g w W Wthout ths term In Case and ln ln ln d g rrelevant W w ln ln 0 w Decson surfaces are hyperquadrcs, e.g., Hyperplanes Hyperspheres Hyperellpsods hyperhyperbolods
66 Case 3: Non-smply connected decson regons can arse n one dmenson for Gaussans havng unequal varance. Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 66
67 Case 3: Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 67
68 Case 3: Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 68
69 Case 3: Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 69
70 Case 3: Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 70
71 Summary Bayesan Decson heory Basc concepts Bayes theorem Bayes decson rule Feasblty of Bayes Decson Rule ror probablty + lkelhood Soluton I: countng relatve frequences Soluton II: conduct densty estmaton Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 7
72 Summary Bayes decson rule: he general scenaro Allowng more than one feature Allowng more than two states of nature Allowng actons than merely decdng state of nature Loss functon Epected loss condtonal rsk General Bayes decson rule Mnmum-error-rate classfcaton Dscrmnant functons Gaussan densty Dscrmnant functons for Gaussan pdf. Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 7
73 k-means Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty 73
74 Group Any Queston? Xn-Shun SDU School of Computer Scence and echnology, Shandong Unversty
P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More informationWhy Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)
Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch
More informationThe Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD
he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s
More informationDepartment of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING
MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationPattern Classification
attern Classfcaton All materals n these sldes were taken from attern Classfcaton nd ed by R. O. Duda,. E. Hart and D. G. Stork, John Wley & Sons, 000 wth the ermsson of the authors and the ublsher Chater
More informationStatistical pattern recognition
Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve
More informationLecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More information9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov
9.93 Class IV Part I Bayesan Decson Theory Yur Ivanov TOC Roadmap to Machne Learnng Bayesan Decson Makng Mnmum Error Rate Decsons Mnmum Rsk Decsons Mnmax Crteron Operatng Characterstcs Notaton x - scalar
More informationGenerative classification models
CS 675 Intro to Machne Learnng Lecture Generatve classfcaton models Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Data: D { d, d,.., dn} d, Classfcaton represents a dscrete class value Goal: learn
More informationMaximum Likelihood Estimation (MLE)
Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y
More informationOutline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil
Outlne Multvarate Parametrc Methods Steven J Zel Old Domnon Unv. Fall 2010 1 Multvarate Data 2 Multvarate ormal Dstrbuton 3 Multvarate Classfcaton Dscrmnants Tunng Complexty Dscrete Features 4 Multvarate
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationBayesian decision theory. Nuno Vasconcelos ECE Department, UCSD
Bayesan decson theory Nuno Vasconcelos ECE Department UCSD Notaton the notaton n DHS s qute sloppy e.. show that error error z z dz really not clear what ths means we wll use the follown notaton subscrpts
More informationINF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018
INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationLecture 3: Probability Distributions
Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the
More informationProbability and Random Variable Primer
B. Maddah ENMG 622 Smulaton 2/22/ Probablty and Random Varable Prmer Sample space and Events Suppose that an eperment wth an uncertan outcome s performed (e.g., rollng a de). Whle the outcome of the eperment
More informationThe Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD
he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that e have state of the orld X observatons decson functon L[,y] loss of predctn y th Bayes decson rule s the rule
More informationPattern Classification
Pattern Classfcaton All materals n these sldes ere taken from Pattern Classfcaton (nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wley & Sons, 000 th the permsson of the authors and the publsher
More informationEngineering Risk Benefit Analysis
Engneerng Rsk Beneft Analyss.55, 2.943, 3.577, 6.938, 0.86, 3.62, 6.862, 22.82, ESD.72, ESD.72 RPRA 2. Elements of Probablty Theory George E. Apostolaks Massachusetts Insttute of Technology Sprng 2007
More informationj) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1
Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons
More informationClassification learning II
Lecture 8 Classfcaton learnng II Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Logstc regresson model Defnes a lnear decson boundar Dscrmnant functons: g g g g here g z / e z f, g g - s a logstc functon
More informationMLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012
MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:
More informationMachine Learning. Classification. Theory of Classification and Nonparametric Classifier. Representing data: Hypothesis (classifier) Eric Xing
Machne Learnng 0-70/5 70/5-78, 78, Fall 008 Theory of Classfcaton and Nonarametrc Classfer Erc ng Lecture, Setember 0, 008 Readng: Cha.,5 CB and handouts Classfcaton Reresentng data: M K Hyothess classfer
More informationBayesian decision theory. Nuno Vasconcelos ECE Department, UCSD
Bayesan decson theory Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world observatons decson functon L[,y] loss of predctn y wth the epected value of the
More informationBayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County
Smart Home Health Analytcs Sprng 2018 Bayesan Learnng Nrmalya Roy Department of Informaton Systems Unversty of Maryland Baltmore ounty www.umbc.edu Bayesan Learnng ombnes pror knowledge wth evdence to
More informationENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition
EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30
More informationStatistical analysis using matlab. HY 439 Presented by: George Fortetsanakis
Statstcal analyss usng matlab HY 439 Presented by: George Fortetsanaks Roadmap Probablty dstrbutons Statstcal estmaton Fttng data to probablty dstrbutons Contnuous dstrbutons Contnuous random varable X
More informationCS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015
CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours
UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x
More informationMachine learning: Density estimation
CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of
More informationCHAPTER 3: BAYESIAN DECISION THEORY
HATER 3: BAYESIAN DEISION THEORY Decson mang under uncertanty 3 Data comes from a process that s completely not nown The lac of nowledge can be compensated by modelng t as a random process May be the underlyng
More informationA Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach
A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland
More informationA be a probability space. A random vector
Statstcs 1: Probablty Theory II 8 1 JOINT AND MARGINAL DISTRIBUTIONS In Probablty Theory I we formulate the concept of a (real) random varable and descrbe the probablstc behavor of ths random varable by
More informationThe big picture. Outline
The bg pcture Vncent Claveau IRISA - CNRS, sldes from E. Kjak INSA Rennes Notatons classes: C = {ω = 1,.., C} tranng set S of sze m, composed of m ponts (x, ω ) per class ω representaton space: R d (=
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More informationPredictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore
Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationCS-433: Simulation and Modeling Modeling and Probability Review
CS-433: Smulaton and Modelng Modelng and Probablty Revew Exercse 1. (Probablty of Smple Events) Exercse 1.1 The owner of a camera shop receves a shpment of fve cameras from a camera manufacturer. Unknown
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationStatistics and Probability Theory in Civil, Surveying and Environmental Engineering
Statstcs and Probablty Theory n Cvl, Surveyng and Envronmental Engneerng Pro. Dr. Mchael Havbro Faber ETH Zurch, Swtzerland Contents o Todays Lecture Overvew o Uncertanty Modelng Random Varables - propertes
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationClustering & Unsupervised Learning
Clusterng & Unsupervsed Learnng Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 2012 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y
More informationINTRODUCTION TO MACHINE LEARNING 3RD EDITION
ETHEM ALPAYDIN The MIT Press, 2014 Lecture Sldes for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydn@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/2ml3e CHAPTER 3: BAYESIAN DECISION THEORY Probablty
More informationHydrological statistics. Hydrological statistics and extremes
5--0 Stochastc Hydrology Hydrologcal statstcs and extremes Marc F.P. Berkens Professor of Hydrology Faculty of Geoscences Hydrologcal statstcs Mostly concernes wth the statstcal analyss of hydrologcal
More informationDecision-making and rationality
Reslence Informatcs for Innovaton Classcal Decson Theory RRC/TMI Kazuo URUTA Decson-makng and ratonalty What s decson-makng? Methodology for makng a choce The qualty of decson-makng determnes success or
More informationMixture o f of Gaussian Gaussian clustering Nov
Mture of Gaussan clusterng Nov 11 2009 Soft vs hard lusterng Kmeans performs Hard clusterng: Data pont s determnstcally assgned to one and only one cluster But n realty clusters may overlap Soft-clusterng:
More informationERROR RATES STABILITY OF THE HOMOSCEDASTIC DISCRIMINANT FUNCTION
ISSN - 77-0593 UNAAB 00 Journal of Natural Scences, Engneerng and Technology ERROR RATES STABILITY OF THE HOMOSCEDASTIC DISCRIMINANT FUNCTION A. ADEBANJI, S. NOKOE AND O. IYANIWURA 3 *Department of Mathematcs,
More informationChat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980
MT07: Multvarate Statstcal Methods Mke Tso: emal mke.tso@manchester.ac.uk Webpage for notes: http://www.maths.manchester.ac.uk/~mkt/new_teachng.htm. Introducton to multvarate data. Books Chat eld, C. and
More informationAPPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14
APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce
More informationPhysicsAndMathsTutor.com
PhscsAndMathsTutor.com phscsandmathstutor.com June 005 5. The random varable X has probablt functon k, = 1,, 3, P( X = ) = k ( + 1), = 4, 5, where k s a constant. (a) Fnd the value of k. (b) Fnd the eact
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for
More informationClassification as a Regression Problem
Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class
More informationExpected Value and Variance
MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or
More information15-381: Artificial Intelligence. Regression and cross validation
15-381: Artfcal Intellgence Regresson and cross valdaton Where e are Inputs Densty Estmator Probablty Inputs Classfer Predct category Inputs Regressor Predct real no. Today Lnear regresson Gven an nput
More informationWeek3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity
Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationCommunication with AWGN Interference
Communcaton wth AWG Interference m {m } {p(m } Modulator s {s } r=s+n Recever ˆm AWG n m s a dscrete random varable(rv whch takes m wth probablty p(m. Modulator maps each m nto a waveform sgnal s m=m
More informationPROBABILITY PRIMER. Exercise Solutions
PROBABILITY PRIMER Exercse Solutons 1 Probablty Prmer, Exercse Solutons, Prncples of Econometrcs, e EXERCISE P.1 (b) X s a random varable because attendance s not known pror to the outdoor concert. Before
More informationCS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements
CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before
More informationMarkov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement
Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs
More informationStatistical Foundations of Pattern Recognition
Statstcal Foundatons of Pattern Recognton Learnng Objectves Bayes Theorem Decson-mang Confdence factors Dscrmnants The connecton to neural nets Statstcal Foundatons of Pattern Recognton NDE measurement
More informationStat 543 Exam 2 Spring 2016
Stat 543 Exam 2 Sprng 2016 I have nether gven nor receved unauthorzed assstance on ths exam. Name Sgned Date Name Prnted Ths Exam conssts of 11 questons. Do at least 10 of the 11 parts of the man exam.
More informationProbabilistic Classification: Bayes Classifiers. Lecture 6:
Probablstc Classfcaton: Bayes Classfers Lecture : Classfcaton Models Sam Rowes January, Generatve model: p(x, y) = p(y)p(x y). p(y) are called class prors. p(x y) are called class condtonal feature dstrbutons.
More informationProbability Theory (revisited)
Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted
More informationProbability Theory. The nth coefficient of the Taylor series of f(k), expanded around k = 0, gives the nth moment of x as ( ik) n n!
8333: Statstcal Mechancs I Problem Set # 3 Solutons Fall 3 Characterstc Functons: Probablty Theory The characterstc functon s defned by fk ep k = ep kpd The nth coeffcent of the Taylor seres of fk epanded
More informationStat 543 Exam 2 Spring 2016
Stat 543 Exam 2 Sprng 206 I have nether gven nor receved unauthorzed assstance on ths exam. Name Sgned Date Name Prnted Ths Exam conssts of questons. Do at least 0 of the parts of the man exam. I wll score
More informationHere is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)
Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,
More informationWhich Separator? Spring 1
Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal
More informationClustering & (Ken Kreutz-Delgado) UCSD
Clusterng & Unsupervsed Learnng Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y ), fnd an approxmatng
More informationLecture 20: Hypothesis testing
Lecture : Hpothess testng Much of statstcs nvolves hpothess testng compare a new nterestng hpothess, H (the Alternatve hpothess to the borng, old, well-known case, H (the Null Hpothess or, decde whether
More informationSee Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)
Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces
More informationFeb 14: Spatial analysis of data fields
Feb 4: Spatal analyss of data felds Mappng rregularly sampled data onto a regular grd Many analyss technques for geophyscal data requre the data be located at regular ntervals n space and/or tme. hs s
More information1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands
Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of
More informationComparison of Regression Lines
STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence
More informationInner Product. Euclidean Space. Orthonormal Basis. Orthogonal
Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,
More informationClassification Bayesian Classifiers
lassfcaton Bayesan lassfers Jeff Howbert Introducton to Machne Learnng Wnter 2014 1 Bayesan classfcaton A robablstc framework for solvng classfcaton roblems. Used where class assgnment s not determnstc,.e.
More informationComputation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models
Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,
More informationChapter 1. Probability
Chapter. Probablty Mcroscopc propertes of matter: quantum mechancs, atomc and molecular propertes Macroscopc propertes of matter: thermodynamcs, E, H, C V, C p, S, A, G How do we relate these two propertes?
More informationMean Field / Variational Approximations
Mean Feld / Varatonal Appromatons resented by Jose Nuñez 0/24/05 Outlne Introducton Mean Feld Appromaton Structured Mean Feld Weghted Mean Feld Varatonal Methods Introducton roblem: We have dstrbuton but
More informationÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE
ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE School of Computer and Communcaton Scences Handout 0 Prncples of Dgtal Communcatons Solutons to Problem Set 4 Mar. 6, 08 Soluton. If H = 0, we have Y = Z Z = Y
More informationTHE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for
More information( ) [ ] MAP Decision Rule
Announcemens Bayes Decson Theory wh Normal Dsrbuons HW0 due oday HW o be assgned soon Proec descrpon posed Bomercs CSE 90 Lecure 4 CSE90, Sprng 04 CSE90, Sprng 04 Key Probables 4 ω class label X feaure
More informationMACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression
11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING
More informationβ0 + β1xi and want to estimate the unknown
SLR Models Estmaton Those OLS Estmates Estmators (e ante) v. estmates (e post) The Smple Lnear Regresson (SLR) Condtons -4 An Asde: The Populaton Regresson Functon B and B are Lnear Estmators (condtonal
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationComparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method
Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method
More informationBasically, if you have a dummy dependent variable you will be estimating a probability.
ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models
More informationSupport Vector Machines. Vibhav Gogate The University of Texas at dallas
Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest
More informationSome basic statistics and curve fitting techniques
Some basc statstcs and curve fttng technques Statstcs s the dscplne concerned wth the study of varablty, wth the study of uncertanty, and wth the study of decsonmakng n the face of uncertanty (Lndsay et
More informationLecture Nov
Lecture 18 Nov 07 2008 Revew Clusterng Groupng smlar obects nto clusters Herarchcal clusterng Agglomeratve approach (HAC: teratvely merge smlar clusters Dfferent lnkage algorthms for computng dstances
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More information7. Multivariate Probability
7. Multvarate Probablty Chrs Pech and Mehran Saham May 2017 Often you wll work on problems where there are several random varables (often nteractng wth one another). We are gong to start to formally look
More information