P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
|
|
- Meghan Hampton
- 5 years ago
- Views:
Transcription
1 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng / Theory & Applcatons of Pattern Recognton A pror / A posteror prob. Loss functon Bayes decson rule The lkelhood rato test Maxmum a posteror (MAP) Crteron Mn. error rate classfcaton Dscrmnant functons Error bounds and prob. Background: Pattern Classfcaton, Duda, Hart and Stork, Copyrght John Wley and Sons, 00, PR logo Copyrght Rob Polkar, 00
2 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Today n PR Revew of Bayes theorem Bayes Decson Theory Bayes rule Loss functon & expected loss Mnmum error rate classfcaton Classfcaton usng dscrmnant functons Error bounds & probabltes
3 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Bayes Rule Suppose, we know P(ω ), P(ω ), P(x ω ) and P(x ω ), and that we have observed the value of the feature (a random varable) x How would you decde on the state of nature type of fsh, based on ths nfo? Bayes theory allows us to compute the posteror probabltes from pror and classcondtonal probabltes Lkelhood: The (class-condtonal) probablty of observng a feature value of x, gven that the correct class s ω. All thngs beng equal, the category wth hgher class condtonal probablty s more lkely to be the correct class. Posteror Probablty: The (condtonal) probablty of correct class beng ω, gven that feature value x has been observed P( ω x) ( x Iω ) P P( x ω ) P( ω ) = = C P( x) P( x ω ) P( ω ) k = k Pror Probablty: The total probablty of correct class beng class ω determned based on pror experence k Evdence: The total probablty of observng the feature value as x
4 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Bayes Decson Rule Choose ω f P(ω x) > P(ω x) for all,,=,,,c If there are multple features, x={x, x,, x d } Choose ω f P(ω x) > P(ω x) for all,=,,,c
5 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ The Loss Functon Mathematcal descrpton of how costly each acton (makng a class decson) s. Are certan mstakes costler than others? {ω, ω,, ω c }: Set of states of nature (classes) {α, α, α a }: Set of possble actons. Note that a need not be same as c. Because we may make more (or less) number of actons than the number of classes. For example, not makng a decson (reecton) s also an acton. {λ, λ, λ a }: Losses assocated wth each acton λ(α ω }: The loss functon: Loss ncurred by takng acton when the true state of nature s n fact. R(α x): Condtonal rsk - Expected loss for takng acton c R( α x) = λ( α ω ) P( ω = x) Bayes decson takes the acton that mnmzes the condtonal rsk!
6 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Bayes Decson Rule Usng Condtonal Rsk. Compute condtonal rsk R( α x) = λ( α ω ) P( ω x) for each acton taken. Select the acton that has the mnmum condtonal rsk. Let ths be acton k c = 3. The overall rsk s then R = x X R(α k x ) p( x) dx Integrated over all possble values of x Condtonal rsk assocated wth takng acton α(x) based on the observaton x. Probablty that x wll be observed 4. Ths s the Bayes Rsk, the mnmum possble rsk that can be taken by any classfer!
7 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Two-Class Specal Case Defntons: α : Decde on ω, α : Decde on ω, λ : λ(α ω ) Loss for decdng on ω when the SON s ω Condtonal rsk: R(α x) = λ P(ω x)+ λ P(ω x) R(α x) = λ P(ω x)+ λ P(ω x) Note that λ and λ need not be zero, though we expect λ < λ, λ < λ Decde on ω f R(α x) < R(α x), decde on ω, otherwse ( x ω ) ( x ω ) p ω Λ( x ) = > p < ω ( λ λ ) ( λ λ ) P P ( ω ) ( ω ) The Lkelhood Rato Test (LRT): Pck ω f the LRT s greater then a threshold that s ndependent of x. Ths rule, whch mnmzes the Bayes rsk, s also called the Bayes Crteron.
8 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Example From R. TAMU
9 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Example (to be Fully solved on Request on Frday) λ λ λ λ Modfed from R. TAMU
10 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Mnmum Error-Rate Classfcaton: Multclass Case If we assocate takng acton as selectng class, and f all errors are equally lkely, we obtan the zero-one loss (symmetrcal cost functon) ( ω ) λ α 0, =, Ths loss functon assgns no loss to correct classfcaton, and assgns to msclassfcaton. The rsk correspondng to ths loss functon s then R( α x) = P( ω x) What does ths tell us? To mnmze ths rsk (average probablty of error), we need to choose the class that maxmzes the posteror probablty P(ω x) f f = P( ω x) = =,..., c Λ ( x) = p p ( x ω ) ( x ω ) > < ω ω P P ( ω ) ( ω ) P P ( ω x) ( ω x) ω > < ω Maxmum a posteror (MAP) crteron Maxmum lkelhood crteron for equal prors
11 Error Probabltes (Bayes Rule Rules!) In a two class case, there are two sources of error: x s n R, yet SON s ω, or vce versa P( error) = = R p ( error x) p( x) dx ( x) p( x) dx + p( x) p( x) dx p ω ω R x B : Optmal Bayes soluton x*: Non-optmal soluton P(error) = + P( x R, ω) = P( x R ω) P( ω) Theory and Applcatons of Pattern Recognton P( x R, ω) = P( x R ω) P( ω) 003, Rob Polkar, Rowan Unversty, Glassboro, NJ
12 Probablty of Error In mult-class case, there are more ways to be wrong then to be rght, so we explot the fact that P(error)=-P(correct), where P C ( correct) = P( x R, ω ) = P( x R ω ) P( ω ) = = C = x R P ( x ω ) P( ω ) dx = P( ω x) P( x) dx = x R Of course, n order to mnmze the P(error), we need to maxmze P(correct) for whch we need to maxmze each and every one of the ntegrals. Note that P(x) s common to all ntegrals, therefore the expresson wll be maxmzed by choosng the decson regons R where the posteror probabltes P(ω x) are maxmum: C = C Theory and Applcatons of Pattern Recognton From R. TAMU x 003, Rob Polkar, Rowan Unversty, Glassboro, NJ
13 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Dscrmnant Based Classfcaton A dscrmnant s a functon g(x), that dscrmnates between classes. Ths functon assgns the nput vector to a class accordng to ts defnton: Choose class f g ( x) > g ( x),, =,,..., c Bayes rule can be mplemented n terms of dscrmnant functons g ( x) = P( ω x) Each dscrmnant functon generates c decson regons, R,,R c, whch are separated by decson boundares. Decson regons need NOT be contguous. The decson boundary satsfes g ( x) = g ( x) x R, g, ( x) > g =,,..., c ( x)
14 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Dscrmnant Functons We may vew the classfer as an automated machne that computes c dscrmnants and selects the category correspondng to the largest dscrmnant A neural network s one such classfer for Bayes classfer wth non-unform rsks, R(α x): for MAP classfer (of unform rsks): for maxmum lkelhood classfer (of equal prors): g g g ( x ) = R( α x) ( x ) = P( ω x) ( x ) = P( x ω )
15 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Dscrmnant Functons In fact, multplyng every DF wth the same constant, or addng/subtractng a constant to all DFs does not change the decson boundary In general every g (x) can be replaced by f (g (x) ), where f(.) s any monotoncally ncreasng functon wthout affectng the actual decson boundary Some lnear or non-lnear transformatons of the prevously stated DFs may greatly smplfy the desgn of the classfer What examples can you thnk of?
16 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Normal Denstes If lkelhood probabltes are normally dstrbuted, then a number of smplfcatons can be made. In partcular, the dscrmnant functon can be wrtten as n ths greatly smplfed form (!) p( x ω ) = p d / / ( π ) Σ ( x ω ) ~ N( µ, Σ ) e T [( x µ ) Σ ( x µ )] g ( x) = [ T ( ) ( )] d x µ Σ x µ lnπ ln Σ + ln P( ω ) There are three dstnct cases that can occur:
17 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Σ = σ Case : I Features are statstcally ndependent, and all features have the same varance: Dst. are sphercal n d dmensons, the boundary s a generalzed hyperplane (lnear dscrmnant) of d- dmensons, and features create equal szed hypersphercal clusters. Examples of such hypersphercal clusters are: The general form of the dscrmnant s then g x µ ( x) = + ln P( ω ) σ If prors are the same: g ( x) = T ( x µ ) ( x µ ) σ Mnmum Dstance Classfer
18 Σ = σ Case : I Ths case results n lnear dscrmnants that can be wrtten n the form w T = µ, 0 ln ( ) w = µ µ + P ω σ σ g + T ( x) = w x w 0 Threshold (Bas) of the th category -D case 3-D case -D case Note how prors shft the dscrmnant functon away from the more lkely mean!!! Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ
19 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Σ = Σ Case : Covarance matrces are arbtrary, but equal to each other for all classes. Features then form hyperellpsodal clusters of equal sze and shape. Ths also results n lnear dscrmnant functons whose decson boundares are agan hyperplanes: T g ( ) x = ( x µ ) Σ ( x µ ) + ln P ω g + T ( x) = w x w 0 w = Σ µ, [ ] ( ) T w 0 = µ Σ µ + ln P( ω )
20 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Case 3: Σ = Arbtrary All bets are off!in two class case, the decson boundares form hyperquadratcs. The dscrmnant functons are now, n general, quadratc (nor lnear) and non-contguous T T W g ( x) = x W x + w x + w = Σ, w = Σ µ 0 T w 0 = µ Σ µ ln Σ + ln P( ω ) Hyperbolc Parabolc Lnear Ellpsodal Crcular
21 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Case 3: Σ = Arbtrary For the mult class case, the boundares wll look even more complcated. As an example Decson Boundares
22 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Case 3: Σ = Arbtrary In 3-D
23 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Conclusons The Bayes classfer for normally dstrbuted classes s n general a quadratc classfer and can be computed The Bayes classfer for normally dstrbuted classes wth equal covarance matrces s a lnear classfer For normally dstrbuted classes wth equal covarance matrces and equal prors s a mnmum Mahalanobs dstance classfer For normally dstrbuted classes wth equal covarance matrces proportonal to the dentty matrx and wth equal prors s a mnmum Eucldean dstance classfer Note that usng a mnmum Eucldean or Mahalanobs dstance classfer mplctly makes certan assumptons regardng statstcal propertes of the data, whch may or may not and n general are not true. However, n many cases, certan smplfcatons and approxmatons can be made that warrant makng such assumptons even f they are not true. The bottom lne n practce n decdng whether the assumptons are warranted s does the damn thng solve my classfcaton problem?
24 Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Error Bounds It s dffcult, at best f possble, to analytcally compute the error probabltes, Partcularly when the decson regons are not contguous. However, upper bounds for ths error can be obtaned: The Chernoff bound and ts approxmaton Bhattacharya bound are two such bounds that are often used. If the dstrbutons are Gaussan, these expressons are relatvely easer to compute Often tmes even non-gaussan cases are consdered as Gaussan.
Lecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More informationMIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU
Group M D L M Chapter Bayesan Decson heory Xn-Shun Xu @ SDU School of Computer Scence and echnology, Shandong Unversty Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationStatistical pattern recognition
Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve
More informationThe Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD
he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s
More informationWhy Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)
Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch
More informationPattern Classification
attern Classfcaton All materals n these sldes were taken from attern Classfcaton nd ed by R. O. Duda,. E. Hart and D. G. Stork, John Wley & Sons, 000 wth the ermsson of the authors and the ublsher Chater
More informationENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition
EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30
More information9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov
9.93 Class IV Part I Bayesan Decson Theory Yur Ivanov TOC Roadmap to Machne Learnng Bayesan Decson Makng Mnmum Error Rate Decsons Mnmum Rsk Decsons Mnmax Crteron Operatng Characterstcs Notaton x - scalar
More informationPattern Classification
Pattern Classfcaton All materals n these sldes ere taken from Pattern Classfcaton (nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wley & Sons, 000 th the permsson of the authors and the publsher
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationThe Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD
he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that e have state of the orld X observatons decson functon L[,y] loss of predctn y th Bayes decson rule s the rule
More informationDepartment of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING
MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/
More informationSupport Vector Machines. Vibhav Gogate The University of Texas at dallas
Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest
More informationMaximum Likelihood Estimation (MLE)
Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y
More informationRegularized Discriminant Analysis for Face Recognition
1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths
More informationThe big picture. Outline
The bg pcture Vncent Claveau IRISA - CNRS, sldes from E. Kjak INSA Rennes Notatons classes: C = {ω = 1,.., C} tranng set S of sze m, composed of m ponts (x, ω ) per class ω representaton space: R d (=
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationOutline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil
Outlne Multvarate Parametrc Methods Steven J Zel Old Domnon Unv. Fall 2010 1 Multvarate Data 2 Multvarate ormal Dstrbuton 3 Multvarate Classfcaton Dscrmnants Tunng Complexty Dscrete Features 4 Multvarate
More informationCommunication with AWGN Interference
Communcaton wth AWG Interference m {m } {p(m } Modulator s {s } r=s+n Recever ˆm AWG n m s a dscrete random varable(rv whch takes m wth probablty p(m. Modulator maps each m nto a waveform sgnal s m=m
More informationBayesian decision theory. Nuno Vasconcelos ECE Department, UCSD
Bayesan decson theory Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world observatons decson functon L[,y] loss of predctn y wth the epected value of the
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationA Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach
A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland
More informationCS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015
CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research
More informationMLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012
MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:
More informationINTRODUCTION TO MACHINE LEARNING 3RD EDITION
ETHEM ALPAYDIN The MIT Press, 2014 Lecture Sldes for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydn@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/2ml3e CHAPTER 3: BAYESIAN DECISION THEORY Probablty
More informationLecture 2: Prelude to the big shrink
Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson
More informationClassification as a Regression Problem
Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationFinite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin
Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of
More information8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore
8/5/17 Data Modelng Patrce Koehl Department of Bologcal Scences atonal Unversty of Sngapore http://www.cs.ucdavs.edu/~koehl/teachng/bl59 koehl@cs.ucdavs.edu Data Modelng Ø Data Modelng: least squares Ø
More informationEngineering Risk Benefit Analysis
Engneerng Rsk Beneft Analyss.55, 2.943, 3.577, 6.938, 0.86, 3.62, 6.862, 22.82, ESD.72, ESD.72 RPRA 2. Elements of Probablty Theory George E. Apostolaks Massachusetts Insttute of Technology Sprng 2007
More informationCHAPTER 3: BAYESIAN DECISION THEORY
HATER 3: BAYESIAN DEISION THEORY Decson mang under uncertanty 3 Data comes from a process that s completely not nown The lac of nowledge can be compensated by modelng t as a random process May be the underlyng
More informationExperimental Study on Classification
Chapter 7. Expermental Study on Classfcaton 7.1 Characterzaton of Explosve Materals 7.1.1 Atomc effect number and densty Theoretcally, most explosves fall wthn a relatvely narrow wndow n Z eff and n densty,
More informationClustering & Unsupervised Learning
Clusterng & Unsupervsed Learnng Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 2012 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationInner Product. Euclidean Space. Orthonormal Basis. Orthogonal
Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,
More informationLecture 20: Hypothesis testing
Lecture : Hpothess testng Much of statstcs nvolves hpothess testng compare a new nterestng hpothess, H (the Alternatve hpothess to the borng, old, well-known case, H (the Null Hpothess or, decde whether
More informationGenerative classification models
CS 675 Intro to Machne Learnng Lecture Generatve classfcaton models Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Data: D { d, d,.., dn} d, Classfcaton represents a dscrete class value Goal: learn
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationProbabilistic Classification: Bayes Classifiers. Lecture 6:
Probablstc Classfcaton: Bayes Classfers Lecture : Classfcaton Models Sam Rowes January, Generatve model: p(x, y) = p(y)p(x y). p(y) are called class prors. p(x y) are called class condtonal feature dstrbutons.
More informationError Probability for M Signals
Chapter 3 rror Probablty for M Sgnals In ths chapter we dscuss the error probablty n decdng whch of M sgnals was transmtted over an arbtrary channel. We assume the sgnals are represented by a set of orthonormal
More informationLinear Classification, SVMs and Nearest Neighbors
1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush
More informationj) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1
Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons
More informationTracking with Kalman Filter
Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,
More informationERROR RATES STABILITY OF THE HOMOSCEDASTIC DISCRIMINANT FUNCTION
ISSN - 77-0593 UNAAB 00 Journal of Natural Scences, Engneerng and Technology ERROR RATES STABILITY OF THE HOMOSCEDASTIC DISCRIMINANT FUNCTION A. ADEBANJI, S. NOKOE AND O. IYANIWURA 3 *Department of Mathematcs,
More informationxp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ
CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and
More informationStatistical Foundations of Pattern Recognition
Statstcal Foundatons of Pattern Recognton Learnng Objectves Bayes Theorem Decson-mang Confdence factors Dscrmnants The connecton to neural nets Statstcal Foundatons of Pattern Recognton NDE measurement
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationMarkov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement
Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs
More information3.1 ML and Empirical Distribution
67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed
More informationBayesian predictive Configural Frequency Analysis
Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse
More informationLecture 3: Probability Distributions
Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the
More informationÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE
ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE School of Computer and Communcaton Scences Handout 0 Prncples of Dgtal Communcatons Solutons to Problem Set 4 Mar. 6, 08 Soluton. If H = 0, we have Y = Z Z = Y
More informationBayesian decision theory. Nuno Vasconcelos ECE Department, UCSD
Bayesan decson theory Nuno Vasconcelos ECE Department UCSD Notaton the notaton n DHS s qute sloppy e.. show that error error z z dz really not clear what ths means we wll use the follown notaton subscrpts
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationEigenvalues of Random Graphs
Spectral Graph Theory Lecture 2 Egenvalues of Random Graphs Danel A. Spelman November 4, 202 2. Introducton In ths lecture, we consder a random graph on n vertces n whch each edge s chosen to be n the
More informationLecture 6 More on Complete Randomized Block Design (RBD)
Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For
More informationHeteroscedastic Variance Covariance Matrices for. Unbiased Two Groups Linear Classification Methods
Appled Mathematcal Scences, Vol. 7, 03, no. 38, 6855-6865 HIKARI Ltd, www.m-hkar.com http://dx.do.org/0.988/ams.03.39486 Heteroscedastc Varance Covarance Matrces for Unbased Two Groups Lnear Classfcaton
More informationINF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018
INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton
More informationClustering & (Ken Kreutz-Delgado) UCSD
Clusterng & Unsupervsed Learnng Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y ), fnd an approxmatng
More informationI529: Machine Learning in Bioinformatics (Spring 2017) Markov Models
I529: Machne Learnng n Bonformatcs (Sprng 217) Markov Models Yuzhen Ye School of Informatcs and Computng Indana Unversty, Bloomngton Sprng 217 Outlne Smple model (frequency & profle) revew Markov chan
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationCS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements
CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours
UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x
More informationAPPENDIX A Some Linear Algebra
APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,
More informationMixture o f of Gaussian Gaussian clustering Nov
Mture of Gaussan clusterng Nov 11 2009 Soft vs hard lusterng Kmeans performs Hard clusterng: Data pont s determnstcally assgned to one and only one cluster But n realty clusters may overlap Soft-clusterng:
More informationPredictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore
Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.
More informationLecture Nov
Lecture 18 Nov 07 2008 Revew Clusterng Groupng smlar obects nto clusters Herarchcal clusterng Agglomeratve approach (HAC: teratvely merge smlar clusters Dfferent lnkage algorthms for computng dstances
More informationMultilayer Perceptron (MLP)
Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne
More informationSupport Vector Machines CS434
Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts
More informationTornado and Luby Transform Codes. Ashish Khisti Presentation October 22, 2003
Tornado and Luby Transform Codes Ashsh Khst 6.454 Presentaton October 22, 2003 Background: Erasure Channel Elas[956] studed the Erasure Channel β x x β β x 2 m x 2 k? Capacty of Noseless Erasure Channel
More informationThe Second Anti-Mathima on Game Theory
The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player
More informationUniversity of Washington Department of Chemistry Chemistry 453 Winter Quarter 2015
Lecture 2. 1/07/15-1/09/15 Unversty of Washngton Department of Chemstry Chemstry 453 Wnter Quarter 2015 We are not talkng about truth. We are talkng about somethng that seems lke truth. The truth we want
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationRelevance Vector Machines Explained
October 19, 2010 Relevance Vector Machnes Explaned Trstan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introducton Ths document has been wrtten n an attempt to make Tppng s [1] Relevance Vector Machnes
More information8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS
SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 493 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces you have studed thus far n the text are real vector spaces because the scalars
More informationClassification learning II
Lecture 8 Classfcaton learnng II Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Logstc regresson model Defnes a lnear decson boundar Dscrmnant functons: g g g g here g z / e z f, g g - s a logstc functon
More informationLecture 4: September 12
36-755: Advanced Statstcal Theory Fall 016 Lecture 4: September 1 Lecturer: Alessandro Rnaldo Scrbe: Xao Hu Ta Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer: These notes have not been
More information( ) [ ] MAP Decision Rule
Announcemens Bayes Decson Theory wh Normal Dsrbuons HW0 due oday HW o be assgned soon Proec descrpon posed Bomercs CSE 90 Lecure 4 CSE90, Sprng 04 CSE90, Sprng 04 Key Probables 4 ω class label X feaure
More informationChat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980
MT07: Multvarate Statstcal Methods Mke Tso: emal mke.tso@manchester.ac.uk Webpage for notes: http://www.maths.manchester.ac.uk/~mkt/new_teachng.htm. Introducton to multvarate data. Books Chat eld, C. and
More informationGoodness of fit and Wilks theorem
DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),
More informationOther NN Models. Reinforcement learning (RL) Probabilistic neural networks
Other NN Models Renforcement learnng (RL) Probablstc neural networks Support vector machne (SVM) Renforcement learnng g( (RL) Basc deas: Supervsed dlearnng: (delta rule, BP) Samples (x, f(x)) to learn
More informationLearning from Data 1 Naive Bayes
Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why
More informationLecture 10 Support Vector Machines. Oct
Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron
More informationLecture 10: Dimensionality reduction
Lecture : Dmensonalt reducton g The curse of dmensonalt g Feature etracton s. feature selecton g Prncpal Components Analss g Lnear Dscrmnant Analss Intellgent Sensor Sstems Rcardo Guterrez-Osuna Wrght
More informationKernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan
Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems
More informationMachine learning: Density estimation
CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of
More informationClassification. Representing data: Hypothesis (classifier) Lecture 2, September 14, Reading: Eric CMU,
Machne Learnng 10-701/15-781, 781, Fall 2011 Nonparametrc methods Erc Xng Lecture 2, September 14, 2011 Readng: 1 Classfcaton Representng data: Hypothess (classfer) 2 1 Clusterng 3 Supervsed vs. Unsupervsed
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationBasically, if you have a dummy dependent variable you will be estimating a probability.
ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy
More informationImage classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?
Image classfcaton Gven te bag-of-features representatons of mages from dfferent classes ow do we learn a model for dstngusng tem? Classfers Learn a decson rule assgnng bag-offeatures representatons of
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More information10) Activity analysis
3C3 Mathematcal Methods for Economsts (6 cr) 1) Actvty analyss Abolfazl Keshvar Ph.D. Aalto Unversty School of Busness Sldes orgnally by: Tmo Kuosmanen Updated by: Abolfazl Keshvar 1 Outlne Hstorcal development
More information