Statistical pattern recognition

Size: px

Start display at page:

Download "Statistical pattern recognition"

Trevor Carr
5 years ago
Views:

1 Statstcal pattern recognton

2 Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve Someone wthout t condton may come out postve (false postve est propertes SPECIFICIY or true negatve rate P(NEG COND SENSIIVIY or true postve rate P(POS COND Rcardo Guterrez Osuna AMU CSE

3 Problem defnton Assume a populaton of, where out of every people has the medcal condton Assume that we desgn a test wth 98% specfcty P(NEG COND and 9% senstvty P(POS COND You take the test, and t comes POSIIVE What condtonal probablty are we after? How lkely s t that you have the condton? Rcardo Guterrez Osuna AMU CSE

4 Soluton: Jont frequency table he answer s the rato of ndvduals wth the condton to total ndvduals (consderng only ndvduals that tested postve or 9/88. HAS CONDIION FREE OF CONDIION ES IS POSIIVE ES IS NEGAIVE ROW OAL rue postve P(POS COND.99 False postve P(POS COND 9,9 (.9898 False negatve P(NEG COND (.9 rue negatve P(NEG COND 9,9.989, 9,9 COLUMN OAL 88 9,, Rcardo Guterrez Osuna AMU CSE

5 Condtonal probablty S S P(A I B P (A B for P(B > P(B A A B B B has A A B B occurred otal probablty P(A P(A I S P(A I B P(A I BN P(A B P(B P(A B P(B N P(A B k k P(B k N N B B B N- A B B N Rcardo Guterrez Osuna AMU CSE

6 Alternatve soluton: Bayes theorem P (A B P(B AP(A P(B P( + cond P(cond P (cond + P( + P( + cond P(cond P( + cond P(cond + P( + cond P( cond ( Rcardo Guterrez Osuna AMU CSE

7 In SPR, Bayes theorem s expressed as Posteror P(ω x P(x ω P(ω j j j N k P(x ω k P(ω k Lkelhood P(x ω j Pror P(ω P(x Norm constant j And we assgn sample x to the class ω k wth the hghest posteror It It can be shown ths rule mnmzes the prob. of error Rcardo Guterrez Osuna AMU CSE

8 Dscrmnant functons Class assgnment Select max Costs g (x g (x g C (x Dscrmnant functons x x x x d Features x ω where g (x > g (x g (x p( ω j j x Rcardo Guterrez Osuna AMU CSE

9 Quadratc classfers For normally dstrbuted classes, the posteror can be reduced to a very smple expresson Recall an n dmensonal Gaussan densty s p(x ( π n/ / exp (x μ (x μ U Usng Bayes rule, the DF can be wrtten as g (x P(ω x P(x ω P(ω P(x exp (x μ (x μ P(ω n/ / ( π P(x Rcardo Guterrez Osuna AMU CSE

10 Elmnatng constant terms g ( x -/ exp (x μ (x μ P( ω And takng logs g (x (x μ (x μ log + log P(ω hs s known as a quadratc dscrmnant functon (because t s a functon of x Rcardo Guterrez Osuna AMU CSE

11 Case : Σ σ I Features are statstcally ndependent, and have the same varance for all classes In ths case, the quadratc dscrmnant functon becomes g (x ( σ I - (x μ (x μ- log σ (x μ (x μ + log P(ω σ I + log P(ω Assumng equal prors and droppng constant terms g (x (x μ (x μ - DIM ( x μ hs s called an Eucldean dstance or nearest mean classfer Rcardo Guterrez Osuna AMU CSE

12 [ ] [ ] [ ] μ μ μ Σ Σ Σ Rcardo Guterrez Osuna AMU CSE

13 Case : Σ Σ All classes have the same covarance matrx, but the matrx s not dagonal In ths case, the quadratc dscrmnant becomes ( g (x (x μ (x μ - log + ( log ( P(ω assumng g equal prors and elmnatng constants g (x (x μ Σ - (x μ x hs s known as a Mahalanobs dstance classfer μ x x - μ K x - μ Κ Rcardo Guterrez Osuna AMU CSE

14 [ ] [ ] [ ]... μ μ μ.. Σ.. Σ.. Σ Rcardo Guterrez Osuna AMU CSE

15 General case [ ] [ ] [ ] μ μ μ [ ] [ ] [ ]... Σ Σ Σ μ μ μ Zoom out Rcardo Guterrez Osuna AMU CSE

16 k nearest neghbors Non parametrc approxmaton Lkelhood of each class P(x ω k N V x V And prors P(ω N N hen, the posteror becomes P(ω x P(x ω P(ω P(x k N N V N k NV Rcardo Guterrez Osuna AMU CSE k k

17 Example Gven the three classes, assgn a class label for the unknown example x u Assume the Eucldean dstance and k neghbors Of the closest neghbors, belong to ω and belongs ω to ω, so x s assgned to ω u, the predomnant class ω x u ω Rcardo Guterrez Osuna AMU CSE

18 Rcardo Guterrez Osuna AMU CSE

19 -NN -NN -NN Rcardo Guterrez Osuna AMU CSE

20 Advantages Smple mplementaton Nearly optmal n the large sample lmt (N P[error] Bayes <P[error] NN <P[error] Bayes Uses local nformaton, whch can yeld hghly adaptve behavor Lends tself very easly to parallel mplementatons Dsadvantages Large storage requrements Computatonally ntensve recall Hghly susceptble to the curse of dmensonalty Rcardo Guterrez Osuna AMU CSE

21 Dmensonalty reducton

22 Why do dmensonalty reducton? he so called curse of dmensonalty Exponental growth n the number of examples requred to accurately estmate a functon Exploratory data analyss Vsualzng the structure of the data n a lowdmensonal subspace Rcardo Guterrez Osuna AMU CSE

23 wo approaches to perform dmensonalty reducton Feature selecton: choose a subset of all the features [ x x...x ] [ x x ] N...x M Feature extracton: create new features by combnngthe exstng ones [ x x...x ] [ y y...y ] f ( [ x x ] N y M...x M Feature extracton s typcally a lnear transform x x M x N y lnear feature extracton y y M w w M w M w w w M M Rcardo Guterrez Osuna AMU CSE L L O w w w N N M MN x x M x N

24 Representaton vs. classfcaton Fe eature Feature Rcardo Guterrez Osuna AMU CSE

25 PCA Soluton Project the data onto the egenvectors of the largest egenvalues of the covarance matrx PCA fnds orthogonal drectons of largest varance Propertes If data s Gaussan, PCA fnds ndependent axes Otherwse, t smply de correlates lt the axes Lmtaton Drectons of hgh h varance do not necessarly contan dscrmnatory nformaton Rcardo Guterrez Osuna AMU CSE

26 LDA Defne scatter matrces x Wthn class μ S W S B S W C S C x ω ( x μ ( x μ S B μ S B μ S W Between class μ S W S B C N ( μ μ( μ μ x hen maxmze rato J(W W S W S B W W W Rcardo Guterrez Osuna AMU CSE

27 Soluton NOE Optmal projectons are the egenvectors of the largest egenvalues of the generalzed egenvalue problem ( S λ S w B W S B s the sum of C matrces of rank one or less and the mean vectors are constraned by Σμ μ herefore, S B wll be at most of rank (C, and LDA produces at most C feature projectons Lmtatons Overfttng Informaton not n the mean of the data Classes sgnfcantly non Gaussan Rcardo Guterrez Osuna AMU CSE

28 PCA axs axs axs axs axs x - LDA axs axs axs x axs axs x - Rcardo Guterrez Osuna AMU CSE

29 LDA and overfttng Generate an artfcal dataset h l l l h h lk lh d hree classes, examples per class, wth the exact same lkelhood: a multvarate Gaussan wth zero mean and dentty covarance dmensons dmensons dmensons dmensons Rcardo Guterrez Osuna AMU CSE

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna