Pattern Classification (II) 杜俊

Size: px

Start display at page:

Download "Pattern Classification (II) 杜俊"

Agnes Webb
5 years ago
Views:

1 attern lassfcaton II 杜俊

2 Revew roalty & Statstcs Bayes theorem Ranom varales: screte vs. contnuous roalty struton: DF an DF Statstcs: mean, varance, moment arameter estmaton: MLE Informaton Theory Entroy, mutual nformaton, nformaton channel, KL vergence Functon Otmzaton onstrane/unconstrane otmzaton Lnear Algera Matr manulaton

3 Outlne attern lassfcaton rolems Inference an ecson Bayesan Decson Theory How to make the otmal ecson? Mamum a osteror MA ecson rule Generatve Moels Jont struton of oservaton an lael sequences Moel estmaton: MLE, Bayesan learnng, scrmnatve tranng Dscrmnatve Moels Moel the osteror roalty rectly scrmnant functon Logstc regresson, suort vector machne, neural network

4 Bayesan Decson Theory I Bayesan ecson theory s a funamental statstcal aroach to all attern classfcaton rolems attern classfcaton rolem s ose n roalstc terms Oservaton s vewe as ranom varales vectors, lass,,, N s treate as a screte ranom varale All nfo aout an can e otane va jont struton, Bayesan ecson theory leas to the otmal classfcaton wth Otmal guarantee mnmum average classfcaton error The mnmum classfcaton error s calle the Bayes error,

5 Bayesan Decson Theory II ror roaltes of each class How lkely any attern from class efore oservng any features ror knowlege from revous eerence lass-contonal roalty of oserve feature How the feature strutes for all atterns elongng to class If s contnuous, s a DF If s screte, s a MF N

6 Eamles of lass ontonal roalty

7 Bayes Decson Rule I If not oserve any feature of an ncomng unknown attern, classfy t ase on ror knowlege only Roughly guess t as the class wth largest ror roalty arg ma If oserve some features of the unknown atter, we can convert the ror roalty nto a osteror roalty ase on the Bayes theorem: osteror ror lkelhoo evence

8 Bayes Decson Rule II ror Lkelhoo osteror Evence

9 Bayes Decson Rule III Intutvely, we can classfy an unknown attern nto the class wth the largest osteror roaltes, resultng n the mamum a osteror MA ecson rule, also calle Bayes ecson rule arg ma arg ma

10 The MA Decson Rule s Otmal I How well the MA ecson rule ehaves?? Otmalty: assume we have comlete knowlege,, the MA ecson rule s otmal to classfy atterns, whch means t wll acheve the lowest average classfcaton error rate. roof of otmalty of the MA rule: Gven a attern, f ts true class s, ut we classfy t as, then the classfcaton error s counte as l 0 whch s also known as 0- loss functon.

11 The MA Decson Rule s Otmal II The eecte average classfcaton error R N l The otmal classfcaton s to mnmze mamze the MA ecson rule s otmal R

12 The MA Decson Rule A general ecson rule s a mang functon: A ecson rule wll artton the entre feature sace of nto N fferent regons, O, O,, ON. Each regon O coul consst of many contguous areas. If s locate n the regon O, we classfy t as class. The MA ecson rule s otmal among all ossle ecson rules n terms of mnmzng average classfcaton errors contonal on that we have comlete knowlege aout the unerlyng rolem. Feature sace lass lass lass N

13 Eamle O O O O

14 lassfcaton Error roalty Assume N-class rolem, any a ecson rule arttons the feature sace nto N regons, O, O,, ON. r O, j enotes the roalty of the oservaton wth true class j n the regon O. The overall classfcaton error roalty of the ecson rule s: r error N N O r correct r O N r O,

15 Eamle Error Error

16 Bayes Error Bayes error: error roalty of the Bayes MA ecson rule. Snce Bayes ecson rule guarantees the mnmum error, the Bayes error s the lower oun of all ossle error roaltes. It s ffcult to calculate the Bayes error, even for the very smle cases ecause of scontnuous nature of the ecson regons n the ntegral, esecally n hgh mensons. Some aromaton methos to estmate an uer oun. hernoff oun Bhattacharyya oun Evaluate on an neenent test set.

17 Eamle: s Dscrete I A smle case Bnomal moel: -class,, feature vector s -mensonal vector, whose comonents are nary-value an contonally neenent. t q q q r r 0,,,,

18 Eamle: s Dscrete II The MA ecson rule: classfy to Equvalently, we have the ecson functon : g ln q ln q If g f q 0 ln q ln q ln. ln 0, classfy to, otherwse, otherwse 0

19 Eamle: s ontnuous Gaussan moel: -class,, the feature vector s a scalar whch s real-value The MA ecson rule: / - - / - - e, ; e, ; N N. otherwse, f o classfy t

20 Mssng Features/Data I If we know the full roalty structure of a rolem, we can construct the otmal Bayes ecson rule. In some ractcal stuatons, for some atterns, we can t oserve the full feature vector escre n the roalty structure. Only artal nformaton of the feature vector s oserve, ut some comonents are mssng. How to classfy such corrute nuts to otan mnmum average error? Let the full feature vector =[g,], g reresents the oserve or goo features, reresents the mssng or a ones. In ths case, the otmal ecson rule s constructe as follows: arg ma g

21 Mssng Features/Data II g g g g g g g g g g,,,,,,,

22 ractcal Issue The otmal Bayes ecson rule s not feasle n ractce. In any ractcal rolem, we can not have a comlete knowlege aout the rolem. E.g., the class-contonal roalty are always unavalale an etremely har to estmate. However, ossle to collect a set of samle ata for each class n queston. The samle ata are always far from enough to estmate a relale DF y usng samle ata themselves ONLY. Queston: How to ul a reasonale classfer ase on a lmte set of samle ata, nstea of the true DF?

23 Statstcal Data Moelng For any real rolem, the true DFs are always unknown Statstcal ata moelng: ase on the avalale samle ata set, choose a roer statstcal moel to ft nto the avalale ata set. Data moelng stage: once the statstcal moel s selecte, ts functon form ecomes known ecet a set of moel arameters assocate wth the moel are unknown to us. Learnng tranng stage: the unknown arameters can e estmate y fttng the moel nto the ata set ase on certan estmaton crteron. Decson test stage: the estmate DFs are lugge nto the otmal Bayes ecson rule n lace of the real DFs, so calle lug-n MA ecson rule Not otmal ut erforms reasonaly well n ractce

24 Data Moelng Eamle

25 lug-n MA Decson Rule Once the statstcal moels are estmate, they are treate as f they were true strutons of the ata, an lug nto the form of the otmal Bayes MA ecson rule n lace of the unknown true DFs. The lug-n MA ecson rule: arg ma arg ma arg ma

Bayesian Decision Theory

Bayesian Decision Theory No.4 Bayesan Decson Theory Hu Jang Deartment of Electrcal Engneerng and Comuter Scence Lassonde School of Engneerng York Unversty, Toronto, Canada Outlne attern Classfcaton roblems Bayesan Decson Theory