ROC Curves for Mulvarae Bomerc Machng Moels Sung-Hyuk Cha an Charles C. Tapper Absrac The bomerc machng problem s a wo class whn or beween ) classfcaon problem where wo ypes of errors an ) occur. Whle he recever operang characersc or ROC curve, whch s a plo of an, can be easly obane n he smple machng moel, s non-rval o oban n he mulvarae machng moel. Here he problem of obanng ROC curves for several paramerc an nonparamerc machers n a mulmensonal case s consere. Graphcal ecson bounares are also gven for each macher. A I. INTRODUCTION RE wo bomerc samples from he same person or from wo fferen people? Ths wo class whn or beween ) machng problem s of grea mporance n varous bomerc auhencaon sysems such as enfcaon, screenng, verfcaon, an connuy of eny []. Feaure exracon x x x y y y f, f,..., f ) f, f,..., f ) Scalar sance measure x x x y y y f, f,..., f ) f, f,..., f ) Dsance Doman Transformaon Same/fferen people a) Smple Mach b) Dchoomy Moel Fg.. Bomerc Machng Moels. x y x y x y δ f, f ) δ f, f ) δ f, f ) Feaure exracon Mulmensonal Dchoomzer A ypcal convenonal bomerc machng moel s he sance-base smple mach SM) shown n Fg. a). The Dchoomy moel n Fg. b) was nrouce n [] o assess he power of he nvualy of hanwrng, an has many avanages over he SM moel [3]. Whle he former s a unvarae ecson problem, he laer s a wo-class mulvarae paern classfcaon problem where numerous paern classfcaon algorhms can be apple [4, 5]. Manuscrp receve May 3, 9. S. Cha s wh Compuer Scence Deparmen, Pace Unversy, Pleasanvlle, NY 57 USA phone: 94-773-389; fax: 94-773-3533; e-mal: scha@pace.eu). C. C. Tapper s wh Compuer Scence Deparmen, Pace Unversy, Pleasanvlle, NY 57 USA e-mal: capper@pace.eu). The Recever Operang Characersc ROC) curve was frs use o analyze raar sgnals [6] an has more recenly been employe n machne learnng an paern recognon o evaluae classfcaon algorhms [7, 8]. In he bomerc machng moels, wo ypes of errors are nrouce: false rejec rae ) an false accep rae ). The ROC curve n he bomerc machng problem s a graphcal plo of versus. In he SM moel n Fg. a), he ROC curve can be obane rvally by alerng a scalar hreshol value. In he mulmensonal case n Fg. b), however, he ecson bounary may be conrolle by several parameers or complex ecson rules raher han by a sngle hreshol. Alhough fnng he opmal ROC curve n he mulvarae wo-caegory classfcaon problem can requre grea compuaonal resources, n pracce he complex ecson rule s ofen smplfe o a sngle conrol parameer ecson rule o raw an ROC curve [4]. Here, several mulvarae paern classfcaon algorhms are examne for he bomerc machng problem. By fxng or proporonng parameers, a sngle conrol parameer s obane o conrol he errors. For smplcy an ease of vsualzaon, wo-mensonal ecson bounares are gven for each moel. The purpose of hs arcle s o beer unersan he srbuons of he wo classes n bomerc machng problems an he meanng of ROC curves n erms of ecson bounares. The res of hs paper s organze as follows. Secon II revews ROC curves n he convenonal SM moel. Several paramerc an non-paramerc classfers for he choomy moel are examne n secon III. Fnally, Secon IV conclues hs work. II. ROC CURVES FOR SIMPLE MATCH MODELS f 4 3 5 5 f Fg.. Two-mensonal bomerc aa from sx subjecs. s s s 3 s 4 s 5 s 6 Conser he hypohecal bomerc aa from sx fferen subjecs n Fg.. Each bomerc aa sample, x, s
represene by wo feaure values, x = {x, x }. These sample aa are use as an llusrave example hroughou hs paper. Le sx) enoe he subjec eny of sample x. The Euclean or L sance s efne n ) x, = x y ) ) L = If wo ranomly selece bomerc samples are from he same subjec, he scalar sance beween hem belongs o he whn class nra-person), W, as efne n ). If hey are from wo fferen subjecs, he sance belongs o he beween class ner-person), B, gven n 3). W = { x, s x) = s } ) B = { x, s x) s } 3) Fg. 3 a) shows 36 ranomly selece W an B unvarae hsograms from he hypohecal aa,.e., W an B = 36, an Fg. 3 b) shows her corresponng normal srbuons base solely on her means an varances..5..5. whn beween Overlap Prece w b w True posve True Accep, H) False negave False Rejec, Mss) Acual b False posve False Accep, False Alarm) True negave True Rejec) Fg. 4. conngency able confuson marx) s ofen calle he False Negave Rae FNR), False Non-Mach Rae FNMR), Type II error, or smply a mss; s referre o as False Posve Rae FPR), False Mach Rae FMR), Type I error, or smply a false alarm. The four possbles are shown n he conngency able or confuson marx of Fg. 4..5.5 5 5 5 3 35. a) unvarae sance hsograms whn beween.8.6.4...5 3.5 a) 3D plos among,, an 5 5 5 3 b) corresponng normal srbuons Fg. 3. Whn an beween-class bomerc scalar sance srbuons. The smple sance-base bomerc mach moel n Fg. a) ulzes a sance measure beween wo bomerc samples such as ), an he sance value beween wo bomerc samples s classfe base on a hreshol value as efne n 4) on he belef ha he whn-class sance ens o be smaller han he beween-class sance. w f x, x, = 4) b oherwse For a fxe value, wo ypes of error probables can be eermne, as epce n Fg. 3 b). The False Rejec Rae ) s he probably of whn class aa classfe as beween class 5) an he False Accep Rae ) s ha of beween-class aa classfe as whn class 6). = Pr x, = b) s x) = s ) ) 5) = Pr x, = w) s x) s ) ) 6).9.8.7.6.5.4.3...5.45.4.35.3.5..5..5 5 5 5 3 35 b) D plos b/w an.5..5..5.3.35.4.45.5.9.8.7.6.5.4.3...5.45.4.35.3.5..5..5 5 5 5 3 35 c) D plos b/w an.5..5..5.3.35.4.45.5 ) raw ROC from Fg. 3 a) e) smooh ROC from Fg. 3 b) Fg. 5. ROC curves. Fg. 5 shows he plos relang,, an. As moves o he lef n Fg. 3 b), ncreases an ecreases, an vce versa as moves o rgh. When an are projece no a D plo as shown n Fg. 5 ) an e), hese graphs are calle Recever Operang Characersc ROC) curves. The ROC curve expresses he
rae-off beween an ypcally, he hgher, he lower. In oher wors, by alerng one can conrol he number of secury breaches an he level of convenence. Anoher benef of he ROC curves s ha hey allow comparng fferen bomerc machers. In [9], four kns of sance measures were exploe for han geomery bomerc verfcaon. They are ) an followng 7~9) measures: cy block L, weghe L, an weghe L sance measures. L x, = = x y 7) x y x, = 8) wl wl = σ x, = x y ) = σ In 8) an 9), σ s he sanar evaon of all h feaure values n he enrolle bomerc samples. Le he SM moels n 4) usng 7), 8), ), an 9) be referre o as SM-L, SM-wL, SM-L, an SM-wL, respecvely. Fg. 6 splays her ROC curves. The ROC curve closes o he orgn s he bes, n hs case SV-wL. 3 5 5.4. 5.5.4.3.. SM-L SM-wL SM-L SM-wL 9)..4.6.8 Fg. 6. Comparson of several ROC curves for SM moels. III. MULTIVARIATE DICHOTOMY MODEL = x - y 5 5 whn beween 5 5 = x - y 3 5 5 5.5. Fg. 7. -feaure sance ransforme space. Transformng from he feaure space n Fg. o he feaure sance space n Fg. 7 was calle he choomy ransformaon [, 3] an s efne n ). x, = δ x, y),..., δ x, y )) ) For example, n Fg. 7 he absolue fference, L, s use for all δ s an hus x, = x y, x y ). In hs way, he bomerc machng problem can be hough of as a wo-class mulvarae paern classfcaon problem. Le s refer o he feaure-sance- ransforme space as space. Fg. 8 shows he scree ecson bounares of he prevous four SM moels n space as he respecve scalar hreshol value s vare. The axs range s {~} an he axs range s {~}. Fg. 8 gves an nuve ea of why SM-wL s beer han he oher SM moels. SV-L gves a lnear graaon wh equal weghng of he wo axes, SV-wL weghs he axes appropraely, SV-L proves a quarac graaon wh equal weghng, an SV-wL weghs he axes. a) SM-L b) SM-wL c) SM-L ) SM-wL Fg. 8. Decson bounares of SM moels over. All SM moels assume ha he whn an beween-class srbuons fall no he class of paramerc ecson bounares. Bu wha f he ecson bounary oes no follow he paramerc ecson bounary funcon? Insea of SM moels, varous oher paramerc or non paramerc paern classfers can be apple o solve he choomy problem n mulmensonal space as gven n Fg. b). However, hese oher paramerc moels may have more han one parameer o conrol he wo ypes of errors, an f so, a queson arses as o how o reuce hem o one parameer o raw he ROC curve. An f non-paramerc classfers are apple, a queson arses as o whch parameer shoul be use for he hreshol. Ths secon nrouces several oher moels where he respecve ROC curves can be aane. A. Uncorrelae Ellpsoal Moels UCE) In space, can naurally be assume ha he whn-class srbuon s clusere near he orgn an he beween-class srbuon scaere away from he orgn. Base on hs assumpon, one reasonable paramerc moel s he uncorrelae ellpsoal UCE) moel ) where q = x,. ) ) w f + = ) ra rb b oherwse 3
= w f πσ σ ˆ ˆ e x y ) x y ) σˆ σˆ 6) Alhough 6) s no explcly relae o ), s mplcly relae o he ellpsoal moel,.e., he corresponng r a an r b values n ) can be foun for 6).. Fg. 9. 3D plos for an wh respec o r a an r b. Unlke he SM moel where only a sngle hreshol s use o conrol he wo characerscs, he UCE moel n ) has wo parameers o conrol hem. Fg. 9 shows wo 3D plos for an wh respec o r a an r b. One woul have o race he curve n a graen ecen manner o raw he opmal ROC curve wh respecve o mulvarae hreshols raher han n a smple projecon wh a sngle hreshol n Fg. 5. In general, s ffcul o raw he ROC curve n he mulvarae case. I shoul be noe ha SM-L an SM-wL can be hough of as specal cases of he UCE moel f all feaures are of numerc ype. If r a = r b =, ) becomes SM-L, an f r a = σ an r b = σ, ) becomes SM-wL. Anoher reasonable specal case of he UCE moel s o assgn r a = μ` an r b = μ` n ) where μ` an μ` are mean values of all whn sance values only. Le s call hs specal choomzer he mean uncorrelae ellpsoal or muce classfer n ). For smplcy, he = b oherwse par of ) s ome n he remanng efnons. = w f + ) μ ' μ ' If he varances of all whn sance values are use, he varance uncorrelae ellpsoal or vuce macher s bul. muce an vuce are smple an can be easly generalze o he mulvarae cases as n 3) an 4), respecvely. x y ) = w f 3) = μ ' x y ) = w f 4) = σ ' Whle conroversal, he whn an beween-class srbuons are ofen assume o be normal as n Fg. 3 b). Here, a fferen kn of he Gaussan normal srbuon can be assume o esgn anoher specal UCE moel. In he mulmensonal sance space n Fg. 7, he mean of he whn-class aa can be assume o be he orgn, ). Then he sanar evaon of each whn-class sance s efne as n 5). W σ ˆ = 5) W Assumng ha he whn-class sance srbuon follows he uncorrelae normal Gaussan srbuon, anoher choomzer calle UCG can be efne as n 6)..5..5..5 muce vuce UCG SM-L...3.4 Fg.. ROC curves of UCE moels wh proporonal hreshols. Fg. compares he ROC curves of four uncorrelae ellpsoal moels. The vuce moel seems o perform beer han he ohers. The wo unknown conrollng parameers, r a an r b n ), are assume o be proporonal o each oher n all UCE moels an hus here can be only one parameer o conrol he wo error probables. Ye, n he rue opmal uncorrelae ellpsoal moel s r a an r b may no be proporonal. Fg. shows he ecson bounares of hree uncorrelae ellpsoal moels. a) Normal b) UCG c) muce ) vuce Fg.. Decson bounares of uncorrelae ellpsoal moels over. B. k-neares neghbor k-nn) One of he mos popular an smples non-paramerc classfers s he k-neares neghbor k-nn) algorhm [4, 5, ]. The class of an unknown sample s eermne by he vong of he op k mos smlar ems n he reference se R. Le R = W B an R` be he se of he op k mos smlar ems o he query q,.e., R` = {r`,, r`k} R. Le ar ) be he acual class of he r,.e., ar ) {w, b}. The k-nn for he choomy problem n Fg. b) s formulae as n 7). = w f a ) = w a ) = b 7) In Fg., f k = 3, R` = {W, B, B } an hus = b. If k = 7, R` = {W, W, W 3, W 4, B, B, B3} an hus = w because ar`) W = 4) > ar`) B = 3). 4
k = 7)-NN k = 3)-NN B 4 B B W W 8 3 W q B 3 B 5 W W 4 B 9 B 6 B 8 B 7 = w f mn w, ) ) Suppose ha he average of he op k whn neares neghbors s use nsea of he neares one n 9). Le s call hs he k-whn-neares-neghbor wh a hreshol sance k-wnn-) classfer 3). = w f avg w, ) 3) op k Boh WNN- an k-wnn- gve goo ecson bounares over for he ROC curves as shown n Fg. 3 f), g), an h). W 9 W7 W 5 W6 W = whn feaure sance B = beween feaure sance Fg.. Sample k-nn classfcaon A varaon of he sanar k-nn nclues he sance-weghe k-neares neghbor or wk-nn n shor [5]. = w f q, ) q, ) 8) a r ) W a r ) B Anoher varaon of he k-nn nclues he rank-weghe neares k-neares neghbor or rwk-nn n shor. Le rankr`) be he funcon o reurn he poson n he escenng orere ls of R`. I can have an neger value beween an k an rankr`) > rankr`j) f q, r`) q, r`j). Then he rwk-nn s efne as 9) = w f rank r` ) rank r` ) 9) a r ) W a r ) B In Fg. where k = 3, hree neares neghbors are sore reversely: BB, B, W ) an he query q s classfe as w snce [rankw ) = 3] [rankbb) = ) + rankb B ) = )]. In orer o raw ROC curves for he k-nn, wk-nn, an rwk-nn, he parameer k may be a canae o conrol he wo errors. Unforunaely, he ecson bounares are oo gh o raw he ROC curves as shown n Fg. 3 a~c) where k vares from o. Suppose we assgn q o w as long as m of he whn-class samples are n he k-neares neghbors. Le s call hs classfer m-mach k-neares neghbor mk-nn) ). = w f a ) = w m ) For he llusrave example n Fg., q s classfe as b for m = 5 an k = 7) an as w for m = an k = 3). Smlarly, he rwk-nn classfer n 9) can be alere wh he rank ecson n ) o make anoher choomzer calle mrwk-nn. k = w f rank ) where ) = a r ) W The mk-nn an mrwk-nn mehos were nvesgae o oban ROC curves n a keysroke bomerc suy []. Conser anoher varaon of he neares neghbor concep. If only he whn sance aa W s consere as R, an he sance beween a query an he whn neares neghbor falls whn a ceran sance hreshol value, hen s classfe as w. Le s call hs he whn-neares-neghbor wh a hreshol sance WNN-) classfer ). a) k-nn b) wk-nn c) rwk-nn ) mk -NN e) mrwk -NN f) WNN- g) k 5 -WNN- h) k -WNN- Fg. 3. Decson bounares of k-nn an k-wnn moels over. Alhough mrwk= ) NN appears o be he bes ROC curve for he llusrave examples n Fg. 4, Fg. 3 ~h) prove an nuve ncaon of he behavors of he ypes of classfers. I shoul also be noe ha f k vares, here are wo parameers ha conrol he errors for he opmal ROC curves n several of hese neares neghbor moels...8.6.4. WNN- k=5) WNN- m k=) NN mrw k=) NN vuce...3.4.5 Fg. 4. Decson bounares of knn moels over. C. Arfcal Neural Nework The Arfcal Neural Nework ANN) has been wely ulze o solve classfcaon problems [4, 5]. As shown n Fg. 5 a), consss of npu, hen, an oupu layers of neurons. The ecson s mae n he oupu neuron afer ranng he nework an s gven n 4) h = w f w h = 4) 5
Inpu Neurons Hen Layer h h h 3 h 4 w w w 3 w 4 Oupu Neuron x = w h = w h har hreshol ecson makng llusrae on a small se of hypohecal bomerc aa. The resuls prove an nuve ncaon of he behavors of he varous ypes of classfers. They also ncae ha a smple ROC comparson of bomerc machers s no suffcen o fully compare he machers. These saemens are rue as long as all feaures are numerc ype as n Fg.. Whle UCE moels can be use even f feaures are non-numerc or heerogeneous ype, SM moels canno be use [3]. Unlke he ANN, lnear scrmnan funcons such as SVM wh proper preprocessng also gave reasonable ecson bounares bu were ome ue o space lmaon. 5 5 a) Typcal fee forwar ANN srucure whn beween overlap..45.7.95. b) ANN oupu hsogram Fg. 5. Decson bounares of four rane ANNs over...8.6.4. ANN-5 ANN-4-3 vuce...3.4 Fg. 6. ROC curves of wo rane ANNs over. Fg. 5 b) gves he hsogram of oupu values of a rane ANN. ROC curves can be obane by varyng as shown n Fg. 6. A frs glance, ANNs perform beer han vuce accorng o he ROC curve comparsons. Decson bounares of several ANNs wh fferen number of neurons n he hen layer are shown n Fg. 7. Fg. 7 ) has wo hen layers wh 4 an 3 neurons n each layer, bu hey were over-fe ramacally. REFERENCES [] R.M. Bolle, J.H. Connell, S. Pankan, N.K. Raha, an A.W. Senor, Gue o Bomercs. New York: Sprnger-Verlag, 4. [] S.-H. Cha an S. N. Srhar, Wrer Ienfcaon: Sascal Analyss an Dchoomzer, LNCS - Avances n Paern Recognon, vol. 876, Aug, pp. 3-3. [3] S.-H. Cha an S. N. Srhar, Mulple Feaure Inegraon for Wrer Verfcaon, n Proc. of 7h In l Workshop on Froners n Hanwrng Recognon, Amseram, Neherlans, Sep, pp 333-34. [4] R.O. Dua, P.E. Har, an D.G. Sork, Paern Classfcaon. n e. Wley,. [5] T.M. Mchell, Machne Learnng. McGraw-Hll, 997. [6] D.M. Green an J.M. Swes, Sgnal eecon heory an psychophyscs. New York: John Wley an Sons Inc., 966. [7] K.A.Spackman, Sgnal eecon heory: Valuable ools for evaluang nucve learnng, n Proc. of 6h In l Workshop on Machne Learnng, San Maeo, CA, 989, pp 6 63. [8] T. Fawce, An nroucon o ROC analyss, Paern Recognon Leers, vol. 7, 6, pp. 86-874. [9] A.K. Jan, A. Ross, an S. Pankan, A Prooype Han Geomery-base Verfcaon Sysem, n Proc. of 6h In l Conf. on Auo- an Veo-base Bomerc Person Auhencaon, Washngon D.C., Mar 999, pp 66-7. [] B.V. Dasarahy, eor, Neares Neghbor NN) Norms: NN Paern Classfcaon Technques. Washngon DC: IEEE Compuer Socey, 99. [] R.S. Zack, C.C. Tapper, S.-H. Cha, J. Alper, A. Amaya, T. Maruo, A. Shah, an M. Warren, Obanng bomerc ROC curves from a non-paramerc classfer n a long-ex-npu keysroke auhencaon suy, CSIS Techncal Repor #68, Pace Unversy, November 9. a) -3 ANN b) -4 ANN c) -5 ANN ) -4-3 ANN Fg. 7. Decson bounares of four rane ANNs over. IV. CONCLUSIONS Ths arcle presene a varey of mulvarae machng moels for obanng ROC curves an he moels were 6