Progressive Boosting for Class Imbalance and Its Application to Face Re-Identification

Size: px
Start display at page:

Download "Progressive Boosting for Class Imbalance and Its Application to Face Re-Identification"

Transcription

1 Progrssiv Boosting for Class Imbalanc and Its Application to Fac R-Idntification Roghayh Solymani a,, Eric Grangr a, Giorgio Fumra b a Laboratoir d imagri, d vision t d intllignc artificill, Écol d tchnologi supériur Univrsité du Québc, Montral, Canada b Pattrn Rcognition and Applications Group, Dpt. of Elctrical and Elctronic Enginring Univrsity of Cagliari, Cagliari, Italy Abstract In practic, pattrn rcognition applications oftn suffr from imbalancd data distributions btwn classs, which may vary during oprations w.r.t. th dsign data. For instanc, in many vido survillanc applications,.g., fac r-idntification, th fac individuals must b rcognizd ovr a distributd ntwork of vido camras. An important challng in such applications is class imbalanc sinc th numbr of facs capturd from an individual of intrst is gratly outnumbrd by thos of othrs. Two-class classification systms dsignd using imbalancd data tnd to rcogniz th majority (ngativ) class bttr, whil th class of intrst (positiv class) oftn has th smallr numbr of sampls. Svral data-lvl tchniqus hav bn proposd to allviat this issu, whr classifir nsmbls ar dsignd with balancd data substs by up-sampling positiv sampls or undr-sampling ngativ sampls. Howvr, som informativ sampls may b nglctd by random undr-sampling and adding synthtic positiv sampls through up-sampling adds to training complxity. In this papr, a nw nsmbl larning algorithm calld Progrssiv Boosting (PBoost) is proposd that progrssivly insrts uncorrlatd groups of sampls into a Boosting procdur to avoid loosing information whil gnrating a divrs pool of classifirs. In many ral-world rcognition problms, th sampls may b rgroupd using som application-basd contxtual information. For xampl, in fac r-idntification applications, facial rgions of a sam prson apparing in a camra fild of viw may b rgroupd basd on thir trajctoris found by fac trackr. From on itration to th nxt, th PBoost algorithm accumulats ths uncorrlatd groups of sampls into a st that grows gradually in siz and imbalanc. Bas classifirs ar traind on sampls slctd from this st and validatd on th whol st. Consquntly, PBoost is mor robust whn th oprational data may hav unknown and variabl lvls of skw. In addition, th computation complxity of PBoost is lowr than Boosting nsmbls in litratur that us undr-sampling for larning from imbalancd data bcaus not all of th bas classifirs ar validatd on all ngativ sampls. Th nw loss factor usd in PBoost avoids biasing prformanc towards th ngativ class. Using this loss factor, th wight updat of sampls and classifir contribution in final prdictions ar st according to th ability of classifirs to rcogniz both classs. Th proposd approach was validatd and compard using synthtic data and vidos from th Facs In Action, and COX datast that mulat fac r-idntification applications. Rsults show that PBoost outprforms stat of th art tchniqus in trms of both accuracy and complxity ovr diffrnt lvls of imbalanc and ovrlap btwn classs. Kywords: Class Imbalanc, Ensmbl Larning, Boosting, Fac R-Idntification, Vido Survillanc. 5. Introduction Class imbalanc is a fundamntal issu in many ral-world pattrn rcognition applications found in,.g., automatd vido survillanc, fraud dtction, intrusion dtction in computr and ntwork scurity, risk managmnt, and mdical diagnosis. Imbalanc appars in binary classification problms and binarization of multi-class classification problms using on-vsall stratgy whn sampls from on class ar compard against all sampls from all othr classs (Galar t al., 2; Wang & Yao, 22). In particular, in fac r-idntification applications, Corrsponding author addrsss: rsolymani@livia.tsmtl.ca (Roghayh Solymani), Eric.Grangr@tsmtl.ca (Eric Grangr), Fumra@di.unica.it (Giorgio Fumra) systms for vido-to-vido fac rcognition ar dsignd using facs of individuals capturd from vido squncs, and sk to rcogniz thm whn thy appar in archivd or liv vidos capturd ovr a ntwork of vido camras. Fac tracking systms follow th position of facs ovr conscutiv vido frams and dfin th trajctoris by collcting all fac capturs corrspond to a sam high quality track of an individual. Class imbalanc is an important challng in this application bcaus th numbr of fac capturs from an individual of intrst (positiv class) may b gratly outnumbrd by thos of unknown or non-targt individuals (ngativ class). In practic, th lvl of imbalanc obsrvd during oprations is unknown a priori and varis ovr tim. This lvl of skw may diffr from what is sn in th dsign data. Classification algorithms dsignd using imbalancd data ar oftn bi- Prprint submittd to Exprt Systms with Applications January 25, 28

2 asd towards th majority (ngativ) class, vn though th minority class is th (positiv) class of intrst. Th main rason is that larning algorithms ar typically dsignd to optimiz th prformanc in trms of standard accuracy. Consquntly, corrct classification of ngativ class bcoms thir priority du to th abundanc of sampls for this class. Svral approachs hav bn proposd in litratur to dsign nsmbls of classifirs using imbalancd data (s th rviws (H & Garcia, 29; Galar t al., 22; Branco t al., 26; Krawczyk, 26; Haixiang t al., 26)). In this papr, ths approachs ar dividd into data-lvl and algorithmlvl approachs. Data-lvl approachs ithr up-sampl th positiv class, undr-sampl th ngativ class or combin upsampling and undr-sampling to r-balanc data for larning an nsmbl of classifirs. Algorithm-lvl mthods crat or modify larning algorithms to countr th bias towards th ngativ class through cost-fr tchniqus or by introducing unvn misclassification costs for th sampls from diffrnt classs in cost-snsitiv approachs. Ensmbl gnration tchniqus can also b catgorizd into static and dynamic approachs. Static nsmbls ar dsignd a priori and fac no chang during oprations. Th nsmbls slction or fusion may b st off-lin using validation data, but typically assum a fixd lvl of imbalanc during oprations. Dynamic nsmbls allow to adapt th slction and fusion of bas classifirs during oprations basd on th oprational data (Xiao t al., 22; Galar t al., 23a). Most of th nsmbl larning mthods to handl imbalanc in litratur ar static approachs. Boosting (Frund & Schapir, 995; Frund t al., 996) is a common static nsmbl mthod that has bn modifid in svral ways to larn from imbalancd data (s th rviws by (Galar t al., 22; Branco t al., 26; Krawczyk, 26; Haixiang t al., 26)). In datalvl Boosting approachs, training data is rbalancd by upsampling positiv class, undr-sampling ngativ class, or using both up-sampling and undr-sampling (Chawla t al., 23; Hu t al., 29; Mas t al., 27; Guo & Viktor, 24; Siffrt t al., 2; Galar t al., 23b; Díz-Pastor t al., 25). Upsampling mthods lik SMOTEBoost (Chawla t al., 23) ar oftn mor accurat, but thy ar computationally complx. In contrast, random undr-sampling () (Siffrt t al., 2) is mor computationally fficint, but suffrs from information loss. Partitional approachs (Solymani t al., 26a; Yan t al., 23; Li t al., 23) avoid information loss by splitting th ngativ class to uncorrlatd substs and training classifirs using all of ths substs. Anothr issu with Boosting-basd nsmbls is that thy may suffr from th bias of prformanc towards ngativ class bcaus th loss factor, which guids thir larning procss, is obtaind basd on wightd accuracy. In cass of imbalanc, wightd accuracy rflcts th ability for corrct classification of ngativ sampls mor than positiv ons. This issu can b avoidd by adopting a cost-snsitiv approach (Fan t al., ; Ting, 2; Sun t al., 27), that dfins diffrnt misclassification costs for diffrnt classs and intgrats ths cost factors into Boosting larning procss. Th drawback of ths cost-snsitiv tchniqus is that thy rly on th suitabl slction of cost factors which is oftn stimatd by sarching a rang of possibl valus. In contrast, cost-fr tchniqus modify larning algorithms by nhancing loss factor calculation without considring cost factors (Joshi t al., 2; Kim t al., 25; Solymani t al., 26b). In litratur, imbalanc is addrssd for fac r-idntification through dynamic and static approachs. Th dynamic approachs bas th slction and fusion of th classifirs on th stimatd lvl of skw (Radtk t al., 24; D-la Torr t al., 25) to build robust nsmbls of classifirs. In (Radtk t al., 24; D-la Torr t al., 25) authors dsign bas classifirs for a rang of diffrnt lvls of imbalanc for fac ridntification in vido survillanc application. Thn, thy stimat th skw lvl of input data stram and slct a suitabl fusion function basd on that lvl. Th lvl of imbalanc may b difficult to stimat accuratly during oprations and th divrging slction and fusion function can dcras prformanc. In contrast, in a static approach (Solymani t al., 26a), th rang of possibl imbalanc lvls is accountd for during dsign by training bas classifirs on data substs with diffrnt imbalanc lvls. In this papr w addrss th two abov mntiond issus of Boosting-basd nsmbls in imbalancd problms, and in particular in fac r-idntification applications: th computational complxity of up-sampling mthods, th loss of information of undr-sampling ons; and th bias of th standard loss factor towards th ngativ class. To this aim w propos th Progrssiv Boosting (PBoost) algorithm to dsign static classifir nsmbls that can maintain a high lvl of prformanc ovr a rang of possibl lvls of imbalanc and complxity in th data ncountrd during oprations. PBoost uss a partitioning mthod inspird by fac ridntification applications, Trajctory Undr-Sampling (TUS), that w proposd in our prvious work []. TUS uss partitions of th ngativ class basd on tracking information to dsign nsmbls of classifirs. In particular, sampls from th ngativ class ar rgroupd into disjoint partitions and, ovr itrations, ths partitions ar gradually accumulatd into a tmporary dsign subst. Howvr, sampls from th nwly addd partition and th important sampls from prvious itrations hav an qually highr probability of bing slctd. Th bas classifir is thn validatd on th whol tmporary subst. As with traditional Boosting nsmbls, th sampls that ar misclassifid ar considrd as th most important sampls and thir wights incras. With th sampl slction schm proposd in this papr, loss of information is considrably rducd, corrlation among substs of ngativ class is low, and only important sampls tnd to appar in mor than on training subst. Thrfor, th divrsity and accuracy of Boosting nsmbls tnd to incras. In addition, to avoid biasing th prformanc towards th ngativ class, th proposd PBoosting algorithm mploys a loss factor basd on th F β -masur, prviously proposd by th authors (Solymani t al., 26b), that is applicabl in any Boosting nsmbl. Th divrs pool of classifirs gnratd with PBoost allows to globally modl a rang of diffrnt lvls of imbalanc and dcision bound complxitis for th data. Thrfor, th static

3 nsmbls producd using PBoost ar robust to possibl variations in data procssd during oprations bcaus bas classifirs ar validatd on a growing numbr of ngativ sampls (imbalanc lvl). In addition, th numbr of sampls usd pr itration to dsign (train and validat) a classifir in this nsmbl is smallr than Boosting mthods in th litratur, which translats to a lowr computational complxity for dsign. Th contributions of th proposd PBoost algorithm for fac r-idntification is summarizd as follows: A nw sampl slction procss for dsigning Boosting nsmbls whr ngativ class sampls intr th Boosting procss in uncorrolatd partitions to avoid loss of information. Spcifically for fac r-idntification application, th disjoint partitions ar slctd using tracking information. Modifying th validation stp in Boosting larning such that bas classifirs ar validatd on growing numbr of ngativ sampls to incras robustnss to imbalanc and dcras computation complxity; Algorithm : Boost.M nsmbl larning mthod. Input: Training st: S = {(x i,y i );i =,...,M},y i {,} # of itrations: E Tst input : X Output: Prdiction Function: H( ) Initializ W (i) = M for i =,...,M. 2 for =,..,E do i Crat nw training st S with wight distribution W. ii Train classifir C on S with W. iii Tst C on S and gt back a labl st {Y i,i =,...,M}. iv Calculat th psudo-loss for S and W : ε = W (i). (i,y i ):y i Y i v If ε >.5 go to stp i vi Calculat th wight updat paramtr: α = vii Updat W + (i) = W (i)α y i Y i /2 viii Normaliz W + such that: W + =. ε ε Exploiting a spcific loss factor (F-masur) in Boosting algorithm to avoid bias of prformanc towards th majority class. Th PBoost algorithm has bn compard to stat of th art Boosting nsmbls on synthtic and vido datasts, that mulat fac r-idntification application, in trms of both accuracy and computational complxity. Th rst of th papr is structurd as follows. Sction 2 contains a rviw of litratur on nsmbl larning for class imbalanc in gnral and in fac r-idntification application. In Sction 3, th proposd PBoost algorithm is dscribd. Th xprimntal mthodology and rsults ar prsntd in Sctions 4 and 5, rspctivly. 2. Boosting Ensmbl Larning for Class Imbalanc Larning from imbalancd data has bn addrssd in litratur through data-lvl, algorithm-lvl, and cost-snsitiv tchniqus. Ensmbl larning mthods xploit on or a combination of aformntiond tchniqus (Galar t al., 22) to handl imbalanc. Classifir nsmbls can provid highr accuracy and robustnss than a singl classifir systm by combining divrs classifirs (Rokach, 2). Boosting is a common static nsmbl larning algorithm initiatd with Boost (Frund & Schapir, 995) and improvd in Boost.M (for 2-class problms) and Boost.M2 (for multipl-class problms) (Frund t al., 996) to ffctivly promot a wak larnr that prforms slightly bttr than random gussing into a strongr nsmbl. In Boost.M (Algo.) sampls ar assignd wights that indicat thir importanc. Ths wights guid th larning procss such that bas classifirs in th nsmbl focus on corrct classification of mor important sampls as th larning itrations procd. Sampls that ar misclassifid in ach itration gain mor importanc for th nxt itration and mor accurat bas classi- firs gain highr contribution in final dcision. Ths wights Output th final hypothsis: H( ) = E = h ( )log α ar usd dirctly or for r-sampling training data, dpnding on th typ of th bas classifir bing usd. Whn th bas classifir is from a typ that is not dsignd to incorporat sampl wights in its larning procss (lik SVMs), training data is rsampld according to th wights of th sampls. This cas is considrd hr to xplain th Boosting procdur. Lt s considr a two-class problm with M lablld training sampls S = {(x i,y i );i =,...,M} whr y i {,} that contains M + positiv sampls and M ngativ sampls. All sampls in th datast ar initially associatd with th sam wight W (i) = /M, i =,...,M. Thn, a nw training subst is rsampld into S with W to traind classifir C. This classifir is tstd on all training sampls (S) and a loss factor (ε ) is calculatd as th sum of th wights of misclassifid sampls: ε = W (i) () (i,y i ):y i Y i whr Y i is th labl associatd with x i by C. If th classifir is too wak (ε >.5), th classifir is discardd and training st is r-sampld to train anothr classifir. Th loss factor is thn usd to dfin a wight updat factor: α = ε ε. (2) Th wights of th sampls ar thn updatd as: W + (i) = W (i)α 2 y i Y i, (3) Wight vctor is normalizd such that th wights of th misclassifid sampls (mor important sampls) incras xponntially whil th wights of th corrctly classifid sampls dcras. α is also usd to dtrmin th contribution of th classifir in final prdictions (Equation 4) so that mor accurat

4 classifirs play mor important rol in idntifying th class of th input sampl. This procss is rpatd for a prdfind numbr of tims to dsign E classifirs. Considring h (x) as th output of C (ithr a classification scor or a labl) for an input sampl x, final prdiction of th nsmbl is obtaind from: H(x) = E = h (x)log α (4) Analogous to most larning algorithms, Boost is not ffctiv to larn from imbalancd data for two rasons. Ngativ sampls ar th majority and whn training data is r-sampld in lin 2.i of Boost (s Algo. ), thy contribut mor in S. Thrfor, C is traind biasd to corrct classification of this class. Aftr that, whn C is tstd on S, loss factor in lin 2.iv is calculatd as a wightd rror rat of classification. Again, ngativ sampls contribut mor in loss factor calculation and th wight updat formula and classifirs contribution in final prdiction bcom biasd such that wight of ngativ sampls incrass for th nxt itration and classifirs that mostly classify ngativ sampls corrctly gt highr importanc in final prdiction of th nsmbl. A taxonomy of mthods in litratur that modify Boost to handl imbalanc is prsntd in Figur. Basd on th issu ths approachs addrss, thy ar dividd to two catgoris, data-lvl and algorithm-lvl mthods that ar prsntd in subsctions 2. and 2.2, rspctivly. 2.. Data-Lvl Mthods: Class imbalanc can b handld in Boosting nsmbls through up-sampling th positiv class, undr-sampling th ngativ class or combination of thm. A popular up-sampling Boosting approach is SMOTEBoost (Chawla t al., 23) that intgrats Synthtic Minority Ovr-sampling Tchniqu (SMOTE) into Boost.M2. SMOTE crats synthtic sampls by intrpolating ach positiv sampl with its k-narst nighbours. MSMOTEBoost (Hu t al., 29) us modifid SMOTE (MSMOTE) by liminating noisy sampls and ovrsampling only saf sampls. Jous-Boost (Mas t al., 27) ovrsampl th positiv class by duplicating it, instad of crating nw sampls, and introduc prturbation (jittring) to this data in ordr to avoid ovrfitting. DataBoost-IM (Guo & Viktor, 24) ovrsampl difficult sampls from both classs and intgrats it into Boost.M. Up-sampling tchniqus addrss th bias of prformanc in classifirs through balancing class distribution without loss of information. Howvr, up-sampling, in gnral, incras th numbr of sampls and consquntly incras th complxity of larning algorithms, and SMOTE involvs additional computations du to intrpolating ach sampl with its k-narst nighbours to gnrat synthtic sampls. In undr-sampling Boosting catgory, Boost (Siffrt t al., 2) intgrats random undr-sampling () into Boost.M. Boost is similar to Boost prsntd in Algo. whr in lin 2.i of this algorithm, S contains all positiv sampls and a randomly slctd subst of ngativ class, oftn with a siz qual to th positiv class. Th substs of ngativ class slctd randomly ovr itrations of Boost could b highly corrlatd and th classifirs traind on thm can lack in divrsity, spcially whn th skw lvl of training data is high. Th sampl slction paradigm in Boost is managd in EUSBoost (Galar t al., 23b) to crat lss corrlatd substs using volutionary prototyp slction (García & Hrrra, 29). Som rsarchrs combin SMOTE and in Boost to achiv gratr divrsity and avoid loss of information as in Random Balanc Boosting (-Boost) (Díz-Pastor t al., 25). -Boost combins SMOTE and to crat training substs with random and diffrnt skw lvls in Boost.M. Rptition of sampling in Boosting nsmbls incras th chanc of low corrlation btwn substs of data that ar usd for dsigning bas classifirs and thrfor maintain divrsity among thm. Howvr, som potntially informativ sampls may b ovrlookd from ths substs in undr-sampling procss. In partitional approachs (Solymani t al., 26a; Yan t al., 23; Li t al., 23) bootstraps ar slctd without rplacmnt ithr randomly (Yan t al., 23), by clustring (Li t al., 23) or basd on a prior knowldg from th application (lik trajctoris in vido survillanc applications such as fac r-idntification (Solymani t al., 26a)). In ths nsmbl bootstraps ar drawn from a st of ngativ sampls that rducs siz in ach itration. In othr words, aftr slction of a bootstrap in ach itration, its sampls ar liminatd from th main st. In random partitioning of ngativ sampls by Yan t al. (Yan t al., 23) th ngativ data is randomly dcomposd into a numbr of substs and ach subst, combind with th positiv sampls, is usd to train a classifir. Li t al. (Li t al., 23) partition ngativ data by clustring it using k-mans in th fatur spac and thn crat an nsmbl from th classifirs traind on ach ngativ clustr and th positiv sampls. Th contribution of th classifirs in th nsmbl ar thn wightd basd on th distanc btwn th corrsponding ngativ clustr and positiv class. In (Solymani t al., 26a), partitioning ngativ class is don by slcting sampls from a st of trajctoris that ar formd basd on th tracking information, as found in svral vido survillanc applications lik fac r-idntification. In this approach, data from th trajctoris ar accumulatd as th training itration procds and thrfor, bas classifirs in th nsmbl ar traind on diffrnt imbalanc lvls to incras robustnss of th nsmbl to th possibl variations in th skw lvl and complxity of oprational data. In contrast to Boost, ths partitional approachs us all ngativ sampls from partitions to dsign nsmbls and avoid loss of information. Howvr, not all sampls ar informativ and using all sampls for training may rsult in unncssary tim and mmory complxity. Thrfor, nhancing partitional mthods with mor intllignt sampl slction and nsmbl larning algorithm (lik Boost) can avoid information loss and xcssiv tim complxity at th sam tim Algorithm-Lvl Mthods: Using th standard loss factor basd on misclassification rat in Boosting nsmbl larning algorithms biass thir prformanc towards ngativ class. In litratur this issu is avoidd

5 Figur : A taxonomy of Boosting nsmbls larning mthods spcializd for imbalancd data. 35 at th algorithm lvl using two typs of tchniqus; thos that mploy two diffrnt misclassification cost factors, on for positiv and anothr for ngativ classs and thos that handl this issu without th us of cost factors. Cost-snsitiv Boosting mthods including Cost (Fan t al., 999), CSB (Ting, 2) and C (Sun t al., 27), mbd diffrnt misclassification cost factors into loss function or wight updat formula of Boost.M2. Givn µ i as th cost factor of sampl x i, in Cost (Fan t al., 999), two cost adjustmnt functions ar dfind for ach sampl as φ + =.5µ i +.5 and φ =.5µ i +.5 and wight updat formula is changd to: { W (i)xp{ α φ + y i Y i /2} for Y i = W + (i) = W (i)xp{ α φ y i Y i /2} for Y i = (5) In C3: α = i µ i W (i) + µ 2 i W (i) µ 2 i W (i) 2 ln i,y i =Y i i,y i Y i i µ i W (i) µ 2 i W (i) + µ 2 i W (i), () i,y i =Y i i,y i Y i W + (i) = µ i W (i)xp{ α µ i Y i y i } (2) In ths cost-snsitiv approachs by stting µ + gratr than µ th wights of misclassifid sampls from positiv class incras mor than that of th misclassifid sampls from ngativ class. In addition, th wights of th classifirs that corrctly classify positiv class bttr than th ngativ class is highr in final dcision. Thrfor, ths cost-snsitiv approachs can mak up for th usag of standard rror rat in Boosting nsmbls and allow adapting th prformanc by slcting propr cost factors basd on th application. Th drawback of ths cost-snsitiv approachs is that thy rquir known µ i s that ar usually st ad-hoc or by conducting a sarch in th spac of possibl costs for a datast. CSB (Ting, 2) introduc two diffrnt cost factors for positiv and ngativ classs as µ + = and µ, rspctivly. { W (i)µ + xp{ α y i Y i /2} for Y i = W + (i) = W (i)µ xp( α y i Y i /2} for Y i = (6) In C, 2, 3 (Sun t al., 27) cost factors ar mbddd into th wight updat formula in thr diffrnt ways. Givn µ i [,+ ), in C: α = + µ i W (i) µ i W (i) 2 ln i,y i =Y i i,y i Y i µ i W (i) + µ i W (i), (7) i,y i =Y i i,y i Y i W + (i) = W (i)xp{ α µ i Y i y i ) (8) In C2: α = µ i W (i) 2 ln i,y i =Y i µ i W (i), (9) i,y i Y i W + (i) = µ i W (i)xp{ α Y i y i } () 5 Som cost-fr approachs hav bn proposd to dal with th bias of prformanc causd by using standard rror in Boosting nsmbls. In RarBoost (Joshi t al., 2), two diffrnt αs ar dfind for positiv and ngativ classs as: α + = 2 ln(tp FP ),α = 2 ln(tn FN ) (3) whr T P and T N ar th tru positiv and tru ngativ counts, rspctivly. Thn th wight updat formula and final classification prdiction ar modifid as: { W (i)xp{ α + y i Y i /2} for Y i = W + (i) = W (i)xp{ α (4) y i Y i /2} for Y i = H(x) = sign( :h (x) α + h (x) + :h (x)< α h (x))) (5) Kim t al. (Kim t al., 25) also dfin two diffrnt α s for positiv and ngativ classs as: α + = l+ l +,l+ = W (i) y i Y i /2 i;y i =+ W (i) i;y i =+ (6)

6 α = l l,l = W (i) y i Y i /2 i;y i = W (i) i;y i = (7) 34 and at th sam tim allows us to giv mor importanc to on class than th othr. Thrfor, this mtric is usd in loss factor calculation of th proposd PBoost algorithm whr l + and l ar psudo rrors of classifir in classifying ach class. Finally: α = ln( µ i α + α ), (8) µ i is a multiplir to control th wight of ach sampl. Anothr cost-fr approach (Solymani t al., 26b) modifis th loss factor calculation of Boosting algorithm using F- masur, th most frquntly usd masurs for prformanc valuation in class imbalanc larning. To calculat this loss factor, th wight vctor W is split to two wight matrics for positiv W + and ngativ W classs. Thn, wightd vrsions of tru positiv, fals positiv, tru ngativ and fals ngativ counts ar dfind as: TP = W + (i),i =,...,M + (9) i:y i = FP = W (i),i =,...,M (2) i:y i = TN = FN = i:y i = i:y i = W (i),i =,...,M (2) W + (i),i =,...,M + (22) Basd on ths valus, th accuracy of a classifir is computd in trms of F β -masur as: A F = ( + β 2 )TP ( + β 2 )TP + FP + β 2 FN, (23) To masur th rror of th classifirs, th corrsponding loss factor is dfind as: FP + β 2 FN L = A F = ( + β 2 )TP + FP + β 2. (24) FN Th condition ε >.5 in lin (v) of Boost.M (Algo. ) mans that classifirs in a Boosting nsmbl should prform bttr than random gussing. Whn F β -masur is usd as th valuation mtric, th bas classifir to bat is th on that prdicts vrything as positiv (Flach & Kull, 25). Thrfor, whn th loss factor is calculatd using Eq. (24), th accuracy critrion of.5 in Boost.M should b rplacd by l b = M (+β 2 )M + +M. Cost-fr mthods nhanc th prformanc of Boosting nsmbls without stting any cost factors and guid th larning procss using a mor suitabl loss factor calculation sinc th us of wightd standard accuracy, as in original Boosting algorithm, biass th larning procss towards corrct classification of th ngativ class. Th problm with th loss factor proposd by Kim t al. (Kim t al., 25) is that, if thr ar no misclassifid sampls in on class or in both classs, α is undfind. Bsids, F β -masur is mor snsitiv to imbalanc than G-man Class Imbalanc in Fac R-Idntification Fac r-idntification is a vido survillanc application whr vido-to-vido fac rcognition systms ar dsignd to rcogniz facs of th individuals in archivd or liv vidos at diffrnt tim instants and/or locations ovr a ntwork of distributd camras. In this application, a fac trackr dfins facial trajctoris for th moving facs capturd ovr conscutiv frams. An fficint tracking systm dos not mix th tracking information from svral individuals btwn frams and thrfor ach trajctory corrsponds to on individual. A trajctory is dfind as a st of facial ROIs that corrspond to a sam high quality track of an individual across conscutiv frams. A common classification architctur in fac r-idntification is modular classification systms consisting of a singl classifir or an nsmbl of classifirs dsignd pr targt individual of intrst. Two class classification systms in this application ar dsignd using fac capturs from th targt individual of intrst (positiv class) and thos of th non-targt individuals (ngativ class). This application is challnging du to variations in captur conditions such as pos, illumination, xprssion, tc. Morovr, an important challng in this application is that th numbr of facs capturd from th targt individual is typically limitd and gratly outnumbrd by thos of non-targt ons. In addition, th lvl of imbalanc during oprations may diffr from that of th dsign data. Thr ar som spcializd approachs in litratur to addrss imbalanc in fac r-idntification application. In a dynamic approach (Radtk t al., 24; D-la Torr t al., 25), a pool of classifirs is gnratd using data substs with diffrnt imbalanc lvls. Ths classifirs ar thn combind using Boolan combination and validatd on diffrnt imbalanc lvls. During oprations, th lvl of imbalanc is stimatd and th suitabl Boolan function and its corrsponding st of classifirs is slctd. Th lvl of imbalanc may b difficult to stimat accuratly during oprations and a static approach that accounts for variations in imbalanc lvl of data may b of mor intrst. In th static approach of (Solymani t al., 26a), all th fac capturs ar rgroupd into trajctoris. Th classifirs ar traind using a positiv class trajctory and diffrnt numbrs of ngativ class trajctoris to dsign divrs and accurat classifirs. Slcting sampls using trajctoris to dsign classifir nsmbls appars to b mor ffctiv than using th gnral-purpos sampling tchniqus ( and CUS) to improv accuracy. Th sam sampl slction tchniqu is utilizd in th proposd PBoost algorithm for fac r-idntification application Progrssiv Boosting for Larning Ensmbls from Imbalancd Data: Th Progrssiv Boosting (PBoost) larning mthod is proposd to sustain a high lvl of prformanc ovr a rang of

7 imbalanc and complxity lvls in th data sn during oprations. This mthod follows a static approach, and larns nsmbls basd on a combination of undr-sampling and cost-fr adjustmnt of Boosting nsmbl larning. With th PBoost algorithm, ngativ class is partitiond into disjoint substs. Ths partitions ar accumulatd into a tmporary dsign st progrssivly as larning itrations procd. In ach itration, a subst of this tmporary st is usd for training a classifir such that th most important sampls plus sampls from th nw partition ar givn an qually high opportunity to b usd in training a bas classifir. Loss of information is thrfor avoidd and nsmbl divrsity is incrasd. Th traind classifir is thn validatd on th tmporary st that contains all positiv sampls and only thos ngativ partitions that hav alrady bn usd in prvious training itrations. As th tmporary st grows, its imbalanc lvl incrass and thrfor, th nsmbl s robustnss to divrs lvls of skw and dcision bound complxitis during oprations is incrasd. In PBoost, th rror of th classifir is dtrmind basd on its ability to corrctly classify both positiv and ngativ classs. This loss factor plays an important rol in dtrmining th contribution of classifirs in final prdiction, and in slction critria of sampls for dsigning th nxt classifirs. Thr ar svral possibl ways to partition th ngativ sampls into disjoint substs in litratur (Xu & Wunsch, 25).g., prototyp-basd mthods lik k-mans and GMM algorithms, affinity-basd mthods lik spctral, normalizd-cut and subspac algorithms to rprsnt th ngativs, and thus dfin partitions (numbr of clustrs and association of data to clustrs). Two gnral-purpos partitioning tchniqus hav bn usd in litratur to partition data to larn nsmbls from imbalancd data: Random Undr-Sampling without rplacmnt (w call wr in this papr) (Yan t al., 23), and Clustr Undr- Sampling (CUS) (Li t al., 23). In som applications th data is alrady partitiond, lik binarization of multi-class classification problms using on-vs-all stratgy. In som othrs, th data may b groupd basd on som contxtual or application-basd knowldg of data. Trajctory Undr-Sampling (TUS) is applicabl in vido survillanc applications whr Rgions Of Intrst (ROIs), which ar facs in fac r-idntification, ar rgroupd into a st calld a trajctory with a high quality fac trackr (Solymani t al., 26a). A high quality tracking systm finds th trajctoris by fficintly following th location of th ROIs that blong to th sam individual ovr conscutiv vido frams. This application-basd undr-sampling mthod appard to b mor ffctiv than th gnral-purposd undr-sampling mthods in dsigning classifir nsmbls for fac r-idntification in trms of divrsity and accuracy of opinions (Solymani t al., 26a). Th progrssiv Boosting mthod is prsntd in Algo. 2 and Figur 2. Its main stps ar xplaind in th following. Th ngativ sampls ar rgroupd to E disjoint partitions P, whr =,..,E, on pr classifir in th nsmbl (lin ). E and th numbr of ngativ sampls in ach partition N varis and dpnds on th partitioning mthod and th data distribution. In th cas of random undr-sampling without r placmnt E is prslctd and N taks a fixd random valu N [M + /2,2M + ] such that E = N = M. In th cas of CUS and TUS, E and N dpnd on th numbr of sampls that ar assignd to ach partition by th clustring algorithm and th trackr, rspctivly. Givn a training data st S, on partition P is slctd in ach itration and addd to a tmporary st S tmp (lin 5.ii) which initially contains th positiv sampls. Th sam initial wight w ini is assignd to th sampls in th nw partition crating a wight vctor W p (lin 5.i) which is also addd to a tmporary wight st W tmp (lin 5.ii). In th nxt stp (lin 5.iv), N sampls from th tmporary st S tmp ar slctd through random undr-sampling to crat a nw subst S with th wight distribution of W. A classifir C is traind on S (lin 5.v). Thn it is tstd on th whol tmporary st S tmp that has an imbalanc lvl of λ = : f = N f/ M + (lin 5.vi). Thrfor, th classifirs in this nsmbl ar in fact validatd on data substs with a growing lvl of imbalanc and complxity. Aftr that, th loss factor is calculatd using th mthod proposd in (Solymani t al., 26b). Th tmporary wight vctor W tmp is split to two wight matrics for positiv W tmp,+ and ngativ W tmp, classs. Th siz of W tmp,+ is M + and th siz of W tmp, is f = N f, and: W tmp,+ W tmp, = {W tmp ( j), j =,...,(M + + = {W tmp ( j), j =,...,(M + + f = f = N f ) y j = }, (25) N f ) y j = }. (26) Thn, wightd vrsions of tru positiv, fals positiv, tru ngativ and fals ngativ counts ar dfind as: TP = FP = TN = FN = W tmp,+ k:y k = W tmp, k:y k = W tmp, k:y k = W tmp,+ k:y k = (k),k =,...,M + (27) (k),k =,..., (k),k =,..., f = f = N f (28) N f (29) (k),k =,...,M + (3) To masur th rror of th classifirs, th corrsponding loss factor is dfind as: L = A F = FP + β 2 FN ( + β 2 )TP + FP + β 2 FN. (3) Aftr calculation of α (lin 5.ix) from: α = L L, (32) th wights in th tmporary st W tmp ar updatd (lin 5.x) as: W tmp + ( j) = Wtmp ( j) α y j Y j /2. (33)

8 Evn though it is dsirabl to limit th loss of information during undr-sampling of data, som sampls (lik bordrlin sampls) ar of mor intrst than othrs for training classifirs in th nsmbl. In Boosting nsmbls, ths sampls ar oftn dtctd as misclassifid sampls bcaus bordrlin sampls play mor important rol in dfining th dcision bound and thy ar mor likly to b misclassifid. Mor importanc is givn to ths sampls by assigning highr wights to thm, so that thy hav a highr chanc to b includd in training subst(s). In th proposd PBoost nsmbl, aftr normalization of W tmp, its maximum valu among ngativ sampls is slctd as th initial wight for th nxt itration (lin 5.xii): w ini + = max y j = {Wtmp ( j)}, j =,...,M + + f = N f. (34) This valu corrsponds to th wight of mor important misclassifid ngativ sampls. Thrfor, in ach itration, nw sampls and misclassifid sampls from prvious itrations hav mor chanc to b includd in th training subst. Finally, α is usd to obtain th final class prdiction of th nsmbl from (4) (lin 6). PBoost is somwhat inspird from Boost, but diffrs in thr main rspcts. First, during ach itration, instad of random undr-sampling with rplacmnt, most of training ngativ sampls ar slctd from disjoint partitions. Consquntly, rpatdly slction of th sam sampls ovr all itrations and information loss is avoidd whil th divrsity incrass. Scond, instad of validating th classifirs on all sampls, th classifirs ar validatd only on a subst of training st that grows in siz and imbalanc ovr itrations. Thrfor, robustnss to diffrnt lvls of data imbalanc and complxity incrass, and th computations complxity of validation stp dcrass significantly. Third, instad of wightd accuracy, F- masur, an imbalanc-compatibl prformanc mtric, avoids biasing prformanc towards ngativ class. 4. Exprimntal Mthodology In our xprimnts, th proposd PBoost nsmbl larning mthod is assssd and compard with Boost.M (Frund t al., 996), and on stat of th art mthod from ach family of th data-lvl approachs rviwd in Sction 2 including SMOTEBoost (Chawla t al., 23), Boost (Siffrt t al., 2), and -Boost (Díz-Pastor t al., 25). Th datasts that ar usd for th xprimnts includ: () A st of synthtic 2D data sts in which th lvl of skw and ovrlap btwn classs ar controllabl, (2) th Fac In Action (FIA) vido databas (Goh t al., 25) that mulats a passport chcking scnario in fac r-idntification application, and (3) COX Fac datast th fac rcognition in vido survillanc applications (Huang t al., 25) Datasts 4... Synthtic Datast Th prformanc of classification systms may vary on diffrnt lvls of ovrlap and skw btwn classs in both training and tst data. Thrfor, in our xprimnts on synthtic data, diffrnt synthtic datasts with diffrnt ovrlap and skw lvls ar gnratd and usd to compar classification systms. Th data is gnratd to mulat both binarization of a multiclass classification problm whn th classification stratgy is on vrsus all and binary classification problms whr thr is no prior knowldg of optimal partitions. Th sampls of both positiv and ngativ classs ar gnratd from a mixtur of Gaussian distributions. Th sampls from on normal distribution ar considrd as positiv class and all othr sampls ar considrd as ngativ class. To gnrat th 2D synthtic data, M + = positiv class sampls ar gnratd with a normal distribution as N(m +,σ + ), whr m + = (,) and σ + = [ ] indicat its man and covarianc matrix, rspctivly. Thn, T = points ar gnratd randomly from a uniform distribution around m +. Ths points (m, j, j =,...,T ) ar gnratd as th man of T Normal distributions (N(m, j,σ ), j =,...,T ) for ngativ class whr σ = σ +. Each normal distribution contains M + = sampls and is considrd as an idal clustr of ngativ class (usd for PCUS i and to mulat fac capturs along trajctoris in TUS). Th man of ths clustrs (m, j s) kp a margin distanc δ from m +. This margin is usd to control th lvl of ovrlap btwn positiv and ngativ classs. For th xprimnts, w slctd th paramtr δ as. (maximum ovrlap) and.2 (mdium ovrlap). For ach ovrlap lvl, ach normal distribution is randomly dividd into two substs for dsign and tsting. Thn th dsign substs ar dividd into 5 folds considring on fold for validation and 4 folds for training. Fiv rplications ar carrid out by altrnating th validation fold in ach itration and by rvrsing th rol of dsign and tsting substs for a total of rplications. Two skw lvls and two ovrlap lvls of training data hav bn considrd for th xprimnts, which hav bn combind into th thr sttings shown in Tabl. λ train = : M / M + is st to :5 in two stting and to :2 in th othr. Whn λ train = : 5, only 5 clustrs from th ngativ class ar usd for training. Th objctiv is to compar diffrnt classification algorithms whn thy ar dsignd on diffrnt lvls of imbalanc. Proprtis of training data gnratd with ths sttings ar summarizd in Tabl and xampls ar prsntd in Figur 3. In a similar way, four imbalanc lvls (λ tst = { :, : 2, : 5, : }) ar considrd for tsting to valuat th robustnss of th classification algorithms ovr varying skw lvls of data during opration. Exampls of synthtic tst data corrsponding to stting D ar prsntd in Figur Fac R-Idntification Datast To compar th prformanc of proposd PBoost algorithm to stat of th art classification systms, two datasts for vidobasd fac rcognition wr considrd. FIA vido databas (Goh t al., 25) contains vido squncs that mulat a passport chcking scnario. Th vido strams ar collctd from 22 participants undr diffrnt captur conditions such as pos, illumination and xprssion in

9 Algorithm 2: Progrssiv Boosting nsmbl larning mthod. Input: Training st: S = {(x i,y i );i =,...,M},y i {,},M = M + M + Output: Prdictd scor or labl: H( ) Partition non-targt sampls from S into E clustrs {P ; =,...,E}. 2 Crat a tmporary training st and wight vctor: S tmp {(x i,y i ) S y i = } and W tmp (k) =,k =,...,M +. 3 Initializ w ini =. M 4 St l b = (+β 2 )M + +M 5 for =,..,E do i Initializ wight distribution of P as W p (k) = w ini ii S tmp S tmp iii Normaliz W tmp P, W tmp W tmp p W such that: W tmp =. iv Randomly slct N non-targt sampls from S tmp v Train C on S with W. vi Tst C on S tmp and gt back labls Y j, j =,...,(M + + f = N f ). vii Calculat th psudo-loss for S tmp W tmp, from W tmp = {W tmp ( j), j =,...,(M + + f = N f ) y j = }, W tmp,+ TP = (k),k =,...,M +, (k,y k ):Y k = FP = W tmp, (k),k =,..., f = N f, (k,y k ):Y k = TN = W tmp, (k),k =,..., f = N f, (k,y k ):Y k = FN = W tmp,+ (k),k =,...,M +, (k,y k ):Y k = FP L = A F = +β 2 FN. (+β 2 )TP +FP +β 2 FN viii If L > l b go to stp iv ix Calculat th wight updat paramtr: α = x Updat W tmp + ( j) = Wtmp ( j)α y j Y j /2 xi Normaliz W tmp + such that: Wtmp + =. xii St w ini + = max(wtmp ),y j =,k =,...,N. // N is th siz of P. basd on W tmp, to crat a training subst S with W. (using Equations 25 to 3): W tmp,+ = {W tmp ( j), j =,...,(M + + f = N f ) y j = }, L L 6 Output th final hypothsis: H( ) = E = h ( )log α // h ( ) is th output of C. Tabl : Sttings usd for data gnration. D D 2 D 3 λ tr :5 :5 :2 δ Th COX Fac datast for fac rcognition in vido survillanc (Huang t al., 25) contains vidos from participants capturd with 4 camras undr diffrnt captur conditions. Th facs ar trackd and rsizd such that for ach fram with a fac dtctd, an imag patch cntrd at th had of th subjct is croppd out with a siz of both indoor and outdoor nvironmnts. Vidos wr collctd ovr thr sssions whr scond and third sssions ar thr months latr than th prvious on. Th participants ar prsnt bfor 6 camras for about 5 sconds, rsulting in total of 8 vido squncs pr prson. For xprimnts in this papr using FIA datast, only th facs capturd with frontal camra in indoor nvironmnt is usd for both dsign and tsting. ROIs ar convrtd to grayscal and rscald to 7 7 pixls using Viola Jons algorithm (Viola & Jons, 2) from this vido. Som xampls of ROIs from this data st ar prsntd in Figur For xprimnts with vido data, multi-rsolution gray-scal and rotation invariant Local Binary Pattrns (LBP) (Ojala t al., 22) histograms hav bn xtractd as faturs. Th local imag txtur for LBP has bn charactrizd with 8 nighbours on a radius circl cntrd on ach pixl. Finally, a fatur vctor with th lngth of 59 has bn obtaind for ach ROI. individuals ar randomly slctd as targts and 9 individuals ar randomly slctd as non-targts. In ach round of xprimnt, fac pattrns of on targt individual (a trajctory) is considrd as th positiv class and individuals (including 9 othr targt individuals and 9 non-targt individuals) ar

10 Figur 2: Block diagram rprsntation of PBoost larning mthod. 59 slctd as th ngativ class. ROI pattrns for ach trajctory ar dividd into 2 sts for dsign and tsting. Th dsign st is dividd to 5 folds, and for ach round on fold is considrd for validation and rmaining 4 folds ar considrd for Th ROIs of ach individual in FIA and COX datasts ar alrady groupd. 595 training. Thn th rols of dsign and tsting sts is rvrsd. Thrfor, for ach targt individual, thr indpndnt sts ar collctd from ths fac pattrns for training, validation and tsting. Each st contains on group of sampls from th targt individual, 9 groups of sampls from th rmaining targt

11 Ngativ Clustr cntrs.2 Positiv Ovrlap:δ a: D Ngativ Clustr cntrs Positiv Ovrlap:δ a: λ tst = : ID2 ID3 ID4 ID5 ID ID2 ID5 ID b: D 2 b: λ tst = : c: D 3 Figur 3: Exampls of synthtic training data gnratd undr diffrnt sttings D, D 2 and D 3. c: λ tst = : Figur 4: Exampls of synthtic tst data gnratd with δ =.2 and diffrnt skw lvls λ tst. Figur 5: Exampls of 2D mapping of LBP fatur vctors blonging to 8 individuals using Sammon mapping (Sammon, 969) on th lft, and xampls of 7 7 pixls ROIs along a trajctory capturs with camra 3, during sssion on for ID with thir fram numbrs on th right individuals and 9 groups of sampls from non-targt individuals. Rpating this procss for ach targt individual yilds = ovrall xprimnts for this datast. Two imbalanc lvls (λ train = : 5 and ) and four diffrnt imbalanc lvls λ tst = { :, : 2, : 5, : } ar considrd for slcting th training and tsting ngativ class for ach positiv individual, rspctivly. This is to valuat th prformanc of diffrnt classification algorithms whn thy ar traind on diffrnt imbalanc lvls, and to valuat th robustnss of th classification algorithms ovr varying skw lvls during oprations. Whn λ train = : 5, for ach positiv individual, only T = 5 of othr individuals ar usd as th ngativ class from th training st that was collctd for that positiv individual. Thrfor, whn λ tst = :, thr ar 5 ngativ individuals in th tsting st that wr not in cludd in training th classification systms and th skw lvl of tst data is highr than th skw lvl of training data. Whn λ tst < : 5, most of th ngativ individuals that wr usd for training do not appar in tsting data. Whn λ train = λ tst =, th maximum imbalanc lvl of tsting data is th sam as th imbalanc lvl of training data. Thrfor, all individuals ar sn in both training and tsting. Howvr, in this cas a high lvl of imbalanc xists in both training and tsting stags that maks both larning and classification mor difficult. It is worth mntioning that in all sttings, th skw lvl of th validation data is slctd to b th sam as tsting data Exprimntal Protocol For validation, synthtic and vido datasts ar usd to valuat th algorithm whn th idal partitions (or clustrs) of nga-

12 tiv class ar known a priori. Ths data sts ar also usd for a binary classification problm whr no information is availabl rgarding th idal clustrs of data. W us SVM with F krnl (Chang & Lin, 2) as th bas classifir whr K(x,x ) = xp{ x x 2 / 2κ 2}. Th krnl paramtr κ is st as th avrag of th man minimum distanc btwn any two training sampls and th scattr radius of th training sampls in th input spac (Li t al., 28). Th scattr radius is calculatd by slcting th maximum distanc btwn th training sampls and a point corrsponding to th man of training sampls. W usd th LibSVM implmntation of (Chang & Lin, 2). A brif dscription of th implmntd nsmbls, thir variants and th abbrviations usd for thm ar shown in Tabl 2. Th last column of th tabl shows th datasts that ar usd for xprimnts on ths classification systms. Th abbrviations assignd to ths nsmbls ar slctd basd on thir sampling tchniqus and loss factor. Th baslin sampling tchniqus includ (rsampling in Boost), (SMOTE in SMOTEBoost), (random undr-sampling in Boost), (random balanc in -Boost). For PBoost four partitioning tchniqus ar usd for undr-sampling th ngativ class to valuat th ffct of th partitioning tchniqu on th prformanc of PBoost nsmbl: random undr sampling without rplacmnt (P) and clustr undr-sampling (PCUS) ar usd as gnral partitioning tchniqus for PBoost disrgarding th data structur, whthr or not th ngativ class is partitiond a priori. For PCUS, krnl k-mans is usd for clustring ngativ sampls. To slct k, it is varid ovr a rang of possibl valus and th valu of Dunn indx (Dunn, 973) is calculatd for ach cas using a validation st. Finally, th optimal k, is slctd whn Dunn indx taks its maximum valu. Two cass ar considrd for PBoost in which th partitions of th ngativ class ar known a priori. Th idal clustr undr-sampling (PCUS i ) with synthtic datasts and trajctory undr-sampling (PTUS) with vido datast. Th loss factor is calculatd in two ways basd on: th traditional tchniqu i.. wightd accuracy, and th F-masur. To indicat th us of F-masur in th Boosting nsmbls in Tabl 2, th abbrviation is followd by -F. For th us of proposd loss factor calculation with th F-masur, β is st as 2 in all xprimnts bcaus β is mor suitabl for imbalancd data classification whn th positiv class is th minority class. An xprimnt is don to valuat th prformanc of Boosting nsmbls with diffrnt valus of β. In addition to th mntiond Boosting nsmbls, th nsmbls proposd in (Solymani t al., 26a) for fac ridntification ar also includd in th comparison. Rows 7 and 8 in Tabl 2 summariz th proprtis of ths nsmbls. In this tchniqu, th ngativ class sampls ar rgroupd to substs with growing imbalanc lvls using CUS i for synthtric data and TUS for vido data. In contrast to Boosting nsmbls, this mthod dos not involv th us of any loss factors in larning procss. Howvr, th contribution of bas classifirs in final prdiction dpnds on thir prformanc in trms E (a) AUPR E (b) F 2 -masur. Figur 6: Prformanc of baslin Boosting nsmbls for diffrnt valus of E on D 2 with λ tst = :. of F-masur. In th xprimnts with synthtic and vido data sts, two diffrnt imbalanc lvls ar usd for training and four diffrnt imbalanc lvls ar usd for tsting. This is to valuat th snsitivity of classification systms to th lvl of imbalanc during training and thir robustnss to possibl variations in skw lvl during oprations. In xprimnts with synthtic data, th ovrlap lvl btwn positiv and ngativ classs ar also varid bcaus th issu of imbalanc is rlatd to th lvl of ovrlap btwn classs (Lópz t al., 23). In xprimnts with synthtic and vido datasts, th siz of all Boosting nsmbls is st qual to th maximum imbalanc lvl of th data, xcpt from PCUS. Th rason for this stting is that th numbr of idal clustrs and th numbr of trajctoris ar both known and qual to th lvl of skw. In addition, basd on a prliminary xprimnt undr stting D 2 (s Tabl ) on baslin nsmbls in Figur 6, it is obsrvd that th siz of ths nsmbls dos not hav a significant impact on thir prformanc. Th prformanc of ths nsmbls vary in trms of F 2 -masur as th nsmbl siz grows. Howvr, thir global prformanc in trms of AUPR do not chang significantly. For PCUS, th siz of nsmbl is slctd qual to th optimal k obtaind using Dunn indx Prformanc Evaluation Global prformanc valuation curvs such as ROC and prcision-rcall, show th trad off btwn two mtrics for diffrnt oprational sttings. For classifirs that output scors or probability stimats, this stting is usually th choic of dcision thrshold. Ara undr th curv, shows th global prformanc of th classifir ovr a rang of possibl dcision thrsholds, whr local valuation mtric such as F-masur show th prformanc for a spcific dcision thrshold. Thrfor, whn diffrnt classifirs ar compard in trms of local mtrics, th choic of th dcision thrshold bcoms important. Th dcision thrshold may b st to a fixd optimal valu with or without considring th oprating conditions: th cost proportions or skw lvls (Hrnándz-Orallo t al., 22). Th prformanc mtrics that can b maximizd to st th dcision thrshold ar accuracy, Brir scor, AUC, xpctd cost, G-man and F-masur (Hrnándz-Orallo t al., 22; Lipton t al., 24).

Learning Spherical Convolution for Fast Features from 360 Imagery

Learning Spherical Convolution for Fast Features from 360 Imagery Larning Sphrical Convolution for Fast Faturs from 36 Imagry Anonymous Author(s) 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 In this fil w provid additional dtails to supplmnt th main papr

More information

Higher order derivatives

Higher order derivatives Robrto s Nots on Diffrntial Calculus Chaptr 4: Basic diffrntiation ruls Sction 7 Highr ordr drivativs What you nd to know alrady: Basic diffrntiation ruls. What you can larn hr: How to rpat th procss of

More information

EXST Regression Techniques Page 1

EXST Regression Techniques Page 1 EXST704 - Rgrssion Tchniqus Pag 1 Masurmnt rrors in X W hav assumd that all variation is in Y. Masurmnt rror in this variabl will not ffct th rsults, as long as thy ar uncorrlatd and unbiasd, sinc thy

More information

A Propagating Wave Packet Group Velocity Dispersion

A Propagating Wave Packet Group Velocity Dispersion Lctur 8 Phys 375 A Propagating Wav Packt Group Vlocity Disprsion Ovrviw and Motivation: In th last lctur w lookd at a localizd solution t) to th 1D fr-particl Schrödingr quation (SE) that corrsponds to

More information

Data Assimilation 1. Alan O Neill National Centre for Earth Observation UK

Data Assimilation 1. Alan O Neill National Centre for Earth Observation UK Data Assimilation 1 Alan O Nill National Cntr for Earth Obsrvation UK Plan Motivation & basic idas Univariat (scalar) data assimilation Multivariat (vctor) data assimilation 3d-Variational Mthod (& optimal

More information

1 Minimum Cut Problem

1 Minimum Cut Problem CS 6 Lctur 6 Min Cut and argr s Algorithm Scribs: Png Hui How (05), Virginia Dat: May 4, 06 Minimum Cut Problm Today, w introduc th minimum cut problm. This problm has many motivations, on of which coms

More information

ph People Grade Level: basic Duration: minutes Setting: classroom or field site

ph People Grade Level: basic Duration: minutes Setting: classroom or field site ph Popl Adaptd from: Whr Ar th Frogs? in Projct WET: Curriculum & Activity Guid. Bozman: Th Watrcours and th Council for Environmntal Education, 1995. ph Grad Lvl: basic Duration: 10 15 minuts Stting:

More information

CS 361 Meeting 12 10/3/18

CS 361 Meeting 12 10/3/18 CS 36 Mting 2 /3/8 Announcmnts. Homwork 4 is du Friday. If Friday is Mountain Day, homwork should b turnd in at my offic or th dpartmnt offic bfor 4. 2. Homwork 5 will b availabl ovr th wknd. 3. Our midtrm

More information

22/ Breakdown of the Born-Oppenheimer approximation. Selection rules for rotational-vibrational transitions. P, R branches.

22/ Breakdown of the Born-Oppenheimer approximation. Selection rules for rotational-vibrational transitions. P, R branches. Subjct Chmistry Papr No and Titl Modul No and Titl Modul Tag 8/ Physical Spctroscopy / Brakdown of th Born-Oppnhimr approximation. Slction ruls for rotational-vibrational transitions. P, R branchs. CHE_P8_M

More information

Addition of angular momentum

Addition of angular momentum Addition of angular momntum April, 0 Oftn w nd to combin diffrnt sourcs of angular momntum to charactriz th total angular momntum of a systm, or to divid th total angular momntum into parts to valuat th

More information

Recursive Estimation of Dynamic Time-Varying Demand Models

Recursive Estimation of Dynamic Time-Varying Demand Models Intrnational Confrnc on Computr Systms and chnologis - CompSysch 06 Rcursiv Estimation of Dynamic im-varying Dmand Modls Alxandr Efrmov Abstract: h papr prsnts an implmntation of a st of rcursiv algorithms

More information

Addition of angular momentum

Addition of angular momentum Addition of angular momntum April, 07 Oftn w nd to combin diffrnt sourcs of angular momntum to charactriz th total angular momntum of a systm, or to divid th total angular momntum into parts to valuat

More information

Observer Bias and Reliability By Xunchi Pu

Observer Bias and Reliability By Xunchi Pu Obsrvr Bias and Rliability By Xunchi Pu Introduction Clarly all masurmnts or obsrvations nd to b mad as accuratly as possibl and invstigators nd to pay carful attntion to chcking th rliability of thir

More information

Search sequence databases 3 10/25/2016

Search sequence databases 3 10/25/2016 Sarch squnc databass 3 10/25/2016 Etrm valu distribution Ø Suppos X is a random variabl with probability dnsity function p(, w sampl a larg numbr S of indpndnt valus of X from this distribution for an

More information

CE 530 Molecular Simulation

CE 530 Molecular Simulation CE 53 Molcular Simulation Lctur 8 Fr-nrgy calculations David A. Kofk Dpartmnt of Chmical Enginring SUNY Buffalo kofk@ng.buffalo.du 2 Fr-Enrgy Calculations Uss of fr nrgy Phas quilibria Raction quilibria

More information

3-2-1 ANN Architecture

3-2-1 ANN Architecture ARTIFICIAL NEURAL NETWORKS (ANNs) Profssor Tom Fomby Dpartmnt of Economics Soutrn Mtodist Univrsity Marc 008 Artificial Nural Ntworks (raftr ANNs) can b usd for itr prdiction or classification problms.

More information

Principles of Humidity Dalton s law

Principles of Humidity Dalton s law Principls of Humidity Dalton s law Air is a mixtur of diffrnt gass. Th main gas componnts ar: Gas componnt volum [%] wight [%] Nitrogn N 2 78,03 75,47 Oxygn O 2 20,99 23,20 Argon Ar 0,93 1,28 Carbon dioxid

More information

4.2 Design of Sections for Flexure

4.2 Design of Sections for Flexure 4. Dsign of Sctions for Flxur This sction covrs th following topics Prliminary Dsign Final Dsign for Typ 1 Mmbrs Spcial Cas Calculation of Momnt Dmand For simply supportd prstrssd bams, th maximum momnt

More information

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, April 04, 2005, 8:35 AM) PART I: CHAPTER TWO COMB MATH.

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, April 04, 2005, 8:35 AM) PART I: CHAPTER TWO COMB MATH. C:\Dallas\0_Courss\03A_OpSci_67\0 Cgh_Book\0_athmaticalPrliminaris\0_0 Combath.doc of 8 COPUTER GENERATED HOLOGRAS Optical Scincs 67 W.J. Dallas (onday, April 04, 005, 8:35 A) PART I: CHAPTER TWO COB ATH

More information

The Matrix Exponential

The Matrix Exponential Th Matrix Exponntial (with xrciss) by D. Klain Vrsion 207.0.05 Corrctions and commnts ar wlcom. Th Matrix Exponntial For ach n n complx matrix A, dfin th xponntial of A to b th matrix A A k I + A + k!

More information

Homework #3. 1 x. dx. It therefore follows that a sum of the

Homework #3. 1 x. dx. It therefore follows that a sum of the Danil Cannon CS 62 / Luan March 5, 2009 Homwork # 1. Th natural logarithm is dfind by ln n = n 1 dx. It thrfor follows that a sum of th 1 x sam addnd ovr th sam intrval should b both asymptotically uppr-

More information

First derivative analysis

First derivative analysis Robrto s Nots on Dirntial Calculus Chaptr 8: Graphical analysis Sction First drivativ analysis What you nd to know alrady: How to us drivativs to idntiy th critical valus o a unction and its trm points

More information

The Matrix Exponential

The Matrix Exponential Th Matrix Exponntial (with xrciss) by Dan Klain Vrsion 28928 Corrctions and commnts ar wlcom Th Matrix Exponntial For ach n n complx matrix A, dfin th xponntial of A to b th matrix () A A k I + A + k!

More information

INFLUENCE OF GROUND SUBSIDENCE IN THE DAMAGE TO MEXICO CITY S PRIMARY WATER SYSTEM DUE TO THE 1985 EARTHQUAKE

INFLUENCE OF GROUND SUBSIDENCE IN THE DAMAGE TO MEXICO CITY S PRIMARY WATER SYSTEM DUE TO THE 1985 EARTHQUAKE 13 th World Confrnc on Earthquak Enginring Vancouvr, B.C., Canada August 1-6, 2004 Papr No. 2165 INFLUENCE OF GROUND SUBSIDENCE IN THE DAMAGE TO MEXICO CITY S PRIMARY WATER SYSTEM DUE TO THE 1985 EARTHQUAKE

More information

Construction of asymmetric orthogonal arrays of strength three via a replacement method

Construction of asymmetric orthogonal arrays of strength three via a replacement method isid/ms/26/2 Fbruary, 26 http://www.isid.ac.in/ statmath/indx.php?modul=prprint Construction of asymmtric orthogonal arrays of strngth thr via a rplacmnt mthod Tian-fang Zhang, Qiaoling Dng and Alok Dy

More information

(Upside-Down o Direct Rotation) β - Numbers

(Upside-Down o Direct Rotation) β - Numbers Amrican Journal of Mathmatics and Statistics 014, 4(): 58-64 DOI: 10593/jajms0140400 (Upsid-Down o Dirct Rotation) β - Numbrs Ammar Sddiq Mahmood 1, Shukriyah Sabir Ali,* 1 Dpartmnt of Mathmatics, Collg

More information

Collisions between electrons and ions

Collisions between electrons and ions DRAFT 1 Collisions btwn lctrons and ions Flix I. Parra Rudolf Pirls Cntr for Thortical Physics, Unirsity of Oxford, Oxford OX1 NP, UK This rsion is of 8 May 217 1. Introduction Th Fokkr-Planck collision

More information

Estimation of apparent fraction defective: A mathematical approach

Estimation of apparent fraction defective: A mathematical approach Availabl onlin at www.plagiarsarchlibrary.com Plagia Rsarch Library Advancs in Applid Scinc Rsarch, 011, (): 84-89 ISSN: 0976-8610 CODEN (USA): AASRFC Estimation of apparnt fraction dfctiv: A mathmatical

More information

Sara Godoy del Olmo Calculation of contaminated soil volumes : Geostatistics applied to a hydrocarbons spill Lac Megantic Case

Sara Godoy del Olmo Calculation of contaminated soil volumes : Geostatistics applied to a hydrocarbons spill Lac Megantic Case wwwnvisol-canadaca Sara Godoy dl Olmo Calculation of contaminatd soil volums : Gostatistics applid to a hydrocarbons spill Lac Mgantic Cas Gostatistics: study of a PH contamination CONTEXT OF THE STUDY

More information

PHASE-ONLY CORRELATION IN FINGERPRINT DATABASE REGISTRATION AND MATCHING

PHASE-ONLY CORRELATION IN FINGERPRINT DATABASE REGISTRATION AND MATCHING Anall Univrsităţii d Vst din Timişoara Vol. LII, 2008 Sria Fizică PHASE-OLY CORRELATIO I FIGERPRIT DATABASE REGISTRATIO AD ATCHIG Alin C. Tusda, 2 Gianina Gabor Univrsity of Orada, Environmntal Faculty,

More information

cycle that does not cross any edges (including its own), then it has at least

cycle that does not cross any edges (including its own), then it has at least W prov th following thorm: Thorm If a K n is drawn in th plan in such a way that it has a hamiltonian cycl that dos not cross any dgs (including its own, thn it has at last n ( 4 48 π + O(n crossings Th

More information

Chapter 13 GMM for Linear Factor Models in Discount Factor form. GMM on the pricing errors gives a crosssectional

Chapter 13 GMM for Linear Factor Models in Discount Factor form. GMM on the pricing errors gives a crosssectional Chaptr 13 GMM for Linar Factor Modls in Discount Factor form GMM on th pricing rrors givs a crosssctional rgrssion h cas of xcss rturns Hors rac sting for charactristic sting for pricd factors: lambdas

More information

Basic Polyhedral theory

Basic Polyhedral theory Basic Polyhdral thory Th st P = { A b} is calld a polyhdron. Lmma 1. Eithr th systm A = b, b 0, 0 has a solution or thr is a vctorπ such that π A 0, πb < 0 Thr cass, if solution in top row dos not ist

More information

A Sub-Optimal Log-Domain Decoding Algorithm for Non-Binary LDPC Codes

A Sub-Optimal Log-Domain Decoding Algorithm for Non-Binary LDPC Codes Procdings of th 9th WSEAS Intrnational Confrnc on APPLICATIONS of COMPUTER ENGINEERING A Sub-Optimal Log-Domain Dcoding Algorithm for Non-Binary LDPC Cods CHIRAG DADLANI and RANJAN BOSE Dpartmnt of Elctrical

More information

Ch. 24 Molecular Reaction Dynamics 1. Collision Theory

Ch. 24 Molecular Reaction Dynamics 1. Collision Theory Ch. 4 Molcular Raction Dynamics 1. Collision Thory Lctur 16. Diffusion-Controlld Raction 3. Th Matrial Balanc Equation 4. Transition Stat Thory: Th Eyring Equation 5. Transition Stat Thory: Thrmodynamic

More information

Lecture 19: Free Energies in Modern Computational Statistical Thermodynamics: WHAM and Related Methods

Lecture 19: Free Energies in Modern Computational Statistical Thermodynamics: WHAM and Related Methods Statistical Thrmodynamics Lctur 19: Fr Enrgis in Modrn Computational Statistical Thrmodynamics: WHAM and Rlatd Mthods Dr. Ronald M. Lvy ronlvy@tmpl.du Dfinitions Canonical nsmbl: A N, V,T = k B T ln Q

More information

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018 Propositional Logic Combinatorial Problm Solving (CPS) Albrt Olivras Enric Rodríguz-Carbonll May 17, 2018 Ovrviw of th sssion Dfinition of Propositional Logic Gnral Concpts in Logic Rduction to SAT CNFs

More information

MCB137: Physical Biology of the Cell Spring 2017 Homework 6: Ligand binding and the MWC model of allostery (Due 3/23/17)

MCB137: Physical Biology of the Cell Spring 2017 Homework 6: Ligand binding and the MWC model of allostery (Due 3/23/17) MCB37: Physical Biology of th Cll Spring 207 Homwork 6: Ligand binding and th MWC modl of allostry (Du 3/23/7) Hrnan G. Garcia March 2, 207 Simpl rprssion In class, w drivd a mathmatical modl of how simpl

More information

Forces. Quantum ElectroDynamics. α = = We have now:

Forces. Quantum ElectroDynamics. α = = We have now: W hav now: Forcs Considrd th gnral proprtis of forcs mdiatd by xchang (Yukawa potntial); Examind consrvation laws which ar obyd by (som) forcs. W will nxt look at thr forcs in mor dtail: Elctromagntic

More information

Rational Approximation for the one-dimensional Bratu Equation

Rational Approximation for the one-dimensional Bratu Equation Intrnational Journal of Enginring & Tchnology IJET-IJES Vol:3 o:05 5 Rational Approximation for th on-dimnsional Bratu Equation Moustafa Aly Soliman Chmical Enginring Dpartmnt, Th British Univrsity in

More information

Two Products Manufacturer s Production Decisions with Carbon Constraint

Two Products Manufacturer s Production Decisions with Carbon Constraint Managmnt Scinc and Enginring Vol 7 No 3 pp 3-34 DOI:3968/jms9335X374 ISSN 93-34 [Print] ISSN 93-35X [Onlin] wwwcscanadant wwwcscanadaorg Two Products Manufacturr s Production Dcisions with Carbon Constraint

More information

CPSC 665 : An Algorithmist s Toolkit Lecture 4 : 21 Jan Linear Programming

CPSC 665 : An Algorithmist s Toolkit Lecture 4 : 21 Jan Linear Programming CPSC 665 : An Algorithmist s Toolkit Lctur 4 : 21 Jan 2015 Lcturr: Sushant Sachdva Linar Programming Scrib: Rasmus Kyng 1. Introduction An optimization problm rquirs us to find th minimum or maximum) of

More information

On the Hamiltonian of a Multi-Electron Atom

On the Hamiltonian of a Multi-Electron Atom On th Hamiltonian of a Multi-Elctron Atom Austn Gronr Drxl Univrsity Philadlphia, PA Octobr 29, 2010 1 Introduction In this papr, w will xhibit th procss of achiving th Hamiltonian for an lctron gas. Making

More information

3 Noisy Channel model

3 Noisy Channel model 3 Noisy Channl modl W obsrv a distortd mssag R (forign string f). W hav a modl on how th mssag is distortd (translation modl t(f )) and also a modl on which original mssags ar probabl (languag modl p()).

More information

Exam 1. It is important that you clearly show your work and mark the final answer clearly, closed book, closed notes, no calculator.

Exam 1. It is important that you clearly show your work and mark the final answer clearly, closed book, closed notes, no calculator. Exam N a m : _ S O L U T I O N P U I D : I n s t r u c t i o n s : It is important that you clarly show your work and mark th final answr clarly, closd book, closd nots, no calculator. T i m : h o u r

More information

Full Waveform Inversion Using an Energy-Based Objective Function with Efficient Calculation of the Gradient

Full Waveform Inversion Using an Energy-Based Objective Function with Efficient Calculation of the Gradient Full Wavform Invrsion Using an Enrgy-Basd Objctiv Function with Efficint Calculation of th Gradint Itm yp Confrnc Papr Authors Choi, Yun Sok; Alkhalifah, ariq Ali Citation Choi Y, Alkhalifah (217) Full

More information

Homotopy perturbation technique

Homotopy perturbation technique Comput. Mthods Appl. Mch. Engrg. 178 (1999) 257±262 www.lsvir.com/locat/cma Homotopy prturbation tchniqu Ji-Huan H 1 Shanghai Univrsity, Shanghai Institut of Applid Mathmatics and Mchanics, Shanghai 272,

More information

MEASURING HEAT FLUX FROM A COMPONENT ON A PCB

MEASURING HEAT FLUX FROM A COMPONENT ON A PCB MEASURING HEAT FLUX FROM A COMPONENT ON A PCB INTRODUCTION Elctronic circuit boards consist of componnts which gnrats substantial amounts of hat during thir opration. A clar knowldg of th lvl of hat dissipation

More information

Chemical Physics II. More Stat. Thermo Kinetics Protein Folding...

Chemical Physics II. More Stat. Thermo Kinetics Protein Folding... Chmical Physics II Mor Stat. Thrmo Kintics Protin Folding... http://www.nmc.ctc.com/imags/projct/proj15thumb.jpg http://nuclarwaponarchiv.org/usa/tsts/ukgrabl2.jpg http://www.photolib.noaa.gov/corps/imags/big/corp1417.jpg

More information

Brief Introduction to Statistical Mechanics

Brief Introduction to Statistical Mechanics Brif Introduction to Statistical Mchanics. Purpos: Ths nots ar intndd to provid a vry quick introduction to Statistical Mchanics. Th fild is of cours far mor vast than could b containd in ths fw pags.

More information

u x v x dx u x v x v x u x dx d u x v x u x v x dx u x v x dx Integration by Parts Formula

u x v x dx u x v x v x u x dx d u x v x u x v x dx u x v x dx Integration by Parts Formula 7. Intgration by Parts Each drivativ formula givs ris to a corrsponding intgral formula, as w v sn many tims. Th drivativ product rul yilds a vry usful intgration tchniqu calld intgration by parts. Starting

More information

Application of Vague Soft Sets in students evaluation

Application of Vague Soft Sets in students evaluation Availabl onlin at www.plagiarsarchlibrary.com Advancs in Applid Scinc Rsarch, 0, (6):48-43 ISSN: 0976-860 CODEN (USA): AASRFC Application of Vagu Soft Sts in studnts valuation B. Chtia*and P. K. Das Dpartmnt

More information

4. Money cannot be neutral in the short-run the neutrality of money is exclusively a medium run phenomenon.

4. Money cannot be neutral in the short-run the neutrality of money is exclusively a medium run phenomenon. PART I TRUE/FALSE/UNCERTAIN (5 points ach) 1. Lik xpansionary montary policy, xpansionary fiscal policy rturns output in th mdium run to its natural lvl, and incrass prics. Thrfor, fiscal policy is also

More information

Abstract Interpretation. Lecture 5. Profs. Aiken, Barrett & Dill CS 357 Lecture 5 1

Abstract Interpretation. Lecture 5. Profs. Aiken, Barrett & Dill CS 357 Lecture 5 1 Abstract Intrprtation 1 History On brakthrough papr Cousot & Cousot 77 (?) Inspird by Dataflow analysis Dnotational smantics Enthusiastically mbracd by th community At last th functional community... At

More information

Association (Part II)

Association (Part II) Association (Part II) nanopoulos@ismll.d Outlin Improving Apriori (FP Growth, ECLAT) Qustioning confidnc masur Qustioning support masur 2 1 FP growth Algorithm Us a comprssd rprsntation of th dtb databas

More information

GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES. Eduard N. Klenov* Rostov-on-Don, Russia

GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES. Eduard N. Klenov* Rostov-on-Don, Russia GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES Eduard N. Klnov* Rostov-on-Don, Russia Th articl considrs phnomnal gomtry figurs bing th carrirs of valu spctra for th pairs of th rmaining additiv

More information

ECE602 Exam 1 April 5, You must show ALL of your work for full credit.

ECE602 Exam 1 April 5, You must show ALL of your work for full credit. ECE62 Exam April 5, 27 Nam: Solution Scor: / This xam is closd-book. You must show ALL of your work for full crdit. Plas rad th qustions carfully. Plas chck your answrs carfully. Calculators may NOT b

More information

Middle East Technical University Department of Mechanical Engineering ME 413 Introduction to Finite Element Analysis

Middle East Technical University Department of Mechanical Engineering ME 413 Introduction to Finite Element Analysis Middl East Tchnical Univrsity Dpartmnt of Mchanical Enginring ME 43 Introduction to Finit Elmnt Analysis Chaptr 3 Computr Implmntation of D FEM Ths nots ar prpard by Dr. Cünyt Srt http://www.m.mtu.du.tr/popl/cunyt

More information

Symmetric centrosymmetric matrix vector multiplication

Symmetric centrosymmetric matrix vector multiplication Linar Algbra and its Applications 320 (2000) 193 198 www.lsvir.com/locat/laa Symmtric cntrosymmtric matrix vctor multiplication A. Mlman 1 Dpartmnt of Mathmatics, Univrsity of San Francisco, San Francisco,

More information

EFFECT OF BALL PROPERTIES ON THE BALL-BAT COEFFICIENT OF RESTITUTION

EFFECT OF BALL PROPERTIES ON THE BALL-BAT COEFFICIENT OF RESTITUTION EFFECT OF BALL PROPERTIES ON THE BALL-BAT COEFFICIENT OF RESTITUTION A. M. NATHAN 1 AND L. V. SMITH 2 1 Univrsity of Illinois, 1110 W. Grn Strt, Urbana, IL 61801, USA, E-mail: a-nathan@illinois.du 2 Washington

More information

A Prey-Predator Model with an Alternative Food for the Predator, Harvesting of Both the Species and with A Gestation Period for Interaction

A Prey-Predator Model with an Alternative Food for the Predator, Harvesting of Both the Species and with A Gestation Period for Interaction Int. J. Opn Problms Compt. Math., Vol., o., Jun 008 A Pry-Prdator Modl with an Altrnativ Food for th Prdator, Harvsting of Both th Spcis and with A Gstation Priod for Intraction K. L. arayan and. CH. P.

More information

The Importance of Action History in Decision Making and Reinforcement Learning

The Importance of Action History in Decision Making and Reinforcement Learning Th Importanc of Action History in Dcision Making and Rinforcmnt Larning Yongjia Wang (yongjiaw@umich.du Univrsity of Michigan, 2260 Hayward Strt Ann Arbor, MI 48109-2121 John E. Laird (laird@umich.du Univrsity

More information

Review Statistics review 14: Logistic regression Viv Bewick 1, Liz Cheek 1 and Jonathan Ball 2

Review Statistics review 14: Logistic regression Viv Bewick 1, Liz Cheek 1 and Jonathan Ball 2 Critical Car Fbruary 2005 Vol 9 No 1 Bwick t al. Rviw Statistics rviw 14: Logistic rgrssion Viv Bwick 1, Liz Chk 1 and Jonathan Ball 2 1 Snior Lcturr, School of Computing, Mathmatical and Information Scincs,

More information

The pn junction: 2 Current vs Voltage (IV) characteristics

The pn junction: 2 Current vs Voltage (IV) characteristics Th pn junction: Currnt vs Voltag (V) charactristics Considr a pn junction in quilibrium with no applid xtrnal voltag: o th V E F E F V p-typ Dpltion rgion n-typ Elctron movmnt across th junction: 1. n

More information

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation.

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation. Lur 7 Fourir Transforms and th Wav Euation Ovrviw and Motivation: W first discuss a fw faturs of th Fourir transform (FT), and thn w solv th initial-valu problm for th wav uation using th Fourir transform

More information

The van der Waals interaction 1 D. E. Soper 2 University of Oregon 20 April 2012

The van der Waals interaction 1 D. E. Soper 2 University of Oregon 20 April 2012 Th van dr Waals intraction D. E. Sopr 2 Univrsity of Orgon 20 pril 202 Th van dr Waals intraction is discussd in Chaptr 5 of J. J. Sakurai, Modrn Quantum Mchanics. Hr I tak a look at it in a littl mor

More information

Elements of Statistical Thermodynamics

Elements of Statistical Thermodynamics 24 Elmnts of Statistical Thrmodynamics Statistical thrmodynamics is a branch of knowldg that has its own postulats and tchniqus. W do not attmpt to giv hr vn an introduction to th fild. In this chaptr,

More information

What are those βs anyway? Understanding Design Matrix & Odds ratios

What are those βs anyway? Understanding Design Matrix & Odds ratios Ral paramtr stimat WILD 750 - Wildlif Population Analysis of 6 What ar thos βs anyway? Undrsting Dsign Matrix & Odds ratios Rfrncs Hosmr D.W.. Lmshow. 000. Applid logistic rgrssion. John Wily & ons Inc.

More information

EEO 401 Digital Signal Processing Prof. Mark Fowler

EEO 401 Digital Signal Processing Prof. Mark Fowler EEO 401 Digital Signal Procssing Prof. Mark Fowlr Dtails of th ot St #19 Rading Assignmnt: Sct. 7.1.2, 7.1.3, & 7.2 of Proakis & Manolakis Dfinition of th So Givn signal data points x[n] for n = 0,, -1

More information

A Recognition and Verification Strategy for Handwritten Word Recognition

A Recognition and Verification Strategy for Handwritten Word Recognition A Rcognition and Vrification Stratgy for Handwrittn Word Rcognition M. Morita 1,2, R. Sabourin 1 3, F. Bortolozzi 3 and C. Y. Sun 2 1 Écol d Tchnologi Supériur, Montral, Canada 2 Cntr for Pattrn Rcognition

More information

Answer Homework 5 PHA5127 Fall 1999 Jeff Stark

Answer Homework 5 PHA5127 Fall 1999 Jeff Stark Answr omwork 5 PA527 Fall 999 Jff Stark A patint is bing tratd with Drug X in a clinical stting. Upon admiion, an IV bolus dos of 000mg was givn which yildd an initial concntration of 5.56 µg/ml. A fw

More information

Procdings of IC-IDC0 ( and (, ( ( and (, and (f ( and (, rspctivly. If two input signals ar compltly qual, phas spctra of two signals ar qual. That is

Procdings of IC-IDC0 ( and (, ( ( and (, and (f ( and (, rspctivly. If two input signals ar compltly qual, phas spctra of two signals ar qual. That is Procdings of IC-IDC0 EFFECTS OF STOCHASTIC PHASE SPECTRUM DIFFERECES O PHASE-OLY CORRELATIO FUCTIOS PART I: STATISTICALLY COSTAT PHASE SPECTRUM DIFFERECES FOR FREQUECY IDICES Shunsu Yamai, Jun Odagiri,

More information

Dealing with quantitative data and problem solving life is a story problem! Attacking Quantitative Problems

Dealing with quantitative data and problem solving life is a story problem! Attacking Quantitative Problems Daling with quantitati data and problm soling lif is a story problm! A larg portion of scinc inols quantitati data that has both alu and units. Units can sa your butt! Nd handl on mtric prfixs Dimnsional

More information

2.3 Matrix Formulation

2.3 Matrix Formulation 23 Matrix Formulation 43 A mor complicatd xampl ariss for a nonlinar systm of diffrntial quations Considr th following xampl Exampl 23 x y + x( x 2 y 2 y x + y( x 2 y 2 (233 Transforming to polar coordinats,

More information

Intro to Nuclear and Particle Physics (5110)

Intro to Nuclear and Particle Physics (5110) Intro to Nuclar and Particl Physics (5110) March 09, 009 Frmi s Thory of Bta Dcay (continud) Parity Violation, Nutrino Mass 3/9/009 1 Final Stat Phas Spac (Rviw) Th Final Stat lctron and nutrino wav functions

More information

Where k is either given or determined from the data and c is an arbitrary constant.

Where k is either given or determined from the data and c is an arbitrary constant. Exponntial growth and dcay applications W wish to solv an quation that has a drivativ. dy ky k > dx This quation says that th rat of chang of th function is proportional to th function. Th solution is

More information

Linear Non-Gaussian Structural Equation Models

Linear Non-Gaussian Structural Equation Models IMPS 8, Durham, NH Linar Non-Gaussian Structural Equation Modls Shohi Shimizu, Patrik Hoyr and Aapo Hyvarinn Osaka Univrsity, Japan Univrsity of Hlsinki, Finland Abstract Linar Structural Equation Modling

More information

That is, we start with a general matrix: And end with a simpler matrix:

That is, we start with a general matrix: And end with a simpler matrix: DIAGON ALIZATION OF THE STR ESS TEN SOR INTRO DUCTIO N By th us of Cauchy s thorm w ar abl to rduc th numbr of strss componnts in th strss tnsor to only nin valus. An additional simplification of th strss

More information

Lecture 37 (Schrödinger Equation) Physics Spring 2018 Douglas Fields

Lecture 37 (Schrödinger Equation) Physics Spring 2018 Douglas Fields Lctur 37 (Schrödingr Equation) Physics 6-01 Spring 018 Douglas Filds Rducd Mass OK, so th Bohr modl of th atom givs nrgy lvls: E n 1 k m n 4 But, this has on problm it was dvlopd assuming th acclration

More information

Machine Detector Interface Workshop: ILC-SLAC, January 6-8, 2005.

Machine Detector Interface Workshop: ILC-SLAC, January 6-8, 2005. Intrnational Linar Collidr Machin Dtctor Intrfac Workshop: ILCSLAC, January 68, 2005. Prsntd by Brtt Parkr, BNLSMD Mssag: Tools ar now availabl to optimiz IR layout with compact suprconducting quadrupols

More information

General Notes About 2007 AP Physics Scoring Guidelines

General Notes About 2007 AP Physics Scoring Guidelines AP PHYSICS C: ELECTRICITY AND MAGNETISM 2007 SCORING GUIDELINES Gnral Nots About 2007 AP Physics Scoring Guidlins 1. Th solutions contain th most common mthod of solving th fr-rspons qustions and th allocation

More information

Coupled Pendulums. Two normal modes.

Coupled Pendulums. Two normal modes. Tim Dpndnt Two Stat Problm Coupld Pndulums Wak spring Two normal mods. No friction. No air rsistanc. Prfct Spring Start Swinging Som tim latr - swings with full amplitud. stationary M +n L M +m Elctron

More information

Chapter 13 Aggregate Supply

Chapter 13 Aggregate Supply Chaptr 13 Aggrgat Supply 0 1 Larning Objctivs thr modls of aggrgat supply in which output dpnds positivly on th pric lvl in th short run th short-run tradoff btwn inflation and unmploymnt known as th Phillips

More information

ANALYSIS IN THE FREQUENCY DOMAIN

ANALYSIS IN THE FREQUENCY DOMAIN ANALYSIS IN THE FREQUENCY DOMAIN SPECTRAL DENSITY Dfinition Th spctral dnsit of a S.S.P. t also calld th spctrum of t is dfind as: + { γ }. jτ γ τ F τ τ In othr words, of th covarianc function. is dfind

More information

Title: Vibrational structure of electronic transition

Title: Vibrational structure of electronic transition Titl: Vibrational structur of lctronic transition Pag- Th band spctrum sn in th Ultra-Violt (UV) and visibl (VIS) rgions of th lctromagntic spctrum can not intrprtd as vibrational and rotational spctrum

More information

Discrete Hilbert Transform. Numeric Algorithms

Discrete Hilbert Transform. Numeric Algorithms Volum 49, umbr 4, 8 485 Discrt Hilbrt Transform. umric Algorithms Ghorgh TODORA, Rodica HOLOEC and Ciprian IAKAB Abstract - Th Hilbrt and Fourir transforms ar tools usd for signal analysis in th tim/frquncy

More information

Status of LAr TPC R&D (2) 2014/Dec./23 Neutrino frontier workshop 2014 Ryosuke Sasaki (Iwate U.)

Status of LAr TPC R&D (2) 2014/Dec./23 Neutrino frontier workshop 2014 Ryosuke Sasaki (Iwate U.) Status of LAr TPC R&D (2) 214/Dc./23 Nutrino frontir workshop 214 Ryosuk Sasaki (Iwat U.) Tabl of Contnts Dvlopmnt of gnrating lctric fild in LAr TPC Introduction - Gnrating strong lctric fild is on of

More information

Problem Set 6 Solutions

Problem Set 6 Solutions 6.04/18.06J Mathmatics for Computr Scinc March 15, 005 Srini Dvadas and Eric Lhman Problm St 6 Solutions Du: Monday, March 8 at 9 PM in Room 3-044 Problm 1. Sammy th Shark is a financial srvic providr

More information

Determination of Vibrational and Electronic Parameters From an Electronic Spectrum of I 2 and a Birge-Sponer Plot

Determination of Vibrational and Electronic Parameters From an Electronic Spectrum of I 2 and a Birge-Sponer Plot 5 J. Phys. Chm G Dtrmination of Vibrational and Elctronic Paramtrs From an Elctronic Spctrum of I 2 and a Birg-Sponr Plot 1 15 2 25 3 35 4 45 Dpartmnt of Chmistry, Gustavus Adolphus Collg. 8 Wst Collg

More information

Extraction of Doping Density Distributions from C-V Curves

Extraction of Doping Density Distributions from C-V Curves Extraction of Doping Dnsity Distributions from C-V Curvs Hartmut F.-W. Sadrozinski SCIPP, Univ. California Santa Cruz, Santa Cruz, CA 9564 USA 1. Connction btwn C, N, V Start with Poisson quation d V =

More information

Computing and Communications -- Network Coding

Computing and Communications -- Network Coding 89 90 98 00 Computing and Communications -- Ntwork Coding Dr. Zhiyong Chn Institut of Wirlss Communications Tchnology Shanghai Jiao Tong Univrsity China Lctur 5- Nov. 05 0 Classical Information Thory Sourc

More information

Roadmap. XML Indexing. DataGuide example. DataGuides. Strong DataGuides. Multiple DataGuides for same data. CPS Topics in Database Systems

Roadmap. XML Indexing. DataGuide example. DataGuides. Strong DataGuides. Multiple DataGuides for same data. CPS Topics in Database Systems Roadmap XML Indxing CPS 296.1 Topics in Databas Systms Indx fabric Coopr t al. A Fast Indx for Smistructurd Data. VLDB, 2001 DataGuid Goldman and Widom. DataGuids: Enabling Qury Formulation and Optimization

More information

u r du = ur+1 r + 1 du = ln u + C u sin u du = cos u + C cos u du = sin u + C sec u tan u du = sec u + C e u du = e u + C

u r du = ur+1 r + 1 du = ln u + C u sin u du = cos u + C cos u du = sin u + C sec u tan u du = sec u + C e u du = e u + C Tchniqus of Intgration c Donald Kridr and Dwight Lahr In this sction w ar going to introduc th first approachs to valuating an indfinit intgral whos intgrand dos not hav an immdiat antidrivativ. W bgin

More information

Supplementary Materials

Supplementary Materials 6 Supplmntary Matrials APPENDIX A PHYSICAL INTERPRETATION OF FUEL-RATE-SPEED FUNCTION A truck running on a road with grad/slop θ positiv if moving up and ngativ if moving down facs thr rsistancs: arodynamic

More information

Pair (and Triplet) Production Effect:

Pair (and Triplet) Production Effect: Pair (and riplt Production Effct: In both Pair and riplt production, a positron (anti-lctron and an lctron (or ngatron ar producd spontanously as a photon intracts with a strong lctric fild from ithr a

More information

Limiting value of higher Mahler measure

Limiting value of higher Mahler measure Limiting valu of highr Mahlr masur Arunabha Biswas a, Chris Monico a, a Dpartmnt of Mathmatics & Statistics, Txas Tch Univrsity, Lubbock, TX 7949, USA Abstract W considr th k-highr Mahlr masur m k P )

More information

Function Spaces. a x 3. (Letting x = 1 =)) a(0) + b + c (1) = 0. Row reducing the matrix. b 1. e 4 3. e 9. >: (x = 1 =)) a(0) + b + c (1) = 0

Function Spaces. a x 3. (Letting x = 1 =)) a(0) + b + c (1) = 0. Row reducing the matrix. b 1. e 4 3. e 9. >: (x = 1 =)) a(0) + b + c (1) = 0 unction Spacs Prrquisit: Sction 4.7, Coordinatization n this sction, w apply th tchniqus of Chaptr 4 to vctor spacs whos lmnts ar functions. Th vctor spacs P n and P ar familiar xampls of such spacs. Othr

More information

Quasi-Classical States of the Simple Harmonic Oscillator

Quasi-Classical States of the Simple Harmonic Oscillator Quasi-Classical Stats of th Simpl Harmonic Oscillator (Draft Vrsion) Introduction: Why Look for Eignstats of th Annihilation Oprator? Excpt for th ground stat, th corrspondnc btwn th quantum nrgy ignstats

More information

Estimation of odds ratios in Logistic Regression models under different parameterizations and Design matrices

Estimation of odds ratios in Logistic Regression models under different parameterizations and Design matrices Advancs in Computational Intllignc, Man-Machin Systms and Cybrntics Estimation of odds ratios in Logistic Rgrssion modls undr diffrnt paramtrizations and Dsign matrics SURENDRA PRASAD SINHA*, LUIS NAVA

More information

Random Access Techniques: ALOHA (cont.)

Random Access Techniques: ALOHA (cont.) Random Accss Tchniqus: ALOHA (cont.) 1 Exampl [ Aloha avoiding collision ] A pur ALOHA ntwork transmits a 200-bit fram on a shard channl Of 200 kbps at tim. What is th rquirmnt to mak this fram collision

More information