A Generalized Online Mirror Descent with Applications to Classification and Regression
|
|
- Sheryl Montgomery
- 5 years ago
- Views:
Transcription
1 Journal of Machne Learnng Research Submed 4/00; Publshed 10/00 A Generalzed Onlne Mrror Descen wh Applcaons o Classfcaon and Regresson Francesco Orabona Toyoa Technologcal Insue a Chcago Chcago, IL, USA Koby Crammer Deparmen of Elecrcal Engnerng The Technon Hafa, 3000 Israel Ncolò Cesa-Banch Deparmen of Compuer Scence Unversà degl Sud d Mlano Mlano, 0135 Ialy francesco@orabonacom koby@eeechnonacl ncolocesa-banch@unm Edor: Absrac Onlne learnng algorhms are fas, memory-effcen, easy o mplemen, and applcable o many predcon problems, ncludng classfcaon, regresson, and rankng Several onlne algorhms were proposed n he pas few decades, some based on addve updaes, lke he Percepron, and some oher on mulplcave updaes, lke Wnnow Onlne convex opmzaon s a general framework o unfy boh he desgn and he analyss of onlne algorhms usng a sngle predcon sraegy: onlne mrror descen Dfferen frs-order onlne algorhms are obaned by choosng he regularzaon funcon n onlne mrror descen We generalze onlne mrror descen o sequences of me-varyng regularzers Our approach allows us o recover as specal cases many recenly proposed second-order algorhms, such as he Vovk-Azoury-Warmuh, he second-order Percepron, and he AROW algorhm Moreover, we derve a new second order adapve p-norm algorhm, and mprove bounds for some frs-order algorhms, such as Passve-Aggressve PA-I Keywords: Onlne learnng, Convex opmzaon, Second-order algorhms 1 Inroducon Onlne learnng provdes a scalable and flexble approach for he soluon of a wde range of predcon problems, ncludng classfcaon, regresson, rankng, and porfolo managemen Popular onlne algorhms for classfcaon nclude he sandard Percepron and s many varans, such as kernel Percepron Freund and Schapre, 1999, p-norm Percepron Genle, 003, and Passve-Aggressve Crammer e al, 006 These algorhms have well known counerpars for regresson problems, such he Wdrow-Hoff algorhm and s p-norm generalzaon Oher onlne algorhms, wh properes dfferen from hose of he sandard Percepron, are based on exponenal raher han addve updaes, such as Wnnow Llesone, 1988 for classfcaon and Exponenaed Graden Kvnen and c 000 Francesco Orabona and Koby Crammer and Ncol Cesa-Banch
2 Orabona and Crammer and Cesa-Banch Warmuh, 1997 for regresson Whereas hese onlne algorhms are all essenally varans of sochasc graden descen Tsypkn, 1971, n he las decade many algorhms usng second-order nformaon from he npu feaures have been proposed These nclude he Vovk-Azoury-Warmuh algorhm for regresson Vovk, 001; Azoury and Warmuh, 001, he second-order Percepron Cesa-Banch e al, 005, he CW/AROW algorhms Dredze e al, 008; Crammer e al, 009,?, and he algorhms proposed by Duch e al 011, all for bnary classfcaon Recenly, onlne convex opmzaon has been proposed as a common unfyng framework for desgnng and analyzng onlne algorhms In parcular, onlne mrror descen OMD s a general onlne convex opmzaon algorhm whch s paramerzed by a regularzer, e, a srongly convex funcon By approprae choces of he regularzer, mos frs-order onlne learnng algorhms are recovered as specal cases of OMD Moreover, performance guaranees can be also derved smply by nsanang he general OMD bounds o he specfc regularzer beng used The heorecal sudy of OMD reles on convex analyss Warmuh and Jagoa 1997 and Kvnen and Warmuh 001 poneered he use of Bregman dvergences n he analyss of onlne algorhms, as explaned n he monography of Cesa-Banch and Lugos 006 Shalev-Shwarz and Snger 007, Shalev-Shwarz 007 n hs dsseraon, and Shalev-Shwarz and Kakade 009 showed a dfferen analyss based on a prmal-dual mehod Sarng from he work of Kakade e al 009, s now clear ha many nsances of OMD can be analyzed usng only a few basc convex dualy properes See he he recen survey by Shalev-Shwarz 01 for a lucd descrpon of hese developmens In hs paper we exend and generalze he heorecal framework of Kakade e al 009 In parcular, we allow OMD o use a sequence of me-varyng regularzers Ths s known o be he key o obanng second-order algorhms, and ndeed we recover he Vovk-Azoury- Warmuh, he second-order Percepron, and he AROW algorhm as specal cases, wh a slghly mproved analyss of AROW Our generalzed analyss also capures he effcen varans of hese algorhms ha only use he dagonal elemens of he second order nformaon marx, a resul whch was no whn reach of he prevous echnques Besdes beng able o express second-order algorhms, me-varyng regularzers can be used o perform oher ypes of adapaon o he sequence of observed daa We gve a concree example by nroducng a new adapve regularzer correspondng o a weghed verson of he p-norm regularzer In he case of sparse arges, he correspondng nsance of OMD acheves a performance bound beer han ha of OMD wh 1-norm regularzaon, whch s he sandard regularzer for he sparse arge assumpon Even n case of frs-order algorhms our framework gves mprovemens on prevous resuls For example, alhough aggressve algorhms for bnary classfcaon ofen exhb a beer emprcal performance han her conservave counerpars, a heorecal explanaon of hs behavor remaned so far elusve Usng our refned analyss, we are able o prove he frs bound for Passve-Aggressve PA-I ha s never worse and somemes beer han he Percepron bound
3 The generalzed graden-based lnear forecaser Onlne convex programmng Le X be some Eucldean space a fne-dmensonal lnear space over he reals equpped wh an nner produc In he onlne convex opmzaon proocol an algorhm sequenally chooses elemens from S X, each me ncurrng a ceran loss A each sep = 1,, he algorhm chooses w S and hen observes a convex loss funcon l : S R The value l w s he loss of he learner a sep, and he goal s o conrol he regre, R T u = l w l u for all u S and for any sequence of convex loss funcons l An mporan applcaon doman for hs proocol s sequenal lnear regresson/classfcaon In hs case, here s a fxed and gven loss funcon l : R R R and a fxed bu unknown sequence x 1, y 1, x, y, of examples x, y X R A each sep = 1,, he learner observes x and pcks w S X The loss suffered a sep s hen defned as l w = l w, x, y For example, n regresson l w, x, y = w, x y In classfcaon, where y { 1, +1}, a ypcal loss funcon s he hnge loss [ 1 y w, x ] +, where [a] + = max{0, a} Ths s a convex upper bound on he rue quany of neres Namely, he msake ndcaor funcon I {y w,x 0} 1 Furher noaon and defnons We now nroduce some basc noons of convex analyss ha are used n he paper We refer o Rockafellar 1970 for defnons and ermnology We consder funcons f : X R ha are closed and convex Ths s equvalen o say ha her epgraph {x, y : fx y} s a convex and closed subse of X R The effecve doman of f, ha s he se {x X : fx < }, s a convex se whenever f s convex We can always choose any S X as doman of f by leng fx = for x S Gven a closed and convex funcon f wh doman S X, s Fenchel conjugae f : X R s defned as f u = sup v S v, u fv Noe ha he doman of f s always X Moreover, one can prove ha f = f A generc norm of a vecor u X s denoed by u Is dual s he norm defned as v = sup u { u, v : u 1} The Fenchel-Young nequaly saes ha fu + f v u, v for all v, u A vecor x s a subgraden of a convex funcon f a v f fu fv u v, x for any u n he doman of f The dfferenal se of f a v, denoed by fv, s he se of all he subgradens of f a v If f s also dfferenable a v, hen fv conans a sngle vecor, denoed by fv, whch s he graden of f a v A consequence of he Fenchel- Young nequaly s he followng: for all x fv we have ha fv + f x = v, x A funcon f s β-srongly convex wh respec o a norm f for any u, v n s doman, and any x fu, fv fu + x, v u + β u v 3
4 Orabona and Crammer and Cesa-Banch The Fenchel conjugae f of a β-srongly convex funcon f s everywhere dfferenable and -srongly smooh Ths means ha for all u, v X, 1 β f v f u + fu, v u + 1 β u v See also he paper of Kakade e al 009 and references heren A furher propery of srongly convex funcons f : S R s he followng: for all u X, v, f u = argsup u fv 1 v S Ths mples he useful deny f f u + f u = f u, u Srong convexy and srong smoohness are key properes n he desgn of onlne learnng algorhms In he followng, we ofen wre f o denoe he norm accordng o whch f s srongly convex 3 Onlne Mrror Descen We now nroduce our man algorhmc ool: a generalzaon of he sandard OMD algorhm for onlne convex programmng n whch he regularzers may change over me Algorhm 1 Onlne Mrror Descen 1: Parameers: A sequence of srongly convex funcons f 1, f, defned on a common doman S X : Inalze: θ 1 = 0 X 3: for = 1,, do 4: Choose w = f θ 5: Observe z X 6: Updae θ +1 = θ + z 7: end for Sandard OMD see, eg, Kakade e al, 009 uses f = f for all Noe he followng remarkable propery of Algorhm 1: whle θ moves freely n X as deermned by he npu sequence z, because of 1 he propery w S holds for all The followng lemma s a generalzaon of Corollary 4 of Kakade e al 009 and of Corollary 3 of Duch e al 011 Lemma 1 Assume OMD s run wh funcons f 1, f, defned on a common doman S X and such ha each f s β -srongly convex wh respec o he norm f Then, for any u S, z f z, u w f T u + + f θ f β 1θ where we se f 0 0 = 0 Moreover, f θ f 1 θ f 1 w f w for all 1 4
5 The generalzed graden-based lnear forecaser Proof Le = f θ +1 f 1 θ Then = ft θ T +1 f0 θ 1 = ft θ T +1 Snce he funcons f are 1 β -srongly smooh wh respec o f, and recallng ha θ +1 = θ + z, = f θ +1 f θ + f θ f 1θ f θ f 1θ f θ, z + 1 β z f = f θ f 1θ w, z + 1 β z f where we used he defnon of w n he las sep On he oher hand, he Fenchel-Young nequaly mples = ft θ T +1 u, θ T +1 f T u = u, z f T u Combnng he upper and lower bound on and summng over we ge u, z f T u f θ f 1θ + w, z + 1 z β f We now prove he second saemen Recallng agan he defnon of w we have ha mples f θ = w, θ f w On he oher hand, he Fenchel-Young nequaly mples ha f 1 θ f 1 w w, θ Combnng he wo we ge f θ f 1 θ f 1 w f w, as desred Nex, we show a general regre bound for Algorhm 1 Corollary 1 Le R : S R be a convex funcon and le g 1, g, be a sequence of nondecreasng convex funcons g : S R Fx η > 0 and assume f = g + ηr are β -srongly convex wh respec o If OMD s run on he npu sequence z = ηl for some l l w, hen l w + Rw l u + Ru g T u η for all u S Moreover, f f = g + ηr where g : S R s β-srongly convex, hen l w + Rw for all u S l u + Ru T gu η + η l f β 3 + η β max l f T 4 5
6 Orabona and Crammer and Cesa-Banch Fnally, f f = R, where R s β-srongly convex wh respec o a norm, hen l w + Rw for all u S l u + Ru max T l f 1 + ln T β Proof By convexy, l w l u 1 η z, u w Usng Lemma 1 we have, T l z, u w g T u + ηt Ru + η f + η 1Rw Rw β where we used he fac ha he erms g 1 w g w are nonposve under he hypohess ha he funcons g are nondecreasng Reorderng erms we oban 3 In order o oban 4 s suffcen o noe ha, by defnon of srong convexy, g s β -srongly convex because g s β-srongly convex, hence f s β -srongly convex oo The elemenary nequaly T 1 T concludes he proof of 4 Fnally, bound 5 s proven by observng ha f = R s β-srongly convex because R s β-srongly convex The elemenary nequaly T ln T concludes he proof A specal case of OMD s he Regularzed Dual Averagng framework of Xao 010, where he predcon a each sep s defned by w = argmn w s=1 5 w l s + β 1 gw + Rw 6 1 for some l s l s w s, s = 1,, 1 Usng 1, s easy o see ha hs updae s equvalen 1 o 1 w = f where f w = β 1 gw + 1 Rw The framework of Xao 010 has been exended by Duch e al 010 o allow he srongly convex par of he regularzer o ncrease over me However, her framework s no flexble enough o nclude algorhms ha updae whou usng he graden of he loss funcon wh respec o whch he regre s calculaed Examples of such algorhms are he Vovk-Azoury-Warmuh algorhm of he nex secon and he onlne bnary classfcaon algorhms of Secon 6 A bound smlar o 3 has been recenly presened by Duch e al 011 and exended o he varable poenal funcons by Duch e al 010 There, a more mmedae radeoff beween he curren graden and he Bregman dvergence from he new soluon o he prevous one s used o updae a each me sep Noe ha he only hypohess on R s convexy Hence, R can be a nondfferenable funcon as 1 Thus we recover he resuls abou mnmzaon of srongly convex and compose loss funcons, and adapve learnng raes, n a smple unque framework In he nex secons we show more algorhms ha can be vewed as specal cases of hs framework 1 Alhough Xao 010 explcly menons ha hs resuls canno be recovered wh he prmal-dual proofs, here we prove he conrary s=1 l s 6
7 The generalzed graden-based lnear forecaser 4 Square Loss In hs secon we recover known regre bounds for onlne regresson wh he square loss va Lemma 1 Throughou hs secon, X = R d and he nner produc u, x s he sandard do produc u x We se l u = 1 y u x where x1, y 1, x, y, s some arbrary sequence of examples x, y R d R Frs, noe ha s possble o specalze OMD o he Vovk-Azoury-Warmuh algorhm for onlne regresson by seng z = y x and f u = 1 u A u, where A 1 = ai d and A = A 1 + x x for > 1 The regre bound of hs algorhm see, eg, Theorem 118 of Cesa-Banch and Lugos 006 s recovered from Lemma 1 by nong ha f s 1- srongly convex wh respec o he norm u f = u A u Hence, R T = y u x y w x f T u + a u + 1 w x y x f f T u + + f θ f 1θ f T u + a u + 1 a u + Y x A 1 x w x snce f θ f 1 θ f 1 w f w = 1 w x, and by seng Y max y We can also generalze he p-norm LMS algorhm of Kvnen e al 006 for conrollng he adapve flerng regre R af T = w x u x The reader neresed n he movaons behnd he sudy of hs regre s addressed o ha paper Ths s acheved by seng z = y w x x and f u = X β fu n OMD, where f s an arbrary β-srongly convex funcon wh respec o some norm, and X = max s x s We can hen wre R T + 1 Raf T = y w x u x y w x w x f T u + 1 y w x where n he las sep we used Lemma 1, he X -srong convexy of f, and he fac he f f 1 Smplfyng he expresson we oban he followng adapve flerng bound R af T X T T β fu + y u x Compared o he bounds of Kvnen e al 006, our algorhm nhers he ably o adap o he maxmum norm of x whou any pror knowledge Moreover, nsead of usng a decreasng learnng rae here we use an ncreasng regularzer 7
8 Orabona and Crammer and Cesa-Banch 5 A new algorhm for onlne regresson In hs secon we show he full power of our framework by nroducng a new me-varyng regularzer f generalzng he squared q-norm Then, we derve he correspondng regre bound As n he prevous secon, le X = R d and le he nner produc u, x be he sandard do produc u x Gven b 1,, b d R + and q 1, ] le he weghed q-norm of w R d be d 1/q w q b Defne he correspondng regularzaon funcon by fw = d /q 1 w q b q 1 Ths funcon has he followng properes proof n appendx Lemma The Fenchel conjugae of f s f θ = d /p 1 θ p b 1 p for p = q p 1 q 1 7 Moreover, he funcon fw s 1-srcly convex wh respec o he norm d 1/q x q b whose dual norm s defned by d θ p b 1 p 1/p We can now prove he followng regre bound for lnear regresson wh absolue loss Corollary 3 Le f u = where b, = max s=1,, x s,, and le d /q e u q b, q 1 1 q = 1 ln max x s s=1,, 0 8 1
9 The generalzed graden-based lnear forecaser If OMD s run usng regularzers f on he npu sequence z = ηl, where l l w for l w = w x y and η > 0, hen d w x y u x y et ln max x u B T, η,,t η for any u R d, where B T, = max,,t x, Ths bound has he neresng propery o be nvaran wh respec o arbrary scalng of ndvdual coordnaes of he daa pons x Ths s unlke runnng sandard OMD wh non-adapve regularzers, whch gves bounds of he form u max x T In parcular, by an approprae unng of η he regre n Corollary 3 s bounded by a quany of he order of d T u max x, ln d When he good u are sparse, ha s u 1 are small, hs s always beer han runnng sandard OMD wh a non-weghed q-norm regularzer, whch for q 1 he bes choce for he sparse u case gves bounds of he form u 1 max x T ln d Indeed, we have d u max x, d u max max x,j j = u 1 max x Smlar regularzaon funcons are suded by Grave e al 011 alhough n a dfferen conex 6 Bnary classfcaon: aggressve and dagonal updaes In hs secon we show ha several known algorhms for onlne bnary classfcaon are specal cases of OMD These algorhms nclude p-norm Percepron Genle, 003, Passve- Aggressve Crammer e al, 006, second-order Percepron Cesa-Banch e al, 005, and AROW Crammer e al, 009 Besdes recoverng all prevously known msake bounds, we also show new bounds for Passve-Aggressve and for AROW wh dagonal updaes Fx any Eucldean space wh nner produc, Gven a fxed bu unknown sequence x 1, y 1, x, y, of examples x, y X { 1, +1}, le l w = l w, x, y be he hnge loss [ 1 y w, x ] I s easy o verfy ha he hnge loss sasfes he followng + condon: f l w > 0 hen l u 1 + u, l for all u, w R d wh l l w 8 Noe ha when l w > 0 he subgraden noaon s redundan, as l w s he sngleon { l w } We apply he OMD algorhm o onlne bnary classfcaon by seng z = η l f l w > 0, and z = 0 oherwse 9
10 Orabona and Crammer and Cesa-Banch In he followng, when T s undersood from he conex, we denoe by M he se of seps on whch he algorhm made a msake, ŷ y Smlarly, we denoe by U he se of margn error seps; ha s, seps where ŷ = y bu l w > 0 Followng a sandard ermnology, we call conservave or passve an algorhm ha updaes s classfer only on msake seps, and aggressve an algorhm ha updaes s classfer boh on msake and margn error seps 61 Frs-order algorhms If we run OMD n conservave mode, and le f = f = 1 p for 1 < p, hen we recover he p-norm Percepron of Genle 003 We now show how o use our framework o generalze and mprove prevous analyses for bnary classfcaon algorhms ha use aggressve updaes Corollary 4 Assume OMD s run wh f = f where f, wh doman X, s β-srongly convex wh respec o he norm and sasfes fλu λ fu for all λ R and all u X Furher assume he npu sequence s z = η y x, for some 0 < η 1 such ha y w, x 0 mples η = 1 Then, for all T 1 and for all u X, M Lu + D + β fux T + X T β fulu where M = M, X T = max T x, Lu = [ 1 y u, x ] and D = η x η + βy w, x + X For he conservave p-norm Percepron, we have U =, = q where q = p 1, and β = p 1 because 1 p s p 1-srongly convex wh respec o p for 1 < p, see Lemma 17 of Shalev-Shwarz 007 We herefore recover he msake bound of Genle 003 The erm D n he bound of Corollary 4 can be negave We can mnmze, subjec o 0 η 1, by seng { { } } X η = max mn βy w, x x, 1, 0 Ths unng of η s que smlar o ha of he Passve-Aggressve algorhm ype I of Crammer e al 006 In fac for f = f = 1 we would have { { } } X η = max mn y w, x x, 1, 0 whle he updae rule for PA-I s η = max U { { } } 1 y w, x mn x, 1, 0 10 p
11 The generalzed graden-based lnear forecaser The msake bound of Corollary 4 s however beer han he aggressve bounds for PA-I of Crammer e al 006 and Shalev-Shwarz 007 Indeed, whle he PA-I bounds are generally worse han he Percepron msake bound M Lu + u X T + u XT Lu, 9 as dscussed by Crammer e al 006, our bound s beer as soon as D < 0 Hence, can be vewed as he frs heorecal evdence n suppor of aggressve updaes Proof of Corollary 4 Usng 15 n Lemma 5 wh he assumpon η = 1 when M, we ge M Lu + fu M x β + U η β x + η y w, x η U Lu + X T β fu M + U η x + βη y w, x X U η where we have used he fac ha X X T for all T Solvng for M we ge M Lu β fux T + X T β fu β X T fu + Lu + D η 10 U wh 1 β X T fu + Lu + D 0, and D = U η x + βη y w, x η X We furher upper bound he rgh-hand sde of 10 usng he elemenary nequaly a + b a + for all a > 0 and b a Ths gves b a M Lu X T D β fux T + X T β fu β X T fu + Lu + β fu 1 β X T fu + Lu Lu β fux T + X T β fu β X T fu + Lu + D η U Applyng he nequaly a + b a + b and rearrangng gves he desred bound U η 6 Second-order algorhms We now apply our framework o second-order algorhms for bnary classfcaon Here, we le X = R d and he nner produc u, x be he sandard do produc u x 11
12 Orabona and Crammer and Cesa-Banch Second-order algorhms for bnary classfcaon are onlne varans of Rdge regresson Recall ha he Rdge regresson lnear predcor s defned by w +1 = argmn w x s y s + w w R d s=1 The closed-form expresson for w +1, whch nvolves he desgn marx S = [ x 1,, x ] and he label vecor y = y 1,, y, s gven by w = I + S 1S S y The second-order Percepron see below uses hs wegh w +1, bu S and y only conan he examples x s, y s on whch a msake occurred In hs sense, we call an onlne varan of Rdge regresson In pracce, second-order algorhms perform ypcally beer han her frs-order counerpars, such as he algorhms n he Percepron famly There are wo basc second-order algorhms: he second-order Percepron of Cesa-Banch e al 005 and he AROW algorhm of Crammer e al 009 We show ha boh of hem are nsances of OMD and recover her msake bounds as specal cases of our analyss Le f x = 1 x A x, where A 0 = I and A = A r x x wh r > 0 Each funcon f s 1-srongly convex wh respec o he norm x f = x A x wh dual norm x f = x A 1 x The dual funcon of f x s f x = 1 x A 1 x Now, he conservave verson of OMD run wh f chosen as above s he second-order Percepron The aggressve verson corresponds nsead o AROW wh a mnor dfference Indeed, n hs case he predcon of OMD s he sgn of y w r x = m r+χ, where we use he noaon χ = x A 1 1 x and m = y x A 1 1 θ On he oher hand, AROW smply predcs usng r he sgn of m The sgn of he predcons s he same, bu OMD updaes when m r+χ 1 whle AROW updaes when m 1 Typcally, for large he value of χ s small, and hus he wo updae rules concde n pracce To derve a msake bound for OMD run wh f x = 1 x A x, frs observe ha usng he Woodbury deny we have f θ f 1θ = x A 1 1 θ r + x A 1 1 x = m r + χ Hence, usng 15 n Lemma 5, and seng η = 1, we oban M + U Lu + u A T u x A 1 x + y w x m r + χ Lu + for all u X, where u + 1 r = Lu + r u + Lu = u x r ln A T + u x ln A T + [ 1 y u, x ] + 1 y w x m r m rr + χ m r + χ
13 The generalzed graden-based lnear forecaser Ths bound mproves slghly over he known bound for AROW n he las sum n he square roo In fac n AROW we have he erm U, whle here we have m r m rr + χ U m r m rr + χ U r rr + χ U 11 In he conservave case, when U, he bound specalzes o he sandard second-order Percepron bound 63 Dagonal updaes AROW and he second-order Percepron can be run more effcenly usng dagonal marces In hs case, each updae akes me lnear n d We now use Corollary 5 o prove a msake bound for he dagonal verson of he second-order Percepron Denoe D = dag{a } be he dagonal marx ha agrees wh A on he dagonal, where A s defned as before and f x = 1 x D x Seng η = 1, usng he second bound of Lemma 5, and Lemma 6, we have M + U Lu + u T D T u r = Lu + u + 1 r d u d 1 ln r x, x, U r d 1 ln r x, U 1 Ths allows us o heorecally analyze he cases where hs algorhm could be advanageous In parcular, feaures of NLP daa are ypcally bnary, and s ofen he case ha mos of he feaures are zero mos of he me On he oher hand, hese rare feaures are usually he mos nformave ones see, eg, he dscusson of Dredze e al 008 Fgure 1 shows he number of mes each feaure word appears n wo senmen daases vs he word rank Clearly, here are a few very frequen words and many rare words These exac properes orgnally movaed he CW and AROW algorhms, and now our analyss provdes a heorecal jusfcaon Concreely, he above consderaons suppor he assumpon ha he opmal hyperplane u sasfes d u x, I u x, s I u s u where I s he se of nformave and rare feaures, and s s he maxmum number of mes hese feaures appear n he sequence Runnng he dagonal verson of he second order Perepron so ha U =, and assumng ha, d u We dd no opmze he consan mulplyng U n he bound x, s u 13 13
14 Orabona and Crammer and Cesa-Banch Fgure 1: Evdence of heavy als for NLP Daa The plos show he number of words vs he word rank on wo senmen daa ses he las erm n he msake bound 1 can be re-wren as u + 1 d d u x 1, r ln x, r r + 1 u MX r + s d ln T dr M M + 1 where we calculaed he maxmum of he sum, gven he consran d x, XT M M We can now use Corollary 3 n he appendx o oban M Lu + u 8 u r + sx 4 r + sd ln T edr + Lu X T dr + Hence, when he hypohess 13 s verfed, he number of msakes of he dagonal verson of AROW depends on ln Lu raher han on Lu 7 Conclusons We proposed a framework for onlne convex opmzaon combnng onlne mrror descen wh me-varyng regularzers Ths allowed us o vew second-order algorhms such as he Vovk-Azoury-Warmuh forecaser, he second-order Percepron, and he AROW algorhm as specal cases of mrror descen Our analyss also capures second-order varans ha only employ he dagonal elemens of he second order nformaon marx, a resul whch was no whn reach of he prevous echnques Whn our framework, we also derved and analyzed a new regularzer based on an adapve weghed verson of he p-norm Percepron In he case of sparse arges, he 14
15 The generalzed graden-based lnear forecaser correspondng nsance of OMD acheves a performance bound beer han ha of OMD wh 1-norm regularzaon We also mproved prevous bounds for exsng frs-order algorhms For example, we were able o formally explan he phenomenon accordng o whch aggressve algorhms ypcally exhb beer emprcal performance han her conservave counerpars Specfcally, our refned analyss provdes a bound for Passve-Aggressve PA-I ha s never worse and somemes beer han he Percepron bound One neresng drecon o pursue s he dervaon and analyss of algorhms based on me-varyng versons of he enropc regularzers used by he EG and Wnnow algorhms More n general, would be useful o devse a more sysemac approach o he desgn of adapve regularzers enojoyng a gven se of desred properes Ths would help obanng more examples of adapaon mechansms ha are no based on second-order nformaon Acknowledgmens The hrd auhor graefully acknowledges paral suppor by he PASCAL Nework of Excellence under EC gran no Ths publcaon only reflecs he auhors vews The second auhor graefully acknowledges paral suppor by an Israel Scence Foundaon gran ISF-1567/10 Techncal lemmas Proof of Lemma The Fenchel conjugae of f s f θ = sup v v θ fv Se w 1 d /p equal o he graden of p 1 θ p b 1 p wh respec o θ Easy calculaons show ha d /p w 1 θ fw = θ p b 1 p p 1 We now show ha hs quany s ndeed sup v v θ fv Pck any v R d Applyng Hölder nequaly o he vecors v 1 b 1/q 1,, v d b 1/q d and θ 1b 1/q 1,, θ d b 1/q d we ge, d 1/q d 1/p d 1/q d 1/p v θ v q b θ p b p/q = v q b θ p b 1 p Hence d 1/q d 1/p d /q v θ fv v q b θ p b 1 p 1 v q b q 1 d 1/q The rgh-hand sde s a quadrac funcon n v q b If we maxmze, we oban d /p d /p v θ fv q 1 θ p b 1 p 1 = θ p b 1 p p 1 15
16 Orabona and Crammer and Cesa-Banch whch concludes he proof for f In order o show he second par, we follow Lemma 17 of Shalev-Shwarz 007 and d /q prove ha x q b x fwx Defne Ψa = a/q q 1 and φa = a q, hence fw = Ψ d b φx Clearly Ψ a = a/q 1 qq 1 and Ψ a = /q 1 qq 1 a/q Moreover, φ a = q sgna a q 1 and φ a = qq 1 a q The, j elemen of fw for j s d Ψ b k φw k b b j φ w φ w j k=1 The dagonal elemens of fw are d Ψ b k φw k k=1 Thus we have b φ w d + Ψ b k φw k b φ w d d d d x fwx = Ψ b k φw k b x φ w + Ψ b k φw k b x φ w k=1 k=1 k=1 The frs erm s non-negave snce q 1, Wrng he second erm explcly we have, d q/q 1 d x fwx b k w k b x w q k=1 We now lower bound hs quany usng Hölder nequaly Le y = b γ w qq/ for γ = q/ We have d /q d /q x q b x q d = y b y y / q q/ d d q/ / q d = b γ w qq/ x b/q y /q x b/q q/ b γ w qq/ /q /q q/ /q 16
17 The generalzed graden-based lnear forecaser = We jus showed ha d q/q d b γ/ q w q d q/q d = b w q x b/q b γ/q w q x w q b /q q/q d q/q d = b w q x w q b /q q/q d q/q d = b w q x w q b x fwx d k=1 b k w k q/q 1 d d /q b x w q x q b Ths concludes he proof of he 1-src convexy of f d 1/q We now prove ha he dual norm of x d 1/p q b s θ p b 1 p By defnon of dual norm, d 1/q d 1/q sup x u x : x q b 1 = sup x u x : x b 1/q q 1 d 1/q = sup u y b 1/q y : y q 1 = u 1 b 1/q 1,, u d b 1/q p d where 1/q + 1/p = 1 Wrng he las norm explcly and observng ha p = q/q 1, 1/p 1/p u p b p/q = u p b 1 q whch concludes he proof Lemma 5 Assume OMD s run wh funcons f 1, f, defned on X and such ha each f s β -srongly convex wh respec o he norm f and f λu λ f u for all λ R and all u S Assume furher he npu sequence s z = η l for some η > 0, where l l w, l w = 0 mples l = 0, and l = l, x, y sasfes 8 Then, for all T 1, η L η + λf T u + 1 B + η l λ β f η w, l 14 17
18 Orabona and Crammer and Cesa-Banch for any u S and any λ > 0, where L η = η l u and B = f θ f 1θ In parcular, choosng he opmal λ, we oban η L η + [ f T u B + η l ] β f η w, l + 15 Proof We apply Lemma 1 wh z = η l and usng λu for any λ > 0, η l, w λu λ f T u + η l β f + f θ f 1θ Snce l w = 0 mples l = 0, and usng 8, η l, w + η η l u η l, w λu Dvdng by λ and rearrangng gves he frs bound The second bound s obaned by choosng he λ ha makes equal he las wo erms n he rgh-hand sde of 14 Lemma 6 For all x 1, x T R d le D = dag{a } where A 0 = I and A = A r x x for some r > 0 Then x D 1 x r d 1 ln r x, + 1 Proof Consder he sequence a 0 and defne v = a 0 + a wh a 0 > 0 The concavy of he logarhm mples ln b ln a + b a a for all a, b > 0 Hence we have a v = v v 1 v ln v = ln v T = ln a 0 + T a v 1 v 0 a 0 Usng he above and he defnon of D, we oban x D 1 x = d x, r j=1 x j, = r d x, r + j=1 x j, r d ln r + T x, r We conclude he appendx by provng he resuls requred o solve he mplc logarhmc equaons of Secon 63 We use he followng fac of Orabona e al 01 18
19 The generalzed graden-based lnear forecaser Lemma 7 Le a, x > 0 be such ha x a ln x Then for all n > 1 x n na a ln n 1 e Corollary For all a, b, c, d, x > 0 such ha x a lnbx + c + d, we have x n a ln nab n 1 e + d + c 1 b n 1 Corollary 3 For all a, b, c, d, x > 0 such ha x a lnbx c + d 16 we have x 8ab a ln e + b c + db + + c + d Proof Assumpon 16 mples x a lnbx c + d 17 a lnbx c + d = a lnbx c + d a lnb x + + c + d From Corollary we have ha f f, g, h,, y > 0 sasfy y f lngx + h +, hen y n f ln nfg + + h 1 n 1 e g n 1 n nf g n 1 e + + h 1 g n 1 where we have used he elemenary nequaly ln y y e for all y 0 Applyng he above o 17 we oban x n na b n 1 e + c + d b n 1 whch mples n nab x + c + d n 1 e b n 1 Noe ha we have repeaedly used he elemenary nequaly x + y x + y Choosng n = and applyng 18 o 16 we ge x a lnbx c + d 8ab a ln + b c + db + + c + d e concludng he proof 19
20 Orabona and Crammer and Cesa-Banch References KS Azoury and MK Warmuh Relave loss bounds for on-lne densy esmaon wh he exponenal famly of dsrbuons Machne Learnng, 433:11 46, 001 N Cesa-Banch and G Lugos Predcon, Learnng, and Games Cambrdge Unversy Press, 006 N Cesa-Banch, A Concon, and C Genle A second-order Percepron algorhm SIAM Journal on Compung, 343: , 005 K Crammer, O Dekel, J Keshe, S Shalev-Shwarz, and Y Snger Onlne passveaggressve algorhms Journal of Machne Learnng Research, 7: , 006 K Crammer, M Dredze, and F Perera Exac convex confdence-weghed learnng Advances n Neural Informaon Processng Sysems, 1:345 35, 009 K Crammer, A Kulesza, and M Dredze Adapve regularzaon of wegh vecors Advances n Neural Informaon Processng Sysems, :414 4, 009 M Dredze, K Crammer, and F Perera Onlne confdence-weghed learnng Proceedngs of he 5h Inernaonal Conference on Machne Learnng, 008 J Duch, E Hazan, and Y Snger Adapve subgraden mehods for onlne learnng and sochasc opmzaon Journal of Machne Learnng Research, 1:11 159, 011 J Duch, S Shalev-Shwarz, Y Snger, and A Tewar Compose objecve mrror descen In Proceedngs of he 3rd Annual Conference on Learnng Theory, pages 14 6, 010 Y Freund and R E Schapre Large margn classfcaon usng he Percepron algorhm Machne Learnng, pages 77 96, 1999 C Genle The robusness of he p-norm algorhms Machne Learnng, 533:65 99, 003 E Grave, G Oboznsk, and FR Bach Trace Lasso: a race norm regularzaon for correlaed desgns Advances n Neural Informaon Processng Sysems, 4: , 011 SM Kakade, S Shalev-Shwarz, and A Tewar wh marces CoRR, abs/ , 009 Regularzaon echnques for learnng J Kvnen and MK Warmuh Exponenaed graden versus graden descen for lnear predcors Informaon and Compuaon, 131:1 63, 1997 J Kvnen and MK Warmuh Relave loss bounds for muldmensonal regresson problems Machne Learnng, 453:301 39, 001 J Kvnen, M K Warmuh, and B Hassb The p-norm generalzaon of he LMS algorhm for adapve flerng IEEE Transacons on Sgnal Processng, 545: , 006 0
21 The generalzed graden-based lnear forecaser N Llesone Learnng quckly when rrelevan arbues abound: a new lnear-hreshold algorhm Machne Learnng, 4:85 318, 1988 F Orabona, N Cesa-Banch, and C Genle Beyond logarhmc bounds n onlne learnng In Proceedngs of he 15h Inernaonal Conference on Arfcal Inellgence and Sascs, pages JMLR W&CP, 01 R T Rockafellar Convex Analyss Prnceon Unversy Press, 1970 S Shalev-Shwarz Onlne Learnng: Theory, Algorhms, and Applcaons PhD hess, The Hebrew Unversy, 007 S Shalev-Shwarz Onlne learnng and onlne convex opmzaon Foundaons and Trends n Machne Learnng, 4, 01 S Shalev-Shwarz and SM Kakade Mnd he dualy gap: Logarhmc regre algorhms for onlne opmzaon Advances n Neural Informaon Processng Sysems, 1: , 009 S Shalev-Shwarz and Y Snger A prmal-dual perspecve of onlne learnng algorhms Machne Learnng Journal, 007 Y Tsypkn Adapaon and Learnng n Auomac Sysems Academc Press, 1971 V Vovk Compeve on-lne sascs Inernaonal Sascal Revew, 69:13 48, 001 MK Warmuh and AK Jagoa Connuous and dscree-me nonlnear graden descen: Relave loss bounds and convergence In Elecronc proceedngs of he 5h Inernaonal Symposum on Arfcal Inellgence and Mahemacs, 1997 L Xao Dual averagng mehods for regularzed sochasc learnng and onlne opmzaon Journal of Machne Learnng Research, 11: , 010 1
Variants of Pegasos. December 11, 2009
Inroducon Varans of Pegasos SooWoong Ryu bshboy@sanford.edu December, 009 Youngsoo Cho yc344@sanford.edu Developng a new SVM algorhm s ongong research opc. Among many exng SVM algorhms, we wll focus on
More informationV.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS
R&RATA # Vol.) 8, March FURTHER AALYSIS OF COFIDECE ITERVALS FOR LARGE CLIET/SERVER COMPUTER ETWORKS Vyacheslav Abramov School of Mahemacal Scences, Monash Unversy, Buldng 8, Level 4, Clayon Campus, Wellngon
More informationGENERATING CERTAIN QUINTIC IRREDUCIBLE POLYNOMIALS OVER FINITE FIELDS. Youngwoo Ahn and Kitae Kim
Korean J. Mah. 19 (2011), No. 3, pp. 263 272 GENERATING CERTAIN QUINTIC IRREDUCIBLE POLYNOMIALS OVER FINITE FIELDS Youngwoo Ahn and Kae Km Absrac. In he paper [1], an explc correspondence beween ceran
More informationAdvanced Machine Learning & Perception
Advanced Machne Learnng & Percepon Insrucor: Tony Jebara SVM Feaure & Kernel Selecon SVM Eensons Feaure Selecon (Flerng and Wrappng) SVM Feaure Selecon SVM Kernel Selecon SVM Eensons Classfcaon Feaure/Kernel
More informationSolution in semi infinite diffusion couples (error function analysis)
Soluon n sem nfne dffuson couples (error funcon analyss) Le us consder now he sem nfne dffuson couple of wo blocks wh concenraon of and I means ha, n a A- bnary sysem, s bondng beween wo blocks made of
More informationAn introduction to Support Vector Machine
An nroducon o Suppor Vecor Machne 報告者 : 黃立德 References: Smon Haykn, "Neural Neworks: a comprehensve foundaon, second edon, 999, Chaper 2,6 Nello Chrsann, John Shawe-Tayer, An Inroducon o Suppor Vecor Machnes,
More informationVolatility Interpolation
Volaly Inerpolaon Prelmnary Verson March 00 Jesper Andreasen and Bran Huge Danse Mares, Copenhagen wan.daddy@danseban.com brno@danseban.com Elecronc copy avalable a: hp://ssrn.com/absrac=69497 Inro Local
More informationIntroduction to Boosting
Inroducon o Boosng Cynha Rudn PACM, Prnceon Unversy Advsors Ingrd Daubeches and Rober Schapre Say you have a daabase of news arcles, +, +, -, -, +, +, -, -, +, +, -, -, +, +, -, + where arcles are labeled
More informationComparison of Differences between Power Means 1
In. Journal of Mah. Analyss, Vol. 7, 203, no., 5-55 Comparson of Dfferences beween Power Means Chang-An Tan, Guanghua Sh and Fe Zuo College of Mahemacs and Informaon Scence Henan Normal Unversy, 453007,
More informationDynamic Team Decision Theory. EECS 558 Project Shrutivandana Sharma and David Shuman December 10, 2005
Dynamc Team Decson Theory EECS 558 Proec Shruvandana Sharma and Davd Shuman December 0, 005 Oulne Inroducon o Team Decson Theory Decomposon of he Dynamc Team Decson Problem Equvalence of Sac and Dynamc
More informationGeneral Weighted Majority, Online Learning as Online Optimization
Sascal Technques n Robocs (16-831, F10) Lecure#10 (Thursday Sepember 23) General Weghed Majory, Onlne Learnng as Onlne Opmzaon Lecurer: Drew Bagnell Scrbe: Nahanel Barshay 1 1 Generalzed Weghed majory
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4
CS434a/54a: Paern Recognon Prof. Olga Veksler Lecure 4 Oulne Normal Random Varable Properes Dscrmnan funcons Why Normal Random Varables? Analycally racable Works well when observaon comes form a corruped
More information( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model
BGC1: Survval and even hsory analyss Oslo, March-May 212 Monday May 7h and Tuesday May 8h The addve regresson model Ørnulf Borgan Deparmen of Mahemacs Unversy of Oslo Oulne of program: Recapulaon Counng
More informationMachine Learning Linear Regression
Machne Learnng Lnear Regresson Lesson 3 Lnear Regresson Bascs of Regresson Leas Squares esmaon Polynomal Regresson Bass funcons Regresson model Regularzed Regresson Sascal Regresson Mamum Lkelhood (ML)
More informationOnline Supplement for Dynamic Multi-Technology. Production-Inventory Problem with Emissions Trading
Onlne Supplemen for Dynamc Mul-Technology Producon-Invenory Problem wh Emssons Tradng by We Zhang Zhongsheng Hua Yu Xa and Baofeng Huo Proof of Lemma For any ( qr ) Θ s easy o verfy ha he lnear programmng
More informationRelative controllability of nonlinear systems with delays in control
Relave conrollably o nonlnear sysems wh delays n conrol Jerzy Klamka Insue o Conrol Engneerng, Slesan Techncal Unversy, 44- Glwce, Poland. phone/ax : 48 32 37227, {jklamka}@a.polsl.glwce.pl Keywor: Conrollably.
More informationOn One Analytic Method of. Constructing Program Controls
Appled Mahemacal Scences, Vol. 9, 05, no. 8, 409-407 HIKARI Ld, www.m-hkar.com hp://dx.do.org/0.988/ams.05.54349 On One Analyc Mehod of Consrucng Program Conrols A. N. Kvko, S. V. Chsyakov and Yu. E. Balyna
More informationCHAPTER 10: LINEAR DISCRIMINATION
CHAPER : LINEAR DISCRIMINAION Dscrmnan-based Classfcaon 3 In classfcaon h K classes (C,C,, C k ) We defned dscrmnan funcon g j (), j=,,,k hen gven an es eample, e chose (predced) s class label as C f g
More informationEpistemic Game Theory: Online Appendix
Epsemc Game Theory: Onlne Appendx Edde Dekel Lucano Pomao Marcano Snscalch July 18, 2014 Prelmnares Fx a fne ype srucure T I, S, T, β I and a probably µ S T. Le T µ I, S, T µ, βµ I be a ype srucure ha
More informationHEAT CONDUCTION PROBLEM IN A TWO-LAYERED HOLLOW CYLINDER BY USING THE GREEN S FUNCTION METHOD
Journal of Appled Mahemacs and Compuaonal Mechancs 3, (), 45-5 HEAT CONDUCTION PROBLEM IN A TWO-LAYERED HOLLOW CYLINDER BY USING THE GREEN S FUNCTION METHOD Sansław Kukla, Urszula Sedlecka Insue of Mahemacs,
More informationRobustness Experiments with Two Variance Components
Naonal Insue of Sandards and Technology (NIST) Informaon Technology Laboraory (ITL) Sascal Engneerng Dvson (SED) Robusness Expermens wh Two Varance Componens by Ana Ivelsse Avlés avles@ns.gov Conference
More informationLecture 11 SVM cont
Lecure SVM con. 0 008 Wha we have done so far We have esalshed ha we wan o fnd a lnear decson oundary whose margn s he larges We know how o measure he margn of a lnear decson oundary Tha s: he mnmum geomerc
More informationCS286.2 Lecture 14: Quantum de Finetti Theorems II
CS286.2 Lecure 14: Quanum de Fne Theorems II Scrbe: Mara Okounkova 1 Saemen of he heorem Recall he las saemen of he quanum de Fne heorem from he prevous lecure. Theorem 1 Quanum de Fne). Le ρ Dens C 2
More informationRobust and Accurate Cancer Classification with Gene Expression Profiling
Robus and Accurae Cancer Classfcaon wh Gene Expresson Proflng (Compuaonal ysems Bology, 2005) Auhor: Hafeng L, Keshu Zhang, ao Jang Oulne Background LDA (lnear dscrmnan analyss) and small sample sze problem
More informationTight results for Next Fit and Worst Fit with resource augmentation
Tgh resuls for Nex F and Wors F wh resource augmenaon Joan Boyar Leah Epsen Asaf Levn Asrac I s well known ha he wo smple algorhms for he classc n packng prolem, NF and WF oh have an approxmaon rao of
More informationExistence and Uniqueness Results for Random Impulsive Integro-Differential Equation
Global Journal of Pure and Appled Mahemacs. ISSN 973-768 Volume 4, Number 6 (8), pp. 89-87 Research Inda Publcaons hp://www.rpublcaon.com Exsence and Unqueness Resuls for Random Impulsve Inegro-Dfferenal
More informationLecture 6: Learning for Control (Generalised Linear Regression)
Lecure 6: Learnng for Conrol (Generalsed Lnear Regresson) Conens: Lnear Mehods for Regresson Leas Squares, Gauss Markov heorem Recursve Leas Squares Lecure 6: RLSC - Prof. Sehu Vjayakumar Lnear Regresson
More informationON THE WEAK LIMITS OF SMOOTH MAPS FOR THE DIRICHLET ENERGY BETWEEN MANIFOLDS
ON THE WEA LIMITS OF SMOOTH MAPS FOR THE DIRICHLET ENERGY BETWEEN MANIFOLDS FENGBO HANG Absrac. We denfy all he weak sequenal lms of smooh maps n W (M N). In parcular, hs mples a necessary su cen opologcal
More information( ) () we define the interaction representation by the unitary transformation () = ()
Hgher Order Perurbaon Theory Mchael Fowler 3/7/6 The neracon Represenaon Recall ha n he frs par of hs course sequence, we dscussed he chrödnger and Hesenberg represenaons of quanum mechancs here n he chrödnger
More informationDepartment of Economics University of Toronto
Deparmen of Economcs Unversy of Torono ECO408F M.A. Economercs Lecure Noes on Heeroskedascy Heeroskedascy o Ths lecure nvolves lookng a modfcaons we need o make o deal wh he regresson model when some of
More informationBayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance
INF 43 3.. Repeon Anne Solberg (anne@f.uo.no Bayes rule for a classfcaon problem Suppose we have J, =,...J classes. s he class label for a pxel, and x s he observed feaure vecor. We can use Bayes rule
More informationFTCS Solution to the Heat Equation
FTCS Soluon o he Hea Equaon ME 448/548 Noes Gerald Reckenwald Porland Sae Unversy Deparmen of Mechancal Engneerng gerry@pdxedu ME 448/548: FTCS Soluon o he Hea Equaon Overvew Use he forward fne d erence
More informationLinear Response Theory: The connection between QFT and experiments
Phys540.nb 39 3 Lnear Response Theory: The connecon beween QFT and expermens 3.1. Basc conceps and deas Q: ow do we measure he conducvy of a meal? A: we frs nroduce a weak elecrc feld E, and hen measure
More informationIn the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!
ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL The frs hng o es n wo-way ANOVA: Is here neracon? "No neracon" means: The man effecs model would f. Ths n urn means: In he neracon plo (wh A on he horzonal
More informationMechanics Physics 151
Mechancs Physcs 5 Lecure 0 Canoncal Transformaons (Chaper 9) Wha We Dd Las Tme Hamlon s Prncple n he Hamlonan formalsm Dervaon was smple δi δ Addonal end-pon consrans pq H( q, p, ) d 0 δ q ( ) δq ( ) δ
More informationCubic Bezier Homotopy Function for Solving Exponential Equations
Penerb Journal of Advanced Research n Compung and Applcaons ISSN (onlne: 46-97 Vol. 4, No.. Pages -8, 6 omoopy Funcon for Solvng Eponenal Equaons S. S. Raml *,,. Mohamad Nor,a, N. S. Saharzan,b and M.
More informationSecond-Order Non-Stationary Online Learning for Regression
Second-Order Non-Saonary Onlne Learnng for Regresson Nna Vas, Edward Moroshko, and Koby Crammer, Fellow, IEEE arxv:303040v cslg] Mar 03 Absrac he goal of a learner, n sandard onlne learnng, s o have he
More informationAttribute Reduction Algorithm Based on Discernibility Matrix with Algebraic Method GAO Jing1,a, Ma Hui1, Han Zhidong2,b
Inernaonal Indusral Informacs and Compuer Engneerng Conference (IIICEC 05) Arbue educon Algorhm Based on Dscernbly Marx wh Algebrac Mehod GAO Jng,a, Ma Hu, Han Zhdong,b Informaon School, Capal Unversy
More informationOnline Classification: Perceptron and Winnow
E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng
More information5th International Conference on Advanced Design and Manufacturing Engineering (ICADME 2015)
5h Inernaonal onference on Advanced Desgn and Manufacurng Engneerng (IADME 5 The Falure Rae Expermenal Sudy of Specal N Machne Tool hunshan He, a, *, La Pan,b and Bng Hu 3,c,,3 ollege of Mechancal and
More informationSOME NOISELESS CODING THEOREMS OF INACCURACY MEASURE OF ORDER α AND TYPE β
SARAJEVO JOURNAL OF MATHEMATICS Vol.3 (15) (2007), 137 143 SOME NOISELESS CODING THEOREMS OF INACCURACY MEASURE OF ORDER α AND TYPE β M. A. K. BAIG AND RAYEES AHMAD DAR Absrac. In hs paper, we propose
More informationJohn Geweke a and Gianni Amisano b a Departments of Economics and Statistics, University of Iowa, USA b European Central Bank, Frankfurt, Germany
Herarchcal Markov Normal Mxure models wh Applcaons o Fnancal Asse Reurns Appendx: Proofs of Theorems and Condonal Poseror Dsrbuons John Geweke a and Gann Amsano b a Deparmens of Economcs and Sascs, Unversy
More information[ ] 2. [ ]3 + (Δx i + Δx i 1 ) / 2. Δx i-1 Δx i Δx i+1. TPG4160 Reservoir Simulation 2018 Lecture note 3. page 1 of 5
TPG460 Reservor Smulaon 08 page of 5 DISCRETIZATIO OF THE FOW EQUATIOS As we already have seen, fne dfference appromaons of he paral dervaves appearng n he flow equaons may be obaned from Taylor seres
More informationLecture VI Regression
Lecure VI Regresson (Lnear Mehods for Regresson) Conens: Lnear Mehods for Regresson Leas Squares, Gauss Markov heorem Recursve Leas Squares Lecure VI: MLSC - Dr. Sehu Vjayakumar Lnear Regresson Model M
More information. The geometric multiplicity is dim[ker( λi. number of linearly independent eigenvectors associated with this eigenvalue.
Lnear Algebra Lecure # Noes We connue wh he dscusson of egenvalues, egenvecors, and dagonalzably of marces We wan o know, n parcular wha condons wll assure ha a marx can be dagonalzed and wha he obsrucons
More informationComb Filters. Comb Filters
The smple flers dscussed so far are characered eher by a sngle passband and/or a sngle sopband There are applcaons where flers wh mulple passbands and sopbands are requred Thecomb fler s an example of
More informationOutline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model
Probablsc Model for Tme-seres Daa: Hdden Markov Model Hrosh Mamsuka Bonformacs Cener Kyoo Unversy Oulne Three Problems for probablsc models n machne learnng. Compung lkelhood 2. Learnng 3. Parsng (predcon
More informationDual Approximate Dynamic Programming for Large Scale Hydro Valleys
Dual Approxmae Dynamc Programmng for Large Scale Hydro Valleys Perre Carpener and Jean-Phlppe Chanceler 1 ENSTA ParsTech and ENPC ParsTech CMM Workshop, January 2016 1 Jon work wh J.-C. Alas, suppored
More informationNotes on the stability of dynamic systems and the use of Eigen Values.
Noes on he sabl of dnamc ssems and he use of Egen Values. Source: Macro II course noes, Dr. Davd Bessler s Tme Seres course noes, zarads (999) Ineremporal Macroeconomcs chaper 4 & Techncal ppend, and Hamlon
More informationSampling Procedure of the Sum of two Binary Markov Process Realizations
Samplng Procedure of he Sum of wo Bnary Markov Process Realzaons YURY GORITSKIY Dep. of Mahemacal Modelng of Moscow Power Insue (Techncal Unversy), Moscow, RUSSIA, E-mal: gorsky@yandex.ru VLADIMIR KAZAKOV
More informationPerformance Analysis for a Network having Standby Redundant Unit with Waiting in Repair
TECHNI Inernaonal Journal of Compung Scence Communcaon Technologes VOL.5 NO. July 22 (ISSN 974-3375 erformance nalyss for a Nework havng Sby edundan Un wh ang n epar Jendra Sngh 2 abns orwal 2 Deparmen
More informationMachine Learning 2nd Edition
INTRODUCTION TO Lecure Sldes for Machne Learnng nd Edon ETHEM ALPAYDIN, modfed by Leonardo Bobadlla and some pars from hp://www.cs.au.ac.l/~aparzn/machnelearnng/ The MIT Press, 00 alpaydn@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/mle
More informationGraduate Macroeconomics 2 Problem set 5. - Solutions
Graduae Macroeconomcs 2 Problem se. - Soluons Queson 1 To answer hs queson we need he frms frs order condons and he equaon ha deermnes he number of frms n equlbrum. The frms frs order condons are: F K
More informationMechanics Physics 151
Mechancs Physcs 5 Lecure 9 Hamlonan Equaons of Moon (Chaper 8) Wha We Dd Las Tme Consruced Hamlonan formalsm H ( q, p, ) = q p L( q, q, ) H p = q H q = p H = L Equvalen o Lagrangan formalsm Smpler, bu
More information. The geometric multiplicity is dim[ker( λi. A )], i.e. the number of linearly independent eigenvectors associated with this eigenvalue.
Mah E-b Lecure #0 Noes We connue wh he dscusson of egenvalues, egenvecors, and dagonalzably of marces We wan o know, n parcular wha condons wll assure ha a marx can be dagonalzed and wha he obsrucons are
More informationMechanics Physics 151
Mechancs Physcs 5 Lecure 9 Hamlonan Equaons of Moon (Chaper 8) Wha We Dd Las Tme Consruced Hamlonan formalsm Hqp (,,) = qp Lqq (,,) H p = q H q = p H L = Equvalen o Lagrangan formalsm Smpler, bu wce as
More informationSurvival Analysis and Reliability. A Note on the Mean Residual Life Function of a Parallel System
Communcaons n Sascs Theory and Mehods, 34: 475 484, 2005 Copyrgh Taylor & Francs, Inc. ISSN: 0361-0926 prn/1532-415x onlne DOI: 10.1081/STA-200047430 Survval Analyss and Relably A Noe on he Mean Resdual
More informationFall 2010 Graduate Course on Dynamic Learning
Fall 200 Graduae Course on Dynamc Learnng Chaper 4: Parcle Flers Sepember 27, 200 Byoung-Tak Zhang School of Compuer Scence and Engneerng & Cognve Scence and Bran Scence Programs Seoul aonal Unversy hp://b.snu.ac.kr/~bzhang/
More informationOn computing differential transform of nonlinear non-autonomous functions and its applications
On compung dfferenal ransform of nonlnear non-auonomous funcons and s applcaons Essam. R. El-Zahar, and Abdelhalm Ebad Deparmen of Mahemacs, Faculy of Scences and Humanes, Prnce Saam Bn Abdulazz Unversy,
More informationTHE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS
THE PREICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS INTROUCTION The wo dmensonal paral dfferenal equaons of second order can be used for he smulaon of compeve envronmen n busness The arcle presens he
More informationEcon107 Applied Econometrics Topic 5: Specification: Choosing Independent Variables (Studenmund, Chapter 6)
Econ7 Appled Economercs Topc 5: Specfcaon: Choosng Independen Varables (Sudenmund, Chaper 6 Specfcaon errors ha we wll deal wh: wrong ndependen varable; wrong funconal form. Ths lecure deals wh wrong ndependen
More informationOrdinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s
Ordnary Dfferenal Equaons n Neuroscence wh Malab eamples. Am - Gan undersandng of how o se up and solve ODE s Am Undersand how o se up an solve a smple eample of he Hebb rule n D Our goal a end of class
More informationWiH Wei He
Sysem Idenfcaon of onlnear Sae-Space Space Baery odels WH We He wehe@calce.umd.edu Advsor: Dr. Chaochao Chen Deparmen of echancal Engneerng Unversy of aryland, College Par 1 Unversy of aryland Bacground
More informationDEEP UNFOLDING FOR MULTICHANNEL SOURCE SEPARATION SUPPLEMENTARY MATERIAL
DEEP UNFOLDING FOR MULTICHANNEL SOURCE SEPARATION SUPPLEMENTARY MATERIAL Sco Wsdom, John Hershey 2, Jonahan Le Roux 2, and Shnj Waanabe 2 Deparmen o Elecrcal Engneerng, Unversy o Washngon, Seale, WA, USA
More informationPart II CONTINUOUS TIME STOCHASTIC PROCESSES
Par II CONTINUOUS TIME STOCHASTIC PROCESSES 4 Chaper 4 For an advanced analyss of he properes of he Wener process, see: Revus D and Yor M: Connuous marngales and Brownan Moon Karazas I and Shreve S E:
More informationComputing Relevance, Similarity: The Vector Space Model
Compung Relevance, Smlary: The Vecor Space Model Based on Larson and Hears s sldes a UC-Bereley hp://.sms.bereley.edu/courses/s0/f00/ aabase Managemen Sysems, R. Ramarshnan ocumen Vecors v ocumens are
More informationLecture 18: The Laplace Transform (See Sections and 14.7 in Boas)
Lecure 8: The Lalace Transform (See Secons 88- and 47 n Boas) Recall ha our bg-cure goal s he analyss of he dfferenal equaon, ax bx cx F, where we emloy varous exansons for he drvng funcon F deendng on
More informationClustering (Bishop ch 9)
Cluserng (Bshop ch 9) Reference: Daa Mnng by Margare Dunham (a slde source) 1 Cluserng Cluserng s unsupervsed learnng, here are no class labels Wan o fnd groups of smlar nsances Ofen use a dsance measure
More informationStandard Error of Technical Cost Incorporating Parameter Uncertainty
Sandard rror of echncal Cos Incorporang Parameer Uncerany Chrsopher Moron Insurance Ausrala Group Presened o he Acuares Insue General Insurance Semnar 3 ovember 0 Sydney hs paper has been prepared for
More informatione-journal Reliability: Theory& Applications No 2 (Vol.2) Vyacheslav Abramov
June 7 e-ournal Relably: Theory& Applcaons No (Vol. CONFIDENCE INTERVALS ASSOCIATED WITH PERFORMANCE ANALYSIS OF SYMMETRIC LARGE CLOSED CLIENT/SERVER COMPUTER NETWORKS Absrac Vyacheslav Abramov School
More informationCSCE 478/878 Lecture 5: Artificial Neural Networks and Support Vector Machines. Stephen Scott. Introduction. Outline. Linear Threshold Units
(Adaped from Ehem Alpaydn and Tom Mchell) Consder humans: Toal number of neurons Neuron schng me 3 second (vs ) Connecons per neuron 4 5 Scene recognon me second nference seps doesn seem lke enough ) much
More informationApproximate Analytic Solution of (2+1) - Dimensional Zakharov-Kuznetsov(Zk) Equations Using Homotopy
Arcle Inernaonal Journal of Modern Mahemacal Scences, 4, (): - Inernaonal Journal of Modern Mahemacal Scences Journal homepage: www.modernscenfcpress.com/journals/jmms.aspx ISSN: 66-86X Florda, USA Approxmae
More informationScattering at an Interface: Oblique Incidence
Course Insrucor Dr. Raymond C. Rumpf Offce: A 337 Phone: (915) 747 6958 E Mal: rcrumpf@uep.edu EE 4347 Appled Elecromagnecs Topc 3g Scaerng a an Inerface: Oblque Incdence Scaerng These Oblque noes may
More informationarxiv: v1 [math.oc] 11 Dec 2014
Nework Newon Aryan Mokhar, Qng Lng and Alejandro Rbero Dep. of Elecrcal and Sysems Engneerng, Unversy of Pennsylvana Dep. of Auomaon, Unversy of Scence and Technology of Chna arxv:1412.374v1 [mah.oc] 11
More informationBoosted LMS-based Piecewise Linear Adaptive Filters
016 4h European Sgnal Processng Conference EUSIPCO) Boosed LMS-based Pecewse Lnear Adapve Flers Darush Kar and Iman Marvan Deparmen of Elecrcal and Elecroncs Engneerng Blken Unversy, Ankara, Turkey {kar,
More informationA decision-theoretic generalization of on-line learning. and an application to boosting. AT&T Labs. 180 Park Avenue. Florham Park, NJ 07932
A decson-heorec generalzaon of on-lne learnng and an applcaon o boosng Yoav Freund Rober E. Schapre AT&T Labs 80 Park Avenue Florham Park, NJ 07932 fyoav, schapreg@research.a.com December 9, 996 Absrac
More informationSupplementary Material to: IMU Preintegration on Manifold for E cient Visual-Inertial Maximum-a-Posteriori Estimation
Supplemenary Maeral o: IMU Prenegraon on Manfold for E cen Vsual-Ineral Maxmum-a-Poseror Esmaon echncal Repor G-IRIM-CP&R-05-00 Chrsan Forser, Luca Carlone, Fran Dellaer, and Davde Scaramuzza May 0, 05
More informationRELATIONSHIP BETWEEN VOLATILITY AND TRADING VOLUME: THE CASE OF HSI STOCK RETURNS DATA
RELATIONSHIP BETWEEN VOLATILITY AND TRADING VOLUME: THE CASE OF HSI STOCK RETURNS DATA Mchaela Chocholaá Unversy of Economcs Braslava, Slovaka Inroducon (1) one of he characersc feaures of sock reurns
More informationCHAPTER 5: MULTIVARIATE METHODS
CHAPER 5: MULIVARIAE MEHODS Mulvarae Daa 3 Mulple measuremens (sensors) npus/feaures/arbues: -varae N nsances/observaons/eamples Each row s an eample Each column represens a feaure X a b correspons o he
More information( ) [ ] MAP Decision Rule
Announcemens Bayes Decson Theory wh Normal Dsrbuons HW0 due oday HW o be assgned soon Proec descrpon posed Bomercs CSE 90 Lecure 4 CSE90, Sprng 04 CSE90, Sprng 04 Key Probables 4 ω class label X feaure
More informationTSS = SST + SSE An orthogonal partition of the total SS
ANOVA: Topc 4. Orhogonal conrass [ST&D p. 183] H 0 : µ 1 = µ =... = µ H 1 : The mean of a leas one reamen group s dfferen To es hs hypohess, a basc ANOVA allocaes he varaon among reamen means (SST) equally
More informationCH.3. COMPATIBILITY EQUATIONS. Continuum Mechanics Course (MMC) - ETSECCPB - UPC
CH.3. COMPATIBILITY EQUATIONS Connuum Mechancs Course (MMC) - ETSECCPB - UPC Overvew Compably Condons Compably Equaons of a Poenal Vecor Feld Compably Condons for Infnesmal Srans Inegraon of he Infnesmal
More informationMath 128b Project. Jude Yuen
Mah 8b Proec Jude Yuen . Inroducon Le { Z } be a sequence of observed ndependen vecor varables. If he elemens of Z have a on normal dsrbuon hen { Z } has a mean vecor Z and a varancecovarance marx z. Geomercally
More informationLecture 2 L n i e n a e r a M od o e d l e s
Lecure Lnear Models Las lecure You have learned abou ha s machne learnng Supervsed learnng Unsupervsed learnng Renforcemen learnng You have seen an eample learnng problem and he general process ha one
More informationEEL 6266 Power System Operation and Control. Chapter 5 Unit Commitment
EEL 6266 Power Sysem Operaon and Conrol Chaper 5 Un Commmen Dynamc programmng chef advanage over enumeraon schemes s he reducon n he dmensonaly of he problem n a src prory order scheme, here are only N
More informationReactive Methods to Solve the Berth AllocationProblem with Stochastic Arrival and Handling Times
Reacve Mehods o Solve he Berh AllocaonProblem wh Sochasc Arrval and Handlng Tmes Nsh Umang* Mchel Berlare* * TRANSP-OR, Ecole Polyechnque Fédérale de Lausanne Frs Workshop on Large Scale Opmzaon November
More informationMethod of upper lower solutions for nonlinear system of fractional differential equations and applications
Malaya Journal of Maemak, Vol. 6, No. 3, 467-472, 218 hps://do.org/1.26637/mjm63/1 Mehod of upper lower soluons for nonlnear sysem of fraconal dfferenal equaons and applcaons D.B. Dhagude1 *, N.B. Jadhav2
More informationChapter 6: AC Circuits
Chaper 6: AC Crcus Chaper 6: Oulne Phasors and he AC Seady Sae AC Crcus A sable, lnear crcu operang n he seady sae wh snusodal excaon (.e., snusodal seady sae. Complee response forced response naural response.
More informationOnline Boosting Algorithms for Multi-label Ranking
Onlne Boosng Algorhms for Mul-label Rankng Young Hun Jung Deparmen of Sascs Unversy of Mchgan Ann Arbor, MI 4809 yhjung@umch.edu Ambuj Tewar Deparmen of Sascs Unversy of Mchgan Ann Arbor, MI 4809 ewara@umch.edu
More informationP R = P 0. The system is shown on the next figure:
TPG460 Reservor Smulaon 08 page of INTRODUCTION TO RESERVOIR SIMULATION Analycal and numercal soluons of smple one-dmensonal, one-phase flow equaons As an nroducon o reservor smulaon, we wll revew he smples
More informationShould Exact Index Numbers have Standard Errors? Theory and Application to Asian Growth
Should Exac Index umbers have Sandard Errors? Theory and Applcaon o Asan Growh Rober C. Feensra Marshall B. Rensdorf ovember 003 Proof of Proposon APPEDIX () Frs, we wll derve he convenonal Sao-Vara prce
More informationNew M-Estimator Objective Function. in Simultaneous Equations Model. (A Comparative Study)
Inernaonal Mahemacal Forum, Vol. 8, 3, no., 7 - HIKARI Ld, www.m-hkar.com hp://dx.do.org/.988/mf.3.3488 New M-Esmaor Objecve Funcon n Smulaneous Equaons Model (A Comparave Sudy) Ahmed H. Youssef Professor
More informationTime-interval analysis of β decay. V. Horvat and J. C. Hardy
Tme-nerval analyss of β decay V. Horva and J. C. Hardy Work on he even analyss of β decay [1] connued and resuled n he developmen of a novel mehod of bea-decay me-nerval analyss ha produces hghly accurae
More informationABSTRACT KEYWORDS. Bonus-malus systems, frequency component, severity component. 1. INTRODUCTION
EERAIED BU-MAU YTEM ITH A FREQUECY AD A EVERITY CMET A IDIVIDUA BAI I AUTMBIE IURACE* BY RAHIM MAHMUDVAD AD HEI HAAI ABTRACT Frangos and Vronos (2001) proposed an opmal bonus-malus sysems wh a frequency
More informationDensity Matrix Description of NMR BCMB/CHEM 8190
Densy Marx Descrpon of NMR BCMBCHEM 89 Operaors n Marx Noaon Alernae approach o second order specra: ask abou x magnezaon nsead of energes and ranson probables. If we say wh one bass se, properes vary
More informationOptimal environmental charges under imperfect compliance
ISSN 1 746-7233, England, UK World Journal of Modellng and Smulaon Vol. 4 (28) No. 2, pp. 131-139 Opmal envronmenal charges under mperfec complance Dajn Lu 1, Ya Wang 2 Tazhou Insue of Scence and Technology,
More informationFI 3103 Quantum Physics
/9/4 FI 33 Quanum Physcs Aleander A. Iskandar Physcs of Magnesm and Phooncs Research Grou Insu Teknolog Bandung Basc Conces n Quanum Physcs Probably and Eecaon Value Hesenberg Uncerany Prncle Wave Funcon
More informationOnline Appendix for. Strategic safety stocks in supply chains with evolving forecasts
Onlne Appendx for Sraegc safey socs n supply chans wh evolvng forecass Tor Schoenmeyr Sephen C. Graves Opsolar, Inc. 332 Hunwood Avenue Hayward, CA 94544 A. P. Sloan School of Managemen Massachuses Insue
More informationMANY real-world applications (e.g. production
Barebones Parcle Swarm for Ineger Programmng Problems Mahamed G. H. Omran, Andres Engelbrech and Ayed Salman Absrac The performance of wo recen varans of Parcle Swarm Opmzaon (PSO) when appled o Ineger
More informationTesting a new idea to solve the P = NP problem with mathematical induction
Tesng a new dea o solve he P = NP problem wh mahemacal nducon Bacground P and NP are wo classes (ses) of languages n Compuer Scence An open problem s wheher P = NP Ths paper ess a new dea o compare he
More information