Machine Learning, 43, , 2001.

Size: px
Start display at page:

Download "Machine Learning, 43, , 2001."

Transcription

1 Machne Learnng, 43, 65-9, 00. Drftng Games Robert E. Schapre AT&T Labs Research, Shannon Laboratory, 80 Park Avenue, Room A79, Florham Park, NJ Abstract. We ntroduce and study a general, abstract game played between two players called the shepherd and the adversary. The game s played n a seres of rounds usng a nte set of \chps" whch are moved about n IR n. On each round, the shepherd assgns a desred drecton of movement and an mportance weght to each of the chps. The adversary then moves the chps n any way that need only be weakly correlated wth the desred drectons assgned by the shepherd. The shepherd's goal s to cause the chps to be moved to low-loss postons, where the loss of each chp at ts nal poston s measured by a gven loss functon. We present a shepherd algorthm for ths game and prove an upper bound on ts performance. We also prove a lower bound showng that the algorthm s essentally optmal for a large number of chps. We dscuss computatonal methods for ecently mplementng our algorthm. We show that our general drftng-game algorthm subsumes some well studed boostng and on-lne learnng algorthms whose analyses follow as easy corollares of our general result. Keywords: boostng, on-lne learnng algorthms. Introducton We ntroduce a general, abstract game played between two players called the shepherd and the adversary. The game s played n a seres of rounds usng a nte set of \chps" whch are moved about n IR n. On each round, the shepherd assgns a desred drecton of movement to each of the chps, as well as a nonnegatve weght measurng the relatve mportance that each chp be moved n the desred drecton. In response, the adversary moves each chp however t wshes, so long as the relatve movements of the chps projected n the drectons chosen by the shepherd are at least, on average. Here, the average s taken wth respect to the mportance weghts that were selected by the shepherd, and 0 s a gven parameter of the game. Snce we thnk of as a small number, the adversary need move the chps n a fashon that s only weakly correlated wth the drectons desred by the shepherd. The adversary s also restrcted to choose relatve movements for the In an earler verson of ths paper, the \shepherd" was called the \drfter," a term that was found by some readers to be confusng. The name of the man algorthm has also been changed from \Shepherd" to \OS." c 00 Kluwer Academc Publshers. Prnted n the Netherlands. p.

2 R. E. Schapre chps from a gven set B IR n. The goal of the shepherd s to force the chps to be moved to low-loss postons, where the loss of each chp at ts nal poston s measured by a gven loss functon L. A more formal descrpton of the game s gven n Secton. We present n Secton 4 a new algorthm called \OS" for playng ths game n the role of the shepherd, and we analyze the algorthm's performance for any parameterzaton of the game meetng certan natural condtons. Under the same condtons, we also prove n Secton 5 that our algorthm s the best possble when the number of chps becomes large. As spelled out n Secton 3, the drftng game s closely related to boostng, the problem of ndng a hghly accurate classcaton rule by combnng many weak classers or hypotheses. The drftng game and ts analyss are generalzatons of Freund's (995) \majorty-vote game" whch was used to derve hs boost-by-majorty algorthm. Ths latter algorthm s optmal n a certan sense for boostng bnary problems usng weak hypotheses whch are restrcted to makng bnary predctons. However, the boost-by-majorty algorthm has never been generalzed to multclass problems, nor to a settng n whch weak hypotheses may \abstan" or gve graded predctons between two classes. The general drftng game that we study leads mmedately to new boostng algorthms for these settngs. By our result on the optmalty of the OS algorthm, these new boostng algorthms are also best possble, assumng as we do n ths paper that the nal hypothess s restrcted n form to a smple majorty vote. We do not know f the derved algorthms are optmal wthout ths restrcton. In Secton 6, we dscuss computatonal methods for mplementng the OS algorthm. We gve a useful theorem for handlng games n whch the loss functon enjoys certan monotoncty propertes. We also gve a more general technque usng lnear programmng for mplementng OS n many settngs, ncludng the drftng game that corresponds to multclass boostng. In ths latter case, the algorthm runs n polynomal tme when the number of classes s held constant. In Secton 7, we dscuss the analyss of several drftng games correspondng to prevously studed learnng problems. For the drftng games correspondng to bnary boostng wth or wthout abstanng weak hypotheses, we show how to mplement the algorthm ecently. We also show that there are parameterzatons of the drftng game under whch OS s equvalent to a smpled verson of the AdaBoost algorthm (Freund and Schapre, 997; Schapre and Snger, 999), as well as Cesa-Banch et al.'s (996) BW algorthm and Lttlestone and Warmuth's (994) weghted majorty algorthm for combnng the advce of experts n an on-lne learnng settng. Analyses of these alp.

3 Drftng Games 3 parameters: number of rounds T dmenson of space n set B IR n of permtted relatve movements norm l p where p mnmum average drft 0 loss functon L : IR n IR number of chps m for t = ; : : : ; T : shepherd chooses weght vector w t IRn for each chp adversary chooses drft vector z t B for each chp so that m w t z t m = = the nal loss suered by the shepherd s m Fgure. The drftng game. jjw t jj p m = L T t= z t gorthms follow as easy corollares of the analyss we gve for general drftng games.. Drftng games We begn wth a formal descrpton of the drftng game. An outlne of the game s shown n Fg.. There are two players n the game called the shepherd and the adversary. The game s played n T rounds usng m chps. On each round, the shepherd speces a weght vector w t IRn for each chp. The drecton of ths vector, v t = wt =jjwt jj p, speces a desred drecton of drft, whle the length of the vector jjw tjj p speces the relatve mportance of movng the chp n the desred drecton. In response, the adversary chooses a drft vector z t for each chp. The adversary s constraned to choose each z t from a xed set B IRn. Moreover, the z t 's must satsfy w t z t jjw t jj p () or equvalently P jjwt jj pv t zt P jjwt jj p () p.3

4 4 R. E. Schapre where 0 s a xed parameter of the game. (Here and throughout the paper, when clear from context, P denotes P m = ; lkewse, we wll shortly use the notaton P t for P T t=.) In words, v t zt s the amount by whch chp has moved n the desred drecton. Thus, the left hand sde of Eq. () represents a weghted average of the drfts of the chps projected n the desred drectons where chp 's projected drft s weghted by jjw t jj p= P jjwt jj p. We requre that ths average projected drft be at least. The poston of chp at tme t, denoted by s t, s smply the sum of the drfts of that chp up to that pont n tme. Thus, s = 0 and = s t + zt. The nal poston of chp at the end of the game s s t+ s T +. At the end of T rounds, we measure the shepherd's performance usng a functon L of the nal postons of the chps; ths functon s called the loss functon. Speccally, the shepherd's goal s to mnmze m L(s T + ): Summarzng, we see that a game s speced by several parameters: the number of rounds T ; the dmenson n of the space; a norm jjjj p on IR n ; a set B IR n ; a mnmum drft constant 0; a loss functon L; and the number of chps m. Snce the length of weght vectors w are measured usng an l p -norm, t s natural to measure drft vectors z usng a dual l q -norm where =p + =q =. When clear from context, we wll generally drop p and q subscrpts and wrte smply jjwjj or jjzjj. As an example of a drftng game, suppose that the game s played on the real lne and that the shepherd's goal s to get as many chps as possble nto the nterval [; 7]. Suppose further that the adversary s constraned to move each chp left or rght by one unt, and that, on each round, 0% of the chps (as weghted by the shepherd's chosen dstrbuton over chps) must be moved n the shepherd's desred drecton. Then for ths game, n =, B = f ; +g and = 0:. Any norm wll do (snce we are workng n just one dmenson), and the loss functon s 0 f s [; 7] L(s) = otherwse. We wll return to ths example later n the paper. Drftng games bear a certan resemblence to the knd of games studed n Blackwell's (956) celebrated approachablty theory. However, t s unclear what the exact relatonshp s between these two types of games and whether one type s a specal case of the other. p.4

5 Drftng Games 5 3. Relaton to boostng In ths secton, we descrbe how the general game of drft relates drectly to boostng. In the smplest boostng model, there s a boostng algorthm that has access to a weak learnng algorthm that t calls n a seres of rounds. There are m gven labeled examples (x ; y ); : : : ; (x m ; y m ) where x and y f ; +g. On each round t, the booster chooses a dstrbuton D t () over the examples. The weak learner then must generate a weak hypothess h t : f ; +g whose error s at most = wth respect to dstrbuton D t. That s, Pr Dt [y 6= h t (x )] : (3) Here, > 0 s known a pror to both the booster and the weak learner. After T rounds, the booster outputs a nal hypothess whch we here assume s a majorty vote of the weak hypotheses: H(x) = sgn t h t (x) : (4) For our purposes, the goal of the booster s to mnmze the fracton of errors of the nal hypothess on the gven set of examples: m jf : y 6= H(x )gj : (5) We can recast boostng as just descrbed as a specal-case drftng game; a smlar game, called the \majorty-vote game," was studed by Freund (995) for ths case. The chps are dented wth examples, and the game s one-dmensonal so that n =. The drft of a chp z t s + f example s correctly classed by h t and otherwse; that s, z t = y h t (x ) and B = f ; +g. The weght w t s formally permtted to be negatve, somethng that does not make sense n the boostng settng; however, for the optmal shepherd descrbed n the next secton, ths weght wll always be nonnegatve for ths game (by Theorem 7), so we henceforth assume that w t 0. The dstrbuton D t() corresponds to P w t= wt. Then the condton n Eq. (3) s equvalent to or " w t P wt z t w t z t # w t : (6) Of course, the real goal of a boostng algorthm s to nd a hypothess wth low generalzaton error. In ths paper, we only focus on the smpled problem of mnmzng error on the gven tranng examples. p.5

6 6 R. E. Schapre Ths s the same as Eq. () f we let =. Fnally, f we dene the loss functon to be f s 0 L(s) = (7) 0 f s > 0 then m L(s T + ) (8) s exactly equal to Eq. (5). Our man result on playng drftng games yelds n ths case exactly Freund's boost-by-majorty algorthm (995). There are numerous varants of ths basc boostng settng to whch Freund's algorthm has never been generalzed and analyzed. For nstance, we have so far requred weak hypotheses to output values n f ; +g. It s natural to generalze ths model to allow weak hypotheses to take values n f ; 0; +g so that the weak hypotheses may \abstan" on some examples, or to take values n [ ; +] so that a whole range of values are possble. These correspond to smple modcatons of the drftng game descrbed above n whch we smply change B to f ; 0; +g or [ ; +]. As before, we requre that Eq. (6) hold for all weak hypotheses and we attempt to desgn a boostng algorthm whch mnmzes Eq. (8). For both of these cases, we are able to derve analogs of the boost-by-majorty algorthm whch we prove are optmal n a partcular sense. Another drecton for generalzaton s to the non-bnary multclass case n whch labels y belong to a set Y = f; : : : ; ng, n >. Followng generalzatons of the boostng algorthm AdaBoost to the multclass case (Freund and Schapre, 997; Schapre and Snger, 999), we allow the booster to assgn weghts both to examples and labels. That s, on each round, the booster devses a dstrbuton D t (; `) over examples and labels ` Y. The weak learner then computes a weak hypothess h t : Y f ; +g whch must be correct on a non-trval fracton of the example-label pars. That s, f we dene + f y = ` y (`) = otherwse then we requre Pr (;`)Dt [h t (x ; `) 6= y (`)] : (9) The nal hypothess, we assume, s agan a pluralty vote of the weak hypotheses: H(x) = arg max h t (x; y): (0) yy t p.6

7 Drftng Games 7 We can cast ths multclass boostng problem as a drftng game as follows. We have n dmensons, one per class. It wll be convenent for the rst dmenson always to correspond to the correct label, wth the remanng n dmensons correspondng to ncorrect labels. To do ths, let us dene a map ` : IR n IR n whch smply swaps coordnates and `, leavng the other coordnates untouched. The weght vectors w t correspond to the dstrbuton D t, modulo swappng of coordnates, a correcton of sgn and normalzaton: [y (w t D t (; `) = P )]` jjwt jj : The norm used here to measure weght vectors s l -norm. Also, t wll follow from Theorem 7 that, for optmal play of ths game, the rst coordnate of w t s always nonnegatve and all other coordnates are nonpostve. The drft vectors z t are derved as before from the weak hypotheses: z t = y (hh t (x ; ); : : : ; h t (x ; n)): It can be vered that the condton n Eq. (9) s equvalent to Eq. () wth =. For bnary weak hypotheses, B = f ; +g n. The nal hypothess H makes a mstake on example (x; y) f and only f h t (x; y) max h t (x; `): `:`6=y t Therefore, we can count the fracton of mstakes of the nal hypothess n the drftng game context as where m t L(s T + ) f s maxfs L(s) = ; : : : ; s n g 0 otherwse. () Thus, by gvng an algorthm for the general drftng game, we also obtan a generalzaton of the boost-by-majorty algorthm for multclass problems. The algorthm can be mplemented n ths case n polynomal tme for a constant number of classes n, and the algorthm s provably best possble n a partcular sense. We note also that a smpled form of the AdaBoost algorthm (Freund and Schapre, 997; Schapre and Snger, 999) can be derved as an nstance of the OS algorthm smply by changng the loss functon L n Eq. (7) to an exponental L(s) = exp( s) for some > 0. More detals on ths game are gven n Secton 7.. p.7

8 8 R. E. Schapre Besdes boostng problems, the drftng game also generalzes the problem of learnng on-lne wth a set of \experts" (Cesa-Banch et al., 997; Lttlestone and Warmuth, 994). In partcular, the BW algorthm of Cesa-Banch et al. (996) and the weghted majorty algorthm of Lttestone and Warmuth (994) can be derved as specal cases of our man algorthm for a partcular natural parameterzaton of the drftng game. Detals are gven n Secton The algorthm and ts analyss We next descrbe our algorthm for playng the general drftng game of Secton. Lke Freund's boost-by-majorty algorthm (995), the algorthm we present here uses a \potental functon" whch s central both to the workngs of the algorthm and ts analyss. Ths functon can be thought of as a \guess" of the loss that we expect to suer for a chp at a partcular poston and at a partcular pont n tme. We denote the potental of a chp at poston s on round t by t (s). The nal potental s the actual loss so that T = L. The potental functons t for earler tme steps are dened nductvely: t (s) = mn sup wir n zb ( t (s + z) + w z jjwjj p ): () We wll show later that, under natural condtons, the mnmum above actually exsts. Moreover, the mnmzng vector w s the one used by the shepherd for the algorthm we now present. We call our shepherd algorthm \OS" for \optmal shepherd." The weght vector w t chosen by OS for chp s any vector w whch mnmzes sup( t (s t + z) + w z jjwjj p ): zb Returnng to the example at the end of Secton, Fg. shows the potental functon t and the weghts that would be selected by OS as a functon of the poston of each chp for varous choces of t. For ths gure, T = 0. We wll need some natural assumptons to analyze ths algorthm. The rst assumpton states merely that the allowed drft vectors n B are bounded; for convenence, we assume they have norm at most one. ASSUMPTION. sup zb jjzjj q. We next assume that the loss functon L s bounded. ASSUMPTION. There exst nte L mn and L max such that L mn L(s) L max for all s IR n. p.8

9 Drftng Games 9 t=0 t=5 potental / weght potental / weght poston t= poston t=5 potental / weght potental / weght poston t= poston t=0 potental / weght potental / weght poston poston Fgure. Plots of the potental functon (top curve n each gure) and the weghts selected by OS (bottom curves) as a functon of the poston of a chp n the example game at the end of Secton for varous choces of t and wth T = 0. The vertcal dotted lnes show the boundary of the goal nterval [; 7]. Curves are only meanngful at nteger values. In fact, ths assumpton need only hold for all s wth jjsjj q T snce postons outsde ths range are never reached, gven Assumpton. Fnally, we assume that, for any drecton v, t s possble to choose a drft whose projecton onto v s more than by a constant amount. ASSUMPTION 3. There exsts a number > 0 such that for all w IR n there exsts z B wth w z ( + )jjwjj. p.9

10 0 R. E. Schapre LEMMA. Gven Assumptons, and 3, for all t = 0; : : : ; T :. the mnmum n Eq. () exsts; and. L mn t (s) L max for all s IR n. Proof. By backwards nducton on t. The base cases are trval. Let us x s and let F (z) = t (s + z). Let H(w) = sup(f (z) + w z jjwjj): zb Usng Assumpton, for any w, w 0 : jh(w 0 ) H(w)j sup zb (F (z) + w z jjwjj) (F (z) + w 0 z jjw 0 jj) (w w 0 ) z + (jjw 0 jj jjwjj) = sup zb ( + )jjw 0 wjj: Therefore, H s contnuous. Moreover, for w IR n, by Assumptons and 3 (as well as our nductve hypothess), Snce H(w) L mn + ( + )jjwjj jjwjj = L mn + jjwjj: (3) H(0) L max ; (4) t follows that H(w) > H(0) f jjwjj > (L max L mn )=. Thus, for computng the mnmum of H, we only need consder ponts n the compact set w : jjwjj L max L mn Snce a contnuous functon over a compact set has a mnmum, ths proves Part. Part follows mmedately from Eqs (3) and (4). We next prove an upper bound on the loss suered by a shepherd employng the OS algorthm aganst any adversary. Ths s the man result of ths secton. We wll shortly see that ths bound s essentally best possble for any algorthm. It s mportant to note that these theorems tell us much more than the almost obvous pont that the optmal thng to do s whatever s best n a mnmax sense. These theorems prove the nontrval fact that (nearly) mnmax behavor can be obtaned wthout the smultaneous consderaton of all of the chps at once. Rather, we can compute each weght vector w t merely as a functon of the poston of chp, wthout consderaton of the postons of any of the other chps. : p.0

11 Drftng Games THEOREM. Under the condton of Assumptons, and 3, the nal loss suered by the OS algorthm aganst any adversary s at most 0 (0) where the functons t are dened above. Proof. Followng Freund's analyss (995), we show that the total potental never ncreases. That s, we prove by nducton that t (s t+ ) t (s t ): (5) Ths mples, through repeated applcaton of Eq. (5), that m L(s T + ) = m T (s T + ) m 0 (s ) = 0 (0) as clamed. The denton of t gven n Eq. () mples that for w t chosen by the OS algorthm, and for all z B and all s IR n : t (s + z) + w t z jjw t jj t (s): Therefore, t (s t+ ) = t (s t + z t ) ( t (s t ) w t z t + jjw t jj) t (s t ) where the last nequalty follows from Eq. (). Returnng agan to the example at the end of Secton, Fg. 3 shows a plot of the bound 0 (0) as a functon of the total number of rounds T. It s rather curous that the bound s not monotonc n T (even dscountng the jagged nature of the curve caused by the derence between even and odd length games). Apparently, for ths game, havng more tme to get the chps nto the goal regon can actually hurt the shepherd. 5. A lower bound In ths secton, we prove that the OS algorthm s essentally optmal n the sense that, for any shepherd algorthm, there exsts an adversary capable of forcng a loss matchng the upper bound of Theorem n the lmt of a large number of chps. Speccally, we prove the followng theorem, the man result of ths secton: p.

12 R. E. Schapre 0.9 optmal loss T Fgure 3. A plot of the loss bound 0(0) as a functon of the total number of rounds T for the example game at the end of Secton. The jagged nature of the curve s due to the derence between a game wth an odd or an even number of steps. THEOREM 3. Let A be any shepherd algorthm for playng a drftng game satsfyng Assumptons, and 3 where all parameters of the game are xed, except the number of chps m. Let t be as dened above. Then for any > 0, there exsts an adversary such that for m sucently large, the loss suered by algorthm A s at least 0 (0). To prove the theorem, we wll need two lemmas. The rst gves an abstract result on computng a mnmax of the knd appearng n Eq. (). The second lemma uses the rst to prove a characterzaton of t n a form amenable to use n the proof of Theorem 3. LEMMA 4. Let S be any nonempty, bounded subset of IR. Let C be the convex hull of S. Then nf supfy + x : (x; y) Sg = supfy : (0; y) Cg: IR Proof. Let C be the closure of C. Frst, for any IR, supfy + x : (x; y) Sg = supfy + x : (x; y) Cg = supfy + x : (x; y) Cg: (6) p.

13 Drftng Games 3 The rst equalty follows from the fact that, f (x; y) C then (x; y) = N = p (x ; y ) for some postve nteger N, p [0; ], P p =, (x ; y ) S. But then y + x = N = p (y + x ) max(y + x ): The second equalty n Eq. (6) follows smply because the supremum of a contnuous functon on any set s equal to ts supremum over the closure of the set. For ths same reason, supfy : (0; y) Cg = supfy : (0; y) Cg: (7) Because C s closed, convex and bounded, and because the functon y + x s contnuous, concave n (x; y) and convex n, we can reverse the order of the \nf sup" (see, for nstance, Corollary of Rockafellar (970)). That s, nf sup IR (x;y)c Clearly, f x 6= 0 then (y + x) = sup (x;y)c nf (y + x) = : IR Thus, the rght hand sde of Eq. (8) s equal to supfy : (0; y) Cg: nf (y + x): (8) IR Combnng wth Eqs. (6) and (7) mmedately gves the result. LEMMA 5. Under the condton of Assumptons, and 3, and for t as dened above, t (s) = nf v:jjvjj= sup N j= d j t (s + z j ) where the supremum s taken over all postve ntegers N, all z ; : : : ; z N B and all nonnegatve d ; : : : ; d N satsfyng P j d j = and j d j v z j = : p.3

14 4 R. E. Schapre Proof. To smplfy notaton, let us x t and s. Let F and H be as dened n the proof of Lemma. For jjvjj =, let G(v) = sup N j= d j F (z j ) (9) where agan the supremum s taken over d j 's and z j 's as n the statement of the lemma. Note that by Assumpton 3, ths supremum cannot be vacuous. Throughout ths proof, we use v to denote a vector of norm one, whle w s a vector of unrestrcted norm. Our goal s to show that nf v Let us x v momentarly. Let G(v) = nf H(w): (0) S = f(v z ; F (z)) : z Bg : Then S s bounded by Assumptons, and 3 (and part of Lemma ), so we can apply Lemma 4 whch gves Note that w nf sup (F (z) + (v z )) = G(v): () IR zb nf H(v) = nf 0 sup 0 zb nf sup IR zb nf sup IR zb (F (z) + v z ) (F (z) + v z ) (F (z) + v z jj) = nf H(v) IR (where the second nequalty uses jj). Combnng wth Eq. () gves nf nf H(v) nf G(v) nf nf H(v): v 0 v v IR Snce the left and rght terms are both equal to nf w H(w), ths mples Eq. (0) and completes the proof. Proof of Theorem 3. We wll show that, for m sucently large, on round t, the adversary can choose the z t 's so that m t (s t+ ) m t (s t ) T : () p.4

15 Repeatedly applyng Eq. () mples that m L(s T + ) = m Drftng Games 5 T (s T + ) m 0 (s ) = 0 (0) provng the theorem. Fx t. We use a random constructon to show that there exst z t 's wth the desred propertes. For each weght vector w t chosen by the Pshepherd, let d ; : : : ; d N [0; ] and z ; : : : ; z N B be such that j d j =, and j j d j w t z j = jjw t jj d j t (s t + z j ) t (s t ) T : Such d j 's and z j 's must exst by Lemma 5. Usng Assumpton 3, let z 0 be such that w t z 0 ( + )jjw t jj: Fnally, let Z be a random varable that s z 0 wth probablty and z j wth probablty ( )d j (ndependent of the other Z 's). Here, = 4T (L max L mn ) : Let v = w t=jjwt jj, and let a = P jjw tjj= jjwt jj. By Assumpton, jv Z j. Also, E [v Z ] ( ) + ( + ) = + : Thus, by Hoedng's nequalty (963), Pr " a v Z < Let S = (=m) P t(s t + Z ). Then E [S] m = m m # t (s t ) exp P a ( ) + t (s t + z 0 ) T h t (s t ) + ( t (s t + z 0 ) t (s t )) e = : (3) ( ) T t (s t ) (L max L mn ) T : (4) p.5

16 6 R. E. Schapre By Hoedng's nequalty (963), snce L mn t (s t + Z ) L max, Pr [S < E [S] (L max L mn )] e m : (5) Now let m be so large that e m + e = < : Then by Eqs. (3) and (5), there exsts a choce of z t 's such that and such that m w t z t = t (s t+ ) = m by Eq. (4) and our choce of. a v z t t (s t + z t ) E [S] (L max L mn ) t (s t m ) T 6. Computatonal methods In ths secton, we dscuss general computatonal methods for mplementng the OS algorthm. 6.. Unate loss functons We rst note that, for loss functons L wth certan monotoncty propertes, the quadrant n whch the mnmzng weght vectors are to be found can be determned a pror. Ths often smples the search for mnma. To be more precse, for f ; +g n and x; y IR n, let us wrte x y f x y for all n. We say that a functon f : IR n IR s unate wth sgn vector f ; +g n f f(x) f(y) whenever x y. LEMMA 6. If the loss functon L s unate wth sgn vector f ; +g n, then so s t (as dened above) for t = 0; : : : ; T. Proof. By backwards nducton on t. The base case s mmedate. Let x y. Then for any z B and w IR n, x + z y + z, and so t (x + z) + w z jjwjj t (y + z) + w z jjwjj p.6

17 Drftng Games 7 by nductve hypothess. Therefore, t (x) t (y), and so t s also unate. For the man theorem of ths subsecton, we need one more assumpton: ASSUMPTION 4. If z B and f z 0 s such that jz 0 j = jz j for all, then z 0 B. THEOREM 7. Under the condton of Assumptons,, 3 and 4, f L s unate wth sgn vector f ; +g n, then for any s IR n, there s a vector w whch mnmzes sup( t (s + z) + w z jjwjj) zb and for whch w 0. Proof. Let F and H be as n the proof of Lemma. By Lemma 6, F s unate. Let w IR n have some coordnate for whch w > 0 so that w 6 0. Let w 0 be such that w 0 j = wj f j 6= w f j =. We show that H(w 0 ) H(w). Let z B. If z > 0 then F (z) + w z jjwjj F (z) + w 0 z jjw 0 jj: If z 0 then let z 0 be dened analogously to w 0. By Assumpton 4, z 0 B. Then z z 0 and so F (z) F (z 0 ). Thus, F (z 0 ) + w z 0 jjwjj F (z) + w 0 z jjw 0 jj: Hence, H(w 0 ) H(w). Applyng ths argument repeatedly, we can derve a vector w wth w 0 and such that H(w) H(w). Ths proves the theorem. Note that the loss functons for all of the games n Secton 3 are unate (and also satsfy Assumptons {4). The same wll be true of all of the games dscussed n Secton 7. Thus, for all of these games, we can determne a pror the sgns of each of the coordnates of the mnmzng vectors used by the OS algorthm. p.7

18 8 R. E. Schapre 6.. A general technque usng lnear programmng In many cases, we can use lnear programmng to mplement OS. In partcular, let us assume that we measure weght vectors w usng the l norm (.e., p = ). Also, let us assume that jbj s nte. Then gven t and s, computng t (s) = mn wir n max zb ( t(s + z) + w z jjwjj) can be rewrtten as an optmzaton problem: varables: w IR n, b IR mnmze: b subject to: 8z B : t (s + z) + w z jjwjj b. The mnmzng value b s the desred value of t (s). Note that, wth respect to the varables w and b, ths problem s \almost" a lnear program, f not for the norm operator. However, when L s unate wth sgn vector, and when the other condtons of Theorem 7 hold, we can restrct w so that w 0. Ths allows us to wrte jjwjj = n = w : Addng w 0 as a constrant (or rather, a set of n constrants), we now have derved a lnear program wth n + varables and jbj + n constrants. Ths can be solved n polynomal tme. Thus, for nstance, ths technque can be appled to the multclass boostng problem dscussed n Secton 3. In ths case, B = f ; +g n. So, for any s, t (s) can be computed from t n tme polynomal n n whch may be reasonable for small n. In addton, t must be computed at each reachable poston s n an n-dmensonal nteger grd of radus t,.e., for all s f t; t + ; : : : ; t ; tg n. Ths nvolves computaton of t at (t + ) n ponts, gvng an overall runnng tme for the algorthm whch s polynomal n (T + ) n. Agan, ths may be reasonable for very small n. It s an open problem to nd a way to mplement the algorthm more ecently. 7. Dervng old and new algorthms In ths secton, we show how a number of old and new boostng and on-lne learnng algorthms can be derved and analyzed as nstances of the OS algorthm for approprately chosen drftng games. p.8

19 Drftng Games Boost-by-majorty and varants We begn wth the drftng game descrbed n Secton 3 correspondng to bnary boostng wth B = f ; +g. For ths game, t (s) = mn w0 maxf t(s ) w w; t (s + ) + w wg where we know from Theorem 7 that only nonnegatve values of w need to be consdered. It can be argued that the mnmum must occur when.e., when Ths gves Solvng gves t (s ) w w = t (s + ) + w w; w = t(s ) t (s + ) : (6) t (s) = + t(s + ) + t(s ): t (s) = t T 0k(T t s)= T t k + n (where we follow the conventon that k = 0 f k < 0 or k > n). Weghtng examples usng Eq. (6) gves exactly Freund's (995) boostby-majorty algorthm (the \boostng by resamplng" verson). When B = f ; 0; +g, a smlar but more nvolved analyss gves t (s) = max ( ) t (s) + t (s + ); + t(s + ) + t(s ) (7) and the correspondng choce of w s t (s) t (s + ) or ( t (s ) t (s+))=, dependng on whether the maxmum n Eq. (7) s realzed by the rst or second quantty. We do not know how to solve the recurrence n Eq. (7) so that the bound 0 (0) gven n Theorem can be put n explct form. Nevertheless, ths bound can easly be evaluated numercally, and the algorthm can certanly be mplemented ecently n ts present form. We have thus far been unable to solve the recurrence for the case that B = [ ; +], even to a pont at whch the algorthm can be mplemented. However, ths case can be approxmated by the case n whch B = f=n : = N; : : : ; Ng for a moderate value of N. In the latter case, the potental functon and assocated weghts can be k p.9

20 0 R. E. Schapre loss AdaBoost B = [-,+] B = {-,0,+} B = {-,+} T Fgure 4. A comparson of the bound 0(0) for the drftng games assocated wth AdaBoost (Secton 7.) and boost-by-majorty (Sectons 3 and 7.). For AdaBoost, s set as n Eq. (8). For boost-by-majorty, the bound s plotted when B s f ; +g, f ; 0; +g and [ ; +]. (The latter case s approxmated by B = f=00 : = 00; : : : ; 00g.) The bound s plotted as a functon of the number of rounds T. The drft parameter s xed to = 0:. (The jagged nature of the B = f ; +g curve s due to the fact that games wth an even number of rounds n whch tes count as a loss for the shepherd so that L(0) = are harder than games wth an odd number of rounds.) computed numercally. For nstance, lnear programmng can be used as dscussed n Secton 6.. Alternatvely, t can be shown that Lemma 5 combned wth Theorem 7 mples that t (s)=maxfp t (s + z ) + ( p) t (s + z ) : z ; z B; p [0; ]; pz + ( p)z = g whch can be evaluated usng a smple search over all pars z ; z (snce B s nte). Fg. 4 compares the bound 0 (0) for the drftng games assocated wth boost-by-majorty and varants n whch B s f ; +g, f ; 0; +g and [ ; +] (usng the approxmaton that was just mentoned), as well as AdaBoost (dscussed n the next secton). These bounds are plotted as a functon of the number of rounds T. p.0

21 Drftng Games 7.. AdaBoost and varants As mentoned n Secton 3, a smpled, non-adaptve verson of AdaBoost can be derved as an nstance of OS. To do ths, we smply replace the loss functon (Eq. (7)) n the bnary boostng game of Secton 3 wth an exponental loss functon L(s) = e s where > 0 s a parameter of the game. As a specal case of the dscusson below, t wll follow that t (s) = T t e s where s the constant = e + + e : Also, the weght gven to a chp at poston s on round t s T t e e e s whch s proportonal to e s (n other words, the weghtng functon s eectvely unchanged from round to round). Ths weghtng s the same as the one used by a non-adaptve verson of AdaBoost n whch all weak hypotheses are gven equal weght. Snce e s s an upper bound on the loss functon of Eq. (7), Theorem mples an upper bound on the fracton of mstakes of the nal hypothess of When 0 (0) = T : + = ln so that s mnmzed, ths gves an upper bound of (8) ( ) T = = ( 4 ) T = whch s equvalent to a non-adaptve verson of Freund and Schapre's (997) analyss. We next consder a more general drftng game n n dmensons whose loss functon s a sum of exponentals L(s) = k j= b j exp( j u j s) (9) where the b j 's, j 's and u j 's are parameters wth b j > 0, j > 0, jju j jj = and u j 0 for some sgn vector. For ths game, B = p.

22 R. E. Schapre [ ; +] n and p =. Many (non-adaptve) varants of AdaBoost correspond to specal cases of ths game. For nstance, AdaBoost.M (Freund and Schapre, 997), a multclass verson of AdaBoost, essentally uses the loss functon L(s) = n j= e (=)(s s j ) where we follow the multclass setup of Secton 3 so that n s the number of classes, and the rst component n the drftng game s dented wth the correct class. (As before, we only consder a nonadaptve game n whch > 0 s a xed, tunable parameter.) Lkewse, AdaBoost.MH (Schapre and Snger, 999), another multclass verson of AdaBoost, uses the loss functon L(s) = e s + n j= e s j : Note that both loss functons upper bound the \true" loss for multclass boostng gven n Eq. (). Moreover, both functons clearly have the form gven n Eq. (9). We clam that, for the general game wth loss functon as n Eq. (9), t (s) = j b j T t j exp( j u j s) (30) where j = e j + + e j : Proof of Eq. (30) s by backwards nducton on t. For xed t and s, let w = j b j T t j e j e j u j exp( j u j s): We wll show that ths s the mnmzng weght vector that gets used by OS for a chp at poston s at tme t. Let b 0 j = b j T t j exp( j u j s): Note that t (s + z) + w z = j j b 0 j exp( j u j z) + e j e j b 0 j e j + e j u j z (3) p.

23 snce e x Drftng Games 3 e + e e e for all IR and x [ ; +] by convexty of e x. Also, by our assumptons on b j, u j and j, we can compute jjwjj = j b 0 j e j Thus, combnng Eqs. (3) and (3) gves e j t (s) sup( t (s + z) + w z jjwjj ) zb x : (3) j = j b 0 j j b j T t+ j exp( j u j s): Ths gves the needed upper bound on t (s). For the lower bound, usng Theorem 7 (snce L s unate wth sgn vector ), we have t (s) mn max w 0 zf ;g 8 < = mn max c0 : j ( t (s + z) + w z jjwjj ) b 0 j e j + c c; j b 0 j e j c c 9 = ; where we have used u j = and w = jjwjj (snce u j 0 and w 0). We also have dented c wth jjwjj. Solvng the mn max expresson gves the desred lower bound. Ths completes the proof of Eq. (30) On-lne learnng algorthms In ths secton, we show how Cesa-Banch et al.'s (996) BW algorthm for combnng expert advce can be derved as an nstance of OS. We wll also see how ther algorthm can be generalzed, and how Lttlestone and Warmuth's (994) weghted majorty algorthm can also be derved and analyzed. Suppose that we have access to m \experts." On each round t, each expert provdes a predcton t f ; +g. A \master" algorthm combnes ther predctons nto ts own predcton t f ; +g. An p.3

24 4 R. E. Schapre outcome y t f ; +g s then observed. The master makes a mstake f t 6= y t, and smlarly for expert f t 6= y t. The goal of the master s to mnmze how many mstakes t makes relatve to the best expert. We wll consder master algorthms whch use a weghted majorty vote to form ther predctons; that s, t = sgn m = w t t The problem s to derve a good choce of weghts w t. We also assume that the master algorthm s conservatve n the sense that rounds on whch the master's predctons are correct are eectvely gnored (so that the weghts w t only depend upon prevous rounds on whch mstakes were made). Let us suppose that there s one expert that makes at most k mstakes. We wll (re)derve an algorthm (namely, BW) and a bound on the number of mstakes made by the master, gven ths assumpton. Snce we restrct our attenton to conservatve algorthms, we can assume wthout loss of generalty that a mstake occurs on every round and smply proceed to bound the total number of rounds. To set up the problem as a drftng game, we dentfy one chp wth each of the experts. The problem s one dmensonal so n =. The weghts w t selected by the master are the same as those chosen by the shepherd. Snce we assume that the master makes a mstake on each round, we have for all t that y t w t t 0: (33) Thus, f we dene the drft z t to be y t t, then w t zt 0: Settng = 0, we see that Eq. (33) s equvalent to Eq. (). Also, B = f ; +g. Let M t be the number of mstakes made by expert on rounds ; : : : ; t. Then by denton of z t, Let the loss functon L be s t = M t t + : : L(s) = f s k T 0 otherwse. (34) p.4

25 Drftng Games 5 Then L(s T + ) = f and only f expert makes a total of k or fewer mstakes n T rounds. Thus, our assumpton that the best expert makes at most k mstakes mples that On the other hand, Theorem mples that m L(s T + ): (35) L(s T + ) 0 (0): (36) By an analyss smlar to the one gven n Secton 7., t can be seen that t (s) = ( t(s + ) + t (s )) : Solvng ths recurrence gves t (s) = t T where n = k In partcular, 0 (0) = T T t k t+s k =0 n k T k Combnng Eqs. (35), (36) and (37) gves m T T k : : (37) : (38) In other words, the number of mstakes T of the master algorthm must satsfy Eq. (38) and so must be at most ( ) q max q IN : q lg m + lg ; k the same bound gven by Cesa-Banch et al. (996). The weghtng functon obtaned s also equvalent to thers snce, by a smlar argument to that used n Secton 7., OS gves w t = ( t(s t ) t(s t + )) = t T T t k t+s = t T T t k M t : p.5

26 6 R. E. Schapre Note that ths argument can be generalzed to the case n whch the expert's predctons are not restrcted to f ; +g but nstead may be all of [ ; +], or a subset of ths nterval, such as f ; 0; +g. The performance of each expert then s measured on each round usng absolute loss jt y tj rather than whether or not t made a mstake. In ths case, as n the analogous extenson of boost-by-majorty gven n Secton 3, we only need to replace B by [ ; +] or f ; 0; +g. The resultng bound on the number of mstakes of the master s then the largest T for whch =m 0 (0) (note that 0 (0) depends mplctly on T ). The resultng master algorthm smply uses the weghts computed by OS for the approprate drftng game. It s an open problem to determne f ths generalzed algorthm enjoys strong optmalty propertes smlar to those of BW (Cesa-Banch et al., 996). Lttlestone and Warmuth's (994) weghted majorty algorthm can also be derved as an nstance of OS. To do ths, we smply replace the loss functon L n the game above wth L(s) = exp( (s k + T )) for some parameter > 0. Ths loss functon upper bounds the one n Eq. (34). We assume that experts are permtted to output predctons n [ ; +] so that B = [ ; +]. From the results of Secton 7. appled to ths drftng game, where t (s) = T t exp( (s k + T )) = e + e : Therefore, because one expert suers loss at most k, m 0(0) = T e (k T ) : Equvalently, the number of mstakes T s at most k + ln m ; ln +e exactly the bound gven by Lttlestone and Warmuth (994). The algorthm s also the same as thers snce the weght gven to an expert (chp) at poston s t at tme t s w t = e e exp( (s t k + T )) / exp( M t ): p.6

27 Drftng Games 7 8. Open problems Ths paper represents the rst work on general drftng games. As such, there are many open problems. We have presented closed-form solutons of the potental functon for just a few specal cases. Are there other cases n whch such closedform solutons are possble? In partcular, can the boostng games of Secton 3 correspondng to B = f ; 0; +g and B = [ ; +] be put nto closed-form? For games n whch a closed form s not possble, s there nevertheless a general method of characterzng the loss bound 0 (0), say, as the number of rounds T gets large? Sde products of our work nclude new versons of boost-by-majorty for the multclass case, as well as bnary cases n whch the weak hypotheses have range f ; 0; +g or [ ; +]. However, the optmalty proof for the drftng game only carres over to the boostng settng f the nal hypothess has the restrcted forms gven n Eqs. (4) and (0). Are the resultng boostng algorthms also optmal (for nstance, n the sense proved by Freund (995) for boost-by-majorty) wthout these restrctons? Lkewse, can the extensons of the BW algorthm n Secton 7.3 be shown to be optmal? Can ths algorthm be extended usng drftng games to the multclass case, or to the case n whch the master s allowed to output predctons n [ ; +] (suerng absolute loss)? The OS algorthm s non-adaptve n the sense that must be known ahead of tme. To what extent can OS be made adaptve? For nstance, can Freund's (999) recent technque for makng boost-by-majorty adaptve be carred over to the general drftng-game settng? Smlarly, what happens f the number of rounds T s not known n advance? Fnally, are there other nterestng drftng games for entrely derent learnng problems such as regresson or densty estmaton? Acknowledgments Many thanks to Yoav Freund for very helpful dscussons whch led to ths research. References Blackwell, D.: 956, `An analog of the mnmax theorem for vector payos'. Pacc Journal of Mathematcs 6(), {8. Cesa-Banch, N., Y. Freund, D. Haussler, D. P. Helmbold, R. E. Schapre, and M. K. Warmuth: 997, `How to Use Expert Advce'. Journal of the Assocaton for Computng Machnery 44(3), 47{485. p.7

28 8 R. E. Schapre Cesa-Banch, N., Y. Freund, D. P. Helmbold, and M. K. Warmuth: 996, `On-lne Predcton and Converson Strateges'. Machne Learnng 5, 7{0. Freund, Y.: 995, `Boostng a weak learnng algorthm by majorty'. Informaton and Computaton (), 56{85. Freund, Y.: 999, `An adaptve verson of the boost by majorty algorthm'. In: Proceedngs of the Twelfth Annual Conference on Computatonal Learnng Theory. pp. 0{3, ACM Press. Freund, Y. and R. E. Schapre: 997, `A decson-theoretc generalzaton of onlne learnng and an applcaton to boostng'. Journal of Computer and System Scences 55(), 9{39. Hoedng, W.: 963, `Probablty nequaltes for sums of bounded random varables'. Journal of the Amercan Statstcal Assocaton 58(30), 3{30. Lttlestone, N. and M. K. Warmuth: 994, `The Weghted Majorty Algorthm'. Informaton and Computaton 08, {6. Rockafellar, R. T.: 970, Convex Analyss. Prnceton Unversty Press. Schapre, R. E. and Y. Snger: 999, `Improved boostng algorthms usng condencerated predctons'. Machne Learnng 37(3), 97{336. p.8

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

and problem sheet 2

and problem sheet 2 -8 and 5-5 problem sheet Solutons to the followng seven exercses and optonal bonus problem are to be submtted through gradescope by :0PM on Wednesday th September 08. There are also some practce problems,

More information

Section 8.3 Polar Form of Complex Numbers

Section 8.3 Polar Form of Complex Numbers 80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Complete subgraphs in multipartite graphs

Complete subgraphs in multipartite graphs Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness.

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness. 20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The frst dea s connectedness. Essentally, we want to say that a space cannot be decomposed

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}.

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}. CS 189 Introducton to Machne Learnng Sprng 2018 Note 26 1 Boostng We have seen that n the case of random forests, combnng many mperfect models can produce a snglodel that works very well. Ths s the dea

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Maximizing the number of nonnegative subsets

Maximizing the number of nonnegative subsets Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential Open Systems: Chemcal Potental and Partal Molar Quanttes Chemcal Potental For closed systems, we have derved the followng relatonshps: du = TdS pdv dh = TdS + Vdp da = SdT pdv dg = VdP SdT For open systems,

More information

a b a In case b 0, a being divisible by b is the same as to say that

a b a In case b 0, a being divisible by b is the same as to say that Secton 6.2 Dvsblty among the ntegers An nteger a ε s dvsble by b ε f there s an nteger c ε such that a = bc. Note that s dvsble by any nteger b, snce = b. On the other hand, a s dvsble by only f a = :

More information

CS286r Assign One. Answer Key

CS286r Assign One. Answer Key CS286r Assgn One Answer Key 1 Game theory 1.1 1.1.1 Let off-equlbrum strateges also be that people contnue to play n Nash equlbrum. Devatng from any Nash equlbrum s a weakly domnated strategy. That s,

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Chapter 3 Differentiation and Integration

Chapter 3 Differentiation and Integration MEE07 Computer Modelng Technques n Engneerng Chapter Derentaton and Integraton Reerence: An Introducton to Numercal Computatons, nd edton, S. yakowtz and F. zdarovsky, Mawell/Macmllan, 990. Derentaton

More information

Some basic inequalities. Definition. Let V be a vector space over the complex numbers. An inner product is given by a function, V V C

Some basic inequalities. Definition. Let V be a vector space over the complex numbers. An inner product is given by a function, V V C Some basc nequaltes Defnton. Let V be a vector space over the complex numbers. An nner product s gven by a functon, V V C (x, y) x, y satsfyng the followng propertes (for all x V, y V and c C) (1) x +

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution. Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,

More information

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso Supplement: Proofs and Techncal Detals for The Soluton Path of the Generalzed Lasso Ryan J. Tbshran Jonathan Taylor In ths document we gve supplementary detals to the paper The Soluton Path of the Generalzed

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique Outlne and Readng Dynamc Programmng The General Technque ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Matrx Chan-Product ( 5.3.1) Dynamc Programmng verson 1.4 1 Dynamc Programmng verson 1.4 2 Dynamc Programmng

More information

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis A Appendx for Causal Interacton n Factoral Experments: Applcaton to Conjont Analyss Mathematcal Appendx: Proofs of Theorems A. Lemmas Below, we descrbe all the lemmas, whch are used to prove the man theorems

More information

Lecture Space-Bounded Derandomization

Lecture Space-Bounded Derandomization Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval

More information

Finding Primitive Roots Pseudo-Deterministically

Finding Primitive Roots Pseudo-Deterministically Electronc Colloquum on Computatonal Complexty, Report No 207 (205) Fndng Prmtve Roots Pseudo-Determnstcally Ofer Grossman December 22, 205 Abstract Pseudo-determnstc algorthms are randomzed search algorthms

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

Snce h( q^; q) = hq ~ and h( p^ ; p) = hp, one can wrte ~ h hq hp = hq ~hp ~ (7) the uncertanty relaton for an arbtrary state. The states that mnmze t

Snce h( q^; q) = hq ~ and h( p^ ; p) = hp, one can wrte ~ h hq hp = hq ~hp ~ (7) the uncertanty relaton for an arbtrary state. The states that mnmze t 8.5: Many-body phenomena n condensed matter and atomc physcs Last moded: September, 003 Lecture. Squeezed States In ths lecture we shall contnue the dscusson of coherent states, focusng on ther propertes

More information

A Simple Research of Divisor Graphs

A Simple Research of Divisor Graphs The 29th Workshop on Combnatoral Mathematcs and Computaton Theory A Smple Research o Dvsor Graphs Yu-png Tsao General Educaton Center Chna Unversty o Technology Tape Tawan yp-tsao@cuteedutw Tape Tawan

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem. prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

1 Definition of Rademacher Complexity

1 Definition of Rademacher Complexity COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #9 Scrbe: Josh Chen March 5, 2013 We ve spent the past few classes provng bounds on the generalzaton error of PAClearnng algorths for the

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

CHAPTER 14 GENERAL PERTURBATION THEORY

CHAPTER 14 GENERAL PERTURBATION THEORY CHAPTER 4 GENERAL PERTURBATION THEORY 4 Introducton A partcle n orbt around a pont mass or a sphercally symmetrc mass dstrbuton s movng n a gravtatonal potental of the form GM / r In ths potental t moves

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Physics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1

Physics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1 P. Guterrez Physcs 5153 Classcal Mechancs D Alembert s Prncple and The Lagrangan 1 Introducton The prncple of vrtual work provdes a method of solvng problems of statc equlbrum wthout havng to consder the

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Exercise Solutions to Real Analysis

Exercise Solutions to Real Analysis xercse Solutons to Real Analyss Note: References refer to H. L. Royden, Real Analyss xersze 1. Gven any set A any ɛ > 0, there s an open set O such that A O m O m A + ɛ. Soluton 1. If m A =, then there

More information

Canonical transformations

Canonical transformations Canoncal transformatons November 23, 2014 Recall that we have defned a symplectc transformaton to be any lnear transformaton M A B leavng the symplectc form nvarant, Ω AB M A CM B DΩ CD Coordnate transformatons,

More information

Determinants Containing Powers of Generalized Fibonacci Numbers

Determinants Containing Powers of Generalized Fibonacci Numbers 1 2 3 47 6 23 11 Journal of Integer Sequences, Vol 19 (2016), Artcle 1671 Determnants Contanng Powers of Generalzed Fbonacc Numbers Aram Tangboonduangjt and Thotsaporn Thanatpanonda Mahdol Unversty Internatonal

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

Computing Correlated Equilibria in Multi-Player Games

Computing Correlated Equilibria in Multi-Player Games Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,

More information

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS BOUNDEDNESS OF THE IESZ TANSFOM WITH MATIX A WEIGHTS Introducton Let L = L ( n, be the functon space wth norm (ˆ f L = f(x C dx d < For a d d matrx valued functon W : wth W (x postve sem-defnte for all

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpenCourseWare http://ocw.mt.edu 6.854J / 18.415J Advanced Algorthms Fall 2008 For nformaton about ctng these materals or our Terms of Use, vst: http://ocw.mt.edu/terms. 18.415/6.854 Advanced Algorthms

More information

FACTORIZATION IN KRULL MONOIDS WITH INFINITE CLASS GROUP

FACTORIZATION IN KRULL MONOIDS WITH INFINITE CLASS GROUP C O L L O Q U I U M M A T H E M A T I C U M VOL. 80 1999 NO. 1 FACTORIZATION IN KRULL MONOIDS WITH INFINITE CLASS GROUP BY FLORIAN K A I N R A T H (GRAZ) Abstract. Let H be a Krull monod wth nfnte class

More information

Weighted Voting Systems

Weighted Voting Systems Weghted Votng Systems Elssa Brown Chrssy Donovan Charles Noneman June 5, 2009 Votng systems can be deceptve. For nstance, a votng system mght consst of four people, n whch three of the people have 2 votes,

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

An Explicit Construction of an Expander Family (Margulis-Gaber-Galil)

An Explicit Construction of an Expander Family (Margulis-Gaber-Galil) An Explct Constructon of an Expander Famly Marguls-Gaber-Gall) Orr Paradse July 4th 08 Prepared for a Theorst's Toolkt, a course taught by Irt Dnur at the Wezmann Insttute of Scence. The purpose of these

More information

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2 Salmon: Lectures on partal dfferental equatons 5. Classfcaton of second-order equatons There are general methods for classfyng hgher-order partal dfferental equatons. One s very general (applyng even to

More information

2.3 Nilpotent endomorphisms

2.3 Nilpotent endomorphisms s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms

More information