Boosting with Online Binary Learners for the Multiclass Bandit Problem

Size: px
Start display at page:

Download "Boosting with Online Binary Learners for the Multiclass Bandit Problem"

Transcription

1 Shang-Tse Chen School of Compuer Science, Georgia Insiue of Technology, Alana, GA Hsuan-Tien Lin Deparmen of Compuer Science and Informaion Engineering Naional Taiwan Universiy, Taipei, Taiwan Chi-Jen Lu Insiue of Informaion Science, Academia Sinica, Taipei, Taiwan Absrac We consider he problem of online muliclass predicion in he bandi seing. Compared wih he full-informaion seing, in which he learner can receive he rue label as feedback afer making each predicion, he bandi seing assumes ha he learner can only know he correcness of he prediced label. Because he bandi seing is more resriced, i is difficul o design good bandi learners and currenly here are no many bandi learners. In his paper, we propose an approach ha sysemaically convers exising online binary classifiers o promising bandi learners wih srong heoreical guaranee. The approach maches he idea of boosing, which has been shown o be powerful for bach learning as well as online learning. In paricular, we esablish he weak-learning condiion on he online binary classifier, and show ha he condiion allows auomaically consrucing a bandi learner wih arbirary srengh by combining several of hose classifiers. Experimenal resuls on several real-world daa ses demonsrae he effeciveness of he proposed approach. 1. Inroducion Recenly, machine learning problems ha involve parial feedback have received an increasing amoun of aenion (Auer e al., 22; Flaxman e al., 25). These problems occur naurally in many modern applicaions, such as online adverising and recommender sysems (Li e al., 21). Proceedings of he 31 s Inernaional Conference on Machine Learning, Beijing, China, 214. JMLR: W&CP volume 32. Copyrigh 214 by he auhor(s). For insance, in a recommender sysem, he parial feedback represens wheher he user likes he conen recommended by he sysem, whereas he user s preference for he oher conens ha have no been displayed remain unknown. In his paper, we consider one paricular learning problem relaed o parial feedback: he online muliclass predicion problem in he bandi seing (Kakade e al., 28). In he problem, he learner ieraively ineracs wih he environmen. In each ieraion, he learner observes an insance and is asked o predic is label. The main difference beween he radiional full-informaion seing and he bandi seing is he feedback received afer each predicion. In he full-informaion seing, he rue label of he insance is revealed, whereas in he bandi seing, only wheher he predicion is correc is known. Tha is, in he bandi seing, he rue label remains unknown if he predicion is incorrec. Our goal is o make as few errors as possible in he harsh environmen of he bandi seing. Wih more resriced informaion available, i becomes harder o design good learning algorihms in he bandi seing, excep for he case of online binary classificaion, in which he bandi seing and full-informaion seing coincide. Thus, i is desirable o find a sysemaic way o ransform he exising online binary classificaion algorihms, or combine many of hem, o ge an algorihm ha effecively deals wih he bandi seing. The moivaion calls for boosing (Schapire, 199), which is one of he mos popular and well-developed ensemble mehods implemened in he radiional bach supervised classificaion framework. While mos sudies on boosing focus on he bach seing (Freund & Schapire, 1997; Schapire & Singer, 1999), some works have exended he success of boosing o he online seing (Oza & Russell, 21; Chen e al., 212). However, o he bes of our knowledge, here is no boosing algorihm ye for he problem of online muliclass predicion in he bandi seing. In his paper, we sudy he possibiliy of

2 exending he promising heoreical and empirical resuls of boosing o he bandi seing. As in he design and analysis of boosing algorihms in oher seings, we need an appropriae assumpion on weak learners in order for boosing o work. A sronger assumpion makes he design of a boosing algorihm easier, bu a he expense of more resriced applicabiliy. To weaken he assumpion, we consider binary full-informaion weak learners, insead of muliclass bandi ones, wih he given binary examples consruced hrough he one-versus-res decomposiion from he muliclass examples (Chen e al., 29). Following (Chen e al., 212), we propose a similar assumpion which requires such binary weak learners o perform beer han random guessing wih respec o smooh weigh disribuions over he binary examples. Then we prove ha boosing is possible under his assumpion by designing a srong bandi algorihm using such binary weak learners. Our bandi algorihm is exended from he full-informaion one of (Chen e al., 212), which provides a mehod o generae such smooh example weighs for updaing weak learners, as well as some appropriae voing weighs for combining he predicions of weak learners. Neverheless, our exension in his paper is non-rivial. To compue hese weighs exacly in (Chen e al., 212), one needs he full-informaion feedback, which is no available in our bandi seing. Wih he limied informaion of bandi feedback, we show how o find good esimaors for he example weighs as well as for he voing weighs, and we prove ha hey can in fac be used o replace he rue weighs o make boosing work in he bandi seing. Our proposed bandi boosing algorihm enjoys nice heoreical properies similar o hose of is bach counerpar. In paricular, he proposed algorihm can achieve a small error rae if he performance of each weak learner is beer han ha of random guessing wih respec o he carefully generaed weigh disribuions. In addiion, he algorihm reaches promising empirical performance on real-world daa ses, even when using very simple full-informaion weak learners. Finally, le us sress he difference beween our work and exising ones on he bandi problem. Unlike exising works, our goal is no o consruc one specific bandi algorihm and analyze is regre. Insead, our goal is o sudy he possibiliy of a general paradigm for designing bandi algorihms in a sysemaic way. Noe ha here are currenly only a very small number of bandi algorihms for he muliclass predicion problem, and mos seem o be based on linear models (Kakade e al., 28; Hazan & Kale, 211). Wih he limied power of such linear models, a high error rae is unavoidable in general, so he focus of hese works was o reduce he regre, regardless of wheher he acual error rae is high. Our resul, on he oher hand, works for a broader class of classifiers beyond linear ones. We show how o consruc a srong bandi algorihm wih an error rae close o zero, when we have weak learners which can perform slighly beer han random guessing. Here we allow any weak learners, no jus linear ones, ha only need o work in he simpler full-informaion seing raher han in he more challenging bandi seing. Consrucing such weak learners may look much less dauning, bu we show ha hey in fac suffice for consrucing srong bandi algorihms. We hope ha his could open more possibiliies for designing beer bandi algorihms in he fuure. 2. Boosing in differen seings Before formally describing our boosing framework in he online bandi seing, le us firs review he radiional bach seing as well as he online full-informaion seing. In he bach seing, he boosing algorihm has he whole raining se S = {(x 1, y 1 ),..., (x T, y T )} available a he beginning, where each x is he feaure vecor from some space X R d and y is is label from some space Y. For he case of binary classificaion, we assume Y = { 1, +1}, and he boosing algorihm repeaedly calls he bach weak learner for a number of rounds as follows. In round i, i feeds S as well as a probabiliy disribuion p (i) over S o he weak learner, which hen reurns a weak hypohesis h (i) afer seeing he whole S and p (i). I sops a some round N when he srong hypohesis H(x) = sign( N i=1 α(i) h (i) (x)), wih α (i) R being he voing weigh of h (i), achieves a small error rae over S, defined as { : H(x ) y } /T. For he case of muliclass classificaion, we assume Y = {1,..., K}, and for simpliciy we adop he one-versusres approach o reduce he muliclass problem o a binary one. More precisely, each muliclass example (x, y ) is decomposed ino K binary examples ((x, k), y ), k = 1,..., K, where y is 1 if y = k and 1 oherwise. One can hen apply he boosing algorihm o such binary examples and use H(x) = arg max k i α(i) k h(i) (x, k) as he srong hypohesis for he original muliclass problem. In he online full-informaion seing, he examples of S are usually considered as chosen adversarially and hey arrive one a a ime. The online boosing algorihm mus decide on some number N of online weak learners o sar wih. A sep, he boosing algorihm receives x and i predics H(x ) = arg max k i α(i) h(i) (x, k), where h (i) is he weak hypohesis provided by he i h weak learner and α (i) is is voing weigh. Afer he predicion, he rue label y is revealed, and o updae each weak learner, we would like o feed i wih a probabiliy measure on each binary example, as in he bach boosing. However, in he

3 online seing i is hard o deermine a good measure of an example wihou seeing he remaining examples, so we insead only generae a weigh w (i) for ((x, k), y ), which afer normalizaion corresponds o he measure p (i), for he i-h weak learner. The goal is again o achieve a small error rae over S, given ha each weak learner has some posiive advanage, defined as p(i) y h (i) (x, k). Chen e al. (212) proposed an online boosing algorihm ha achieves his goal in he binary case, which can be easily adaped o he muliclass case here. In his paper, we consider he online muliclass predicion problem in he bandi seing. The seing is similar o he full-informaion one, excep ha a sep he boosing algorihm only receives he bandi informaion of wheher is predicion is correc or no. The goal is essenially he same o achieve a small error rae, given ha each weak learner has some posiive advanage. Several issues arise in designing such a bandi boosing algorihm. The sandard approach in designing a bandi algorihm is o use a full-informaion algorihm as a black box, wih is needed informaion replaced by some esimaed one. Usually, he only informaion needed by a fullinformaion algorihm is he gradien of he loss funcion a each sep, and his informaion is used only once, for updaing is nex sraegy or acion. As a resul, he performance (regre) of such a bandi algorihm can be easily analyzed based on ha of he full-informaion one, as i is usually expressed as a simple funcion of he gradiens. For our boosing problem, we would also like o follow his approach, and he only available full-informaion boosing algorihm wih heoreical guaranee is ha of (Chen e al., 212). However, i is no obvious wha o esimae now since ha algorihm involves hree online processes which all need he informaion y, bu for differen purposes. Firs, he boosing algorihm needs y o compue he example weighs w (i) s. Second, he boosing algorihm needs y o compue he voing weighs α (i) s. Third, he weak learners also need y, in addiion o w (i) s, o updae is nex hypohesis. Can one single bi of bandi informaion abou y be used o ge good esimaors for all he hree processes? Furhermore, as y is used in several places and in a more involved way, he bandi algorihm may no be able o use he full-informaion one as a simple black box, and is performance (error rae) may no be easily based on ha of he full-informaion one. Finally, i is no clear wha he appropriae assumpion one should make on he weak learners in order for boosing o work in he bandi seing. In fac, i is no even clear wha ype of weak learners one should use. Perhaps he mos naural choice is o use muliclass bandi algorihms. Tha is, saring from weak muliclass bandi algorihms, we boos hem ino srong muliclass bandi ones. Surprisingly, we will show ha i suffices o use binary full-informaion algorihms wih a posiive advanage as weak learners. This no only gives us a sronger resul in heory, as a weaker assumpion on weak learners is needed, bu also provides us more possibiliies of designing weak learners (and hus srong bandi algorihms) in pracice, as mos exising muliclass bandi algorihms are linear ones. We will use he following noaion and convenion. For a posiive ineger n, we le [n] denoe he se {1,..., n}. For a condiion π, we use he noaion 1[π] which gives he value 1 if π holds and oherwise. For simpliciy, we assume ha each x has lengh x 2 1 and each hypohesis h comes from some family H wih h (x, k) [ 1, 1]. 3. Online weak learners In his secion, we sudy reasonable assumpions on weak learners for allowing boosing o work in he bandi seing. As menioned in he previous secion, insead of using muliclass bandi algorihms as weak learners, we will use binary full-informaion ones. A naural assumpion o make is for such a binary full-informaion algorihm o achieve a posiive advanage wih respec o any example weighs. However, as noed in (Chen e al., 212), his assumpion is oo srong o achieve, as one canno expec an online algorihm o achieve a posiive advanage in exreme cases, such as when only he firs example has a nonzero weigh. Thus, some consrains mus be pu on he example weighs. To idenify an appropriae consrain, le us follow (Chen e al., 212) and consider he case ha each hypohesis h consiss of K linear funcions wih h (x, k) = h, x, he inner produc of wo vecors h and x, wih h 2 1. When given an example (x, k), he weak learner uses h o predic he binary label y. Afer ha, i receives y as well as he example weigh w, and uses hem o updae h ino a new h (+1)k. We can reduce he ask of such a weak learner o he well-known online linear opimizaion problem, by using he reward funcion r (h ) = w y h, x, which is linear in h. Then we can apply he online gradien descen algorihm of (Zinkevich, 23) o generae h a sep, and a sandard regre analysis shows ha for some consan c >, w y h, x w y h k, x c w 2 for any h k wih h k 2 1. Summing over k [K] and using Cauchy-Schwarz inequaliy, we ge w y h, x w y h k, x ck w 2. Le w denoe he oal weigh w, so ha p = w w is he measure of example (x, k). Then by dividing boh

4 sides of he inequaliy above by w, we obain p y h, x p y h k, x ck w 2 w 2. Noe ha p y h k, x is he advanage of he offline learner, and suppose ha i is a leas 3γ >. Moreover, suppose he example weighs are large, in he sense ha hey saisfy he following condiion: w ckb/γ 2, (1) where B max w is a consan ha will be fixed laer. Then he advanage of he online weak learner becomes p y h, x 3γ ck B w 3γ γ = 2γ. w 2 This moivaes us o propose he following assumpion on weak learners, which need no be linear ones. Assumpion 1. There is an online full-informaion weak learner which can achieve an advanage 2γ > for any sequence of examples and weighs saisfying condiion (1). From he discussion above, we have he following. Lemma 1. Suppose for any sequence of examples and weighs saisfying condiion (1), here exiss an offline linear hypohesis wih an advanage 3γ >. Then Assumpion 1 holds. Le us make wo remarks on Assumpion 1. Firs, he assumpion ha a weak learner has a posiive advanage is jus he assumpion ha i predics beer han random guessing, which is he sandard assumpion used by (almos) all previous bach boosing algorihms. Second, he condiion (1) on example weighs acually makes our assumpion weaker, which in urn makes he boosing ask harder and our boosing resul in he nex secion sronger. More precisely, we only require he weak learner o perform well (having a posiive advanage) when he weighs are large, and we do no care how bad i may perform wih small weighs. In fac, we will make our boosing algorihm call he weak learner wih large weighs. 4. Our bandi boosing algorihm In his secion we show how o design a bandi boosing algorihm under Assumpion 1. Le WL be such an online full-informaion weak learner and we will run N copies of WL, for some N o be deermined laer. We follow he approach of reducing he muliclass problem o he binary one as described in Secion 2, and we base our bandi boosing algorihm on he full-informaion one of (Chen e al., 212) ha works for binary classificaion in he full-informaion seing. More precisely, a sep we do he following afer receiving he feaure vecor x. For each class k [K], a new feaure vecor (x, k) is creaed, we obain a binary weak hypohesis h (i) (x) = h(i) (x, k) from he i h weak learner, for i [N], and we form he srong hypohesis H (x) = arg max k [K] f (x), wih f (x) = N i=1 α (i) h(i) (x), where α (i) is some voing weigh for he i h weak learner. Then we make our predicion ŷ based on H (x ) in some way and receive he feedback 1[ŷ = y ]. Using he feedback, we prepare some example weigh w (i) o updae he i h weak learner, as well as o compue he nex voing weigh α (i) (+1)k, for i [N]. I remains o show how o se he example weighs and he voing weighs, as well as how o choose ŷ, which we describe and analyze in deail nex. The complee algorihm is given in Algorihm 1. Algorihm 1 Bandi boosing algorihm wih online weak learner WL Inpu: Sreaming examples (x 1, y 1 ),..., (x T, y T ). Parameers: < < 1, θ < γ < 1 2. Choose α (i) 1k = 1 N and random h(i) 1k for k [K], i [N]. for = 1 o T do Le H (x) = arg max k [K] i [N] α(i) h(i) (x). Le p (k) = (1 )1[k = H (x )] + K for k [K]. Predic ŷ according o he disribuion p. Receive he informaion 1[ŷ = y ]. for k = 1 o K and i = 1 o N do Updae w (i) If ŷ = k, call WL(h (i) h (i) (+1)k end for end for according o (4)., (x, k), y, w (i) ) o obain ; oherwise, le h(i) (+1)k = h(i). according o (6). Updae α (i) (+1)k The example weigh of (x, k) for he i h weak learner used by he full-informaion algorihm of (Chen e al., 212) is { } w (i) = min (1 γ) z(i 1) /2, 1, (2) where z () = and z(i 1), for i 1 1, is defined as i 1 z (i 1) = (y h (j) (x ) θ), wih θ = γ/(2 + γ), (3) j=1 which depends on he informaion y. As we are in he bandi seing, we do no have y o compue such weighs in general. Thus, we balance exploiaion wih exploraion

5 by independenly predicing H (x ) wih probabiliy 1 and a random label wih probabiliy, wih γ; le ŷ denoe our predicion. For k [K], le p (k) denoe he probabiliy ha ŷ = k. Then we replace he example weigh w (i) by he esimaor w (i) = { w (i) /p (k) if ŷ = k, oherwise, which we can compue, because when ŷ = k, we do have y o compue w (i). As p (k) /K, we can choose B = K/ and have w (i) and w (i) w (i) (4) B for any, k, i. Noe ha w(i) are random variables, and he following shows ha s are in fac good esimaors for w(i) s. Claim 1. For any, k, i, E[ w (i) ] = E[w(i) ]. For any k, i, λ, [ ] ( ) Pr w (i) w(i) > λt 2 Ω(λ2 T/B 2). Proof. Observe ha any fixing of he randomness up o sep 1 leaves w (i) and w(i) wih he same condiional expecaion. Thus, E[ w (i) ] = E[w(i) ]. Moreover, as he random variables M = w (i) w(i), for [T ], form a maringale difference sequence, wih M B, he probabiliy bound follows from Azuma s inequaliy. This claim allows us o use w (i) as he example weigh of (x, k) o updae he i h weak learner. However, as each weak learner is assumed o be a full-informaion one, i also needs he label y o updae which we may no know. One may ry o ake a similar approach as before o feed he weak learner wih an esimaor which is y /p (k) when ŷ = k and oherwise, bu his does no work as i does no ake a value in { 1, 1} needed by he binary weak learner. Insead, we ake a differen approach: we only call he weak learner o updae when ŷ = k so ha we know y. Tha is, when ŷ = k, we call he i h weak learner wih w (i) and y, which can hen updae and reurn he nex weak hypohesis h (i) (+1)k; oherwise, we do no call he i h weak learner o updae and we simply le he nex hypohesis h (i) (+1)k be he curren h(i). Anoher issue is ha a weak learner is only assumed o work well when given large example weighs saisfying condiion (1), and even hen, i only works well on hose examples which are given o i o updae. This is deal by he following. Lemma 2. Le [, 1], le m be he larges number such ha w(i) KT for every i m, and le f (x) = m i=1 1 m h(i) (x). Then when T c (K 2 / 4 ) log(k/) for a large enough consan c, Pr [ {(, k) : y f (x ) < θ} > 2KT ], for he parameer θ = γ/(2 + γ) inroduced in (3) Proof. Noe ha according o he definiion, for any and k, w (m+1), and w (m+1) = 1 if y f (x ) < θ, as = m(y f (x ) θ). This implies ha z (m) {(, k) : y f (x ) < θ} As w(m+1) w (m+1). < KT by he definiion of m, we have Pr [ {(, k) : y f (x ) < θ} > 2KT ] Pr w (m+1) w (m+1) > KT, which by a union bound and Claim 1 is a mos K2 Ω(2 T/B 2). The following lemma gives an upper bound on he parameer m defined in Lemma 2. Lemma 3. Suppose Assumpion 1 holds and T ck/( 2 γ 2 ) for he consan c in he condiion (1). Then he parameer m in Lemma 2 is a mos O(K/( 2 γ 2 )). Proof. Noe ha for any i [m], w(i) KT ckb/γ 2, wih B = K/, and he condiion (1) is saisfied. Thus from Assumpion 1, we have i, w (i) y h (i) (x ) 2γ w (i), (5) i, wih he sums over i above, as well as in he res of he proof, being aken over i [m]. On he oher hand, we have he following claim. Claim 2. i, w(i) y h (i) (x ) O(BKT/γ) + γ i, w(i). We omi is proof here as i is very similar o ha for Lemma 5 in (Servedio, 23). 1 Combining he bound in Claim 2 wih he inequaliy (5), we have γ i, w (i) O(BKT/γ). 1 Alhough ha lemma is for w (i) s, is proof can be easily modified o work for w (i) s, bu wih an addiional facor of B appearing in he erm O(BKT/γ) here.

6 Since w(i) KT for i [m] and B = K/, we ge γmkt O(K 2 T/(γ)). From his, he required bound on m follows, and we have he lemma. Le us suppose ha T c (K 2 / 4 ) log(k/) for a large enough consan c so ha boh lemmas apply. Then Lemma 2 shows ha one can obain a srong learner by combining he firs m weak learners. However, one canno deermine he number m before seeing all he examples, and in fac in our online seing, we need o decide he number N of weak learners even before seeing he firs example. Following (Chen e al., 212), we se N o be he upper bound given by Lemma 3. Then a sep, for each k [K], we consider he funcion f (x) = N i=1 α (i) h(i) (x) and reduce he ask of finding such α = (α (1),..., α(n) ) o he Online Convex Programming problem. More precisely, we use he N-dimensional probabiliy simplex, denoed by P N, as he feasible se and define he loss funcion as { N } L (α) = max, θ y α (i) h (i) (x ), i=1 which is a convex funcion of α. However, unlike in (Chen e al., 212), we are in he bandi seing and hus may no know y. To overcome his, we use a similar idea as before o esimae a subgradien L (α ) by { L (α l = )/p (k) if ŷ = k, oherwise. One can hen use l = (l (1),..., l(n) ) o perform gradien descen o updae α as in (Chen e al., 212). However, o ge a beer heoreical bound, here we choose o perform a muliplicaive updae on α o ge α (+1)k = (α (1) (+1)k,..., α(n) (+1)k) for sep + 1, wih α (i) (+1)k = α(i) e ηl(i) /Z(+1)k, (6) where Z (+1)k is he normalizaion facor and η is he learning rae which we se o 3 /K. Then we have he following. [ ] Lemma 4. Pr L (α ) O(KT ) 1 2. Proof. Following he sandard analysis, one can show ha for any k [K] and any ᾱ k P N, l, α ᾱ k O((log N)/η) + η l 2 O(KT ), (7) since l 2 B 2 = K 2 / 2, η = 3 /K and N O(K/ 4 ). Now noe ha for any and k, E[ l, α ᾱ k ] = E[ L (α ), α ᾱ k ] because given he randomness up o sep 1, α is fixed and he condiional expecaion of l equals L (α ). Moreover, as L (α ) L (ᾱ k ) L (α ), α ᾱ k for convex L, we have E [L (α ) L (ᾱ k )] E[ l, α ᾱ k ]. Then using he bound in (7) and applying a similar maringale analysis as before, one can show ha for any ᾱ k P N, Pr (L (α ) L (ᾱ k )) O(KT ) 1. Le ᾱ k = (ᾱ (1) k ᾱ (i) k,..., ᾱ(n) k ), wih ᾱ (i) k = 1 m for i m and = for i > m, so ha L (ᾱ k ) = max{, θ y f (x )} Then we know from Lemma 2 ha (1 + θ)1[y f (x ) < θ]. Pr L (ᾱ k ) (1 + θ)2kt 1. Combining he wo probabiliy bounds ogeher, we have he lemma. Finally, recall ha o predic each y, we independenly oupu H (x ) = arg max k [K] f (x ) wih probabiliy 1 and a random label wih probabiliy. Thus, by a Chernoff bound, our algorihm makes a mos { : H (x ) y } + 2T errors wih probabiliy 1 2 Ω(2 T ) 1. On he oher hand, as 1[H (x ) y ] k Lemma 4 implies ha k 1[y f (x ) < ] L (α )/θ, Pr [ { : H (x ) y } O(KT/θ)] 1 2. Consequenly, for θ = γ/(2 + γ), we can conclude ha our algorihm makes a mos O(KT/θ) + 2T O(KT/γ) errors wih probabiliy a leas 1 3. Therefore, we have he following, which is he main resul of our paper.

7 Theorem 1. Suppose Assumpion 1 holds and T c (K 2 / 4 ) log(k/) for a large enough consan c. Then our bandi algorihm uses O(K/( 2 γ 2 )) weak learners and makes O(KT/γ) errors wih probabiliy 1 3. Noe ha he error rae of our algorihm is O(K/γ), which can be made o any ε by seing = O(εγ/K), wih he requiremen on T and he number of weak learners adjused accordingly. We remark ha we did no aemp o opimize our bounds (which we believe can be improved) as our focus was on esablishing he possibiliy of boosing in he bandi seing. Moreover, i does no seem appropriae o compare our error bound wih he regre bounds of exising bandi algorihms. This is because exising algorihms are usually based on linear classifiers, which may have large error raes even hough heir regres are small. On he oher hand, our boosing algorihm works for any ype of classifiers and achieves a small error rae as long as we have weak learners which saisfy Assumpion Experimens In his secion, we validae he empirical performance of he proposed algorihm on several real-world daa ses. We compare wih wo represenaive algorihms. The firs one is (Kakade e al., 28), which is one of he firs proposed algorihms for he bandi seing. I is modified from a muliclass varian of he well-known Percepron algorihm (Rosenbla, 1962) using he so-called Kesler s consrucion (Duda & Har, 1973). By doing some random exploraion, i can accuraely consruc he esimaion of he updae sep for he full-informaion muliclass Percepron. The algorihm has good heoreical guaranee, especially when he daa is linearly separable. The algorihm can be viewed as a direc modificaion of a full-informaion learner (Percepron) for he bandi seing, wihou combining he learners for boosing. The second one is Conservaive OVA (C-OVA) (Chen e al., 29), which uses he one-versus-all muliclass o binary decomposiion similar o our algorihm. Bu unlike mos of he bandi algorihms, i does no do random exploraion a all. Insead, i conservaively updaes using whaever i ges from he parial feedback, and hence he name. Noe ha alhough i embeds an online binary learning algorihm as is base learner, i does no perform boosing by combining several base learners like our algorihm does. Also, C-OVA performs a margin-based decoding of he binary classificaion resuls, and hence may no work well wih non-margin-based base learners. To demonsrae he boosing abiliy of our proposed algorihm, we choose wo compleely differen ypes of online binary classifiers as our weak learners. The firs one is Percepron, a sandard margin-based linear classifier. Noe ha in (Chen e al., 29) hey use a similar bu more com- Table 1. The daa ses used in our experimens. Daa se Car DNA Nursery Connec4 Reuers4 #classes #feaures ,81 #examples 1,728 3,186 12,96 67, ,768 plex Online Passive-Aggressive Algorihm (PA) (Crammer e al., 26) as is inernal learner. Since we found lile difference in performance on he daa ses we esed beween he PA algorihm and he Percepron algorihm, we only repor he resuls using he simpler and more famous Percepron algorihm o compare fairly wih. The second weak learner we use is Naive Bayes, a simple saisical classifier ha esimaes he poserior probabiliy for each class using Bayes heorem and he assumpion of condiional independence beween feaures Resuls We es our algorihm on 5 public real-world daa ses from various domains wih differen sizes: CAR, NURSERY, and CONNECT4 from he UCI machine learning reposiory (Frank & Asuncion, 21); DNA from he Salog projec (Michie e al., 1994); REUTERS4 from he paper of (Kakade e al., 28). Basic informaion of hese daa ses are summarized in Table 1. As described previously, each example is firs used for predicion before he disclosure of is label, and he error rae is he number of predicion errors divided by he oal number of examples. All he experimens are repeaed 1 imes wih differen random orderings of he examples. For fairness of comparison o, we do no une he parameers oher han he exploraion rae. We fix he number of weak learners o be 1 and he assumed weak learner advanage γ o be as in he full-informaion online boosing algorihm (Chen e al., 212). For he exploraion rae, we es a wide range of values o see he effec of random exploraion. The resuls are shown in Figure 1. Noe ha C-OVA is no included in his figure since C-OVA does no perform random exploraion a all and is parameer-free. One can see ha for reasonable range of values of (around.1 o ), he performance of our algorihm is quie srong and relaively sable, while seing i oo high or oo low resuls in worse performance as expeced. Table 2 summarizes he average error rae and he sandard deviaion when he bes choices of are used in and in our algorihm. Le us firs focus on he case when Percepron is used as he weak learner. Here, he caegorical feaures are ransformed ino numerical ones by decomposiion ino binary vecors. We can see ha he proposed bandi boosing algorihm consisenly ouperforms on all he daa ses, and is also comparable o C-OVA, especially on larger

8 BandiBoos + Percepron BandiBoos + NaiveBayes BandiBoos + Percepron BandiBoos + NaiveBayes BandiBoos + Percepron BandiBoos + NaiveBayes error rae error rae error rae (a) CAR (b) DNA (c) NURSERY BandiBoos + Percepron BandiBoos + NaiveBayes BandiBoos + Percepron.8 C-OVA + Percepron BandiBoos + Percepron error rae error rae error rae (d) CONNECT (e) REUTERS # of examples (f) Learning Curve of REUTERS4 Figure 1. (a)-(d): Error rae using differen values of exploraion rae. (f): learning curve of REUTERS4 using he bes Table 2. Average (over 1 rials) error rae (%) and sandard deviaion comparison Daa se C-OVA (Percepron) BandiBoos (Percepron) C-OVA (Naive Bayes) BandiBoos (Naive Bayes) CAR 29.4 ± ± ± ± 25.1 ± 1.8 DNA 26.8 ± ± 18.6 ± ± ± 2.6 NURSERY 28.8 ± ± ± ± ± 2.1 CONNECT ± 28.1 ± 3.8 ± 34.9 ± ± REUTERS ± 8.5 ± ± N/A N/A daa ses. To ake a closer look a he performance of hese algorihms, we plo he learning curve for he larges daa se (REUTERS4) in Figure 1 (f). One can see ha our algorihm begins o ouperform he oher algorihms when he number of examples is sufficienly large. This is due o he more complex model we use and he need for random exploraion as opposed o he deerminisic C-OVA algorihm. Noe ha i is in accordance o our analysis in Theorem 1, as he error bound only holds when he number of rounds T is large. Nex, le us see he siuaion when he weak learner is swiched o Naive Bayes. Noe ha here we did no es on he REUTERS4 daa se due o he slow inference of Naive Bayes for high dimensional daa. I can be seen ha our algorihm consisenly reaches he bes on all he daa ses. Moreover, we see a large difference beween C-OVA and our algorihm, especially in DNA and NURSERY daa ses. The superioriy echoes he earlier conjecure ha C-OVA may no work well wih non-margin-based base learners. On he oher hand, he proposed bandi boosing algorihm enjoys a sronger heoreical guaranee and works well wih various ypes of weak learners. 6. Conclusion We propose a boosing algorihm o efficienly generae srong muliclass bandi learners by exploiing he abundance of exising online binary learners. The proposed algorihm can be viewed as a careful combinaion of he online boosing algorihm for binary classificaion (Chen e al., 212) and some key esimaion echniques in he bandi algorihms. While he proposed algorihm is simple, we show some non-rivial heoreical analysis ha leads o sound heoreical guaranee. To he bes of our knowledge, our proposed boosing algorihm is he firs one ha comes wih such heoreical guaranee. In addiion, experimenal resuls on real-world daa ses show ha he proposed bandi boosing algorihm can be easily coupled wih differen weak binary learners o reach promising performance.

9 References Auer, P., Cesa-Bianchi, N., Freund, Y., and Schapire, R. E. The non-sochasic muli-armed bandi problem. SIAM Journal of Compuing, 32:48 77, 22. Chen, G., Chen, G., Zhang, J., Chen, S., and Zhang, C. Beyond bandiron: A conservaive and efficien reducion for online muliclass predicion wih bandi seing model. In Proceedings of ICDM, pp. 71 8, 29. Chen, S.-T., Lin, H.-T., and Lu, C.-J. An online boosing algorihm wih heoreical jusificaions. In Proceedings of ICML, pp , July 212. Schapire, R. E. The srengh of weak learnabiliy. Mach. Learn., 5(2): , July 199. Schapire, R. E. and Singer, Y. Improved boosing algorihms using confidence-raed predicions. Machine Learning, 37(3): , December Servedio, R. A. Smooh boosing and learning wih malicious noise. JMLR, 4: , 23. Zinkevich, M. Online convex programming and generalized infiniesimal gradien ascen. In Proceedings of ICML, pp , 23. Crammer, K., Dekel, O., Keshe, J., Shalev-Shwarz, S., and Singer, Y. Online passive-aggressive algorihms. J. Mach. Learn. Res., 7: , December 26. Duda, R. O. and Har, P. E. Paern Classificaion and Scene Analysis. Wiley, Flaxman, A. D., Kalai, A. T., and McMahan, H. B. Online convex opimizaion in he bandi seing: gradien descen wihou a gradien. In Proceedings of SODA, pp , Philadelphia, PA, USA, 25. Frank, A. and Asuncion, A. UCI machine learning reposiory, 21. URL hp://archive.ics.uci. edu/ml. Freund, Y. and Schapire, R. E. A decision-heoreic generalizaion of on-line learning and an applicaion o boosing. Journal of Compuer and Sysem Sciences, 55(1): , Hazan, E. and Kale, S. Newron: an efficien bandi algorihm for online muliclass predicion. In Proceedings of NIPS, pp , 211. Kakade, S. M., Shalev-Shwarz, S., and Tewari, A. Efficien bandi algorihms for online muliclass predicion. In Proceedings of ICML, pp , New York, NY, USA, 28. Li, L., Chu, W., Langford, J., and Schapire, R. E. A conexual-bandi approach o personalized news aricle recommendaion. In Proceedings of WWW, pp , New York, NY, USA, 21. ACM. Michie, D., Spiegelhaler, D. J., and Taylor, C. C. Machine learning, neural and saisical classificaion, Oza, N. C. and Russell, S. Online bagging and boosing. In Proceedings of AISTATS, pp , 21. Rosenbla, F. Principles of Neurodynamics: Perceprons and he Theory of Brain Mechanisms. Sparan, 1962.

Ensamble methods: Bagging and Boosting

Ensamble methods: Bagging and Boosting Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par

More information

Ensamble methods: Boosting

Ensamble methods: Boosting Lecure 21 Ensamble mehods: Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Schedule Final exam: April 18: 1:00-2:15pm, in-class Term projecs April 23 & April 25: a 1:00-2:30pm in CS seminar room

More information

Online Convex Optimization Example And Follow-The-Leader

Online Convex Optimization Example And Follow-The-Leader CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence Supplemen for Sochasic Convex Opimizaion: Faser Local Growh Implies Faser Global Convergence Yi Xu Qihang Lin ianbao Yang Proof of heorem heorem Suppose Assumpion holds and F (w) obeys he LGC (6) Given

More information

Notes on online convex optimization

Notes on online convex optimization Noes on online convex opimizaion Karl Sraos Online convex opimizaion (OCO) is a principled framework for online learning: OnlineConvexOpimizaion Inpu: convex se S, number of seps T For =, 2,..., T : Selec

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Noes for EE7C Spring 018: Convex Opimizaion and Approximaion Insrucor: Moriz Hard Email: hard+ee7c@berkeley.edu Graduae Insrucor: Max Simchowiz Email: msimchow+ee7c@berkeley.edu Ocober 15, 018 3

More information

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.

More information

Notes for Lecture 17-18

Notes for Lecture 17-18 U.C. Berkeley CS278: Compuaional Complexiy Handou N7-8 Professor Luca Trevisan April 3-8, 2008 Noes for Lecure 7-8 In hese wo lecures we prove he firs half of he PCP Theorem, he Amplificaion Lemma, up

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check

More information

Ensemble Confidence Estimates Posterior Probability

Ensemble Confidence Estimates Posterior Probability Ensemble Esimaes Poserior Probabiliy Michael Muhlbaier, Aposolos Topalis, and Robi Polikar Rowan Universiy, Elecrical and Compuer Engineering, Mullica Hill Rd., Glassboro, NJ 88, USA {muhlba6, opali5}@sudens.rowan.edu

More information

Approximation Algorithms for Unique Games via Orthogonal Separators

Approximation Algorithms for Unique Games via Orthogonal Separators Approximaion Algorihms for Unique Games via Orhogonal Separaors Lecure noes by Konsanin Makarychev. Lecure noes are based on he papers [CMM06a, CMM06b, LM4]. Unique Games In hese lecure noes, we define

More information

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality Marix Versions of Some Refinemens of he Arihmeic-Geomeric Mean Inequaliy Bao Qi Feng and Andrew Tonge Absrac. We esablish marix versions of refinemens due o Alzer ], Carwrigh and Field 4], and Mercer 5]

More information

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015 Explaining Toal Facor Produciviy Ulrich Kohli Universiy of Geneva December 2015 Needed: A Theory of Toal Facor Produciviy Edward C. Presco (1998) 2 1. Inroducion Toal Facor Produciviy (TFP) has become

More information

Online Learning with Partial Feedback. 1 Online Mirror Descent with Estimated Gradient

Online Learning with Partial Feedback. 1 Online Mirror Descent with Estimated Gradient Avance Course in Machine Learning Spring 2010 Online Learning wih Parial Feeback Hanous are joinly prepare by Shie Mannor an Shai Shalev-Shwarz In previous lecures we alke abou he general framework of

More information

A Forward-Backward Splitting Method with Component-wise Lazy Evaluation for Online Structured Convex Optimization

A Forward-Backward Splitting Method with Component-wise Lazy Evaluation for Online Structured Convex Optimization A Forward-Backward Spliing Mehod wih Componen-wise Lazy Evaluaion for Online Srucured Convex Opimizaion Yukihiro Togari and Nobuo Yamashia March 28, 2016 Absrac: We consider large-scale opimizaion problems

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecure Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/i2ml3e CHAPTER 2: SUPERVISED LEARNING Learning a Class

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H. ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple

More information

Comparing Means: t-tests for One Sample & Two Related Samples

Comparing Means: t-tests for One Sample & Two Related Samples Comparing Means: -Tess for One Sample & Two Relaed Samples Using he z-tes: Assumpions -Tess for One Sample & Two Relaed Samples The z-es (of a sample mean agains a populaion mean) is based on he assumpion

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Random Walk with Anti-Correlated Steps

Random Walk with Anti-Correlated Steps Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

Online Learning with Queries

Online Learning with Queries Online Learning wih Queries Chao-Kai Chiang Chi-Jen Lu Absrac The online learning problem requires a player o ieraively choose an acion in an unknown and changing environmen. In he sandard seing of his

More information

Let us start with a two dimensional case. We consider a vector ( x,

Let us start with a two dimensional case. We consider a vector ( x, Roaion marices We consider now roaion marices in wo and hree dimensions. We sar wih wo dimensions since wo dimensions are easier han hree o undersand, and one dimension is a lile oo simple. However, our

More information

Lecture 33: November 29

Lecture 33: November 29 36-705: Inermediae Saisics Fall 2017 Lecurer: Siva Balakrishnan Lecure 33: November 29 Today we will coninue discussing he boosrap, and hen ry o undersand why i works in a simple case. In he las lecure

More information

Christos Papadimitriou & Luca Trevisan November 22, 2016

Christos Papadimitriou & Luca Trevisan November 22, 2016 U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

Lecture 2 October ε-approximation of 2-player zero-sum games

Lecture 2 October ε-approximation of 2-player zero-sum games Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion

More information

Sequential Importance Resampling (SIR) Particle Filter

Sequential Importance Resampling (SIR) Particle Filter Paricle Filers++ Pieer Abbeel UC Berkeley EECS Many slides adaped from Thrun, Burgard and Fox, Probabilisic Roboics 1. Algorihm paricle_filer( S -1, u, z ): 2. Sequenial Imporance Resampling (SIR) Paricle

More information

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8) I. Definiions and Problems A. Perfec Mulicollineariy Econ7 Applied Economerics Topic 7: Mulicollineariy (Sudenmund, Chaper 8) Definiion: Perfec mulicollineariy exiss in a following K-variable regression

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

Final Spring 2007

Final Spring 2007 .615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o

More information

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Matlab and Python programming: how to get started

Matlab and Python programming: how to get started Malab and Pyhon programming: how o ge sared Equipping readers he skills o wrie programs o explore complex sysems and discover ineresing paerns from big daa is one of he main goals of his book. In his chaper,

More information

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation:

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation: M ah 5 7 Fall 9 L ecure O c. 4, 9 ) Hamilon- J acobi Equaion: Weak S oluion We coninue he sudy of he Hamilon-Jacobi equaion: We have shown ha u + H D u) = R n, ) ; u = g R n { = }. ). In general we canno

More information

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

Presentation Overview

Presentation Overview Acion Refinemen in Reinforcemen Learning by Probabiliy Smoohing By Thomas G. Dieerich & Didac Busques Speaer: Kai Xu Presenaion Overview Bacground The Probabiliy Smoohing Mehod Experimenal Sudy of Acion

More information

Online Learning with Preference Feedback

Online Learning with Preference Feedback Online Learning wih Preference Feedback Pannagadaa K. Shivaswamy Deparmen of Compuer Science Cornell Universiy, Ihaca NY pannaga@cs.cornell.edu Thorsen Joachims Deparmen of Compuer Science Cornell Universiy,

More information

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details!

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details! MAT 257, Handou 6: Ocober 7-2, 20. I. Assignmen. Finish reading Chaper 2 of Spiva, rereading earlier secions as necessary. handou and fill in some missing deails! II. Higher derivaives. Also, read his

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1 RL Lecure 7: Eligibiliy Traces R. S. Suon and A. G. Baro: Reinforcemen Learning: An Inroducion 1 N-sep TD Predicion Idea: Look farher ino he fuure when you do TD backup (1, 2, 3,, n seps) R. S. Suon and

More information

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé Bias in Condiional and Uncondiional Fixed Effecs Logi Esimaion: a Correcion * Tom Coupé Economics Educaion and Research Consorium, Naional Universiy of Kyiv Mohyla Academy Address: Vul Voloska 10, 04070

More information

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis Speaker Adapaion Techniques For Coninuous Speech Using Medium and Small Adapaion Daa Ses Consaninos Boulis Ouline of he Presenaion Inroducion o he speaker adapaion problem Maximum Likelihood Sochasic Transformaions

More information

Multiarmed Bandits With Limited Expert Advice

Multiarmed Bandits With Limited Expert Advice uliarmed Bandis Wih Limied Exper Advice Sayen Kale Yahoo Labs ew York sayen@yahoo-inc.com Absrac We consider he problem of minimizing regre in he seing of advice-efficien muliarmed bandis wih exper advice.

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems MATHEMATICS OF OPERATIONS RESEARCH Vol. 38, No. 2, May 2013, pp. 209 227 ISSN 0364-765X (prin) ISSN 1526-5471 (online) hp://dx.doi.org/10.1287/moor.1120.0562 2013 INFORMS On Boundedness of Q-Learning Ieraes

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

A DELAY-DEPENDENT STABILITY CRITERIA FOR T-S FUZZY SYSTEM WITH TIME-DELAYS

A DELAY-DEPENDENT STABILITY CRITERIA FOR T-S FUZZY SYSTEM WITH TIME-DELAYS A DELAY-DEPENDENT STABILITY CRITERIA FOR T-S FUZZY SYSTEM WITH TIME-DELAYS Xinping Guan ;1 Fenglei Li Cailian Chen Insiue of Elecrical Engineering, Yanshan Universiy, Qinhuangdao, 066004, China. Deparmen

More information

11!Hí MATHEMATICS : ERDŐS AND ULAM PROC. N. A. S. of decomposiion, properly speaking) conradics he possibiliy of defining a counably addiive real-valu

11!Hí MATHEMATICS : ERDŐS AND ULAM PROC. N. A. S. of decomposiion, properly speaking) conradics he possibiliy of defining a counably addiive real-valu ON EQUATIONS WITH SETS AS UNKNOWNS BY PAUL ERDŐS AND S. ULAM DEPARTMENT OF MATHEMATICS, UNIVERSITY OF COLORADO, BOULDER Communicaed May 27, 1968 We shall presen here a number of resuls in se heory concerning

More information

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models. Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear

More information

Expert Advice for Amateurs

Expert Advice for Amateurs Exper Advice for Amaeurs Ernes K. Lai Online Appendix - Exisence of Equilibria The analysis in his secion is performed under more general payoff funcions. Wihou aking an explici form, he payoffs of he

More information

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Kriging Models Predicing Arazine Concenraions in Surface Waer Draining Agriculural Waersheds Paul L. Mosquin, Jeremy Aldworh, Wenlin Chen Supplemenal Maerial Number

More information

Unit Root Time Series. Univariate random walk

Unit Root Time Series. Univariate random walk Uni Roo ime Series Univariae random walk Consider he regression y y where ~ iid N 0, he leas squares esimae of is: ˆ yy y y yy Now wha if = If y y hen le y 0 =0 so ha y j j If ~ iid N 0, hen y ~ N 0, he

More information

Single-Pass-Based Heuristic Algorithms for Group Flexible Flow-shop Scheduling Problems

Single-Pass-Based Heuristic Algorithms for Group Flexible Flow-shop Scheduling Problems Single-Pass-Based Heurisic Algorihms for Group Flexible Flow-shop Scheduling Problems PEI-YING HUANG, TZUNG-PEI HONG 2 and CHENG-YAN KAO, 3 Deparmen of Compuer Science and Informaion Engineering Naional

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN Inernaional Journal of Scienific & Engineering Research, Volume 4, Issue 10, Ocober-2013 900 FUZZY MEAN RESIDUAL LIFE ORDERING OF FUZZY RANDOM VARIABLES J. EARNEST LAZARUS PIRIYAKUMAR 1, A. YAMUNA 2 1.

More information

Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria

Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria Compuaional Equivalence of Fixed Poins and No Regre Algorihms, and Convergence o Equilibria Elad Hazan IBM Almaden Research Cener 650 Harry Road San Jose, CA 95120 hazan@us.ibm.com Sayen Kale Compuer Science

More information

NEWTRON: an Efficient Bandit algorithm for Online Multiclass Prediction

NEWTRON: an Efficient Bandit algorithm for Online Multiclass Prediction NEWTRON: an Efficien Bandi algorihm for Online Muliclass Predicion Elad Hazan Deparmen of Indusrial Engineering Technion - Israel Insiue of Technology Haifa 32000 Israel ehazan@ie.echnion.ac.il Sayen Kale

More information

Estimation of Poses with Particle Filters

Estimation of Poses with Particle Filters Esimaion of Poses wih Paricle Filers Dr.-Ing. Bernd Ludwig Chair for Arificial Inelligence Deparmen of Compuer Science Friedrich-Alexander-Universiä Erlangen-Nürnberg 12/05/2008 Dr.-Ing. Bernd Ludwig (FAU

More information

Some Basic Information about M-S-D Systems

Some Basic Information about M-S-D Systems Some Basic Informaion abou M-S-D Sysems 1 Inroducion We wan o give some summary of he facs concerning unforced (homogeneous) and forced (non-homogeneous) models for linear oscillaors governed by second-order,

More information

Summer Term Albert-Ludwigs-Universität Freiburg Empirische Forschung und Okonometrie. Time Series Analysis

Summer Term Albert-Ludwigs-Universität Freiburg Empirische Forschung und Okonometrie. Time Series Analysis Summer Term 2009 Alber-Ludwigs-Universiä Freiburg Empirische Forschung und Okonomerie Time Series Analysis Classical Time Series Models Time Series Analysis Dr. Sevap Kesel 2 Componens Hourly earnings:

More information

Bias-Variance Error Bounds for Temporal Difference Updates

Bias-Variance Error Bounds for Temporal Difference Updates Bias-Variance Bounds for Temporal Difference Updaes Michael Kearns AT&T Labs mkearns@research.a.com Sainder Singh AT&T Labs baveja@research.a.com Absrac We give he firs rigorous upper bounds on he error

More information

Longest Common Prefixes

Longest Common Prefixes Longes Common Prefixes The sandard ordering for srings is he lexicographical order. I is induced by an order over he alphabe. We will use he same symbols (,

More information

Conservative Contextual Linear Bandits

Conservative Contextual Linear Bandits Conservaive Conexual Linear Bandis Abbas Kazerouni Sanford Universiy abbask@sanford.edu Yasin Abbasi-Yadkori Adobe Research abbasiya@adobe.com Mohammad Ghavamzadeh DeepMind ghavamza@google.com Benjamin

More information

DEPARTMENT OF STATISTICS

DEPARTMENT OF STATISTICS A Tes for Mulivariae ARCH Effecs R. Sco Hacker and Abdulnasser Haemi-J 004: DEPARTMENT OF STATISTICS S-0 07 LUND SWEDEN A Tes for Mulivariae ARCH Effecs R. Sco Hacker Jönköping Inernaional Business School

More information

20. Applications of the Genetic-Drift Model

20. Applications of the Genetic-Drift Model 0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0

More information

We just finished the Erdős-Stone Theorem, and ex(n, F ) (1 1/(χ(F ) 1)) ( n

We just finished the Erdős-Stone Theorem, and ex(n, F ) (1 1/(χ(F ) 1)) ( n Lecure 3 - Kövari-Sós-Turán Theorem Jacques Versraëe jacques@ucsd.edu We jus finished he Erdős-Sone Theorem, and ex(n, F ) ( /(χ(f ) )) ( n 2). So we have asympoics when χ(f ) 3 bu no when χ(f ) = 2 i.e.

More information

Echocardiography Project and Finite Fourier Series

Echocardiography Project and Finite Fourier Series Echocardiography Projec and Finie Fourier Series 1 U M An echocardiagram is a plo of how a porion of he hear moves as he funcion of ime over he one or more hearbea cycles If he hearbea repeas iself every

More information

Non-Stochastic Bandit Slate Problems

Non-Stochastic Bandit Slate Problems Non-Sochasic Bandi Slae Problems Sayen Kale Yahoo! Research Sana Clara, CA skale@yahoo-inccom Lev Reyzin Georgia Ins of echnology Alana, GA lreyzin@ccgaechedu Absrac Rober E Schapire Princeon Universiy

More information

Methodology. -ratios are biased and that the appropriate critical values have to be increased by an amount. that depends on the sample size.

Methodology. -ratios are biased and that the appropriate critical values have to be increased by an amount. that depends on the sample size. Mehodology. Uni Roo Tess A ime series is inegraed when i has a mean revering propery and a finie variance. I is only emporarily ou of equilibrium and is called saionary in I(0). However a ime series ha

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t Exercise 7 C P = α + β R P + u C = αp + βr + v (a) (b) C R = α P R + β + w (c) Assumpions abou he disurbances u, v, w : Classical assumions on he disurbance of one of he equaions, eg. on (b): E(v v s P,

More information

Lecture Notes 2. The Hilbert Space Approach to Time Series

Lecture Notes 2. The Hilbert Space Approach to Time Series Time Series Seven N. Durlauf Universiy of Wisconsin. Basic ideas Lecure Noes. The Hilber Space Approach o Time Series The Hilber space framework provides a very powerful language for discussing he relaionship

More information

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS NA568 Mobile Roboics: Mehods & Algorihms Today s Topic Quick review on (Linear) Kalman Filer Kalman Filering for Non-Linear Sysems Exended Kalman Filer (EKF)

More information

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018 MATH 5720: Gradien Mehods Hung Phan, UMass Lowell Ocober 4, 208 Descen Direcion Mehods Consider he problem min { f(x) x R n}. The general descen direcions mehod is x k+ = x k + k d k where x k is he curren

More information

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models Journal of Saisical and Economeric Mehods, vol.1, no.2, 2012, 65-70 ISSN: 2241-0384 (prin), 2241-0376 (online) Scienpress Ld, 2012 A Specificaion Tes for Linear Dynamic Sochasic General Equilibrium Models

More information

Lecture 12: Multiple Hypothesis Testing

Lecture 12: Multiple Hypothesis Testing ECE 830 Fall 00 Saisical Signal Processing insrucor: R. Nowak, scribe: Xinjue Yu Lecure : Muliple Hypohesis Tesing Inroducion In many applicaions we consider muliple hypohesis es a he same ime. Example

More information

References are appeared in the last slide. Last update: (1393/08/19)

References are appeared in the last slide. Last update: (1393/08/19) SYSEM IDEIFICAIO Ali Karimpour Associae Professor Ferdowsi Universi of Mashhad References are appeared in he las slide. Las updae: 0..204 393/08/9 Lecure 5 lecure 5 Parameer Esimaion Mehods opics o be

More information

The General Linear Test in the Ridge Regression

The General Linear Test in the Ridge Regression ommunicaions for Saisical Applicaions Mehods 2014, Vol. 21, No. 4, 297 307 DOI: hp://dx.doi.org/10.5351/sam.2014.21.4.297 Prin ISSN 2287-7843 / Online ISSN 2383-4757 The General Linear Tes in he Ridge

More information

Two Coupled Oscillators / Normal Modes

Two Coupled Oscillators / Normal Modes Lecure 3 Phys 3750 Two Coupled Oscillaors / Normal Modes Overview and Moivaion: Today we ake a small, bu significan, sep owards wave moion. We will no ye observe waves, bu his sep is imporan in is own

More information

Conservative Contextual Linear Bandits

Conservative Contextual Linear Bandits Conservaive Conexual Linear Bandis Abbas Kazerouni, Mohammad Ghavamzadeh and Benjamin Van Roy 1 Absrac Safey is a desirable propery ha can immensely increase he applicabiliy of learning algorihms in real-world

More information

Understanding the asymptotic behaviour of empirical Bayes methods

Understanding the asymptotic behaviour of empirical Bayes methods Undersanding he asympoic behaviour of empirical Bayes mehods Boond Szabo, Aad van der Vaar and Harry van Zanen EURANDOM, 11.10.2011. Conens 2/20 Moivaion Nonparameric Bayesian saisics Signal in Whie noise

More information

Some Ramsey results for the n-cube

Some Ramsey results for the n-cube Some Ramsey resuls for he n-cube Ron Graham Universiy of California, San Diego Jozsef Solymosi Universiy of Briish Columbia, Vancouver, Canada Absrac In his noe we esablish a Ramsey-ype resul for cerain

More information

Monochromatic Infinite Sumsets

Monochromatic Infinite Sumsets Monochromaic Infinie Sumses Imre Leader Paul A. Russell July 25, 2017 Absrac WeshowhahereisaraionalvecorspaceV suchha,whenever V is finiely coloured, here is an infinie se X whose sumse X+X is monochromaic.

More information

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important on-parameric echniques Insance Based Learning AKA: neares neighbor mehods, non-parameric, lazy, memorybased, or case-based learning Copyrigh 2005 by David Helmbold 1 Do no fi a model (as do LDA, logisic

More information

di Bernardo, M. (1995). A purely adaptive controller to synchronize and control chaotic systems.

di Bernardo, M. (1995). A purely adaptive controller to synchronize and control chaotic systems. di ernardo, M. (995). A purely adapive conroller o synchronize and conrol chaoic sysems. hps://doi.org/.6/375-96(96)8-x Early version, also known as pre-prin Link o published version (if available):.6/375-96(96)8-x

More information

Written HW 9 Sol. CS 188 Fall Introduction to Artificial Intelligence

Written HW 9 Sol. CS 188 Fall Introduction to Artificial Intelligence CS 188 Fall 2018 Inroducion o Arificial Inelligence Wrien HW 9 Sol. Self-assessmen due: Tuesday 11/13/2018 a 11:59pm (submi via Gradescope) For he self assessmen, fill in he self assessmen boxes in your

More information

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important on-parameric echniques Insance Based Learning AKA: neares neighbor mehods, non-parameric, lazy, memorybased, or case-based learning Copyrigh 2005 by David Helmbold 1 Do no fi a model (as do LTU, decision

More information

ACE 562 Fall Lecture 8: The Simple Linear Regression Model: R 2, Reporting the Results and Prediction. by Professor Scott H.

ACE 562 Fall Lecture 8: The Simple Linear Regression Model: R 2, Reporting the Results and Prediction. by Professor Scott H. ACE 56 Fall 5 Lecure 8: The Simple Linear Regression Model: R, Reporing he Resuls and Predicion by Professor Sco H. Irwin Required Readings: Griffihs, Hill and Judge. "Explaining Variaion in he Dependen

More information

Optimal Paired Choice Block Designs. Supplementary Material

Optimal Paired Choice Block Designs. Supplementary Material Saisica Sinica: Supplemen Opimal Paired Choice Block Designs Rakhi Singh 1, Ashish Das 2 and Feng-Shun Chai 3 1 IITB-Monash Research Academy, Mumbai, India 2 Indian Insiue of Technology Bombay, Mumbai,

More information

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients Secion 3.5 Nonhomogeneous Equaions; Mehod of Undeermined Coefficiens Key Terms/Ideas: Linear Differenial operaor Nonlinear operaor Second order homogeneous DE Second order nonhomogeneous DE Soluion o homogeneous

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information