Stochastic Multi-armed Bandits in Constant Space

Size: px
Start display at page:

Download "Stochastic Multi-armed Bandits in Constant Space"

Transcription

1 Davd Lau Erc Prce Zhao Song The Unversty of Texas at Austn Ger Yang Abstract We consder the stochastc bandt problem n the sublnear space settng, where one cannot record the wn-loss record for all K arms. We gve an algorthm usng words of space wth regret K = log log T where s the gap between the best arm and arm and s the gap between the best and the second-best arms. If the rewards are bounded away from 0 and, ths s wthn an log / factor of the optmum regret possble wthout space constrants. Introducton In ths paper, we study the mult-arm bandt problem n a sublnear space settng. In an nstance of the bandt problem, there are K arms and a fnte tme horzon,..., T, where T could be unknown to us. At each tme step, we pull one of the K arms, and receve a reward that depends on our choce. The goal s to fnd a strategy that would acheve a sublnear wth respect to tme regret, whch s defned as the dfference between the cumulatve reward we receved from our strategy and the reward we could have receved f we always pulled the best arm n the hndsght. There are many formulatons of the bandt problem. In ths paper we consder the stochastc settng ur technques. Proceedngs of the st Internatonal Conference on Artfcal Intellgence and Statstcs AISTATS 08, Lanzarote, Span. PMLR: Volume 84. Copyrght 08 by the authors. specfcally. In the stochastc settng, one assumes the rewards from the -th arm are..d. random varables, wth mean µ and support [0, ]. A well-known algorthm for the stochastc bandt s the UCB algorthm Auer et al., 00, and t s known that UCB acheves regret K log T. The UCB algorthm requres ΩK space snce t records the estmated rewards from all of the K arms. However, n settngs wth lmted space such as streamng algorthms, or settngs wth nfntely many arms Klenberg, 004, the requrement s problematc. There s a sgnfcant lterature addressng ths problem, but exstng approaches assume structural propertes on the set of arms, e.g. combnatoral structure Cesa-Banch and Lugos, 0 or contnuum arm wth local Lpschtz condton Klenberg, 004. A natural queston s, what can we do wthout these structural assumptons gven lmted space? A partcular example s n a streamng algorthm settng, where space s much more lmted than tme, such as a router Zhang, 03. If the space constrant s ok but the tme constrant s ΩK, one cannot run tradtonal UCB. In ths case, K regret s stll acceptable, and by acceptng K total regret, we can avod requrng structural assumptons. In a router, complcated strategy would corresponds to a larger set K of possble strateges, whch grants us the tradeoff: larger K wll result n a hgher regret wth a better optmum. Snce routers have strct space constrants, runnng UCB would result n an extremely small regret on average over tme K/T = space/tme, whch s acceptable for routers. ur algorthm provdes more flexblty n ths bas/varance tradeoff. ur algorthm s based on farly smple deas. Frst, suppose we know the tme horzon T and the expected value of the optmal arm µ. We could then make a sngle pass through the arms; for each arm, flp t untl we have hgh /T 3

2 confdence that = µ µ > 0, where µ s the expected value of arm. nce ths happens, move to the next arm. Ths wll flp each arm log T tmes, nducng regret log T from ths arm. The total regret wll then be log T, whch s deal, wth only constant space requred. The problem s that we don t know T or µ. Not knowng T sn t a bg deal we can partton the tme horzon nto log log T scales, and the last log T term wll domnate Auer and rtner, 00 but not knowng µ s a serous problem. We solve ths problem by teratvely refnng upper and lower bounds µ LB and µ UB on µ. In each pass through the data, we get new estmates that are half as far from each other. After log/ passes, where = mn : >0 s the mnmal gap between the optmal and the suboptmal arms, only the best arm wll reman n the nterval. Ths gves an algorthm that loses at most an log/ factor n the regret. In some cases, the loss s sgnfcantly smaller. Therefore, we can obtan the followng result that mproves the log/ factor nto a log / factor, Theorem.. Gven a stochastc bandt nstance wth K arms and ther expected values µ, µ k [0, ]. Let µ = [K] µ, = µ µ, and = mn : >0. For any T > 0, there exsts an algorthm that uses words of space and acheves regret :>0 log log T. Recall that the well-known UCB algorthm gves regret log T :>0. ur algorthm s always wthn a log / factor of ts space-unlmted verson. In certan stuatons, we can do slghtly better by refnng our estmate of µ by more than a constant factor n each teraton. Ths gves us the followng result Theorem.. Under the same settng as Theorem., for any γ > 0, there exsts an algorthm that uses words of space and acheves regret :>0 log γ log / logt. γ log log / In partcular, f we set γ = /, we can fnd that ths algorthm s always wthn an log/ log log/ factor of the space-unlmted UCB algorthm. The paper s presented n the followng manner. Secton revews the related work. Secton 3 provdes detaled prelmnares of problem formulaton and the background needed for our result. Secton 4 and 5 contans the algorthm that gves the result I and II of Theorem. wth known tme horzon T, respectvely. Secton 6 demonstrates how to extend the algorthms to the case wth unknown tme horzon. The full verson s avalable at Related Works For stochastc bandts, the semnal work by La and Robbns 985 demonstrated the dea of usng the confdence ntervals to solve the problem, and t showed that the lower bound of the regret. The UCB algorthm, whch s a smple soluton to stochastc bandts, was analyzed n Auer et al. 00. The UCB algorthm s based on Hoeffdng s nequalty, whch s optmal when s Ω log T KLµ,µ KLµ, µ. In certan stuatons ths can be mproved usng dfferent types of concentraton nequaltes; for example, Audbert et al. 009 used Bernsten s nequalty to derve an algorthm wth regret dependng on the second moments. Later, Garver and Cappé 0 and Mallard et al. 0 ndependently proposed the KL-UCB algorthm that matches the lower bound. We refer to the reader the comprehensve survey by Bubeck and Cesa-Banch 0 for general bandt problems. In addton to regret analyss for onlne decson makng, there s a set of papers that dscuss the sample complexty for the pure exploraton problem,.e. how to dentfy the best arm Mannor and Tstskls, 004; Even-Dar et al., 00; Jameson et al., 04; Karnn et al., 03; Kaufmann et al., 05; Even- Dar et al., 006. Smlar algorthms has been used n the regme of onlne decson makng Bu et al., 0; Auer and rtner, 00. Wth the dea of the best arm dentfcaton, the explore-then-commt ETC polcy s desgned to frst performs some tests to dentfy the best arm, and then commt to t n the remanng tme horzon. The ETC polcy s shown to be suboptmal Garver et al., 06 but smplfes the analyss. In partcular, our algorthm s based on the framework by Auer and rtner 00, but our algorthm takes only space whle the method by Auer and rtner 00 takes K space. Moreover, there s a small set of papers that ntegrates the sketchng technques from streamng and onlne learnng Hazan and Seshadhr, 009; Luo et al., 06. Hazan and Seshadhr 009 consdered

3 Lau, Prce, Song, Yang the problem of mnmzng α-exp-concave losses, and the regret s requred to be log T unformly over tme. They used the dea from streamng to keep a small actve set of experts. Luo et al. 06 consdered the onlne convex optmzaton problem, and they used the deas of sketchng to reduce the effcency for computng onlne Newton steps, however, the complexty s stll ΩK. 3 Prelmnary Notatons For any postve nteger n, we use [n] to denote the set {,,, n}. For random varable X, let E[X] denote ts expectaton of X If ths quantty exsts. In addton to notaton, for two functons f, g, we use the shorthand f g resp. to ndcate that f Cg resp. for an absolute constant C. We use f g to mean cf g Cf for constants c, C. We measure space n words usng the word RAM model, so that the nput values such as K, T, and rewards and varables can each be expressed n word of space n logkt bts. For more detals of word RAM model, we refer the readers to Aho et al. 974; Cormen et al Problem Formulatons Defnton 3.. For a mult-armed bandt problem, there are K arms n total, and a fnte tme horzon,,..., T. At each tme step t [T ], the player has to choose an arm I t [K] to play, and receves a reward X,t assocate to that arm. Wthout loss of generalty, assume that for each arm [K] and each tme step t [T ], X,t [0, ]. We denote the arm that player chooses at tme t as I t. The goal of the player s to mze the total reward he s gettng. We wll measure the performance of an algorthm va ts regret, whch s defned as the dfference between the best reward n the hndsght and the reward receved wth the algorthm: T Ψ T = X,t [K] t= T t= X It,t In ths paper, we consder the stochastc settng, where we assume the rewards are comng from some stochastc processes. Defnton 3.. In a stochastc bandt, we assume each arm [K] s assocated wth a dstrbuton D over [0, ], wth mean µ. The reward X,t at tme t [T ] s drawn from D ndependently.. For stochastc bandts, nstead of usng the regret defned above, we wll consder the pseudo regret: [ T ] [ T Ψ T = E X,t E [K] t= t= X It,t ] We can rewrte the pseudo regret usng Wald s dentty: K Ψ T = E [N j,t j ], [K] j= where N j,t s the number of tmes arm j s chosen up to tme T, and we defne j = µ µ j to be the gap between the means of arm and arm j. We use µ to denote the mean reward for the arm wth the hghest mean,.e., µ = [K] µ. 3. Concentraton Inequaltes In ths paper, for smplcty, we wll use Chernoff- Hoeffdng nequalty to analyze the concentraton behavor for random varables wth bounded support. Fact 3.3 Chernoff-Hoeffdng Bound. Let x, x,..., x n be..d. random varables n [0, ]. Let X = n n = x. Then for any ɛ > 0, Pr [ X E[X] > ɛ] e nɛ. 4 UCBConstSpace wth known T The orgnal UCB- algorthm Auer et al., 00 needs K space to acheve :>0 log T regret. In ths secton, we propose a new algorthm whch requres only space n exchange for a slghtly worse regret. Frst, we consder the settng where T s known. The man result s presented n the followng theorem. Theorem 4.. Gven a stochastc bandt nstance wth known T, let = µ µ, and let = mn : >0. Then for any T > 0, there exsts an algorthm that uses words of space and acheves regret log / log T. :>0 We present the method n Algorthm, where we teratvely mprove our estmaton of. More precsely, we scan through the data multple rounds. In the r-th round, we sample each arm up to some.

4 Algorthm UCB algorthm wth constant space and known T Theorem 4. : procedure UCBConstSpaceK, T : Set δ /T 3, ntalze g, t 3: Exploraton Phase: 4: for rounds r =,,... do 5: a : the best arm n the prevous round, µ : mean reward for arm a n the prevous round 6: N log/δ/gr, whch s the mum number of plays for each arm n the current round 7: Intalze a, b 0, whch are the best and the second best arm n ths round 8: Intalze µ a, µ b 0, whch are the means for arms a and b 9: for each arm = K do 0: Set µ 0, whch keeps the mean reward for arm n the current round : for n = N do : Pull arm and receve reward v 3: t t 4: Update µ wth v: µ µ n v/n 5: f µ log/δ/n < µ g r / then 6: break,.e. we rule out arm for the current round 7: end f 8: end for 9: f µ > µ a then b a, µ b µ a, a and µ a µ Update the best and the nd best arms 0: else f µ > µ b then b and µ b µ Update the nd best arm : end for : Stoppng Crteron: f µ a g r / > µ b g r / or t > T then break 3: Update a = a and µ = µ a 4: Set new precson: g r = g r / 5: end for 6: Explotaton Phase: 7: Pull arm a for the remanng tme steps. 8: end procedure precson g r. The desred precson g r s halved after each round. In ths samplng process, we only keep the nformaton of the best arm and the second best arm seen n the current and the prevous round, nstead of savng those from all arms. Wth the nformaton of the best arm and the current precson g r, we can refne the upper and lower bound µ UB and µ LB on µ. If an arm whose upper confdence value s less than µ LB, we can rule t out wthout contnung to g r precson. Ths process s termnated f we are able to determne the best arm wth the rest arms. We defne a r and b r as the best arm and the second best arm stored at the end of the r-th round. Also, we let µ r to be the recorded emprcal mean at the end of the r-th round for arm. Denote n r as the total number of pulls of arm at the r-th round. Then, we defne µ r,n as the emprcal mean µ stored for arm after pullng t for n tmes n round r. Further, we defne r as the value of r at the moment the algorthm exts the loop n Lne. Defnton 4.. For each r [r ], defne the event ξ r to be the event: r [r], [K], n [n r ] such that µ r,n µ > log/δ/n,.e., there exsts some estmate of µ r,n our desred confdence nterval up to round r. that s not wthn Throughout the frst part of our analyss, we focus on the case when ξ r holds when we are dscussng the state of the algorthm at round r,.e., all estmates are wthn our desred confdence nterval. Lemma 4.3. In Algorthm, at any round r [r ], gven ξ r, the followng statements are true:. n r = log/δ/g a r,.e. the clamed optmal arm cannot be ruled out early. r. n r = log/δ/gr,.e. the true optmal arm cannot be ruled out early. 3. µ r µ a r g r /. Proof. We prove ths lemma by nducton. For the base case, the frst and the second statement are true because all arms have to be played for log/δ g

5 Lau, Prce, Song, Yang tmes. For the thrd statement, we prove by contradcton. Assume the contrary,.e. µ µ a > g / or µ µ > g a /. If µ µ a > g /, then we have µ < µ g a / µ a log/δ/n g a / µ a g / g / = µ a where the second step follows by condton ξ r and the thrd step follows by n a log/δ/g. The above equaton leads to a contradcton because µ > µ for any. Smlarly, f µ µ > g a /, then we have µ a < µ g / µ log/δ/n g / µ g / g / = µ where the second step follows by condton ξ r, and the thrd step follows by n log/δ/g. The above equaton also results n a contradcton because for any to be assgned as a, we must have µ > µ a. For the nducton step, we assume these three statements are true for r r. Now consder r = r. We frst prove the second statement. Assume the contrary,.e. the true optmal arm has been ruled out early, meanng µ r log/δ/n r < µ r g a r r / Then, we can see that µ µ r log/δ/n r < µ r a r g r / µ a r 3 where n the last nequalty, we use the nducton hypothess, n r and then a r log/δ g r µ a r µ r log/δ µ r g a r n r a r r / a r There s a contradcton n 3 because we must have µ µ a r. Hence the second statement s true. Next, we can see that the frst statement s now clear because we have shown that there s at least one arm that s gong to pull for log/δ g tmes at the r- r th round whch s arm accordng to the second statement we have just shown. Ths means that f arm a r s not arm, then t has to be pulled for tmes as well. log/δ g r For the thrd statement, the proof s smlar to the base case, where we prove by contradcton. Assume the contrary,.e. µ r µ a r > g r / or µ µ r > a r g r /. If µ r a r µ > g r /, then we have µ < µ r g a r r / µ a r log/δ/n r g a r r / µ a r g r / g r / = µ a r where the second step follows by condton ξ r, and the thrd step follows by n r log/δ a r g the frst r statement. Ths results n a contradcton because µ µ for any [K]. Smlarly, f µ µ r > g a r r /, then we have µ r a r < µ g r / µ r log/δ/n r g r / µ r g r / g r / = µ r where the second step follows by condton ξ r and the thrd step follows by n r log/δ g the second r statement. Ths results n a contradcton because for any to be assgned as a r, we must have µ r a r otherwse we wll have µ r ξ r. > µ r, µ g r / by condton Lemma 4.4. In Algorthm, condtonng on event ξ r holds, we have r log/. Proof. Assume the contrary,.e. at the end of round r = log/, the best arm and the second best arm are stll not dfferentated, meanng we stll have µ r g r / < µ r a r g r /

6 Frst note that r > log/ mples r = g r < /. We have µ µ r log/δ/n r µ r g r / < µ r a r 3g r / < µ r a r 3 /4 Smlarly, we have µ a r can show that > µ r /4. Then, we a r µ µ a µ r a r 3 /4 µ r a r /4 < whch results n a contradcton. Ths mples that gven ξ r, we must have r log/. Lemma 4.5. In Algorthm, at any round r, gven ξ r, the number of plays for any arm [K] s upper-bounded by n r log/δ g r. Proof. Frst, note that as long as an arm has not been ruled out, we have µ r,n r Then, we can show = µ µ log/δ n r µ r a r g r µ r g r a r g r µ log/δ µr g r a r µ r,n r n r log/δ n r 4 where the second step follows from Lemma 4.3, the thrd step follows by ξ r, and the last step follows by 4. Reorganzng the above nequalty proves the lemma. Proof of Theorem 4.. Consder Algorthm. For each round r [r ], condtoned on ξ r,.e. the confdence nterval s correct, we frst recognze two bounds on the number of plays n r for each arm [K]. By the defnton of Algorthm, we have n r log/δ/g r 5 Also, from Lemma 4.5, we have n r log/δ/ g r 6 By combnng 5 and 6, together wth r log/ by Lemma 4.4, we can upper bound the regret results from pullng arm n the algorthm. Let α = log/ and β = log3/. Condtonng on event ξ r holds, we have, n r log/δ r= r= {g r, g r } log/δ = { r, r } r= Furthermore, we can obtan n r r= β r= log/δ r log/ 88 log/δ log/ r=β 8 log /3 log/δ log/δ log3/ log / log/δ 7 For the next step, we fnd an upper bound for the probablty of event ξ r := { r [r ], [K], n [n r ] s.t. µ r,n µ > log/δ/n } : Prξ r T /K K T r= = n= Pr µ r,n µ > log/δ/n T δ 8 Fnally, by choosng δ = /T 3, and combnng 7 and 8, we have K log / logt Ψ T T T δ = K = log / logt whch proves the theorem. 5 Improved Algorthm for UCBConstSpace The result n Theorem gves an addtonal log / factor to the orgnal UCB- algorthm by Auer et al. 00. Ths means that n a

7 Lau, Prce, Song, Yang bad scenaro, for example, f most of the arms have gap = K, the log / factor translates to an addtonal log K factor n the regret. In ths secton, we show that we are able to mprove log the addtonal log / factor to a / log log/ factor by slghtly changng the update rule on the precson g r. Ths means that n the bad example descrbed above, we are mprovng the compettve log K rato from log K to log log K. We present our result n the followng theorem. Theorem 5.. Gven a stochastc bandt nstance wth known T, let = µ µ, and let = mn : >0. For any γ > 0 and any T > 0, there exsts an algorthm that uses words of space and acheves regret :>0 log γ log / logt. γ log log / We consder a modfed verson of Algorthm, where the update rule n Lne 4 s replaced by g r = g r log/g r ɛ 9 where ɛ s some constant to be determned later. In the followng lemma, we show that wth ths update rule, bascally gven any D <, t takes only ɛ log/d log log/d steps to reach accuracy D. Lemma 5.. Gven any g 0, D 0,, D < g 0, let r 0 = logg0/d log logg 0/D. If for any postve nteger r, g g r = r log/g. Then, for any r r ɛ ɛ r 0, we have g r D. Proof. Frst, note that by defnton of g r, we have g r g 0 r for any r. Therefore, for any r r 0, we have g r g 0 r g 0 r0 = g 0 D/g 0 log logg 0 /D Then, we can see that for any r r 0, we have g r g r logg0/d log logg 0/D As a result, we have g r ɛ r0 ɛ g r ɛ r0 logg 0 /D r0 g r logg 0 /D ɛ/ g 0 logg 0 /D r0 = D Ths mples that for any r r 0 ɛ r 0 r 0 ɛ r 0, we have g r D. Note that we can apply Lemma 4.3 and Lemma 4.5 for Algorthm wth update rule 9 because they do not requre specfc update rules. Before we proceed to the proof of Theorem 5., we need the followng lemma for an upper bound of r. Lemma 5.3. In Algorthm wth update rule 9, gven ξ r, we have r log / ɛ log log /. Due to space contrant, we provde the detaled proof of the lemma n the full verson of our paper Lau et al., 07. Proof of Theorem 5.. Consder Algorthm wth update rule 9. For each arm [K], f we condton on ξ r, then by Lemma 4.5 and Lemma 5.3, we can upper bound the regret results from pullng arm n the algorthm: r r= r r= r n r log/δ {g r, g r } r= log/δ g r r r=r log/δ g r r 0 where r be the mnmal round r such that g r < /. For the frst term of 0, snce g r decays super-exponentally,.e. g r g r /, we have r r= log/δ g r 4 log/δ g r = 4 log g r ɛ log/δ g r ɛ 6 log log/δ where the last step follows from the fact that g r / by the defnton of r. For the second term of 0, we have r r=r log/δ g r r r=r r r=r 8 log/δ 8 log/δ By Lemma 5., we can fnd that t takes ɛ log / log log/ rounds to get from / to /.

8 As a result, we can upper bound by r log/δ r=r g r ɛ log / 8 log/δ log log / 3 ɛ log / 6 log/δ 3 log log / Usng the smlar argument as we have done n the proof of Theorem 4., we can fnd that Prξ r T δ 4 Fnally, by combnng, 3, and 4, we can get Ψ T 6 log ɛ log / ɛ log log : >0 / log/δ T T δ :>0 : >0 By choosng δ = /T 3 and ɛ = γ/ we can fnd that Ψ T log γ log / logt γ log log / whch proves the theorem. We conjecture below that the log / log log/ factor s not mprovable gven the space constrant. The dscusson for our conjectured hard nstance s n the full verson of our paper Lau et al., 07. Conjecture 5.4. There exsts a dstrbuton over stochastc bandt problems such that, for any algorthm takng words of space wll have regret log / Ω logt. log log / :>0 6 Unknown Horzon T Now, we show that usng the technque descrbed n Auer and rtner, 00, we are able to get the same regret as n Theorem 4. f T s unknown. Theorem 6. Restatement of Theorem.. Gven a stochastc bandt nstance wth unknown T, let = µ µ, and let = mn : >0. For any T > 0, there exsts an algorthm that uses words of space and acheves regret log / log T :>0 Algorthm UCB algorthm wth constant space and unknown T Theorem 6. and Theorem 5. : procedure UCBCS-UnknownTK : Intalze T 0 0 3: l 0, t 4: whle t T do 5: Call UCBConstSpaceK, T l, 6: t t T l 7: l l 8: T l Tl 9: end whle 0: end procedure Due to space constrants, we defer proof of ths theorem to the full verson of our paper Lau et al., 07. Smlarly, we are able to use ths trck for the mproved algorthm n Secton 5 and get the same regret as n Theorem 5.. Theorem 6. Restatement of Theorem.. Gven a stochastc bandt nstance wth unknown T, let = µ µ, and let = mn : >0. For any γ > 0 and any T > 0, there exsts an algorthm that uses words of space and acheves regret :>0 7 Concluson log γ log / logt γ log log / We proposed a constant space algorthm for the stochastc mult-armed bandts problem. ur algorthms proceeds by teratvely refnng a confdence nterval contanng the best arm s value. In the smpler verson of our algorthm, we refne the nterval by a constant factor n each step, and each teraton only uses PT regret. Ths gves an log - compettve algorthm. We then showed how to mprove ths by an log log factor n certan cases, by usng fewer rounds that gve more progress. Fnally, we showed how to adapt our algorthms whch nvolve parameters that depend on the tme horzon T to stuatons wth unknown tme horzon..

9 Lau, Prce, Song, Yang References A. V. Aho, J. Hopcroft, and J. D. Ullman. The desgn and analyss of computer algorthms. In Addson-Wesley Seres n Computer Scence and Informaton Processng, 974. J.-Y. Audbert, R. Munos, and C. Szepesvár. Exploraton explotaton tradeoff usng varance estmates n mult-armed bandts. Theoretcal Computer Scence, 409:876 90, 009. P. Auer and R. rtner. UCB revsted: Improved regret bounds for the stochastc mult-armed bandt problem. Perodca Mathematca Hungarca, 6-:55 65, 00. P. Auer, N. Cesa-Banch, and P. Fscher. Fntetme analyss of the multarmed bandt problem. Machne learnng, 47-3:35 56, 00. S. Bubeck and N. Cesa-Banch. Regret analyss of stochastc and nonstochastc mult-armed bandt problems. Foundatons and Trends R n Machne Learnng, 5:, 0. L. X. Bu, R. Johar, and S. Mannor. Commttng bandts. In Advances n Neural Informaton Processng Systems, pages , 0. N. Cesa-Banch and G. Lugos. Combnatoral bandts. Journal of Computer and System Scences, 785:404 4, 0. T. H. Cormen, C. E. Leserson, R. L. Rvest, and C. Sten. Introducton to algorthms. MIT press, 009. E. Even-Dar, S. Mannor, and Y. Mansour. PAC bounds for mult-armed bandt and markov decson processes. In Internatonal Conference on Computatonal Learnng Theory, pages Sprnger, 00. E. Even-Dar, S. Mannor, and Y. Mansour. Acton elmnaton and stoppng condtons for the multarmed bandt and renforcement learnng problems. Journal of machne learnng research, 7 Jun:079 05, 006. A. Garver and. Cappé. The KL-UCB algorthm for bounded stochastc bandts and beyond. In CLT, pages , 0. A. Garver, T. Lattmore, and E. Kaufmann. n explore-then-commt strateges. In Advances n Neural Informaton Processng Systems, pages , 06. E. Hazan and C. Seshadhr. Effcent learnng algorthms for changng envronments. In Proceedngs of the 6th Annual Internatonal Conference on Machne Learnng, pages ACM, 009. K. G. Jameson, M. Malloy, R. D. Nowak, and S. Bubeck. ll ucb: An optmal exploraton algorthm for mult-armed bandts. In CLT, volume 35, pages , 04. Z. S. Karnn, T. Koren, and. Somekh. Almost optmal exploraton n mult-armed bandts. ICML 3, 8:38 46, 03. E. Kaufmann,. Cappé, and A. Garver. n the complexty of best arm dentfcaton n multarmed bandt models. The Journal of Machne Learnng Research, 05. R. D. Klenberg. Nearly tght bounds for the contnuum-armed bandt problem. In Advances n Neural Informaton Processng Systems, pages , 004. T. L. La and H. Robbns. Asymptotcally effcent adaptve allocaton rules. Advances n appled mathematcs, 6:4, 985. D. Lau, E. Prce, Z. Song, and G. Yang. Stochastc mult-armed bandts n constant space. In arxv preprnt , 07. H. Luo, A. Agarwal, N. Cesa-Banch, and J. Langford. Effcent second order onlne learnng by sketchng. In Advances n Neural Informaton Processng Systems, pages 90 90, 06..-A. Mallard, R. Munos, G. Stoltz, et al. A fntetme analyss of mult-armed bandts problems wth kullback-lebler dvergences. In CLT, pages , 0. S. Mannor and J. N. Tstskls. The sample complexty of exploraton n the mult-armed bandt problem. Journal of Machne Learnng Research, 5Jun:63 648, 004. Q. Zhang. Introducton. In Lecture notes of Sublnear Algorthms for Bg Data. qzhangcs/b669-3-fall-sublnear/sldes/ space--dst.pdf, 03.

Lecture 14: Bandits with Budget Constraints

Lecture 14: Bandits with Budget Constraints IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

18.1 Introduction and Recap

18.1 Introduction and Recap CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng

More information

Best-arm Identification Algorithms for Multi-Armed Bandits in the Fixed Confidence Setting

Best-arm Identification Algorithms for Multi-Armed Bandits in the Fixed Confidence Setting Best-arm Identfcaton Algorthms for Mult-Armed Bandts n the Fxed Confdence Settng Kevn Jameson and Robert Nowak Department of Electrcal and Computer Engneerng Unversty of Wsconsn - Madson Emal: kgjameson@wsc.edu

More information

Lecture 4: Universal Hash Functions/Streaming Cont d

Lecture 4: Universal Hash Functions/Streaming Cont d CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES SVANTE JANSON Abstract. We gve explct bounds for the tal probabltes for sums of ndependent geometrc or exponental varables, possbly wth dfferent

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities Supplementary materal: Margn based PU Learnng We gve the complete proofs of Theorem and n Secton We frst ntroduce the well-known concentraton nequalty, so the covarance estmator can be bounded Then we

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Eigenvalues of Random Graphs

Eigenvalues of Random Graphs Spectral Graph Theory Lecture 2 Egenvalues of Random Graphs Danel A. Spelman November 4, 202 2. Introducton In ths lecture, we consder a random graph on n vertces n whch each edge s chosen to be n the

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Lecture Space-Bounded Derandomization

Lecture Space-Bounded Derandomization Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval

More information

Further Optimal Regret Bounds for Thompson Sampling

Further Optimal Regret Bounds for Thompson Sampling Further Optmal Regret Bounds for hompson Samplng Shpra Agrawal Mcrosoft Research Inda Navn Goyal Mcrosoft Research Inda Abstract proven recently by Kaufmann et al. [5. hompson Samplng s one of the oldest

More information

Finding Primitive Roots Pseudo-Deterministically

Finding Primitive Roots Pseudo-Deterministically Electronc Colloquum on Computatonal Complexty, Report No 207 (205) Fndng Prmtve Roots Pseudo-Determnstcally Ofer Grossman December 22, 205 Abstract Pseudo-determnstc algorthms are randomzed search algorthms

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

P exp(tx) = 1 + t 2k M 2k. k N

P exp(tx) = 1 + t 2k M 2k. k N 1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.

More information

Lecture 10: May 6, 2013

Lecture 10: May 6, 2013 TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,

More information

Further Optimal Regret Bounds for Thompson Sampling

Further Optimal Regret Bounds for Thompson Sampling Further Optmal Regret Bounds for hompson Samplng Shpra Agrawal Mcrosoft Research Inda Navn Goyal Mcrosoft Research Inda Abstract hompson Samplng s one of the oldest heurstcs for mult-armed bandt problems.

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13]

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13] Algorthms Lecture 11: Tal Inequaltes [Fa 13] If you hold a cat by the tal you learn thngs you cannot learn any other way. Mark Twan 11 Tal Inequaltes The smple recursve structure of skp lsts made t relatvely

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

Complete subgraphs in multipartite graphs

Complete subgraphs in multipartite graphs Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Lecture 17 : Stochastic Processes II

Lecture 17 : Stochastic Processes II : Stochastc Processes II 1 Contnuous-tme stochastc process So far we have studed dscrete-tme stochastc processes. We studed the concept of Makov chans and martngales, tme seres analyss, and regresson analyss

More information

Online Appendix. t=1 (p t w)q t. Then the first order condition shows that

Online Appendix. t=1 (p t w)q t. Then the first order condition shows that Artcle forthcomng to ; manuscrpt no (Please, provde the manuscrpt number!) 1 Onlne Appendx Appendx E: Proofs Proof of Proposton 1 Frst we derve the equlbrum when the manufacturer does not vertcally ntegrate

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

ECE559VV Project Report

ECE559VV Project Report ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Finding Dense Subgraphs in G(n, 1/2)

Finding Dense Subgraphs in G(n, 1/2) Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng

More information

The Experts/Multiplicative Weights Algorithm and Applications

The Experts/Multiplicative Weights Algorithm and Applications Chapter 2 he Experts/Multplcatve Weghts Algorthm and Applcatons We turn to the problem of onlne learnng, and analyze a very powerful and versatle algorthm called the multplcatve weghts update algorthm.

More information

arxiv: v1 [cs.lg] 5 Jun 2017

arxiv: v1 [cs.lg] 5 Jun 2017 Sparse Stochastc Bandts arxv:706.0383v [cs.lg 5 Jun 207 Joon Kwon CMAP, École polytechnque, Unversté Pars Saclay joon.kwon@ens-lyon.org Vanney Perchet CMLA, École Normale Supéreure Pars Saclay & Crteo

More information

Channel Selection for Cognitive Radio Terminals

Channel Selection for Cognitive Radio Terminals Channel Selecton for Cogntve Rado Termnals Lng-Hung Kung; SUID: 04906103 1 Introducton Due to the excessve need of wreless spectrum and the neffcency n utlzng t, the technology of cogntve rado (CR) addresses

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

Computing Correlated Equilibria in Multi-Player Games

Computing Correlated Equilibria in Multi-Player Games Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,

More information

Maximizing the number of nonnegative subsets

Maximizing the number of nonnegative subsets Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

E Tail Inequalities. E.1 Markov s Inequality. Non-Lecture E: Tail Inequalities

E Tail Inequalities. E.1 Markov s Inequality. Non-Lecture E: Tail Inequalities Algorthms Non-Lecture E: Tal Inequaltes If you hold a cat by the tal you learn thngs you cannot learn any other way. Mar Twan E Tal Inequaltes The smple recursve structure of sp lsts made t relatvely easy

More information

Anti-van der Waerden numbers of 3-term arithmetic progressions.

Anti-van der Waerden numbers of 3-term arithmetic progressions. Ant-van der Waerden numbers of 3-term arthmetc progressons. Zhanar Berkkyzy, Alex Schulte, and Mchael Young Aprl 24, 2016 Abstract The ant-van der Waerden number, denoted by aw([n], k), s the smallest

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

A new Approach for Solving Linear Ordinary Differential Equations

A new Approach for Solving Linear Ordinary Differential Equations , ISSN 974-57X (Onlne), ISSN 974-5718 (Prnt), Vol. ; Issue No. 1; Year 14, Copyrght 13-14 by CESER PUBLICATIONS A new Approach for Solvng Lnear Ordnary Dfferental Equatons Fawz Abdelwahd Department of

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Online Algorithms for the Multi-Armed Bandit Problem with Markovian Rewards

Online Algorithms for the Multi-Armed Bandit Problem with Markovian Rewards Onlne Algorthms for the Mult-Armed Bandt Problem wth Markovan Rewards Cem Tekn, Mngyan Lu Abstract We consder the classcal mult-armed bandt problem wth Markovan rewards. When played an arm changes ts state

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

An Interactive Optimisation Tool for Allocation Problems

An Interactive Optimisation Tool for Allocation Problems An Interactve Optmsaton ool for Allocaton Problems Fredr Bonäs, Joam Westerlund and apo Westerlund Process Desgn Laboratory, Faculty of echnology, Åbo Aadem Unversty, uru 20500, Fnland hs paper presents

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

Lecture 4: November 17, Part 1 Single Buffer Management

Lecture 4: November 17, Part 1 Single Buffer Management Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input

More information

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1] DYNAMIC SHORTEST PATH SEARCH AND SYNCHRONIZED TASK SWITCHING Jay Wagenpfel, Adran Trachte 2 Outlne Shortest Communcaton Path Searchng Bellmann Ford algorthm Algorthm for dynamc case Modfcatons to our algorthm

More information

On the size of quotient of two subsets of positive integers.

On the size of quotient of two subsets of positive integers. arxv:1706.04101v1 [math.nt] 13 Jun 2017 On the sze of quotent of two subsets of postve ntegers. Yur Shtenkov Abstract We obtan non-trval lower bound for the set A/A, where A s a subset of the nterval [1,

More information

Economics 101. Lecture 4 - Equilibrium and Efficiency

Economics 101. Lecture 4 - Equilibrium and Efficiency Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of

More information

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES BÂRZĂ, Slvu Faculty of Mathematcs-Informatcs Spru Haret Unversty barza_slvu@yahoo.com Abstract Ths paper wants to contnue

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Exercises of Chapter 2

Exercises of Chapter 2 Exercses of Chapter Chuang-Cheh Ln Department of Computer Scence and Informaton Engneerng, Natonal Chung Cheng Unversty, Mng-Hsung, Chay 61, Tawan. Exercse.6. Suppose that we ndependently roll two standard

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

Remarks on the Properties of a Quasi-Fibonacci-like Polynomial Sequence

Remarks on the Properties of a Quasi-Fibonacci-like Polynomial Sequence Remarks on the Propertes of a Quas-Fbonacc-lke Polynomal Sequence Brce Merwne LIU Brooklyn Ilan Wenschelbaum Wesleyan Unversty Abstract Consder the Quas-Fbonacc-lke Polynomal Sequence gven by F 0 = 1,

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7 Stanford Unversty CS54: Computatonal Complexty Notes 7 Luca Trevsan January 9, 014 Notes for Lecture 7 1 Approxmate Countng wt an N oracle We complete te proof of te followng result: Teorem 1 For every

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

arxiv:submit/ [cs.lg] 30 Aug 2011

arxiv:submit/ [cs.lg] 30 Aug 2011 No Internal Regret va Neghborhood Watch Dean Foster Department of Statstcs Unversty of Pennsylvana Alexander Rakhln Department of Statstcs Unversty of Pennsylvana arxv:submt/0308560 cs.lg 30 Aug 2011 August

More information

Min Cut, Fast Cut, Polynomial Identities

Min Cut, Fast Cut, Polynomial Identities Randomzed Algorthms, Summer 016 Mn Cut, Fast Cut, Polynomal Identtes Instructor: Thomas Kesselhem and Kurt Mehlhorn 1 Mn Cuts n Graphs Lecture (5 pages) Throughout ths secton, G = (V, E) s a mult-graph.

More information

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,

More information

Excess Error, Approximation Error, and Estimation Error

Excess Error, Approximation Error, and Estimation Error E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

An (almost) unbiased estimator for the S-Gini index

An (almost) unbiased estimator for the S-Gini index An (almost unbased estmator for the S-Gn ndex Thomas Demuynck February 25, 2009 Abstract Ths note provdes an unbased estmator for the absolute S-Gn and an almost unbased estmator for the relatve S-Gn for

More information

Expected Value and Variance

Expected Value and Variance MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

arxiv: v1 [cs.lg] 5 Nov 2018

arxiv: v1 [cs.lg] 5 Nov 2018 Mult-armed Bandts wth Compensaton Swe Wang IIIS, Tsnghua Unversty wangsw5@mals.tsnghua.edu.cn Longbo Huang IIIS, Tsnghua Unversty longbohuang@tsnghua.edu.cn arxv:8.075v [cs.lg] 5 ov 08 Abstract We propose

More information

Conservative Contextual Linear Bandits

Conservative Contextual Linear Bandits Conservatve Contextual Lnear Bandts Abbas Kazeroun 1, Mohammad Ghavamzadeh 2, Yasn Abbas-Yadkor 3, and Benjamn Van Roy 4 arxv:1611.06426v2 [stat.ml] 4 Mar 2017 1 Stanford Unversty, abbask@stanford.edu

More information