Online Learning with Queries

Size: px
Start display at page:

Download "Online Learning with Queries"

Transcription

1 Online Learning wih Queries Chao-Kai Chiang Chi-Jen Lu Absrac The online learning problem requires a player o ieraively choose an acion in an unknown and changing environmen. In he sandard seing of his problem, he player has o choose an acion in each round before knowing anyhing abou he corresponding loss. However, here are siuaions in which i seems possible for he player o spend effors or resources o collec some prior informaion before her acions. This moivaes us o sudy a varian of he online learning problem, in which he player is allowed o query B bis from he loss vecor in each round before choosing her acion. Suppose each loss value is represened by K bis and disinc loss values differ by a leas some amoun δ, and suppose here are N acions o choose and T rounds o play. We provide an algorihm for his problem which achieves a regre of he following form. Before B approaching B 1 = NK/2, he regre says a O T ln N), and afer B exceeding B 1 bu before approaching B 2 = NK/2 + 3K/2 1, he regre drops slighly o O T ln N)/N), while afer B exceeding B 2, he regre akes a dramaic drop o N ln N)/δ. Our algorihm is in fac close o opimal as we also provide regre lower bounds which almos mach he regre upper bounds achieved by our algorihm. 1 Inroducion Many siuaions in daily life seem o involve making repeaed decisions in an unknown and changing environmen, including examples such as rading socks, commuing o work, rouing in a nework, forecasing weaher, playing games, ec. This moivaes he sudy of he well-known online learning problem in which a player ieraively chooses an acion and receives a loss or a reward) for a number of rounds. In each round, he player mus choose her acion before knowing he corresponding loss, bu afer choosing her acion, she ges o know he whole loss vecor one enry per acion) of ha round. The player would like o have an online algorihm, which can learn from he pas and hopefully make beer decisions as ime goes by, so ha he oal accumulaed loss is small. The sandard way of evalu- Insiue of Informaion Science, Academia Sinica, Taipei, Taiwan. {chaokai,cjlu}@iis.sinica.edu.w Deparmen of Compuer Science Informaion and Engineering, Naional Taiwan Universiy, Taipei, Taiwan aing such an online algorihm is o compare is oal loss wih ha of he bes fixed acion in hindsigh. The difference beween hese wo losses is called he regre, and he goal of an online algorihm is o minimize is regre. There have been many wonderful works on his problem, and i has grown ino a rich opic wih conribuions coming from several areas such as machine learning, algorihms design, and saisics. More informaion can be found in he survey papers such as [3, 5] or he nice book [6], and a sample of more recen works includes [1, 17, 4, 7, 16, 8, 10, 18, 11, 2]. For he online learning problem which has N acions o choose and T rounds o play, here are some algorihms which achieve a regre of O T ln N), and he bound is in fac igh as a maching lower bound of Ω T ln N) can be shown see e.g. [6]). Noe ha hese bounds hold in he mos general and adversarial seing in which he loss vecor in each round could be any arbirary one in [0, 1] N. On he oher hand, i becomes possible o achieve a smaller regre when he loss vecors have consrains. For he online convex opimizaion problem, which generalizes he online learning problem, Hazan, Agarwal, and Kale [13] showed ha when he loss funcions saisfy some nice properies, such as sric convexiy wih bounded firs and second derivaives), a regre of Oln T ) can be achieved. The resul, however, does no seem o carry over o he online learning problem. For he online linear opimizaion problem, Hazan and Kale [14] considered he case in which he sequence of T loss funcions have a small variaion V, and hey showed ha a regre of O V ) can be achieved. They also have a analogous resul for he online learning problem. Anoher siuaion in which one can have consrains on he loss funcions/vecors, even hough hey could sill be arbirary, is when one can obain some prior informaion abou hem. For he online linear opimizaion problem, Hazan and Megiddo [15] showed ha if he player knows he firs enry of he loss vecor as he prior informaion) before choosing her acion in each round, a regre of ON 2 ln T ) can be achieved. They also considered modeling he prior informaion in each round as some sae vecor and measuring he regre agains sronger) offline algorihms which are allowed o have heir acion in ha round depend in a cerain way on he same prior informaion. In 616 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

2 his seing, hey showed ha a regre can be achieved which depends on T in he form of OT 1 1/d+2) ) where d is he dimension of he sae vecors. In hese previous works, he online player seems o be considered as having a passive role in he environmen, wih no conrol over he consrains of loss funcions: eiher he player is in a somewha benign environmen in which he loss funcions hemselves saisfy some nice propery, or he player passively observes some revealed informaion abou he loss funcions. On he oher hand, here are scenarios in which i seems possible for he player o spend effors or resources o acively collec some informaion of her choice abou he loss funcions. For example, before deciding which roue o ake, a driver may firs selec some roues and ry o collec heir raffic condiions; before deciding which socks o rade, an invesor may firs selec some socks and ry o do some research on heir poenial; before choosing he nex move in a game, a player may firs selec some moves and ry o evaluae how good hey are. However, in mos siuaions, one is unlikely o have an unlimied amoun of effors or resources o collec all he informaion one would like o have; herefore, one needs o decide how o spend he limied effors or resources in an efficien way. We would like o iniiae a sudy on such scenarios. As a sar, we consider modifying he online learning problem in he following way. In each round, we give he player a B-bi budge which allows her o query B bis of her choice on he loss vecor before choosing her acion, where we assume ha each loss value is represened by a K-bi sring and disinc loss values differ by a leas some amoun δ. We allow he queries o be made in a randomized way, bu we also allow an adversary o se each bi of a loss vecor afer receiving he corresponding query made by he online algorihm, alhough we sill require ha he adversary fixes a loss vecor before seeing he algorihm s acion. This has he purpose of limiing he power of queries and capuring he poenial delay beween he queries and acions made by he algorihm. Noe ha our model has he original online learning problem as a special case, when B = 0. On he oher hand, when B = NK, one can achieve a zero regre since one has enough budge o figure ou he whole loss vecor and choose he bes acion in each round. The ineresing case is when he value of B lies in he middle, and some quesions arise. Wih a limied number of queries, where should one spend hem? I is naural o expec ha wih a larger B, one can obain more informaion abou he loss vecors and achieve a smaller regre, bu how does he regre look like as a funcion of he budge bound B? We will ry o answer hese quesions in his paper, by providing an algorihm for his problem ogeher wih lower bounds on he regre which almos mach hose achieved by he algorihm. Our algorihm is based on he well-known weighed average algorihm, which achieves an opimal regre for he original online learning problem. To work in our new seing, we add a sep for making he queries and modify he way an acion is chosen in each round while keeping he weighs updaed in he same muliplicaive way). Insead of using he probabiliy disribuion p of he weighed average algorihm o choose an acion in each round, we use p o guide our queries, and from he query resul, we modify he disribuion p by moving probabiliies around among some acions. Our sraegy is o use queries o find ou acions wih differen loss values so ha by moving he probabiliies o acions wih a smaller loss, he expeced loss in ha sep can be reduced from ha of he weighed average algorihm. We sar he queries on acions wih larger probabiliies in p, hoping ha a larger amoun of probabiliies can be moved around so ha a larger reducion on he loss can be achieved. The regre which our algorihm achieves depends on he budge bound B in he following way. Before B approaches he bound B 1 = NK/2, he regre remains a O T ln N) which is wihin he same order as ha of he no-query case B = 0). Afer B passes he bound B 1 bu before i approaches he bound B 2 = NK/2 + 3K/2 1, here is a noiceable drop of he regre o O T ln N)/N). Finally, afer B passes he bound B 2, he regre akes a dramaic drop o N ln N)/δ, which is independen of T. One may see our regre bound as having wo phase ransiions, one minor and one major, a he wo criical poins B 1 and B 2. One may wonder if his ineresing shape of he regre bound is jus an arificial resul of he paricular algorihm we design. We show ha i is no he case and i acually comes from he naure of he problem. We do his by providing regre lower bounds which almos mach he regre bounds achieved by our algorihm. As a resul, we know ha unless one can query close o half of he bis in he loss vecors, hese queries do no help much as hey can only reduce he regre by a consan facor. Moreover, even when one can have he number of queries close o B 2, one can only reduce he regre by a facor of N. On he oher hand, according o our algorihm, when he budge bound exceeds B 2, 617 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

3 he queries suddenly become exremely useful, and he regre can be made exremely small which does no even depend on T. I is a pleasan surprise o see how he budge affecs he regre in such an ineresing way. We consider our work as a preliminary sep in he new direcion of allowing queries in online learning. There are many quesions ha remain o be answered, and nex we lis hree of hem. Firs, recall ha in our model, we allow an adversary o se each bi of a loss vecor afer receiving he corresponding query made by he online algorihm. This somewha limis he power of he queries even hough queries are allowed o be randomized. Sill, we show ha queries can be very powerful when heir number exceeds some hreshold. We would like o undersand if he queries could become even more powerful when an adversary has o fix a loss vecor before he online algorihm makes any query on i. Nex, our algorihm is based on he specific weighed average algorihm and our regre analysis seems o rely crucially on some of is special properies. We would like o undersand if i is possible o modify any exising online algorihm, insead of jus he weighed average algorihm, o use queries o achieve a smaller regre. Finally, in our query model, we allow he online algorihm o obain he informaion of individual bis of a loss vecor, bu his may no be realisic in some seings. In hese seings, we would like o have more appropriae queries models which capure he kind of informaion one can obain from loss vecors, and hen o design algorihms which can uilize such queries o achieve small regres. The ouline of he paper is he following. In Secion 2, we inroduce some definiions and provide some basic facs. In Secion 3, we consider a special case of he problem and provide a simple algorihm wih a simple analysis, which conains he essenial ideas. Then we provide an algorihm and analyze is regre for he general problem in Secions 4 and 5. Finally, we prove regre lower bounds which almos mach he regre upper bounds achieved by our algorihm. 2 Preliminaries Firs, we inroduce some noaions which will be used in his paper. For a binary vecor v, le # 1 v) denoe he number of ones in v. For a se S, le S denoe he number of elemens in S. For a posiive ineger N, le [N] denoe he se {1, 2,, N}. Nex, le us describe he original online learning problem. Suppose here is a se of N available acions and here are a oal of T rounds o play. In each round [T ], an online algorihm A chooses o play an acion according o some disribuion p = p 1,, p N ) over he N acions, where p i is he probabiliy ha A plays acion i in round. Afer ha a loss vecor l = l 1,, l N ) [0, 1]N is revealed o A, where l i is he loss of playing acion i in round, and A suffers an expeced loss p i l i. The expeced loss of A in T rounds of plays is T L T A = p il i, =1 and we compare i wih ha of he bes fixed acion in hindsigh, which is L T min = min T l i. =1 The goal of A is o minimize is regre, defined as R T A = L T A L T min. For his problem, here are algorihms which achieve an opimal regre of O T ln N). Nex, we describe one of hem, called he weighed average algorihm, denoed as A 0, which will be used laer o build our algorihm. In each round, A 0 mainains a weigh vecor w = w1,, wn ) iniially, w1 = 1 N ) ogeher wih he disribuion p = p 1,, p N ) such ha for each i [N], 2.1) p i = w i W, where W = and performs he following wo seps: j [N] w j, Sep 1: A 0 plays an acion sampled according o he disribuion p = p 1,, p N ). Sep 2: A 0 afer receiving he loss vecor l = l 1,, l N ) updaes is weighs o w+1 = w1 +1,, w +1 N ) according o he rule ha for each i [N], 2.2) w +1 i = w i e l i, where he parameer is he learning rae which one can choose. Several ways are known for bounding he regre of his algorihm. However, for our resul o work, we will use he paricular one given in he following lemma, which we will prove in Appendix A. Noe ha i guaranees a regre of a mos ln N by choosing = ln N)/T. + T = 2 T ln N 618 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

4 Lemma 2.1. For any i 1, i 2,..., i T [N], he regre of A 0 is a mos ln N T +. =1 i:l i l i p i In his paper, we sudy a new seing ha he online algorihm is allowed o query some informaion abou he loss vecor before choosing is acion o play in each round. More precisely, in each round, he algorihm is allowed o query B bis from he loss vecor l. Here, we assume ha each loss value l i comes from a se of a mos 2 K values so ha we can represen each value by a K-bi sring, wih a smaller binary represenaion for a smaller loss value, and we assume furhermore ha any wo disinc loss values differ by a leas some amoun δ. For he clariy of our presenaion, we assume here ha δ and hus K) is a consan, and we also assume ha he algorihm knows he numbers B, δ, and T before i sars. In each round, while we allow he online algorihm o make randomized queries, we also allow an adversary he power o se he bis of he loss vecor afer receiving he corresponding queries, bu sill he adversary mus fix he loss vecor before seeing he acion chosen by he algorihm. 3 A Special Case In his secion, we provide a simple example showing ha even wih a one-bi query in each round, i becomes possible o reduce he regre significanly. We use his simpl case o illusrae he basic ideas, which will be exended for he more difficul general case in he nex secion. The resul of his secion is he following. Theorem 3.1. For he special case of he online learning problem wih N acions such ha loss vecors are from {0, 1} N and he budge bound is B = 1 per round, here exiss an algorihm A 1 which achieves a regre of a mos N ln N. Before proving he heorem, le us firs see how some parial informaion abou a loss vecor can be used o save some loss for he online algorihm. One example is ha if we know l i > l j in round, hen by moving some probabiliy q i from playing acion i o playing acion j, we can save he expeced loss by he amoun 3.3) qi l i q i l ) j = qi l i l ) j = qi since l i, l j {0, 1}, which means ha a larger q i gives a larger saving. This suggess ha we query he bi l i when acion i is he one ha we iniially plan o play wih he highes probabiliy, hoping ha from i we can move a large probabiliy o some oher acion wih a smaller loss value. Using his idea, we will design he algorihm A 1 and analyze is regre nex. Proof. of Theorem 3.1) The algorihm A 1 is based on he weighed average algorihm A 0 described in he previous secion, bu i adds a query sep and hen modifies he disribuion of acions in each round. More precisely, in round, A 1 mainains a weigh vecor w = w1,..., wn ) and he disribuion p = p 1,..., p N ) defined as in 2.1), bu i replaces Sep 1 of A 0 by he following: Sep 1.1. A 1 queries he bi l i of he loss vecor, where i is he acion such ha p i p j for every j [N]. Sep 1.2. A 1 derives he disribuion ˆp from p by moving is probabiliies in he following way: If l i = 0, hen A 1 moves all he probabiliies of oher acions o acion i, so ha ˆp i = 1 and ˆp j = 0 for any j i. If l i = 1, hen A 1 moves he probabiliy of acion i o oher acions evenly, so ha ˆp i = 0 and ˆp j = p j + p i /N 1) for any j i. Sep 1.3. A 1 plays an acion sampled according o he disribuion ˆp. Nex, we analyze he regre of A 1. We do his by comparing i wih ha of A 0, which by Lemma 2.1 is a mos ln N T +. =1 i:l i l i p i According o 3.3), in each round, by moving he probabiliies around, he algorihm A 1 can reduce he loss of A 0 by some amoun s, such ha when l i = 0, s = and when l i = 1, s i:l i l i p i i:l i l i l i l i ) = p i l N i l 1 i) N i:l i l i p i, i:l i l i p i, since p i p i for any i. As a resul, he regre of A 1 is 619 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

5 a mos ln N + T =1 ln N + = N ln N, p i s i:l i l i 1 ) T N p i =1 i:l i l i by choosing = 1/N. This proves Theorem Main Resul In his secion, we consider he general online learning problem described in Secion 2. We will generalize he algorihm A 1 in he previous secion o he general seing, and our main resul is he following heorem. Theorem 4.1. Le D = max{0, N K/2+3K/2 1 B}. Then for he general online learning problem described in Secion 2, here exiss an online algorihm A 2, which given a budge of B queries per round achieves a regre { N ln N)/δ if D = 0, RA T 2 8DT ln N)/NK) if D > 0, for a large enough T. Before proving he heorem, le us ry o undersand beer he somewha complicaed-looking regre bound, and in paricular, o see how he regre is affeced by he budge bound B. Firs, observe ha as B increases from zero, he quaniy D decreases, and consequenly he regre RA T 2 decreases. This maches wha one normally would expec. Nex, le us ake a closer look a how RA T 2 decreases as B increases. Ineresingly, he value of RA T 2 appears o go hrough wo phase ransiions, one minor and one major, around B = N K/2 and B = N K/2 + 3K/2 1, in he following sense. When B 1 ε)nk/2 for any small posiive consan ε, RA T 2 remains a ) O T ln N which is wihin he same order as ha of he no-query case B = 0). When NK/2 B NK/2+1 ε)3k/2 for any small posiive consan ε, RA T 2 akes a noiceable drop o ) O T ln N)/N. Finally, when B NK/2 + 3K/2 1, RA T 2 akes a dramaic drop o N ln N)/δ, which is very small and independen of T. Nex, we proceed o prove Theorem 4.1 by providing he algorihm A 2 and hen bounding is regre in he following wo subsecions, respecively. 4.1 The Algorihm A 2. The algorihm A 2 is based on he algorihm A 1 in he previous secion which in urn is based on he weighed average algorihm A 0 ), bu i modifies Sep 1.1 for making queries) and Sep 1.2 for deriving he disribuion ˆp ) in order o handle he more general case. Consider any round. Jus as in A 1, we would like o use queries o find ou some relaionships among he losses of acions so ha we can move probabiliies o acions wih a smaller loss. Now in he general case, which can have B > 1 and K > 1, we need o decide where o spend he B bis of budge; if we spend hem efficienly, we can find ou more relaionships. We will call an acion i heavier han an acion j if p i p j, and we call i ligher han j oherwise. Le i denoe he heavies acion, and our sraegy is o use is loss value l i as a basis and o find ou is relaionship wih l i for as many acion i s as possible. Here, we look firs for a parial relaionship such as l i l i insead of an exac one such as l i = l i or l i < l i, so ha we can spend as few queries as possible and sill know some way o move he probabiliy. Following he idea in Secion 3, we will query heavier acions before ligher ones, hoping ha larger probabiliies can be moved among hem. Formally, in each round, A 2 replaces Sep 1.1 of A 1 by he following: Sep 1.1. Before he B-bi budge runs ou, A 2 queries he K bis of l i, where i is he heavies acion, and hen repeas he following if l i / {1 K, 0 K }: a) A 2 finds he nex heavies acion i. b) A 2 queries hose bis of l i in hose posiions which have zeros in l i if l i has fewer zeros han ones i.e., # 1 l i ) > K/2), and queries he oher bis of l i oherwise. For example, if l i = 100, hen A 2 queries only he lefmos bi of l i.) c) If any of he queried bi in l i differs from he corresponding bi in l i, A 2 queries all he remaining bis in l i. Noe ha if l i equals 1 K or 0 K, A 2 will no make any furher query on any oher acion i because i knows already he relaionship l i l i or l i l i, respecively. If in Sep 1.1.b) all he queried bis mach he corresponding bis in l i, A 2 knows he relaionship l i l i or l i l i when hose bis are all zeros or all ones, respecively. Oherwise if here is a mismach), hen in Sep 1.1.c) A 2 will query he remaining bis in 620 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

6 l i o deermine wheher l i < l i or l i > l i. From such informaion, A 2 can divide he N acions ino six ses: I <, I, I =, I, I >, and I?, in he following way. If A 2 knows l i < l i or l i > l i, A 2 pus acion i in I < or I >, respecively. If A 2 only knows l i l i or l i l i, A 2 pus acion i in I or I, respecively. If A 2 sill does no know any relaionship beween l i and l i afer running ou he budge, A 2 pus acion i in I?. Finally, le I = = {i }. Wih such informaion a hand, A 2 will derive he new disribuion ˆp from he disribuion p by rying o move probabiliies o acions wih a smaller loss. We will say ha he probabiliies of some se I of acions are moved o anoher se I of acions evenly if ˆp i = 0 for i I and ˆp i = p i + j I p j / I for i I. Formally, in each round, A 2 replaces Sep 1.2 of A 1 by he following: Sep 1.2. A 2 derives he disribuion ˆp from p by moving is probabiliies in he following way: If I <, A 2 moves all he probabiliies from I = I I > I o some i 0 I <. If I < = I, A 2 moves all he probabiliies from I = I > I o I evenly. If I < = = I, A 2 moves all he probabiliies from I > I o I =. The oher seps of he algorihm A 1 are all inheried wihou any change by he algorihm A 2, excep ha now A 2 ses is learning rae as { δ/n if D = 0, 4.4) = NK ln N)/2T D) if D > 0. Nex, we will show ha he algorihm A 2 indeed achieves he regre bound given in Theorem Proof of Theorem 4.1. We follow he analysis in Secion 3. For each round, le s denoe he amoun of loss A 2 saves from ha of A 0 by moving he probabiliies around and playing according o he disribuion ˆp insead of p ), and le r = p i s. i:l i l i According o Lemma 2.1 and he discussion in Secion 3, we can bound he regre of A 2 as 4.5) R T A 2 ln N + T r. =1 Then we bound each r by he following lemma. Lemma 4.1. Le D = max{0, NK/2 + 3K/2 1 B}, and suppose δ/n. Then for any [T ], r 2D NK. We will prove he lemma in Secion 5. Now le us apply i o he bound in 4.5) and consider wo cases, depending on he value of D. If D = 0, by choosing = δ/n, we have R T A 2 ln N = N ln N. δ If D > 0, by choosing = NK ln N)/2T D), which is a mos δ/n for a large enough T, we have RA T 2 ln N + 2DT 8DT ln N NK = NK. This complees he proof of Theorem Proof of Lemma 4.1 Consider any [T ], and le i be he heavies acion which A 2 queires firs in round. Recall ha r = p i s, i:l i l i where s is he saving of loss in round by playing according o he probabiliy disribuion ˆp insead of p. Our goal is o show ha 5.6) r 2D NK, where D = max{0, NK/2 + 3K/2 1 B}. For his, we consider wo cases, depending on wheher or no I < = 0. Firs, le us consider he easier case ha I < 0. In his case, he algorihm A 2 moves he probabiliy p i from he acion i and possibly also probabiliies from oher acions) o some acion i 0 I < wih l i 0 < l i, which means ha he saving of loss is s p i l i l i 0 ). Since i is he heavies acion, we have p i 1/N, and since disinc loss values differ by a leas δ, we have l i l i 0 δ. As a resul, we have r = p i s δ/n 0, i:l i l i by he assumpion ha δ/n. Thus, he bound in 5.6) holds in his case. Nex, le us consider he more difficul case ha I < = 0. We rely on he following claim which we will prove in Subsecion Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

7 Claim 5.1. If I < = 0, hen we have r k I? p k δ ) and I? I > 2D/K. j I > p j Recall ha A 2 queries heavier acions before ligher ones, which implies ha p k p j for any k I? and j / I?. Now le I?1 be he se of he I > heavies acions in I?, and le I?2 be he se of he remaining acions in I?, so ha I?2 consiss of he I?2 2D/K lighes acions among all he N acions. Then we have p k = p k + p k k I? k I?1 k I?2 p j + I?2 N j I > p j + 2D NK, j I > and subsiuing his ino he bound in Claim 5.1, we obain r p j + 2D δ ) p j NK j I > j I > 2D δ 2) p j NK j I > 2D NK, since δ 2 by he assumpion ha δ/n. Thus, he bound in 5.6) also holds in he case ha I < = 0. To complee he proof of Lemma 4.1, i remains o prove Claim 5.1, which we do nex. 5.1 Proof of Claim 5.1. Assume I < = 0. Le us consider wo cases according o he range of # 1 l i ), as algorihm A 2 behaves differenly in hem. Case 1: # 1 l i ) K/2. In his case, A 2 sars is queries on posiions corresponding o ones in l i, and afer finishing all he queries, each acion i i belongs o one of he hree ses: I, I >, or I?. Since A 2 moves all he probabiliies from I I > o I = = {i }, i reduces he loss of A 0 by a leas l i l ) i + l j l ) i, i I :l i >l i p i which implies ha 5.7) s δ j I > p j i I :l i >l i p i + δ j I > p j, since disinc loss values differ by a leas δ. oher hand, 5.8) i:l i l i p i p i + p j + p k, i I :l i >l i j I > k I? On he and noe ha he firs erm in 5.8) is a mos he firs erm in 5.7) since δ. As a resul, r = i:l i l i p i s p j + p k δ j I > k I? = k I? p k δ ) j I > p j. j I > p j Nex, le us bound I? I >. We can assume ha A 2 does run ou he budge in Sep 1.1 because oherwise we have I? = 0 and hence I? I > 0 2D K. Assuming ha no budge remains and since he number of queries A 2 spends on he acions in I =, I, I >, and I? are a mos K, K/2) I, K I >, and K 1, respecively, we have B K + K 2 I + K I > + K 1). On he oher hand, we know ha N = 1 + I + I > + I?, and by combing hese wo bounds o remove I, we obain I? I > 2 NK + 3K ) K B 2D K. Case 2: # 1 l i ) > K/2. In his case, A 2 sars is queries on posiions corresponding o zeros in l i, and afer finishing all he queries, each acion i i belongs o one of he hree ses: I, I >, or I?. Since A 2 moves all he probabiliies from I = I > o I evenly, i reduces he loss of A 0 by a leas i I :l i <l i which implies ha p i I l i l i) + j I > i I 5.9) s δ p i N + δ p j. i I :l i <l i j I > p j I l j l i) 622 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

8 On he oher hand, 5.10) i:l i l i p i p i + p j + p k, i I :l i <l i j I > k I? and noe ha again he firs erm in 5.10) is a mos he firs erm in 5.9) because δ/n and p i p i for any i. Therefore, by subracing 5.9) from 5.10), we can obain he same bound for r as in Case 1. Furhermore, following a similar argumen as in Case 1, one can show ha and B K + K 2 I + K I > + K 1), which ogeher give 6 Lower Bounds N = 1 + I + I > + I?, I? I > 2D K. In his secion we provide regre lower bounds which almos mach he upper bounds achieved by our algorihm. The resul of his secion is he following, and for he simpliciy of our presenaion, we assume here ha K is even. Theorem 6.1. Suppose ε is any consan in 0, 1) and c is a large enough consan. Then any algorihm A for he general online learning problem mus have { RA T Ω T ln N) if B N εn)k/2, Ω T ln N)/N) if B N c)k/2, for a large enough N. Here we do no aemp o prove a maching lower bound for he case which has an N ln N)/δ regre upper bound in Theorem 4.1, since we consider he bound o be exremely small as i can be seen as a consan in erms of T. Our proof of Theorem 6.1 basically follows he approach of [9, 3] for proving lower bounds on approximaely solving a game. A key ool used here is a lower bound on he ail of he binomial disribuion, while for our proof, we need he following bound for more general disribuions. Lemma 6.1. Suppose ha µ, δ 1, δ 2,..., δ n are consans in 0, 1), and X 1, X 2,..., X n are independen random variables such ha for each i [n], Pr [X i = µ δ i ] = Pr [X i = µ + δ i ] = 1/2. Then for any λ c/ n for a large enough consan c, we have Pr X i 1 λ)µn e Oλ2n). We will prove he lemma in Subsecion 6.1, and now le us proceed o prove Theorem 6.1. Proof. of Theorem 6.1) Consider any algorihm A. We would like o show he exisence of a sequence of T loss vecors from which A suffers a large regre. We prove is exisence by he probabilisic mehod. We will generae he T loss vecors in some probabilisic way. Le us see he T loss vecors as an N T marix, in which he enry on row i [N] and column [T ] is he loss value of acion i in round. Here we consider 2 K possible loss values in he range from 0 o 1 2 K wih he naural K-bi binary represenaion. We would like each enry o be independenly disribued and have he same expeced value µ, for some consan µ. This means ha he expeced loss of A in each round is exacly µ, and he loss vecors are independen of each oher. Thus a Chernoff-Hoeffding bound shows ha 6.11) Pr [ L T A µt v ] e Ωv2 /T ), for any v > 0. To make A spend as many queries as possible on an enry wihou figuring ou is relaionship wih µ, we choose µ o have he binary represenaion 01) K/2, which has alernaing zeros and ones. Then in each round, we answer queries and sample enries in he following way. For each query o some bi of an enry, we answer i wih he corresponding bi in µ. Afer answering all he queries, some bis of he loss vecor have now been fixed, and some remain free. For any enry wih wo adjacen bis which have no been fixed and he corresponding wo bis in µ mus have differen values), we make he enry uncerain for A as follows: se hose wo bis of he enry o 00 or 10 wih equal probabiliy if hey are 01 in µ and se hem o 01 or 11 wih equal probabiliy if hey are 10 in µ. All he oher bis are hen fixed as hose in µ. In his way, each enry indeed has expeced value µ and is independen from ohers alhough some have a fixed value µ), and we can see each uncerain enry as a random variable saisfying he condiion in Lemma 6.1. For he clariy of our presenaion, we assume here ha µ and each δ i are consans, bu i is no hard o derive he dependence of hem in our bounds. Nex, we analyze he regre by considering wo cases depending on he range of B. 623 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

9 Case 1: B N εn)k/2 for some consan ε 0, 1). Since i akes a leas K/2 queries o an enry o avoid is uncerainy oherwise, i mus miss some adjacen bis), here mus be a leas εn acions whose corresponding enries in he loss vecor are lef uncerain afer running ou he budge in each round. Thus, he oal number of uncerain enries in he marix is a leas εnt, which implies he exisence of a collecion S of εn/2 acions rows) each of which has uncerain enries in εt/2 rounds columns). This is because oherwise he oal number of uncerain enries in he marix would be less han εn/2)t + NεT/2) = εnt, a conradicion. Now consider any acion in S and he n = εt/2 rounds in which i has uncerain enries fixing any addiional uncerain enries o µ). By applying Lemma 6.1 on hose rounds, one can show ha he accumulaed loss of ha acion in hose rounds is a mos 1 λ)µn wih probabiliy a leas e Oλ2 n) e 1/2) ln N = 1/ N, for some λ = Θ ln N)/n), and when his happens, is oal loss in T rounds is a mos 1 λ)µn + µt n) = µt λµn µt Ω T ln N). Therefore, he probabiliy ha some acion in S has such a oal loss is a leas 1 1 1/ S N) 1 e Ω N). On he oher hand, using he bound in 6.11) wih v = T, we have Pr [ L T A µt v ] e Ω1). As a resul, we can conclude ha RA T µt v) µt Ω ) T ln N) wih probabiliy = Ω T ln N) T = Ω T ln N), 1 e Ω N) e Ω1) > 0, for a large enough N. This implies he exisence of a sequence of T loss vecors from which he algorihm A suffers such a large regre. Case 2: B N c)k/2 for a large enough consan c. Following he same reasoning as in Case 1, one can show ha here mus be a leas c uncerain enries in each round column) and hus he oal number of uncerain enries in he marix is a leas ct. Then we claim ha eiher here are rc 2)/2 acions rows) each of which has uncerain enries in T/e r rounds for some r r = ln N ln ln N, or here are N/ ln N acions each of which has uncerain enries in T/N rounds. This is because oherwise he oal number of uncerain enries in he rc 2)/2 rows wih mos uncerain enries would be less han r 2)/2)T/e r=1c r 1 ) < c 2)/2)T r 0 < c 2)T 1/e r ) while he oal number of uncerain enries in he remaining rows would be less han N/ ln N)T/e r ) + NT/N) = 2T, and hus he oal number of uncerain enries in he marix would be less han c 2)T + 2T = ct, a conradicion. Now le us firs consider he subcase ha here are rc 2)/2 acions each of which has uncerain enries in n = T/e r rounds, for some r r. In his subcase, we can choose λ = Θ 1/n) and follow he argumen in Case 1 o show ha some of hese acions has a oal loss of a mos wih probabiliy a leas µt λµn µt Ω T/e r ) 1 1 e Ω1)) rc 2)/2 1 e Ωcr). On he oher hand, using he bound in 6.11) wih v = c 0 T/er for a small enough consan c 0, we have which implies ha Pr [ L T A µt v ] e Ωe r), R T A µt v) µt Ω ) T/e r ) = Ω T/e r ) c 0 T/e r = Ω T/e r ) Ω T ln N)/N), 624 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

10 1 e Ωcr) e Ωe r ) 1 e Ωcr) 1 Ωe r ) ) = Ωe r ) e Ωcr) > 0, for any r [1, r] and a large enough consan c. Nex, le us consider he subcase ha here are N/ ln N acions each of which has uncerain enries in n = T/N rounds. In his subcase, we can choose λ = Θ ln N)/n) and follow he argumen in Case 1 o show ha some of hese acions has a oal loss of a mos µt λµn µt Ω T ln N)/N) wih probabiliy 1 1 1/ N/ ln N N) 1 e Ω N/ ln N). On he oher hand, using he bound in 6.11) wih v = c 0 T ln N)/N for a small enough consan c0, we have which implies ha Pr [ L T A µt v ] e Ωln N)/N), R T A µt v) wih probabiliy µt Ω ) T ln N)/N) = Ω T ln N)/N) c 0 T ln N)/N = Ω T ln N)/N), 1 e Ω N/ ln N) Ωln N)/N) e 1 e Ω N/ ln N) 1 Ωln N)/N)) = Ωln N)/N) e Ω N/ ln N) > 0, for a large enough N. From hese wo subcases, we can conclude he exisence of a sequence of T loss vecors such ha R T A Ω T ln N)/N). 6.1 Proof of Lemma 6.1. Le Y = Y 1, Y 2,..., Y n ) be a sequence of independen random variables wih Pr [Y i = 1] = Pr [Y i = 1] = 1/2 for each i [n], and i is known ha for any α 0, 1), 6.12) Pr Y i αn e Oα2n), which can be shown using he Sirling formula. Noe ha each random variable X i has he same disribuion as µ + δ i Y i, and hus Pr X i 1 λ)µn = Pr µ + δ i Y i ) 1 λ)µn = Pr δ i Y i λµn. Le δ = δ i/n and le γ = λµ/ δ so ha λµn = γ δn. Le A denoe he even ha δ i Y i γ δn, and our goal now becomes o bound Pr [A]. For his, we consider anoher relaed even, denoed as B, ha Y i 2γn, and we know from 6.12) ha Pr [B] e Oγ2 n). Observe ha in he simpler case when all he δ i s are he same and hus equal o δ, even B implies even A so ha we have Pr [A] Pr [B]. However, when hese δ i s are differen, even B does no necessarily imply even A, so Pr [A] may no be as large as Pr [B] in general. Sill, we will show ha Pr [A] is in fac almos as large as Pr [B]. One approach is o use he inequaliy ha Pr [A] Pr [A B] = Pr [B] Pr [A B], and show ha Pr [A B] is large. However, i urns ou o require some edious calculaion o bound Pr [A B], so we ake a slighly differen approach. Le us decompose he even B ino several disjoin evens in he following way. For any ineger n, le B be he even ha exacly of he n random variables Y 1, Y 2,..., Y n have he value 1, or equivalenly Y i = 2 n. Since 2 n 2γn if and only if 1/2 γ)n, we have B = B and Pr [B] = 1/2 γ)n 1/2 γ)n Pr [ B ]. 625 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

11 Then we use he following bound: 6.13) Pr [A] Pr A = Pr = = 1/2 γ)n 1/2 γ)n 1/2 γ)n 1/2 γ)n B A B ) Pr [ A B ] Pr [ A B ] Pr [ B ], so i suffices o show ha each Pr [A B ] is large. Le us fix any ineger 1/2 γ)n, and we nex show ha Pr [A B ] 1/2 by proving ha Pr [ A B ] 1/2. Observe ha he disribuion of Y = Y 1, Y 2,..., Y n ) condiioned on B is he same as ha of sampling uniformly from hose srings in { 1, 1} n wih exacly number of 1 in hem, and le Z = Z1, Z2,..., Zn) denoe such a condiional disribuion. Then we have 6.14) Pr [ A B ] = Pr δ i Zi > γ δn, which we will bound using he second momen mehod. Noe ha all he random variables Z 1, Z 2,..., Z n have he same disribuion and hus he same expeced value, which we denoe by β, wih β = n n n = 2 n n 2γ. Furhermore, any wo of he random variables are negaively correlaed in he following sense. Claim 6.1. For any disinc i, j [n], E [ Z i Z j] E [ Z i ] E [ Z j ]. We will prove he claim laer. Now observe ha he probabiliy in 6.14) equals Pr δ i Z i β ) > γ δn β Pr δ i Z i β ) > γ δn, since γ δn β δ i γ δn + 2γ δn = γ δn. Then δ i he probabiliy above is a mos Pr δ i Z i β ) E 2 > γ δn ) 2 [ δ i Z i β) ) 2 ] γ δn ) 2 by Markov inequaliy, and he numeraor above equals δ i δ j E [ Zi β ) Zj β )] i,j [n] = i,j [n] Noe ha when i j, we have δ i δ j E [ Z i Z j] β 2 ). E [ Z i Z j] β 2 = E [ Z i Z j] E [ Z i ] E [ Z j ] 0 by Claim 6.1, and when i = j, we have E [ Z i Z j] β 2 E [ Z i Z j] = 1. Combining all hese bounds ogeher, we have Pr [ A B ] δ2 i δ i γ δn) 2 γ 2 δ = 1 2 n 2 γ 2 δn. Since we assume ha λ c/ n for a large enough consan c, we have γ 2 δn = λ 2 µ 2 n/ δ c 2 µ 2 / δ 2 and hus Pr [ A B ] 1 γ 2 δn 1 2. Finally, by subsiuing he above bound ino 6.13), we have Pr [A] 1 1 ) Pr [ B ] = 1 Pr [B], 2 2 1/2 γ)n and hen by applying 6.12) o bound Pr [B], we obain Pr [A] 1 2 e Oγ2 n) = e Oλ2 n), as γ = λµ/ δ = Θλ). Thus, o finish he proof of Lemma 6.1, i remains o prove Claim 6.1, which we do nex. Proof. of Claim 6.1) Fix any disinc i, j [n]. Noe ha we have Pr [ Z j = 1 Z i = 1 ] = 1 n 1 n = Pr [ Z j = 1 ], 626 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

12 which implies ha E [ Z j Z i = 1 ] = 2 Pr [ Z j = 1 Z i = 1 ] 1 and we also have 2 Pr [ Z j = 1 ] 1 = E [ Z j], Pr [ Zj = 1 Zi = 1 ] n ) 1 = n 1 n n = Pr [ Zj = 1 ], which implies ha E [ Z j Z i = 1 ] = 1 2 Pr [ Z j = 1 Z i = 1 ] 1 2 Pr [ Z j = 1 ] = E [ Z j]. As a resul, we have E [ Zi Zj ] = Pr [ Zi = 1 ] E [ Zj Zi = 1 ] Pr [ Zi = 1 ] E [ Zj Zi = 1 ] Pr [ Zi = 1 ] E [ Zj ] Pr [ Zi = 1 ] E [ Zj ] References = E [ Z i ] E [ Z j ]. [1] J. Abernehy, P. Barle, and A. Rakhlin. Muliask Learning wih Exper Advice. In Proceedings of he 20h Annual Conference on Learning Theory COLT), pp , [2] D. Angluin, J. Aspnes, J. Chen, and L. Reyzin. Learning Large-Alphabe and Analog Circuis wih Value Injecion Queries. In Proceedings of he 20h Annual Conference on Learning Theory COLT), pp , [3] S. Arora, E. Hazan, and S. Kale. The Muliplicaive Weighs Updae Mehod: a Mea Algorihm and Applicaions. Manuscrip, [4] S. Ben-David, D. Pal, and S. Shalev-Shwarz. Agnosic Online Learning. In Proceedings of he 22nd Annual Conference on Learning Theory COLT), [5] A. Blum and Y. Mansour. Learning, Regre Minimizaion, and Equilibria. In Algorihmic Game Theory, Cambridge Universiy Press, New York, [6] N. Cesa-Bianchi and G. Lugosi. Predicion, Learning, and Games. Cambridge Universiy Press, New York, [7] E. Even-Dar, M. Kearns, Y. Mansour, and J. Worman. Regre o he Bes vs. Regre o he Average. In Proceedings of he 20h Annual Conference on Learning Theory COLT), pp , [8] E. Even-Dar, R. Kleinberg, S. Mannor, and Y. Mansour. Online Learning for Global Cos Funcions. In 22nd Annual Conference on Learning Theory COLT), [9] Y. Freund and R. Schapire. Adapive game playing using muliplicaive weighs. Games and Economic Behavior, 29, pp , [10] S. Guha and K. Munagala. Approximaion Algorihms for Budgeed Learning Problems. In Proceedings of he 39h Annual ACM Symposium on Theory of Compuing STOC), pp , [11] A. György, G. Lugosi, and G. Oucsák. On-line Sequenial Bin Packing. In Proceedings of he 21s Annual Conference on Learning Theory COLT), pp , [12] D. Haussler, J. Kivinen, and M. K. Warmuh. Sequenial Predicion of Individual Sequences Under General Loss Funcions. IEEE Transacions on Informaion Theory, 445), pp , [13] E. Hazan, A. Kalai, S. Kale, and A. Agarwal. Logarihmic Regre Algorihms for Online Convex Opimizaion. In Proceedings of he 19h Annual Conference on Learning Theory COLT), pp , [14] E. Hazan and S. Kale. Exracing Cerainy from Uncerainy: Regre Bounded by Variaion in Coss. In Proceedings of he 21s Annual Conference on Learning Theory COLT), pp , [15] E. Hazan and N. Megiddo. Online Learning wih Prior Knowledge. In Proceedings of he 20h Annual Conference on Learning Theory COLT), pp , [16] E. Hazan and C. Seshadhri. Adapive Algorihms for Online Decision Problems. In Elecronic Colloquium on Compuaional Complexiy ECCC), TR07-088, [17] G. Lugosi, O. Papaspiliopoulos, and G. Solz. Online Muli-ask Learning wih Hard Consrains. In Proceedings of he 22nd Annual Conference on Learning Theory COLT), [18] M. K. Warmuh and D. Kuzmin. Online Variance Minimizaion. In Proceedings of he 19h Annual Conference on Learning Theory COLT), pp , [19] M. Zinkevich. Online Convex Programming and Generalized Infiniesimal Gradien Ascen. In Proceedings of he Twenieh Inernaional Conference on Machine Learning ICML), pp , A Proof of Lemma 2.1. Recall he updae rule from 2.2) ha for any [T ] and i [N], w +1 i = w i e l i, and recall W = w i. Following a sandard analysis of he weighed average algorihm see e.g. 627 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

13 [5, 6]), we have ln W T +1 W 1 = ln e [T ] l i N and moreover ln W T +1 W 1 = ln e LT min N = L T min ln N, = = T =1 ln W +1 W T ln =1 w i e l i W T ln p ie l i. =1 Then o ge he specific bound of he lemma, we rely on he following claim. Claim A.1. Suppose [0, 1/2], p i, l i [0, 1] for any i [N], and p i = 1. Then for any i [N], ln p i e l i p i l i + 2 i:l i l i We will prove he claim laer. Now assuming i and by combining i wih he bounds before, we have L T min ln N ln W T +1 W 1 T p il i + 2 =1 = L T A 0 + which implies ha T 2 =1 L T A 0 L T min ln N + i:l i l i p i T =1 i:l i l i p i, i:l i l i p i. Thus, o complee he proof of Lemma 2.1, i remains o prove Claim A.1, which we will do nex. A.1 Proof of Claim A.1. Consider he funcion f on x = x 1,..., x N ) [0, 1] N defined by fx) = ln p i e x i. p i. Then our goal is o bound he value of f a he poin l = l 1, l 2,..., l N ). Using Taylor s heorem, by expanding f a he poin l = l i, l i,..., l i ), we have fl) = fl ) A.1) + A.2) i,j [N] fl ) x i l i l i ) fv) x i x j l i l i )l j l i ), for some v [0, 1] N. Since p i = 1, we have fl ) = ln p i e l i = l i, and i remains o bound he wo erms in A.1) and A.2). Le hx) = p ie x i so ha fx) = ln hx), and le g i x) = hx) x i = p i e x i, for i [N]. Then i is no hard o show ha and 2 fx) = x i x j fx) x i g i x) hx) = g ix) hx) ) 2 gi x) hx) if i = j, gix)gjx) h 2 x) if i j. Using his, he erm in A.1) can be wrien as g i l ) hl ) l i l i ) = p i )l i l i ) = l i p i l i, while he erm in A.2) can be wrien as 1 ) ) 2 g i v) gi v) l i l i ) 2 2 hv) hv) g i v)g j v) h 2 l i l i )l j l i ) v) 1i<jN = 1 ) gi v) l i l i ) 2 2 hv) 1 2 g i v) 2 hv) l i l i ) 1 ) gi v) l i l i ) 2 2 hv) 2 p i l i l i ) 2, 628 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

14 where he las line follows from he fac ha wih [0, 1/2], g i v) hv) = 2 p i e v i p ie vi 2 p i e0 e 1/2 22 p i. Finally, by combining all hese bounds ogeher, we have fl) l i +l i p i l i + 2 p i l i l i ) 2 p i l i + 2 :l i l i by using he fac ha l i l i ) 2 1 when l i l i. This proves Claim A.1. p i, 629 Copyrigh by SIAM. Unauhorized reproducion of his aricle is prohibied.

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

Online Convex Optimization Example And Follow-The-Leader

Online Convex Optimization Example And Follow-The-Leader CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion

More information

Notes for Lecture 17-18

Notes for Lecture 17-18 U.C. Berkeley CS278: Compuaional Complexiy Handou N7-8 Professor Luca Trevisan April 3-8, 2008 Noes for Lecure 7-8 In hese wo lecures we prove he firs half of he PCP Theorem, he Amplificaion Lemma, up

More information

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details!

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details! MAT 257, Handou 6: Ocober 7-2, 20. I. Assignmen. Finish reading Chaper 2 of Spiva, rereading earlier secions as necessary. handou and fill in some missing deails! II. Higher derivaives. Also, read his

More information

Random Walk with Anti-Correlated Steps

Random Walk with Anti-Correlated Steps Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and

More information

We just finished the Erdős-Stone Theorem, and ex(n, F ) (1 1/(χ(F ) 1)) ( n

We just finished the Erdős-Stone Theorem, and ex(n, F ) (1 1/(χ(F ) 1)) ( n Lecure 3 - Kövari-Sós-Turán Theorem Jacques Versraëe jacques@ucsd.edu We jus finished he Erdős-Sone Theorem, and ex(n, F ) ( /(χ(f ) )) ( n 2). So we have asympoics when χ(f ) 3 bu no when χ(f ) = 2 i.e.

More information

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality Marix Versions of Some Refinemens of he Arihmeic-Geomeric Mean Inequaliy Bao Qi Feng and Andrew Tonge Absrac. We esablish marix versions of refinemens due o Alzer ], Carwrigh and Field 4], and Mercer 5]

More information

) were both constant and we brought them from under the integral.

) were both constant and we brought them from under the integral. YIELD-PER-RECRUIT (coninued The yield-per-recrui model applies o a cohor, bu we saw in he Age Disribuions lecure ha he properies of a cohor do no apply in general o a collecion of cohors, which is wha

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Final Spring 2007

Final Spring 2007 .615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o

More information

20. Applications of the Genetic-Drift Model

20. Applications of the Genetic-Drift Model 0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0

More information

Christos Papadimitriou & Luca Trevisan November 22, 2016

Christos Papadimitriou & Luca Trevisan November 22, 2016 U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream

More information

10. State Space Methods

10. State Space Methods . Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he

More information

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence Supplemen for Sochasic Convex Opimizaion: Faser Local Growh Implies Faser Global Convergence Yi Xu Qihang Lin ianbao Yang Proof of heorem heorem Suppose Assumpion holds and F (w) obeys he LGC (6) Given

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Noes for EE7C Spring 018: Convex Opimizaion and Approximaion Insrucor: Moriz Hard Email: hard+ee7c@berkeley.edu Graduae Insrucor: Max Simchowiz Email: msimchow+ee7c@berkeley.edu Ocober 15, 018 3

More information

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check

More information

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t... Mah 228- Fri Mar 24 5.6 Marix exponenials and linear sysems: The analogy beween firs order sysems of linear differenial equaions (Chaper 5) and scalar linear differenial equaions (Chaper ) is much sronger

More information

Notes on online convex optimization

Notes on online convex optimization Noes on online convex opimizaion Karl Sraos Online convex opimizaion (OCO) is a principled framework for online learning: OnlineConvexOpimizaion Inpu: convex se S, number of seps T For =, 2,..., T : Selec

More information

Lecture 2 October ε-approximation of 2-player zero-sum games

Lecture 2 October ε-approximation of 2-player zero-sum games Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Lecture Notes 2. The Hilbert Space Approach to Time Series

Lecture Notes 2. The Hilbert Space Approach to Time Series Time Series Seven N. Durlauf Universiy of Wisconsin. Basic ideas Lecure Noes. The Hilber Space Approach o Time Series The Hilber space framework provides a very powerful language for discussing he relaionship

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria

Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria Compuaional Equivalence of Fixed Poins and No Regre Algorihms, and Convergence o Equilibria Elad Hazan IBM Almaden Research Cener 650 Harry Road San Jose, CA 95120 hazan@us.ibm.com Sayen Kale Compuer Science

More information

Expert Advice for Amateurs

Expert Advice for Amateurs Exper Advice for Amaeurs Ernes K. Lai Online Appendix - Exisence of Equilibria The analysis in his secion is performed under more general payoff funcions. Wihou aking an explici form, he payoffs of he

More information

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t Exercise 7 C P = α + β R P + u C = αp + βr + v (a) (b) C R = α P R + β + w (c) Assumpions abou he disurbances u, v, w : Classical assumions on he disurbance of one of he equaions, eg. on (b): E(v v s P,

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

Comparing Means: t-tests for One Sample & Two Related Samples

Comparing Means: t-tests for One Sample & Two Related Samples Comparing Means: -Tess for One Sample & Two Relaed Samples Using he z-tes: Assumpions -Tess for One Sample & Two Relaed Samples The z-es (of a sample mean agains a populaion mean) is based on he assumpion

More information

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H. ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple

More information

2. Nonlinear Conservation Law Equations

2. Nonlinear Conservation Law Equations . Nonlinear Conservaion Law Equaions One of he clear lessons learned over recen years in sudying nonlinear parial differenial equaions is ha i is generally no wise o ry o aack a general class of nonlinear

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),

More information

CS Homework Week 2 ( 2.25, 3.22, 4.9)

CS Homework Week 2 ( 2.25, 3.22, 4.9) CS3150 - Homework Week 2 ( 2.25, 3.22, 4.9) Dan Li, Xiaohui Kong, Hammad Ibqal and Ihsan A. Qazi Deparmen of Compuer Science, Universiy of Pisburgh, Pisburgh, PA 15260 Inelligen Sysems Program, Universiy

More information

5. Stochastic processes (1)

5. Stochastic processes (1) Lec05.pp S-38.45 - Inroducion o Teleraffic Theory Spring 2005 Conens Basic conceps Poisson process 2 Sochasic processes () Consider some quaniy in a eleraffic (or any) sysem I ypically evolves in ime randomly

More information

Approximation Algorithms for Unique Games via Orthogonal Separators

Approximation Algorithms for Unique Games via Orthogonal Separators Approximaion Algorihms for Unique Games via Orhogonal Separaors Lecure noes by Konsanin Makarychev. Lecure noes are based on he papers [CMM06a, CMM06b, LM4]. Unique Games In hese lecure noes, we define

More information

Multiarmed Bandits With Limited Expert Advice

Multiarmed Bandits With Limited Expert Advice uliarmed Bandis Wih Limied Exper Advice Sayen Kale Yahoo Labs ew York sayen@yahoo-inc.com Absrac We consider he problem of minimizing regre in he seing of advice-efficien muliarmed bandis wih exper advice.

More information

Topics in Machine Learning Theory

Topics in Machine Learning Theory Topics in Machine Learning Theory The Adversarial Muli-armed Bandi Problem, Inernal Regre, and Correlaed Equilibria Avrim Blum 10/8/14 Plan for oday Online game playing / combining exper advice bu: Wha

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

Online Learning with Partial Feedback. 1 Online Mirror Descent with Estimated Gradient

Online Learning with Partial Feedback. 1 Online Mirror Descent with Estimated Gradient Avance Course in Machine Learning Spring 2010 Online Learning wih Parial Feeback Hanous are joinly prepare by Shie Mannor an Shai Shalev-Shwarz In previous lecures we alke abou he general framework of

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes Represening Periodic Funcions by Fourier Series 3. Inroducion In his Secion we show how a periodic funcion can be expressed as a series of sines and cosines. We begin by obaining some sandard inegrals

More information

Introduction to Probability and Statistics Slides 4 Chapter 4

Introduction to Probability and Statistics Slides 4 Chapter 4 Inroducion o Probabiliy and Saisics Slides 4 Chaper 4 Ammar M. Sarhan, asarhan@mahsa.dal.ca Deparmen of Mahemaics and Saisics, Dalhousie Universiy Fall Semeser 8 Dr. Ammar Sarhan Chaper 4 Coninuous Random

More information

Predator - Prey Model Trajectories and the nonlinear conservation law

Predator - Prey Model Trajectories and the nonlinear conservation law Predaor - Prey Model Trajecories and he nonlinear conservaion law James K. Peerson Deparmen of Biological Sciences and Deparmen of Mahemaical Sciences Clemson Universiy Ocober 28, 213 Ouline Drawing Trajecories

More information

Longest Common Prefixes

Longest Common Prefixes Longes Common Prefixes The sandard ordering for srings is he lexicographical order. I is induced by an order over he alphabe. We will use he same symbols (,

More information

Optimality Conditions for Unconstrained Problems

Optimality Conditions for Unconstrained Problems 62 CHAPTER 6 Opimaliy Condiions for Unconsrained Problems 1 Unconsrained Opimizaion 11 Exisence Consider he problem of minimizing he funcion f : R n R where f is coninuous on all of R n : P min f(x) x

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

Let us start with a two dimensional case. We consider a vector ( x,

Let us start with a two dimensional case. We consider a vector ( x, Roaion marices We consider now roaion marices in wo and hree dimensions. We sar wih wo dimensions since wo dimensions are easier han hree o undersand, and one dimension is a lile oo simple. However, our

More information

BU Macro BU Macro Fall 2008, Lecture 4

BU Macro BU Macro Fall 2008, Lecture 4 Dynamic Programming BU Macro 2008 Lecure 4 1 Ouline 1. Cerainy opimizaion problem used o illusrae: a. Resricions on exogenous variables b. Value funcion c. Policy funcion d. The Bellman equaion and an

More information

Lecture 20: Riccati Equations and Least Squares Feedback Control

Lecture 20: Riccati Equations and Least Squares Feedback Control 34-5 LINEAR SYSTEMS Lecure : Riccai Equaions and Leas Squares Feedback Conrol 5.6.4 Sae Feedback via Riccai Equaions A recursive approach in generaing he marix-valued funcion W ( ) equaion for i for he

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

Math 2142 Exam 1 Review Problems. x 2 + f (0) 3! for the 3rd Taylor polynomial at x = 0. To calculate the various quantities:

Math 2142 Exam 1 Review Problems. x 2 + f (0) 3! for the 3rd Taylor polynomial at x = 0. To calculate the various quantities: Mah 4 Eam Review Problems Problem. Calculae he 3rd Taylor polynomial for arcsin a =. Soluion. Le f() = arcsin. For his problem, we use he formula f() + f () + f ()! + f () 3! for he 3rd Taylor polynomial

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems Essenial Microeconomics -- 6.5: OPIMAL CONROL Consider he following class of opimizaion problems Max{ U( k, x) + U+ ( k+ ) k+ k F( k, x)}. { x, k+ } = In he language of conrol heory, he vecor k is he vecor

More information

Games Against Nature

Games Against Nature Advanced Course in Machine Learning Spring 2010 Games Agains Naure Handous are joinly prepared by Shie Mannor and Shai Shalev-Shwarz In he previous lecures we alked abou expers in differen seups and analyzed

More information

STA 114: Statistics. Notes 2. Statistical Models and the Likelihood Function

STA 114: Statistics. Notes 2. Statistical Models and the Likelihood Function STA 114: Saisics Noes 2. Saisical Models and he Likelihood Funcion Describing Daa & Saisical Models A physicis has a heory ha makes a precise predicion of wha s o be observed in daa. If he daa doesn mach

More information

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1 SZG Macro 2011 Lecure 3: Dynamic Programming SZG macro 2011 lecure 3 1 Background Our previous discussion of opimal consumpion over ime and of opimal capial accumulaion sugges sudying he general decision

More information

Some Ramsey results for the n-cube

Some Ramsey results for the n-cube Some Ramsey resuls for he n-cube Ron Graham Universiy of California, San Diego Jozsef Solymosi Universiy of Briish Columbia, Vancouver, Canada Absrac In his noe we esablish a Ramsey-ype resul for cerain

More information

MATH 4330/5330, Fourier Analysis Section 6, Proof of Fourier s Theorem for Pointwise Convergence

MATH 4330/5330, Fourier Analysis Section 6, Proof of Fourier s Theorem for Pointwise Convergence MATH 433/533, Fourier Analysis Secion 6, Proof of Fourier s Theorem for Poinwise Convergence Firs, some commens abou inegraing periodic funcions. If g is a periodic funcion, g(x + ) g(x) for all real x,

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8) I. Definiions and Problems A. Perfec Mulicollineariy Econ7 Applied Economerics Topic 7: Mulicollineariy (Sudenmund, Chaper 8) Definiion: Perfec mulicollineariy exiss in a following K-variable regression

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

A Local Regret in Nonconvex Online Learning

A Local Regret in Nonconvex Online Learning Sergul Aydore Lee Dicker Dean Foser Absrac We consider an online learning process o forecas a sequence of oucomes for nonconvex models. A ypical measure o evaluae online learning policies is regre bu such

More information

Math 333 Problem Set #2 Solution 14 February 2003

Math 333 Problem Set #2 Solution 14 February 2003 Mah 333 Problem Se #2 Soluion 14 February 2003 A1. Solve he iniial value problem dy dx = x2 + e 3x ; 2y 4 y(0) = 1. Soluion: This is separable; we wrie 2y 4 dy = x 2 + e x dx and inegrae o ge The iniial

More information

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé Bias in Condiional and Uncondiional Fixed Effecs Logi Esimaion: a Correcion * Tom Coupé Economics Educaion and Research Consorium, Naional Universiy of Kyiv Mohyla Academy Address: Vul Voloska 10, 04070

More information

Ensamble methods: Bagging and Boosting

Ensamble methods: Bagging and Boosting Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

Some Basic Information about M-S-D Systems

Some Basic Information about M-S-D Systems Some Basic Informaion abou M-S-D Sysems 1 Inroducion We wan o give some summary of he facs concerning unforced (homogeneous) and forced (non-homogeneous) models for linear oscillaors governed by second-order,

More information

5.1 - Logarithms and Their Properties

5.1 - Logarithms and Their Properties Chaper 5 Logarihmic Funcions 5.1 - Logarihms and Their Properies Suppose ha a populaion grows according o he formula P 10, where P is he colony size a ime, in hours. When will he populaion be 2500? We

More information

MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES PROBABILITY DISTRIBUTION OF A FUNCTION OF A RANDOM VARIABLE

MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES PROBABILITY DISTRIBUTION OF A FUNCTION OF A RANDOM VARIABLE Topics MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES 2-6 3. FUNCTION OF A RANDOM VARIABLE 3.2 PROBABILITY DISTRIBUTION OF A FUNCTION OF A RANDOM VARIABLE 3.3 EXPECTATION AND MOMENTS

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

Solutions for Assignment 2

Solutions for Assignment 2 Faculy of rs and Science Universiy of Torono CSC 358 - Inroducion o Compuer Neworks, Winer 218 Soluions for ssignmen 2 Quesion 1 (2 Poins): Go-ack n RQ In his quesion, we review how Go-ack n RQ can be

More information

Comments on Window-Constrained Scheduling

Comments on Window-Constrained Scheduling Commens on Window-Consrained Scheduling Richard Wes Member, IEEE and Yuing Zhang Absrac This shor repor clarifies he behavior of DWCS wih respec o Theorem 3 in our previously published paper [1], and describes

More information

Biol. 356 Lab 8. Mortality, Recruitment, and Migration Rates

Biol. 356 Lab 8. Mortality, Recruitment, and Migration Rates Biol. 356 Lab 8. Moraliy, Recruimen, and Migraion Raes (modified from Cox, 00, General Ecology Lab Manual, McGraw Hill) Las week we esimaed populaion size hrough several mehods. One assumpion of all hese

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecure Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/i2ml3e CHAPTER 2: SUPERVISED LEARNING Learning a Class

More information

Online Learning Applications

Online Learning Applications Online Learning Applicaions Sepember 19, 2016 In he las lecure we saw he following guaranee for minimizing misakes wih Randomized Weighed Majoriy (RWM). Theorem 1 Le M be misakes of RWM and M i he misakes

More information

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems MATHEMATICS OF OPERATIONS RESEARCH Vol. 38, No. 2, May 2013, pp. 209 227 ISSN 0364-765X (prin) ISSN 1526-5471 (online) hp://dx.doi.org/10.1287/moor.1120.0562 2013 INFORMS On Boundedness of Q-Learning Ieraes

More information

References are appeared in the last slide. Last update: (1393/08/19)

References are appeared in the last slide. Last update: (1393/08/19) SYSEM IDEIFICAIO Ali Karimpour Associae Professor Ferdowsi Universi of Mashhad References are appeared in he las slide. Las updae: 0..204 393/08/9 Lecure 5 lecure 5 Parameer Esimaion Mehods opics o be

More information

Electrical and current self-induction

Electrical and current self-induction Elecrical and curren self-inducion F. F. Mende hp://fmnauka.narod.ru/works.hml mende_fedor@mail.ru Absrac The aricle considers he self-inducance of reacive elemens. Elecrical self-inducion To he laws of

More information

Seminar 4: Hotelling 2

Seminar 4: Hotelling 2 Seminar 4: Hoelling 2 November 3, 211 1 Exercise Par 1 Iso-elasic demand A non renewable resource of a known sock S can be exraced a zero cos. Demand for he resource is of he form: D(p ) = p ε ε > A a

More information

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015 Explaining Toal Facor Produciviy Ulrich Kohli Universiy of Geneva December 2015 Needed: A Theory of Toal Facor Produciviy Edward C. Presco (1998) 2 1. Inroducion Toal Facor Produciviy (TFP) has become

More information

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes Some common engineering funcions 2.7 Inroducion This secion provides a caalogue of some common funcions ofen used in Science and Engineering. These include polynomials, raional funcions, he modulus funcion

More information

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.

More information

Chapter 7: Solving Trig Equations

Chapter 7: Solving Trig Equations Haberman MTH Secion I: The Trigonomeric Funcions Chaper 7: Solving Trig Equaions Le s sar by solving a couple of equaions ha involve he sine funcion EXAMPLE a: Solve he equaion sin( ) The inverse funcions

More information

11!Hí MATHEMATICS : ERDŐS AND ULAM PROC. N. A. S. of decomposiion, properly speaking) conradics he possibiliy of defining a counably addiive real-valu

11!Hí MATHEMATICS : ERDŐS AND ULAM PROC. N. A. S. of decomposiion, properly speaking) conradics he possibiliy of defining a counably addiive real-valu ON EQUATIONS WITH SETS AS UNKNOWNS BY PAUL ERDŐS AND S. ULAM DEPARTMENT OF MATHEMATICS, UNIVERSITY OF COLORADO, BOULDER Communicaed May 27, 1968 We shall presen here a number of resuls in se heory concerning

More information

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.

More information

Innova Junior College H2 Mathematics JC2 Preliminary Examinations Paper 2 Solutions 0 (*)

Innova Junior College H2 Mathematics JC2 Preliminary Examinations Paper 2 Solutions 0 (*) Soluion 3 x 4x3 x 3 x 0 4x3 x 4x3 x 4x3 x 4x3 x x 3x 3 4x3 x Innova Junior College H Mahemaics JC Preliminary Examinaions Paper Soluions 3x 3 4x 3x 0 4x 3 4x 3 0 (*) 0 0 + + + - 3 3 4 3 3 3 3 Hence x or

More information

The General Linear Test in the Ridge Regression

The General Linear Test in the Ridge Regression ommunicaions for Saisical Applicaions Mehods 2014, Vol. 21, No. 4, 297 307 DOI: hp://dx.doi.org/10.5351/sam.2014.21.4.297 Prin ISSN 2287-7843 / Online ISSN 2383-4757 The General Linear Tes in he Ridge

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION DOI: 0.038/NCLIMATE893 Temporal resoluion and DICE * Supplemenal Informaion Alex L. Maren and Sephen C. Newbold Naional Cener for Environmenal Economics, US Environmenal Proecion

More information

dt = C exp (3 ln t 4 ). t 4 W = C exp ( ln(4 t) 3) = C(4 t) 3.

dt = C exp (3 ln t 4 ). t 4 W = C exp ( ln(4 t) 3) = C(4 t) 3. Mah Rahman Exam Review Soluions () Consider he IVP: ( 4)y 3y + 4y = ; y(3) = 0, y (3) =. (a) Please deermine he longes inerval for which he IVP is guaraneed o have a unique soluion. Soluion: The disconinuiies

More information

More Digital Logic. t p output. Low-to-high and high-to-low transitions could have different t p. V in (t)

More Digital Logic. t p output. Low-to-high and high-to-low transitions could have different t p. V in (t) EECS 4 Spring 23 Lecure 2 EECS 4 Spring 23 Lecure 2 More igial Logic Gae delay and signal propagaion Clocked circui elemens (flip-flop) Wriing a word o memory Simplifying digial circuis: Karnaugh maps

More information

Overview. COMP14112: Artificial Intelligence Fundamentals. Lecture 0 Very Brief Overview. Structure of this course

Overview. COMP14112: Artificial Intelligence Fundamentals. Lecture 0 Very Brief Overview. Structure of this course OMP: Arificial Inelligence Fundamenals Lecure 0 Very Brief Overview Lecurer: Email: Xiao-Jun Zeng x.zeng@mancheser.ac.uk Overview This course will focus mainly on probabilisic mehods in AI We shall presen

More information

Families with no matchings of size s

Families with no matchings of size s Families wih no machings of size s Peer Franl Andrey Kupavsii Absrac Le 2, s 2 be posiive inegers. Le be an n-elemen se, n s. Subses of 2 are called families. If F ( ), hen i is called - uniform. Wha is

More information

Ensamble methods: Boosting

Ensamble methods: Boosting Lecure 21 Ensamble mehods: Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Schedule Final exam: April 18: 1:00-2:15pm, in-class Term projecs April 23 & April 25: a 1:00-2:30pm in CS seminar room

More information

EXERCISES FOR SECTION 1.5

EXERCISES FOR SECTION 1.5 1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler

More information

Online Learning, Regret Minimization, Minimax Optimality, and Correlated Equilibrium

Online Learning, Regret Minimization, Minimax Optimality, and Correlated Equilibrium Algorihm Online Learning, Regre Minimizaion, Minimax Opimaliy, and Correlaed Equilibrium High level Las ime we discussed noion of Nash equilibrium Saic concep: se of prob Disribuions (p,q, ) such ha nobody

More information

SMT 2014 Calculus Test Solutions February 15, 2014 = 3 5 = 15.

SMT 2014 Calculus Test Solutions February 15, 2014 = 3 5 = 15. SMT Calculus Tes Soluions February 5,. Le f() = and le g() =. Compue f ()g (). Answer: 5 Soluion: We noe ha f () = and g () = 6. Then f ()g () =. Plugging in = we ge f ()g () = 6 = 3 5 = 5.. There is a

More information