Regret in Online Combinatorial Optimization

Size: px
Start display at page:

Download "Regret in Online Combinatorial Optimization"

Transcription

1 Regret n Onlne Combnatoral Optmzaton Jean-Yves Audbert Imagne, Unversté Pars Est, and Serra, CNRS/ENS/INRIA audbert@magne.enpc.fr Sébasten Bubeck Department of Operatons Research and Fnancal Engneerng, Prnceton Unversty sbubeck@prnceton.edu Gábor Lugos ICREA and Pompeu Fabra Unversty gabor.lugos@upf.edu March 29, 2013 Abstract We address onlne lnear optmzaton problems when the possble actons of the decson maker are represented by bnary vectors. The regret of the decson maker s the dfference between her realzed loss and the mnmal loss she would have acheved by pckng, n hndsght, the best possble acton. Our goal s to understand the magntude of the best possble mnmax regret. We study the problem under three dfferent assumptons for the feedback the decson maker receves: full nformaton, and the partal nformaton models of the socalled sem-bandt and bandt problems. In the full nformaton case we show that the standard exponentally weghted average forecaster s a provably suboptmal strategy. For the sem-bandt model, by combnng the Mrror Descent algorthm and the INF Implctely Normalzed Forecaster strategy, we are able to prove the frst optmal bounds. Fnally, n the bandt case we dscuss exstng results n lght of a new lower bound, and suggest a conjecture on the optmal regret n that case. 1 Introducton. In ths paper we consder the framework of onlne lnear optmzaton. The setup may be descrbed as a repeated game between a decson maker or smply player or forecaster and an adver- 1

2 sary as follows: at each tme nstance t = 1,..., n, the player chooses, possbly n a randomzed way, an acton from a gven fnte acton set A R d. The acton chosen by the player at tme t s denoted by a t A. Smultaneously to the player, the adversary chooses a loss vector z t Z R d and the loss ncurred by the forecaster s a T t z t. The goal of the player s to mnmze the expected cumulatve loss E n at t z t where the expectaton s taken wth respect to the player s nternal randomzaton and eventually the adversary s randomzaton. In the basc full-nformaton verson of ths problem, the player observes the adversary s move z t at the end of round t. Another mportant model for feedback s the so-called bandt problem, n whch the player only observes the ncurred loss a T t z t. As a measure of performance we defne the regret 1 of the player as R n = E a T t z t mn E a T z t. a A In ths paper we address a specfc example of onlne lnear optmzaton: we assume that the acton set A s a subset of the d-dmensonal hypercube {0, 1} d such that a A, a 1 = m, and the adversary has a bounded loss per coordnate, that s 2 Z = [0, 1] d. We call ths settng onlne combnatoral optmzaton. As we wll see below, ths restrcton of the general framework contans a rch class of problems. Indeed, n many nterestng cases, actons are naturally represented by Boolean vectors. In addton to the full nformaton and bandt versons of onlne combnatoral optmzaton, we also consder another type of feedback whch makes sense only n ths combnatoral settng. In the sem-bandt verson, we assume that the player observes only the coordnates of z t that were played n a t, that s the player observes the vector a t 1z t 1,..., a t dz t d. All three varants of onlne combnatoral optmzaton are sketched n Fgure 1. More rgorously, onlne combnatoral optmzaton s defned as a repeated game between a player and an adversary. At each round t = 1,..., n of the game, the player chooses a probablty dstrbuton p t over the set of actons A {0, 1} d and draws a random acton a t A accordng to p t. Smultaneously, the adversary chooses a vector z t [0, 1] d. More formally, z t s a measurable functon of the past p s, a s, z s s=1,...,t 1. In the full nformaton case, p t s a measurable functon of p s, a s, z s s=1,...,t 1. In the sem-bandt case, p t s a measurable functon of p s, a s, a s z s =1,...,d s=1,...,t 1 and n the bandt problem t s a measurable functon of p s, a s, a T s z s s=1,...,t Motvatng examples. Many problems can be tackled under the onlne combnatoral optmzaton framework. We gve here three smple examples: m-sets. In ths example we consder the set A of all d m Boolean vectors n dmenson d wth exactly m ones. In other words, at every tme step, the player selects m actons out of 1 In the full nformaton verson, t s straghtforward to obtan upper bounds for the stronger noton of regret E n n at t z t E mn a A at z t whch s always at least as large as R n. However, for partal nformaton games, ths requres more work. In ths paper we only consder R n as a measure of the regret. 2 Note that snce all actons have the same sze,.e. a 1 = m, a A, one can reduce the case of Z = [α, β] d to Z = [0, 1] d va a smple renormalzaton. 2

3 Parameters: set of actons A {0, 1} d ; number of rounds n N. For each round t = 1, 2,..., n; 1 the player chooses a probablty dstrbuton p t over A and draws a random acton a t A accordng to p t ; 2 smultaneously, the adversary selects a loss vector z t [0, 1] d wthout revealng t; 3 the player ncurs the loss a T t z t. She observes the loss vector z t n the full nformaton settng, the coordnates z t a t n the sem-bandt settng, the nstantaneous loss a T t z t n the bandt settng. Goal: The player tres to mnmze her cumulatve loss n at t z t. Fgure 1: Onlne combnatoral optmzaton. d possbltes. When m = 1, the sem-bandt and bandt versons concde and correspond to the standard adversaral mult-armed bandt problem. Onlne shortest path problem. Consder a communcaton network represented by a graph n whch one has to send a sequence of packets from one fxed vertex to another. For each packet one chooses a path through the graph and suffers a certan delay whch s the sum of the delays on the edges of the path. Dependng on the traffc, the delays on the edges may change, and, at the end of each round, accordng to the assumed level of feedback, the player observes ether the delays of all edges, the delays of each edge on the chosen path, or only the total delay of the chosen path. The player s objectve s to mnmze the total delay for the sequence of packets. One can represent the set of vald paths from the startng vertex to the end vertex as a set A {0, 1} d where d s the number of edges. If at tme t, z t [0, 1] d s the vector of delays on the edges, then the delay of a path a A s z T t a. Thus ths problem s an nstance of onlne combnatoral optmzaton n dmenson d, where d s the number of edges n the graph. In ths paper we assume, for smplcty, that all vald paths have the same length m. Rankng. Consder the problem of selectng a rankng of m tems out of M possble tems. For example a webste could have a set of M ads, and t has to select a ranked lst of m of these ads to appear on the webpage. One can rephrase ths problem as selectng a matchng of sze m on the complete bpartte graph K m,m wth d = m M edges. In the onlne learnng verson of ths problem, each day the webste chooses one such lst, and gans one dollar for each clck on the ads. Ths problem can easly be formulated as an onlne combnatoral optmzaton problem. Our theory apples to many more examples, such as spannng trees whch can be useful n certan communcaton problems, or m-ntervals. 3

4 1.2 Prevous work. Full Informaton. The full-nformaton settng s now farly well understood, and an optmal regret bound n terms of m, d, n was obtaned by Koolen, Warmuth, and Kvnen [26]. Prevous papers under full nformaton feedback also nclude Gentle and Warmuth [14], Kvnen and Warmuth [25], Grove, Lttlestone, and Schuurmans [15], Takmoto and Warmuth [34], Kala and Vempala [22], Warmuth and Kuzmn [36], Herbster and Warmuth [19], and Hazan, Kale, and Warmuth [18]. Sem-bandt. The frst paper on the adversaral mult-armed bandt problem.e., the specal case of m-sets wth m = 1 s by Auer, Cesa-Banch, Freund, and Schapre [4] who derved a regret bound of order dn log d. Ths result was mproved to dn by Audbert and Bubeck [2, 3]. György, Lnder, Lugos, and Ottucsák [16] consder the onlne shortest path problem and derve suboptmal regret bounds n terms of the dependency on m and d. Uchya, Nakamura, and Kudo [35] respectvely Kale, Reyzn, and Schapre [23] derved optmal regret bounds for the case of m-sets respectvely for the problem of rankng selecton up to logarthmc factors. Bandt. McMahan and Blum [27], and Awerbuch and Klenberg [5] were the frst to consder ths settng, and obtaned suboptmal regret bounds n terms of n. The frst paper wth optmal dependency n n was by Dan, Hayes, and Kakade [12]. The dependency on m and d was then mproved n varous ways by Abernethy, Hazan, and Rakhln [1], Cesa-Banch, and Lugos [11], and Bubeck, Cesa-Banch, and Kakade [9]. We dscuss these bounds n detal n Secton 4. In partcular, we argue that the optmal regret bound n terms of d and m s stll an open problem. We also refer the nterested reader to the recent survey [8] for an overvew of bandt problems n varous other settngs. 1.3 Contrbuton and contents of the paper. In ths paper we are prmarly nterested n the optmal mnmax regret n terms of m, d and n. More precsely, our am s to determne the order of magntude of the followng quantty: For a gven feedback assumpton, wrte sup for the supremum over all adversares and nf for the nfmum over all allowed strateges for the player under the feedback assumpton. Recall the defnton of adversary and player from the ntroducton. Then we are nterested n max nf sup R n. A {0,1} d : a A, a 1 =m Our contrbuton to the study of ths quantty s threefold. Frst, we unfy the algorthms used n Abernethy, Hazan, and Rakhln [1], Koolen, Warmuth, and Kvnen [26], Uchya, Nakamura, and Kudo [35], and Kale, Reyzn, and Schapre [23] under the umbrella of mrror descent. The dea of mrror descent goes back to Nemrovsk [28], Nemrovsk and Yudn [29]. A somewhat smlar concept was re-dscovered n onlne learnng by Herbster and Warmuth [20], Grove, Lttlestone, and Schuurmans [15], Kvnen and Warmuth [25] under the name of potental-based gradent 4

5 Lower Bound Full Informaton Sem-Bandt Bandt m n log d m mdn m dn Upper Bound m n log d m mdn m 3/2 dn log d m Table 1: Bounds on the mnmax regret up to constant factors. The new results are set n boldface. In ths paper we also show that EXP2 n the full nformaton case has a regret bounded below by d 3/2 n when m s of order d. descent, see [10, Chapter 11]. Recently, these deas have been flourshng, see for nstance Shalev- Schwartz [33], Rakhln [30], Hazan [17], and Bubeck [7]. Our man theorem Theorem 2 allows one to recover almost all known regret bounds for onlne combnatoral optmzaton. Ths frst contrbuton leads to our second man result, the mprovement of the known upper bounds for the sem-bandt game. In partcular, we propose a dfferent proof of the mnmax regret bound of the order of nd n the standard d-armed bandt game that s much smpler than the one provded n Audbert and Bubeck [3] whch also mproves the constant factor. In addton to these upper bounds we prove two new lower bounds. Frst we answer a queston of Koolen, Warmuth, and Kvnen [26] by showng that the exponentally weghted average forecaster s provably suboptmal for onlne combnatoral optmzaton. Our second lower bound s a mnmax lower bound n the bandt settng whch mproves known results by an order of magntude. A summary of known bounds and the new bounds proved n ths paper can be found n Table 1. The paper s organzed as follows. In Secton 2 we ntroduce the two algorthms dscussed n ths paper. In partcular n Secton 2.1 we dscuss the popular exponentally weghted average forecaster and we show that t s a provably suboptmal strategy. Then n Secton 2.2 we descrbe our man algorthm, OSMD Onlne Stochastc Mrror Descent, and prove a general regret bound n terms of the Bregman dvergence of the Fenchel-Legendre dual of the Legendre functon defnng the strategy. In Secton 3 we derve upper bounds for the regret n the sem-bandt case for OSMD wth approprately chosen Legendre functons. Fnally n Secton 4 we prove a new lower bound for the bandt settng, and we formulate a conjecture on the correct order of magntude of the regret for that problem based on ths new result and the regret bounds obtaned n [1, 9]. 2 Algorthms. In ths secton we dscuss two classes of algorthms that have been proposed for onlne combnatoral optmzaton. 2.1 Expanded Exponental weghts EXP2. The smplest approach to onlne combnatoral optmzaton s to consder each acton of A as an ndependent expert, and then apply a generc regret mnmzng strategy. Perhaps the most popular such strategy s the exponentally weghted average forecaster see, e.g., [10]. Ths 5

6 strategy s sometmes called Hedge, see Freund and Schapre [13]. We call the resultng strategy for the onlne combnatoral optmzaton problem EXP2, see Fgure 2. In the full nformaton settng, EXP2 corresponds to Expanded Hedge, as defned n Koolen, Warmuth, and Kvnen [26]. In the sem-bandt case, EXP2 was studed by György, Lnder, Lugos, and Ottucsák [16] whle n the bandt case n Dan, Hayes, and Kakade [12], Cesa-Banch and Lugos [11], and Bubeck, Cesa-Banch, and Kakade [9]. Note that n the bandt case, EXP2 s mxed wth an exploraton dstrbuton, see Secton 4 for more detals. Despte strong nterest n ths strategy, no optmal regret bound has been derved for t n the combnatoral settng. More precsely, the best bound whch can be derved from a standard argument, see for example [12] or [26] s of order m n 3/2 log d m. On the other hand, n [26] the authors showed that by usng Mrror Descent see next secton wth the negatve entropy, one obtans a regret bounded by m n log d m. Furthermore ths latter bound s clearly optmal up to a numercal constant, as one can see from the standard lower bound n predcton wth expert advce consder the set A that corresponds to playng m expert problems n parallel wth d/m experts n each problem. In [26] the authors leave as an open queston the problem of whether t would be possble to mprove the bound for EXP2 to obtan the optmal order of magntude. The followng theorem shows that ths s mpossble, and that n fact EXP2 s a provably suboptmal strategy. Theorem 1 Let n d. There exsts a subset A {0, 1} d such that n the full nformaton settng, the regret of the EXP2 strategy for any learnng rate η, satsfes The proof s deferred to the Appendx. sup R n 0.01 d 3/2 n. adversary 2.2 Onlne Stochastc Mrror Descent. In ths secton we descrbe the man algorthm studed n ths paper. We call t Onlne Stochastc Mrror Descent OSMD. Each term n ths name refers to a part of the algorthm: Mrror Descent orgnates n the work of Nemrovsk and Yudn [29]. The dea of mrror descent s to perform a gradent descent, where the update wth the gradent s performed n the dual space defned by some Legendre functon F rather than n the prmal see below for a precse formulaton. The Stochastc part takes ts orgn from Robbns and Monro [31] and from Kefer and Wolfowtz [24]. The key dea s that t s enough to observe an unbased estmate of the gradent rather than the true gradent n order to perform a gradent descent. Fnally the Onlne part comes from Znkevch [37]. Znkevch derved the Onlne Gradent Descent OGD algorthm, whch s a verson of gradent descent talored to onlne optmzaton. To properly descrbe the OSMD strategy, we recall a few concepts from convex analyss, see Hrart-Urruty and Lemaréchal [21] for a thorough treatment of ths subject. Let D R d be an open convex set, and D the closure of D. Defnton 1 We call Legendre any contnuous functon F : D R such that F s strctly convex contnuously dfferentable on D, 6

7 EXP2: Parameter: Learnng rate η. Let p 1 = 1 A,..., 1 A R A. For each round t = 1, 2,..., n; a Play a t p t and observe the loss vector z t n the full nformaton game, the coordnates z t 1 at=1 n the sem-bandt game, the nstantaneous loss a T t z t n the bandt game. b Estmate the loss vector z t by z t. For nstance, one may take z t = z t n the full nformaton game, z t = z t a A:a=1 ptaa t n the sem-bandt game, z t = P + t a t a T t z t, wth P t = E a pt aa T n the bandt game. c Update the probabltes, for all a A, p t+1 a = exp ηa T z t p t a b A exp ηbt z T t p t b. Fgure 2: The EXP2 strategy. The notaton E a pt denotes expected value wth respect to the random choce of a when t s dstrbuted accordng to p t. lm x D\D F x = +. 3 The Bregman dvergence D F : D D assocated to a Legendre functon F s defned by D F x, y = F x F y x y T F y. Moreover, we say that D = F D s the dual space of D under F. We also denote by F the Legendre-Fenchel transform of F defned by F u = sup x T u F x. x D Lemma 1 Let F be a Legendre functon. Then F = F and F = F 1 on the set D. Moreover, x, y D, D F x, y = D F F y, F x. 1 3 By the equvalence of norms n R d, ths defnton does not depend on the choce of the norm. 7

8 The lemma above s the key to understandng how a Legendre functon acts on the space. The gradent F maps D to the dual space D, and F s the nverse mappng from the dual space to the orgnal prmal space. Moreover, 1 shows that the Bregman dvergence n the prmal space corresponds exactly to the Bregman dvergence of the Legendre-Fenchel transform n the dual space. A proof of ths result can be found, for example, n [Chapter 11, [10]]. We now have all ngredents to descrbe the OSMD strategy, see Fgure 3 for the precse formulaton. Note that step d s well defned f the followng consstency condton s satsfed: F x η z t D, x ConvA D. 2 In the full nformaton settng, algorthms of ths type were studed by Abernethy, Hazan, and Rakhln [1], Rakhln [30], and Hazan [17]. In these papers the authors adopted the presentaton suggested by Beck and Teboulle [6], whch corresponds to a Follow-the-Regularzed-Leader FTRL type strategy. There the focus was on F beng strongly convex wth respect to some norm. Moreover, n [1] the authors also consder the bandt case, and swtch to F beng a self-concordant barrer for the convex hull of A see Secton 4 for more detals. Another lne of work studed ths type of algorthms wth F beng the negatve entropy, see Koolen, Warmuth, and Kvnen [26] for the full nformaton case and Uchya, Nakamura, and Kudo [35], Kale, Reyzn, and Schapre [23] for specfc nstances of the sem-bandt case. All these results are unfed and descrbed n detals n Bubeck [7]. In ths paper we consder a new type of Legendre functons F nspred by Audbert and Bubeck [3], see Secton 3. Regardng computatonal complexty, OSMD s effcent as soon as the polytope ConvA can be descrbed by a polynomal n d number of constrants. Indeed n that case steps a-b can be performed effcently jontly one can get an algorthm by lookng at the proof of Carathéodory s theorem, and step d s a convex program wth a polynomal number of constrants. In many nterestng examples such as m-sets, selecton of rankngs, spannng trees, paths n acyclc graphs one can descrbe the convex hull of A by a polynomal number of constrants, see Schrjver [32]. On the other hand, there also exst mportant examples where ths s not the case such as paths on general graphs. Also note that for some specfc examples t s possble to mplement OSMD wth mproved computatonal complexty, see Koolen, Warmuth, and Kvnen [26]. In ths paper we restrct our attenton to the combnatoral learnng settng n whch A s a subset of {0, 1} d and the loss s lnear. However, one should note that ths specfc form of A plays no role n the defnton of OSMD. Moreover, f the loss s not lnear, then one can modfy OSMD by performng a gradent update wth a gradent of the loss rather than the loss vector z t. See Bubeck [7] for more detals on ths approach. The followng result s at the bass of our mproved regret bounds for OSMD n the sem-bandt settng, see Secton 3. Theorem 2 Suppose that 2 s satsfed and the loss estmates are unbased n the sense that E at p t z t = z t. Then the regret of the OSMD strategy satsfes R n sup a A F a F x 1 η + 1 η ED F F x t η z t, F x t. 8

9 OSMD: Parameters: learnng rate η > 0, Legendre functon F defned on D ConvA. Let x 1 argmn x ConvA F x. For each round t = 1, 2,..., n; a Let p t be a dstrbuton on the set A such that x t = E a pt a. b Draw a random acton a t accordng to the dstrbuton p t and observe the feedback. c Based on the observed feedback, estmate the loss vector z t by z t. d Let w t+1 D satsfy F w t+1 = F x t η z t. 3 e Project the weght vector w t+1 defned by 3 on the convex hull of A: x t+1 argmn D F x, w t+1. 4 x ConvA Fgure 3: Onlne Stochastc Mrror Descent OSMD. Proof Let a A. Usng that a t and z t are unbased estmates of x t and z t, we have E a t a T z t = E x t a T z t. Usng 3, and applyng the defnton of the Bregman dvergences, one obtans η z T t x t a = a x t T F w t+1 F x t = D F a, x t + D F x t, w t+1 D F a, w t+1. By the Pythagorean theorem for Bregman dvergences see, e.g., Lemma 11.3 of [10], we have D F a, w t+1 D F a, x t+1 + D F x t+1, w t+1, hence η z T t x t a D F a, x t + D F x t, w t+1 D F a, x t+1 D F x t+1, w t+1. Summng over t gves η z t T x t a D F a, a 1 D F a, a n+1 + DF x t, w t+1 D F x t+1, w t+1. 9

10 By the nonnegatvty of the Bregman dvergences, we get η z t T x t a D F a, a 1 + D F x t, w t+1. From 1, one has D F x t, w t+1 = D F F xt η z t, F x t. Moreover, by wrtng the frstorder optmalty condton for x 1, one drectly obtans D F a, x 1 F a F x 1 whch concludes the proof. Note that, f F admts an Hessan, denoted 2 F, that s always nvertble, then one can prove that, up to a thrd-order term n z t, the regret bound can be wrtten as R n sup a A F a F x 1 η + η 2 z T t 2 F x t 1 zt. 5 The man techncal dffculty s to control the thrd-order error term n ths nequalty. 3 Sem-bandt feedback. In ths secton we consder onlne combnatoral optmzaton wth sem-bandt feedback. As we already dscussed, n the full nformaton case Koolen, Warmuth, and Kvnen [26] proved that OSMD wth the negatve entropy s a mnmax optmal strategy. We frst prove a regret bound when one uses ths strategy wth the followng estmate for the loss vector: z t = z ta t. 6 x t Note that ths s a vald estmate snce t makes only use of z t 1a t 1,..., z t da t d. Moreover, t s unbased wth respect to the random draw of a t from p t, snce by defnton, E at p t a t = x t. In other words, E at p t z t = z t. Theorem 3 The regret of OSMD wth F x = d =1 x log x d =1 x and D = 0, + d and any non-negatve unbased loss estmate z t 0 satsfes R n m log d m η In partcular, wth the estmate 6 and η = R n + η 2 d x t z t 2. =1 m log dm 2, nd 2mdn log d m. 10

11 Proof One can easly see that for the negatve entropy the dual space s D = R d. Thus, 2 s verfed and OSMD s well defned. Moreover, agan by straghtforward computatons, one can also see that D F F x, F y = d =1 y Θ F x F y, 7 where Θx = expx 1 x. Thus, usng Theorem 2 and the facts that Θx x2 for x 0 2 and d =1 x t m, one obtans R n sup a A F a F x 1 η sup a A F a F x 1 η + 1 η + η 2 ED F F x t η z t, F x t d x t z t 2 The proof of the frst nequalty s concluded by notng that: d d 1 F a F x 1 x 1 log x 1 m log =1 The second nequalty follows from =1 =1 Ex t z t 2 E a t x t = 1. x 1 m 1 = m log d x 1 m. Usng the standard dn lower bound for the mult-armed bandt whch corresponds to the case where A s the canoncal bass, see e.g., [Theorem 30, [3]], one can drectly obtan a lower bound of order mdn for our settng. Thus the upper bound derved n Theorem 3 has an extraneous logarthmc factor compared to the lower bound. Ths phenomenon already appeared n the basc mult-armed bandt settng. In that case, the extra logarthmc factor was removed n Audbert and Bubeck [2] by resortng to a new class of strateges for the expert problem, called INF Implctely Normalzed Forecaster. Next we generalze ths class of algorthms to the combnatoral settng, and thus remove the extra logarthmc factor. Frst we ntroduce the noton of a potental and the assocated Legendre functon. Defnton 2 Let ω 0. A functon ψ :, a R + for some a R {+ } s called an ω-potental f t s convex, contnuously dfferentable, and satsfes lm ψx = ω, x ψ > 0, lm ψx = +, x a ω+1 ω ψ 1 s ds < +. For every potental ψ we assocate the functon F ψ defned on D = ω, + d by: F ψ x = d =1 x ω 11 ψ 1 sds.

12 In ths paper we restrct our attenton to 0-potentals whch we wll smply call potentals. A non-zero value of ω may be used to derve regret bounds that hold wth hgh probablty nstead of pseudo-regret bounds, see footnote 1. The frst order optmalty condton for 4 mples that OSMD wth F ψ s a drect generalzaton of INF wth potental ψ, n the sense that the two algorthms concde when A s the canoncal bass. Note, n partcular, that wth ψx = expx we recover the negatve entropy for F ψ. In [3], the choce of ψx = x q wth q > 1 was recommended. We show n Theorem 4 that here, agan, ths choce gves a mnmax optmal strategy. Lemma 2 Let ψ be a potental. Then F = F ψ s Legendre and for all u, v D =, a d such that u v, {1,..., d}, D F u, v 1 2 d ψ v u v 2. =1 Proof A drect examnaton shows that F = F ψ s a Legendre functon. Moreover, snce F u = F 1 u = ψu 1,..., ψu d, we obtan D F u, v = From a Taylor expanson, we get d u v ψsds u v ψv. =1 D F u, v d 1 max s [u,v ] 2 ψ su v 2. =1 Snce the functon ψ s convex, and u v, we have whch gves the desred result. max s [u,v ] ψ s ψ maxu, v ψ v, Theorem 4 Let ψ be a potental. The regret of OSMD wth F = F ψ and any non-negatve unbased loss estmate z t satsfes R n sup a A F a F x 1 η + η 2 d z t 2 E ψ 1 x t. In partcular, wth the estmate 6, ψx = x q, q > 1,and η = =1 2 q 1 m 1 2/q 1, d 1 2/q n Wth q = 2 ths gves 2 R n q q 1 mdn. R n 2 2mdn. 12

13 In the case m = 1, the above theorem mproves the bound R n 8 nd obtaned n Theorem 11 of [3]. Proof Frst note that snce D =, a d and z t has non-negatve coordnates, OSMD s well defned that s, 2 s satsfed. The frst nequalty follows from Theorem 2 and the fact that ψ ψ 1 1 s = Let ψx = x q. Then ψ 1 x = x 1/q and F x = q q 1 note that by Hölder s nequalty, snce d =1 x 1 = m, F a F x 1 q q 1 d x 1 1 1/q =1 Moreover, note that ψ 1 x = 1 q x 1 1/q, and d =1 x1 1/q q q 1 mq 1/q d 1/q. d z t 2 d E ψ 1 x t q x t 1/q qm 1/q d 1 1/q, =1 whch concludes the proof. =1 ψ 1 s.. In partcular, 4 Bandt feedback. In ths secton we consder onlne combnatoral optmzaton wth bandt feedback. Ths settng s much more challengng than the sem-bandt case, and n order to obtan sublnear regret bounds all known strateges add an exploraton component to the algorthm. For example, n EXP2, nstead of playng an acton at random accordng to the exponentally weghted average dstrbuton p t, one draws a random acton from p t wth probablty 1 γ and from some fxed exploraton dstrbuton µ wth probablty γ. On the other hand, n OSMD, one randomly perturbs x t to some x t, and then plays at random a pont n A such that on average one plays x t. In Bubeck, Cesa-Banch, and Kakade [9], the authors study the EXP2 strategy wth the exploraton dstrbuton µ supported on the contact ponts between the polytope ConvA and the John ellpsod of ths polytope.e., the ellpsod of mnmal volume enclosng the polytope. Usng ths method they are able to prove the best known upper bound for onlne combnatoral optmzaton wth bandt feedback. They show that the regret of EXP2 mxed wth John s exploraton and wth the estmate descrbed n Fgure 2 satsfes R n 2m 3/2 3dn log ed m. Our next theorem shows that no strategy can acheve a regret less than a constant tmes m dn, leavng a gap of a factor of m log d. As we argue below, we conjecture that the lower bound s of m the correct order of magntude. However, mprovng the upper bound seems to requre some substantally new deas. Note that the followng bound gves lmtatons that no strategy can surpass, on the contrary to Theorem 1 whch was dedcated to the EXP2 strategy. 13

14 Theorem 5 Let n d 2m. There exsts a subset A {0, 1} d such that a 1 = m, a A, under bandt feedback, one has nf sup strateges adversares R n 0.02m dn, 8 where the nfmum and the supremum are taken over the class of strateges for the player and for the adversary as defned n the ntroducton. Note that t should not come as a surprse that EXP2 wth John s exploraton s suboptmal, snce even n the full nformaton case the basc EXP2 strategy was provably suboptmal, see Theorem 1. We conjecture that the correct order of magntude for the mnmax regret n the bandt case s m dn, as the above lower bound suggests. A promsng approach to resolve ths conjecture s to consder agan the OSMD approach. However we beleve that n the bandt case, one has to consder Legendre functons wth nondagonal Hessan on the contrary to the Legendre functons consdered so far n ths paper. Abernethy, Hazan, and Rakhln [1] propose to use a self-concordant barrer functon for the polytope ConvA. Then they randomly perturb the pont x t gven by OSMD usng the egenstructure of the Hessan. Ths approach leads to a regret upper bound of order md θn log n for θ > 0 when ConvA admts a θ-self-concordant barrer functon. Unfortunately, even when there exsts a O1-self concordant barrer, ths bound s stll larger than the conjectured optmal bound by a factor d. In fact, t was proved n [9] that n some cases there exst better choces for the Legendre functon and the perturbaton than those descrbed n [1], even when there s a O1-self concordant functon for the acton set. How to generalze ths approach to the polytopes nvolved n onlne combnatoral optmzaton s a challengng open problem. A Proof of Theorem 1. For the sake of smplcty, we assume that d s a multple of 4 and that n s even. We consder the followng subset of the hypercube: { d/2 A = a {0, 1} d : a = d/4 and =1 a = 1, {d/2 + 1;..., d/2 + d/4} or } a = 1, {d/2 + d/4 + 1,..., d}. That s, choosng a pont n A corresponds to choosng a subset of d/4 elements among the frst half of the coordnates, and choosng one of the two frst dsjont ntervals of sze d/4 n the second half of the coordnates. We prove that for any parameter η, there exsts an adversary such that Exp2 wth parameter η has a regret of at least nd tanh ηd 16 8 least mn d log 2 12η, and that there exsts another adversary such that ts regret s at, nd 12. As a consequence, we have nd ηd d log 2 sup R n max 16 tanh, mn 8 12η, nd 12 nd ηd mn max 16 tanh, d log 2, nd 8 12η mn A, nd 12,

15 wth nd ηd A = mn max η [0,+ 16 tanh, d log η nd ηd nd ηd mn mn ηd 8 16 tanh, mn 8 max ηd<8 16 tanh, d log η nd nd mn tanh1, mn 16 max ηd d log 2 tanh1, ηd< η nd mn 16 tanh1, nd3 log 2 tanh1 mn 0.04 nd, 0.01 d 3/2 n, where we used the fact that tanh s concave and ncreasng on R +. As n d, ths mples the stated lower bound. Frst we prove the lower bound nd tanh ηd Defne the followng adversary: z t = 1 f {d/2 + 1;..., d/2 + d/4} and t odd, 1 f {d/2 + d/4 + 1,..., d} and t even, 0 otherwse. Ths adversary always puts a zero loss on the frst half of the coordnates, and alternates between a loss of d/4 for choosng the frst nterval n the second half of the coordnates and the second nterval. At the begnnng of odd rounds, any vertex a A has the same cumulatve loss and thus Exp2 pcks ts expert unformly at random, whch yelds an expected cumulatve loss equal to nd/16. On the other hand, at even rounds the probablty dstrbuton to select the vertex a A s always the same. More precsely, the probablty of selectng a vertex whch contans the nterval 1 {d/2 + d/4 + 1,..., d}.e, the nterval wth a d/4 loss at ths round s exactly. Ths 1+exp ηd/4 adds an expected cumulatve loss equal to nd 1. Fnally, note that the loss of any fxed 8 1+exp ηd/4 vertex s nd/8. Thus, we obtan R n = nd 16 + nd exp ηd/4 nd 8 = nd ηd 16 tanh. 8 It remans to show a lower bound proportonal to 1/η. To ths end, we consder a dfferent adversary defned by 1 ε f d/4, z t = 1 f {d/4 + 1,..., d/2}, 0 otherwse, for some fxed ε > 0. Note that aganst ths adversary the choce of the nterval n the second half of the components does not matter. Moreover, by symmetry, the weght of any coordnate n {d/4 + 1,..., d/2} s the same at any round. Fnally, note that ths weght s decreasng wth t. Thus, we have the followng denttes n the bg sums represents the number of components selected n the frst 15

16 d/4 components: R n = nεd a A:a d/2 =1 exp ηnzt 1 a 4 a A exp ηnzt 1 a = nεd 4 = nεd 4 = nεd 4 d/4 1 d/4 d/4 1 =0 d/4 d/4 d/4 =0 d/4 1 d/4 d/4 1 =0 d/4 d/4 d/4 =0 d/4 1 =0 1 4 d/4 d/4 d d/4 d/4 d/4 =0 d/4 1 exp ηnd/4 nε d/4 exp ηnd/4 nε d/4 1 expηnε d/4 expηnε d/4 expηnε d/4 expηnε where we used d/4 1 d/4 1 = 1 4 d/4 d d/4 n the last equalty. Thus, takng ε = mn log 2, 1 ηn yelds d log 2 R n mn 4η, nd 4 d/4 1 =0 1 4 d/4 2 d mn2, expηn 2 mn2, expηn d/4 =0 d/4 d log 2 mn 12η, nd, 12 where the last nequalty follows from Lemma 3 n the appendx. Ths concludes the proof of the lower bound. B Proof of Theorem 5 The structure of the proof s smlar to that of [3, Theorem 30], whch deals wth the smple case where m = 1. The man mportant conceptual dfference s contaned n Lemma 4, whch s at the heart of ths new proof. The man argument follows the lne of standard lower bounds for bandt problems, see, e.g., [10]: The worst-case regret s bounded from below by by takng an average over a convenently chosen class of strateges of the adversary. Then, by Pnsker s nequalty, the problem s reduced to computng the Kullback-Lebler dvergence of certan dstrbutons. The man techncal argument, gven n Lemma 4, s for provng manageable bounds for the relevant Kullback-Lebler dvergence. For the sake of smplfyng notaton, we assume that d s a multple of m, and we dentfy {0, 1} d wth the set of m d/m bnary matrces {0, 1} m d m. We consder the followng set of actons: d/m A = {a {0, 1} m d m : {1,..., m}, a, j = 1}. In other words, the player s playng n parallel m fnte games wth d/m actons. From step 1 to 3 we restrct our attenton to the case of determnstc strateges for the player, and we show how to extend the results to arbtrary strateges n step 4. Frst step: defntons. j=1 16

17 We denote by I,t {1,..., m} the random varable such that a t, I,t = 1. That s, I,t s the acton chosen at tme t n the th game. Moreover, let τ be drawn unformly at random from {1,..., n}. In ths proof we consder random adversares ndexed by A. More precsely, for α A, we defne the α-adversary as follows: For any t {1,..., n}, z t, j s drawn from a Bernoull dstrbuton wth parameter 1 εα, j. In other words, aganst adversary α, n the 2 th game, the acton j such that α, j = 1 has a loss slghtly smaller n expectaton than the other actons. We denote by E α ntegraton wth respect to the loss generaton process of the α-adversary. We wrte P,α for the probablty dstrbuton of α, I,τ when the player faces the α-adversary. Note that 1 we have P,α 1 = E n α n 1 α,i,t =1, hence, aganst the α-adversary, we have m m R n = E α ε1 α,i,t 1 = nε 1 P,α 1, =1 whch mples snce the maxmum s larger than the mean m max R 1 n nε 1 P α A d/m m,α 1. 9 =1 Second step: nformaton nequalty. Let P,α be the probablty dstrbuton of α, I,τ aganst the adversary whch plays lke the α-adversary except that n the th game, the losses of all coordnates are drawn from a Bernoull dstrbuton of parameter 1/2. We call t the, α-adversary and we denote by E,α ntegraton wth respect to ts loss generaton process. By Pnsker s nequalty, 1 P,α 1 P,α KLP,α, P,α, where KL denotes the Kullback-Lebler dvergence. Moreover, note that by symmetry of the adversares, α, 1 1 P d/m m,α 1 = E d/m m,α α, I,τ α A = = = α A 1 1 d/m m d/m β A =1 α A α:,α=,β 1 1 d/m m d/m E,β β A 1 1 d/m m d/m β A and thus, thanks to the concavty of the square root, α:,α=,β E,α α, I,τ α, I,τ = m d, 10 1 P d/m m,α 1 m 1 d + KLP 2d/m m,α, P,α. 11 α A 17 α A

18 Thrd step: computaton of KLP,α, P,α wth the chan rule. Note that snce the forecaster s determnstc, the sequence of observed losses up to tme n W n {0,..., m} n unquely determnes the emprcal dstrbuton of plays, and, n partcular, the probablty dstrbuton of α, I,τ condtonally to W n s the same for any adversary. Thus, f we denote by P n α respectvely P n,α the probablty dstrbuton of W n when the forecaster plays aganst the α-adversary respectvely the, α-adversary, then one can easly prove that KLP,α, P,α KLP n,α, P n α. Now we use the chan rule for Kullback-Lebler dvergence teratvely to ntroduce the probablty dstrbutons P t α of the observed losses W t up to tme t. More precsely, we have, KLP n,α, P n α = KLP 1,α, P 1 α + P t 1,α w t 1KLP t,α. w t 1, P t α. w t 1 t=2 w t 1 {0,...,m} t 1 = KL B, B 1 α,i,1 =1 + P t 1,α w t 1KL B wt 1, B w t 1, t=2 w t 1 :α,i,1 =1 where B wt 1 and B w t 1 are sums of m Bernoull dstrbutons wth parameters n {1/2, 1/2 ε} and such that the number of Bernoulls wth parameter 1/2 n B wt 1 s equal to the number of Bernoulls wth parameter 1/2 n B w t 1 plus one. Now usng Lemma 4 see below we obtan, In partcular, ths gves KLP n,α, P n α KL B wt 1, B w t 1 8 ε 2 1 4ε 2 m. 8 ε 2 1 4ε 2 m E,α 1 α,i,t =1 = 8 ε 2 n 1 4ε 2 m P,α1. Summng and pluggng ths nto 11 we obtan agan thanks to 10, for ε 1 8, 1 P d/m m,α 1 m 8n d + ε d. α A To conclude the proof of 8 for determnstc players one needs to plug ths last equaton n 9 along wth straghtforward computatons. Fourth step: Fubn s theorem to handle non-determnstc players. Consder now a randomzed player, and let E rand denote the expectaton wth respect to the randomzaton of the player. Then one has thanks to Fubn s theorem, 1 E d/m m α A a T t z t α T 1 z = E rand d/m m α A E α a T t z t α T z. Now note that f we fx the realzaton of the forecaster s randomzaton then the results of the 1 prevous steps apply and, n partcular, one can lower bound d/m α A E n m α at t z t α T z as before note that α s the optmal acton n expectaton aganst the α-adversary. 18

19 C Techncal lemmas. Lemma 3 For any k N, for any 1 c 2, we have k =0 1 /k k 2c k =0 k 2c 1/3. Proof Let fc denote the expresson on the left-hand sde of the nequalty. Introduce the random varable X, whch s equal to {0,..., k} wth probablty k 2c / k k 2c j=0 j j. We have f c = 1E[X1 X/k] 1 1 EXE1 X/k = Var X 0. So the functon f s decreasng c c ck on [1, 2], and therefore t suffces to consder c = 2. Numerator and denomnator of the left-hand sde dffer only by the factor 1 /k. A lower bound for the left-hand sde can thus be obtaned by showng that the terms for close to k are not essental to the value of the denomnator. To prove ths, we may use Strlng s formula whch mples that for any k 2 and [1, k 1], k k k k e 1/6 < k 2πk k k k k k < e 1/12, k 2πk hence k 2 k 2k ke 1/3 k 2πk < 2 k k 2 k 2k ke 1/6 < k 2π. Introduce λ = /k and χλ = 2 λ λ 2λ 1 λ 21 λ. We have [χλ] k 2e 1/3 πk < 2 k 2 < [χλ] k e1/6 2πλ. 12 Lemma 3 can be numercally verfed for k We now consder k > For λ 0.666, snce the functon χ can be shown to be decreasng on [0.666, 1], the nequalty k 22 < [χ0.666] k e1/6 holds. We have χ0.657/χ0.666 > Consequently, for k > π 106, we have [χ0.666] k < [χ0.657] k /k 2. So for λ and k > 10 6, we have 2 k 2 < [χ0.657] k e 1/6 2π k < 2e 1/3 2 [χ0.657]k 1000πk 2 = mn λ [0.656,0.657] [χλ]k < k 2e 1/3 1000πk 2 max {1,...,k 1} [0,0.666k k 2 2, 13 where the last nequalty comes from 12 and the fact that there exsts {1,..., k 1} such that /k [0.656, 0.657]. Inequalty 13 mples that for any {1,..., k}, we have 0.666k k 2 k 2 < max {1,...,k 1} [0,0.666k 19 2 k 2 < <0.666k 2 k 2.

20 To conclude, ntroducng A = 0 <0.666k k 22, we have k =0 1 /k k 22 k k k =0 k 2 > A A A 1 3. Lemma 4 Let l and n be ntegers wth 1 n l n. Let p, 2 2 p, q, p 1,..., p n be real numbers n 0, 1 wth q {p, p }, p 1 = = p l = q and p l+1 = = p n. Let B resp. B be the sum of n + 1 ndependent Bernoull dstrbutons wth parameters p, p 1,..., p n resp. p, p 1,..., p n. We have KLB, B 2p p 2 1 p n + 2q. Proof Let Z, Z, Z 1,..., Z n be ndependent Bernoull dstrbutons wth parameters p, p, p 1,..., p n. Defne S = l =1 Z, T = n =l+1 Z and V = Z + S. By a slght and usual abuse of notaton, we use KL to denote Kullback-Lebler dvergence of both probablty dstrbutons and random varables. Then we may wrte the nequalty s an easy consequence of the chan rule for Kullback- Lebler dvergence KLB, B = KL Z + S + T, Z + S + T KL Z + S, T, Z + S, T = KL Z + S, Z + S. Let s k = PS = k for k = 1, 0,..., l + 1. Usng the equaltes s k = l q k 1 q l k = k q 1 q whch holds for 1 k l + 1, we obtan l k + 1 l q k 1 1 q l k+1 = k k 1 q 1 q l k + 1 s k 1, k l+1 PZ + S = k KLZ + S, Z + S = PV = k log PZ + S = k k=0 l+1 psk ps k = PV = k log p s k p s k k=0 l+1 p 1 q k + 1 pl k + 1 q = PV = k log p 1 q k + 1 k=0 q p l k + 1 p qv + 1 pql + 1 = E log. 14 p qv + 1 p ql

21 Frst case: q = p. By Jensen s nequalty, usng that EV = p l p p n ths case, we get p p KLZ + S, Z EV + 1 pp l S log 1 p p l + 1 p p p p l + 1 = log 1 p p l + 1 p p 2 p p 2 = log p p l p p l + 1. Second case: q = p. In ths case, V s a bnomal dstrbuton wth parameters l + 1 and p. From 14, we have p KLZ + S, Z pv + 1 p pl S E log 1 ppl + 1 E log 1 + p pv EV ppl + 1 To conclude, we wll use the followng lemma. Lemma 5 The followng nequalty holds for any x x 0 wth x 0 0, 1: logx x 1 + x 12 2x 0. Proof Introduce fx = x 1 + x 12 2x 0 + logx. We have f x = 1 + x 1 x 0 + 1, and x f x = 1 x 0 1. From f x x 2 0 = 0, we get that f s negatve on x 0, 1 and postve on 1, +. Ths leads to f nonnegatve on [x 0, +. Fnally, from Lemma 5 and 15, usng x 0 = 1 p, we obtan 1 p KLZ + S, Z p 2 p E[V EV 2 ] + S 1 ppl + 1 2x 0 p 2 p l + 1p1 p 2 = 1 ppl p p p 2 = 21 p l + 1p. Acknowledgements G. Lugos s supported by the Spansh Mnstry of Scence and Technology grant MTM and PASCAL2 Network of Excellence under EC grant no

22 References [1] J. Abernethy, E. Hazan, and A. Rakhln, Competng n the dark: An effcent algorthm for bandt lnear optmzaton, Proceedngs of the 21st Annual Conference on Learnng Theory COLT, 2008, pp [2] J.-Y. Audbert and S. Bubeck, Mnmax polces for adversaral and stochastc bandts, Proceedngs of the 22nd Annual Conference on Learnng Theory COLT, [3], Regret bounds and mnmax polces under partal montorng, Journal of Machne Learnng Research , [4] P. Auer, N. Cesa-Banch, Y. Freund, and R. Schapre, The non-stochastc mult-armed bandt problem, SIAM Journal on Computng , no. 1, [5] B. Awerbuch and R. Klenberg, Adaptve routng wth end-to-end feedback: dstrbuted learnng and geometrc approaches, STOC 04: Proceedngs of the thrty-sxth annual ACM symposum on Theory of computng, 2004, pp [6] A. Beck and M. Teboulle, Mrror descent and nonlnear projected subgradent methods for convex optmzaton, Operatons Research Letters , no. 3, [7] S. Bubeck, Introducton to onlne optmzaton, Lecture Notes, [8] S. Bubeck and N. Cesa-Banch, Regret analyss of stochastc and nonstochastc mult-armed bandt problems, Foundatons and Trends n Machne Learnng , no. 1, [9] S. Bubeck, N. Cesa-Banch, and S. M. Kakade, Towards mnmax polces for onlne lnear optmzaton wth bandt feedback, Arxv preprnt arxv: [10] N. Cesa-Banch and G. Lugos, Predcton, learnng, and games, Cambrdge Unversty Press, [11], Combnatoral bandts, Journal of Computer and System Scences 2011, To appear. [12] V. Dan, T. Hayes, and S. Kakade, The prce of bandt nformaton for onlne optmzaton, Advances n Neural Informaton Processng Systems NIPS, vol. 20, 2008, pp [13] Y. Freund and R. E. Schapre, A decson-theoretc generalzaton of on-lne learnng and an applcaton to boostng, Journal of Computer and System Scences , [14] C. Gentle and M. Warmuth, Lnear hnge loss and average margn, Advances n Neural Informaton Processng Systems NIPS, [15] A. Grove, N. Lttlestone, and D. Schuurmans, General convergence results for lnear dscrmnant updates, Machne Learnng , [16] A. György, T. Lnder, G. Lugos, and G. Ottucsák, The on-lne shortest path problem under partal montorng, Journal of Machne Learnng Research ,

23 [17] E. Hazan, The convex optmzaton approach to regret mnmzaton, Optmzaton for Machne Learnng S. Sra, S. Nowozn, and S. Wrght, eds., MIT press, 2011, pp [18] E. Hazan, S. Kale, and M. Warmuth, Learnng rotatons wth lttle regret, Proceedngs of the 23rd Annual Conference on Learnng Theory COLT, [19] D. P. Helmbold and M. Warmuth, Learnng permutatons wth exponental weghts, Journal of Machne Learnng Research , [20] M. Herbster and M. Warmuth, Trackng the best expert, Machne Learnng , [21] J.-B. Hrart-Urruty and C. Lemaréchal, Fundamentals of convex analyss, Sprnger, [22] A. Kala and S. Vempala, Effcent algorthms for onlne decson problems, Journal of Computer and System Scences , [23] S. Kale, L. Reyzn, and R. Schapre, Non-stochastc bandt slate problems, Advances n Neural Informaton Processng Systems NIPS, 2010, pp [24] J. Kefer and J. Wolfowtz, Stochastc estmaton of the maxmum of a regresson functon, Annals of Mathematcal Statstcs , [25] J. Kvnen and M. Warmuth, Relatve loss bounds for multdmensonal regresson problems, Machne Learnng , [26] W. Koolen, M. Warmuth, and J. Kvnen, Hedgng structured concepts, Proceedngs of the 23rd Annual Conference on Learnng Theory COLT, 2010, pp [27] H. McMahan and A. Blum, Onlne geometrc optmzaton n the bandt settng aganst an adaptve adversary, In Proceedngs of the 17th Annual Conference on Learnng Theory COLT, 2004, pp [28] A. Nemrovsk, Effcent methods for large-scale convex optmzaton problems, Ekonomka Matematcheske Metody , In Russan. [29] A. Nemrovsk and D. Yudn, Problem complexty and method effcency n optmzaton, Wley Interscence, [30] A. Rakhln, Lecture notes on onlne learnng, [31] H. Robbns and S. Monro, A stochastc approxmaton method, Annals of Mathematcal Statstcs , [32] A. Schrjver, Combnatoral optmzaton, Sprnger, [33] S. Shalev-Shwartz, Onlne learnng: Theory, algorthms, and applcatons, Ph.D. thess, The Hebrew Unversty of Jerusalem, [34] E. Takmoto and M. Warmuth, Paths kernels and multplcatve updates, Journal of Machne Learnng Research ,

24 [35] T. Uchya, A. Nakamura, and M. Kudo, Algorthms for adversaral bandt problems wth multple plays, Proceedngs of the 21st Internatonal Conference on Algorthmc Learnng Theory ALT, [36] M. Warmuth and D. Kuzmn, Randomzed onlne pca algorthms wth regret bounds that are logarthmc n the dmenson, Journal of Machne Learnng Research , [37] M. Znkevch, Onlne convex programmng and generalzed nfntesmal gradent ascent, Proceedngs of the Twenteth Internatonal Conference on Machne Learnng ICML,

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016 CS 294-128: Algorthms and Uncertanty Lecture 14 Date: October 17, 2016 Instructor: Nkhl Bansal Scrbe: Antares Chen 1 Introducton In ths lecture, we revew results regardng follow the regularzed leader (FTRL.

More information

Minimax Policies for Combinatorial Prediction Games

Minimax Policies for Combinatorial Prediction Games JMLR: Workshop and Conference Proceedngs 9 20 07 32 24th Annual Conference on Learnng Theory Mnmax Polces for Combnatoral Predcton Games Jean-Yves Audbert Imagne, Unv. Pars Est, and Serra, CNRS/ENS/INRIA,

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Regret in Online Combinatorial Optimization

Regret in Online Combinatorial Optimization Regret in Online Combinatorial Optimization Jean-Yves Audibert Imagine, Université Paris Est, and Sierra, CNRS/ENS/INRIA audibert@imagine.enpc.fr Sébastien Bubeck Department of Operations Research and

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

Lecture 14: Bandits with Budget Constraints

Lecture 14: Bandits with Budget Constraints IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution. Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

P exp(tx) = 1 + t 2k M 2k. k N

P exp(tx) = 1 + t 2k M 2k. k N 1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Maximizing the number of nonnegative subsets

Maximizing the number of nonnegative subsets Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

More information

Lecture 17: Lee-Sidford Barrier

Lecture 17: Lee-Sidford Barrier CSE 599: Interplay between Convex Optmzaton and Geometry Wnter 2018 Lecturer: Yn Tat Lee Lecture 17: Lee-Sdford Barrer Dsclamer: Please tell me any mstake you notced. In ths lecture, we talk about the

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k. THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

Lecture 20: November 7

Lecture 20: November 7 0-725/36-725: Convex Optmzaton Fall 205 Lecturer: Ryan Tbshran Lecture 20: November 7 Scrbes: Varsha Chnnaobreddy, Joon Sk Km, Lngyao Zhang Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer:

More information

Finding Primitive Roots Pseudo-Deterministically

Finding Primitive Roots Pseudo-Deterministically Electronc Colloquum on Computatonal Complexty, Report No 207 (205) Fndng Prmtve Roots Pseudo-Determnstcally Ofer Grossman December 22, 205 Abstract Pseudo-determnstc algorthms are randomzed search algorthms

More information

The Experts/Multiplicative Weights Algorithm and Applications

The Experts/Multiplicative Weights Algorithm and Applications Chapter 2 he Experts/Multplcatve Weghts Algorthm and Applcatons We turn to the problem of onlne learnng, and analyze a very powerful and versatle algorthm called the multplcatve weghts update algorthm.

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

Complete subgraphs in multipartite graphs

Complete subgraphs in multipartite graphs Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Communication Complexity 16:198: February Lecture 4. x ij y ij

Communication Complexity 16:198: February Lecture 4. x ij y ij Communcaton Complexty 16:198:671 09 February 2010 Lecture 4 Lecturer: Troy Lee Scrbe: Rajat Mttal 1 Homework problem : Trbes We wll solve the thrd queston n the homework. The goal s to show that the nondetermnstc

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES SVANTE JANSON Abstract. We gve explct bounds for the tal probabltes for sums of ndependent geometrc or exponental varables, possbly wth dfferent

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis A Appendx for Causal Interacton n Factoral Experments: Applcaton to Conjont Analyss Mathematcal Appendx: Proofs of Theorems A. Lemmas Below, we descrbe all the lemmas, whch are used to prove the man theorems

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Min Cut, Fast Cut, Polynomial Identities

Min Cut, Fast Cut, Polynomial Identities Randomzed Algorthms, Summer 016 Mn Cut, Fast Cut, Polynomal Identtes Instructor: Thomas Kesselhem and Kurt Mehlhorn 1 Mn Cuts n Graphs Lecture (5 pages) Throughout ths secton, G = (V, E) s a mult-graph.

More information

Minimax Policies for Combinatorial Prediction Games

Minimax Policies for Combinatorial Prediction Games Minimax Policies for Combinatorial Prediction Games Jean-Yves Audibert Imagine, Univ. Paris Est, and Sierra, CNRS/ENS/INRIA, Paris, France audibert@imagine.enpc.fr Sébastien Bubeck Centre de Recerca Matemàtica

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

Perfect Competition and the Nash Bargaining Solution

Perfect Competition and the Nash Bargaining Solution Perfect Competton and the Nash Barganng Soluton Renhard John Department of Economcs Unversty of Bonn Adenauerallee 24-42 53113 Bonn, Germany emal: rohn@un-bonn.de May 2005 Abstract For a lnear exchange

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

A Note on Bound for Jensen-Shannon Divergence by Jeffreys

A Note on Bound for Jensen-Shannon Divergence by Jeffreys OPEN ACCESS Conference Proceedngs Paper Entropy www.scforum.net/conference/ecea- A Note on Bound for Jensen-Shannon Dvergence by Jeffreys Takuya Yamano, * Department of Mathematcs and Physcs, Faculty of

More information

Lecture 4: September 12

Lecture 4: September 12 36-755: Advanced Statstcal Theory Fall 016 Lecture 4: September 1 Lecturer: Alessandro Rnaldo Scrbe: Xao Hu Ta Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer: These notes have not been

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Excess Error, Approximation Error, and Estimation Error

Excess Error, Approximation Error, and Estimation Error E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple

More information

Lecture 17 : Stochastic Processes II

Lecture 17 : Stochastic Processes II : Stochastc Processes II 1 Contnuous-tme stochastc process So far we have studed dscrete-tme stochastc processes. We studed the concept of Makov chans and martngales, tme seres analyss, and regresson analyss

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION Advanced Mathematcal Models & Applcatons Vol.3, No.3, 2018, pp.215-222 ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EUATION

More information

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS BOUNDEDNESS OF THE IESZ TANSFOM WITH MATIX A WEIGHTS Introducton Let L = L ( n, be the functon space wth norm (ˆ f L = f(x C dx d < For a d d matrx valued functon W : wth W (x postve sem-defnte for all

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Announcements EWA with ɛ-exploration (recap) Lecture 20: EXP3 Algorithm. EECS598: Prediction and Learning: It s Only a Game Fall 2013.

Announcements EWA with ɛ-exploration (recap) Lecture 20: EXP3 Algorithm. EECS598: Prediction and Learning: It s Only a Game Fall 2013. Lecture 0: EXP3 Algorthm 1 EECS598: Predcton and Learnng: It s Only a Game Fall 013 Prof. Jacob Abernethy Lecture 0: EXP3 Algorthm Scrbe: Zhhao Chen Announcements None 0.1 EWA wth ɛ-exploraton (recap)

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso Supplement: Proofs and Techncal Detals for The Soluton Path of the Generalzed Lasso Ryan J. Tbshran Jonathan Taylor In ths document we gve supplementary detals to the paper The Soluton Path of the Generalzed

More information

arxiv:submit/ [cs.lg] 30 Aug 2011

arxiv:submit/ [cs.lg] 30 Aug 2011 No Internal Regret va Neghborhood Watch Dean Foster Department of Statstcs Unversty of Pennsylvana Alexander Rakhln Department of Statstcs Unversty of Pennsylvana arxv:submt/0308560 cs.lg 30 Aug 2011 August

More information

Lecture Space-Bounded Derandomization

Lecture Space-Bounded Derandomization Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

CSCE 790S Background Results

CSCE 790S Background Results CSCE 790S Background Results Stephen A. Fenner September 8, 011 Abstract These results are background to the course CSCE 790S/CSCE 790B, Quantum Computaton and Informaton (Sprng 007 and Fall 011). Each

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Lecture Randomized Load Balancing strategies and their analysis. Probability concepts include, counting, the union bound, and Chernoff bounds.

Lecture Randomized Load Balancing strategies and their analysis. Probability concepts include, counting, the union bound, and Chernoff bounds. U.C. Berkeley CS273: Parallel and Dstrbuted Theory Lecture 1 Professor Satsh Rao August 26, 2010 Lecturer: Satsh Rao Last revsed September 2, 2010 Lecture 1 1 Course Outlne We wll cover a samplng of the

More information

Computing Correlated Equilibria in Multi-Player Games

Computing Correlated Equilibria in Multi-Player Games Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,

More information

The lower and upper bounds on Perron root of nonnegative irreducible matrices

The lower and upper bounds on Perron root of nonnegative irreducible matrices Journal of Computatonal Appled Mathematcs 217 (2008) 259 267 wwwelsevercom/locate/cam The lower upper bounds on Perron root of nonnegatve rreducble matrces Guang-Xn Huang a,, Feng Yn b,keguo a a College

More information

Economics 101. Lecture 4 - Equilibrium and Efficiency

Economics 101. Lecture 4 - Equilibrium and Efficiency Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Lecture 4: Universal Hash Functions/Streaming Cont d

Lecture 4: Universal Hash Functions/Streaming Cont d CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space. Lnear, affne, and convex sets and hulls In the sequel, unless otherwse specfed, X wll denote a real vector space. Lnes and segments. Gven two ponts x, y X, we defne xy = {x + t(y x) : t R} = {(1 t)x +

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

Finding Dense Subgraphs in G(n, 1/2)

Finding Dense Subgraphs in G(n, 1/2) Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng

More information

Maximal Margin Classifier

Maximal Margin Classifier CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org

More information