Regret in Online Combinatorial Optimization
|
|
- Justin McKinney
- 5 years ago
- Views:
Transcription
1 Regret n Onlne Combnatoral Optmzaton Jean-Yves Audbert Imagne, Unversté Pars Est, and Serra, CNRS/ENS/INRIA audbert@magne.enpc.fr Sébasten Bubeck Department of Operatons Research and Fnancal Engneerng, Prnceton Unversty sbubeck@prnceton.edu Gábor Lugos ICREA and Pompeu Fabra Unversty gabor.lugos@upf.edu March 29, 2013 Abstract We address onlne lnear optmzaton problems when the possble actons of the decson maker are represented by bnary vectors. The regret of the decson maker s the dfference between her realzed loss and the mnmal loss she would have acheved by pckng, n hndsght, the best possble acton. Our goal s to understand the magntude of the best possble mnmax regret. We study the problem under three dfferent assumptons for the feedback the decson maker receves: full nformaton, and the partal nformaton models of the socalled sem-bandt and bandt problems. In the full nformaton case we show that the standard exponentally weghted average forecaster s a provably suboptmal strategy. For the sem-bandt model, by combnng the Mrror Descent algorthm and the INF Implctely Normalzed Forecaster strategy, we are able to prove the frst optmal bounds. Fnally, n the bandt case we dscuss exstng results n lght of a new lower bound, and suggest a conjecture on the optmal regret n that case. 1 Introducton. In ths paper we consder the framework of onlne lnear optmzaton. The setup may be descrbed as a repeated game between a decson maker or smply player or forecaster and an adver- 1
2 sary as follows: at each tme nstance t = 1,..., n, the player chooses, possbly n a randomzed way, an acton from a gven fnte acton set A R d. The acton chosen by the player at tme t s denoted by a t A. Smultaneously to the player, the adversary chooses a loss vector z t Z R d and the loss ncurred by the forecaster s a T t z t. The goal of the player s to mnmze the expected cumulatve loss E n at t z t where the expectaton s taken wth respect to the player s nternal randomzaton and eventually the adversary s randomzaton. In the basc full-nformaton verson of ths problem, the player observes the adversary s move z t at the end of round t. Another mportant model for feedback s the so-called bandt problem, n whch the player only observes the ncurred loss a T t z t. As a measure of performance we defne the regret 1 of the player as R n = E a T t z t mn E a T z t. a A In ths paper we address a specfc example of onlne lnear optmzaton: we assume that the acton set A s a subset of the d-dmensonal hypercube {0, 1} d such that a A, a 1 = m, and the adversary has a bounded loss per coordnate, that s 2 Z = [0, 1] d. We call ths settng onlne combnatoral optmzaton. As we wll see below, ths restrcton of the general framework contans a rch class of problems. Indeed, n many nterestng cases, actons are naturally represented by Boolean vectors. In addton to the full nformaton and bandt versons of onlne combnatoral optmzaton, we also consder another type of feedback whch makes sense only n ths combnatoral settng. In the sem-bandt verson, we assume that the player observes only the coordnates of z t that were played n a t, that s the player observes the vector a t 1z t 1,..., a t dz t d. All three varants of onlne combnatoral optmzaton are sketched n Fgure 1. More rgorously, onlne combnatoral optmzaton s defned as a repeated game between a player and an adversary. At each round t = 1,..., n of the game, the player chooses a probablty dstrbuton p t over the set of actons A {0, 1} d and draws a random acton a t A accordng to p t. Smultaneously, the adversary chooses a vector z t [0, 1] d. More formally, z t s a measurable functon of the past p s, a s, z s s=1,...,t 1. In the full nformaton case, p t s a measurable functon of p s, a s, z s s=1,...,t 1. In the sem-bandt case, p t s a measurable functon of p s, a s, a s z s =1,...,d s=1,...,t 1 and n the bandt problem t s a measurable functon of p s, a s, a T s z s s=1,...,t Motvatng examples. Many problems can be tackled under the onlne combnatoral optmzaton framework. We gve here three smple examples: m-sets. In ths example we consder the set A of all d m Boolean vectors n dmenson d wth exactly m ones. In other words, at every tme step, the player selects m actons out of 1 In the full nformaton verson, t s straghtforward to obtan upper bounds for the stronger noton of regret E n n at t z t E mn a A at z t whch s always at least as large as R n. However, for partal nformaton games, ths requres more work. In ths paper we only consder R n as a measure of the regret. 2 Note that snce all actons have the same sze,.e. a 1 = m, a A, one can reduce the case of Z = [α, β] d to Z = [0, 1] d va a smple renormalzaton. 2
3 Parameters: set of actons A {0, 1} d ; number of rounds n N. For each round t = 1, 2,..., n; 1 the player chooses a probablty dstrbuton p t over A and draws a random acton a t A accordng to p t ; 2 smultaneously, the adversary selects a loss vector z t [0, 1] d wthout revealng t; 3 the player ncurs the loss a T t z t. She observes the loss vector z t n the full nformaton settng, the coordnates z t a t n the sem-bandt settng, the nstantaneous loss a T t z t n the bandt settng. Goal: The player tres to mnmze her cumulatve loss n at t z t. Fgure 1: Onlne combnatoral optmzaton. d possbltes. When m = 1, the sem-bandt and bandt versons concde and correspond to the standard adversaral mult-armed bandt problem. Onlne shortest path problem. Consder a communcaton network represented by a graph n whch one has to send a sequence of packets from one fxed vertex to another. For each packet one chooses a path through the graph and suffers a certan delay whch s the sum of the delays on the edges of the path. Dependng on the traffc, the delays on the edges may change, and, at the end of each round, accordng to the assumed level of feedback, the player observes ether the delays of all edges, the delays of each edge on the chosen path, or only the total delay of the chosen path. The player s objectve s to mnmze the total delay for the sequence of packets. One can represent the set of vald paths from the startng vertex to the end vertex as a set A {0, 1} d where d s the number of edges. If at tme t, z t [0, 1] d s the vector of delays on the edges, then the delay of a path a A s z T t a. Thus ths problem s an nstance of onlne combnatoral optmzaton n dmenson d, where d s the number of edges n the graph. In ths paper we assume, for smplcty, that all vald paths have the same length m. Rankng. Consder the problem of selectng a rankng of m tems out of M possble tems. For example a webste could have a set of M ads, and t has to select a ranked lst of m of these ads to appear on the webpage. One can rephrase ths problem as selectng a matchng of sze m on the complete bpartte graph K m,m wth d = m M edges. In the onlne learnng verson of ths problem, each day the webste chooses one such lst, and gans one dollar for each clck on the ads. Ths problem can easly be formulated as an onlne combnatoral optmzaton problem. Our theory apples to many more examples, such as spannng trees whch can be useful n certan communcaton problems, or m-ntervals. 3
4 1.2 Prevous work. Full Informaton. The full-nformaton settng s now farly well understood, and an optmal regret bound n terms of m, d, n was obtaned by Koolen, Warmuth, and Kvnen [26]. Prevous papers under full nformaton feedback also nclude Gentle and Warmuth [14], Kvnen and Warmuth [25], Grove, Lttlestone, and Schuurmans [15], Takmoto and Warmuth [34], Kala and Vempala [22], Warmuth and Kuzmn [36], Herbster and Warmuth [19], and Hazan, Kale, and Warmuth [18]. Sem-bandt. The frst paper on the adversaral mult-armed bandt problem.e., the specal case of m-sets wth m = 1 s by Auer, Cesa-Banch, Freund, and Schapre [4] who derved a regret bound of order dn log d. Ths result was mproved to dn by Audbert and Bubeck [2, 3]. György, Lnder, Lugos, and Ottucsák [16] consder the onlne shortest path problem and derve suboptmal regret bounds n terms of the dependency on m and d. Uchya, Nakamura, and Kudo [35] respectvely Kale, Reyzn, and Schapre [23] derved optmal regret bounds for the case of m-sets respectvely for the problem of rankng selecton up to logarthmc factors. Bandt. McMahan and Blum [27], and Awerbuch and Klenberg [5] were the frst to consder ths settng, and obtaned suboptmal regret bounds n terms of n. The frst paper wth optmal dependency n n was by Dan, Hayes, and Kakade [12]. The dependency on m and d was then mproved n varous ways by Abernethy, Hazan, and Rakhln [1], Cesa-Banch, and Lugos [11], and Bubeck, Cesa-Banch, and Kakade [9]. We dscuss these bounds n detal n Secton 4. In partcular, we argue that the optmal regret bound n terms of d and m s stll an open problem. We also refer the nterested reader to the recent survey [8] for an overvew of bandt problems n varous other settngs. 1.3 Contrbuton and contents of the paper. In ths paper we are prmarly nterested n the optmal mnmax regret n terms of m, d and n. More precsely, our am s to determne the order of magntude of the followng quantty: For a gven feedback assumpton, wrte sup for the supremum over all adversares and nf for the nfmum over all allowed strateges for the player under the feedback assumpton. Recall the defnton of adversary and player from the ntroducton. Then we are nterested n max nf sup R n. A {0,1} d : a A, a 1 =m Our contrbuton to the study of ths quantty s threefold. Frst, we unfy the algorthms used n Abernethy, Hazan, and Rakhln [1], Koolen, Warmuth, and Kvnen [26], Uchya, Nakamura, and Kudo [35], and Kale, Reyzn, and Schapre [23] under the umbrella of mrror descent. The dea of mrror descent goes back to Nemrovsk [28], Nemrovsk and Yudn [29]. A somewhat smlar concept was re-dscovered n onlne learnng by Herbster and Warmuth [20], Grove, Lttlestone, and Schuurmans [15], Kvnen and Warmuth [25] under the name of potental-based gradent 4
5 Lower Bound Full Informaton Sem-Bandt Bandt m n log d m mdn m dn Upper Bound m n log d m mdn m 3/2 dn log d m Table 1: Bounds on the mnmax regret up to constant factors. The new results are set n boldface. In ths paper we also show that EXP2 n the full nformaton case has a regret bounded below by d 3/2 n when m s of order d. descent, see [10, Chapter 11]. Recently, these deas have been flourshng, see for nstance Shalev- Schwartz [33], Rakhln [30], Hazan [17], and Bubeck [7]. Our man theorem Theorem 2 allows one to recover almost all known regret bounds for onlne combnatoral optmzaton. Ths frst contrbuton leads to our second man result, the mprovement of the known upper bounds for the sem-bandt game. In partcular, we propose a dfferent proof of the mnmax regret bound of the order of nd n the standard d-armed bandt game that s much smpler than the one provded n Audbert and Bubeck [3] whch also mproves the constant factor. In addton to these upper bounds we prove two new lower bounds. Frst we answer a queston of Koolen, Warmuth, and Kvnen [26] by showng that the exponentally weghted average forecaster s provably suboptmal for onlne combnatoral optmzaton. Our second lower bound s a mnmax lower bound n the bandt settng whch mproves known results by an order of magntude. A summary of known bounds and the new bounds proved n ths paper can be found n Table 1. The paper s organzed as follows. In Secton 2 we ntroduce the two algorthms dscussed n ths paper. In partcular n Secton 2.1 we dscuss the popular exponentally weghted average forecaster and we show that t s a provably suboptmal strategy. Then n Secton 2.2 we descrbe our man algorthm, OSMD Onlne Stochastc Mrror Descent, and prove a general regret bound n terms of the Bregman dvergence of the Fenchel-Legendre dual of the Legendre functon defnng the strategy. In Secton 3 we derve upper bounds for the regret n the sem-bandt case for OSMD wth approprately chosen Legendre functons. Fnally n Secton 4 we prove a new lower bound for the bandt settng, and we formulate a conjecture on the correct order of magntude of the regret for that problem based on ths new result and the regret bounds obtaned n [1, 9]. 2 Algorthms. In ths secton we dscuss two classes of algorthms that have been proposed for onlne combnatoral optmzaton. 2.1 Expanded Exponental weghts EXP2. The smplest approach to onlne combnatoral optmzaton s to consder each acton of A as an ndependent expert, and then apply a generc regret mnmzng strategy. Perhaps the most popular such strategy s the exponentally weghted average forecaster see, e.g., [10]. Ths 5
6 strategy s sometmes called Hedge, see Freund and Schapre [13]. We call the resultng strategy for the onlne combnatoral optmzaton problem EXP2, see Fgure 2. In the full nformaton settng, EXP2 corresponds to Expanded Hedge, as defned n Koolen, Warmuth, and Kvnen [26]. In the sem-bandt case, EXP2 was studed by György, Lnder, Lugos, and Ottucsák [16] whle n the bandt case n Dan, Hayes, and Kakade [12], Cesa-Banch and Lugos [11], and Bubeck, Cesa-Banch, and Kakade [9]. Note that n the bandt case, EXP2 s mxed wth an exploraton dstrbuton, see Secton 4 for more detals. Despte strong nterest n ths strategy, no optmal regret bound has been derved for t n the combnatoral settng. More precsely, the best bound whch can be derved from a standard argument, see for example [12] or [26] s of order m n 3/2 log d m. On the other hand, n [26] the authors showed that by usng Mrror Descent see next secton wth the negatve entropy, one obtans a regret bounded by m n log d m. Furthermore ths latter bound s clearly optmal up to a numercal constant, as one can see from the standard lower bound n predcton wth expert advce consder the set A that corresponds to playng m expert problems n parallel wth d/m experts n each problem. In [26] the authors leave as an open queston the problem of whether t would be possble to mprove the bound for EXP2 to obtan the optmal order of magntude. The followng theorem shows that ths s mpossble, and that n fact EXP2 s a provably suboptmal strategy. Theorem 1 Let n d. There exsts a subset A {0, 1} d such that n the full nformaton settng, the regret of the EXP2 strategy for any learnng rate η, satsfes The proof s deferred to the Appendx. sup R n 0.01 d 3/2 n. adversary 2.2 Onlne Stochastc Mrror Descent. In ths secton we descrbe the man algorthm studed n ths paper. We call t Onlne Stochastc Mrror Descent OSMD. Each term n ths name refers to a part of the algorthm: Mrror Descent orgnates n the work of Nemrovsk and Yudn [29]. The dea of mrror descent s to perform a gradent descent, where the update wth the gradent s performed n the dual space defned by some Legendre functon F rather than n the prmal see below for a precse formulaton. The Stochastc part takes ts orgn from Robbns and Monro [31] and from Kefer and Wolfowtz [24]. The key dea s that t s enough to observe an unbased estmate of the gradent rather than the true gradent n order to perform a gradent descent. Fnally the Onlne part comes from Znkevch [37]. Znkevch derved the Onlne Gradent Descent OGD algorthm, whch s a verson of gradent descent talored to onlne optmzaton. To properly descrbe the OSMD strategy, we recall a few concepts from convex analyss, see Hrart-Urruty and Lemaréchal [21] for a thorough treatment of ths subject. Let D R d be an open convex set, and D the closure of D. Defnton 1 We call Legendre any contnuous functon F : D R such that F s strctly convex contnuously dfferentable on D, 6
7 EXP2: Parameter: Learnng rate η. Let p 1 = 1 A,..., 1 A R A. For each round t = 1, 2,..., n; a Play a t p t and observe the loss vector z t n the full nformaton game, the coordnates z t 1 at=1 n the sem-bandt game, the nstantaneous loss a T t z t n the bandt game. b Estmate the loss vector z t by z t. For nstance, one may take z t = z t n the full nformaton game, z t = z t a A:a=1 ptaa t n the sem-bandt game, z t = P + t a t a T t z t, wth P t = E a pt aa T n the bandt game. c Update the probabltes, for all a A, p t+1 a = exp ηa T z t p t a b A exp ηbt z T t p t b. Fgure 2: The EXP2 strategy. The notaton E a pt denotes expected value wth respect to the random choce of a when t s dstrbuted accordng to p t. lm x D\D F x = +. 3 The Bregman dvergence D F : D D assocated to a Legendre functon F s defned by D F x, y = F x F y x y T F y. Moreover, we say that D = F D s the dual space of D under F. We also denote by F the Legendre-Fenchel transform of F defned by F u = sup x T u F x. x D Lemma 1 Let F be a Legendre functon. Then F = F and F = F 1 on the set D. Moreover, x, y D, D F x, y = D F F y, F x. 1 3 By the equvalence of norms n R d, ths defnton does not depend on the choce of the norm. 7
8 The lemma above s the key to understandng how a Legendre functon acts on the space. The gradent F maps D to the dual space D, and F s the nverse mappng from the dual space to the orgnal prmal space. Moreover, 1 shows that the Bregman dvergence n the prmal space corresponds exactly to the Bregman dvergence of the Legendre-Fenchel transform n the dual space. A proof of ths result can be found, for example, n [Chapter 11, [10]]. We now have all ngredents to descrbe the OSMD strategy, see Fgure 3 for the precse formulaton. Note that step d s well defned f the followng consstency condton s satsfed: F x η z t D, x ConvA D. 2 In the full nformaton settng, algorthms of ths type were studed by Abernethy, Hazan, and Rakhln [1], Rakhln [30], and Hazan [17]. In these papers the authors adopted the presentaton suggested by Beck and Teboulle [6], whch corresponds to a Follow-the-Regularzed-Leader FTRL type strategy. There the focus was on F beng strongly convex wth respect to some norm. Moreover, n [1] the authors also consder the bandt case, and swtch to F beng a self-concordant barrer for the convex hull of A see Secton 4 for more detals. Another lne of work studed ths type of algorthms wth F beng the negatve entropy, see Koolen, Warmuth, and Kvnen [26] for the full nformaton case and Uchya, Nakamura, and Kudo [35], Kale, Reyzn, and Schapre [23] for specfc nstances of the sem-bandt case. All these results are unfed and descrbed n detals n Bubeck [7]. In ths paper we consder a new type of Legendre functons F nspred by Audbert and Bubeck [3], see Secton 3. Regardng computatonal complexty, OSMD s effcent as soon as the polytope ConvA can be descrbed by a polynomal n d number of constrants. Indeed n that case steps a-b can be performed effcently jontly one can get an algorthm by lookng at the proof of Carathéodory s theorem, and step d s a convex program wth a polynomal number of constrants. In many nterestng examples such as m-sets, selecton of rankngs, spannng trees, paths n acyclc graphs one can descrbe the convex hull of A by a polynomal number of constrants, see Schrjver [32]. On the other hand, there also exst mportant examples where ths s not the case such as paths on general graphs. Also note that for some specfc examples t s possble to mplement OSMD wth mproved computatonal complexty, see Koolen, Warmuth, and Kvnen [26]. In ths paper we restrct our attenton to the combnatoral learnng settng n whch A s a subset of {0, 1} d and the loss s lnear. However, one should note that ths specfc form of A plays no role n the defnton of OSMD. Moreover, f the loss s not lnear, then one can modfy OSMD by performng a gradent update wth a gradent of the loss rather than the loss vector z t. See Bubeck [7] for more detals on ths approach. The followng result s at the bass of our mproved regret bounds for OSMD n the sem-bandt settng, see Secton 3. Theorem 2 Suppose that 2 s satsfed and the loss estmates are unbased n the sense that E at p t z t = z t. Then the regret of the OSMD strategy satsfes R n sup a A F a F x 1 η + 1 η ED F F x t η z t, F x t. 8
9 OSMD: Parameters: learnng rate η > 0, Legendre functon F defned on D ConvA. Let x 1 argmn x ConvA F x. For each round t = 1, 2,..., n; a Let p t be a dstrbuton on the set A such that x t = E a pt a. b Draw a random acton a t accordng to the dstrbuton p t and observe the feedback. c Based on the observed feedback, estmate the loss vector z t by z t. d Let w t+1 D satsfy F w t+1 = F x t η z t. 3 e Project the weght vector w t+1 defned by 3 on the convex hull of A: x t+1 argmn D F x, w t+1. 4 x ConvA Fgure 3: Onlne Stochastc Mrror Descent OSMD. Proof Let a A. Usng that a t and z t are unbased estmates of x t and z t, we have E a t a T z t = E x t a T z t. Usng 3, and applyng the defnton of the Bregman dvergences, one obtans η z T t x t a = a x t T F w t+1 F x t = D F a, x t + D F x t, w t+1 D F a, w t+1. By the Pythagorean theorem for Bregman dvergences see, e.g., Lemma 11.3 of [10], we have D F a, w t+1 D F a, x t+1 + D F x t+1, w t+1, hence η z T t x t a D F a, x t + D F x t, w t+1 D F a, x t+1 D F x t+1, w t+1. Summng over t gves η z t T x t a D F a, a 1 D F a, a n+1 + DF x t, w t+1 D F x t+1, w t+1. 9
10 By the nonnegatvty of the Bregman dvergences, we get η z t T x t a D F a, a 1 + D F x t, w t+1. From 1, one has D F x t, w t+1 = D F F xt η z t, F x t. Moreover, by wrtng the frstorder optmalty condton for x 1, one drectly obtans D F a, x 1 F a F x 1 whch concludes the proof. Note that, f F admts an Hessan, denoted 2 F, that s always nvertble, then one can prove that, up to a thrd-order term n z t, the regret bound can be wrtten as R n sup a A F a F x 1 η + η 2 z T t 2 F x t 1 zt. 5 The man techncal dffculty s to control the thrd-order error term n ths nequalty. 3 Sem-bandt feedback. In ths secton we consder onlne combnatoral optmzaton wth sem-bandt feedback. As we already dscussed, n the full nformaton case Koolen, Warmuth, and Kvnen [26] proved that OSMD wth the negatve entropy s a mnmax optmal strategy. We frst prove a regret bound when one uses ths strategy wth the followng estmate for the loss vector: z t = z ta t. 6 x t Note that ths s a vald estmate snce t makes only use of z t 1a t 1,..., z t da t d. Moreover, t s unbased wth respect to the random draw of a t from p t, snce by defnton, E at p t a t = x t. In other words, E at p t z t = z t. Theorem 3 The regret of OSMD wth F x = d =1 x log x d =1 x and D = 0, + d and any non-negatve unbased loss estmate z t 0 satsfes R n m log d m η In partcular, wth the estmate 6 and η = R n + η 2 d x t z t 2. =1 m log dm 2, nd 2mdn log d m. 10
11 Proof One can easly see that for the negatve entropy the dual space s D = R d. Thus, 2 s verfed and OSMD s well defned. Moreover, agan by straghtforward computatons, one can also see that D F F x, F y = d =1 y Θ F x F y, 7 where Θx = expx 1 x. Thus, usng Theorem 2 and the facts that Θx x2 for x 0 2 and d =1 x t m, one obtans R n sup a A F a F x 1 η sup a A F a F x 1 η + 1 η + η 2 ED F F x t η z t, F x t d x t z t 2 The proof of the frst nequalty s concluded by notng that: d d 1 F a F x 1 x 1 log x 1 m log =1 The second nequalty follows from =1 =1 Ex t z t 2 E a t x t = 1. x 1 m 1 = m log d x 1 m. Usng the standard dn lower bound for the mult-armed bandt whch corresponds to the case where A s the canoncal bass, see e.g., [Theorem 30, [3]], one can drectly obtan a lower bound of order mdn for our settng. Thus the upper bound derved n Theorem 3 has an extraneous logarthmc factor compared to the lower bound. Ths phenomenon already appeared n the basc mult-armed bandt settng. In that case, the extra logarthmc factor was removed n Audbert and Bubeck [2] by resortng to a new class of strateges for the expert problem, called INF Implctely Normalzed Forecaster. Next we generalze ths class of algorthms to the combnatoral settng, and thus remove the extra logarthmc factor. Frst we ntroduce the noton of a potental and the assocated Legendre functon. Defnton 2 Let ω 0. A functon ψ :, a R + for some a R {+ } s called an ω-potental f t s convex, contnuously dfferentable, and satsfes lm ψx = ω, x ψ > 0, lm ψx = +, x a ω+1 ω ψ 1 s ds < +. For every potental ψ we assocate the functon F ψ defned on D = ω, + d by: F ψ x = d =1 x ω 11 ψ 1 sds.
12 In ths paper we restrct our attenton to 0-potentals whch we wll smply call potentals. A non-zero value of ω may be used to derve regret bounds that hold wth hgh probablty nstead of pseudo-regret bounds, see footnote 1. The frst order optmalty condton for 4 mples that OSMD wth F ψ s a drect generalzaton of INF wth potental ψ, n the sense that the two algorthms concde when A s the canoncal bass. Note, n partcular, that wth ψx = expx we recover the negatve entropy for F ψ. In [3], the choce of ψx = x q wth q > 1 was recommended. We show n Theorem 4 that here, agan, ths choce gves a mnmax optmal strategy. Lemma 2 Let ψ be a potental. Then F = F ψ s Legendre and for all u, v D =, a d such that u v, {1,..., d}, D F u, v 1 2 d ψ v u v 2. =1 Proof A drect examnaton shows that F = F ψ s a Legendre functon. Moreover, snce F u = F 1 u = ψu 1,..., ψu d, we obtan D F u, v = From a Taylor expanson, we get d u v ψsds u v ψv. =1 D F u, v d 1 max s [u,v ] 2 ψ su v 2. =1 Snce the functon ψ s convex, and u v, we have whch gves the desred result. max s [u,v ] ψ s ψ maxu, v ψ v, Theorem 4 Let ψ be a potental. The regret of OSMD wth F = F ψ and any non-negatve unbased loss estmate z t satsfes R n sup a A F a F x 1 η + η 2 d z t 2 E ψ 1 x t. In partcular, wth the estmate 6, ψx = x q, q > 1,and η = =1 2 q 1 m 1 2/q 1, d 1 2/q n Wth q = 2 ths gves 2 R n q q 1 mdn. R n 2 2mdn. 12
13 In the case m = 1, the above theorem mproves the bound R n 8 nd obtaned n Theorem 11 of [3]. Proof Frst note that snce D =, a d and z t has non-negatve coordnates, OSMD s well defned that s, 2 s satsfed. The frst nequalty follows from Theorem 2 and the fact that ψ ψ 1 1 s = Let ψx = x q. Then ψ 1 x = x 1/q and F x = q q 1 note that by Hölder s nequalty, snce d =1 x 1 = m, F a F x 1 q q 1 d x 1 1 1/q =1 Moreover, note that ψ 1 x = 1 q x 1 1/q, and d =1 x1 1/q q q 1 mq 1/q d 1/q. d z t 2 d E ψ 1 x t q x t 1/q qm 1/q d 1 1/q, =1 whch concludes the proof. =1 ψ 1 s.. In partcular, 4 Bandt feedback. In ths secton we consder onlne combnatoral optmzaton wth bandt feedback. Ths settng s much more challengng than the sem-bandt case, and n order to obtan sublnear regret bounds all known strateges add an exploraton component to the algorthm. For example, n EXP2, nstead of playng an acton at random accordng to the exponentally weghted average dstrbuton p t, one draws a random acton from p t wth probablty 1 γ and from some fxed exploraton dstrbuton µ wth probablty γ. On the other hand, n OSMD, one randomly perturbs x t to some x t, and then plays at random a pont n A such that on average one plays x t. In Bubeck, Cesa-Banch, and Kakade [9], the authors study the EXP2 strategy wth the exploraton dstrbuton µ supported on the contact ponts between the polytope ConvA and the John ellpsod of ths polytope.e., the ellpsod of mnmal volume enclosng the polytope. Usng ths method they are able to prove the best known upper bound for onlne combnatoral optmzaton wth bandt feedback. They show that the regret of EXP2 mxed wth John s exploraton and wth the estmate descrbed n Fgure 2 satsfes R n 2m 3/2 3dn log ed m. Our next theorem shows that no strategy can acheve a regret less than a constant tmes m dn, leavng a gap of a factor of m log d. As we argue below, we conjecture that the lower bound s of m the correct order of magntude. However, mprovng the upper bound seems to requre some substantally new deas. Note that the followng bound gves lmtatons that no strategy can surpass, on the contrary to Theorem 1 whch was dedcated to the EXP2 strategy. 13
14 Theorem 5 Let n d 2m. There exsts a subset A {0, 1} d such that a 1 = m, a A, under bandt feedback, one has nf sup strateges adversares R n 0.02m dn, 8 where the nfmum and the supremum are taken over the class of strateges for the player and for the adversary as defned n the ntroducton. Note that t should not come as a surprse that EXP2 wth John s exploraton s suboptmal, snce even n the full nformaton case the basc EXP2 strategy was provably suboptmal, see Theorem 1. We conjecture that the correct order of magntude for the mnmax regret n the bandt case s m dn, as the above lower bound suggests. A promsng approach to resolve ths conjecture s to consder agan the OSMD approach. However we beleve that n the bandt case, one has to consder Legendre functons wth nondagonal Hessan on the contrary to the Legendre functons consdered so far n ths paper. Abernethy, Hazan, and Rakhln [1] propose to use a self-concordant barrer functon for the polytope ConvA. Then they randomly perturb the pont x t gven by OSMD usng the egenstructure of the Hessan. Ths approach leads to a regret upper bound of order md θn log n for θ > 0 when ConvA admts a θ-self-concordant barrer functon. Unfortunately, even when there exsts a O1-self concordant barrer, ths bound s stll larger than the conjectured optmal bound by a factor d. In fact, t was proved n [9] that n some cases there exst better choces for the Legendre functon and the perturbaton than those descrbed n [1], even when there s a O1-self concordant functon for the acton set. How to generalze ths approach to the polytopes nvolved n onlne combnatoral optmzaton s a challengng open problem. A Proof of Theorem 1. For the sake of smplcty, we assume that d s a multple of 4 and that n s even. We consder the followng subset of the hypercube: { d/2 A = a {0, 1} d : a = d/4 and =1 a = 1, {d/2 + 1;..., d/2 + d/4} or } a = 1, {d/2 + d/4 + 1,..., d}. That s, choosng a pont n A corresponds to choosng a subset of d/4 elements among the frst half of the coordnates, and choosng one of the two frst dsjont ntervals of sze d/4 n the second half of the coordnates. We prove that for any parameter η, there exsts an adversary such that Exp2 wth parameter η has a regret of at least nd tanh ηd 16 8 least mn d log 2 12η, and that there exsts another adversary such that ts regret s at, nd 12. As a consequence, we have nd ηd d log 2 sup R n max 16 tanh, mn 8 12η, nd 12 nd ηd mn max 16 tanh, d log 2, nd 8 12η mn A, nd 12,
15 wth nd ηd A = mn max η [0,+ 16 tanh, d log η nd ηd nd ηd mn mn ηd 8 16 tanh, mn 8 max ηd<8 16 tanh, d log η nd nd mn tanh1, mn 16 max ηd d log 2 tanh1, ηd< η nd mn 16 tanh1, nd3 log 2 tanh1 mn 0.04 nd, 0.01 d 3/2 n, where we used the fact that tanh s concave and ncreasng on R +. As n d, ths mples the stated lower bound. Frst we prove the lower bound nd tanh ηd Defne the followng adversary: z t = 1 f {d/2 + 1;..., d/2 + d/4} and t odd, 1 f {d/2 + d/4 + 1,..., d} and t even, 0 otherwse. Ths adversary always puts a zero loss on the frst half of the coordnates, and alternates between a loss of d/4 for choosng the frst nterval n the second half of the coordnates and the second nterval. At the begnnng of odd rounds, any vertex a A has the same cumulatve loss and thus Exp2 pcks ts expert unformly at random, whch yelds an expected cumulatve loss equal to nd/16. On the other hand, at even rounds the probablty dstrbuton to select the vertex a A s always the same. More precsely, the probablty of selectng a vertex whch contans the nterval 1 {d/2 + d/4 + 1,..., d}.e, the nterval wth a d/4 loss at ths round s exactly. Ths 1+exp ηd/4 adds an expected cumulatve loss equal to nd 1. Fnally, note that the loss of any fxed 8 1+exp ηd/4 vertex s nd/8. Thus, we obtan R n = nd 16 + nd exp ηd/4 nd 8 = nd ηd 16 tanh. 8 It remans to show a lower bound proportonal to 1/η. To ths end, we consder a dfferent adversary defned by 1 ε f d/4, z t = 1 f {d/4 + 1,..., d/2}, 0 otherwse, for some fxed ε > 0. Note that aganst ths adversary the choce of the nterval n the second half of the components does not matter. Moreover, by symmetry, the weght of any coordnate n {d/4 + 1,..., d/2} s the same at any round. Fnally, note that ths weght s decreasng wth t. Thus, we have the followng denttes n the bg sums represents the number of components selected n the frst 15
16 d/4 components: R n = nεd a A:a d/2 =1 exp ηnzt 1 a 4 a A exp ηnzt 1 a = nεd 4 = nεd 4 = nεd 4 d/4 1 d/4 d/4 1 =0 d/4 d/4 d/4 =0 d/4 1 d/4 d/4 1 =0 d/4 d/4 d/4 =0 d/4 1 =0 1 4 d/4 d/4 d d/4 d/4 d/4 =0 d/4 1 exp ηnd/4 nε d/4 exp ηnd/4 nε d/4 1 expηnε d/4 expηnε d/4 expηnε d/4 expηnε where we used d/4 1 d/4 1 = 1 4 d/4 d d/4 n the last equalty. Thus, takng ε = mn log 2, 1 ηn yelds d log 2 R n mn 4η, nd 4 d/4 1 =0 1 4 d/4 2 d mn2, expηn 2 mn2, expηn d/4 =0 d/4 d log 2 mn 12η, nd, 12 where the last nequalty follows from Lemma 3 n the appendx. Ths concludes the proof of the lower bound. B Proof of Theorem 5 The structure of the proof s smlar to that of [3, Theorem 30], whch deals wth the smple case where m = 1. The man mportant conceptual dfference s contaned n Lemma 4, whch s at the heart of ths new proof. The man argument follows the lne of standard lower bounds for bandt problems, see, e.g., [10]: The worst-case regret s bounded from below by by takng an average over a convenently chosen class of strateges of the adversary. Then, by Pnsker s nequalty, the problem s reduced to computng the Kullback-Lebler dvergence of certan dstrbutons. The man techncal argument, gven n Lemma 4, s for provng manageable bounds for the relevant Kullback-Lebler dvergence. For the sake of smplfyng notaton, we assume that d s a multple of m, and we dentfy {0, 1} d wth the set of m d/m bnary matrces {0, 1} m d m. We consder the followng set of actons: d/m A = {a {0, 1} m d m : {1,..., m}, a, j = 1}. In other words, the player s playng n parallel m fnte games wth d/m actons. From step 1 to 3 we restrct our attenton to the case of determnstc strateges for the player, and we show how to extend the results to arbtrary strateges n step 4. Frst step: defntons. j=1 16
17 We denote by I,t {1,..., m} the random varable such that a t, I,t = 1. That s, I,t s the acton chosen at tme t n the th game. Moreover, let τ be drawn unformly at random from {1,..., n}. In ths proof we consder random adversares ndexed by A. More precsely, for α A, we defne the α-adversary as follows: For any t {1,..., n}, z t, j s drawn from a Bernoull dstrbuton wth parameter 1 εα, j. In other words, aganst adversary α, n the 2 th game, the acton j such that α, j = 1 has a loss slghtly smaller n expectaton than the other actons. We denote by E α ntegraton wth respect to the loss generaton process of the α-adversary. We wrte P,α for the probablty dstrbuton of α, I,τ when the player faces the α-adversary. Note that 1 we have P,α 1 = E n α n 1 α,i,t =1, hence, aganst the α-adversary, we have m m R n = E α ε1 α,i,t 1 = nε 1 P,α 1, =1 whch mples snce the maxmum s larger than the mean m max R 1 n nε 1 P α A d/m m,α 1. 9 =1 Second step: nformaton nequalty. Let P,α be the probablty dstrbuton of α, I,τ aganst the adversary whch plays lke the α-adversary except that n the th game, the losses of all coordnates are drawn from a Bernoull dstrbuton of parameter 1/2. We call t the, α-adversary and we denote by E,α ntegraton wth respect to ts loss generaton process. By Pnsker s nequalty, 1 P,α 1 P,α KLP,α, P,α, where KL denotes the Kullback-Lebler dvergence. Moreover, note that by symmetry of the adversares, α, 1 1 P d/m m,α 1 = E d/m m,α α, I,τ α A = = = α A 1 1 d/m m d/m β A =1 α A α:,α=,β 1 1 d/m m d/m E,β β A 1 1 d/m m d/m β A and thus, thanks to the concavty of the square root, α:,α=,β E,α α, I,τ α, I,τ = m d, 10 1 P d/m m,α 1 m 1 d + KLP 2d/m m,α, P,α. 11 α A 17 α A
18 Thrd step: computaton of KLP,α, P,α wth the chan rule. Note that snce the forecaster s determnstc, the sequence of observed losses up to tme n W n {0,..., m} n unquely determnes the emprcal dstrbuton of plays, and, n partcular, the probablty dstrbuton of α, I,τ condtonally to W n s the same for any adversary. Thus, f we denote by P n α respectvely P n,α the probablty dstrbuton of W n when the forecaster plays aganst the α-adversary respectvely the, α-adversary, then one can easly prove that KLP,α, P,α KLP n,α, P n α. Now we use the chan rule for Kullback-Lebler dvergence teratvely to ntroduce the probablty dstrbutons P t α of the observed losses W t up to tme t. More precsely, we have, KLP n,α, P n α = KLP 1,α, P 1 α + P t 1,α w t 1KLP t,α. w t 1, P t α. w t 1 t=2 w t 1 {0,...,m} t 1 = KL B, B 1 α,i,1 =1 + P t 1,α w t 1KL B wt 1, B w t 1, t=2 w t 1 :α,i,1 =1 where B wt 1 and B w t 1 are sums of m Bernoull dstrbutons wth parameters n {1/2, 1/2 ε} and such that the number of Bernoulls wth parameter 1/2 n B wt 1 s equal to the number of Bernoulls wth parameter 1/2 n B w t 1 plus one. Now usng Lemma 4 see below we obtan, In partcular, ths gves KLP n,α, P n α KL B wt 1, B w t 1 8 ε 2 1 4ε 2 m. 8 ε 2 1 4ε 2 m E,α 1 α,i,t =1 = 8 ε 2 n 1 4ε 2 m P,α1. Summng and pluggng ths nto 11 we obtan agan thanks to 10, for ε 1 8, 1 P d/m m,α 1 m 8n d + ε d. α A To conclude the proof of 8 for determnstc players one needs to plug ths last equaton n 9 along wth straghtforward computatons. Fourth step: Fubn s theorem to handle non-determnstc players. Consder now a randomzed player, and let E rand denote the expectaton wth respect to the randomzaton of the player. Then one has thanks to Fubn s theorem, 1 E d/m m α A a T t z t α T 1 z = E rand d/m m α A E α a T t z t α T z. Now note that f we fx the realzaton of the forecaster s randomzaton then the results of the 1 prevous steps apply and, n partcular, one can lower bound d/m α A E n m α at t z t α T z as before note that α s the optmal acton n expectaton aganst the α-adversary. 18
19 C Techncal lemmas. Lemma 3 For any k N, for any 1 c 2, we have k =0 1 /k k 2c k =0 k 2c 1/3. Proof Let fc denote the expresson on the left-hand sde of the nequalty. Introduce the random varable X, whch s equal to {0,..., k} wth probablty k 2c / k k 2c j=0 j j. We have f c = 1E[X1 X/k] 1 1 EXE1 X/k = Var X 0. So the functon f s decreasng c c ck on [1, 2], and therefore t suffces to consder c = 2. Numerator and denomnator of the left-hand sde dffer only by the factor 1 /k. A lower bound for the left-hand sde can thus be obtaned by showng that the terms for close to k are not essental to the value of the denomnator. To prove ths, we may use Strlng s formula whch mples that for any k 2 and [1, k 1], k k k k e 1/6 < k 2πk k k k k k < e 1/12, k 2πk hence k 2 k 2k ke 1/3 k 2πk < 2 k k 2 k 2k ke 1/6 < k 2π. Introduce λ = /k and χλ = 2 λ λ 2λ 1 λ 21 λ. We have [χλ] k 2e 1/3 πk < 2 k 2 < [χλ] k e1/6 2πλ. 12 Lemma 3 can be numercally verfed for k We now consder k > For λ 0.666, snce the functon χ can be shown to be decreasng on [0.666, 1], the nequalty k 22 < [χ0.666] k e1/6 holds. We have χ0.657/χ0.666 > Consequently, for k > π 106, we have [χ0.666] k < [χ0.657] k /k 2. So for λ and k > 10 6, we have 2 k 2 < [χ0.657] k e 1/6 2π k < 2e 1/3 2 [χ0.657]k 1000πk 2 = mn λ [0.656,0.657] [χλ]k < k 2e 1/3 1000πk 2 max {1,...,k 1} [0,0.666k k 2 2, 13 where the last nequalty comes from 12 and the fact that there exsts {1,..., k 1} such that /k [0.656, 0.657]. Inequalty 13 mples that for any {1,..., k}, we have 0.666k k 2 k 2 < max {1,...,k 1} [0,0.666k 19 2 k 2 < <0.666k 2 k 2.
20 To conclude, ntroducng A = 0 <0.666k k 22, we have k =0 1 /k k 22 k k k =0 k 2 > A A A 1 3. Lemma 4 Let l and n be ntegers wth 1 n l n. Let p, 2 2 p, q, p 1,..., p n be real numbers n 0, 1 wth q {p, p }, p 1 = = p l = q and p l+1 = = p n. Let B resp. B be the sum of n + 1 ndependent Bernoull dstrbutons wth parameters p, p 1,..., p n resp. p, p 1,..., p n. We have KLB, B 2p p 2 1 p n + 2q. Proof Let Z, Z, Z 1,..., Z n be ndependent Bernoull dstrbutons wth parameters p, p, p 1,..., p n. Defne S = l =1 Z, T = n =l+1 Z and V = Z + S. By a slght and usual abuse of notaton, we use KL to denote Kullback-Lebler dvergence of both probablty dstrbutons and random varables. Then we may wrte the nequalty s an easy consequence of the chan rule for Kullback- Lebler dvergence KLB, B = KL Z + S + T, Z + S + T KL Z + S, T, Z + S, T = KL Z + S, Z + S. Let s k = PS = k for k = 1, 0,..., l + 1. Usng the equaltes s k = l q k 1 q l k = k q 1 q whch holds for 1 k l + 1, we obtan l k + 1 l q k 1 1 q l k+1 = k k 1 q 1 q l k + 1 s k 1, k l+1 PZ + S = k KLZ + S, Z + S = PV = k log PZ + S = k k=0 l+1 psk ps k = PV = k log p s k p s k k=0 l+1 p 1 q k + 1 pl k + 1 q = PV = k log p 1 q k + 1 k=0 q p l k + 1 p qv + 1 pql + 1 = E log. 14 p qv + 1 p ql
21 Frst case: q = p. By Jensen s nequalty, usng that EV = p l p p n ths case, we get p p KLZ + S, Z EV + 1 pp l S log 1 p p l + 1 p p p p l + 1 = log 1 p p l + 1 p p 2 p p 2 = log p p l p p l + 1. Second case: q = p. In ths case, V s a bnomal dstrbuton wth parameters l + 1 and p. From 14, we have p KLZ + S, Z pv + 1 p pl S E log 1 ppl + 1 E log 1 + p pv EV ppl + 1 To conclude, we wll use the followng lemma. Lemma 5 The followng nequalty holds for any x x 0 wth x 0 0, 1: logx x 1 + x 12 2x 0. Proof Introduce fx = x 1 + x 12 2x 0 + logx. We have f x = 1 + x 1 x 0 + 1, and x f x = 1 x 0 1. From f x x 2 0 = 0, we get that f s negatve on x 0, 1 and postve on 1, +. Ths leads to f nonnegatve on [x 0, +. Fnally, from Lemma 5 and 15, usng x 0 = 1 p, we obtan 1 p KLZ + S, Z p 2 p E[V EV 2 ] + S 1 ppl + 1 2x 0 p 2 p l + 1p1 p 2 = 1 ppl p p p 2 = 21 p l + 1p. Acknowledgements G. Lugos s supported by the Spansh Mnstry of Scence and Technology grant MTM and PASCAL2 Network of Excellence under EC grant no
22 References [1] J. Abernethy, E. Hazan, and A. Rakhln, Competng n the dark: An effcent algorthm for bandt lnear optmzaton, Proceedngs of the 21st Annual Conference on Learnng Theory COLT, 2008, pp [2] J.-Y. Audbert and S. Bubeck, Mnmax polces for adversaral and stochastc bandts, Proceedngs of the 22nd Annual Conference on Learnng Theory COLT, [3], Regret bounds and mnmax polces under partal montorng, Journal of Machne Learnng Research , [4] P. Auer, N. Cesa-Banch, Y. Freund, and R. Schapre, The non-stochastc mult-armed bandt problem, SIAM Journal on Computng , no. 1, [5] B. Awerbuch and R. Klenberg, Adaptve routng wth end-to-end feedback: dstrbuted learnng and geometrc approaches, STOC 04: Proceedngs of the thrty-sxth annual ACM symposum on Theory of computng, 2004, pp [6] A. Beck and M. Teboulle, Mrror descent and nonlnear projected subgradent methods for convex optmzaton, Operatons Research Letters , no. 3, [7] S. Bubeck, Introducton to onlne optmzaton, Lecture Notes, [8] S. Bubeck and N. Cesa-Banch, Regret analyss of stochastc and nonstochastc mult-armed bandt problems, Foundatons and Trends n Machne Learnng , no. 1, [9] S. Bubeck, N. Cesa-Banch, and S. M. Kakade, Towards mnmax polces for onlne lnear optmzaton wth bandt feedback, Arxv preprnt arxv: [10] N. Cesa-Banch and G. Lugos, Predcton, learnng, and games, Cambrdge Unversty Press, [11], Combnatoral bandts, Journal of Computer and System Scences 2011, To appear. [12] V. Dan, T. Hayes, and S. Kakade, The prce of bandt nformaton for onlne optmzaton, Advances n Neural Informaton Processng Systems NIPS, vol. 20, 2008, pp [13] Y. Freund and R. E. Schapre, A decson-theoretc generalzaton of on-lne learnng and an applcaton to boostng, Journal of Computer and System Scences , [14] C. Gentle and M. Warmuth, Lnear hnge loss and average margn, Advances n Neural Informaton Processng Systems NIPS, [15] A. Grove, N. Lttlestone, and D. Schuurmans, General convergence results for lnear dscrmnant updates, Machne Learnng , [16] A. György, T. Lnder, G. Lugos, and G. Ottucsák, The on-lne shortest path problem under partal montorng, Journal of Machne Learnng Research ,
23 [17] E. Hazan, The convex optmzaton approach to regret mnmzaton, Optmzaton for Machne Learnng S. Sra, S. Nowozn, and S. Wrght, eds., MIT press, 2011, pp [18] E. Hazan, S. Kale, and M. Warmuth, Learnng rotatons wth lttle regret, Proceedngs of the 23rd Annual Conference on Learnng Theory COLT, [19] D. P. Helmbold and M. Warmuth, Learnng permutatons wth exponental weghts, Journal of Machne Learnng Research , [20] M. Herbster and M. Warmuth, Trackng the best expert, Machne Learnng , [21] J.-B. Hrart-Urruty and C. Lemaréchal, Fundamentals of convex analyss, Sprnger, [22] A. Kala and S. Vempala, Effcent algorthms for onlne decson problems, Journal of Computer and System Scences , [23] S. Kale, L. Reyzn, and R. Schapre, Non-stochastc bandt slate problems, Advances n Neural Informaton Processng Systems NIPS, 2010, pp [24] J. Kefer and J. Wolfowtz, Stochastc estmaton of the maxmum of a regresson functon, Annals of Mathematcal Statstcs , [25] J. Kvnen and M. Warmuth, Relatve loss bounds for multdmensonal regresson problems, Machne Learnng , [26] W. Koolen, M. Warmuth, and J. Kvnen, Hedgng structured concepts, Proceedngs of the 23rd Annual Conference on Learnng Theory COLT, 2010, pp [27] H. McMahan and A. Blum, Onlne geometrc optmzaton n the bandt settng aganst an adaptve adversary, In Proceedngs of the 17th Annual Conference on Learnng Theory COLT, 2004, pp [28] A. Nemrovsk, Effcent methods for large-scale convex optmzaton problems, Ekonomka Matematcheske Metody , In Russan. [29] A. Nemrovsk and D. Yudn, Problem complexty and method effcency n optmzaton, Wley Interscence, [30] A. Rakhln, Lecture notes on onlne learnng, [31] H. Robbns and S. Monro, A stochastc approxmaton method, Annals of Mathematcal Statstcs , [32] A. Schrjver, Combnatoral optmzaton, Sprnger, [33] S. Shalev-Shwartz, Onlne learnng: Theory, algorthms, and applcatons, Ph.D. thess, The Hebrew Unversty of Jerusalem, [34] E. Takmoto and M. Warmuth, Paths kernels and multplcatve updates, Journal of Machne Learnng Research ,
24 [35] T. Uchya, A. Nakamura, and M. Kudo, Algorthms for adversaral bandt problems wth multple plays, Proceedngs of the 21st Internatonal Conference on Algorthmc Learnng Theory ALT, [36] M. Warmuth and D. Kuzmn, Randomzed onlne pca algorthms wth regret bounds that are logarthmc n the dmenson, Journal of Machne Learnng Research , [37] M. Znkevch, Onlne convex programmng and generalzed nfntesmal gradent ascent, Proceedngs of the Twenteth Internatonal Conference on Machne Learnng ICML,
CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016
CS 294-128: Algorthms and Uncertanty Lecture 14 Date: October 17, 2016 Instructor: Nkhl Bansal Scrbe: Antares Chen 1 Introducton In ths lecture, we revew results regardng follow the regularzed leader (FTRL.
More informationMinimax Policies for Combinatorial Prediction Games
JMLR: Workshop and Conference Proceedngs 9 20 07 32 24th Annual Conference on Learnng Theory Mnmax Polces for Combnatoral Predcton Games Jean-Yves Audbert Imagne, Unv. Pars Est, and Serra, CNRS/ENS/INRIA,
More information1 The Mistake Bound Model
5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationRegret in Online Combinatorial Optimization
Regret in Online Combinatorial Optimization Jean-Yves Audibert Imagine, Université Paris Est, and Sierra, CNRS/ENS/INRIA audibert@imagine.enpc.fr Sébastien Bubeck Department of Operations Research and
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationCOS 521: Advanced Algorithms Game Theory and Linear Programming
COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationMore metrics on cartesian products
More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of
More informationLecture 4. Instructor: Haipeng Luo
Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationMMA and GCMMA two methods for nonlinear optimization
MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.
More informationCS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016
CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationAPPENDIX A Some Linear Algebra
APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,
More informationOnline Classification: Perceptron and Winnow
E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationYong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )
Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often
More informationThe Second Anti-Mathima on Game Theory
The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player
More informationLecture 14: Bandits with Budget Constraints
IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationSolutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.
Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,
More informationAssortment Optimization under MNL
Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationP exp(tx) = 1 + t 2k M 2k. k N
1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.
More informationU.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016
U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and
More informationprinceton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg
prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there
More informationEdge Isoperimetric Inequalities
November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationEstimation: Part 2. Chapter GREG estimation
Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013
COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More informationNatural Language Processing and Information Retrieval
Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support
More informationMaximizing the number of nonnegative subsets
Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum
More informationLecture 17: Lee-Sidford Barrier
CSE 599: Interplay between Convex Optmzaton and Geometry Wnter 2018 Lecturer: Yn Tat Lee Lecture 17: Lee-Sdford Barrer Dsclamer: Please tell me any mstake you notced. In ths lecture, we talk about the
More informationThe Minimum Universal Cost Flow in an Infeasible Flow Network
Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran
More informationCase A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.
THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationCollege of Computer & Information Science Fall 2009 Northeastern University 20 October 2009
College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:
More informationFoundations of Arithmetic
Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an
More informationLecture 20: November 7
0-725/36-725: Convex Optmzaton Fall 205 Lecturer: Ryan Tbshran Lecture 20: November 7 Scrbes: Varsha Chnnaobreddy, Joon Sk Km, Lngyao Zhang Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer:
More informationFinding Primitive Roots Pseudo-Deterministically
Electronc Colloquum on Computatonal Complexty, Report No 207 (205) Fndng Prmtve Roots Pseudo-Determnstcally Ofer Grossman December 22, 205 Abstract Pseudo-determnstc algorthms are randomzed search algorthms
More informationThe Experts/Multiplicative Weights Algorithm and Applications
Chapter 2 he Experts/Multplcatve Weghts Algorthm and Applcatons We turn to the problem of onlne learnng, and analyze a very powerful and versatle algorthm called the multplcatve weghts update algorthm.
More informationInner Product. Euclidean Space. Orthonormal Basis. Orthogonal
Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,
More informationComplete subgraphs in multipartite graphs
Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G
More informationLOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin
Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence
More informationLecture 12: Discrete Laplacian
Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly
More informationCommunication Complexity 16:198: February Lecture 4. x ij y ij
Communcaton Complexty 16:198:671 09 February 2010 Lecture 4 Lecturer: Troy Lee Scrbe: Rajat Mttal 1 Homework problem : Trbes We wll solve the thrd queston n the homework. The goal s to show that the nondetermnstc
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationTAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES
TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES SVANTE JANSON Abstract. We gve explct bounds for the tal probabltes for sums of ndependent geometrc or exponental varables, possbly wth dfferent
More informationThe Order Relation and Trace Inequalities for. Hermitian Operators
Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence
More informationAppendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis
A Appendx for Causal Interacton n Factoral Experments: Applcaton to Conjont Analyss Mathematcal Appendx: Proofs of Theorems A. Lemmas Below, we descrbe all the lemmas, whch are used to prove the man theorems
More informationModule 9. Lecture 6. Duality in Assignment Problems
Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept
More informationMin Cut, Fast Cut, Polynomial Identities
Randomzed Algorthms, Summer 016 Mn Cut, Fast Cut, Polynomal Identtes Instructor: Thomas Kesselhem and Kurt Mehlhorn 1 Mn Cuts n Graphs Lecture (5 pages) Throughout ths secton, G = (V, E) s a mult-graph.
More informationMinimax Policies for Combinatorial Prediction Games
Minimax Policies for Combinatorial Prediction Games Jean-Yves Audibert Imagine, Univ. Paris Est, and Sierra, CNRS/ENS/INRIA, Paris, France audibert@imagine.enpc.fr Sébastien Bubeck Centre de Recerca Matemàtica
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models
More informationAffine transformations and convexity
Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationThe Geometry of Logit and Probit
The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.
More informationNP-Completeness : Proofs
NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem
More informationPerfect Competition and the Nash Bargaining Solution
Perfect Competton and the Nash Barganng Soluton Renhard John Department of Economcs Unversty of Bonn Adenauerallee 24-42 53113 Bonn, Germany emal: rohn@un-bonn.de May 2005 Abstract For a lnear exchange
More informationCSC 411 / CSC D11 / CSC C11
18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationA Note on Bound for Jensen-Shannon Divergence by Jeffreys
OPEN ACCESS Conference Proceedngs Paper Entropy www.scforum.net/conference/ecea- A Note on Bound for Jensen-Shannon Dvergence by Jeffreys Takuya Yamano, * Department of Mathematcs and Physcs, Faculty of
More informationLecture 4: September 12
36-755: Advanced Statstcal Theory Fall 016 Lecture 4: September 1 Lecturer: Alessandro Rnaldo Scrbe: Xao Hu Ta Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer: These notes have not been
More informationMAXIMUM A POSTERIORI TRANSDUCTION
MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,
More informationGames of Threats. Elon Kohlberg Abraham Neyman. Working Paper
Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017
More informationParametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010
Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton
More informationExcess Error, Approximation Error, and Estimation Error
E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple
More informationLecture 17 : Stochastic Processes II
: Stochastc Processes II 1 Contnuous-tme stochastc process So far we have studed dscrete-tme stochastc processes. We studed the concept of Makov chans and martngales, tme seres analyss, and regresson analyss
More informationThe Expectation-Maximization Algorithm
The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.
More informationON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION
Advanced Mathematcal Models & Applcatons Vol.3, No.3, 2018, pp.215-222 ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EUATION
More informationBOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS
BOUNDEDNESS OF THE IESZ TANSFOM WITH MATIX A WEIGHTS Introducton Let L = L ( n, be the functon space wth norm (ˆ f L = f(x C dx d < For a d d matrx valued functon W : wth W (x postve sem-defnte for all
More informationprinceton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora
prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable
More informationVapnik-Chervonenkis theory
Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown
More informationAnnouncements EWA with ɛ-exploration (recap) Lecture 20: EXP3 Algorithm. EECS598: Prediction and Learning: It s Only a Game Fall 2013.
Lecture 0: EXP3 Algorthm 1 EECS598: Predcton and Learnng: It s Only a Game Fall 013 Prof. Jacob Abernethy Lecture 0: EXP3 Algorthm Scrbe: Zhhao Chen Announcements None 0.1 EWA wth ɛ-exploraton (recap)
More informationSupporting Information
Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to
More informationSupplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso
Supplement: Proofs and Techncal Detals for The Soluton Path of the Generalzed Lasso Ryan J. Tbshran Jonathan Taylor In ths document we gve supplementary detals to the paper The Soluton Path of the Generalzed
More informationarxiv:submit/ [cs.lg] 30 Aug 2011
No Internal Regret va Neghborhood Watch Dean Foster Department of Statstcs Unversty of Pennsylvana Alexander Rakhln Department of Statstcs Unversty of Pennsylvana arxv:submt/0308560 cs.lg 30 Aug 2011 August
More informationLecture Space-Bounded Derandomization
Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationCSCE 790S Background Results
CSCE 790S Background Results Stephen A. Fenner September 8, 011 Abstract These results are background to the course CSCE 790S/CSCE 790B, Quantum Computaton and Informaton (Sprng 007 and Fall 011). Each
More informationMarkov Chain Monte Carlo Lecture 6
where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways
More informationLecture Randomized Load Balancing strategies and their analysis. Probability concepts include, counting, the union bound, and Chernoff bounds.
U.C. Berkeley CS273: Parallel and Dstrbuted Theory Lecture 1 Professor Satsh Rao August 26, 2010 Lecturer: Satsh Rao Last revsed September 2, 2010 Lecture 1 1 Course Outlne We wll cover a samplng of the
More informationComputing Correlated Equilibria in Multi-Player Games
Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,
More informationThe lower and upper bounds on Perron root of nonnegative irreducible matrices
Journal of Computatonal Appled Mathematcs 217 (2008) 259 267 wwwelsevercom/locate/cam The lower upper bounds on Perron root of nonnegatve rreducble matrces Guang-Xn Huang a,, Feng Yn b,keguo a a College
More informationEconomics 101. Lecture 4 - Equilibrium and Efficiency
Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of
More informationCSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography
CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve
More informationLecture 4: Universal Hash Functions/Streaming Cont d
CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationLinear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.
Lnear, affne, and convex sets and hulls In the sequel, unless otherwse specfed, X wll denote a real vector space. Lnes and segments. Gven two ponts x, y X, we defne xy = {x + t(y x) : t R} = {(1 t)x +
More informationOn the Multicriteria Integer Network Flow Problem
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of
More informationFinding Dense Subgraphs in G(n, 1/2)
Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng
More informationMaximal Margin Classifier
CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org
More information