Minimizing Regret on Reflexive Banach Spaces and Nash Equilibria in Continuous Zero-Sum Games

Size: px
Start display at page:

Download "Minimizing Regret on Reflexive Banach Spaces and Nash Equilibria in Continuous Zero-Sum Games"

Transcription

1 Minimizing Regre on Reflexive Banach Spaces and Nash Equilibria in Coninuous Zero-Sum Games Maximilian Balanda, Walid Krichene, Claire Tomlin, Alexandre Bayen Elecrical Engineering and Compuer Sciences, UC Berkeley Absrac We sudy a general adversarial online learning problem, in which we are given a decision se X in a reflexive Banach space X and a sequence of reward vecors in he dual space of X. A each ieraion, we choose an acion from X, based on he observed sequence of previous rewards. Our goal is o minimize regre. Using resuls from infinie dimensional convex analysis, we generalize he mehod of Dual Averaging o our seing and obain upper bounds on he wors-case regre ha generalize many previous resuls. Under he assumpion of uniformly coninuous rewards, we obain explici regre bounds in a seing where he decision se is he se of probabiliy disribuions on a compac meric space S. Imporanly, we make no convexiy assumpions on eiher S or he reward funcions. We also prove a general lower bound on he wors-case regre for any online algorihm. We hen apply hese resuls o he problem of learning in repeaed wo-player zero-sum games on compac meric spaces. In doing so, we firs prove ha if boh players play a Hannan-consisen sraegy, hen wih probabiliy 1 he empirical disribuions of play weakly converge o he se of Nash equilibria of he game. We hen show ha, under mild assumpions, Dual Averaging on he (infinie-dimensional space of probabiliy disribuions indeed achieves Hannan-consisency. 1 Inroducion Regre analysis is a general echnique for designing and analyzing algorihms for sequenial decision problems in adversarial or sochasic seings (Shalev-Shwarz, 2012; Bubeck and Cesa-Bianchi, Online learning algorihms have applicaions in machine learning (Xiao, 2010, porfolio opimizaion (Cover, 1991, online convex opimizaion (Hazan e al., 2007 and oher areas. Regre analysis also plays an imporan role in he sudy of repeaed play of finie games (Har and Mas- Colell, I is well known, for example, ha in a wo-player zero-sum finie game, if boh players play according o a Hannan-consisen sraegy (Hannan, 1957, heir (marginal empirical disribuions of play almos surely converge o he se of Nash equilibria of he game (Cesa-Bianchi and Lugosi, Moreover, i can be shown ha playing a sraegy ha achieves sublinear regre almos surely guaranees Hannan-consisency. A naural quesion hen is wheher a similar resul holds for games wih infinie acion ses. In his aricle we provide a posiive answer. In paricular, we prove ha in a coninuous wo-player zero sum game over compac (no necessarily convex meric spaces, if boh players follow a Hannan-consisen sraegy, hen wih probabiliy 1 heir empirical disribuions of play weakly converge o he se of Nash equilibria of he game. This in urn raises anoher imporan quesion: Do algorihms ha ensure Hannan-consisency exis in such a seing? More generally, can one develop algorihms ha guaranee sub-linear growh of he wors-case regre? We answer hese quesions affirmaively as well. To his end, we develop a general framework o sudy he Dual Averaging (or Follow he Regularized Leader mehod on reflexive Banach spaces. This framework generalizes a wide range of exising 30h Conference on Neural Informaion Processing Sysems (NIPS 2016, Barcelona, Spain.

2 resuls in he lieraure, including algorihms for online learning on finie ses (Arora e al., 2012 and finie-dimensional online convex opimizaion (Hazan e al., Given a convex subse X of a reflexive Banach space X, he generalized Dual Averaging (DA mehod maximizes, a each ieraion, he cumulaive pas rewards (which are elemens of X, he dual space of X minus a regularizaion erm h. We show ha under cerain condiions, he maximizer in he DA updae is he Fréche gradien Dh of he regularizer s conjugae funcion. In doing so, we develop a novel characerizaion of he dualiy beween essenial srong convexiy of h and essenial Fréche differeniabiliy of h in reflexive Banach spaces, which is of independen ineres. We apply hese general resuls o he problem of minimizing regre when he rewards are uniformly coninuous funcions over a compac meric space S. Imporanly, we do no assume convexiy of eiher S or he rewards, and show ha i is possible o achieve sublinear regre under a mild geomeric condiion on S (namely, he exisence of a locally Q-regular Borel measure. We provide explici bounds for a class of regularizers, which guaranee sublinear wors-case regre. We also prove a general lower bound on he regre for any online algorihm and show ha DA asympoically achieves his bound up o a log facor. Our resuls are relaed o work by Lehrer (2003 and Sridharan and Tewari (2010; Srebro e al. (2011. Lehrer (2003 gives necessary geomeric condiions for Blackwell approachabiliy in infiniedimensional spaces, bu no implemenable algorihm guaraneeing Hannan-consisency. Sridharan and Tewari (2010 derive general regre bounds for Mirror Descen (MD under he assumpion ha he sraegy se is uniformly bounded in he norm of he Banach space. We do no make such an assumpion here. In fac, his assumpion does no hold in general for our applicaions in Secion 3. The paper is organized as follows: In Secion 2 we inroduce and provide a general analysis of Dual Averaging in reflexive Banach spaces. In Secion 3 we apply hese resuls o obain explici regre bounds on compac meric spaces wih uniformly coninuous reward funcions. We use hese resuls in Secion 4 in he conex of learning Nash equilibria in coninuous wo-player zero sum games, and provide a numerical example in Secion 4. All proofs are given in he supplemenary maerial. 2 Regre Minimizaion on Reflexive Banach Spaces Consider a sequenial decision problem in which we are o choose a sequence (x 1, x 2,... of acions from some feasible subse X of a reflexive Banach space X, and seek o maximize a sequence (u 1 (x 1, u 2 (x 2,... of rewards, where he u τ : X R are elemens of a given subse U X, wih X he dual space of X. We assume ha x, he acion chosen a ime, may only depend on he sequence of previously observed reward vecors (u 1,..., u 1. We call any such algorihm an online algorihm. We consider he adversarial seing, i.e., we do no make any disribuional assumpions on he rewards. In paricular, hey could be picked maliciously by some adversary. The noion of regre is a sandard measure of performance for such a sequenial decision problem. For a sequence (u 1,..., u of reward vecors, and a sequence of decisions (x 1,..., x produced by an algorihm, he regre of he algorihm w.r.. a (fixed decision x X is he gap beween he realized reward and he reward under x, i.e., R (x := u τ (x u τ (x τ. The regre is defined as R := sup x X R (x. An algorihm is said o have sublinear regre if for any sequence (u 1 in he se of admissible reward funcions U, he regre grows sublinearly, i.e. lim sup R / 0. Example 1. Consider a finie acion se S = {1,..., n}, le X = X = R n, and le X = n 1, he probabiliy simplex in R n. A reward funcion can be idenified wih a vecor u R n, such ha he i-h elemen u i is he reward of acion i. A choice x X corresponds o a randomizaion over he n acions in S. This is he classic seing of many regre-minimizing algorihms in he lieraure. Example 2. Suppose S is a compac meric space wih µ a finie measure on S. Consider X = X = L 2 (S, µ and le X = {x X : x 0 a.e., x 1 = 1}. A reward funcion is an L 2 - inegrable funcion on S, and each choice x X corresponds o a probabiliy disribuion (absoluely coninuous w.r.. µ over S. We will explore a more general varian of his problem in Secion 3. In his Secion, we prove a general bound on he wors-case regre for DA. DA was inroduced by Neserov (2009 for (finie dimensional convex opimizaion, and has also been applied o online learning, e.g. by Xiao (2010. In he finie dimensional case, he mehod solves, a each ieraion, he opimizaion problem x +1 = arg max x X η u τ, x h(x, where h is a srongly convex 2

3 regularizer defined on X R n and (η 0 is a sequence of learning raes. The regre analysis of he mehod relies on he dualiy beween srong convexiy and smoohness (Neserov, 2009, Lemma 1. In order o generalize DA o our Banach space seing, we develop an analogous dualiy resul in Theorem 1. In paricular, we show ha he correc noion of srong convexiy is (uniform essenial srong convexiy. Equipped wih his dualiy resul, we analyze he regre of he Dual Averaging mehod and derive a general bound in Theorem Preliminaries Le (X, be a reflexive Banach space, and denoe by, : X X R he canonical pairing beween X and is dual space X, so ha x, ξ := ξ(x for all x X, ξ X. By he effecive domain of an exended real-valued funcion f : X [, + ] we mean he se dom f = {x X : f(x < + }. A funcion f is proper if f > and dom f is non-empy. The conjugae or Legendre-Fenchel ransform of f is he funcion f : X [, + ] given by f (ξ = sup x, ξ f(x (1 x X for all ξ X. If f is proper, lower semiconinuous and convex, is subdifferenial f is he se-valued mapping f(x = { ξ X : f(y f(x + y x, ξ for all y X }. We define dom f := {x X : f(x }. Le Γ denoe he se of all convex, lower semiconinuous funcions γ : [0, [0, ] such ha γ(0 = 0, and le Γ U := { γ Γ : r > 0, γ(r > 0 } Γ L := { γ Γ : γ(r/r 0, as r 0 } (2 We now inroduce some definiions. Addiional resuls are reviewed in he supplemenary maerial. Definiion 1 (Srömberg, A proper convex lower semiconinuous funcion f : X (, ] is essenially srongly convex if (i f is sricly convex on every convex subse of dom f (ii ( f 1 is locally bounded on is domain (iii for every x 0 dom f here exiss ξ 0 X and γ Γ U such ha f(x f(x 0 + x x 0, ξ 0 + γ( x x 0, x X. (3 If (3 holds wih γ independen of x 0, f is uniformly essenially srongly convex wih modulus γ. Definiion 2 (Srömberg, A proper convex lower semiconinuous funcion f : X (, ] is essenially Fréche differeniable if in dom f, f is Fréche differeniable on in dom f wih Fréche derivaive Df, and Df(x j for any sequence (x j j in in dom f converging o some boundary poin of dom f. Definiion 3. A proper Fréche differeniable funcion f : X (, ] is essenially srongly smooh if x 0 dom f, ξ 0 X, κ Γ L such ha f(x f(x 0 + ξ 0, x x 0 + κ( x x 0, x X. (4 If (4 holds wih κ independen of x 0, f is uniformly essenially srongly smooh wih modulus κ. Wih his we are now ready o give our main dualiy resul: Theorem 1. Le f : X (, + ] be proper, lower semiconinuous and uniformly essenially srongly convex wih modulus γ Γ U. Then (i f is proper and essenially Fréche differeniable wih Fréche derivaive Df (ξ = arg max x, ξ f(x. (5 x X If, in addiion, γ(r := γ(r/r is sricly increasing, hen Df (ξ 1 Df (ξ 2 γ 1( ξ 1 ξ 2 /2. (6 In oher words, Df is uniformly coninuous wih modulus of coninuiy χ(r = γ 1 (r/2. (ii f is uniformly essenially smooh wih modulus γ. Corollary 1. If γ(r C r 1+κ, r 0 hen Df (ξ 1 Df (ξ 2 (2C 1/κ ξ 1 ξ 2 1/κ. In paricular, wih γ(r = K 2 r2, Definiion 1 becomes he classic definiion of K-srong convexiy, and (6 yields he resul familiar from he finie-dimensional case ha he gradien Df is 1/K Lipschiz wih respec o he dual norm (Neserov, 2009, Lemma 1. 3

4 2.2 Dual Averaging in Reflexive Banach Spaces We call a proper convex funcion h : X (, + ] a regularizer funcion on a se X X if h is essenially srongly convex and dom h = X. We emphasize ha we do no assume h o be Fréche-differeniable. Definiion 1 in conjuncion wih Lemma S.1 (supplemenal maerial implies ha for any regularizer h, he supremum of any funcion of he form, ξ h( over X, where ξ X, will be aained a a unique elemen of X, namely Dh (ξ, he Fréche gradien of h a ξ. DA wih regularizer h and a sequence of learning raes (η 1 generaes a sequence of decisions using he simple updae rule x +1 = Dh (η U, where U = u τ and U 0 := 0. Theorem 2. Le h be a uniformly essenially srongly convex regularizer on X wih modulus γ and le (η 1 be a posiive non-increasing sequence of learning raes. Then, for any sequence of payoff funcions (u 1 in X for which here exiss M < such ha sup x X u, x M for all, he sequence of plays (x 0 given by x +1 = Dh ( η u τ (7 ensures ha R (x := u τ, x u τ, x τ h(x h + η where h = inf x X h(x, γ(r := γ(r/r and η 0 := η 1. u τ γ 1( η τ 1 2 u τ I is possible o obain a regre bound similar o (8 also in a coninuous-ime seing. In fac, following Kwon and Merikopoulos (2014, we derive he bound (8 by firs proving a bound on a suiably defined noion of coninuous-ime regre, and hen bounding he difference beween he coninuous-ime and discree-ime regres. This analysis is deailed in he supplemenary maerial. Noe ha he condiion ha sup x X u, x M in Theorem 2 is weaker han he one in Sridharan and Tewari (2010, as i does no imply a uniformly bounded sraegy se (e.g., if X = L 2 (R and X is he se of disribuions on X, hen X is unbounded in L 2, bu he condiion may sill hold. Theorem 2 provides a regre bound for a paricular choice x X. Recall ha R := sup x X R (x. In Example 1 he se X is compac, so any coninuous regularizer h will be bounded, and hence aking he supremum over x in (8 poses no issue. However, his is no he case in our general seing, as he regularizer may be unbounded on X. For insance, consider Example 2 wih he enropy regularizer h(x = x(s log(x(sds, which is easily seen o be unbounded on X. As a S consequence, obaining a wors-case bound will in general require addiional assumpions on he reward funcions and he decision se X. This will be invesigaed in deail in Secion 3. Corollary 2. Suppose ha γ(r C r 1+κ, r 0 for some C > 0 and κ > 0. Then R (x h(x h + (2C 1/κ η 1/κ τ 1 η u τ 1+1/κ. (9 In paricular, if u M for all and η = η β, hen R (x h(x h β + κ ( η 1/κM 1+1/κ 1 β/κ. (10 η κ β 2C Assuming h is bounded, opimizing over β yields a rae of R (x = O( κ 1+κ. In paricular, if γ(r = K 2 r2, which corresponds o he classic definiion of srong convexiy, hen R (x = O(. For non-vanishing u τ we will need ha η 0 for he sum in (9 o converge. Thus we could ge poenially igher conrol over he rae of his erm for κ < 1, a he expense of larger consans. 3 Online Opimizaion on Compac Meric Spaces We now apply he above resuls o he problem minimizing regre on compac meric spaces under he addiional assumpion of uniformly coninuous reward funcions. We make no assumpions on convexiy of eiher he feasible se or he rewards. Essenially, we lif he non-convex problem of minimizing a sequence of funcions over he (possibly non-convex se S o he convex (albei infiniedimensional problem of minimizing a sequence of linear funcionals over a se X of probabiliy measures (a convex subse of he vecor space of measures on S. (8 4

5 3.1 An Upper Bound on he Wors-Case Regre Le (S, d be a compac meric space, and le µ be a Borel measure on S. Suppose ha he reward vecors u τ are given by elemens in L q (S, µ, where q > 1. Le X = L p (S, µ, where p and q are Hölder conjugaes, i.e., 1 p + 1 q = 1. Consider X = {x X : x 0 a.e., x 1 = 1}, he se of probabiliy measures on S ha are absoluely coninuous w.r.. µ wih p-inegrable Radon-Nikodym derivaives. Moreover, denoe by Z he class of non-decreasing χ : [0, [0, ] such ha lim r 0 χ(r = χ(0 = 0. The following assumpion will be made hroughou his secion: Assumpion 1. The reward vecors u have modulus of coninuiy χ on S, uniformly in. Tha is, here exiss χ Z such ha u (s u (s χ(d(s, s for all and for all s, s S. Le B(s, r = {s S : d(s, s < r} and denoe by B(s, δ X he elemens of X wih suppor conained in B(s, δ. Furhermore, le D S := sup s,s S d(s, s. Then we have he following: Theorem 3. Le (S, d be compac, and suppose ha Assumpion 1 holds. Le h be a uniformly essenially srongly convex regularizer on X wih modulus γ, and le (η 1 be a posiive nonincreasing sequence of learning raes. Then, under (7, for any posiive sequence (ϑ 1, R sup s S inf x B(s,ϑ h(x h η + χ(ϑ + u τ γ 1( η τ 1 2 u τ. (11 Remark 1. The sequence (ϑ 1 in Theorem 3 is no a parameer of he algorihm, bu raher a parameer in he regre bound. In paricular, (11 holds rue for any such sequence, and we will use his fac laer on o obain explici bounds by insaniaing (11 wih a paricular choice of (ϑ 1. I is imporan o realize ha he infimum over B(s, ϑ in (11 may be infinie, in which case he bound is meaningless. This happens for example if s is an isolaed poin of some S R n and µ is he Lebesgue measure, in which case B(s, ϑ =. However, under an addiional regulariy assumpion on he measure µ we can avoid such degenerae siuaions. Definiion 4 (Heinonen. e al., A Borel measure µ on a meric space (S, d is (Ahlfors Q-regular if here exis 0 < c 0 C 0 < such ha for any open ball B(s, r c 0 r Q µ(b(s, r C 0 r Q. (12 We say ha µ is r 0 -locally Q-regular if (12 holds for all 0 < r r 0. Inuiively, under an r 0 -locally Q-regular measure, he mass in he neighborhood of any poin of S is uniformly bounded from above and below. This will allow, a each ieraion, o assign sufficien probabiliy mass around he maximizer(s of he cumulaive reward funcion. Example 3. The canonical example for a Q-regular measure is he Lebesgue measure λ on R n. If d is he meric induced by he Euclidean norm, hen Q = n and he bound (12 is igh wih c 0 = C 0, a dimensional consan. However, for general ses S R n, λ need no be locally Q-regular. A sufficien condiion for local regulariy of λ is ha S is v-uniformly fa (Krichene e al., Assumpion 2. The measure µ is r 0 -locally Q-regular on (S, d. Under Assumpion 2, B(s, ϑ for all s S and ϑ > 0, hence we may hope for a bound on inf x B(s,ϑ h(x uniform in s. To obain explici convergence raes, we have o consider a more specific class of regularizers. 3.2 Explici Raes for f-divergences on L p (S We consider a paricular class of regularizers called f-divergences or Csiszár divergences (Csiszár, Following Audiber e al. (2014, we define ω-poenials and he associaed f-divergence. Definiion 5. Le ω 0 and a (, + ]. A coninuous increasing diffeomorphism φ : (, a (ω,, is an ω-poenial if lim z φ(z = ω, lim z a φ(z = + and φ(0 1. Associaed o φ is he convex funcion f φ : [0, R defined by f φ (x = x 1 φ 1 (z dz and he f φ -divergence, defined by h φ (x = S f φ( x(s dµ(s + ιx (x, where ι X is he indicaor funcion of X (i.e. ι X (x = 0 if x X and ι X (x = + if x / X. A remarkable fac is ha for regularizers based on ω poenials, he DA updae (7 can be compued efficienly. More precisely, i can be shown (see Proposiion 3 in Krichene (2015 ha he maximizer in his case has a simple expression in erms of he dual problem, and he problem of compuing x +1 = Dh (η u τ reduces o compuing a scalar dual variable ν. 5

6 Proposiion 1. Suppose ha µ(s = 1, and ha Assumpion 2 holds wih consans r 0 > 0 and 0 < c 0 C 0 <. Under he Assumpions of Theorem 3, wih h = h φ he regularizer associaed o an ω-poenial φ, we have ha, for any posiive sequence (ϑ 1 wih ϑ r 0, R min(c 0ϑ Q, µ(s ( f φ c 1 0 η ϑ Q + χ(ϑ + 1 u τ γ 1( η τ 1 2 u τ. (13 For paricular choices of he sequences (η 1 and (ϑ 1, we can derive explici regre raes. 3.3 Analysis for Enropy Dual Averaging (The Generalized Hedge Algorihm Taking φ(z = e z 1, we have ha f φ (x = x 1 φ 1 (zdz = x log x, and hence he regularizer is h φ (x = S x(s log x(sdµ(s. Then Dh exp ξ(s (ξ(s = exp ξ(s 1. This corresponds o a generalized Hedge algorihm (Arora e al., 2012; Krichene e al., 2015 or he enropic barrier of Bubeck and Eldan (2014 for Euclidean spaces. The regularizer h φ can be shown o be essenially srongly convex wih modulus γ(r = 1 2 r2. Corollary 3. Suppose ha µ(s = 1, ha µ is r 0 -locally Q-regular wih consans c 0, C 0, ha u M for all, and ha χ(r = C α r α for 0 < α 1 (ha is, he rewards are α-hölder coninuous. Then, under Enropy Dual Averaging, choosing η = η log / wih ( η = 1 C0Q M 2c 0 log(c 1 0 ϑ Q/α + 2α Q 1/2 and ϑ > 0, we have ha ( R 2C0 ( 2M log(c 1 0 c ϑ Q/α + Q log + C α ϑ (14 0 2α whenever log / < r α 0 ϑ 1. One can now furher opimize over he choice of ϑ o obain he bes consan in he bound. Noe also ha he case α = 1 corresponds o Lipschiz coninuiy. 3.4 A General Lower Bound Theorem 4. Le (S, d be compac, suppose ha Assumpion 2 holds, and le w : R R be any funcion wih modulus of coninuiy χ Z such ha w(d(, s q M for some s S for which here exiss s S wih d(s, s = D S. Then for any online algorihm, here exis a sequence (u τ of reward vecors u τ X wih u τ M and modulus of coninuiy χ τ < χ such ha R w(d S 2, (15 2 Maximizing he consan in (15 is of ineres in order o benchmark he bound agains he upper bounds obained in he previous secions. This problem is however quie challenging, and we will defer his analysis o fuure work. For Hölder-coninuous funcions, we have he following resul: Proposiion 2. In he seing of Theorem 4, suppose ha µ(s = 1 and ha χ(r = C α r α for some 0 < α 1. Then R min( Cα 1/α DS α, M 2. (16 2 Observe ha, up o a log facor, he asympoic rae of his general lower bound for any online algorihm maches ha of he upper bound (14 of Enropy Dual Averaging. 4 Learning in Coninuous Two-Player Zero-Sum Games Consider a wo-player zero sum game G = (S 1, S 2, u, in which he sraegy spaces S 1 and S 2 of player 1 and 2, respecively, are Hausdorff spaces, and u : S 1 S 2 R is he payoff funcion of player 1 (as G is zero-sum, he payoff funcion of player 2 is u. For each i, denoe by P i := P(S i he se of Borel probabiliy measures on S i. Denoe S := S 1 S 2 and P := P 1 P 2. For a (join mixed sraegy x P, we define he naural exension ū : P R by ū(x := E x [u] = S u(s1, s 2 dx(s 1, s 2, which is he expeced payoff of player 1 under x. 6

7 A coninuous zero-sum game G is said o have value V if sup x 1 P 1 inf ū(x 1, x 2 = x 2 P 2 inf x 2 P 2 sup ū(x 1, x 2 = V. (17 x 1 P 1 The elemens x 1 x 2 P a which (17 holds are he (mixed Nash Equilibria of G. We denoe he se of Nash equilibria of G by N (G. In he case of finie games, i is well known ha every wo-player zero-sum game has a value. This is no rue in general for coninuous games, and addiional condiions on sraegy ses and payoffs are required, see e.g. (Glicksberg, Repeaed Play We consider repeaed play of he coninuous wo-player zero-sum game. Given a game G and a sequence of plays (s 1 1 and (s 2 1, we say ha player i has sublinear (realized regre if ( 1 lim sup sup u i (s i, s i τ u i (s i τ, s i τ 0 (18 s i S i where we use i o denoe he oher player. A sraegy σ i for player i is, loosely speaking, a (possibly random mapping from pas observaions o is acions. Of primary ineres o us are Hannan-consisen sraegies: Definiion 6 (Hannan, A sraegy σ i of player i is Hannan consisen if, for any sequence (s i 1, he sequence of plays (s i 1 generaed by σ i has sublinear regre almos surely. Noe ha he almos sure saemen in Definiion 6 is wih respec o he randomness in he sraegy σ i. The following resul is a generalizaion of is counerpar for discree games (e.g. Corollary 7.1 in (Cesa-Bianchi and Lugosi, 2006: Proposiion 3. Suppose G has value V and consider a sequence of plays (s 1 1, (s 2 1 and 1 assume ha boh players have sublinear realized regre. Then lim u(s1 τ, s 2 τ = V. As in he discree case (Cesa-Bianchi and Lugosi, 2006, we can also say somehing abou convergence of he empirical disribuions of play o he se of Nash Equilibria. Since hese disribuions have finie suppor for every, we can a bes hope for convergence in he weak sense as follows: Theorem 5. Suppose ha in a repeaed wo-player zero sum game G ha has a value boh players follow a Hannan-consisen sraegy, and denoe by ˆx i = 1 δ s i he marginal empirical τ disribuion of play of player i a ieraion. Le ˆx := (ˆx 1, ˆx 2. Then ˆx N (G almos surely, ha is, wih probabiliy 1 he sequence (ˆx 1 weakly converges o he se of Nash equilibria of G. Corollary 4. If G has a unique Nash equilibrium x, hen wih probabiliy 1, ˆx x. 4.2 Hannan-Consisen Sraegies By Theorem 5, if each player follows a Hannan-consisen sraegy, hen he empirical disribuions of play weakly converge o he se of Nash equilibria of he game. Bu do such sraegies exis? Regre minimizing sraegies are inuiive candidaes, and he inimae connecion beween regre minimizaion and learning in games is well sudied in many cases, e.g. for finie games (Cesa- Bianchi and Lugosi, 2006 or poenial games (Monderer and Shapley, Using our resuls from Secion 3, we will show ha, under he appropriae assumpion on he informaion revealed o he player, no-regre learning based on Dual Averaging leads o Hannan consisency in our seing. Specifically, suppose ha afer each ieraion, each player i observes a parial payoff funcion ũ i : S i R describing heir payoff as a funcion of only heir own acion, s i, holding he acion played by he oher player fixed. Tha is, ũ 1 (s 1 := u(s 1, s 2 and ũ 2 (s 2 := u(s 1, s 2. Remark 2. Noe ha we do no assume ha he players have knowledge of he join uiliy funcion u. However, we do assume ha he player has full informaion feedback, in he sense ha hey observe parial reward funcions u(, s i τ on heir enire acion se, as opposed o only observing he reward u(s 1 τ, s 2 τ of he acion played (he laer corresponds o he bandi seing. We denoe by Ũ i = (ũ i τ he sequence of parial payoff funcions observed by player i. We use U i o denoe he se of all possible such hisories, and define U i 0 :=. A sraegy σ i of player i is a collecion (σ i =1 of (possibly random mappings σ i : U i 1 S i, such ha a ieraion, player i plays s i = σ i (U i 1. We make he following assumpion on he payoff funcion: 7

8 Assumpion 3. The payoff funcion u is uniformly coninuous in s i wih modulus of coninuiy independen of s i for i = 1, 2. Tha is, for each i here exiss χ i Z such ha u(s, s i u(s, s i χ i (d i (s, s for all s i S i. I is easy o see ha Assumpion 3 implies ha he game has a value (see supplemenary maerial. I also makes our seing compaible wih ha of Secion 3. Suppose now ha each player randomizes heir play according o he sequence of probabiliy disribuions on S i generaed by DA wih regularizer h i. Tha is, suppose ha each σ i is a random variable wih he following disribuion: σ i Dh ( 1 i η 1 ũi τ. (19 Theorem 6. Suppose ha player i uses sraegy σ i according o (19, and ha he DA algorihm ensures sublinear regre (i.e. lim sup R / 0. Then σ i is Hannan-consisen. Corollary 5. If boh players use sraegies according o (19 wih he respecive Dual Averaging ensuring ha lim sup R / 0, hen wih probabiliy 1 he sequence (ˆx 1 of empirical disribuions of play weakly converges o he se of Nash equilibria of G. Example Consider a zero-sum game G 1 beween wo players on he uni inerval wih payoff funcion u(s 1, s 2 = s 1 s 2 a 1 s 1 a 2 s 2, where a 1 = e 2 e 1 and a2 = 1 e 1. I is easy o verify ha he pair ( x 1, x 2 = ( exp(s e 1, exp(1 s e 1 is a mixed-sraegy Nash equilibrium of G1. For sequences (s 1 τ and (s 2 τ, he cumulaive payoff funcions for fixed acion s [0, 1] are given, respecively, by U 1 (s 1 = ( Σ s 2 τ a 1 s 1 a 2 Σ s 2 τ U 2 (s 2 = ( a 2 Σ s 1 τ s 2 a 1 Σ s 1 τ If each player i uses he Generalized Hedge Algorihm wih learning raes (η τ, heir sraegy in period is o sample from he disribuion x i (s exp(α i s, where α 1 = η (Σ s 2 τ a 1 and α 2 = η (a 2 Σ s 1 τ. Ineresingly, in his case he sum of he opponen s pas plays is a sufficien saisic, in he sense ha i compleely deermines he mixed sraegy a ime x 1 (s player 1, = player 2, = x 2 (s x 1 (s player 1, =50000 player 2, =50000 x 2 (s x 1 (s player 1, = player 2, = x 2 (s Figure 1: Normalized hisograms of he empirical disribuions of play in G (100 bins Figure 1 shows normalized hisograms of he empirical disribuions of play a differen ieraions. As grows he hisograms approach he equilibrium densiies x 1 and x 2, respecively. However, his does no mean ha he individual sraegies x i converge. Indeed, Figure 2 shows he α i oscillaing around he equilibrium parameers 1 and 1, respecively, even for very large. We do, however, observe ha he ime-averaged parameers ᾱ i converge o he equilibrium values 1 and 1. 2 α 1 α ᾱ 1 ᾱ Figure 2: Evoluion of parameers α i and ᾱ i := 1 αi τ in G 1 In he supplemenary maerial we provide addiional numerical examples, including one ha illusraes how our algorihms can be uilized as a ool o compue approximae Nash equilibria in coninuous zero-sum games on non-convex domains. 8

9 References Sanjeev Arora, Elad Hazan, and Sayen Kale. The muliplicaive weighs updae mehod: a meaalgorihm and applicaions. Theory of Compuing, 8(1: , Jean-Yves Audiber, Sébasien Bubeck, and Gàbor Lugosi. Regre in online combinaorial opimizaion. Mahemaics of Operaions Research, 39(1:31 45, S. Bubeck and R. Eldan. The enropic barrier: a simple and opimal universal self-concordan barrier. ArXiv e-prins, December Sébasien Bubeck and Nicolò Cesa-Bianchi. Regre analysis of sochasic and nonsochasic muliarmed bandi problems. Foundaions and Trends in Machine Learning, 5(1:1 122, Nicolo Cesa-Bianchi and Gabor Lugosi. Predicion, Learning, and Games. Cambridge UP, Thomas M. Cover. Universal porfolios. Mahemaical Finance, 1(1:1 29, Imre Csiszár. Informaion-ype measures of difference of probabiliy disribuions and indirec observaions. Sudia Scieniarum Mahemaicarum Hungarica, 2: , Irving L. Glicksberg. Minimax heorem for upper and lower semiconinuous payoffs. Research Memorandum RM-478, The RAND Corporaion, Oc James Hannan. Approximaion o Bayes risk in repeaed play. In Conribuions o he Theory of Games, vol III of Annals of Mahemaics Sudies 39. Princeon Universiy Press, Sergiu Har and Andreu Mas-Colell. A general class of adapive sraegies. Journal of Economic Theory, 98(1:26 54, Elad Hazan, Ami Agarwal, and Sayen Kale. Logarihmic regre algorihms for online convex opimizaion. Machine Learning, 69(2-3: , Juha Heinonen., Pekka Koskela, Nageswari Shanmugalingam, and Jeremy T. Tyson. Sobolev Spaces on Meric Measure Spaces: An Approach Based on Upper Gradiens. New Mahemaical Monographs. Cambridge Universiy Press, Walid Krichene. Dual averaging on compacly-suppored disribuions and applicaion o no-regre learning on a coninuum. CoRR, abs/ , Walid Krichene, Maximilian Balanda, Claire Tomlin, and Alexandre Bayen. The Hedge Algorihm on a Coninuum. In 32nd Inernaional Conference on Machine Learning, pages , Joon Kwon and Panayois Merikopoulos. A coninuous-ime approach o online opimizaion. ArXiv e-prins, January Ehud Lehrer. Approachabiliy in infinie dimensional spaces. Inernaional Journal of Game Theory, 31(2: , Dov Monderer and Lloyd S. Shapley. Poenial games. Games and Economic Behavior, 14(1: , Yurii Neserov. Primal-dual subgradien mehods for convex problems. Mahemaical Programming, 120(1: , Shai Shalev-Shwarz. Online learning and online convex opimizaion. Foundaions and Trends in Machine Learning, 4(2: , Nai Srebro, Karhik Sridharan, and Ambuj Tewari. On he universaliy of online mirror descen. In Advances in Neural Informaion Processing Sysems 24 (NIPS, pages Karhik Sridharan and Ambuj Tewari. Convex games in banach spaces. In COLT The 23rd Conference on Learning Theory,, pages 1 13, Haifa, Israel, June Thomas Srömberg. Dualiy beween Fréche differeniabiliy and srong convexiy. Posiiviy, 15(3: , Lin Xiao. Dual averaging mehods for regularized sochasic learning and online opimizaion. J. Mach. Learn. Res., 11: , December

arxiv: v1 [cs.lg] 3 Jun 2016

arxiv: v1 [cs.lg] 3 Jun 2016 Minimizing Regre on Reflexive Banach Spaces and Learning Nash Equilibria in Coninuous Zero-Sum Games arxiv:66.26v [cs.lg] 3 Jun 26 Maximilian Balanda Walid Krichene Claire Tomlin Alexandre Bayen Deparmen

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Noes for EE7C Spring 018: Convex Opimizaion and Approximaion Insrucor: Moriz Hard Email: hard+ee7c@berkeley.edu Graduae Insrucor: Max Simchowiz Email: msimchow+ee7c@berkeley.edu Ocober 15, 018 3

More information

Games Against Nature

Games Against Nature Advanced Course in Machine Learning Spring 2010 Games Agains Naure Handous are joinly prepared by Shie Mannor and Shai Shalev-Shwarz In he previous lecures we alked abou expers in differen seups and analyzed

More information

An Introduction to Malliavin calculus and its applications

An Introduction to Malliavin calculus and its applications An Inroducion o Malliavin calculus and is applicaions Lecure 5: Smoohness of he densiy and Hörmander s heorem David Nualar Deparmen of Mahemaics Kansas Universiy Universiy of Wyoming Summer School 214

More information

Notes on online convex optimization

Notes on online convex optimization Noes on online convex opimizaion Karl Sraos Online convex opimizaion (OCO) is a principled framework for online learning: OnlineConvexOpimizaion Inpu: convex se S, number of seps T For =, 2,..., T : Selec

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation:

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation: M ah 5 7 Fall 9 L ecure O c. 4, 9 ) Hamilon- J acobi Equaion: Weak S oluion We coninue he sudy of he Hamilon-Jacobi equaion: We have shown ha u + H D u) = R n, ) ; u = g R n { = }. ). In general we canno

More information

Optimality Conditions for Unconstrained Problems

Optimality Conditions for Unconstrained Problems 62 CHAPTER 6 Opimaliy Condiions for Unconsrained Problems 1 Unconsrained Opimizaion 11 Exisence Consider he problem of minimizing he funcion f : R n R where f is coninuous on all of R n : P min f(x) x

More information

Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria

Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria Compuaional Equivalence of Fixed Poins and No Regre Algorihms, and Convergence o Equilibria Elad Hazan IBM Almaden Research Cener 650 Harry Road San Jose, CA 95120 hazan@us.ibm.com Sayen Kale Compuer Science

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN Inernaional Journal of Scienific & Engineering Research, Volume 4, Issue 10, Ocober-2013 900 FUZZY MEAN RESIDUAL LIFE ORDERING OF FUZZY RANDOM VARIABLES J. EARNEST LAZARUS PIRIYAKUMAR 1, A. YAMUNA 2 1.

More information

Online Convex Optimization Example And Follow-The-Leader

Online Convex Optimization Example And Follow-The-Leader CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion

More information

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs PROC. IEEE CONFERENCE ON DECISION AND CONTROL, 06 A Primal-Dual Type Algorihm wih he O(/) Convergence Rae for Large Scale Consrained Convex Programs Hao Yu and Michael J. Neely Absrac This paper considers

More information

CHARACTERIZATION OF REARRANGEMENT INVARIANT SPACES WITH FIXED POINTS FOR THE HARDY LITTLEWOOD MAXIMAL OPERATOR

CHARACTERIZATION OF REARRANGEMENT INVARIANT SPACES WITH FIXED POINTS FOR THE HARDY LITTLEWOOD MAXIMAL OPERATOR Annales Academiæ Scieniarum Fennicæ Mahemaica Volumen 31, 2006, 39 46 CHARACTERIZATION OF REARRANGEMENT INVARIANT SPACES WITH FIXED POINTS FOR THE HARDY LITTLEWOOD MAXIMAL OPERATOR Joaquim Marín and Javier

More information

arxiv: v1 [math.fa] 9 Dec 2018

arxiv: v1 [math.fa] 9 Dec 2018 AN INVERSE FUNCTION THEOREM CONVERSE arxiv:1812.03561v1 [mah.fa] 9 Dec 2018 JIMMIE LAWSON Absrac. We esablish he following converse of he well-known inverse funcion heorem. Le g : U V and f : V U be inverse

More information

MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES PROBABILITY DISTRIBUTION OF A FUNCTION OF A RANDOM VARIABLE

MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES PROBABILITY DISTRIBUTION OF A FUNCTION OF A RANDOM VARIABLE Topics MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES 2-6 3. FUNCTION OF A RANDOM VARIABLE 3.2 PROBABILITY DISTRIBUTION OF A FUNCTION OF A RANDOM VARIABLE 3.3 EXPECTATION AND MOMENTS

More information

Convergence of the Neumann series in higher norms

Convergence of the Neumann series in higher norms Convergence of he Neumann series in higher norms Charles L. Epsein Deparmen of Mahemaics, Universiy of Pennsylvania Version 1.0 Augus 1, 003 Absrac Naural condiions on an operaor A are given so ha he Neumann

More information

EXERCISES FOR SECTION 1.5

EXERCISES FOR SECTION 1.5 1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler

More information

Lecture 10: The Poincaré Inequality in Euclidean space

Lecture 10: The Poincaré Inequality in Euclidean space Deparmens of Mahemaics Monana Sae Universiy Fall 215 Prof. Kevin Wildrick n inroducion o non-smooh analysis and geomery Lecure 1: The Poincaré Inequaliy in Euclidean space 1. Wha is he Poincaré inequaliy?

More information

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales Advances in Dynamical Sysems and Applicaions. ISSN 0973-5321 Volume 1 Number 1 (2006, pp. 103 112 c Research India Publicaions hp://www.ripublicaion.com/adsa.hm The Asympoic Behavior of Nonoscillaory Soluions

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Cash Flow Valuation Mode Lin Discrete Time

Cash Flow Valuation Mode Lin Discrete Time IOSR Journal of Mahemaics (IOSR-JM) e-issn: 2278-5728,p-ISSN: 2319-765X, 6, Issue 6 (May. - Jun. 2013), PP 35-41 Cash Flow Valuaion Mode Lin Discree Time Olayiwola. M. A. and Oni, N. O. Deparmen of Mahemaics

More information

Hamilton Jacobi equations

Hamilton Jacobi equations Hamilon Jacobi equaions Inoducion o PDE The rigorous suff from Evans, mosly. We discuss firs u + H( u = 0, (1 where H(p is convex, and superlinear a infiniy, H(p lim p p = + This by comes by inegraion

More information

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018 MATH 5720: Gradien Mehods Hung Phan, UMass Lowell Ocober 4, 208 Descen Direcion Mehods Consider he problem min { f(x) x R n}. The general descen direcions mehod is x k+ = x k + k d k where x k is he curren

More information

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details!

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details! MAT 257, Handou 6: Ocober 7-2, 20. I. Assignmen. Finish reading Chaper 2 of Spiva, rereading earlier secions as necessary. handou and fill in some missing deails! II. Higher derivaives. Also, read his

More information

Expert Advice for Amateurs

Expert Advice for Amateurs Exper Advice for Amaeurs Ernes K. Lai Online Appendix - Exisence of Equilibria The analysis in his secion is performed under more general payoff funcions. Wihou aking an explici form, he payoffs of he

More information

Empirical Process Theory

Empirical Process Theory Empirical Process heory 4.384 ime Series Analysis, Fall 27 Reciaion by Paul Schrimpf Supplemenary o lecures given by Anna Mikusheva Ocober 7, 28 Reciaion 7 Empirical Process heory Le x be a real-valued

More information

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence Supplemen for Sochasic Convex Opimizaion: Faser Local Growh Implies Faser Global Convergence Yi Xu Qihang Lin ianbao Yang Proof of heorem heorem Suppose Assumpion holds and F (w) obeys he LGC (6) Given

More information

Hamilton- J acobi Equation: Explicit Formulas In this lecture we try to apply the method of characteristics to the Hamilton-Jacobi equation: u t

Hamilton- J acobi Equation: Explicit Formulas In this lecture we try to apply the method of characteristics to the Hamilton-Jacobi equation: u t M ah 5 2 7 Fall 2 0 0 9 L ecure 1 0 O c. 7, 2 0 0 9 Hamilon- J acobi Equaion: Explici Formulas In his lecure we ry o apply he mehod of characerisics o he Hamilon-Jacobi equaion: u + H D u, x = 0 in R n

More information

An Introduction to Backward Stochastic Differential Equations (BSDEs) PIMS Summer School 2016 in Mathematical Finance.

An Introduction to Backward Stochastic Differential Equations (BSDEs) PIMS Summer School 2016 in Mathematical Finance. 1 An Inroducion o Backward Sochasic Differenial Equaions (BSDEs) PIMS Summer School 2016 in Mahemaical Finance June 25, 2016 Chrisoph Frei cfrei@ualbera.ca This inroducion is based on Touzi [14], Bouchard

More information

MATH 4330/5330, Fourier Analysis Section 6, Proof of Fourier s Theorem for Pointwise Convergence

MATH 4330/5330, Fourier Analysis Section 6, Proof of Fourier s Theorem for Pointwise Convergence MATH 433/533, Fourier Analysis Secion 6, Proof of Fourier s Theorem for Poinwise Convergence Firs, some commens abou inegraing periodic funcions. If g is a periodic funcion, g(x + ) g(x) for all real x,

More information

Online Learning with Partial Feedback. 1 Online Mirror Descent with Estimated Gradient

Online Learning with Partial Feedback. 1 Online Mirror Descent with Estimated Gradient Avance Course in Machine Learning Spring 2010 Online Learning wih Parial Feeback Hanous are joinly prepare by Shie Mannor an Shai Shalev-Shwarz In previous lecures we alke abou he general framework of

More information

6. Stochastic calculus with jump processes

6. Stochastic calculus with jump processes A) Trading sraegies (1/3) Marke wih d asses S = (S 1,, S d ) A rading sraegy can be modelled wih a vecor φ describing he quaniies invesed in each asse a each insan : φ = (φ 1,, φ d ) The value a of a porfolio

More information

2. Nonlinear Conservation Law Equations

2. Nonlinear Conservation Law Equations . Nonlinear Conservaion Law Equaions One of he clear lessons learned over recen years in sudying nonlinear parial differenial equaions is ha i is generally no wise o ry o aack a general class of nonlinear

More information

Essential Maps and Coincidence Principles for General Classes of Maps

Essential Maps and Coincidence Principles for General Classes of Maps Filoma 31:11 (2017), 3553 3558 hps://doi.org/10.2298/fil1711553o Published by Faculy of Sciences Mahemaics, Universiy of Niš, Serbia Available a: hp://www.pmf.ni.ac.rs/filoma Essenial Maps Coincidence

More information

Lecture Notes 2. The Hilbert Space Approach to Time Series

Lecture Notes 2. The Hilbert Space Approach to Time Series Time Series Seven N. Durlauf Universiy of Wisconsin. Basic ideas Lecure Noes. The Hilber Space Approach o Time Series The Hilber space framework provides a very powerful language for discussing he relaionship

More information

arxiv: v1 [math.pr] 23 Jan 2019

arxiv: v1 [math.pr] 23 Jan 2019 Consrucion of Liouville Brownian moion via Dirichle form heory Jiyong Shin arxiv:90.07753v [mah.pr] 23 Jan 209 Absrac. The Liouville Brownian moion which was inroduced in [3] is a naural diffusion process

More information

SOME MORE APPLICATIONS OF THE HAHN-BANACH THEOREM

SOME MORE APPLICATIONS OF THE HAHN-BANACH THEOREM SOME MORE APPLICATIONS OF THE HAHN-BANACH THEOREM FRANCISCO JAVIER GARCÍA-PACHECO, DANIELE PUGLISI, AND GUSTI VAN ZYL Absrac We give a new proof of he fac ha equivalen norms on subspaces can be exended

More information

Correspondence should be addressed to Nguyen Buong,

Correspondence should be addressed to Nguyen Buong, Hindawi Publishing Corporaion Fixed Poin Theory and Applicaions Volume 011, Aricle ID 76859, 10 pages doi:101155/011/76859 Research Aricle An Implici Ieraion Mehod for Variaional Inequaliies over he Se

More information

Multiarmed Bandits With Limited Expert Advice

Multiarmed Bandits With Limited Expert Advice uliarmed Bandis Wih Limied Exper Advice Sayen Kale Yahoo Labs ew York sayen@yahoo-inc.com Absrac We consider he problem of minimizing regre in he seing of advice-efficien muliarmed bandis wih exper advice.

More information

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality Marix Versions of Some Refinemens of he Arihmeic-Geomeric Mean Inequaliy Bao Qi Feng and Andrew Tonge Absrac. We esablish marix versions of refinemens due o Alzer ], Carwrigh and Field 4], and Mercer 5]

More information

A Stochastic View of Optimal Regret through Minimax Duality

A Stochastic View of Optimal Regret through Minimax Duality A Sochasic View of Opimal Regre hrough Minimax Dualiy Jacob Abernehy Compuer Science Division UC Berkeley Alekh Agarwal Compuer Science Division UC Berkeley Peer L. Barle Compuer Science Division Deparmen

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

t 2 B F x,t n dsdt t u x,t dxdt

t 2 B F x,t n dsdt t u x,t dxdt Evoluion Equaions For 0, fixed, le U U0, where U denoes a bounded open se in R n.suppose ha U is filled wih a maerial in which a conaminan is being ranspored by various means including diffusion and convecion.

More information

arxiv: v1 [math.ca] 15 Nov 2016

arxiv: v1 [math.ca] 15 Nov 2016 arxiv:6.599v [mah.ca] 5 Nov 26 Counerexamples on Jumarie s hree basic fracional calculus formulae for non-differeniable coninuous funcions Cheng-shi Liu Deparmen of Mahemaics Norheas Peroleum Universiy

More information

Homogenization of random Hamilton Jacobi Bellman Equations

Homogenization of random Hamilton Jacobi Bellman Equations Probabiliy, Geomery and Inegrable Sysems MSRI Publicaions Volume 55, 28 Homogenizaion of random Hamilon Jacobi Bellman Equaions S. R. SRINIVASA VARADHAN ABSTRACT. We consider nonlinear parabolic equaions

More information

A proof of Ito's formula using a di Title formula. Author(s) Fujita, Takahiko; Kawanishi, Yasuhi. Studia scientiarum mathematicarum H Citation

A proof of Ito's formula using a di Title formula. Author(s) Fujita, Takahiko; Kawanishi, Yasuhi. Studia scientiarum mathematicarum H Citation A proof of Io's formula using a di Tile formula Auhor(s) Fujia, Takahiko; Kawanishi, Yasuhi Sudia scieniarum mahemaicarum H Ciaion 15-134 Issue 8-3 Dae Type Journal Aricle Tex Version auhor URL hp://hdl.handle.ne/186/15878

More information

Dual Representation as Stochastic Differential Games of Backward Stochastic Differential Equations and Dynamic Evaluations

Dual Representation as Stochastic Differential Games of Backward Stochastic Differential Equations and Dynamic Evaluations arxiv:mah/0602323v1 [mah.pr] 15 Feb 2006 Dual Represenaion as Sochasic Differenial Games of Backward Sochasic Differenial Equaions and Dynamic Evaluaions Shanjian Tang Absrac In his Noe, assuming ha he

More information

Class Meeting # 10: Introduction to the Wave Equation

Class Meeting # 10: Introduction to the Wave Equation MATH 8.5 COURSE NOTES - CLASS MEETING # 0 8.5 Inroducion o PDEs, Fall 0 Professor: Jared Speck Class Meeing # 0: Inroducion o he Wave Equaion. Wha is he wave equaion? The sandard wave equaion for a funcion

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

On Oscillation of a Generalized Logistic Equation with Several Delays

On Oscillation of a Generalized Logistic Equation with Several Delays Journal of Mahemaical Analysis and Applicaions 253, 389 45 (21) doi:1.16/jmaa.2.714, available online a hp://www.idealibrary.com on On Oscillaion of a Generalized Logisic Equaion wih Several Delays Leonid

More information

arxiv: v2 [math.ap] 16 Oct 2017

arxiv: v2 [math.ap] 16 Oct 2017 Unspecified Journal Volume 00, Number 0, Pages 000 000 S????-????XX0000-0 MINIMIZATION SOLUTIONS TO CONSERVATION LAWS WITH NON-SMOOTH AND NON-STRICTLY CONVEX FLUX CAREY CAGINALP arxiv:1708.02339v2 [mah.ap]

More information

4 Sequences of measurable functions

4 Sequences of measurable functions 4 Sequences of measurable funcions 1. Le (Ω, A, µ) be a measure space (complee, afer a possible applicaion of he compleion heorem). In his chaper we invesigae relaions beween various (nonequivalen) convergences

More information

arxiv: v1 [math.pr] 19 Feb 2011

arxiv: v1 [math.pr] 19 Feb 2011 A NOTE ON FELLER SEMIGROUPS AND RESOLVENTS VADIM KOSTRYKIN, JÜRGEN POTTHOFF, AND ROBERT SCHRADER ABSTRACT. Various equivalen condiions for a semigroup or a resolven generaed by a Markov process o be of

More information

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid

More information

Introduction to Probability and Statistics Slides 4 Chapter 4

Introduction to Probability and Statistics Slides 4 Chapter 4 Inroducion o Probabiliy and Saisics Slides 4 Chaper 4 Ammar M. Sarhan, asarhan@mahsa.dal.ca Deparmen of Mahemaics and Saisics, Dalhousie Universiy Fall Semeser 8 Dr. Ammar Sarhan Chaper 4 Coninuous Random

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

Lecture 4: November 13

Lecture 4: November 13 Compuaional Learning Theory Fall Semeser, 2017/18 Lecure 4: November 13 Lecurer: Yishay Mansour Scribe: Guy Dolinsky, Yogev Bar-On, Yuval Lewi 4.1 Fenchel-Conjugae 4.1.1 Moivaion Unil his lecure we saw

More information

Optimal Investment under Dynamic Risk Constraints and Partial Information

Optimal Investment under Dynamic Risk Constraints and Partial Information Opimal Invesmen under Dynamic Risk Consrains and Parial Informaion Wolfgang Puschögl Johann Radon Insiue for Compuaional and Applied Mahemaics (RICAM) Ausrian Academy of Sciences www.ricam.oeaw.ac.a 2

More information

IMPLICIT AND INVERSE FUNCTION THEOREMS PAUL SCHRIMPF 1 OCTOBER 25, 2013

IMPLICIT AND INVERSE FUNCTION THEOREMS PAUL SCHRIMPF 1 OCTOBER 25, 2013 IMPLICI AND INVERSE FUNCION HEOREMS PAUL SCHRIMPF 1 OCOBER 25, 213 UNIVERSIY OF BRIISH COLUMBIA ECONOMICS 526 We have exensively sudied how o solve sysems of linear equaions. We know how o check wheher

More information

Approximation Algorithms for Unique Games via Orthogonal Separators

Approximation Algorithms for Unique Games via Orthogonal Separators Approximaion Algorihms for Unique Games via Orhogonal Separaors Lecure noes by Konsanin Makarychev. Lecure noes are based on he papers [CMM06a, CMM06b, LM4]. Unique Games In hese lecure noes, we define

More information

1 Solutions to selected problems

1 Solutions to selected problems 1 Soluions o seleced problems 1. Le A B R n. Show ha in A in B bu in general bd A bd B. Soluion. Le x in A. Then here is ɛ > 0 such ha B ɛ (x) A B. This shows x in B. If A = [0, 1] and B = [0, 2], hen

More information

Generalized Snell envelope and BSDE With Two general Reflecting Barriers

Generalized Snell envelope and BSDE With Two general Reflecting Barriers 1/22 Generalized Snell envelope and BSDE Wih Two general Reflecing Barriers EL HASSAN ESSAKY Cadi ayyad Universiy Poly-disciplinary Faculy Safi Work in progress wih : M. Hassani and Y. Ouknine Iasi, July

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite American Journal of Operaions Research, 08, 8, 8-9 hp://wwwscirporg/journal/ajor ISSN Online: 60-8849 ISSN Prin: 60-8830 The Opimal Sopping Time for Selling an Asse When I Is Uncerain Wheher he Price Process

More information

The consumption-based determinants of the term structure of discount rates: Corrigendum. Christian Gollier 1 Toulouse School of Economics March 2012

The consumption-based determinants of the term structure of discount rates: Corrigendum. Christian Gollier 1 Toulouse School of Economics March 2012 The consumpion-based deerminans of he erm srucure of discoun raes: Corrigendum Chrisian Gollier Toulouse School of Economics March 0 In Gollier (007), I examine he effec of serially correlaed growh raes

More information

Approximating positive solutions of nonlinear first order ordinary quadratic differential equations

Approximating positive solutions of nonlinear first order ordinary quadratic differential equations Dhage & Dhage, Cogen Mahemaics (25, 2: 2367 hp://dx.doi.org/.8/233835.25.2367 APPLIED & INTERDISCIPLINARY MATHEMATICS RESEARCH ARTICLE Approximaing posiive soluions of nonlinear firs order ordinary quadraic

More information

12: AUTOREGRESSIVE AND MOVING AVERAGE PROCESSES IN DISCRETE TIME. Σ j =

12: AUTOREGRESSIVE AND MOVING AVERAGE PROCESSES IN DISCRETE TIME. Σ j = 1: AUTOREGRESSIVE AND MOVING AVERAGE PROCESSES IN DISCRETE TIME Moving Averages Recall ha a whie noise process is a series { } = having variance σ. The whie noise process has specral densiy f (λ) = of

More information

Lecture 2 October ε-approximation of 2-player zero-sum games

Lecture 2 October ε-approximation of 2-player zero-sum games Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion

More information

Stable approximations of optimal filters

Stable approximations of optimal filters Sable approximaions of opimal filers Joaquin Miguez Deparmen of Signal Theory & Communicaions, Universidad Carlos III de Madrid. E-mail: joaquin.miguez@uc3m.es Join work wih Dan Crisan (Imperial College

More information

Ann. Funct. Anal. 2 (2011), no. 2, A nnals of F unctional A nalysis ISSN: (electronic) URL:

Ann. Funct. Anal. 2 (2011), no. 2, A nnals of F unctional A nalysis ISSN: (electronic) URL: Ann. Func. Anal. 2 2011, no. 2, 34 41 A nnals of F uncional A nalysis ISSN: 2008-8752 elecronic URL: www.emis.de/journals/afa/ CLASSIFICAION OF POSIIVE SOLUIONS OF NONLINEAR SYSEMS OF VOLERRA INEGRAL EQUAIONS

More information

Lecture 33: November 29

Lecture 33: November 29 36-705: Inermediae Saisics Fall 2017 Lecurer: Siva Balakrishnan Lecure 33: November 29 Today we will coninue discussing he boosrap, and hen ry o undersand why i works in a simple case. In he las lecure

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Undetermined coefficients for local fractional differential equations

Undetermined coefficients for local fractional differential equations Available online a www.isr-publicaions.com/jmcs J. Mah. Compuer Sci. 16 (2016), 140 146 Research Aricle Undeermined coefficiens for local fracional differenial equaions Roshdi Khalil a,, Mohammed Al Horani

More information

Supplementary Material

Supplementary Material Dynamic Global Games of Regime Change: Learning, Mulipliciy and iming of Aacks Supplemenary Maerial George-Marios Angeleos MI and NBER Chrisian Hellwig UCLA Alessandro Pavan Norhwesern Universiy Ocober

More information

Endpoint Strichartz estimates

Endpoint Strichartz estimates Endpoin Sricharz esimaes Markus Keel and Terence Tao (Amer. J. Mah. 10 (1998) 955 980) Presener : Nobu Kishimoo (Kyoo Universiy) 013 Paricipaing School in Analysis of PDE 013/8/6 30, Jeju 1 Absrac of he

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

STABILITY OF PEXIDERIZED QUADRATIC FUNCTIONAL EQUATION IN NON-ARCHIMEDEAN FUZZY NORMED SPASES

STABILITY OF PEXIDERIZED QUADRATIC FUNCTIONAL EQUATION IN NON-ARCHIMEDEAN FUZZY NORMED SPASES Novi Sad J. Mah. Vol. 46, No. 1, 2016, 15-25 STABILITY OF PEXIDERIZED QUADRATIC FUNCTIONAL EQUATION IN NON-ARCHIMEDEAN FUZZY NORMED SPASES N. Eghbali 1 Absrac. We deermine some sabiliy resuls concerning

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

Exponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits

Exponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits DOI: 0.545/mjis.07.5009 Exponenial Weighed Moving Average (EWMA) Char Under The Assumpion of Moderaeness And Is 3 Conrol Limis KALPESH S TAILOR Assisan Professor, Deparmen of Saisics, M. K. Bhavnagar Universiy,

More information

Notes for Lecture 17-18

Notes for Lecture 17-18 U.C. Berkeley CS278: Compuaional Complexiy Handou N7-8 Professor Luca Trevisan April 3-8, 2008 Noes for Lecure 7-8 In hese wo lecures we prove he firs half of he PCP Theorem, he Amplificaion Lemma, up

More information

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check

More information

Heat kernel and Harnack inequality on Riemannian manifolds

Heat kernel and Harnack inequality on Riemannian manifolds Hea kernel and Harnack inequaliy on Riemannian manifolds Alexander Grigor yan UHK 11/02/2014 onens 1 Laplace operaor and hea kernel 1 2 Uniform Faber-Krahn inequaliy 3 3 Gaussian upper bounds 4 4 ean-value

More information

Oscillation of an Euler Cauchy Dynamic Equation S. Huff, G. Olumolode, N. Pennington, and A. Peterson

Oscillation of an Euler Cauchy Dynamic Equation S. Huff, G. Olumolode, N. Pennington, and A. Peterson PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DYNAMICAL SYSTEMS AND DIFFERENTIAL EQUATIONS May 4 7, 00, Wilmingon, NC, USA pp 0 Oscillaion of an Euler Cauchy Dynamic Equaion S Huff, G Olumolode,

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY

RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY ECO 504 Spring 2006 Chris Sims RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY 1. INTRODUCTION Lagrange muliplier mehods are sandard fare in elemenary calculus courses, and hey play a cenral role in economic

More information

14 Autoregressive Moving Average Models

14 Autoregressive Moving Average Models 14 Auoregressive Moving Average Models In his chaper an imporan parameric family of saionary ime series is inroduced, he family of he auoregressive moving average, or ARMA, processes. For a large class

More information

Utility maximization in incomplete markets

Utility maximization in incomplete markets Uiliy maximizaion in incomplee markes Marcel Ladkau 27.1.29 Conens 1 Inroducion and general seings 2 1.1 Marke model....................................... 2 1.2 Trading sraegy.....................................

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

CHAPTER 2 Signals And Spectra

CHAPTER 2 Signals And Spectra CHAPER Signals And Specra Properies of Signals and Noise In communicaion sysems he received waveform is usually caegorized ino he desired par conaining he informaion, and he undesired par. he desired par

More information

Vanishing Viscosity Method. There are another instructive and perhaps more natural discontinuous solutions of the conservation law

Vanishing Viscosity Method. There are another instructive and perhaps more natural discontinuous solutions of the conservation law Vanishing Viscosiy Mehod. There are anoher insrucive and perhaps more naural disconinuous soluions of he conservaion law (1 u +(q(u x 0, he so called vanishing viscosiy mehod. This mehod consiss in viewing

More information

On a Fractional Stochastic Landau-Ginzburg Equation

On a Fractional Stochastic Landau-Ginzburg Equation Applied Mahemaical Sciences, Vol. 4, 1, no. 7, 317-35 On a Fracional Sochasic Landau-Ginzburg Equaion Nguyen Tien Dung Deparmen of Mahemaics, FPT Universiy 15B Pham Hung Sree, Hanoi, Vienam dungn@fp.edu.vn

More information

Average Number of Lattice Points in a Disk

Average Number of Lattice Points in a Disk Average Number of Laice Poins in a Disk Sujay Jayakar Rober S. Sricharz Absrac The difference beween he number of laice poins in a disk of radius /π and he area of he disk /4π is equal o he error in he

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

On Gronwall s Type Integral Inequalities with Singular Kernels

On Gronwall s Type Integral Inequalities with Singular Kernels Filoma 31:4 (217), 141 149 DOI 1.2298/FIL17441A Published by Faculy of Sciences and Mahemaics, Universiy of Niš, Serbia Available a: hp://www.pmf.ni.ac.rs/filoma On Gronwall s Type Inegral Inequaliies

More information

Optima and Equilibria for Traffic Flow on a Network

Optima and Equilibria for Traffic Flow on a Network Opima and Equilibria for Traffic Flow on a Nework Albero Bressan Deparmen of Mahemaics, Penn Sae Universiy bressan@mah.psu.edu Albero Bressan (Penn Sae) Opima and equilibria for raffic flow 1 / 1 A Traffic

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

Stability and Bifurcation in a Neural Network Model with Two Delays

Stability and Bifurcation in a Neural Network Model with Two Delays Inernaional Mahemaical Forum, Vol. 6, 11, no. 35, 175-1731 Sabiliy and Bifurcaion in a Neural Nework Model wih Two Delays GuangPing Hu and XiaoLing Li School of Mahemaics and Physics, Nanjing Universiy

More information

Sliding Mode Extremum Seeking Control for Linear Quadratic Dynamic Game

Sliding Mode Extremum Seeking Control for Linear Quadratic Dynamic Game Sliding Mode Exremum Seeking Conrol for Linear Quadraic Dynamic Game Yaodong Pan and Ümi Özgüner ITS Research Group, AIST Tsukuba Eas Namiki --, Tsukuba-shi,Ibaraki-ken 5-856, Japan e-mail: pan.yaodong@ais.go.jp

More information

Monochromatic Infinite Sumsets

Monochromatic Infinite Sumsets Monochromaic Infinie Sumses Imre Leader Paul A. Russell July 25, 2017 Absrac WeshowhahereisaraionalvecorspaceV suchha,whenever V is finiely coloured, here is an infinie se X whose sumse X+X is monochromaic.

More information