SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION IN STOCHASTICALLY PERTURBED DYNAMICS

Similar documents
b i (µ, x, s) ei ϕ(x) µ s (dx) ds (2) i=1

Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations.

On Ergodic Impulse Control with Constraint

Laplace s Equation. Chapter Mean Value Formulas

THE VALUE FUNCTION IN ERGODIC CONTROL OF DIFFUSION PROCESSES WITH PARTIAL OBSERVATIONS II

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME

1 The Observability Canonical Form

Harmonic Functions and Brownian motion

Lecture No 1 Introduction to Diffusion equations The heat equat

Partial Differential Equations

A variational formula for risk-sensitive control

Stochastic optimal control with rough paths

Continuous dependence estimates for the ergodic problem with an application to homogenization

SHARP BOUNDARY TRACE INEQUALITIES. 1. Introduction

Nonlinear Control. Nonlinear Control Lecture # 3 Stability of Equilibrium Points

A Concise Course on Stochastic Partial Differential Equations

Asymptotics of minimax stochastic programs

Régularité des équations de Hamilton-Jacobi du premier ordre et applications aux jeux à champ moyen

Thuong Nguyen. SADCO Internal Review Metting

Uniformly Uniformly-ergodic Markov chains and BSDEs

Richard F. Bass Krzysztof Burdzy University of Washington

1 Lyapunov theory of stability

THE MULTIPLICATIVE ERGODIC THEOREM OF OSELEDETS

On a Class of Multidimensional Optimal Transportation Problems

Dynamical systems with Gaussian and Levy noise: analytical and stochastic approaches

Nonlinear stabilization via a linear observability

Scaling Limits of Waves in Convex Scalar Conservation Laws under Random Initial Perturbations

Zero-sum semi-markov games in Borel spaces: discounted and average payoff

Half of Final Exam Name: Practice Problems October 28, 2014

A new approach for investment performance measurement. 3rd WCMF, Santa Barbara November 2009

Elliptic Operators with Unbounded Coefficients

INFINITE TIME HORIZON OPTIMAL CONTROL OF THE SEMILINEAR HEAT EQUATION

PDEs in Image Processing, Tutorials

GLOBAL ATTRACTIVITY IN A CLASS OF NONMONOTONE REACTION-DIFFUSION EQUATIONS WITH TIME DELAY

Maximum Process Problems in Optimal Control Theory

Some lecture notes for Math 6050E: PDEs, Fall 2016

The principle of least action and two-point boundary value problems in orbital mechanics

On an uniqueness theorem for characteristic functions

Measure and Integration: Solutions of CW2

Introduction to Nonlinear Control Lecture # 3 Time-Varying and Perturbed Systems

HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS. Josef Teichmann

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Lecture 8. Chapter 5: Input-Output Stability Chapter 6: Passivity Chapter 14: Passivity-Based Control. Eugenio Schuster.

Lyapunov Stability Theory

On the bang-bang property of time optimal controls for infinite dimensional linear systems

LINEAR-CONVEX CONTROL AND DUALITY

arxiv: v1 [math.oc] 9 Oct 2018

Variational approach to mean field games with density constraints

Presenter: Noriyoshi Fukaya

Asymptotic stability of an evolutionary nonlinear Boltzmann-type equation

Introduction to Random Diffusions

An introduction to Mathematical Theory of Control

Applied Math Qualifying Exam 11 October Instructions: Work 2 out of 3 problems in each of the 3 parts for a total of 6 problems.

Hypoelliptic multiscale Langevin diffusions and Slow fast stochastic reaction diffusion equations.

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT

Nonlinear Elliptic Systems and Mean-Field Games

Approximating a Diffusion by a Finite-State Hidden Markov Model

Sobolev embeddings and interpolations

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

ON THE HÖLDER CONTINUITY OF MATRIX FUNCTIONS FOR NORMAL MATRICES

Sobolev Spaces. Chapter 10

EXISTENCE OF SOLUTIONS TO ASYMPTOTICALLY PERIODIC SCHRÖDINGER EQUATIONS

u xx + u yy = 0. (5.1)

Strong uniqueness for stochastic evolution equations with possibly unbounded measurable drift term

The heat equation in time dependent domains with Neumann boundary conditions

Weak convergence and large deviation theory

THEOREMS, ETC., FOR MATH 515

A CONVEX-CONCAVE ELLIPTIC PROBLEM WITH A PARAMETER ON THE BOUNDARY CONDITION

Math The Laplacian. 1 Green s Identities, Fundamental Solution

An iterative procedure for constructing subsolutions of discrete-time optimal control problems

Multi-dimensional Stochastic Singular Control Via Dynkin Game and Dirichlet Form

Exponential stability of families of linear delay systems

Mean Field Games on networks

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

VISCOSITY SOLUTIONS. We follow Han and Lin, Elliptic Partial Differential Equations, 5.

OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY

for all subintervals I J. If the same is true for the dyadic subintervals I D J only, we will write ϕ BMO d (J). In fact, the following is true

On Stopping Times and Impulse Control with Constraint

2 Statement of the problem and assumptions

Probability and Measure

Parameter Dependent Quasi-Linear Parabolic Equations

On some weighted fractional porous media equations

NONTRIVIAL SOLUTIONS TO INTEGRAL AND DIFFERENTIAL EQUATIONS

1 Stochastic Dynamic Programming

Mean-field dual of cooperative reproduction

OBSERVABILITY INEQUALITY AND DECAY RATE FOR WAVE EQUATIONS WITH NONLINEAR BOUNDARY CONDITIONS

THE INVERSE FUNCTION THEOREM

ODE Final exam - Solutions

SYMMETRY RESULTS FOR PERTURBED PROBLEMS AND RELATED QUESTIONS. Massimo Grosi Filomena Pacella S. L. Yadava. 1. Introduction

Some Properties of NSFDEs

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Internal Stabilizability of Some Diffusive Models

Robotics. Control Theory. Marc Toussaint U Stuttgart

Solving Stochastic Partial Differential Equations as Stochastic Differential Equations in Infinite Dimensions - a Review

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

ξ,i = x nx i x 3 + δ ni + x n x = 0. x Dξ = x i ξ,i = x nx i x i x 3 Du = λ x λ 2 xh + x λ h Dξ,

MODULUS OF CONTINUITY OF THE DIRICHLET SOLUTIONS

BLOWUP THEORY FOR THE CRITICAL NONLINEAR SCHRÖDINGER EQUATIONS REVISITED

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Transcription:

SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION IN STOCHASTICALLY PERTURBED DYNAMICS By Ari Arapostathis, Anup Biswas, and Vivek S. Borkar The University of Texas at Austin, Indian Institute of Science Education and Research, Pune, and Indian Institute of Technology Bombay This document contains the proofs of Lemma.3 and Theorem.4 in Section.3, and Lemma.7 and Theorem.9 in Section.5.. Proofs of Lemma.3 and Theorem.4. We start with the proof of Lemma.3. Proof of Lemma.3. The proof is standard. Let U be given, and define M t := E [ t U s 2 ds, t R +. For T >, let HT 2 denote the space of {F t }-adapted processes Y defined on [, T, having continuous sample paths, and satisfying E [ sup t T Y t 2 <. The space HT 2 more precisely the set of equivalence classes in HT 2 ) is a Banach space under the norm Y H 2 T := [ ) /2 E sup Y t 2. t T It is standard to show, for example following the proof of Arapostathis, Borkar and Ghosh, 22, Theorem 2.2.2), that any solution X of ) X t = X + satisfies t mxs ) + ε U s ) ds + ε ν W t, t, 2) X X 2 H 2 t κ t + t) + M t + E [ X 2) e κ t t, for some constants κ and κ that depend only on m. The existence of a pathwise unique solution then follows by applying the contraction mapping theorem as in Arapostathis, Borkar and Ghosh, 22, Theorem 2.2.4). The rest of this section is devoted to the proof of Theorem.4. Without loss of generality we fix ε =, and suppress the dependence on ε in all the

2 A. ARAPOSTATHIS, A. BISWAS, AND V. S. BORKAR variables. Also, throughout the rest of this section, without loss of generality we assume that l. We proceed by establishing two key lemmas, followed by the proof of Theorem.4. Recall that the running cost R is defined by Rx, u) := lx) + 2 u 2, and that since l is Lipschitz, there exists a constant c l independent of ε such that 3) l dη ε c l. For x R d, and α >, we define the subset U α x of admissible controls by { [ } 4) U α x := U U : E U x e αs RX s, U s ) ds <, where E U x denotes the expectation under the law of X, U) which solves the Itô equation 5) X t = x + with X = x. t mx s ) ds + t U s ds + W t, t, Lemma. The equation 6) 2 V α + m, V α 2 V α 2 + l = αv α has a solution in C 2 R d ) for all α, ). Moreover, for all α, ), we have the following. i) For some constant c >, not depending on α, it holds that 7) V α x) c + x, and αvα x) lx) + c α for all x R d. ii) The function V α satisfies 8) V α x) inf U U α x E U x [ iii) With c l the constant in 3), we have e αs RX s, U s ) ds, x R d. inf {x : lx) c l } αv α = inf R d αv α c l.

SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION 3 Proof. In Bensoussan and Frehse, 22, Theorem 4.8, p. 77) it is proved that 6) has a solution in C 2 R d ), and it is also shown in the proof of this theorem that there exists a constant κ > which does not depend on α such that 9) α V α x) κ x R d. By Ichihara, 22, Theorem B.) there exists a constant C not depending on R > such that ) ) sup V α C + sup αvα ) + sup l + + sup l /3, B R B R+ B R+ B R+ from which gradient estimate in 7) follows. The following structural hypothesis on the Hamiltonian hx, p) is assumed in Ichihara, 22, Theorem B.): p hx, p) is strictly convex for all x R d, and there exists some constant k > such that ) k p 2 hx, p) k p 2, x hx, p) k + p 2 ), for x, p) R 2d. This Hamiltonian corresponds to hx, p) = 2 p 2 m, p for the equation in 6), and the first bound in ) is not satisfied. However, replacing this bound with k p 2 k ) hx, p) k p 2 + k ), for some constant k, the proof of Ichihara, 22, Theorem B.) goes through unmodified. Recall that the class of controls Û is defined by { [ t } Û := U U : E U s 2 ds < for all t. Writing 6) in HJB form, and applying Itô s formula, we obtain 2) V α x) e αt E U [ x Vα X t ) [ t E U x e αs RX s, U s ) ds t >, and all U Û. Since m is bounded, it is standard to show using 5) that [ 3) E U x sup Xs) x m t + [ t t + E U x U s ds < s t for all U Û and t >. Also, if E x denotes the expectation E U x with U =, then by 5), we have the estimate 4) E x[ Xt 2 κ 2 + t 2 + x 2) < t >,

4 A. ARAPOSTATHIS, A. BISWAS, AND V. S. BORKAR for some constant κ 2. As shown in the proof of Bensoussan and Frehse, 22, Theorem 4.8, p. 77), α αv α ) is bounded on, ), which together with the gradient estimate in 7) we have already established, provides us with a liberal bound of V α of the form V α x) C + x 2 ) for some constant C. This combined with 4) implies that e αt E x[ Vα X t ) as t. Therefore, using 3), and the Lipschitz constant C l of l, we obtain by 2) that [ αv α x) E x α e αs lx s ) ds lx) + C l α e αs m s + 2 s ) ds x R d, which results in the estimate given in 7), where without loss of generality we use a common constant c for the two inequalities. This completes the proof of part i). Let gx, t) := x + m t + 2 t. Multiplying both sides of 3) by e αt, then strengthening the inequality, and applying the Hölder inequality, we obtain e αt E U [ x Xt [ t 5) gx, t) e αt + e α 2 t E U x e α 2 s U s ds gx, t) e αt + te α 4 t E U x t U Uα x, [ t ) /2 e αs U s 2 ds with U α x as defined in 4). Taking limits as t in 2), and using 5), and the bound of V α in 7) together with the inequality lx) C l x + l), we obtain 8). We now turn to part iii). Let and χx) := 3 min y B x) [ly) d + + 2 d m ) 2) /2, ψx) := V α x) + 2κ α χx ) x x 2), x B x ),

SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION 5 where κ > is the constant in 9). With φx) := x x 2, we have 2 ψ m V α, ψ + αψ ) = 2 V α m, V α + 2 V α 2 + αv α + 2 V α χx ) φ 2 2χ 2 x )φ + 2κ ) χx ) 2 φ + m, φ + α φ) l 2χ 2 x ) + 2κ d + 2 d m + ) χx ) l 3χ 2 x ) d + + 2 d m ) 2 in B x ), for all α, ), where we use 6) and the fact that κ. Since ψ > on B x ) by 9), an application of the strong maximum principle shows that ψ in B x ), which implies that αv α x) αχx) + κ x R d. Since l is inf-compact, and the same is true for χ by its definition, this shows that αv α is inf-compact. In particular, it attains its infimum in R d. With η denoting the invariant probability measure of the diffusion in 5) under the control U =, using 8), we obtain 6) inf R d V α V α dη R d E x [ e αs lx s ) ds η dx) c l α, where the last inequality follows by 3). One more application of the maximum principle implies that if V α attains its infimum at ˆx R d then lˆx) αv α ˆx). This together with 6) implies part iii). Remark 2. We should mention, even though we do not need it for the proof of the main theorem, that 8) holds with equality, and thus V α is indeed the value of the infinite horizon discounted control problem. The proof of this assertion goes as follows. Since V α has at most linear growth, the diffusion in 5) under the Markov control v α = V α has a unique strong solution. It is also clear by 7) that for any α > we can select a constant κ α) such that V α x) κ α) + α 6x. Thus using a standard estimate Arapostathis, Borkar and Ghosh, 22, Theorem 2.2.2) we obtain [ 7) E vα x sup Xs) 2 κ 2 α) + t 2 ) + x 2 )e α 2 t s t

6 A. ARAPOSTATHIS, A. BISWAS, AND V. S. BORKAR for some constant κ 2 α) >. With τ R denoting the first exit time from B R, applying Dynkin s formula we obtain [ t τr V α x) = E vα x e αs [ RX s, v α X s )) ds + E vα x e αt τ R ) V α X t τr ). We write with [ E vα x e αt τ R ) V α X t τr ) = A t, R) + A 2 t, R), [ A t, R) := E vα x e αt V α X t τr ) {t τr }, A 2 t, R) := E vα x [ e ατ R V α X t τr ) {τr < t}. Since V α has at most linear growth in x by 7), it follows by 7) that lim t lim sup R A t, R) =. We also have lim sup R A 2 t, R) = by dominated convergence, since τr < t ) as R. Thus, taking limits first as R, and then P vα x as t in 7), we obtain V α x) E vα x [ Thus the converse inequality to 8) also holds. e αs RX s, v α X s )) ds. Define the class of controls U x by { [ T U x := U U : lim sup T T EU x RX s, U s ) ds } <. Lemma 3. There exists an inf-compact V C 2 R d ) which satisfies 8) A[V x) := 2 V + m, V 2 V 2 + l = β, with and β = β := inf U U x lim sup T [ T T EU x RX s, U s ) ds, 9) V x) c + x x R d, and for some positive constant c. In addition, under the Markov control U t = v X t ), with v = V, the diffusion in 5) is positive recurrent, and β = R R[v d x) dη, where η is the invariant probability measure corresponding to the control v.

SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION 7 Proof. The existence of a solution to 8) is established as a limit of V α ) V α ), along some sequence α n, where V α is the solution of 6) in Lemma Bensoussan and Frehse, 22, p. 75). That V is inf-compact follows by Bensoussan and Frehse, 22, Theorem 4.2). It also follows from the proof of this convergence result that β lim sup α αv α x) for all x R d. We first show that β β. For this, we employ the following assertion which is a special case of the Hardy Littlewood theorem Sznajder and Filar 992). For any sequence {a n } of non-negative real numbers, it holds that 2) lim sup θ θ) θ n a n n= lim sup N N N a n. n= Concerning this assertion, note that if the right hand side of 2) is finite, then the set { an n } is bounded. Therefore, n= θn a n is finite for every θ <. Hence, we can apply Sznajder and Filar, 992, Theorem 2.2) to obtain 2). Fix x R d, and U U x. Define [ n a n := E U x RX s, U s ) ds, n, n and let θ = e α. Applying 2), with N running over the set of natural numbers, we obtain 2) lim sup N [ N N EU x RX s, U s ) ds lim sup θ lim sup α lim sup α θ) n= θ n a n [ n ) e α ) E U x e αs RX s, U s ) ds n= n [ e α ) E U x e αs RX s, U s ) ds β. For the last inequality in 2), we use the fact that lim sup α αv α x) β. Since U U x is arbitrary, 2) together with the definition of β imply that β β. Note also that 2) implies that U α x U x for all α, ). Next, we prove the converse inequality. It is clear that 9) follows from 7). Therefore, since the Markov control v := V x) has at most linear

8 A. ARAPOSTATHIS, A. BISWAS, AND V. S. BORKAR growth, there exists a unique strong solution to 7) under the control v. Applying Itô s formula to 8), and using the notation R[vx) := R x, vx) ) = lx) + 2 vx) 2, we obtain [ E v x V XT τr ) [ T τr V x) + E x R[v X s ) ds = β E x [ T τr, where τ R denotes the exit time from the ball of radius R > centered at. Since V is bounded from below and τ R a.s., as R, by first using Fatou s lemma for the integral term in the above display, and then dividing by T and taking limits as T, we obtain lim sup T [ T T Ev x R[v X s ) ds β. Thus β = β. Since l is inf-compact, this also implies that the diffusion under the control v is positive recurrent, and by Birkhoff s ergodic theorem we obtain β = R d R[v x) dη. This completes the proof. and Let L: C 2 R d ) CR d R d ), and L v denote the operators defined by L[fx, u) := 2 fx) + mx) + u, fx), L v fx) := 2 fx) + mx) + vx), fx), f C 2 R d ), respectively. Also, let P denote the set of infinitesimal ergodic occupation measures, i.e., the set of probability measures π PR d R d ) which satisfy 22) L[fx, u) πdx, du) = f Cc R d ), R d R d where Cc R d ) denotes the class of real-valued smooth functions with compact support. Note that if π = η v v P then 22) can be written as R L d v fx) η v dx) =. Let ˆvx) = vdu x). Since 23) u 2 η v dx) vdu x) R d R d ˆvx) 2 η v dx), R d

SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION 9 and since η v ˆv is also an infinitesimal ergodic occupation measure, it is evident that as far as the proof of strong duality is concerned we may restrict our attention to the subset of P that corresponds to precise controls, which we denote as P. We have the following lemma. 24) then 25) Lemma 4. If π = η v v P is such that R d R[vx) η v dx) = β + 2 R d R d R[vx) η v dx) <, R d vx) + V x) 2 ηv dx). In addition the measure η v has a density ϱ v L d /d ) R d ). Proof. Let χ be a concave C 2 R d ) function such that χx) = x for x, and χx) = for x. Then χ and χ are nonnegative on, ). Define χ R x) := R + χx R) for R >. Using 8), and completing the square, we obtain L v V 2 v + V 2 + R[v β =. Applying L v to the function χ R V ), results in 26) L v χ R V ) 2 χ RV ) V 2 2 χ RV ) v + V 2 + χ RV )R[v χ RV )β =. Observe that χ R V ) R is compactly supported by construction. Thus R L d v χ R V x))η v dx) = for all R >. Since R lx)η d v dx) < by 24), the bound in ) shows that 27) V x) 2 η v dx) <. R d Integrating 26) with respect to η v, using 27), and passing to the limit as R, we obtain 25). We have thus shown that R d mx) + vx) 2 η v dx) <. By Theorem. in Bogachev, Krylov and Röckner 996), this implies that the measure η v has a density in L d /d ) R d ). This completes the proof.

A. ARAPOSTATHIS, A. BISWAS, AND V. S. BORKAR Proof of Theorem.4. Without loss of generality we assume ε =, and we suppress the explicit dependence on ε in the notation used in the theorem. The statement concerning existence of solutions and the behavior above and below a critical value for β follows by the results in Ichihara 2). For this, we need to first verify a Foster Lyapunov type hypothesis, which is part of the assumptions. Note that the operator F in Ichihara 2) has a negative sign in the Laplacian so that A[ϕ = F [ϕ, where A is the operator defined in 8). So, given that l is inf-compact, ϕ = is an obvious choice to satisfy A4) in Ichihara 2). Then of course A[ϕ as x. Note that Theorem 2.2 in Ichihara 2) then asserts that V is bounded below in R d. Next, consider ϕ = a V with a := inf K c m, V V, where K is as V 2 in Hypothesis. 3), and V is as in Lemma 2.3. Since V agrees with V outside some compact set by Lemma 2.3, it follows by Hypothesis. 3) that a >. Then we obtain 2 ϕ + m, ϕ 2 ϕ 2 = a 4 V V a a 4 V 2 m, V + a V V 4V V 2) ) V m, V on K c. Thus, since V is bounded by Hypothesis. 3b), we obtain A[ϕ as x. It is also clear that φ x) φ x) as x. Thus, Hypothesis A.4) in Ichihara 2) is also satisfied. Therefore, as shown in Ichihara, 2, Theorem 2.), there exists some critical value λ such that the HJB equation [ V + min m + u, V + l + 2 u R d 2 u 2 = β has no solution for β > λ. Also by Theorem 2.2 and Corollary 2.3 in Ichihara 2), if V is a solution for β < λ, then under the control v = V, the diffusion is transient. Moreover, for β = λ there exists a unique solution V = V up to an additive constant), and under the control v = V the diffusion X t = X + t mxs ) + vx s ) ) ds + W t, t, is positive recurrent. It is clear then that Lemma 3 implies that λ = β. We next turn to the proof of items a) e). Part a) follows directly by Metafune, Pallara and Rhandi, 25, Lemma 5.). Note that a sharper estimate was established in the proof of Lemma 3 when β = β. The uniqueness

SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION of the solution for β = β follows by the results in Ichihara 2, 22) discussed above, while the rest of the assertions in part b) follow by Lemma 3. Part c) follows by Lemma 4. We now turn to part d). It is enough to show that for any sequence {U n } Û and a sequence of times {t n} diverging to, 28) lim inf n [ tn E x RXs n, Us n ) ds t n β ε, where X n denotes the process controlled by U n. All the terms in 28) are finite, since T EU x [RX s, U s ds < for any U Û, a fact which clearly follows by 2). We include the dependence on the initial condition X n = x explicitly in the notation, and thus we denote the corresponding sequence of mean empirical measures by Φ U n x,t n. Recall that this is defined by R d R d fx, u) Φ U n x,t n dx, du) = E [ tn fx n s, U n s ) ds X n = x for all f C b R d U). Extract a subsequence of {t n } over which the terms on the left hand side of 28) converge to the lim inf and suppose without loss of generality that this limit is finite. Then the corresponding subsequence of mean empirical measures is tight. Let π P be any limit point of this subsequence. It follows that the left hand side of 28) is lower bounded by πr). However, πr) β by 23) and Lemma 4. This completes the proof of part d). It remains to prove part e). Let π = η v v P be any optimal ergodic occupation measure, and π := η v, with v = V. By Lemma 4, η v has density, which we denote by ρ v. Let ξ v := ρ v ρ v + ρ, and ξ := ρ ρ v + ρ, and also define v := ξ v v + ξ v and η := 2 η v + η ). Using the property that the drift of ) is an affine function of the control, together with Arapostathis, Borkar and Ghosh, 22, Lemma 3.2.3), it is straightforward to verify that η v P.

2 A. ARAPOSTATHIS, A. BISWAS, AND V. S. BORKAR By optimality, we have 29) 2 R[ v d η R[v dη v R[v dη R d R d R d = = ξ v v + ξ v 2 d η R d 2 v 2 dη v R d 2 v 2 dη R d ξ v v + ξ v 2 ξ v v 2 ξ v 2) d η R d = 2 R d ρ v x) ρ x) ρ v x) + ρ x) vx) v x) 2 dx. Since ρ is strictly positive, 29) implies that ρ v v v = a.e. in R d, and thus v = v on the support of η v. It is clear that if v is modified outside the support of η v, then the modified η v v is also an infinitesimal ergodic occupation measure. Therefore η v v P. The uniqueness of the invariant measure of the diffusion with generator L v then implies that η v = η, which in turn implies since v = v on the support of η v ) that v = V a.e. in R d. This completes the proof of part e), and also of the theorem. 2. Proofs of Lemma.7 and Theorem.9. We start with the proof of Lemma.7. Proof of Lemma.7. Suppose that M has a number q of eigenvalues on the open right half complex plane. Using a similarity transformation we can transform M to a matrix of the form diagm, M 2 ) where M R d q) d q) and M 2 R q q are Hurwitz matrices. So without loss of generality, we assume M has this form. Let S and S 2 be the unique symmetric positive definite matrices solving the Lyapunov equations S M + M T S = I, and S 2 M 2 + M T 2 S 2 = I, respectively. Extend these to symmetric matrices in R d d by defining S = diags, ) and S 2 = diag, S 2 ), and also define, for α >, ϕ x) := e α x, S x, ϕ 2 x) := e α x, S 2 x, and ϕ := + ϕ ϕ 2. Let T = diagi d q) d q), q q ), and T 2 = diag d q) d q), I q q ). Then,

SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION 3 with L v fx) := 2 fx) + Mx + vx), fx), we obtain L v ϕ2 x) ) 3) = αϕ 2 x) trace S 2 ) 2α x, S 2x 2 + T 2 x 2 + 2 vx), S ) 2 x αϕ 2 x) trace S 2 ) + T 2 x 2 2 T 2x 2 For the inequality in 3) we use 2α S 2 2 T 2 x 2 2 S 2 2 vx) 2) = αϕ 2 x) traces 2 ) + 2 2α S 2 2) T 2 x 2 2 S 2 2 vx) 2). 2 vx), S 2 x = 2 S2 vx), T 2 x T 2 x 2 2 2 S2 vx) 2 2 T 2x 2 2 S 2 2 vx) 2. Using the analogous inequality for L v ϕ x) and combining the equations we obtain 3) L v ϕx) αe α x, S x traces ) + 2 + 2α S 2) T x 2 2 S 2 T vx) 2) + αe α x, S 2 x traces 2 ) + 2 2α S 2 2) T 2 x 2 2 S 2 2 T 2 vx) 2) α traces ) + e α x,sx 2 2α S 2) x 2 2 S 2 vx) 2), with S := diags, S 2 ). Applying Itô s formula to 3), dividing by α, and also using the fact that ϕ and ϕ = 2, we obtain E x [ T traces ) + e α Xt,SXt 2 2α S 2) X t 2 2 S 2 vx t ) 2) dt 2 α.

4 A. ARAPOSTATHIS, A. BISWAS, AND V. S. BORKAR Dividing by T, letting T and rearranging terms, we conclude that e α x,sx x 2 is integrable with respect to invariant probability measure µ v under the control v for any α <, and the following bound holds 4 S 2 e α x,sx x 2 µ v dx) traces ) 2 S 2 R d + 2 2α S 2 2 vx) 2 µ v dx). 2α S 2 R d Taking limits as α, using monotone convergence, we obtain x 2 µ v dx) 2 traces ) + 4 S 2 vx) 2 µ v dx). R d R d The proof is complete. Proof Theorem.9. It is well known Brockett, 97, Theorem 3, p. 5) that there exists at most one symmetric matrix Q satisfying 32) M T Q + QM = Q 2 and 33) M Q)Σ + ΣM Q) T = I. For κ >, consider the ergodic control problem of minimizing [ T 34) J κ v) := lim sup T T E κ X s 2 + 2 vx s) 2) ds, over v U SSM, subject to the linear controlled diffusion 35) X t = X + t MXs + vx s ) ) ds + W t, t. As is also well known, an optimal stationary Markov control for this problem takes the form vx) = Q κ x, where Q κ is the unique positive definite symmetric solution to the matrix Riccati equation 36) Q 2 κ M T Q κ Q κ M = 2 κi. Moreover, Q κ has the following property. Consider a deterministic linear control system ẋt) = Mxt) + ut), with x, u R d, and initial condition x) = x. Let U denote the space of controls u satisfying T ut) 2 dt < for all T >, and φ u t x ) denote the solution of the differential equation under a control u U. Then 37) x, Q κ x = min u U ut) 2 + 2κ φ u t x ) 2) dt.

SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION 5 For these assertions, see Brockett, 97, Theorem, p. 47, and Theorem 5, p. 5). On the other hand, Ψ κ x) = 2 x, Qκ x is a solution of the associated HJB equation 38) 2 Ψ κx)+ min u R d [ Mx+u, Ψκ x) + 2 u 2 +κ x 2 = 2 traceq κ). The HJB equation 38) characterizes the optimal cost, i.e., inf J κ v) = 2 traceq κ). v U SSM Let GM), M R d d, denote the collection of all matrices G R d d such that M G is Hurwitz. Also, for G GM), let Σ G denote the unique) symmetric solution of the Lyapunov equation 39) M G) Σ G + Σ G M G) T = I, and define 4) J G M) := 2 trace G Σ G G T), J M) := inf J GM). G GM) Since the stationary probability distribution of 35) under the control vx) = Q κ x is Gaussian, it follows by 34) that G = Q κ minimizes J G;κ M) := κ trace Σ G ) + 2 trace G Σ G G T) over all matrices G GM), where Σ G is as in 39) note that J G; M) = J G M) which is the right hand side of 4)). Combining this with 38), we obtain 4) inf G GM) By Lemma.7, we have 42) J G;κ M) = traceσ Qκ ) J Qκ;κM) = 2 traceq κ). C + JQκ;κM) ) = C + 2 traceq κ) ). It also follows by 37) that Q κ Q κ is nonnegative definite if κ κ. Therefore Q κ has a unique limit Q as κ. It is evident that Q is nonnegative semidefinite, and 36) shows that it satisfies 32). Since traceσ Qκ )

6 A. ARAPOSTATHIS, A. BISWAS, AND V. S. BORKAR is bounded by 42), it follows that Σ Qκ converges along some subsequence κ n to a symmetric positive semidefinite matrix Σ. Thus 33) holds. However, 33) implies that Σ is invertible, and therefore, it is positive definite. In turn, 33) implies that M Q is Hurwitz. Since v G x) = Gx, G GM), is in general suboptimal for the criterion J κ v), applying Lemma.7 once more, we obtain J Qκ M) Therefore, we have J Qκ;κM) κ C + JG M) ) + J G M) G GM). J M) J Qκ;κM) κ C + J M) ) + J M), and taking limits as κ, this implies by 4) that J M) = 2 traceq). It remains to show that Λ + M) = 2 traceq). Let T be a unitary matrix such that Q := T QT T takes the form Q = diag, Q 2 ), with Q 2 R q q a positive definite matrix, for some q d. Write the corresponding block structure of T MT T as ) M := T MT T = M M2 with M 22 R q q. Since M T Q+QM = Q 2, we obtain M T Q+ Q M = Q2, and block multiplication shows that Q 2 M2 =, which implies that M 2 =. Since M Q is similar to M Q, the latter must be Hurwitz, which implies that M is Hurwitz. By block multiplication we have M 2 43) M T 22 Q2 + Q 2 M22 = Q 2 2. Since Q 2 is positive definite, the matrix M 22 is Hurwitz by the Lyapunov theorem. Thus Λ + M) = trace M 22 ), since M is Hurwitz. Therefore, since Q 2 is invertible, and traceq) = trace Q), we obtain by 43) that M22 traceq) = trace Q 2 ) = trace M T 22 + M 22 ) = 2 trace M 22 ) = 2Λ + M). This proves part a). Now let ˆv U SSM be any control. Let V x) = 2 x, Qx. Then V satisfies 38) with κ =. Since [ Mx min + u, V x) + u R d 2 u 2 = Mx, V x) 2 Qx 2, = Mx + ˆvx), V x) + 2 ˆvx) 2 Qx + ˆvx) 2, 2

we obtain SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION 7 44) 2 V x) + Mx + ˆvx), V x) + 2 ˆvx) 2 = 2 traceq) + 2 Qx + ˆvx) 2. Applying Itô s formula to 44), and using the fact that µˆv has finite second moments as shown in Lemma.7, and V is quadratic, a standard argument gives 45) 2 ˆvx) 2 2 Qx + ˆvx) 2) µˆv dx) = 2 traceq). R d Thus R d 2 ˆvx) 2 µˆv dx) 2 traceq) = J M). Therefore, we have 46) inf v U SSM R d 2 vx) 2 µ v dx) = Λ + M). Suppose ˆv is optimal, i.e., attains the infimum in 46). By 45), we obtain R d Qx + ˆvx) 2 µˆv dx) =. Therefore, since µˆv has a positive density, it holds that ˆvx) = Qx a.e. in R d. This completes the proof of part b). We have already shown that V x) = 2 x, Qx satisfies 2 V x) + Mx, V x) V x) 2 = Λ + M), 2 and that the associated process is positive recurrent. Therefore, as in the proof of Theorem.4 for a bounded m, part c) follows by Theorems 2. 2.2 and Corollary 2.3 in Ichihara 2). Note that Hypothesis A4) in Ichihara 2) is easily satisfied for the linear problem. Indeed, since M is exponentially dichotomous, then as seen in the proof of Theorem 2.2, there exist symmetric matrices S and Ŝ, with Ŝ positive definite such that M T S +SM = Ŝ. Consider the function ϕ x) := a x, Sx, with a := 4 Ŝ S 2). Since Ŝ x, Ŝx x 2 S 2 Sx 2, we obtain Ā[ϕ x) := 2 ϕ x) + Mx, ϕ x) 2 ϕ x) 2 = a trace S + a x, Ŝx 2a2 Sx 2 > a 2 2 trace S+ x, Ŝx ). Thus Ā[ϕ x) as x. This completes the proof.

8 A. ARAPOSTATHIS, A. BISWAS, AND V. S. BORKAR References. Arapostathis, A., Borkar, V. S. and Ghosh, M. K. 22). Ergodic control of diffusion processes. Encyclopedia of Mathematics and its Applications 43. Cambridge University Press, Cambridge. Bensoussan, A. and Frehse, J. 22). Regularity results for nonlinear elliptic systems and applications. Applied Mathematical Sciences 5. Springer-Verlag, Berlin. Bogachev, V. I., Krylov, N. and Röckner, M. 996). Regularity of invariant measures: the case of non-constant diffusion part. J. Funct. Anal. 38 223 242. Brockett, R. W. 97). Finite dimensional linear systems. John Wiley & Sons. Ichihara, N. 2). Recurrence and transience of optimal feedback processes associated with Bellman equations of ergodic type. SIAM J. Control Optim. 49 938 96. Ichihara, N. 22). Large time asymptotic problems for optimal stochastic control with superlinear cost. Stochastic Process. Appl. 22 248 275. Metafune, G., Pallara, D. and Rhandi, A. 25). Global properties of invariant measures. J. Funct. Anal. 223 396 424. Sznajder, R. and Filar, J. A. 992). Some comments on a theorem of Hardy and Littlewood. J. Optim. Theory Appl. 75 2 28. Dept. of Electrical and Computer Eng. The University of Texas at Austin 25 Speedway, EER 7.824 Austin, TX 7872 E-mail: ari@ece.utexas.edu Department of Electrical Engineering Indian Institute of Technology Powai, Mumbai, India E-mail: borkar@ee.iitb.ac.in Indian Institute of Science Education and Research Dr. Homi Bhabha Road Pune 48, India E-mail: anup@iiserpune.ac.in