A variational formula for risk-sensitive control

Size: px
Start display at page:

Download "A variational formula for risk-sensitive control"

Transcription

1 A variational formula for risk-sensitive control Vivek S. Borkar, IIT Bombay May 9, 2018, IMA, Minneapolis

2 PART I: Risk-sensitive reward for Markov chains (joint work with V. Anantharam, Uni. of California, Berkeley ) PART II: Risk-sensitive cost for reflected diffusions (joint work with Ari Arapostathis, Uni. of Texas at Austin, and K. Suresh Kumar, IIT Bombay ) PART III: Linear/dynamic programming approaches for degenerate finite risk-sensitive reward problems

3 PART I

4 Courant-Fisher formula for principle eigenvalue of a positive definite matrix A R d d : λ = x T Ax max 0 x R d x T x. Consider an irreducible nonnegative Q R d d. Then the Perron-Frobenius theorem guarantees a positive principle eigenvalue with an associated positive eigenvector. Is there a counterpart of the Courant-Fisher formula?

5 YES!! The Collatz-Wielandt formula for the principal eigenvalue of an irreducible nonnegative matrix Q = [[q(i, j)]] R d d : λ = sup x=[x 1,,x d ] T,x i >0 i = inf x=[x 1,,x d ] T,x i >0 i min i:x i >0 max i:x i >0 An alternative characterization: write (Qx) i x i (Qx) i x i. Q = DP, where D = diag(d 1,, d d ), P = [[p(j i)]] stochastic.

6 Also define G 0 := { (π, P ) : π is a stationary probability for the stochastic matrix P = [[ p(j i)]] }. Then the following representation holds: log λ = sup (π, P ) G 0 i π(i) [d(i) D( p( i) p( i))]. This is the Donsker-Varadhan formula for the principal eigenvalue of a nonnegative matrix. (cf. the book by Dembo-Zeitouni)

7 Infinite dimensional generalization of Perron-Frobenius theorem is given by the Krein-Rutman theorem: Let 1. B be a Banach space with a positive cone K such that K K is dense in B, 2. T : B B a compact positive linear operator which is strongly positive. Then a principal eigenvalue (unique, positive) / eigenvector (positive) exist.

8 Our interest is in the following nonlinear scenario arising in Risk-Sensitive Control: Consider a controlled Markov chain {X n } on a compact metric state space S, an associated control process {Z n } in a compact metric control space U, a per stage reward function r : S U S R such that r C(S U S),

9 a controlled transition kernel p(dy x, u) with full support, such that P (X n+1 A X m, Z m, m n) = p(a X n, Z n ) and, the maps (x, u) f(y)p(dy x, u), f C(S), f 1, are equicontinuous. This can be relaxed via an approximation argument.

10 The control problem is to maximize the asymptotic growth rate of the exponential reward: λ := sup x S sup lim inf N 1 N loge [ e ] N 1 m=0 r(x m,z m,x m+1 ) X 0 = x. The second supremum is over all admissible (i.e., non-anticipative) controls. We allow relaxed (i.e., probability measure valued) controls. Define T f(x) := T (n) sup φ P(U) p(dy x, u)φ(du)e r(x,u,y) f(y), := T T T (n times), T (0) := Id.

11 T (n) : C(S) C(S), n 0, is the Nisio semigroup, satisfying: 1. strictly increasing: f > g = T (n) f > T (n) g, 2. strongly positive: f 0, f θ = T (n) f int(c + (S)), 3. positively 1-homogeneous: c > 0 = T (n) (cf) = ct (n) f, 4. compact.

12 This leads to an abstract Collatz-Wielandt formula: Theorem There exist ρ > 0, ψ int(c + (S)) such that T ψ = ρψ and ρ = inf f int(c + (S)) = sup f int(c + (S)) sup M + (S) inf M + (S) T fdµ fdµ T fdµ. fdµ Furthermore, log ρ is the optimal reward for the risk-sensitive control problem. The proof uses Nussbaum-Ogiwara formulation of a nonlinear Krein-Rutman theorem.

13 A variational formula: Let G := the set of probability measures η(dx, du, dy) P(S U S) which disintegrate as η(dx, du, dy) = η 0 (dx)η 1 (du x)η 2 (dy x, u), such that η 0 is invariant under the transition kernel U η 2(dy x, u)η 1 (du x). Let D( ) denote the Kullback-Leibler divergence or relative entropy.

14 Theorem Under above hypotheses, log ρ = sup η G ( η 0 (dx)η 1 (du x)[ D(η 2 (dy x, u) p(dy x, u))]). r(x, u, y)η 2 (dy x, u) This generalizes the Donsker-Varadhan formula. Hypotheses can be relaxed to: 1. Range(r) = [, ) with e r C(S U S), 2. p(dy x, u) need not have full support. (extension via an approximation argument)

15 We have an equivalent concave maximization problem, as opposed to a team problem one would obtain from the usual log transformation. If ρ(ϕ) denotes the asymptotic growth rate for a randomized Markov control ϕ, then ρ = max ρ(ϕ) (sufficiency of randomized Markov controls). Related to entropy-penalized control ( Bierkens-Kapper, Guan-Raginsky-Willett, Todorov)

16 Applications 1. Growth rate of the number of directed paths in a graph (requires as a possible reward). 2. Portfolio optimization in the framework of Bielecki, Hernandez-Hernández and Pliska. 3. Problem of minimizing the exit rate from a domain.

17 PART II

18 Reflected diffusion: dx(t) = b(x(t)), v(t))dt + σ(x(t))dw (t) γ(t)dξ(t), dξ(t) = I{X(t) Q}dξ(t), for t 0. Here: 1. Q open bounded with C 3 boundary Q. 2. b continuous, b(, u) Lipschitz uniformly in u. 3. σ is C 1,β 0 and uniformly non-degenerate.

19 4. γ i (x) = σ(x)σ(x) T η(x) where η(x) is the unit outward nornal. 5. W ( ) a BM, v( ) a non-anticipatine control a compact action space. OBJECTIVE: Minimize lim t where r is continuous. 1 t log E [ e t0 r(x(s),v(s))ds ],

20 Nisio semigroup: For t 0, S t f(x) := inf v( ) E x [ e t0 r(x(s),v(s))ds ]. Then S t : C( Q) C( Q) is a semigroup of strongly continuous bounded Lipschitz, monotone operators with infinitesimal generator G defines by Gf(x) = 1 (σ(x)σ 2 tr T (x) 2 ) f(x) + min v [ b(x, v), f(x) + r(x, v)f(x)]. Also, supperadditive, positively 1-homogeneous, strongly positive, completely continuous.

21 Let C 2 γ,+ ( Q) := {f : Q [0, ) : f is C 2 with f(x), γ(x) = 0 for x Q}. Nonlinear Krein-Rutman theorem = There exists unique pair (ρ, ϕ) R Cγ,+ 2 ( Q) satisfying ϕ 0, Q = 1 such that S t ϕ = e ρt ϕ. This solves Gϕ(x) = ρϕ(x), x Q, ϕ(x), γ(x) = 0, x Q.

22 Abstract Collatz-Wielandt formula = ρ = inf f C 2 γ,+ ( Q),f>0 = sup f C 2 γ,+ ( Q),f>0 sup ν P( Q) inf ν P( Q) Gf f dν Gf f dν In uncontrolled case, the first formula is the convex dual of the Donsker-Varadhan formula for principle eigenvalue of G: where ρ = I(ν) := ( ) sup ν P( Q) Q r(x)ν(dx) I(ν) inf f Cγ,+ 2 ( Q),f>0 Q Gf dν. f

23 PART III

24 Notation: {X n, n 0} : controlled Markov chain with finite state space S := {1, 2,, s} and finite action space U. {Z n } : U-valued control process. P u = [[p(j i, u)]] i,j S : controlled transition matrix. r(,, ) C(S U S) : per stage reward.

25 Risk-sensitive reward Aim: Maximize the asymptotic growth rate Here: λ := sup i sup lim inf {Z n } N 1 N log E i [ e ] N 1 m=0 r(x m,z m,x m+1 ). E i [ ] := expectation w.r.t. X 0 = i, and, the inner supremum is over all admissible controls.

26 Consider a controlled Markov chain {Y n } on S with state-dependent action space at state i given by: Ũ i := u U ({u} V i,u ), where V i,u := {q( i, u) : q( i, u) 0, This is isomorphic to P(S). Let j q(j i, u) = 1}. K := i S ({i} Ũ i ). The (controlled) transition probabilities of {Y n } are p(j i, (u, q( i, u))) := q(j i, u).

27 Define per stage reward r : K S R by: r(i, (u, q( i, u)), j) := r(i, u, j) D(q( i, u) p( i, u)). Let {(Z n, Q n ), n 0} := the Ũ Yn -valued control process. Consider the problem: Maximize the long run average reward lim inf N 1 N N 1 n=0 E x [ r ( Yn, (Z n, Q n ), Y n+1 )].

28 Define the corresponding ergodic occupation measure γ P(K S) by γ(i, (u, dq), j) := γ 1 (i)γ 2 (u, dq i)γ 3 (j i, (u, q)), where γ 1 is an invariant probability distribution (not necessarily unique) under the transition kernel ˇγ(j i) = u V i,u γ 2 (u, dq i)γ 3 (j i, (u, q)). Let E := the set of such γ.

29 The above average reward control problem is equivalent to the linear program: P0 Maximize i,j,u γ(i, (u, dq), j) r(i, (u, q), j) over E. (Recall that E is specified by linear constraints.) The maximum will be attained at an extreme point of E corresponding to a stationary Markov policy.

30 This LP can be simplified as : Maximize over i,j γ (i, u, j)[r(i, u, j) D( q( i, u) p( i, u))] Ẽ := {γ P(S U S) : γ (i, u, j) = γ 1 (i)ϕ(u i)q(j i, u) where γ 1 ( ) is invariant under the transition kernel γ(j i) := u ϕ(u i)q(j i, u)}.

31 The dual LP is : Minimize λ subject to λ λ(i), λ(i) + V (i) j λ(i) j q(j i, u)( r(i, (u, q( i, u)), j) + V (j)), q(j i, u)λ(j), i S, (u, q( i, u)) Ũ i. The proof goes through finite approximations. Note that the LP has infinitely many constraints.

32 Dynamic Programming The equivalent dynamic programming formulation is: = max λ(i), i λ(i) + V (i) = max ( q(j i, u)(v (j) u,q( i,u) j + r(i, (u, q( i, u), j)))), ( ) λ λ(i) = max (u,q( i,u)) B i where B i is the Argmax in ( ). j q(j i, u)λ(j), goes through finite approximations. i S, Once again, the proof

33 The maximization over q in ( ) can be explicitly performed using the Gibbs variational principle from statistical mechanics: For fixed i, u, the maximum is attained at q (j i, u) := p(j i, u)er(i,u,j)+v (j). r(i,u,k)+v (k) k p(k i, u)e Substitute back, set Φ(i) := e V (i), Λ(i) := e λ(i), i S, and exponentiate both sides of ( ).

34 This leads to the multiplicative dynamic programming equations for infinite horizon risk-sensitive reward in the general degenerate case: Λ(i)Φ(i) = max u Λ(i) = max u D i j j p(j i, u) where D i is the Argmax in ( ). ( e r(i,u,j) ) Φ(j), ( ) p(j i, u)er(i,u,j) Φ(j) k p(k i, u)e r(i,u,k) Λ(j), Φ(k) i S, This is the analog of Howard-Kallenberg results for ergodic control. Observe the occurrence of the twisted kernel.

35 FUTURE PROBLEMS: 1. Extension to non-compact state spaces (cf. recent work of Cavazos-Cadena (Math. OR, 2018)) 2. Degenerate case for diffusions

36 References 1. V. Anantharam, V. S. Borkar, A variational formula for risk-sensitive reward, submitted A. Arapostathis, V. S. Borkar, K. Suresh Kumar, Risksensitive control and an abstract Collatz-Wielandt formula, Journal of Theoretical Probability, May

37 3. V. S. Borkar, Linear and dynamic programming approaches to degenerate risk-sensitive reward processes, 56th IEEE Conference on Decision and Control, Dec , 2017, Melbourne, Australia.

38 THANK YOU!

A relative entropy characterization of the growth rate of reward in risk-sensitive control

A relative entropy characterization of the growth rate of reward in risk-sensitive control 1 / 47 A relative entropy characterization of the growth rate of reward in risk-sensitive control Venkat Anantharam EECS Department, University of California, Berkeley (joint work with Vivek Borkar, IIT

More information

SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION IN STOCHASTICALLY PERTURBED DYNAMICS

SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION IN STOCHASTICALLY PERTURBED DYNAMICS SUPPLEMENT TO CONTROLLED EQUILIBRIUM SELECTION IN STOCHASTICALLY PERTURBED DYNAMICS By Ari Arapostathis, Anup Biswas, and Vivek S. Borkar The University of Texas at Austin, Indian Institute of Science

More information

Weak convergence and large deviation theory

Weak convergence and large deviation theory First Prev Next Go To Go Back Full Screen Close Quit 1 Weak convergence and large deviation theory Large deviation principle Convergence in distribution The Bryc-Varadhan theorem Tightness and Prohorov

More information

Risk-Sensitive Control and an Abstract Collatz Wielandt Formula

Risk-Sensitive Control and an Abstract Collatz Wielandt Formula J Theor Probab 2016) 29:1458 1484 DOI 10.1007/s10959-015-0616-x Risk-Sensitive Control and an Abstract Collatz Wielandt Formula Ari Arapostathis 1 Vivek S. Borkar 2 K. Suresh Kumar 3 Received: 17 June

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

Consistency of the maximum likelihood estimator for general hidden Markov models

Consistency of the maximum likelihood estimator for general hidden Markov models Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models

More information

On John type ellipsoids

On John type ellipsoids On John type ellipsoids B. Klartag Tel Aviv University Abstract Given an arbitrary convex symmetric body K R n, we construct a natural and non-trivial continuous map u K which associates ellipsoids to

More information

Necessary and sufficient conditions for strong R-positivity

Necessary and sufficient conditions for strong R-positivity Necessary and sufficient conditions for strong R-positivity Wednesday, November 29th, 2017 The Perron-Frobenius theorem Let A = (A(x, y)) x,y S be a nonnegative matrix indexed by a countable set S. We

More information

Second Order Elliptic PDE

Second Order Elliptic PDE Second Order Elliptic PDE T. Muthukumar tmk@iitk.ac.in December 16, 2014 Contents 1 A Quick Introduction to PDE 1 2 Classification of Second Order PDE 3 3 Linear Second Order Elliptic Operators 4 4 Periodic

More information

Economics 204 Summer/Fall 2010 Lecture 10 Friday August 6, 2010

Economics 204 Summer/Fall 2010 Lecture 10 Friday August 6, 2010 Economics 204 Summer/Fall 2010 Lecture 10 Friday August 6, 2010 Diagonalization of Symmetric Real Matrices (from Handout Definition 1 Let δ ij = { 1 if i = j 0 if i j A basis V = {v 1,..., v n } of R n

More information

REMARKS ON THE EXISTENCE OF SOLUTIONS IN MARKOV DECISION PROCESSES. Emmanuel Fernández-Gaucherand, Aristotle Arapostathis, and Steven I.

REMARKS ON THE EXISTENCE OF SOLUTIONS IN MARKOV DECISION PROCESSES. Emmanuel Fernández-Gaucherand, Aristotle Arapostathis, and Steven I. REMARKS ON THE EXISTENCE OF SOLUTIONS TO THE AVERAGE COST OPTIMALITY EQUATION IN MARKOV DECISION PROCESSES Emmanuel Fernández-Gaucherand, Aristotle Arapostathis, and Steven I. Marcus Department of Electrical

More information

Online solution of the average cost Kullback-Leibler optimization problem

Online solution of the average cost Kullback-Leibler optimization problem Online solution of the average cost Kullback-Leibler optimization problem Joris Bierkens Radboud University Nijmegen j.bierkens@science.ru.nl Bert Kappen Radboud University Nijmegen b.kappen@science.ru.nl

More information

From nonnegative matrices to nonnegative tensors

From nonnegative matrices to nonnegative tensors From nonnegative matrices to nonnegative tensors Shmuel Friedland Univ. Illinois at Chicago 7 October, 2011 Hamilton Institute, National University of Ireland Overview 1 Perron-Frobenius theorem for irreducible

More information

Entropy and Large Deviations

Entropy and Large Deviations Entropy and Large Deviations p. 1/32 Entropy and Large Deviations S.R.S. Varadhan Courant Institute, NYU Michigan State Universiy East Lansing March 31, 2015 Entropy comes up in many different contexts.

More information

Some Aspects of Universal Portfolio

Some Aspects of Universal Portfolio 1 Some Aspects of Universal Portfolio Tomoyuki Ichiba (UC Santa Barbara) joint work with Marcel Brod (ETH Zurich) Conference on Stochastic Asymptotics & Applications Sixth Western Conference on Mathematical

More information

Continuous-Time Markov Decision Processes. Discounted and Average Optimality Conditions. Xianping Guo Zhongshan University.

Continuous-Time Markov Decision Processes. Discounted and Average Optimality Conditions. Xianping Guo Zhongshan University. Continuous-Time Markov Decision Processes Discounted and Average Optimality Conditions Xianping Guo Zhongshan University. Email: mcsgxp@zsu.edu.cn Outline The control model The existing works Our conditions

More information

THEOREMS, ETC., FOR MATH 516

THEOREMS, ETC., FOR MATH 516 THEOREMS, ETC., FOR MATH 516 Results labeled Theorem Ea.b.c (or Proposition Ea.b.c, etc.) refer to Theorem c from section a.b of Evans book (Partial Differential Equations). Proposition 1 (=Proposition

More information

Perron-Frobenius theorem for nonnegative multilinear forms and extensions

Perron-Frobenius theorem for nonnegative multilinear forms and extensions Perron-Frobenius theorem for nonnegative multilinear forms and extensions Shmuel Friedland Univ. Illinois at Chicago ensors 18 December, 2010, Hong-Kong Overview 1 Perron-Frobenius theorem for irreducible

More information

Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations.

Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations. Large deviations and averaging for systems of slow fast stochastic reaction diffusion equations. Wenqing Hu. 1 (Joint work with Michael Salins 2, Konstantinos Spiliopoulos 3.) 1. Department of Mathematics

More information

A new approach for investment performance measurement. 3rd WCMF, Santa Barbara November 2009

A new approach for investment performance measurement. 3rd WCMF, Santa Barbara November 2009 A new approach for investment performance measurement 3rd WCMF, Santa Barbara November 2009 Thaleia Zariphopoulou University of Oxford, Oxford-Man Institute and The University of Texas at Austin 1 Performance

More information

Risk-Sensitive and Average Optimality in Markov Decision Processes

Risk-Sensitive and Average Optimality in Markov Decision Processes Risk-Sensitive and Average Optimality in Markov Decision Processes Karel Sladký Abstract. This contribution is devoted to the risk-sensitive optimality criteria in finite state Markov Decision Processes.

More information

Brownian intersection local times

Brownian intersection local times Weierstrass Institute for Applied Analysis and Stochastics Brownian intersection local times Wolfgang König (TU Berlin and WIAS Berlin) [K.&M., Ann. Probab. 2002], [K.&M., Trans. Amer. Math. Soc. 2006]

More information

Markov Chain BSDEs and risk averse networks

Markov Chain BSDEs and risk averse networks Markov Chain BSDEs and risk averse networks Samuel N. Cohen Mathematical Institute, University of Oxford (Based on joint work with Ying Hu, Robert Elliott, Lukas Szpruch) 2nd Young Researchers in BSDEs

More information

Topics in Tensors III Nonnegative tensors

Topics in Tensors III Nonnegative tensors Topics in Tensors III Nonnegative tensors A Summer School by Shmuel Friedland 1 July 6-8, 2011 given in Department of Mathematics University of Coimbra, Portugal 1 Department of Mathematics, Statistics

More information

Mathematical Analysis Outline. William G. Faris

Mathematical Analysis Outline. William G. Faris Mathematical Analysis Outline William G. Faris January 8, 2007 2 Chapter 1 Metric spaces and continuous maps 1.1 Metric spaces A metric space is a set X together with a real distance function (x, x ) d(x,

More information

Krein-Rutman Theorem and the Principal Eigenvalue

Krein-Rutman Theorem and the Principal Eigenvalue Chapter 1 Krein-Rutman Theorem and the Principal Eigenvalue The Krein-Rutman theorem plays a very important role in nonlinear partial differential equations, as it provides the abstract basis for the proof

More information

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi Real Analysis Math 3AH Rudin, Chapter # Dominique Abdi.. If r is rational (r 0) and x is irrational, prove that r + x and rx are irrational. Solution. Assume the contrary, that r+x and rx are rational.

More information

Existence, Uniqueness and Stability of Invariant Distributions in Continuous-Time Stochastic Models

Existence, Uniqueness and Stability of Invariant Distributions in Continuous-Time Stochastic Models Existence, Uniqueness and Stability of Invariant Distributions in Continuous-Time Stochastic Models Christian Bayer and Klaus Wälde Weierstrass Institute for Applied Analysis and Stochastics and University

More information

Hybrid Atlas Model. of financial equity market. Tomoyuki Ichiba 1 Ioannis Karatzas 2,3 Adrian Banner 3 Vassilios Papathanakos 3 Robert Fernholz 3

Hybrid Atlas Model. of financial equity market. Tomoyuki Ichiba 1 Ioannis Karatzas 2,3 Adrian Banner 3 Vassilios Papathanakos 3 Robert Fernholz 3 Hybrid Atlas Model of financial equity market Tomoyuki Ichiba 1 Ioannis Karatzas 2,3 Adrian Banner 3 Vassilios Papathanakos 3 Robert Fernholz 3 1 University of California, Santa Barbara 2 Columbia University,

More information

Errata Applied Analysis

Errata Applied Analysis Errata Applied Analysis p. 9: line 2 from the bottom: 2 instead of 2. p. 10: Last sentence should read: The lim sup of a sequence whose terms are bounded from above is finite or, and the lim inf of a sequence

More information

Stochastic relations of random variables and processes

Stochastic relations of random variables and processes Stochastic relations of random variables and processes Lasse Leskelä Helsinki University of Technology 7th World Congress in Probability and Statistics Singapore, 18 July 2008 Fundamental problem of applied

More information

Online Markov Decision Processes with Kullback Leibler Control Cost

Online Markov Decision Processes with Kullback Leibler Control Cost Online Markov Decision Processes with Kullback Leibler Control Cost Peng Guan, Student Member, IEEE, Maxim Raginsky, Member, IEEE, and Rebecca M. Willett, Senior Member, IEEE Abstract This paper considers

More information

Approximate Dynamic Programming

Approximate Dynamic Programming Master MVA: Reinforcement Learning Lecture: 5 Approximate Dynamic Programming Lecturer: Alessandro Lazaric http://researchers.lille.inria.fr/ lazaric/webpage/teaching.html Objectives of the lecture 1.

More information

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME SAUL D. JACKA AND ALEKSANDAR MIJATOVIĆ Abstract. We develop a general approach to the Policy Improvement Algorithm (PIA) for stochastic control problems

More information

x log x, which is strictly convex, and use Jensen s Inequality:

x log x, which is strictly convex, and use Jensen s Inequality: 2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and

More information

Surrogate loss functions, divergences and decentralized detection

Surrogate loss functions, divergences and decentralized detection Surrogate loss functions, divergences and decentralized detection XuanLong Nguyen Department of Electrical Engineering and Computer Science U.C. Berkeley Advisors: Michael Jordan & Martin Wainwright 1

More information

Distance-Divergence Inequalities

Distance-Divergence Inequalities Distance-Divergence Inequalities Katalin Marton Alfréd Rényi Institute of Mathematics of the Hungarian Academy of Sciences Motivation To find a simple proof of the Blowing-up Lemma, proved by Ahlswede,

More information

A Nonlinear Transfer Operator Theorem

A Nonlinear Transfer Operator Theorem J Stat Phys 207) 66:56 524 DOI 0.007/s0955-06-646- A Nonlinear Transfer Operator Theorem Mark Pollicott Received: 22 June 206 / Accepted: 8 October 206 / Published online: 9 November 206 The Authors) 206.

More information

Invariance Principle for Variable Speed Random Walks on Trees

Invariance Principle for Variable Speed Random Walks on Trees Invariance Principle for Variable Speed Random Walks on Trees Wolfgang Löhr, University of Duisburg-Essen joint work with Siva Athreya and Anita Winter Stochastic Analysis and Applications Thoku University,

More information

Non-Irreducible Controlled Markov Chains with Exponential Average Cost.

Non-Irreducible Controlled Markov Chains with Exponential Average Cost. Non-Irreducible Controlled Markov Chains with Exponential Average Cost. Agustin Brau-Rojas Departamento de Matematicas Universidad de Sonora. Emmanuel Fernández-Gaucherand Dept. of Electrical & Computer

More information

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability... Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................

More information

Convergence of Feller Processes

Convergence of Feller Processes Chapter 15 Convergence of Feller Processes This chapter looks at the convergence of sequences of Feller processes to a iting process. Section 15.1 lays some ground work concerning weak convergence of processes

More information

A Type of Shannon-McMillan Approximation Theorems for Second-Order Nonhomogeneous Markov Chains Indexed by a Double Rooted Tree

A Type of Shannon-McMillan Approximation Theorems for Second-Order Nonhomogeneous Markov Chains Indexed by a Double Rooted Tree Caspian Journal of Applied Mathematics, Ecology and Economics V. 2, No, 204, July ISSN 560-4055 A Type of Shannon-McMillan Approximation Theorems for Second-Order Nonhomogeneous Markov Chains Indexed by

More information

A new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems

A new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems A new Hellinger-Kantorovich distance between positive measures and optimal Entropy-Transport problems Giuseppe Savaré http://www.imati.cnr.it/ savare Dipartimento di Matematica, Università di Pavia Nonlocal

More information

G(t) := i. G(t) = 1 + e λut (1) u=2

G(t) := i. G(t) = 1 + e λut (1) u=2 Note: a conjectured compactification of some finite reversible MCs There are two established theories which concern different aspects of the behavior of finite state Markov chains as the size of the state

More information

Laplace s Equation. Chapter Mean Value Formulas

Laplace s Equation. Chapter Mean Value Formulas Chapter 1 Laplace s Equation Let be an open set in R n. A function u C 2 () is called harmonic in if it satisfies Laplace s equation n (1.1) u := D ii u = 0 in. i=1 A function u C 2 () is called subharmonic

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko

More information

Large-deviation theory and coverage in mobile phone networks

Large-deviation theory and coverage in mobile phone networks Weierstrass Institute for Applied Analysis and Stochastics Large-deviation theory and coverage in mobile phone networks Paul Keeler, Weierstrass Institute for Applied Analysis and Stochastics, Berlin joint

More information

A SURVEY ON THE SPECTRAL THEORY OF NONNEGATIVE TENSORS

A SURVEY ON THE SPECTRAL THEORY OF NONNEGATIVE TENSORS A SURVEY ON THE SPECTRAL THEORY OF NONNEGATIVE TENSORS K.C. CHANG, LIQUN QI, AND TAN ZHANG Abstract. This is a survey paper on the recent development of the spectral theory of nonnegative tensors and its

More information

Deforming conformal metrics with negative Bakry-Émery Ricci Tensor on manifolds with boundary

Deforming conformal metrics with negative Bakry-Émery Ricci Tensor on manifolds with boundary Deforming conformal metrics with negative Bakry-Émery Ricci Tensor on manifolds with boundary Weimin Sheng (Joint with Li-Xia Yuan) Zhejiang University IMS, NUS, 8-12 Dec 2014 1 / 50 Outline 1 Prescribing

More information

OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY

OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY published in IMA Journal of Numerical Analysis (IMAJNA), Vol. 23, 1-9, 23. OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY SIEGFRIED M. RUMP Abstract. In this note we give lower

More information

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

Overview of Markovian maps

Overview of Markovian maps Overview of Markovian maps Mike Boyle University of Maryland BIRS October 2007 *these slides are online, via Google: Mike Boyle Outline of the talk I. Subshift background 1. Subshifts 2. (Sliding) block

More information

MATH 425, FINAL EXAM SOLUTIONS

MATH 425, FINAL EXAM SOLUTIONS MATH 425, FINAL EXAM SOLUTIONS Each exercise is worth 50 points. Exercise. a The operator L is defined on smooth functions of (x, y by: Is the operator L linear? Prove your answer. L (u := arctan(xy u

More information

MATH4406 Assignment 5

MATH4406 Assignment 5 MATH4406 Assignment 5 Patrick Laub (ID: 42051392) October 7, 2014 1 The machine replacement model 1.1 Real-world motivation Consider the machine to be the entire world. Over time the creator has running

More information

Gärtner-Ellis Theorem and applications.

Gärtner-Ellis Theorem and applications. Gärtner-Ellis Theorem and applications. Elena Kosygina July 25, 208 In this lecture we turn to the non-i.i.d. case and discuss Gärtner-Ellis theorem. As an application, we study Curie-Weiss model with

More information

Uniformly Uniformly-ergodic Markov chains and BSDEs

Uniformly Uniformly-ergodic Markov chains and BSDEs Uniformly Uniformly-ergodic Markov chains and BSDEs Samuel N. Cohen Mathematical Institute, University of Oxford (Based on joint work with Ying Hu, Robert Elliott, Lukas Szpruch) Centre Henri Lebesgue,

More information

On Ergodic Impulse Control with Constraint

On Ergodic Impulse Control with Constraint On Ergodic Impulse Control with Constraint Maurice Robin Based on joint papers with J.L. Menaldi University Paris-Sanclay 9119 Saint-Aubin, France (e-mail: maurice.robin@polytechnique.edu) IMA, Minneapolis,

More information

Markov Chains and Stochastic Sampling

Markov Chains and Stochastic Sampling Part I Markov Chains and Stochastic Sampling 1 Markov Chains and Random Walks on Graphs 1.1 Structure of Finite Markov Chains We shall only consider Markov chains with a finite, but usually very large,

More information

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with Appendix: Matrix Estimates and the Perron-Frobenius Theorem. This Appendix will first present some well known estimates. For any m n matrix A = [a ij ] over the real or complex numbers, it will be convenient

More information

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,

More information

Lecture notes for Analysis of Algorithms : Markov decision processes

Lecture notes for Analysis of Algorithms : Markov decision processes Lecture notes for Analysis of Algorithms : Markov decision processes Lecturer: Thomas Dueholm Hansen June 6, 013 Abstract We give an introduction to infinite-horizon Markov decision processes (MDPs) with

More information

Statistics & Data Sciences: First Year Prelim Exam May 2018

Statistics & Data Sciences: First Year Prelim Exam May 2018 Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book

More information

Propagating terraces and the dynamics of front-like solutions of reaction-diffusion equations on R

Propagating terraces and the dynamics of front-like solutions of reaction-diffusion equations on R Propagating terraces and the dynamics of front-like solutions of reaction-diffusion equations on R P. Poláčik School of Mathematics, University of Minnesota Minneapolis, MN 55455 Abstract We consider semilinear

More information

Markov Decision Processes and Dynamic Programming

Markov Decision Processes and Dynamic Programming Markov Decision Processes and Dynamic Programming A. LAZARIC (SequeL Team @INRIA-Lille) Ecole Centrale - Option DAD SequeL INRIA Lille EC-RL Course In This Lecture A. LAZARIC Markov Decision Processes

More information

Midterm 1. Every element of the set of functions is continuous

Midterm 1. Every element of the set of functions is continuous Econ 200 Mathematics for Economists Midterm Question.- Consider the set of functions F C(0, ) dened by { } F = f C(0, ) f(x) = ax b, a A R and b B R That is, F is a subset of the set of continuous functions

More information

P(E t, Ω)dt, (2) 4t has an advantage with respect. to the compactly supported mollifiers, i.e., the function W (t)f satisfies a semigroup law:

P(E t, Ω)dt, (2) 4t has an advantage with respect. to the compactly supported mollifiers, i.e., the function W (t)f satisfies a semigroup law: Introduction Functions of bounded variation, usually denoted by BV, have had and have an important role in several problems of calculus of variations. The main features that make BV functions suitable

More information

Approximate Dynamic Programming

Approximate Dynamic Programming Approximate Dynamic Programming A. LAZARIC (SequeL Team @INRIA-Lille) ENS Cachan - Master 2 MVA SequeL INRIA Lille MVA-RL Course Approximate Dynamic Programming (a.k.a. Batch Reinforcement Learning) A.

More information

Stochastic Processes (Week 6)

Stochastic Processes (Week 6) Stochastic Processes (Week 6) October 30th, 2014 1 Discrete-time Finite Markov Chains 2 Countable Markov Chains 3 Continuous-Time Markov Chains 3.1 Poisson Process 3.2 Finite State Space 3.2.1 Kolmogrov

More information

HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS. Josef Teichmann

HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS. Josef Teichmann HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS Josef Teichmann Abstract. Some results of ergodic theory are generalized in the setting of Banach lattices, namely Hopf s maximal ergodic inequality and the

More information

Separable Utility Functions in Dynamic Economic Models

Separable Utility Functions in Dynamic Economic Models Separable Utility Functions in Dynamic Economic Models Karel Sladký 1 Abstract. In this note we study properties of utility functions suitable for performance evaluation of dynamic economic models under

More information

Mean-field dual of cooperative reproduction

Mean-field dual of cooperative reproduction The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time

More information

Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +

Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < + Random Walks: WEEK 2 Recurrence and transience Consider the event {X n = i for some n > 0} by which we mean {X = i}or{x 2 = i,x i}or{x 3 = i,x 2 i,x i},. Definition.. A state i S is recurrent if P(X n

More information

Analysis Preliminary Exam Workshop: Hilbert Spaces

Analysis Preliminary Exam Workshop: Hilbert Spaces Analysis Preliminary Exam Workshop: Hilbert Spaces 1. Hilbert spaces A Hilbert space H is a complete real or complex inner product space. Consider complex Hilbert spaces for definiteness. If (, ) : H H

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

More information

The Poisson boundary of certain Cartan-Hadamard manifolds of unbounded curvature

The Poisson boundary of certain Cartan-Hadamard manifolds of unbounded curvature The Poisson boundary of certain Cartan-Hadamard manifolds of unbounded curvature University of Luxembourg Workshop on Boundaries Graz University of Technology June 29 July 4, 2009 Reference Marc Arnaudon,

More information

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels Feedback Capacity of a Class of Symmetric Finite-State Markov Channels Nevroz Şen, Fady Alajaji and Serdar Yüksel Department of Mathematics and Statistics Queen s University Kingston, ON K7L 3N6, Canada

More information

Your first day at work MATH 806 (Fall 2015)

Your first day at work MATH 806 (Fall 2015) Your first day at work MATH 806 (Fall 2015) 1. Let X be a set (with no particular algebraic structure). A function d : X X R is called a metric on X (and then X is called a metric space) when d satisfies

More information

Lecture 5: The Bellman Equation

Lecture 5: The Bellman Equation Lecture 5: The Bellman Equation Florian Scheuer 1 Plan Prove properties of the Bellman equation (In particular, existence and uniqueness of solution) Use this to prove properties of the solution Think

More information

CS229T/STATS231: Statistical Learning Theory. Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018

CS229T/STATS231: Statistical Learning Theory. Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018 CS229T/STATS231: Statistical Learning Theory Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018 1 Overview This lecture mainly covers Recall the statistical theory of GANs

More information

{σ x >t}p x. (σ x >t)=e at.

{σ x >t}p x. (σ x >t)=e at. 3.11. EXERCISES 121 3.11 Exercises Exercise 3.1 Consider the Ornstein Uhlenbeck process in example 3.1.7(B). Show that the defined process is a Markov process which converges in distribution to an N(0,σ

More information

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS SPRING 006 PRELIMINARY EXAMINATION SOLUTIONS 1A. Let G be the subgroup of the free abelian group Z 4 consisting of all integer vectors (x, y, z, w) such that x + 3y + 5z + 7w = 0. (a) Determine a linearly

More information

On Finding Optimal Policies for Markovian Decision Processes Using Simulation

On Finding Optimal Policies for Markovian Decision Processes Using Simulation On Finding Optimal Policies for Markovian Decision Processes Using Simulation Apostolos N. Burnetas Case Western Reserve University Michael N. Katehakis Rutgers University February 1995 Abstract A simulation

More information

Stability of optimization problems with stochastic dominance constraints

Stability of optimization problems with stochastic dominance constraints Stability of optimization problems with stochastic dominance constraints D. Dentcheva and W. Römisch Stevens Institute of Technology, Hoboken Humboldt-University Berlin www.math.hu-berlin.de/~romisch SIAM

More information

Thomas Knispel Leibniz Universität Hannover

Thomas Knispel Leibniz Universität Hannover Optimal long term investment under model ambiguity Optimal long term investment under model ambiguity homas Knispel Leibniz Universität Hannover knispel@stochastik.uni-hannover.de AnStAp0 Vienna, July

More information

KPP Pulsating Traveling Fronts within Large Drift

KPP Pulsating Traveling Fronts within Large Drift KPP Pulsating Traveling Fronts within Large Drift Mohammad El Smaily Joint work with Stéphane Kirsch University of British olumbia & Pacific Institute for the Mathematical Sciences September 17, 2009 PIMS

More information

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents MATH 3969 - MEASURE THEORY AND FOURIER ANALYSIS ANDREW TULLOCH Contents 1. Measure Theory 2 1.1. Properties of Measures 3 1.2. Constructing σ-algebras and measures 3 1.3. Properties of the Lebesgue measure

More information

Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes

Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes RAÚL MONTES-DE-OCA Departamento de Matemáticas Universidad Autónoma Metropolitana-Iztapalapa San Rafael

More information

An iterative procedure for constructing subsolutions of discrete-time optimal control problems

An iterative procedure for constructing subsolutions of discrete-time optimal control problems An iterative procedure for constructing subsolutions of discrete-time optimal control problems Markus Fischer version of November, 2011 Abstract An iterative procedure for constructing subsolutions of

More information

Approximate Dynamic Programming

Approximate Dynamic Programming Approximate Dynamic Programming A. LAZARIC (SequeL Team @INRIA-Lille) Ecole Centrale - Option DAD SequeL INRIA Lille EC-RL Course Value Iteration: the Idea 1. Let V 0 be any vector in R N A. LAZARIC Reinforcement

More information

On distribution functions of ξ(3/2) n mod 1

On distribution functions of ξ(3/2) n mod 1 ACTA ARITHMETICA LXXXI. (997) On distribution functions of ξ(3/2) n mod by Oto Strauch (Bratislava). Preliminary remarks. The question about distribution of (3/2) n mod is most difficult. We present a

More information

Aggregate Risk. MFM Practitioner Module: Quantitative Risk Management. John Dodson. February 6, Aggregate Risk. John Dodson.

Aggregate Risk. MFM Practitioner Module: Quantitative Risk Management. John Dodson. February 6, Aggregate Risk. John Dodson. MFM Practitioner Module: Quantitative Risk Management February 6, 2019 As we discussed last semester, the general goal of risk measurement is to come up with a single metric that can be used to make financial

More information

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

g 2 (x) (1/3)M 1 = (1/3)(2/3)M. COMPACTNESS If C R n is closed and bounded, then by B-W it is sequentially compact: any sequence of points in C has a subsequence converging to a point in C Conversely, any sequentially compact C R n is

More information

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal

More information

Econ 204 Differential Equations. 1 Existence and Uniqueness of Solutions

Econ 204 Differential Equations. 1 Existence and Uniqueness of Solutions Econ 4 Differential Equations In this supplement, we use the methods we have developed so far to study differential equations. 1 Existence and Uniqueness of Solutions Definition 1 A differential equation

More information

The KPP minimal speed within large drift in two dimensions

The KPP minimal speed within large drift in two dimensions The KPP minimal speed within large drift in two dimensions Mohammad El Smaily Joint work with Stéphane Kirsch University of British Columbia & Pacific Institute for the Mathematical Sciences Banff, March-2010

More information

Economics 204 Summer/Fall 2011 Lecture 5 Friday July 29, 2011

Economics 204 Summer/Fall 2011 Lecture 5 Friday July 29, 2011 Economics 204 Summer/Fall 2011 Lecture 5 Friday July 29, 2011 Section 2.6 (cont.) Properties of Real Functions Here we first study properties of functions from R to R, making use of the additional structure

More information

Cores for generators of some Markov semigroups

Cores for generators of some Markov semigroups Cores for generators of some Markov semigroups Giuseppe Da Prato, Scuola Normale Superiore di Pisa, Italy and Michael Röckner Faculty of Mathematics, University of Bielefeld, Germany and Department of

More information

ADJOINT METHODS FOR OBSTACLE PROBLEMS AND WEAKLY COUPLED SYSTEMS OF PDE. 1. Introduction

ADJOINT METHODS FOR OBSTACLE PROBLEMS AND WEAKLY COUPLED SYSTEMS OF PDE. 1. Introduction ADJOINT METHODS FOR OBSTACLE PROBLEMS AND WEAKLY COPLED SYSTEMS OF PDE F. CAGNETTI, D. GOMES, AND H.V. TRAN Abstract. The adjoint method, recently introduced by Evans, is used to study obstacle problems,

More information

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is

More information