Introduction to optimal control theory in continuos time (with economic applications) Salvatore Federico

Similar documents
Deterministic Dynamic Programming

Optimal Control. Macroeconomics II SMU. Ömer Özak (SMU) Economic Growth Macroeconomics II 1 / 112

ECON 582: Dynamic Programming (Chapter 6, Acemoglu) Instructor: Dmytro Hryshko

NOTES ON CALCULUS OF VARIATIONS. September 13, 2012

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION

Monetary Economics: Solutions Problem Set 1

Ramsey Cass Koopmans Model (1): Setup of the Model and Competitive Equilibrium Path

Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time

Solution by the Maximum Principle

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011

Viscosity Solutions for Dummies (including Economists)

Lecture 3: Hamilton-Jacobi-Bellman Equations. Distributional Macroeconomics. Benjamin Moll. Part II of ECON Harvard University, Spring

Lecture 1: Overview, Hamiltonians and Phase Diagrams. ECO 521: Advanced Macroeconomics I. Benjamin Moll. Princeton University, Fall

Lecture 6: Discrete-Time Dynamic Optimization

A problem of portfolio/consumption choice in a. liquidity risk model with random trading times

Introduction to Continuous-Time Dynamic Optimization: Optimal Control Theory

An introduction to Mathematical Theory of Control

Advanced Macroeconomics

A simple macro dynamic model with endogenous saving rate: the representative agent model

A t = B A F (φ A t K t, N A t X t ) S t = B S F (φ S t K t, N S t X t ) M t + δk + K = B M F (φ M t K t, N M t X t )

problem. max Both k (0) and h (0) are given at time 0. (a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming

Worst Case Portfolio Optimization and HJB-Systems

1 The Observability Canonical Form

Introduction Optimality and Asset Pricing

Introduction to Recursive Methods

Contents. An example 5. Mathematical Preliminaries 13. Dynamic programming under certainty 29. Numerical methods 41. Some applications 57

Endogenous Growth: AK Model

Optimization Techniques and Problem Analysis for Managers

Optimization, Part 2 (november to december): mandatory for QEM-IMAEF, and for MMEF or MAEF who have chosen it as an optional course.

The Real Business Cycle Model

Economic Growth: Lecture 9, Neoclassical Endogenous Growth

Introduction to Optimal Control Theory and Hamilton-Jacobi equations. Seung Yeal Ha Department of Mathematical Sciences Seoul National University

ECON 2010c Solution to Problem Set 1

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME

Econ 204A: Section 3

Optimal Control with State Constraint and Non-concave Dynamics: A Model Arising in Economic Growth

HOMEWORK #3 This homework assignment is due at NOON on Friday, November 17 in Marnix Amand s mailbox.

Endogenous Growth Theory

News Driven Business Cycles in Heterogenous Agents Economies

The Ramsey Model. Alessandra Pelloni. October TEI Lecture. Alessandra Pelloni (TEI Lecture) Economic Growth October / 61

EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018

Time-varying Consumption Tax, Productive Government Spending, and Aggregate Instability.

Topic 5: The Difference Equation

Basic Deterministic Dynamic Programming

On the Dynamic Implications of the Cobb- Douglas Production Function

Dynamic Optimization Problem. April 2, Graduate School of Economics, University of Tokyo. Math Camp Day 4. Daiki Kishishita.

Lecture 2 The Centralized Economy

Dynamic Optimization: An Introduction

Endogenous growth with addictive habits

Basic Techniques. Ping Wang Department of Economics Washington University in St. Louis. January 2018

Dynamic Programming Theorems

1 Jan 28: Overview and Review of Equilibrium

Nonlinear Control Systems

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Documents de Travail du Centre d Economie de la Sorbonne

Economics 202A Lecture Outline #3 (version 1.0)

Welfare Equivalent NNP and Habit Formation

f(s)ds, i.e. to write down the general solution in

University of Warwick, EC9A0 Maths for Economists Lecture Notes 10: Dynamic Programming

CHAPTER 3 THE MAXIMUM PRINCIPLE: MIXED INEQUALITY CONSTRAINTS. p. 1/73

Alberto Bressan. Department of Mathematics, Penn State University

(a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming

INFINITE TIME HORIZON OPTIMAL CONTROL OF THE SEMILINEAR HEAT EQUATION

Solution of Stochastic Optimal Control Problems and Financial Applications

Lecture 3: Growth Model, Dynamic Optimization in Continuous Time (Hamiltonians)

Suggested Solutions to Homework #3 Econ 511b (Part I), Spring 2004

Slides II - Dynamic Programming

Economic Growth: Lectures 5-7, Neoclassical Growth

CHAPTER 7 APPLICATIONS TO MARKETING. Chapter 7 p. 1/53

Existence Theory: Green s Functions

10 Transfer Matrix Models

Economics 210B Due: September 16, Problem Set 10. s.t. k t+1 = R(k t c t ) for all t 0, and k 0 given, lim. and

DSGE-Models. Calibration and Introduction to Dynare. Institute of Econometrics and Economic Statistics

Numerical approximation for optimal control problems via MPC and HJB. Giulia Fabrini

S2500 ANALYSIS AND OPTIMIZATION, SUMMER 2016 FINAL EXAM SOLUTIONS

Régularité des équations de Hamilton-Jacobi du premier ordre et applications aux jeux à champ moyen

Lecture 2 The Centralized Economy: Basic features

Continuous Time Finance

Optimal Control. McGill COMP 765 Oct 3 rd, 2017

Optimal Control Theory

1 The Basic RBC Model

Exhaustible Resources and Economic Growth

Economic Growth: Lecture 8, Overlapping Generations

Macroeconomics Qualifying Examination

Lecture 5: Some Informal Notes on Dynamic Programming

An Application to Growth Theory

HOMEWORK #1 This homework assignment is due at 5PM on Friday, November 3 in Marnix Amand s mailbox.

ECON 582: An Introduction to the Theory of Optimal Control (Chapter 7, Acemoglu) Instructor: Dmytro Hryshko

Explicit solutions of some Linear-Quadratic Mean Field Games

Part A: Answer question A1 (required), plus either question A2 or A3.

TOBB-ETU - Econ 532 Practice Problems II (Solutions)

Problem Set #2: Overlapping Generations Models Suggested Solutions - Q2 revised

Liquidation in Limit Order Books. LOBs with Controlled Intensity

THE PONTRYAGIN MAXIMUM PRINCIPLE FROM DYNAMIC PROGRAMMING AND VISCOSITY SOLUTIONS TO FIRST-ORDER PARTIAL DIFFERENTIAL EQUATIONS

An Uncertain Control Model with Application to. Production-Inventory System

Social Welfare Functions for Sustainable Development

Advanced Macroeconomics

[A + 1 ] + (1 ) v: : (b) Show: the derivative of T at v = v 0 < 0 is: = (v 0 ) (1 ) ; [A + 1 ]

Dynamic Discrete Choice Structural Models in Empirical IO

Generalized Hypothesis Testing and Maximizing the Success Probability in Financial Markets

Transcription:

Introduction to optimal control theory in continuos time (with economic applications) Salvatore Federico June 26, 2017

2

Contents 1 Introduction to optimal control problems in continuous time 5 1.1 From static to dynamic optimization....................... 5 1.2 Basic ODEs theory.................................. 7 1.3 Formulation of optimal control problems..................... 8 1.3.1 State equation................................ 9 1.3.2 Set of admissible control strategies.................... 9 1.3.3 Objective functional............................. 10 1.3.4 Feedback control strategies........................ 11 1.4 Examples....................................... 12 1.4.1 Example 1: Cake s problem........................ 12 1.4.2 Example 2: Optimal consumption in AK growth models....... 12 1.4.3 Example 3: Optimal investment..................... 12 2 Dynamic Programming method 15 2.1 Dynamic Programming Principle......................... 16 2.2 HJB equation..................................... 17 2.3 Verification Theorem and optimal feedbacks.................. 18 2.3.1 Autonomous problems with infinite horizon.............. 20 3 Application to economic problems 23 3.1 AK model: finite horizon.............................. 23 3.2 AK model: infinite horizon............................. 25 3.3 Optimal investment model: infinite horizon................... 27 3.4 Exercises........................................ 29 3

4

Chapter 1 Introduction to optimal control problems in continuous time 1.1 From static to dynamic optimization The problem of utility maximization is a classical problem of Economic Theory which in its simplest formulation can nicely illustrate the passage from static to dynamic optimization problems. Static case Consider a consumer with an initial amount of money x 0 who may consume k different goods and aims at maximizing its satisfaction from consumption (without taking any debt). If c = (c 1,..., c k ) denotes the vector of consumed quantities (clearly nonnegative), p = (p 1,..., p k ) the vector of nonnegative prices of the k goods available and U(c) the satisfaction from the consumption allocation c where U : R k + R + is a utility function (jointly concave in all the variables and nondecreasing on each component), then the problem is the following max U (c) (1.1.1) s.t. c 0, p,c x 0. The main ingredients of the problems are: the objective function to maximize U; the constraints to be fulfilled. In one dimension the problem simply becomes max ]U (c), (1.1.2) c [0, x 0 p where U : R + R is a utility (i.e. concave and nondecreasing) function. 5

Dynamic case The dynamic formulation of (1.1.2) is the following. The consumer may spread the consumption over a certain period of time: this means that the consumption c becomes now a time dependent function, and so may do the price p. The set of times where the decision of consumption may be taken is usually a subset T of the positive half line R + = [0,+ ). How to choose the subset T? Usually one sets the initial time t = 0 and chooses a time horizon T [0,+ ]: if T < +, we are considering a finite horizon problem; if T =, we are considering an infinite horizon problem. Typically T = [0, T] N, i.e. the discrete time case. T = [0, T] R +, i.e. the continuous time case. In these notes we deal with the continuous time case, so T = [0, T] R +. The consumption c is now a function representing the rate of consumption at time t T, and the consumer will aim at maximizing an intertemporal utility from consumption: T 0 e ρt U (c(t)) dt where ρ > 0 is a discount factor, according to a usual economic assumption: the consumer is less satisfied" by postponing consumption. Considering now the constraints on the decision variable in the static case, let us try to see how they can be formulated in this dynamic case. The nonnegativity constraints c 0 naturally becomes c(t) 0 t T. The budget constraint pc x 0 is naturally rephrased as x(t) 0 t T. where x(t) is now the money in the pocket at time t, i.e. x(t) := x 0 t 0 p(s)c(s)ds, t T. Notice that, if p( ) and c( ) are continuous, then x (t) = p(t)c(t) for every t T. So the problem can be written in a compact form as subject to the pointwise constraints max T 0 e ρt (c(t)) dt, c(t) 0 and x(t) 0 t T, and to the differential constraint x (t) = p(t)c(t), t T, x(0) = x 0. Remark 1.1.1. For simplicity we shall deal with one dimensional problems, but we stress that everything can be suitably reformulated and studied for n-dimensional problems replacing R with R n as state space (see afterwards for the notion of state space). 6

1.2 Basic ODEs theory We recall some basic facts of one dimesnional Ordinary Differential Equations (ODEs). Given f : T R R and (t 0, x 0 ) T R, we consider the so called Cauchy problem: x (t) = f (t, x(t)), (1.2.1) x(t 0 ) = x 0, A (classical) solution to (1.2.1) is a differentiable function x : T R satisfying the two requirements 1. x(t 0 ) = x 0 ; 2. x (t) = f (t, x(t)) for every t T. The first basic theorem in ODEs theory is the following: Theorem 1.2.1 (Cauchy-Lipschitz). Let f C(T R; R) be Lipschitz continuous with respect to space uniformly in time: i.e., for some L > 0, f (t, x) f (t, x ) L x x, t T, x, x R, and let (t 0, x 0 ) T R. Then the Cauchy problem (1.2.1) has a unique solution. To deal with optimal control problems we need a refinement of the theorem by allowing f which are less regular in time. Given U R, we consider the following space L 1 loc (T ;U) := u : T R : R T 0 } u(t) dt < R > 0. Remark 1.2.2. Actually in the real definition of the spaces above, the functions are not allowed to be arbitrarily irregular; they need to be measurable. We drop this concept issue to avoid technical complications. We only notice here that not all the function are measurable; the function we can imagine" are measurable. Moreover, the spaces above are usually defined not actually as spaces of functions but, rather, spaces of equivalence classes of functions. Finally, in the case T <, we just have L 1 loc (T ;U) = L1 (T ;U) := u : T R : Letting t 0 T, we can consider the integral function t U(t) = u(s)ds, t T, t 0 T 0 } u(t) dt <. where t 0 is a reference point within T. As f is not continuous a priori, the function F is not in general differentiable, it is only absolutely continuous. Absolutely continuous 7

functions are, however, almost everywhere (a.e.) 1 differentiable. For function belonging to L 1 (T ) the Fundamental Theorem of Calculus holds in a.e. form, i.e. loc U (t) = u(t) for a.e. t T. Theorem 1.2.3 (Caratheodory). Let f : T R R be Lipschitz continuous with respect to space uniformly in time, i.e., for some L > 0 f (t, x) f (t, x ) L x x, t T, x, x R and such that f (,0) L 1 loc (T ;R). Let (t 0, x 0 ) T R. Then the Cauchy problem (1.2.1) has a unique integral solution in the sense that there exists a unique (absolutely continuous) function x : T R such that x(t) = x 0 + t t 0 f (s, x(s))ds, t T. Remark 1.2.4. For the solution x of Theorem 1.2.3, it holds x (t) = f (t, x(t)) a.e. t T. 1.3 Formulation of optimal control problems Optimal control problems (OCPs) are a special kind of dynamic constrained optimization problems. A control system is a physical or economic system whose behavior is described by a state variable and which can be controlled by a controller through a control variable; the latter enters into the dynamics of the system affecting the evolution of the state. The state and the control variables are required to satisfy an ODE, the so called state equation, which represents the differential costraint of the control system. In optimal control aims at optimizing the behavior of the system by maximizing or minimizing a given functional on the state and the control variables. To be more precise, the key ingredients of a continuous time optimal control problem are the following. 2 A set C R: it is the control set, the set where the control variable takes values. The control variable c : T C and the state variable x : T R. The state equation, an ODE stating the dynamics of the state variable x : T R for each given c : T C in the given set of admissible control strategies. A set of admissible control strategies, i.e. a subset of c : T C}. An objective functional depending on time and on the paths of c( ) and x( ) to be optimized over c( ) ranging in the set of admissible strategies. In these notes we want to deal with a class of optimal control problems that includes some important economic examples. The main goals are: 1 This concept can be rigorously formalized. 2 In the following, we will denote by x( ), y( ), c( ), etc. real functions defined on T. 8

give a quite general formulation for a wide class of optimal control problems; give a brief outline of the Dynamic Programming method used to treat such problems; show how to apply such method to some economic examples. For some reasons that will be lear in the following it is convenient to define the problem also when starting from a time t 0 T greater than 0. To deal with it we also define the sets T t0 := T \ [0, t 0 ), t 0 T. and the space L 1 loc (T t 0 ;U) := u : T T0 R : } u(t) dt < R > t 0. t 0 R T 1.3.1 State equation Let t 0 T and let C R be a control set. A control strategy starting at t 0 is a function c( ) L 1 loc (T t 0 ;C). Given a control strategy c( ) starting at t 0, the evolution of the state variable (state trajectory) in T t0 is determined by a state equation (SE), an ODE with dynamics specified by a function f : T R C R. We will consider the following standing assumptions: (H1) L 0 : f (t, x, c) f (t, x, c ) L( x x + c c ), t T, x, x R, c, c C, (H2) f ( ; x, c) L 1 (T ;R), for some (hence, by (H1), for all) x R, c C. loc Given c( ) L 1 loc (T t 0 ;C) and an initial state x 0 R, we can use (H1)-(H2) and apply Theorem 1.2.3 to get the existence and uniqueness of an absolutely continuous function solving (in integral sense) the Cauchy problem of the ODE (state equation) (SE) x (t) = f (t, x(t), c(t)), t T t0, x(0) = x 0. The state equation is the core of what we call a control system or a controlled dynamical system. The unique solution to (SE) will be denoted by x( ; t 0 x 0, c( )) or simply by x( ) when no confusion may arise. 1.3.2 Set of admissible control strategies The state variable may be required to lie in some interval S R. Then, the set of all admissible control strategies c( ) starting from the initial couple (t 0, x 0 ) T R is accordingly defined as A (t 0, x 0 ) = c( ) L 1 loc ( Tt0 ;C ) : x(t; x 0, c( )) S t T t0 }. 9

1.3.3 Objective functional The objective of the problem is to maximize/minimize a given functional over the set A (t 0, x 0 ). We provide a class of functionals that are commonly used. A function and, in case T <, another function g : T S C R, φ : S R, are given. They represent, respectively, the instantaneous performance index of the system and the payoff from the final state min the case T <. Then we define the functional for the case T = and, for the case T <, J (t 0, x 0 ; c( )) := J (t 0, x 0 ; c( )) = T + t 0 g (t, y(t), c(t)) dt, c( ) A (t 0, x 0 ) t 0 g(t, x(t), c(t))dt + φ(x(t)), c( ) A (t 0, x 0 ). Usually the time dependence of function g in economic problems is of the form g (t, x, c) = e ρt g 0 (x, c), where ρ > 0 is a discount factor and g 0 : S C R. The problem is then (P) Max/Min J(t 0, x 0 ; c( )) over c( ) A (t 0, x 0 ). Remark 1.3.1. We always consider maximization problems here. Recalling that, for a given function F max F = min( F) we can treat with the same ideas also minimization problems. The concept of optimality is naturally the following. Definition 1.3.2. A control strategy c ( ) A (t 0, x 0 ) is called an optimal control strategy at the starting point (t 0, y 0 ) if J ( t 0, y 0 ; c ( ) ) J (t 0, y 0 ; c( )) c( ) A (t 0, y 0 ). The corresponding state trajectory x( ; t 0, y 0 ; c ( )) is called an optimal state trajectory and will be often denoted simply by x ( ). The state-control couple (x ( ), c ( )) is called an optimal couple. The value function of the problem (P) is the optimum of the problem: V (t 0, x 0 ) : = sup c( ) A (t 0,x 0 ) J (t 0, x 0 ; c( )). (1.3.1) Remark 1.3.3. We observe that the definition of optimal control strategy at (t 0, x 0 ) makes sense if we know that the value function is finite at that point. Of course, it can happen that V = + or at some points. This is the case for example in many problems with infinite horizon arising in economic applications for some values of the parameters. In these cases one has to introduce a more general concept of optimality. We avoid to treat this case. 10

1.3.4 Feedback control strategies The concept of feedback strategy plays a crucial role in optimal control theory. The idea of feedback is just the one of looking at the system at any time t T t0, observing its the current state x(t) and then choose in real time the control strategy c(t) as a function of the state (and maybe of the time) at the same time: c(t) = G (t, x(t)) for a suitable map G : T R C. A key point is that the form of G does not depend on the initial time and state (t 0, x 0 ): this is more or less obvious in the philosophy of controlling in real time. To be more precise we introduce the following concepts. Definition 1.3.4. A function G : T R C is called an admissible feedback map for problem (P) if for any initial data (t 0, x 0 ) T R the closed loop equation x (t) = f (t, x(t),g (t, x(t))) t T t0, x(t 0 ) = x 0, admits a unique solution denoted by x G ( ; t 0, x 0 ) and the corresponding feedback control strategy is admissible, i.e. it belongs to A (t 0, x 0 ). c (t0,x 0,G) (t) := G (t, x G ( ; t 0, x 0 )), t T t0, An admissible control strategy for problem (P) is usually called an open loop control strategy. A feedback control strategy will be called closed loop control strategy. Definition 1.3.5. An admissible feedback map G is optimal for problem (P) if, for every initial data (t 0, x 0 ) T R, the state-control couple ( x G ( ; t 0, x 0 ), c (t0,x 0,G)( ) ) is optimal in the sense of Definition 1.3.2. Remark 1.3.6. Observe that, if we know an optimal feedback map G, we are able to optimally control the system in real time without knowing the real input map c (t0,x 0,G) ( ). In fact it is enough to know G and to put a feedback device that reads the state x(t) and give the value c (t) = G(x(t)) at any time t. This is a common technique in many real systems (especially in engineering). Remark 1.3.7. The two philosophies of open loop and closed loop control strategies are substantially different mostly for their different use of the information. Looking for open loop strategies means that at the starting point we look at the problem, assuming to have a perfect foresight on the future and we choose once for all the optimal strategy without changing it. On the other side, looking for closed loop strategies means that we adjust at any time our policy, depending on our observation of the system. This is clearly a better policy in the sense of the use of information and we have equivalence of the two methods if we are in a deterministic world with perfect foresight. 11

1.4 Examples 1.4.1 Example 1: Cake s problem 1.4.2 Example 2: Optimal consumption in AK growth models This is a classical growth model in Economic Theory. Consider an aconomy represented by an aggregate variable k(t) representing capital at time t T and denote by c(t) respectively the consumption rate at time t T. We conider as state equation stating the evolution of the economy k (t) = Ak(t) c(t), t T t0, k(t 0 ) = x, (1.4.1) where A = à δ, with à > 0 being a parameter representing the technological level of the economy and δ > 0 being the depreciation rate of capital. The natural control set to consider in this context is C = R +, which corresponds to require nonnegative consumption rate. Moreover, it is natural to assume that the capital cannot become negative, so to require k (t) R + (so S = R + in the notation of the previous subsection) at any t T t0. Hence the set of admissible strategies starting from (t 0, k 0 ) T S is The problem is where A (t 0, x 0 ) := c( ) L 1 loc (T ;R+ ) : k(t; t 0, k 0, c( )) R + t T t0 }. max J(t 0, k 0 ; c( )) c( ) A (t 0,k 0 ) T J(t 0, k 0 ; c( )) = e ρt u (c (t)) dt + e ρt φ(k (T)), if T < +, t 0 J(t 0, k 0 ; c( )) = t 0 e ρt u (c (t)) dt, if T =, where the function ρ > 0 is the discount rate of the agent, u : R + R + is the instantaneous utility from consumption and φ : R + R + is, possibly, the utility from remaining capital stock. The functions u and φ are usually chosen strictly increasing, concave and two times differentiable. A standard choice of u is the so-called C.E.S. (Constant Elasticity of Substitution) utility function which is given by u σ (c) = c1 σ 1 1 σ if σ > 0, σ 1 u 1 (c) = log c if σ = 1. 1.4.3 Example 3: Optimal investment This is a classical optimal investment problem with adjustment costs. Consider a firm that at time t T produces goods using a certain amount of capital stock k(t) (i.e. the machines used for production or the cattle) and may invest a rate i(t) 12

at time t to increase its capital. A simple model (state equation) for the evolution of k( ) is k (t) = δk(t) + i(t), t T t0, k(t 0 ) = k 0, (1.4.2) where δ > 0 is the depreciation rate of capital (the machines become older and can broke). The firm has to choose the investment strategy respecting some constraints. For example on could assume i( ) L 1 loc (T t 0 ;R), so C = R (allowing for negative investment, i.e. disinvestment) and then impose no state constraint or the state constraint k (t; t 0, k 0, i( )) R + for every t T 0 (so S = R + ); otherwise one could assume irreversibility for the investemnt imposing i( ) L 1 loc (T ;R+ ), so C = [0, ): in this case it is automatically k (t; x, I( )) R + for every t T t0 as soon as k 0 R +. I.e. we may have or or A (t 0, k 0 ) := i( ) L 1 loc (T t 0 ;R)}, A (t 0, k 0 ) := i( ) L 1 loc (T t 0 ;R) : k(t; t 0, k 0, i( ) R + }; A (t 0, k 0 ) := L 1 loc (T t 0 ;R + ) (in this last case, restricting to initial data k 0 R +, otherwise A (t 0, k 0 ) is empty). We model the behavior of the firm assuming that it wants to maximize the discounted intertemporal profit J(t 0, k 0 ; i( ): where max J(t 0, k 0 ; i( )), i( ) A (t 0,k 0 ) T J(t 0, k 0 ; i( )) := e rt g(k(t), i(t))dt + e rt φ(k(t)), if T < + 0 J(t 0, k 0 ; i( )) := + 0 e ρt g(k(t), i(t))dt, if T = + where ρ > 0 is a interest rate g(k, i) gives the instantaneous profit rate for the given levels of capital stock k and investment rate i, and φ(k) gives the profit for keeping a quantity of capital k at the end of the period (e.g. the market value of it). The function g : R + C R might have an additive form g(k, i) = f 1 (k) f 2 (i) where f 1 : R + R is strictly increasing and concave and f 2 : C R is strictly concave and f superlinear, i.e. lim 2 (i) i + i = + (e.g. f 1 (k) = αk, f 2 (i) = βi+γi 2. Similarly φ is usually concave and strictly increasing (e.g. φ(k) = δk). 13

14

Chapter 2 Dynamic Programming method The starting point of the Dynamic Programming (DP) method is the idea of embedding a given optimal control problem (OCP) into a family of OCP s indexed by the initial data (t 0, y 0 ). This means that we keep the horizon T fixed and we let the data (t 0, y 0 ) vary tryinig to establish a relation among such problems. The core of such relation is somehow hidden in the following sentence: The remaining part of an optimal trajectory is still optimal. 1 The main idea of DP is the following. First state precisely the relationship between problems with different data (the Bellman s Optimality Principle). Then use these relationship (eventually in a modified form: this happens especially in the continuous time case where the infinitesimal form is studied) to get information about optimal control strategies. The key tool to find these relationship is the value function of the problem, see (1.3.1). Before to pass to precise statements of theorems we give an outline of the main ideas of (classical, i.e. based on the concept of classical solution of PDEs) Dynamic Programming (DP). The list is purely a rough indication. 1. Define the value function of the problem as in Definition (1.3.1): this is a function of the initial time and the initial state (t, x) = (t 0, x 0 ). 2. Find a functional equation for V, the so-called Dynamic Programming Principle (DPP) or Bellman s principle which is satisfied by V (Theorem 2.1.1). 3. Pass to the limit the DPP to get its differential form: a PDE called Hamilton-Jacobi- Bellman (HJB) equation. 4. Find a solution v, if possible, of the Bellman equation and prove that such solution is the value function V through the so called Verification Theorem (Theorem 2.3.2). 5. As a byproduct of the step 4, find a feedback (candidate) optimal map and then, via the closed loop equation, the optimal couples (still Theorem 2.3.2). 1 This has been the first formulation (in the 50s) of the celebrated Bellman s Optimality Principle. 15

The notation of this chapter will be the ones used of Subsection 1.3 with the following change. From now on the initial data (t 0, x 0 ) will be replaced by (t, x) for simplicity of notation. The running time index will be s in place of t and, in order to avoid confusion with the notation x for te state trajectory, we will change the name to the state variable x( ) into y( ). The assumption (H1) (H2) on f will be standing throughout the whole chapter. The value function of the problem is defined as V (t, x) : = where, given an interval S R we set sup c( ) A (t,x) J (t, x; c( )). A (t, x) = c( ) L 1 loc (T t;c) : y(s; t, x, c( )) S s T t }. In the following, to simplify the treatment, we assume that S is an open interval; the objects t t g(s, y(s; t, x, c( ))ds, J(t, x; c( )), V (t, x) are well defined and finite for all (t, x) T S, t T t, and c( ) A (t, x). The first assumption avoid to consider the cases when the state trajectory may touch the boundary of the state set (on the boundary the HJB equation is not defined). The second assumption has to be checked from case to case dealing with the specific problems. 2.1 Dynamic Programming Principle We start by the Bellman s Optimality Principle. Theorem 2.1.1. (Bellman s Optimality Principle). For every (t, x) T S and t T t we have t V (t, x) = sup g (s, y(s; t, x, c( )), c(s)) ds + V ( t, y ( t ; t, x, c( ) ))}. (2.1.1) c( ) A (t,x) t Remark 2.1.2. The proof of the above result is based on the following properties of admissible controls: 1. For every 0 t t < T, x S, c( ) A (t, x) = c( ) Tt A ( t, x ( t ; t, x, c )) (i.e. the second part of an admissible strategy is admissible). 2. For every 0 t t < T, x S, where c 1 ( ) A (t, x), c 2 ( ) A ( t, x ( t ; t, y, c 1 ( ) )) = c A (t, x), c(s) := c1 (s), if s [t, t ), c 2 (s), if s T t, (i.e. the concatenation of two admissible strategies is admissible). 16

2.2 HJB equation Equation (2.1.1) is a functional equation satisfied by the value function. This is an alternative representation of V that can be useful to determine its properties or even to calculate it. Of course the functional form of (2.1.1) is not easy to handle. It is convenient then to find a differential form of it, i.e. the so called Hamilton-Jacobi-Bellman (HJB) equation. Theorem 2.2.1. Assume that g is uniformly continuous. Assume morever that that V C 1 ([0, T) S). Then V is a classical 2 solution of the following Partial Differential Equation (PDE): v t (t, x) = H max (t, x, v x (t, x)), (t, x) [0, T) S, (2.2.1) where the function H max : [0, T) S R R ( maximum value Hamiltonian or, simply, the Hamiltonian ) is given by where H cv ( current value Hamiltonian") H max (t, x, p) := sup H cv (t, x, p; c), (2.2.2) c C H cv (t, x, p; c) := f (t, x, c) p + g (t, x, c). (2.2.3) Remark 2.2.2. The equation (2.2.1) usually bear the names of Hamilton and Jacobi because such kind of PDE s were first studied by them in connection with calculus of variations and classical mechanics. Bellman was the first to discover its relationship with control problems. We will call it Hamilton-Jacobi-Bellman (HJB) equation. Remark 2.2.3. The function H max (t, x, p) is usually called (in the mathematics literature) the Hamiltonian of the problem. However in many cases the Hamiltonian is defined differently. In particular in the economic literature the name Hamiltonian (or current value Hamiltonian while the other is the maximum value Hamiltonian ) is often used for the function to be maximized in (2.2.2). To avoid misunderstandings we will then use the notation H cv (t, y, p; c) := f (t, x, c) p + g (t, x, c) for the current value Hamiltonian and for the maximum value Hamiltonian. H max (t, x, p) := sup H cv (t, x, p; c) c C Remark 2.2.4. The key issue of the above result is to give an alternative characterization of the value function in term of the PDE (2.2.1). In fact this give a very powerful tool to study properties of V and to calculate it by some numerical analysis (at least in low dimension). Knowing V one can get important information on the optimal state-control trajectories, as we will see below. However, to get a real characterization one need a much 2 In the sense that all derivatives exist and that the equation is satisfied for every (t, x) [0, T) S. 17

more powerful result: here we assumed V C 1 ([0, T) S) and we did not get uniqueness. A good result should state that the value function V is the unique solution of (2.2.1) under general hypotheses on the data. Such kind of result have been a difficult problem for many years because the usual definitions of classical or generalized solution did not adapt to PDE of HJB type. The problem was solved in the 80ies with the introduction of the concept of viscosity solution by Crandall and Lions. With this concept it is possible to state that the value function V is the unique viscosity solution of (2.2.1) under very weak assumptions on the data. 2.3 Verification Theorem and optimal feedbacks The HJB equation has a crucial importance for solving the optimal control problem (P). Before to give the main result on it we prove a fundamental identity in next lemma. Lemma 2.3.1 (Fundamental identity). Let v C(T S) C 1 ([0, T) S) be a classical solution of (2.2.1) and assume that g is continuous. Then the following fundamental identity holds: for every (t, x) T S, for every c( ) A (t, x) and every t T t, setting y(s) := y(s; t, x, c( )) we have v(t, x) v(t, y(t )) = t t t + t g(s, y(s), c(s))ds (2.3.1) [H max (s, y(s), v x (s, y(s))) H cv (s, y(s), v x (s, y(s)); c(s))] ds Proof. For all s [t, t ) we calculate, using that v is a classical solution of (2.2.1) d ds v(s, y(s)) = v t (s, y(s)) + v x (s, y(s)) y (s), = H max (s, y(s), v x (s, y(s))) + f (s, y(s), c) v x (s, y(s)) = H max (s, y(s), v x (s, y(s))) + f (s, y(s), c) v x (s, y(s)) + g (s, x(s), c(s)) g (s, y(s), c(s)) = H max (s, y(s), v x (s, y(s))) + H cv (s, y(s), v x (s, x(s)); c(s)) g (s, y(s), c (s)) In the case T = we just integrating the above identity on [ t, t ] and using the continuity of v and we then get (2.3.1). In the case T < we can do the same passage and get (2.3.1) for t < T. To get (2.3.1) also for t = T one uses it for t ε = T ε and passes to the limit for ε 0 using the continuity of v, g up to T and the fact that H max H cv. Theorem 2.3.2 (Verification Theorem - Finite horizon case). Let T < and (t, x) T S. Assume that v C(T S) C 1 ([0, T) S) is a classical solution of (2.2.1) satisfying the terminal condition v(t, ) = φ( ). (2.3.2) and that g is continuous. Then we have the following claims. 1. v(t, x) V (t, x). 18

2. Assume that c ( ) A (t, x) is such that, denoting y ( ) := y( ; t, x, c ( )), we have 3 H max ( s, y (s), v x (s, y (s)) ) H cv ( s, y (s), v x (s, y (s)); c (s) ) = 0, s [t, T], i.e. c (s) argmax c C H ( cv s, y ( (s), v x s, y (s) ) ; c ), s [t, T]. (2.3.3) Then c ( ) is an optimal control strategy starting at (t, x) and v(t, x) = V (t, x). 3. If we know from the beginning that v(t, x) = V (t, x), then every optimal strategy starting at (t, x) satisfies (2.3.3). 4. Assume that for every (s, z) [t, T] S the map C R, c H cv (s, z, v x (s, z); c) admits a unique maximum point G (s, z) C; the closed loop equation y (s) = f (s, y(s),g (s, y(s))), has a solution y G ( ); the feedback control strategy y(t) = x, c G (s) := G (s, y G (s)), s [t, T], belongs to A (t, x). Then v(t, x) = V (t, x) and (y G ( ), c G ( )) is an optimal couple starting at (t, x). Proof. Applying Lemma 2.3.1 with t = T and the terminal condition (2.3.2) we get v(t, x) φ(y(t)) = T t T + t g(s, y(s), c(s))ds [H max (s, y(s), v x (s, y(s))) H cv (s, y(s), v x (s, y(s)); c(s))] ds, i.e. v(t, x) = J(t, x; c( ))+ T t [H max (s, y(s), v x (s, y(s))) H cv (s, y(s), v x (s, y(s)); c(s))] ds. (2.3.4) 1. As (2.3.4) holds for every c( ) and H max ( ) H cv ( ; c) for every c C, this shows v V. 2. Considering the first item, for such a c ( ) A (t, x) we get V (t, x) J(t, x) = v(t, x) V (t, x) 3 Actually, it is sufficient to require the equality up to a null measure set. 19

and the claim follows. 3. If c ( ) A (t, x) is optimal starting at (t, x) and v(t, x) = V (t, x) from (2.3.4) we get V (t, x) = J(t, x; c ( )) + T t [ Hmax (s, y(s), v x (s, y(s))) H cv ( s, y(s), vx (s, y(s)); c (s) )] ds. Since the integrand in the equality above is nonnegative and J(t, x; c ( )) = V (t, x), we conclude that the integrand must be null 4. 4. The control strategy constructed in such a way that automatically satisfies the assumptions of the second item and we conclude. Remark 2.3.3. In Lemma 2.3.1 and in Theorem 2.3.2 the function v is not necessarily the value function. Of course, if we know (for example from Theorem 2.2.1) that the value function V is a classical solution of equation (2.2.1) it is natural to choose v = V. Remark 2.3.4. The above results can be used only in few cases, even interesting. Indeed the HJB equation (2.2.1) does not admit in general a classical solution. Remark 2.3.5. Note that the above results do not need uniqueness of solutions of (2.2.1) and that such uniqueness can be get as consequence of Theorem 2.3.2. 2.3.1 Autonomous problems with infinite horizon When dealing with autonomous (time-homogeneous) infinite horizon problems with discount factor, the HJB equation can be simplified. In this section we illustrate this fact. Consider the problem ( P ) of maximizing the functional J (t, x; c( )) = t e ρs g 0 (y(s), c(s)) ds, with ρ > 0, where y( ) := y( ; t, x, c( )) is the solution of the state equation and y (s) = f 0 (y(s), c(s)), s [t, T], y(t) = x S, c( ) A (t, x) := c L 1 loc ([t,+ );C) : y(s; t, x, c) S s t}. Problem ( P ) is nothing else than problem (P) with infinite horizon, time-homogeneous dynamics f (s, y, c) = f 0 (y, c) and current profit g (s, y, c) = e ρs g 0 (y, c). The value function satisfies V (t, x) = sup c( ) A (t,x) = e ρt sup 4 Actually, up to a null measure set. t c( ) A (t,x) t = e ρt sup c( ) A (t,x) e ρs g 0 (y(s), c (s)) ds 0 e ρ(s t) g 0 (y(s), c(s)) ds e ρτ g 0 (y(t + τ), c (t + τ)) dτ. 20

Now, being f 0 autonomous we have y(s + t; t, x, c(t + )) = y(s;0, x, c( ), c( ) L 1 loc (T ;C), so also in particular c(t + ) A (t, x) c ( ) A (0, x), With this observation we get V (t, x) = e ρt sup c( ) A (0,x) 0 e ρτ g 0 (y(s;0, x, c( )), c(s)) ds = e ρt V (0, x). (2.3.5) Let us write the HJB equation. The current value Hamiltonian is H cv (t, x, p; c) = f 0 (x, c) p + e ρt g 0 (x, c) = e ρt [ f 0 (x, c) pe ρt + g 0 (x, c) ]. Setting H 0 cv (x, p; c) := f 0 (x, c) p + g 0 (x, c), it becomes The maximum value Hamiltonian is H max (t, x, p) = sup c C H cv (t, x, p; c) = e ρt H 0 cv ( x, e ρt p; c ), H cv (t, x, p; c) = e ρt sup H 0 ( cv x, e ρt p; c ). c C Then, inspired by (2.3.5), we reduce the HJB equation by restricting to solutions in the form v(t, x) = e ρt v 0 (x). Plugging this structure into (2.2.1), it becomes where ρe ρt v 0 (x) = e ρt H 0 max ( x, e ρt ( e ρt v 0 (x))), H 0 max (x, p) = sup c C Hcv 0 (x, p; c). Hence, we get the equation for v 0 formally associated to V 0 : ρv 0 (x) = H 0 max ( x, v 0 (x) ). (2.3.6) In studying the problem ( P ) it is convenient to study the (2.3.6) instead of (2.2.1), since in this case (2.3.6) is just an ODE and V 0 is just a function of one variable. Due to time-stationarity, to implement the DP method we can consider the problem only for t = 0. Set A 0 (x) := A (0, x); J 0 (x; c( )) = J(0, x; c( )); V 0 (x) = V (0, x). Let us see what Lemma 2.3.1 provides in this case. 21

Theorem 2.3.6 (Verification Theorem - Infinite horizon case). Let T = and x S. Assume that v 0 C 1 (S) is a classical solution of (2.2.1) such that, setting y(s) = y(s;0, x, c( )), it satisfies the transversality condition lim t e ρt v 0 (y(t)) = 0, c( ) A 0 (x). (2.3.7) Finally, assume that g is continuous. Then we have the following claims. 1. v 0 (x) V 0 (x). 2. Assume that c ( ) A 0 (x) is such that, denoting y ( ) := y( ;0, x, c ( )), we have 5 i.e. H 0 max ( y (s), v 0 (y (s)) ) H 0 cv ( y (s), v 0 (y (s)); c (s) ) = 0, s R +, c (s) argmax c C ( H0 cv y (s), v 0 ( y (s) ) ; c ), s R +. (2.3.8) Then c ( ) is an optimal control strategy starting at (0, x) and v 0 (x) = V 0 (x). 3. If we know from the beginning that v 0 (x) = V 0 (x), then every optimal strategy starting at (0, x) satisfies (2.3.3). 4. Assume that for every z S the map C R, c Hcv 0 ( z, v 0 (z); c ) admits a unique maximum point G (z); the closed loop equation y (s) = f (y(s),g (s, y(s))), y(0) = x, has a solution y : G( ); the feedback control strategy c G (s) := G (y G (s)), s R +, belongs to A 0 (x). Then v 0 (x) = V 0 (x) and (y G ( ), c G ( )) is an optimal couple starting at (0, x). Proof. Considering (2.3.1) with t = 0, passing it to the limit for t, and using the transversality condition (2.3.7), we get, setting y(s) := (y(s; 0, x, c( )) v 0 (x) = J 0 (x; c( )) + + 0 e ρs [ H 0 max ( y(s), v 0 (y(s)) ) H 0 cv ( y(s), v (y(s)); c(s) )] ds. Then the proof follows the arguments of the proof of Theorem 2.3.2. 5 Again it is sufficient to require the equality up to a null measure set. 22

Chapter 3 Application to economic problems Now we want to apply the DP method to our examples. First we give an outline of the main steps of the DP method in the simplest cases (i.e. when the main assumptions of the above results are verified). We will try to do the following steps. 1. Calculate the Hamiltonians H cv and H max together with argmax H cv. 2. Write the HJB equation and find an explicit classical solution v. 3. Calculate the feedback map G, then solve the closed loop equation finding the optimal state-control couple. Remark 3.0.1. Of course in general it is impossible to perform such steps (notably, find an explicit solution to HJB). 3.1 AK model: finite horizon The state equation is y (s) = A y(s) c(s), s [t, T], and the problem is where y(t) = x > 0, max J(t, x; c( )), c( ) A (t,x) A (t, x) = c( ) L 1 ([0, T];R + ) : y(s; t, x, c( )) > 0 s [t, T] }, and 1 T c (s) 1 σ J(t, x; c( )) = t 1 σ The value function of the problem is: We treat the case σ 1. V (t, x) = y(t)1 σ ds + α, where α 0, σ > 0. 1 σ sup c( ) A (t,x) J(t, x; c( )). 1 We adopt the convention that, if σ = 1, we read z1 σ := log z. 1 σ 23

Hamiltonians The current value Hamiltonian does not depend on t and is given by H cv (t, x, p; c) = Axp cp + c1 σ 1 σ x > 0, p R, c R +. If p > 0, then the maximum point of H cv (x, p; c) is attained at so c (t, x, p) := argmax c 0 H cv (t, x, p; c) = p 1/σ, H max (t, x, p) = Axp + σ 1 σ p σ 1 σ. HJB equation: classical solution The HJB equation associated to our problem is vt (t, x) = H max (t, x, v x (t, x)) (t, x) [0, T) (0,+ ), v(t, x) = α x1 σ 1 σ. (3.1.1) We look for a solution to the HJB equation in the form with a( ) > 0. In this case v x > 0 and (3.1.1) becomes v(t, x) = a(t) x1 σ 1 σ. (3.1.2) vt (t, x) = Axv x (t, x) + σ 1 σ v x(t, x) σ 1 σ (t, x) [0, T) (0,+ ), v(t, x) = α x1 σ 1 σ. (3.1.3) Let us plug (3.1.2) into (3.1.3). We get i.e. a (t) x1 σ 1 σ = Axa(t)x σ + σ σ 1 a(t) σ (x σ ) σ 1 σ 1 σ a(t) x1 σ 1 σ = α x1 σ 1 σ, a (t) x1 σ 1 σ = Aa(t)x1 σ + σ σ 1 a(t) σ x 1 σ 1 σ a(t) x1 σ 1 σ = α x1 σ 1 σ, Dividing by x 1 σ and multiplying by 1 σ we get the ODE a (t) = A(1 σ)a(t) + σa(t) σ 1 σ a(t) = α. (3.1.4) Let us look for a solution of such terminal value problem. Consider b(t) := (a(t t)) 1 σ. 24

We have b (t) = 1 σ (a(t t)) 1 σ 1 a (T t), t [0, T]. Hence, in terms of this function the ODE above rewrites as b (t) = A 1 σ b(t) + 1, σ b(0) = α 1 σ. (3.1.5) Set Then solves (3.1.5) and is strictly positive. Hence δ := A 1 σ σ. ( b(t) = e δt α 1 1 e δt ) σ + δ a(t) = (b(t t)) σ (3.1.6) solves (3.1.4) and is strictly positive, so we conclude that v given by (3.1.2), where a( ) is given by (3.1.6), solves the HJB equation (3.1.1). Optimal feedback control Consider the feedback map coming from the optimization of H cv (p; c) when v x (s, z) is plugged in place of the formal argument p. It is the map G(s, z) = c (s, z, v x (s, z)) = a(s) 1 σ z, where a(t) is given in (3.1.6). The closed loop equation associated to this feedback map is a linear ODE: y (s) = A y(s) G(s, y(s)) = (A a(s) 1 σ )y(s), y(t) = x > 0, with unique solution y G ( ; t, x) strictly positive. The feedback strategy associated to the feedback map G is then c G (s) := G(s, y G (s)), s [t, T]. One verifies now that the c G ( ) A (t, x) as y(s; t, x, c G ( )) = y G (s; t, x) > 0. Hence, by Theorem 2.3.2, we conclude that v(t, x) = V (t, x) and that (c G ( ), y G ( )) is an optimal couple for the problem starting at (t, x). 3.2 AK model: infinite horizon The state equation is y (s) = A y(s) c(s), s R +, y(0) = x > 0, 25

and we want to maximize the intertemporal discounted utility J(x; c( )) = 0 e ρt c (t)1 σ 1 σ dt, where σ (0,1). over all consumption strategies c( ) A (x) where A (x) = c( ) L 1 loc (R+ ;R + ) : k( ; x, c( )) > 0 }. The value function of the problem is We treat the case σ (0,1). V (x) = sup c( ) A (x) J(x; c( )). Hamiltonians The current value Hamiltonian Hcv 0 of our problem is H 0 cv (x, p; c) = Axp cp + c1 σ 1 σ, x > 0, p R, c R+. If p > 0, the maximum of H 0 CV (x, p; ) over R+ is attained at so c (x, p) = argmax c 0 H0 (x, p; c) = p 1/σ, H 0 max (x, p) = Axp + σ 1 σ p σ 1 σ. HJB equation: classical solution The Hamilton-Jacobi equation associated to our problem is where, if p > 0, Hmax 0 (x, p) = } sup Axp cp + c1 σ c 0 1 σ We look for a solution to the HJB equation in the form ρv(x) = H 0 max (x, v (x)) x > 0. (3.2.1) = Axp + σ 1 σ p σ 1 σ. v(x) = a x1 σ, a > 0. (3.2.2) 1 σ This is motivated by the fact that given the linearity of the state equation and the (1 σ)- homogeneity of the functional, one can actually show that V is (1 σ)-homogeneous. Let us plug (3.2.2) into (3.2.1). We get ρa x1 σ 1 σ = Axax σ + σ 1 σ (ax σ ) σ 1 σ [ = x 1 σ Aa + σ ] 1 σ a σ 1 σ 26

from which we see that (3.2.2) solves (3.2.1) if (and only if) provided that [ ] ρ A (1 σ) σ a = (3.2.3) σ ρ > A (1 σ). (3.2.4) We take the latter as assumption. Note that, since c( ) 0, we have a maximal growth for y( ): indeed y(s;0, x; c( )) y M (s) := y(s;0, x; c( ) 0) = xe As, s R +. (3.2.5) Hence, setting y(s) := y(s;0, x, c( )), we have by (3.2.4) 0 lim s e ρs v(y(s)) lim s e ρs v(y M (s)) = lim s e ρs axe A(1 σ)s = 0, so the transversality condition (2.3.7) is fulfilled: we can apply Theorem 2.3.6. Optimal control in feedback form Consider the feedback map coming from the optimization of H 0 cv (p; c) when v (z) is plugged in place of the formal argument p. It is the linear map z G(z) = c (z, v ρ A (1 σ) (z)) = z > 0. σ The closed loop equation associated to this feedback map is linear with unique solution y (s) = A y(s) G(y(s)) = ρ A σ y(0) = x > 0, y G (s) = xe ρ A σ s. y(s), Note that y G (s) > 0 for every s R +, so the feedback strategy associated to the feedback map G ρ A (1 σ) c G (s) := G(y G (s)) = xe ρ A σ s σ belongs to A (x). Then, by Theorem 2.3.6, we deduce that v(x) = V (x) and that (c G ( ), y G ( )) is an optimal couple for the problem starting at x. 3.3 Optimal investment model: infinite horizon Let us consider the classical optimal investment problem with quadratic adjustment costs and linear production function. The state equation is y (s) = δy(s) + c(s), s R +, y(0) = x > 0, 27

where δ > 0 and we want to maximize ( J(x; c( )) := e ρt αy(t) γ ) 2 c2 (t) dt, α,γ,ρ > 0. over the set of admissible strategies where M > 0. The value function is Hamiltonians The current value Hamiltonian is 0 A (x) := L 1 loc (R+ ;[0, M]). V (x) = sup c( ) A (x) J(x; c( )). H 0 cv (x, p, c) = ( δx + c)p + αx γ 2 c2 = δxp + αx + cp γ 2 c2, Assuming p 0 the unique maximum point of Hcv 0 (x, p; ) over [0, M] is p c γ (x, p) =, if p γ M, M, if p γ > M. Therefore the maximum value Hamiltonian is 2γ, H 0 max (x, p) = δxp + αx + p2 HJB equation: classical solution The HJB equation is ρv(x) = δxv (x) + αx + δxp + αx + M p γ 2 M2, (v (x)) 2 2γ We look for a solution in affine form: Plugging (3.3.2) into (3.3.1) we get i.e. Mv (x) γ 2 M2, if p γ M, p if γ > M. if 0 v (x) γm, if v (x) > γm, x > 0. (3.3.1) v(x) = ax + b, a [0,γM], b 0. (3.3.2) ρ(ax + b) = δxa + αx + a2 2γ, x > 0. [(ρ + δ)a α]x + b a2 = 0, x > 0. 2γ Hence, under the assumption α ρ + δ [0,γM], the function v given in (3.3.2) with solves (3.3.1). a = α ρ + δ, b = a2 2γ, 28

Optimal control in feedback form The feedback map is simply constant now: G(z) = c (z, v (z)) a =: c [0, M]. γ The corresponding closed loop equation has solution y G (s) = c [ δ + e δt x c ], δ s R +. (3.3.3) The feedback" (it is constant!) control strategy associated to G is c G (s) = c s R +. The transversality condition (2.3.7) is easily checked. Therefore v(x) = V (x) and that the couple (y G ( ), c G ( )) is optimal starting at x. 3.4 Exercises Exercise 3.4.1. Solve the problem of Sections 3.1 in the case σ = 1, i.e. with J(t, x; c( )) = T t log c(s)ds + αlog(y(t)), where α 0. (Hint: Guess a solution to the HJB equation in the form v(t, x) = a(t)log x + b(t) with a( ) > 0.) Solution. The current value Hamiltonian does not depend on t and is given by H cv (t, x, p; c) = Axp cp + log c x > 0, p R, c R +. If p > 0, then the maximum point of H cv (x, p; c) is attained at so c (t, x, p) := argmax c 0 H cv (t, x, p; c) = p 1, H max (t, x, p) = Axp 1 log p. The HJB equation associated to our problem is vt (t, x) = H max (t, x, v x (t, x)) (t, x) [0, T) (0,+ ), v(t, x) = αlog x. (3.4.1) We look for a solution to the HJB equation in the form v(t, x) = a(t)log x + b(t). (3.4.2) with a( ) > 0. In this case v x > 0 and (3.1.1) becomes vt (t, x) = Axv x (t, x) 1 log v x (t, x), (t, x) [0, T) (0,+ ), v(t, x) = αlog x. (3.4.3) 29

Let us plug (3.4.2) into (3.4.3). We get a (t)log x b (t) = Ax a(t) a(t) x 1 log x a(t)log x + b(t) = α. a (t)log x b (t) = Aa(t) 1 + log x log a(t).a(t)log x + b(t) = α. Equating the terms containing log x and the ones which do not contain it, we get two ODE s with terminal conditions: a (t) = 1, From the first one we get a(t) = α, b (t) = Aa(t) 1 log a(t), b(t) = 0. a(t) = α + T t. (3.4.4) Plugging this expression in the second one we get hence b (t) = A(α + T t) + 1 + log(α + T t), b(t) = 0, b(t) = T which can be explicitly computed yielding t A(α + T s) + 1 + log(α + T s)ds,... Consider the feedback map coming from the optimization of H cv (p; c) when v x (s, z) is plugged in place of the formal argument p. It is the map G(s, z) = c (s, z, v x (s, z)) = z a(s) = z α + T s. The closed loop equation associated to this feedback map is a linear ODE: y (s) = A y(s) G(s, y(s)) = (A 1 α+t s )y(s), y(t) = x > 0, with unique solution y G ( ; t, x) strictly positive. The feedback strategy associated to the feedback map G is then c G (s) := G(s, y G (s)), s [t, T]. One verifies now that the c G ( ) A (t, x) as y(s; t, x, c G ( )) = y G (s; t, x) > 0. Hence, by Theorem 2.3.2, we conclude that v(t, x) = V (t, x) and that (c G ( ), y G ( )) is an optimal couple for the problem starting at (t, x). 30

Exercise 3.4.2. Under the condition solve the dynamic optimization problem 0 < ρ < A, max J(x; c( )), A (x) where with state equation J(x; c( )) = + 0 e ρt log(u(t)y(t))dt, y (t) = (A u(t))y(t), y(0) = x, and admissible set of control strategies A (x) := L 1 loc (R+ ;[0, A]). Hint: Guess a solution to the HJB equation in the form v(x) = alog x + b with a > 1 A. Solution. The current value Hamiltonian Hcv 0 of our problem is H 0 cv (x, p; u) = Axp uxp + log(ux), x > 0, p R, u [0, A]. If (xp) 1 (0, A], the maximum of H 0 (x, p; ) over [0, A] is attained at CV so u (x, p) = (xp) 1, H 0 max (x, p) = Axp 1 log(xp). The Hamilton-Jacobi equation associated to our problem is where, if (xp) 1 (0, A], ρv(x) = H 0 max (x, v (x)) x > 0. (3.4.5) H 0 max (x, p) = Axp 1 log(v (x)). We look for a solution to the HJB equation in the form v(x) = alog x + b, a > 1, b R, (3.4.6) A in which case (xv (x)) 1 (0, A]. Let us plug (3.4.6) into (3.4.5). We get ρ(alog x + b) = Aa 1 + log x log a, from which we see that (3.4.6) solves (3.4.5) if (and only if) a = 1 ρ > 1 A, b = 1 [ ] 1 ρ ρ 1 logρ. (3.4.7) 31

Note that, since u( ) 0, we have a maximal growth for y( ): indeed y(s;0, x; c( )) y M (s) := y(s;0, x; u( ) 0) = xe As, s R +. (3.4.8) On the other hand, since u( ) A we have also a minimal growth for y( ): indeed y(s;0, x; c( )) y m (s) := y(s;0, x; u( ) A) x, s R +. (3.4.9) Hence, setting y(s) := y(s;0, x, c( )), we have 0 = lim s e ρs (alog x + b) = lim s e ρs v(y m (s)) lim e ρs v(y(s)) lim e ρs v(y M (s)) = lim e ρs (alog e As + b) = 0, s s s so the transversality condition (2.3.7) is fulfilled: we can apply Theorem 2.3.6. Consider the feedback map coming from the optimization of H 0 cv (p; c) when v (z) is plugged in place of the formal argument p. It is the constant map z G(z) = u (z, v (z)) ρ. The closed loop equation associated to this feedback" (it is actually constant!) map is y (s) = (A ρ)y(s), with unique solution y(0) = x > 0, y G (s) = xe (A ρ)s > 0. Note feedback" control strategy associated to the feedback map G c G (s) := G(y G (s)) ρ (0, A] belongs to A (x). Then, by Theorem 2.3.6, we deduce that v(x) = V (x) and that (c G ( ), y G ( )) is an optimal couple for the problem starting at x. Exercise 3.4.3. Under the condition 0 < ρ < A, solve the problem of Section 3.2 in the case σ = 1, i.e. with J(x; c( )) = + over the set of admissible control strategies 0 e ρt log c(t)dt, A (x) := L 1 loc (R+ ;R + ) : c(t) A y(t;0, x, c( )) t R + }. Comment: Note that the problem has a mixed state-control constraint: (c(t), y(t)) W := (α,β) R 2 : 0 α Aβ} t 0. We have not treated this kind of constraint. Hint: Reduce the problem to the one of Exercise 3.4.2 by using the change of control variable c(t) = u(t)y(t). 32