DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION UNIVERSITY OF MARYLAND: ECON 600. Alternative Methods of Discrete Time Intertemporal Optimization We will start by solving a discrete time intertemporal optimization problem using two simple methods: the method of substitution and the Lagrange method. We will then study in more detail the Maximum Principle and the dynamic programming approach. Suppose that an individual solves the following problem: max {c t,a t+} T β t u(c t ) a t+ = ()(a t + y t c t ), t = 0,,..., T r > 0, a 0 given Where a t denotes assets (or wealth) held at the beginning of period t, y t is labor income in period t, c t denotes consumption expenditure incurred in period t, β is the discount factor, r is the interest rate, and u() represents the period-by-period utility function, assumed to be twice continuously differentiable, strictly increasing and strictly concave. We also assume that lim ct 0 u (c t ) =... The method of substitution. Substitute the period-by-period budget constraint into the objective function to get: ( max β t u a ) t+ {a t+} T + a t + y t Now we have an unconstrained optimization problem in the decision (or choice) variables a t+, t = 0,,..., T. Since the objective function is strictly concave, the first-order conditions will be necessary and sufficient to determine the unique global maximum point. The FOC for a t+ is: ( β t u a t+ or + a t + y t ) = β t+ ()u ( u (c t ) = β()u (c t+ ) a ) t+2 + a t+ + y t+ Date: Summer 203. Notes compiled on August 23, 203

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION 2 From the period by period budget constraint, we can obtain the lifetime budget constraint: ( ) t ( ) T + ( ) t c t + a T + = a 0 + y t If the individual cannot die with unpaid debts, a T + < 0, then a T + = 0 will always hold, since it would not be optimal for the individual to die with unused resources. Let us specify a functional form for u(c t ) : Then the Euler equation becomes: u(c t ) = c σ t σ, σ > 0 c t+ = [β()] σ ct Note that time-iterating using the Euler equation yields c = [β()] σ c0 c 2 = [β()] σ c = [β()] 2 σ c0. c t = [β()] t σ c0 Replacing the expression for c t into the lifetime budget constraint yields (after some algebra): c 0 = [ β σ () σ σ β σ () σ σ ] T + [a 0 + ( ) ] t y t In the limit as T and assuming β σ () σ < ) c 0 = ( [ β σ ( ) ] t σ () σ a 0 + y t If we need to solve an infinite horizon problem, it is usually simpler to solve the infinite horizon problem directly, instead of taking the limit of the finite horizon solution. Consider the infinite horizon problem: max {c t,a t+} β t u(c t ) a t+ = ()(a t + y t c t ) Using the same process of substitution, as in the finite time case, we can derive the Euler equation u (c t ) = β()u (c t+ )

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION 3 As for the lifetime budget constraint we need an additional condition which is referred to as the transversality condition: ( ) T lim a T + = 0 T ( ) T + What is the economic meaning of that condition? If lim T +r at + < 0, the present discounted value of the individuals lifetime expenditure would be greater than the present discounted value of her lifetime resources by an amount that does not converge to zero. Her debt grows at a rate that is at least as great as the interest rate. To rule out that possibility, we impose the no-ponzi game condition: ( ) T lim a T + 0 T ( ) T + On the other hand, if lim T +r at + > 0, the present discounted value of the individuals lifetime expenditures is lower than the present discounted value of her lifetime resources by an amount that does not converge to zero. That means that the individual could increase her lifetime utility by consuming more. Therefore, under the no-ponzi game condition, the transversality condition must hold, and the lifetime budget constraint is: ( ) t ( ) t c t = a 0 + y t which combined with the Euler equation yields the same expression for c 0 as we derived using the limiting argument. Show this..2. The Lagrange method. The finite horizon problem stated above, taking into account the terminal non-indebtedness condition can be written as: max β t u(c t ) {c t} T ( ) t ( ) T + c t + a T + = a 0 + ( ) t y t and ( ) T + a T + 0 ( ) T + We argued earlier that +r at + = 0 using economic reasoning. That condition will be formally implied by the Kuhn-Tucker theorem when we use the Lagrange method. Note that the two constraints can be combined as ( ) t ( ) t a 0 + y t c t 0

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION 4 and the Lagrangian can be written as [ ( ) t L = β t u(c t ) + λ a 0 + y t ( ) ] t c t The First Order conditions can be written as ( ) t β t u (c t ) = λ t L ( ) t ( ) t λ = a 0 + y t c t 0 λ 0 λ L λ = 0 The first condition at time t and t + can be used to derive the Euler Equation u (c t ) = β()u (c t+ ) Note that λ = u (c 0 ), and therefore the shadow value of the lifetime budget constraint is equal to the marginal utility of consumption at t = 0. Also, from the last FONC we can see that unless λ = u (c 0 ) = 0, which cannot be true given economic scarcity, the lifetime budget constraint must hold with equality (a T + = 0). Then: ( ) T + a T + = 0 Thus, by the Kuhn-Tucker theorem, we get the same solution for consumption as the substitution method. We can also use the complementary slackness condition to derive the transversality condition for the infinite horizon problem. In the limit, as T, we must have: ( ) T + lim a T + = 0 T.3. The Maximum Principle. Intertemporal optimization problems often have a special structure that allows us to characterize their solutions in a certain way. The most important aspect of that structure is the existence of stock-flow relationships among the variables. We will use x t to denote the stock variables (or state variables in the mathematical terminology) measured at the beginning of period t, and to denote the flow variables (or control variables). Economic activity in one period determines the changes in stocks from that period to the next. Therefore, increments to stocks depend on both the stocks and the flows during this period: x t+ x t = f t (x t, ) In addition to the constraints that govern the changes in the state variables, there may be constraints on all the variables the variables pertaining to any one time period, such as: g t (x t, ) 0 Constraints for stocks and flows to be non-negative can also be included in the previous equation.

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION 5 The objective function is often additively separable: it can be expressed as a sum of instantaneous return functions, where each return function depends on the variables pertaining to one time period only: r t (x t, ) The optimization problem is given by: max {,x t+} T r t (x t, ) x t+ x t = f t (x t, ) g t (x t, ) 0 x 0 given We assume that r, f and g are at least C. Introduce the Lagrangian multipliers λ t for the equation describing transition of the state variable and µ t for the additional constraint g t (x t, ) 0. The Lagrangian is given by: L = r t (x t, ) + λ t [f t (x t, ) + x t x t+ ] + µ t g t (x t, ) Assuming interior solutions, the FONC are given by (.) (.2) (.3) (.4) L = r t f t g t + λ t + µ t = 0 L = r t f t g t + λ t + λ t λ t + µ t = 0 L = f t (x t, ) + x t x t+ = 0 λ t g t (x t, ) 0 µ t 0 µ t g t (x t, ) = 0 These conditions can be written in a more compact way. Define the function H t, called the Hamiltonian, by H t (x t,, λ t ) = r t (x t, ) + λ t f t (x t, ) Then the problem can be reformulated as maximizing H t g t (x t, ) 0. Denote by H t (x t,, λ t ) the resulting maximum value. The Lagrangian for this single-period optimization problem is given by H t = H t (x t,, λ t ) + µ t g t (x t, ) Now of as the only choice variable and it has to be chosen to maximize H t (x t ;, λ t ) g t (x t ; ) 0. Thus we can rewrite (.2) as λ t λ t = H t H is the Lagrangian for the static optimization problem in which only the are choice variables, and the x t and λ t are parameters. Therefore, by the Envelope

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION 6 Theorem we have λ t λ t = H t The Envelope Theorem also yields H t λ t as = Ht λ t = f t. Therefore, (.3) can be written x t+ x t = H t λ t The Maximum Principle: The first-order necessary conditions for the intertemporal optimization problem are: () For each t, maximizes the Hamiltonian H t (x t,, t) the singleperiod constraints g t (x t, ) 0, (2) For each t, the change in x t over time is given by x t+ x t = H t λ t (3) For each t, the change in λ t over time is given by λ t λ t = H The first-order conditions are sufficient for a unique optimum if the appropriate curvature conditions are imposed on r, f and g. In particular, sufficiency for a unique optimum holds if a strictly concave function is maximized over a closed strictly convex region. We can interpret the maximization condition (.) by noting that we would not want to choose to maximize r t (x t, ). We know that the choice of affects x t+ via the transition equation of x t, and therefore affects the terms in the objective function at times t +, etc. We can capture all these future effects by using the shadow price of the affected stock. The effect of on x t+ equals its effect on f t (x t, ), and the resulting change in the objective function is found by multiplying this by the shadow price λ t of x t+. That is what we add to r to get the Hamiltonian. Note that the equation λ t λ t = H t has a useful economic interpretation. We can write it as [ ] rt g t f t + µ t + λ t + λ t λ t = 0 A marginal unit of x t yields the marginal return rt g + µ t t within period t, and an extra ft next period valued at λ t. We can think of these as a dividend. And the change in the price λ t λ t is like a capital gain. When x t is optimal, the overall return (the sum of these components) should be zero. In other words, the shadow prices take values that do not allow for an excess return from holding the stock; this is an intertemporal no-arbitrage condition.

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION 7.4. Dynamic Programming. Consider the optimization problem: max {c t} T r t (x t, ) x t+ x t = f t (x t, ) g t (x t, ) 0 x 0 given Define the value function as the resulting maximum value of the objective function expressed as a function of the initial state variables, say V 0 (x 0 ). Since the objective function is separable, instead of starting at time 0, we can start at another particular time, say t = τ. Given that the state variables at a point in time determine all other variables both currently and at all future dates, they determine the maximum value that the objective function can attain. Let V τ (x τ ) denote the value function for the problem of maximizing T t=τ r t(x t, ) x t+ x t = f t (x t, ) and g t (x t, ) 0, t = τ, τ +,..., T. The idea underlying the dynamic programming approach is Bellmans principle of optimality: Pick any t and consider the decision about the control variables at that time. This choice will lead to next periods state x t+ according to the transition equation for x t. Thereafter it remains to solve the subproblem starting at t + and achieve the maximum value V t+ (x t+ ). Then, the total value starting at t can be broken down into two terms: r t (x t, ) that accrues at once, and V t+ (x t+ ) that accrues thereafter. The choice of should maximize the sum of these two terms: V t (x t ) = max r t (x t, ) + V t+ (x t+ ) x t+ x t = f t (x t, ) g t (x t, ) 0 x t given The idea is that whatever the decision at t, the subsequent decisions should be optimal for the subproblem starting at t +. This recursive relationship is known as Bellmans equation. Using this equation, we can start in the final period T and proceed recursively to earlier time periods. In period T, we have V T (x T ) = max u T r T (x T, u T ) x T + x T = f T (x T, u T ) g T (x T, u T ) 0 x T given This is a static optimization problem, and yields the policy function u T = h T (x T ) which, together with the transition equation x T + x T = f T (x T, u T ), gives the

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION 8 value function V T (x T ). The value function can then be used in the right hand side of the static optimization problem for period V T (x T ) = max u T r T (x T, u T ) + V T (x T ) x T x T = f T (x T, u T ) g T (x T, u T ) 0 x T given This is another static optimization problem, and yields the value function V T (x T ). We can proceed recursively backwards all the way to period 0. Note that the Bellman equation shows that dynamic programming problems are two-period problems, where the periods are today and the future. But this works only when the instantaneous return function and the constraint function have the property that controls at t influence states x t+s and returns r t+s (x t+s, +s ) for s > 0. If the Bellmans principle of optimality is applicable, it leads to the same decision rules as the Maximum principle. Consider the intertemporal optimization problem we have been discussing. Substituting for x t+ from the transition equation into the Bellamn equation, we have s.t. V t (x t ) = max r t (x t, ) + V t+ (f t (x t, ) + x t ) g t (x t, ) 0 and x t given Letting µ t denote the Lagrangian multipliers on the constraints, the first-order condition is By the Envelope Theorem: r t + V t+ + f t + µ t g t = 0 V t = r t + V t+ + ( ) ft g t + + µ t Recall that with the Maximum Principle we had: ( ) r t ft g t + λ t + λ t + µ t = 0 Comparing the two equations above, reveals that V t+ + = λ t, Using this in the First order condition yields V t = λ t r t + λ t f t + µ t g t = 0 which is exactly the FOC for to maximize the Hamiltonian defined in the earlier section.

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION 9 Example. Consider a finite horizon consumption problem and let u(c t ) = ln(c t ) and f(k t ) = k α t, 0 < α <. The discount factor is 0 < β <. Assume that the horizon ends at T and there is no value to capital carried over to T +. Let h be the decision rule for consumption and g be the decision rule for capital). The problem can then be written as max {c t,k t+} T 0 β t u(c t ) k t+ = f(k t ) c t, k 0 given Note that we can use either c t or k t+ as the control variable. Let us use the latter and subsitute in using the resource constraint. The Bellman equation for period T is: V T (k T ) = max k T + ln(k α T k T + ) The solution is k T + = 0 (so g T (k T ) = 0). Then we have c T = kt α h T (k T ) and V T (k T ) = ln(kt α ). At T we have, V T (k T ) = max k T ln(k α T k T ) + β ln(k α T ) which yields the FOC k α T k T = β kt α αk α T which yields Then, and k T = αβ + αβ kα T = g T (k T ) c T h T (k T ) = + αβ kα T V T = αβ ln αβ + ( + αβ) ln( + αβ kα T ) Similarly we can keep going back to t = 0. It can be shown that the consumption decision rule takes the form: and c t = h t (k t ) = k t+ = g t (k t ) = T t+ τ=0 (αβ) τ kα t T t τ=0 (αβ)τ T t+ τ=0 (αβ) τ αβkα t

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION 0.4.. Infinite Horizon. What happens as T? In the finite-horizon case, the value function changes with time, since problems that start at different dates differ not only in the initial value of the state but also in the time remaining until the end of the planning period. If T is very large, however, the starting point is less important. In fact, as T, under certain conditions that we will explore later, the value function and the optimal decision rules are time invariant. If h t, f t, r t change radically with time it will be hard to find a solution. A common setup to simplify this is to have r t (x t, ) = β t r(x t, ), β (0, ) f t (x t, ) = f(x t, ) g t (x t, ) = g(x t, ) There are two ways of solving the Bellman Equation, which we will study next:.4.2. Guess and Verify. If one could figure out the form the value function would take, the method of undetermined coefficients would give a solution, in the following way: () Guess a form for the value function. For instance, we could believe that the value function was of the form V G (x) = A + Bh(x) + Cz(x), for known functions h(x) and z(x) but unknown coefficients A, B and C. (2) Plug the conjecture V G into both sides of Bellmans equation V G (x t ) = max r(x t, ) + V G (f(x t, ) + x t ) (3) Obtain the policy function from the FOC and plug it back into the Bellmans equation. (4) Find values of the coefficients that make the equation hold. This method will work if we are close to the correct form of the value function, but trial-and-error on what functional forms to include will be usually fruitless. Example. Consider the same example above but now let the problem be an infinite horizon problem. Guess the form of the Value function as Recall that the Bellman equation was V (k t ) = A + B ln(k t ) V (k t ) = max k t+ ln(k α t k t+ ) + βv (k t+ ) Plug in the guess for the value function into the Bellman equation: The FOC yields A + B ln(k t ) = max k t+ ln(k α t k t+ ) + β(a + B ln(k t+ )) k t+ = βb + βb kα t Replace this into the RHS of the Bellman equation and rearrange to get: A + B ln(k t ) = βa + ln + βb + βb ln βb + α( + βb) ln(k t ) + βb }{{}}{{} B A

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION [ We can then solve for B = α αβ, and A = β ln( αβ) + hence the Value function. Then from the FOC we can get k t+ = αβkt α ] αβ αβ ln(αβ) and and then from the resource constraint c t = ( αβ)k α t. Note how these are the same as the finite horizon problem decision rules when lim T..4.3. Method of Successive Approximations. Denote by V j (x) the j-th guess of V (x). We proceed as follows: () Start with V 0 (x), arbitrary. (2) Plug V 0 (x) into the right-hand side of the Bellman equation to generate a new function on the left-hand side, namely: V (x t ) = max r(x t, ) + V 0 (f(x t, ) + x t ) (3) Obtain the policy function from the FOC and plug it back into the above equation. (4) If V (x) = V 0 (x), the guess is correct, then V (x) = V 0 (x). (5) Otherwise, use V (x) as the initial guess and repeat. We could solve the previous example using this method also. To verify that, start with V 0 (x) = 0 and find V by inspection rather than brute force calculus. Why? By subsituting in for x t+ you lose the x t+ 0 constraint. Either keep track of this as well, or have x t+ be your control variable. Then proceed normally. You should get the same sequence as above when recursing backwards in the finite horizon problem (note that this does not always happen!)..4.4. Unique solution. We need to explore the conditions under which the value function is time invariant and the sequence of functions V j (x) converges to a unique function, say V (x), which reproduces itself if plugged into the right-hand side of the Bellman equation. For the optimization problem (OP) given at the beginning of the section, let T and then we have V t (x t ) = max β t r(x t, ) + V t+ (x t+ ) x t+ x t = f(x t, ) and g(x t, ) 0 It seems natural to conjecture that V t+ = β t+ V = β β t V = βv t. Then the Bellman equation becomes V (x t ) = max r(x t, ) + βv (x t+ ) Solving the problem defined above is equivalent to finding a fixed point for the mapping K such that KV = V in KV (x t ) = max r(x t, ) + βv (x t+ ) x t+ x t = f(x t, ) and g(x t, ) 0 Under what conditions can we ensure that such a fixed point exists? To answer this question we need to introduce some concepts.

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION 2 Definition. Let B(X) be the space of bounded functions from X to R. Let (B(X), d) be a metric space, and K : B(X) B(X). We say that K is a contraction if there exists β [0, ) such that d[k(v), K(v )] βd(v, v ), v, v B(X) Fact 2. Let (B(X), d) be a complete metric space, and K : B(X) B(X) a contraction. Then, There is a unique point v B(X) such that K(v ) = v (i.e. a fixed point), and The sequence {v n }, defined by v = K(v 0 ), v 2 = K(v ), K(v n+ ) = K(v n ), converges to v for any starting point v 0 B(X). Thus, we want to show that the mapping K is a contraction. The following theorem gives sufficient conditions for an operator in a useful function space to be a contraction Theorem 3. (Blackwells sufficient Conditions for a Contraction). Let K be an operator on a metric space (B(X), d), where B(X) is a space of functions. If K satisfies: () Monotonicity: For any v, v B(X), v(x) v (x) x X (Kv)(x) (Kv )(x) x X. (2) Discounting: There is a β [0, ) such that K(v + c) K(v) + βc for all v B(X) and any positive real c. Then K is a contraction with modulus β. In light of Blackwells sufficient conditions, then, we only need to show that the mapping K defined above satisfies the monotonicity and discounting properties on a complete metric space. To this purpose, let us assume that the return function r(x t, ) is real valued, continuous, concave and bounded, and that the constraint set {x t, x t+, : x t+ = f(x t, ) + x t, g(x t, ) 0} is convex and compact. We work with the metric space of continuous bounded functions mapping x X into the real line. The metric is given by the d sup metric. This metric space can be shown to be complete, and it can also be shown that K maps a continuous bounded function V into a continuous bounded function KV. Furthermore, K satisfies Blackwell s sufficient conditions: Let V (x) W (x) for all x X. Define u W t = arg max r(x t, ) + βw (f(x t, ) + x t ) s.t. g(x t, ) 0 Then KW (x t ) = max r(x t, ) + βw (f(x t, ) + x t ) s.t. g(x t, ) 0 = r(x t, u W t ) + βw ((f(x t, u W t ) + x t ) r(x t, u W t ) + βv ((f(x t, u W t ) + x t ) max r(x t, ) + βv (f(x t, ) + x t ) s.t. g(x t, ) 0 = KV (x t )

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION 3 and K is monotonic. Also, for any positive constant c, and K discounts. K(V (x t ) + c) = max r(x t, ) + β[v (x t+ ) + c] s.t. constraints = βc + max r(x t, ) + βv (x t+ ) s.t. constraints = βc + K(V (x t ))