Numerical Methods in Economics

Size: px

Start display at page:

Download "Numerical Methods in Economics"

Camilla Blake
5 years ago
Views:

1 Numerical Methods in Economics MIT Press, 1998 Chapter 12 Notes Numerical Dynamic Programming Kenneth L. Judd Hoover Institution November 15,

2 Discrete-Time Dynamic Programming Objective: X: set of states E { T t=1 D: thesetofcontrols π(x t,u t,t)+w(x T +1 ) }, (12.1.1) π(x, u, t) payoffs in period t, forx X at the beginning of period t, and control u Dis applied in period t. D(x, t) D: controls which are feasible in state x at time t. F (A; x, u, t) :probability that x t+1 A X conditional on time t control and state Value function { T V (x, t) sup E U(x,t) s=t π(x s,u s,s)+w(x T +1 ) x t = x }. (12.1.2) Bellman equation V (x, t) = sup u D(x,t) π(x, u, t)+e {V (x t+1,t+1) x t = x, u t = u} (12.1.3) Existence: boundedness of π is sufficient 2

3 Autonomous, Infinite-Horizon Problem: Objective: X: set of states D: thesetofcontrols max u t { } E β t π(x t,u t ) t=1 D(x) D: controls which are feasible in state x. (12.1.1) π(x, u) payoff in period t if x X at the beginning of period t, and control u Dis applied in period t. F (A; x, u) :probability that x + A X conditional on current control u and current state x. Value function definition: if U(x) is set of all feasible strategies starting at x. { } V (x) sup E β t π(x t,u t ) U(x) x 0 = x, (12.1.8) t=0 3

4 Bellman equation for V (x) V (x) = sup u D(x) π(x, u)+βe { V (x + ) x, u } (TV)(x), (12.1.9) Optimal policy function, U(x), if it exists, is defined by U(x) arg max π(x, u)+βe{ V (x + ) x, u } u D(x) Standard existence theorem: Theorem 1 If X is compact, β<1, andπ is bounded above and below, then the map TV = sup u D(x) π(x, u)+βe { V (x + ) x, u } ( ) is monotone in V, is a contraction mapping with modulus β in the space of bounded functions, and has a unique fixed point. 4

5 Deterministic Growth Example Problem: Euler equation: V (k 0 )=max ct t=0 βt u(c t ), k t+1 = F (k t ) c t k 0 given u (c t )=βu (c t+1 )F (k t+1 ) ( ) Bellman equation V (k) =max c u(c) +βv (F(k) c). ( ) Solution to ( ) is a policy function C(k) and a value function V (k) satisfying 0=u (C(k))F (k) V (k) ( ) V (k)=u(c(k)) + βv (F(k) C(k)) ( ) ( ) defines the value of an arbitrary policy function C(k), not just for the optimal C(k). The pair ( ) and ( ) expresses the value function given a policy, and a first-order condition for optimality. 5

6 Stochastic Growth Accumulation Problem: { } V (k, θ) =max c t,l t E t=0 β t u(c t ) k t+1 = F(k t,θ t ) c t θ t+1 = g(θ t,ε t ) ε t : i.i.d. random variable k 0 = k, θ 0 = θ. State variables: k: productive capital stock, endogenous θ: productivity state, exogenous The dynamic programming formulation is V (k, θ) =max c u(c)+βe{v (F (k, θ) c, θ + ) θ} ( ) θ + = g(θ, ε) The control law c = C(k, θ) satisfies the first-order conditions 0=u c (C(k, θ)) βe{u c (C(k +,θ + ))F k (k +,θ + ) θ}, ( ) where k + F (k, L(k, θ),θ) C(k, θ), 6

7 General Stochastic Accumulation Problem: { } V (k, θ) =max c t,l t E t=0 β t u(c t,l t ) k t+1 = F (k t,l t,θ t ) c t θ t+1 = g(θ t,ε t ) k 0 = k, θ 0 = θ. State variables: k: productive capital stock, endogenous θ: productivity state, exogenous The dynamic programming formulation is V (k, θ) =max c,l u(c, l)+βe{v (F(k, l, θ) c, θ + ) θ}, ( ) where θ + is next period s θ realization. Control laws c = C(k, θ) and l = L(k, θ) satisfy foc s 0= u c (C(k, θ),l(k, θ))f k (k, L(k, θ),θ) V k (k, θ), 0= u l (C(k, θ),l(k, θ)) + F l (k, θ)u c (C(k, θ),l(k, θ)). Euler equation implies 0=u c (C(k, θ),l(k, θ)) βe{u c (C(k +,θ + ),l + )F k (k +,l +,θ + ) θ}, ( ) where next period s capital stock and labor supply are k + F (k, L(k, θ),θ) C(k, θ), l + L(k +,θ + ), 7

8 Discrete State Space Problems State space X = {x i,i=1,,n} Controls D = {u i i =1,..., m} qij t (u) =Pr(x t+1 = x j x t = x i,u t = u) Q t (u) = ( qij t (u)) : Markov transition matrix at t if u i,j t = u. 8

9 Value Function iteration Terminal value: V T +1 i = W(x i ),i=1,,n. Bellman equation: time t value function is n Vi t =max[π(x i,u,t)+β q t u ij(u) Vj t+1 ], i=1,,n j=1 Bellman equation can be directly implemented. Called value function iteration It is only choice for finite-horizon problems because each period has a different value function. Infinite-horizon problems Bellman equation is now a simultaneous set of equations for V i values: n V i =max π(x i,u)+β q ij (u) V j,i=1,,n u j=1 Value function iteration is now V k+1 i =max u π(x i,u)+β n j=1 q ij (u) V k j,i=1,,n Can use value function iteration with arbitrary Vi 0. Error is given by contraction mapping property: V k V 1 V k+1 V k 1 β and iterate k 9

10 Algorithm 12.1: Value Function Iteration Algorithm Objective: Solve the Bellman equation, (12.3.4). Step 0: Make initial guess V 0 ; choose stopping criterion ɛ>0. Step 1: For i =1,..., n, compute Vi l+1 =max u D π(x i,u)+β n j=1 q ij(u)vj l. Step 2: If V l+1 V l <ɛ, then go to step 3; else go to step 1. Step 3: Compute the final solution, setting U = UV l+1, Pi = π(x i,ui ), i =1,,n, V =(I βq U ) 1 P, and STOP. Output: 10

11 Policy Iteration (a.k.a. Howard improvement) Value function iteration is a slow process Linear convergence at rate β Convergence is particularly slow if β is close to 1. Policy iteration is faster Current guess: V k i,i=1,,n. Iteration: compute optimal policy today if V k is value tomorrow: n Ui k+1 =argmax π(x i,u)+β q ij (u) Vj k,i=1,,n, u Compute the value function if the policy U k+1 is used forever, which is solution to the linear system Vi k+1 = π ( ) n x i,ui k+1 + β q ij (Ui k+1 ) Vj k+1,i=1,,n, j=1 Comments: j=1 Policy iteration depends on only monotonicity Policy iteration is faster than value function iteration If initial guess is above or below solution then policy iteration is between truth and value function iterate Works well even for β close to 1. 11

12 Algorithm 12.2: Policy Function Algorithm Objective: Solve the Bellman equation, (12.3.4). Step 0: Choose stopping criterion ɛ>0. EITHER make initial guess, V 0,forthe value function and go to step 1, OR make initial guess, U 1,forthe policy function and go to step 2. Step 1: U l+1 = UV l Step 2: Pi l+1 = π ( ) x i,ui l+1, i =1,,n ( ) 1 Step 3: V l+1 = I βq U l+1 P l+1 Step 4: If V l+1 V l <ɛ, STOP; else go to step 1. 12

13 Modified policy iteration If n is large, difficult to solve policy iteration step Alternative approximation: Assume policy U l+1 is used for k periods: V l+1 = k ( β t Q U l+1) t ( P l+1 + β k+1 Q U l+1) k+1 V l. (12.4.1) t=0 Theorem 4.1 points out that as the policy function gets close to U, the linear rate of convergence approaches β k+1. Hence convergence accelerates as the iterates converge. Theorem 2 (Putterman and Shin) The successive iterates of modified policy iteration with k steps, (12.4.1), satisfy the error bound V V l+1 [ ] V V l min β(1 β k ) β, U l U +β k+1 (12.4.3) 1 β 13

14 Gaussian acceleration methods for infinite-horizon models Key observation: Bellman equation is a simultaneous set of equations n V i =max π(x i,u)+β q ij (u) V j,i=1,,n u Idea: Treat problem as a large system of nonlinear equations Value function iteration is the pre-gauss-jacobi iteration n Vi k+1 =max π(x i,u)+β q ij (u) Vj k,i=1,,n u True Gauss-Jacobi is V k+1 i =max u pre-gauss-seidel iteration j=1 j=1 [ π(xi,u)+β j i q ij(u) V k j 1 βq ii (u) Value function iteration is a pre-gauss-jacobi scheme. ],i=1,,n Gauss-Seidel alternatives use new information immediately Suppose we have Vi l At each x i,givenvj l+1 for j<i,computevi l+1 Seidel fashion in a pre-gauss- V l+1 i =max u π(x i,u)+β j<i q ij (u)vj l+1 +β j i q ij (u)v l j (12.4.7) Iterate (12.4.7) for i =1,.., n 14

15 Gauss-Seidel iteration Suppose we have V l i If optimal control at state i is u, then Gauss-Seidel iterate would be V l+1 i = π(x i,u)+β j<i q ij(u)v l+1 j 1 βq ii (u) + j>i q ij(u)v l j Gauss-Seidel: At each x i,givenvj l+1 for j<i,computev i l+1 V l+1 i =max u Iterate this for i =1,.., n Gauss-Seidel iteration: better notation π(x i,u)+β j<i q ij(u)vj l+1 + β j>i q ij(u)vj l 1 βq ii (u) No reason to keep track of l, number of iterations At each x i, V i max u π(x i,u)+β j<i q ij(u)v j + β j>i q ij(u)v j 1 βq ij (u) Iterate this for i =1,.., n, 1,..., etc. 15

16 Upwind Gauss-Seidel Gauss-Seidel methods in (12.4.7) and (12.4.8) Sensitive to ordering of the states. Need to find good ordering schemes to enhance convergence. Example: Two states, x 1 and x 2,andtwocontrols,u 1 and u 2 u i causesstatetomovetox i, i =1, 2 Payoffs: π(x 1,u 1 )= 1, π(x 1,u 2 )=0, π(x 2,u 1 )= 0,π(x 2,u 2 )=1. β =0.9. (12.4.9) Solution: Optimal policy: always choose u 2,movingtox 2 Value function: V (x 1 )=9,V(x 2 )=10. x 2 is the unique steady state, and is stable Value iteration with V 0 (x 1 )=V 0 (x 2 )=0converges linearly: V 1 (x 1 )=0,V 1 (x 2 )=1,U 1 (x 1 )=2,U 1 (x 2 )=2, V 2 (x 1 )=0.9, V 2 (x 2 )=1.9, U 2 (x 1 )=2,U 2 (x 2 )=2, V 3 (x 1 )=1.71, V 3 (x 2 )=2.71, U 3 (x 1 )=2,U 3 (x 2 )=2, Policy iteration converges after two iterations V 1 (x 1 )=0,V 1 (x 2 )=1,U 1 (x 1 )=2,U 1 (x 2 )=2, V 2 (x 1 )=9,V 2 (x 2 )=10,U 2 (x 1 )=2,U 2 (x 2 )=2, 16

17 Upwind Gauss-Seidel Value function at absorbing states is trivial to compute Suppose s is absorbing state with control u V (s) =π(s, u)/(1 β). With absorbing state V (s) we compute V (s ) of any s that sends system to s. V (s )=π (s,u)+βv (s) With V (s ), we can compute values of states s that send system to s ;etc. 17

18 Alternating Sweep It may be difficult to find proper order. Idea: alternate between two approaches with different directions. W = V k, W i =max u π(x i,u)+β n j=1 q ij(u)w j,i=1, 2, 3,..., n W i =max u π(x i,u)+β n j=1 q ij(u)w j,i= n, n 1,..., 1 V k+1 = W Will always work well in one-dimensional problems since state moves either right or left, and alternating sweep will exploit this half of the time. In two dimensions, there may still be a natural ordering to be exploited. Simulated Upwind Gauss-Seidel It may be difficult to find proper order in higher dimensions Idea: simulate using latest policy function to find downwind direction Simulate to get an example path, x 1,x 2,x 3,x 4,..., x m Execute Gauss-Seidel with states x m,x m 1,x m 2,..., x 1 18

19 Linear Programming Approach If D is finite, we can reformulate dynamic programming as a linear programming problem. (12.3.4) is equivalent to the linear program min n Vi i=1 V i s.t. V i π(x i,u)+β n j=1 q ij(u)v j, i, u D, ( ) Computational considerations ( ) may be a large problem OR literature does not favor this approach Trick and Zin (1997) pursued an acceleration approach with success. 19

20 Continuous states: discretization Method: Replace continuous X with a finite X = {x i,i=1,,n} X Proceed with a finite-state method. Problems: Sometimes need to alter space of controls to assure landing on an x in X. A fine discretization often necessary to get accurate approximations 20

21 Continuous States: Linear-Quadratic Dynamic Programming Problem: max u t T ( 1 β t 2 x t Q t x t + u t R t x t + 1 ) 2 u t S t u t x T +1W T +1 x T +1 t=0 Bellman equation: x t+1 = A t x t + B t u t, (12.6.1) 1 V (x, t) =max u t 2 x Q t x + u t R t x u t S t u t + βv (A t x + B t u t,t+1). (12.6.2) Finite horizon Key fact: We know solution is quadratic, solve for the unknown coefficients The guess V (x, t) = 1 2 x W t+1 x implies f.o.c. 0=S t u t + R t x + βb t W t+1 (A t x + B t u t ), F.o.c. implies the time t control law u t = (S t + βbt W t+1 B t ) 1 (R t + βbt W t+1 A t )x (12.6.3) U t x. Substitution into Bellman implies Riccati equation for W t : W t = Q t + βa t W t+1 A t +(βb t W t+1 A t + R t )U t. (12.6.4) Value function method iterates (12.6.4) beginning with known W T +1 matrix of coefficients. 21

22 Autonomous, Infinite-horizon case. Assume R t = R, Q t = Q, S t = S, A t = A, andb t = B The guess V (x) 1 2 x Wx implies the algebraic Riccati equation Two convergent procedures: W =Q + βa WA (βb WA+ R ) (12.6.5) (S + βb WB) 1 (βb WB + R ). Value function iteration: W 0 : a negative definite initial guess W k+1 =Q + βa W k A (βb W k A + R ) (S + βb W k B) 1 (βb W k B + R ). (12.6.6) Policy function iteration: Lessons W 0 : initial guess U i+1 = (S + βb W i B) 1 (R + βb W i A):optimal policy for W i 1 2 W i+1 = Q U i+1su i+1 + Ui+1R : value of U i 1 β We used a functional form to solve the dynamic programming problem We solve for unknown coefficients We did not restrict either the state or control set Canwedothisingeneral? 22

23 Continuous Methods for Continuous-State Problems Basic Bellman equation: V (x) = max u D(x) π(u, x)+βe{v (x+ ) x, u)} (TV)(x). (12.7.1) Discretization essentially approximates V with a step function Approximation theory provides better methods to approximate continuous functions. General Task Find good approximation for V Identify parameters 23

24 Continuous States: Parametric Approx. and Simulation General Idea: parameterize critical functions and find parameter values that generates a good approximation. Direct approach: parameterize the control law, Û(x; a), and use simulation to find a that produces highest value. Example: Consider stochastic growth problem: V (k) =max c u(c)+βe{v (k c + θf(k c)) k, c}, (12.8.1) Parameterize savings function, S(k) k C(k). Consider linear rules: Ŝ(k) =a + bk Use simulation to approximate value of a savings rule. Simulate θ t,t=1,,t sequence of productivity shocks. For given k 0, θ t,andŝ(k), compute paths for c t and k t : c t =k t Ŝ(k t ) k t+1 =Ŝ(k t )+θ t f(ŝ(k t )) Compute realized discounted utility is T W(θ; Ŝ) = β t u(c t ). (12.8.2) Repeat for several θ t sequences. Value Ŝ(k 0 ) is V (k 0 ; Ŝ) =E{W(θ; Ŝ)}, approximated by average t=0 1 N N W(θ j ; Ŝ) = 1 N j=1 N j=1 T β t u(c j t). (12.8.3) t=0 Iterate over various a and b to find optimal rule 24

25 General Parametric Approach: Approximating V (x) Choose a finite-dimensional parameterization and a finite number of states V (x). = ˆV (x; a), a R m (12.7.2) X = {x 1,x 2,,x n }, (12.7.3) polynomials with coefficients a and collocation points X splines with coefficients a with uniform nodes X rational function with parameters a and nodes X neural network specially designed functional form Objective: find coefficients a R m such that ˆV (x; a) approximately satisfies the Bellman equation. 25

26 General Parametric Approach: Approximating T For each x j, (TV)(x j ) is defined by v j =(TV)(x j )= max u D(x j ) π(u, x j)+β ˆV (x + ; a)df (x + x j,u) (12.7.5) In practice, we compute the approximation ˆT v j =(ˆTV)(x j ). =(TV)(x j ) Integration step: for ω j and x j for some numerical quadrature formula E{V (x + ; a) x j,u)}= ˆV (x + ; a)df (x + x j,u) = ˆV (g(x j,u,ε); a)df (ε). = ω l ˆV (g(x j,u,ε l ); a) l Maximization step: for x i X, evaluate v i =(T ˆV )(x i ) Hot starts Concave stopping rules Fitting step: Data: (v i,x i ),i=1,,n Objective: find an a R m such that ˆV (x; a) best fits the data Methods: determined by ˆV (x; a) 26

27 Approximating T with Hermite Data Conventional methods just generate data on V (x j ): v j = max π(u, x j)+β ˆV (x + ; a)df (x + x j,u) (12.7.5) u D(x j ) Envelope theorem: If solution u is interior, v j = π x (u, x j )+β ˆV (x + ; a)df x (x + x j,u) If solution u is on boundary v j = µ + π x (u, x j )+β ˆV (x + ; a)df x (x + x j,u) where µ is a Kuhn-Tucker multiplier Since computing v j is cheap, we should include it in data: Data: (v i,v i,x i), i=1,,n Objective: find an a R m such that ˆV (x; a) best fits Hermite data Methods: determined by ˆV (x; a) 27

28 General Parametric Approach: Value Function Iteration Comparison with discretization guess a ˆV (x; a) (v i,x i ),i=1,,n new a This procedure examines only a finite number of points, but does not assume that future points lie in same finite set. Our choices for the x i are guided by systematic numerical considerations. Synergies Smooth interpolation schemes allow us to use Newton s method in the maximization step. They also make it easier to evaluate the integral in (12.7.5). 28

29 Algorithm 12.5: Parametric Dynamic Programming with Value Function Iteration Objective: Solve the Bellman equation, (12.7.1). Step 0: Choose functional form for ˆV (x; a), andchoose the approximation grid, X = {x 1,..., x n }. Make initial guess ˆV (x; a 0 ), and choose stopping criterion ɛ>0. Step 1: Maximization step: Compute v j =(T ˆV ( ; a i ))(x j ) for all x j X. Step 2: Fitting step: Using the appropriate approximation method, compute the a i+1 R m such that ˆV (x; a i+1 ) approximates the (v i,x i ) data. Step 3: If ˆV (x; a i ) ˆV (x; a i+1 ) < ɛ, STOP; else go to step 1. 29

30 Convergence T is a contraction mapping ˆT may be neither monotonic nor a contraction Shape problems An Instructive Example Figure 1: Shape problems may become worse with value function iteration Shape-preserving approximation implies monotonicity 30

31 Comparisons We apply various methods to the deterministic growth model Relative L2 Errors over [0.7,1.3] N (β,γ) : (.95,-10.) (.95,-2.) (.95,-.5) (.99,-10.) (.99,-2.) (.99,-.5) Discrete model e e e e e e e e e e e e-04 Linear Interpolation 4 7.9e e e e e e e e e e e e e e e e e e-05 Cubic Spline 4 6.6e e e e e e e e e e e e e e e e e e e e e e e e-09 Polynomial (without slopes) 4 DNC 5.4e e e e e e e e e e e-09 Shape Preserving Quadratic Hermite Interpolation 4 4.7e e e e e e e e e e e e e e e e e e-08 Shape Preserving Quadratic Interpolation (ignoring slopes) 4 1.1e e e e e e e e e e e e e e e e e e-07 31

32 General Parametric Approach: Policy Iteration Basic Bellman equation: Policy iteration: V (x) = max u D(x) π(u, x)+βe{v (x+ ) x, u)} (TV)(x). Current guess: a finite-dimensional linear parameterization V (x). = ˆV (x; a), a R m Iteration: compute optimal policy today if ˆV (x; a) is value tomorrow U (x) =π u (x i,u(x),t)+β d ( { E ( ˆV x + ; a ) }) x, U (x)) du using some approximation scheme Û(x; b) Compute the value function if the policy Û(x; b) is used forever, which is solution to the linear integral equation ˆV (x; a )=π(û(x; b),x)+βe{ ˆV (x + ; a ) x, Û(x; b))} that can be solved by a projection method 32

33 Summary: Discretization methods Easy to implement Numerically stable Amenable to many accelerations Poor approximation to continuous problems Continuous approximation methods Can exploit smoothness in problems Possible numerical instabilities Acceleration is less possible 33

Numerical Dynamic Programming

Numerical Dynamic Programming Kenneth L. Judd Hoover Institution Prepared for ICE05 July 20, 2005 - lectures 1 Discrete-Time Dynamic Programming Objective: X: set of states E ( TX D: the set of controls