Economics 2010c: Lecture 2 Iterative Methods in Dynamic Programming

Similar documents
Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time

DYNAMIC LECTURE 5: DISCRETE TIME INTERTEMPORAL OPTIMIZATION

Basic Deterministic Dynamic Programming

Stochastic Dynamic Programming: The One Sector Growth Model

1 Stochastic Dynamic Programming

Lecture 5: The Bellman Equation

d(x n, x) d(x n, x nk ) + d(x nk, x) where we chose any fixed k > N

Economics 2010c: Lecture 3 The Classical Consumption Model

Macro 1: Dynamic Programming 1

Slides II - Dynamic Programming

2. What is the fraction of aggregate savings due to the precautionary motive? (These two questions are analyzed in the paper by Ayiagari)

Economics 204 Fall 2011 Problem Set 2 Suggested Solutions

Economics 204 Summer/Fall 2011 Lecture 5 Friday July 29, 2011

ADVANCED MACROECONOMIC TECHNIQUES NOTE 3a

Dynamic Programming Theorems

1 Markov decision processes

University of Warwick, EC9A0 Maths for Economists Lecture Notes 10: Dynamic Programming

Markov Decision Processes Infinite Horizon Problems

General Examination in Macroeconomic Theory SPRING 2013

Contents. An example 5. Mathematical Preliminaries 13. Dynamic programming under certainty 29. Numerical methods 41. Some applications 57

Lecture 4: Completion of a Metric Space

ECON 2010c Solution to Problem Set 1

ECON 582: Dynamic Programming (Chapter 6, Acemoglu) Instructor: Dmytro Hryshko

Advanced Economic Growth: Lecture 21: Stochastic Dynamic Programming and Applications

Math 118B Solutions. Charles Martin. March 6, d i (x i, y i ) + d i (y i, z i ) = d(x, y) + d(y, z). i=1

Simple Consumption / Savings Problems (based on Ljungqvist & Sargent, Ch 16, 17) Jonathan Heathcote. updated, March The household s problem X

Economics 210B Due: September 16, Problem Set 10. s.t. k t+1 = R(k t c t ) for all t 0, and k 0 given, lim. and

Understanding (Exact) Dynamic Programming through Bellman Operators

9 - Markov processes and Burt & Allison 1963 AGEC

Markov Decision Processes

An Application to Growth Theory

Recursive Methods. Introduction to Dynamic Optimization

Fixed Point Theorems

Stochastic Shortest Path Problems

Lecture 1. Evolution of Market Concentration

Stochastic Dynamic Programming. Jesus Fernandez-Villaverde University of Pennsylvania

Outline Today s Lecture

Lecture 4: Dynamic Programming

Macroeconomic Theory II Homework 2 - Solution

CSE250A Fall 12: Discussion Week 9

Lecture Pakes, Ostrovsky, and Berry. Dynamic and Stochastic Model of Industry

MATH3283W LECTURE NOTES: WEEK 6 = 5 13, = 2 5, 1 13

Lecture 1: Dynamic Programming

Some AI Planning Problems

Lecture 4: The Bellman Operator Dynamic Programming

Abstract Dynamic Programming

Time is discrete and indexed by t =0; 1;:::;T,whereT<1. An individual is interested in maximizing an objective function given by. tu(x t ;a t ); (0.

Sequences. Limits of Sequences. Definition. A real-valued sequence s is any function s : N R.

a) Let x,y be Cauchy sequences in some metric space. Define d(x, y) = lim n d (x n, y n ). Show that this function is well-defined.

6.2 Deeper Properties of Continuous Functions

An approximate consumption function

Math 328 Course Notes

Characterisation of Accumulation Points. Convergence in Metric Spaces. Characterisation of Closed Sets. Characterisation of Closed Sets

1 Stochastic Dynamic Programming

L p Functions. Given a measure space (X, µ) and a real number p [1, ), recall that the L p -norm of a measurable function f : X R is defined by

Sequences. Chapter 3. n + 1 3n + 2 sin n n. 3. lim (ln(n + 1) ln n) 1. lim. 2. lim. 4. lim (1 + n)1/n. Answers: 1. 1/3; 2. 0; 3. 0; 4. 1.

Online Appendix Durable Goods Monopoly with Stochastic Costs

Dynamic Problem Set 1 Solutions

Math 104: Homework 7 solutions

ECON607 Fall 2010 University of Hawaii Professor Hui He TA: Xiaodong Sun Assignment 2

Module 16: Signaling

Political Economy of Institutions and Development: Problem Set 1. Due Date: Thursday, February 23, in class.

5.5 Deeper Properties of Continuous Functions

Planning in Markov Decision Processes

The Principle of Optimality

Lecture notes for Analysis of Algorithms : Markov decision processes

Math 320-2: Midterm 2 Practice Solutions Northwestern University, Winter 2015

Area I: Contract Theory Question (Econ 206)

September Math Course: First Order Derivative

MDP Preliminaries. Nan Jiang. February 10, 2019

Consumption-Savings Decisions with Quasi-Geometric Discounting: The Case with a Discrete Domain

Macro 1: Dynamic Programming 2

Metric Spaces and Topology

Math 117: Infinite Sequences

T 1. The value function v(x) is the expected net gain when using the optimal stopping time starting at state x:

ECOM 009 Macroeconomics B. Lecture 2

Heterogeneous Agent Models: I

CMU Lecture 11: Markov Decision Processes II. Teacher: Gianni A. Di Caro

Estimating Single-Agent Dynamic Models

Math 410 Homework 6 Due Monday, October 26

Sequences. We know that the functions can be defined on any subsets of R. As the set of positive integers

converges as well if x < 1. 1 x n x n 1 1 = 2 a nx n

ECOM 009 Macroeconomics B. Lecture 3

Value and Policy Iteration

Assignment-10. (Due 11/21) Solution: Any continuous function on a compact set is uniformly continuous.

Scalar multiplication and addition of sequences 9

Example I: Capital Accumulation

Markov Decision Processes and Dynamic Programming

, and rewards and transition matrices as shown below:

1 THE GAME. Two players, i=1, 2 U i : concave, strictly increasing f: concave, continuous, f(0) 0 β (0, 1): discount factor, common

Political Economy of Institutions and Development. Lecture 8. Institutional Change and Democratization

Advanced Calculus I Chapter 2 & 3 Homework Solutions October 30, Prove that f has a limit at 2 and x + 2 find it. f(x) = 2x2 + 3x 2 x + 2

Appendix (For Online Publication) Community Development by Public Wealth Accumulation

Reinforcement Learning and Optimal Control. ASU, CSE 691, Winter 2019

1 MDP Value Iteration Algorithm

In last semester, we have seen some examples about it (See Tutorial Note #13). Try to have a look on that. Here we try to show more technique.

MAS221 Analysis Semester Chapter 2 problems

Dynamic Discrete Choice Structural Models in Empirical IO

We have been going places in the car of calculus for years, but this analysis course is about how the car actually works.

MS&E338 Reinforcement Learning Lecture 1 - April 2, Introduction

Transcription:

Economics 2010c: Lecture 2 Iterative Methods in Dynamic Programming David Laibson 9/04/2014

Outline: 1. Functional operators 2. Iterative solutions for the Bellman Equation 3. Contraction Mapping Theorem 4. Blackwell s Theorem (Blackwell: 1919-2010, see obituary) 5. Application: Search and stopping problem

1 Functional operators: Sequence Problem: Find ( ) such that ( 0 )= sup { +1 } =0 subject to +1 Γ( ) with 0 given. X =0 ( +1 ) Bellman Equation: Find ( ) such that ( ) = sup { ( +1 )+ ( +1 )} +1 Γ( )

Bellman operator operating on function is defined ( )( ) sup { ( +1 )+ ( +1 )} +1 Γ( ) Definition is expressed pointwise for one value of butappliestoall values of Operator maps function to a new function. Operator maps functions; itisreferredtoasafunctional operator. The argument of the function may or may not be a solution to the Bellman Equation.

Let be a solution to the Bellman Equation. If = then = ( )( ) = sup +1 Γ( ) { ( +1)+ ( +1 )} = ( ) Function is a fixed point of ( maps to ).

2 Iterative Solutions for the Bellman Equation 1. (Guess a solution from last lecture. This is not iterative.) 2. Pick some 0 and analytically iterate 0 until convergence. (This is really just for illustration.) 3. Pick some 0 and numerically iterate 0 until convergence. (This is the way that iterative methods are used in most cases.)

What does it mean to iterate? ( )( ) = sup +1 Γ( ) { ( +1)+ ( +1 )} ( ( ))( ) = sup +1 Γ( ) { ( +1)+ ( )( +1 )} ( ( 2 n ))( ) = sup +1 Γ( ) ( +1 )+ ( 2 )( +1 ) o. =. ( ( ))( ) = sup +1 Γ( ) { ( +1)+ ( )( +1 )}

What does it mean for functions to converge? Notation: lim 0 = Informally, as gets large, the set of remaining elements of the sequence of functions n 0 +1 0 +2 0 +3 0 o are getting closer and closer together. What does it mean for one function to be close to another? maximum distance between the functions is bounded by The We will not worry about these technical issues in this mini-course.

3 Contraction Mapping Theorem Why does converge as? Answer: is a contraction mapping. Definition 3.1 Let ( ) be a metric space and : be a function mapping into itself. is a contraction mapping if for some (0 1) ( ) ( ) for any two functions and. Intuition: is a contraction mapping if operating on any two functions moves them strictly closer together: and are strictly closer together than and

Remark 3.1 A metric (distance function) is just a way of representing the distance between functions (e.g., the supremum pointwise gap between the two functions).

Theorem 3.1 If ( ) is a complete metric space and : is a contraction mapping, then: 1) has exactly one fixed point 2) for any 0 lim 0 = 3) 0 has an exponential convergence rate at least as great as ln( )

Consider the contraction mapping ( )( ) ( )+ ( ) with (0 1) ( )( ) = ( )+ ( ) ( ( ))( ) = 2 ( )+ ( )+ ( ) ( ( 2 ))( ) = 3 ( )+ 2 ( )+ ( )+ ( ) ( )( ) = ( )+ 1 ( )+ + ( ) lim( )( ) = ( ) (1 ) So fixed point, ( ) is ( ) = ( ) (1 ) Note that ( )( ) = ( )

ProofofContractionMappingTheorem Broad strategy: 1. Show that { 0 } =0 is a Cauchy sequence. 2. Show that the limit point is a fixed point of. 3. Show that only one fixed point exists. 4. Bound the rate of convergence.

Choose some 0 Let = 0 Since is a contraction Continuing by induction, ( 2 1 )= ( 1 0 ) ( 1 0 ) ( +1 ) ( 1 0 ) We can now bound the distance between and when ( ) ( 1 )+ + ( +2 +1 )+ ( +1 ) h 1 + + +1 + i ( 1 0 ) = h 1 + + 1 +1 i ( 1 0 ) So { } =0 is Cauchy. X 1 ( 1 0 )

Since is complete We now have a candidate fixed point To show that = note ( ) ( 0 )+ ( 0 ) ( 1 0 )+ ( 0 ) These inequalities must hold for all And both terms on the RHS converge to zero as So ( ) =0 implying that = X

Now we show that our fixed point is unique. Suppose there were two fixed points: 6= Then = and = (since fixed points). Also have ( ) ( ) (since is contraction) So, ( )= ( ) ( ) Contradiction. So the fixed point is unique. X

Finally, note that ( 0 ) = ( ( 1 0 ) ) ( 1 0 ) So ( 0 ) ( 0 ) follows by induction. X This completes the proof of the Contraction Mapping Theorem.

4 Blackwell s Theorem These are sufficient conditions for an operator to be contraction mapping. Theorem 4.1 (Blackwell s sufficient conditions) Let < and let ( ) be a space of bounded functions : <, with the sup-metric. Let : ( ) ( ) be an operator satisfying two conditions: 1) (monotonicity) if ( ) and ( ) ( ), then ( )( ) ( )( ) 2) (discounting) there exists some (0 1) such that [ ( + )]( ) ( )( )+ ( ) 0 Then is a contraction with modulus

Remark 4.1 Note that is a constant and ( + ) is the function generated by adding to the function Remark 4.2 Blackwell sconditionsaresufficient but not necessary for to be a contraction.

ProofofBlackwell stheorem: For any ( ) ( ) ( )+ ( ) We ll write this as + ( ) So property 1) and 2) imply ( + ( )) + ( ) ( + ( )) + ( )

Combining the last two lines, we have ( ) ( ) ( )( ) ( )( ) ( ) sup ( )( ) ( )( ) ( ) ( ) ( ) X

Example 4.1 Check Blackwell conditions for Bellman operator in a consumption problem (with stochastic asset returns, stochastic labor income, and a liquidity constraint). ( )( ) = sup [0 ] n ( )+ ( +1 ( )+ +1 ) o Monotonicity. Assume ( ) ( ). Suppose is the optimal policy when the continuation value function is ( )( ) = sup [0 ] n ( )+ ( +1 ( )+ +1 ) o = ( )+ ( +1 ( )+ +1) ( )+ ( +1 ( )+ +1) n ( )+ ( +1 ( )+ +1 ) o sup [0 ] = ( )( )

Discounting. Adding a constant ( ) to an optimization problem does not affect optimal choice, so n h [ ( + )]( ) = sup ( )+ ( +1 ( )+ +1 )+ io [0 ] n = sup ( )+ ( +1 ( )+ +1 ) o + [0 ] = ( )( )+

5 Application: Search/ stopping. An agent draws an offer, from a uniform distribution with support in the unit interval. The agent can either accept the offer and realize net present value (ending the game) or the agent can reject the offer and draw again a period later. All draws are independent. Rejections are costly because the agent discounts the future with discount factor. This game continues until the agent receives an offer that she is willing to accept. The Bellman equation for this problem is: ( ) =max{ h ( 0 ) i }

In the first lecture, we showed that the solution would be a stationary threshold: REJECT if ACCEPT if This implies that ( ) = ( if if ) Weshowedthatif = 1 µ 1 q1 2 then, ( ) solves the Bellman Equation.

Iteration of Bellman Operator starting with 0 ( ) =0 This is the Bellman Operator for our problem: ( )( ) max{ h ( 0 ) i } Iterating the Bellman Operator once on 0 ( ) yields 1 ( ) = ( 0 )( ) = max{ h 0 ( 0 ) i } = max{ 0} = So associated threshold cutoff is 0 In other words, accept all offers greater than or equal to 0 since the continuation payoff function is 0 =0

General notation. Iteration of the Bellman Operator,. ( ) = ( 0 )( ) = = ( 1 0 )( ) max{ h ( 1 0 )( 0 ) i } Let h ( 1 0 )( 0 ) i, which is the continuation payoff for ( ) is also the cutoff threshold associated with ( ) =( 0 )( ) ( ) ( ) =( if 0 )( ) = if is a sufficient statistic for ( 0 )( )

Another illustrative example. Iterate the functional operator on 0 2 ( ) = ( 2 0 )( ) = ( 0 )( ) = max{ ( 0 )( 0 )} = max{ 1 ( 0 )} = max{ 0 } 2 = 0 = 2 2 ( ) =( 2 0 )( ) =( 2 2 2 )

The proof itself: ( 1 0 )( ) =( 1 if 1 if 1 ) = ( 1 0 )( 0 ) = = 2 " Z 1 1 ( ) + =0 ³ 2 1 +1 Z =1 = 1 ( ) # Set = 1 to confirm that this sequence converges to µ lim = 1 1 q1 2

Iterating functional for search/stopping problem with discount rate =.1 = -ln(delta) 0.7 0.6 0.5 xn value 0.4 0.3 x 2 0.2 0.1 x 1 0 1 2 3 4 5 6 7 8 9 10 iteration number

Review of Today s Lecture: 1. Functional operators 2. Iterative solutions for the Bellman Equation 3. Contraction Mapping Theorem 4. Blackwell s Theorem 5. Application: Search and stopping problem (analytic iteration)