... Econometric Methods for the Analysis of Dynamic General Equilibrium Models

... Econometric Methods for the Analysis of Dynamic General Equilibrium Models 1

Overview Multiple Equation Methods State space-observer form Three Examples of Versatility of state space-observer form: Smoothing and filtering (estimation of output gap, real interest rate ) Handling mixed monthly/quarterly data. Connection between DSGE models and Vector Autoregressions. Limited information estimation : impulse response function matching Impulse response functions Formal connection between VARs and DSGE models Full information estimation Maximum likelihood Bayesian inference Single equation ( limited information ) methods: Introduction to Generalized Method of Moments 7

State Space-Observer Form Compact summary of the model, and of the data used in the analysis. Typically, data are available in log form. So, the following is useful: If x is steady state of x t : ˆx t x t x x, = x t x =1+ˆx t ³ xt = log =log(1+ˆx t ) ˆx t x Suppose we have a model solution in hand: z t = Az t 1 + Bs t s t = Ps t 1 + t,e t 0 t = D. 8

State Space-Observer Form... Consider example #3 in solution notes, in which z t = µ ˆKt+1,s t =ˆε t, t = e t. ˆN t Data used in analysis may include variables in z t and/or other variables. Suppose variables of interest include employment and GDP. GDP, y t : so that y t = ε t K α t N 1 α t, ŷ t =ˆε t + α ˆK t +(1 α) ˆN t = 01 α z t + α 0 z t 1 + s t Then, Y data t = µ log yt log N t = µ log y log N + µ ŷt ˆN t 9

State Space-Observer Form... Model prediction for data: Y data t = = µ log y + log N µ log y + log N µ ŷt ˆN t 01 α 0 1 z t + α 0 0 0 z t 1 + 1 0 s t = a + Hξ t ξ t = z t z t 1 ˆε t,a= log y log N,H= 01 αα01 0 1 0 0 0 The Observer Equation may include measurement error, w t : Y data t = a + Hξ t + w t,ew t w 0 t = R. Semantics: ξ t is the state of the system (do not confuse with the economic state (K t,ε t )!). 10

State Space-Observer Form... The state equation Lawofmotionofthestate,ξ t ξ t = Fξ t 1 + u t,eu t u 0 t = Q z t+1 z t s t+1 = A 0 BP I 0 0 0 0 P z t z t 1 s t + B 0 I t+1, u t = B 0 I t,q= BDB0 0 BD 0 0 0 DB 0 D,F= A 0 BP I 0 0 0 0 P. 11

State Space-Observer Form... Summary: State-Space, Observer System - ξ t = Fξ t 1 + u t,eu t u 0 t = Q, Y data t = a + Hξ t + w t,ew t w 0 t = R. Can be constructed from model parameters θ =(β, δ,...) so F = F (θ),q= Q (θ),a= a (θ),h= H (θ),r= R (θ). 12

State Space-Observer Form... State space observer system very useful Estimation of θ and forecasting ξ t and Y data t Can take into account situations in which data represent a mixture of quarterly, monthly, daily observations. Software readily available on web and elsewhere. Useful for solving the following forecasting problems: Filtering: P ξ t Yt 1 data,yt 2 data,..., Y1 data,t=1, 2,..., T. Smoothing: P ξ t YT data,..., Y1 data,t=1, 2,..., T. Example: real rate of interest and output gap can be recovered from ξ t using example #5 in solution notes. 13

Mixed Monthly/Quarterly Observations Different data arrive at different frequencies: daily, monthly, quarterly, etc. This feature can be easily handled in state space-observer system. Example: suppose inflation and hours are monthly, t =0, 1/3, 2/3, 1, 4/3, 5/3, 2,... suppose gdp is quarterly, t =0, 1, 2, 3,... Y data t = GDP t monthly inflation t monthly inflation t 1/3 monthly inflation t 2/3 hours t hours t 1/3 hours t 2/3,t=0, 1, 2,.... that is, we can think of our data set as actually being quarterly, with quarterly observations on the first month s inflation, quarterly observations on the second month s inflation, etc. 14

Mixed Monthly/Quarterly Observations... Problem: find state-space observer system in which observed data are: Y data t = GDP t monthly inflation t monthly inflation t 1/3 monthly inflation t 2/3 hours t hours t 1/3 hours t 2/3,t=0, 1, 2,.... Solution: easy! 15

Mixed Monthly/Quarterly Observations... Model: specified at a monthly level, with solution, t =0, 1/3, 2/3,... z t = Az t 1/3 + Bs t, s t = Ps t 1/3 + t,e t 0 t = D. Monthly state-space observer system, t =0, 1/3, 2/3,... ξ t = Fξ t 1/3 + u t,eu t u 0 t = Q, u t iid t =0, 1/3, 2/3,... Y t = Hξ t,y t = y t π t. h t Note: first order vector autoregressive representation for quarterly state z } { ξ t = F 3 ξ t 1 + u t + Fu t 1/3 + F 2 u t 2/3, u t + Fu t 1/3 + F 2 u t 2/3 ~iid for t =0, 1, 2,...!! 16

Mixed Monthly/Quarterly Observations... Consider the following system: ξ t ξ t 1/3 = F 3 00 ξ t 1 F 2 00 ξ t 4/3 ξ t 2/3 F 00 ξ t 5/3 + I F F2 0 I F 0 0 I u t u t 1/3 u t 2/3. Define ξ t = ξ t ξ t 1/3 ξ t 2/3, F = F 3 00 F 2 00, ũ t = F 00 I F F2 0 I F 0 0 I u t u t 1/3 u t 2/3, so that ξ t = F ξ t 1 +ũ t, ũ t iid in quarterly data, t =0, 1, 2,... Eũ t ũ 0 t = Q = I F F2 0 I F D 0 0 0 D 0 I F F2 0 0 I F 0 0 I 0 0 D 0 0 I 18

Mixed Monthly/Quarterly Observations... Done! State space-observer system for mixed monthly/quarterly data, for t = 0, 1, 2,... ξ t = F ξ t 1 +ũ t, ũ t iid, Eũ t ũ 0 t = Q, Y data t = H ξ t + w t,w t iid, Ew t w 0 t = R. Here, H selects elements of ξt to construct Y data t can easily handle distinction between whether quarterly data represent monthly averages (as in flow variables), or point-in-time observations on one month in the quarter (as in stock variables). Can use Kalman filter to forecast current quarter data based on first month s (day s, week s) observations. 19

Matching Impulse Response Functions Set t =1for t =1, t =0otherwise Impulse response function: log deviation of data with shock from where data would have been in the absence of a shock - u t = B 0 I t, ξ t = Fξ t 1 + u t,ξ 0 =0, impulse response function = Ỹ data t = Hξ t, for t =1, 2,... Choose model parameters, θ, to match Ỹ t data VAR (more on this later). with corresponding estimate from 20

Connection Between DSGE s and VAR s Fernandez-Villaverde, Rubio-Ramirez, Sargent Result Vector Autoregression Y t = B 1 Y t 1 + B 2 Y t 2 +... + u t, where u t is iid. Matching impulse response functions strategy for building DSGE models fits VARs and assumes u t are a rotation of economic shocks (for details, see later notes). Can use the state space, observer representation to assess this assumption from the perspective of a DSGE. 21

Connection Between DSGE s and VAR s... System (ignoring constant terms and measurement error): ( State equation ) ξ t = Fξ t 1 + D t,d= ( Observer equation ) Y t = Hξ t. Substituting: Y t = HFξ t 1 + HD t B 0 I, Suppose HD is square and invertible. Then t =(HD) 1 Y t (HD) 1 HFξ t 1 ( ) Substitute latter into the state equation: ξ t = Fξ t 1 + D (HD) 1 Y t D (HD) 1 HFξ t 1 = h i I D (HD) 1 H Fξ t 1 + D (HD) 1 Y t. 22

Connection Between DSGE s and VAR s... We have: ξ t = Mξ t 1 + D (HD) 1 Y t,m= h i I D (HD) 1 H F. If eigenvalues of M are less than unity, ξ t = D (HD) 1 Y t + MD(HD) 1 Y t 1 + M 2 D (HD) 1 Y t 2 +... Substituting into ( ) t =(HD) 1 Y t (HD) 1 HF h i D (HD) 1 Y t 1 + MD(HD) 1 Y t 2 + M 2 D (HD) 1 Y t 3 +... or, 23

Connection Between DSGE s and VAR s... where Y t = B 1 Y t 1 + B 2 Y t 2 +... + u t, u t = HD t B j = HFM j 1 D (HD) 1,j=1, 2,... The latter is the VAR representation. Note: t is invertible because it lies in space of current and past Y t s. Note: VAR is infinite-ordered. Note: assumed system is square. Sims-Zha (Macroeconomic Dynamics) show that square-ness is not necessary. 24

Maximum Likelihood Estimation State space-observer system: ξ t+1 = Fξ t + u t+1,eu t u 0 t = Q, Y data t = a 0 + Hξ t + w t,ew t w 0 t = R Parameters of system: (F, Q, a 0,H,R). These are functions of model parameters, θ. Formulas for computing likelihood P Y data θ = P Y data 1,..., Y data T θ. are standard (see Hamilton s textbook). 25

Bayesian Maximum Likelihood Bayesians describe the mapping from prior beliefs about θ, summarized in p (θ), to new posterior beliefs in the light of observing the data, Y data. General property of probabilities: p Y data,θ = which implies Bayes rule: ½ p Y data θ p (θ) p θ Y data p Y data, p θ Y data = p Y data θ p (θ), p (Y data ) mapping from prior to posterior induced by Y data. 26

Bayesian Maximum Likelihood... Properties of the posterior distribution, p θ Y data. The value of θ that maximizes p θ Y data ( mode of posterior distribution). Graphs that compare the marginal posterior distribution of individual elements of θ with the corresponding prior. Probability intervals about the mode of θ ( Bayesian confidence intervals ) Other properties of p θ Y data helpful for assessing model fit. 27

Bayesian Maximum Likelihood... Computation of mode sometimes referred to as Basyesian maximum likelihood : ( ) θ mod e =argmax θ log p Y data θ + maximum likelihood with a penalty function. NX log [p i (θ i )] i=1 Shape of posterior distribution, p θ Y data, obtained by Metropolis-Hastings algorithm. Algorithm computes θ (1),..., θ (N), which, as N, has a density that approximates p θ Y data well. Marginal posterior distribution of any element of θ displayed as the histogram of the corresponding element {θ (i),i=1,..,n} 28

Metropolis-Hastings Algorithm (MCMC) We have (except for a constant): Ã f {z} θ Y N 1! = f (Y θ) f (θ). f (Y ) We want the marginal posterior distribution of θ i : h (θ i Y )= Z θ j6=i f (θ Y ) dθ j6=i,i=1,..., N. MCMC algorithm can approximate h (θ i Y ). Obtain (V produced automatically by gradient-based maximization methods): θ mod e θ =argmax θ f (Y θ) f (θ),v 1 2 f (Y θ) f (θ) θ θ 0. θ=θ 29

Metropolis-Hastings Algorithm (MCMC)... Compute the sequence, θ (1),θ (2),..., θ (M) (M large) whose distribution turns out to have pdf f (θ Y ). θ (1) = θ to compute θ (r), for r>1 step 1: select candidate θ (r),x, jump distribution z Ã }!{ draw {z} x from θ (r 1) + N 1 step 2: compute scalar, λ : λ = kn {z} 0,V N 1 f (Y x) f (x) f ³Y (r 1) θ f ³θ (r 1),kis a scalar step 3: ½ compute θ (r) : θ (r) θ = (r 1) if u>λ x if u<λ,uis a realization from uniform [0, 1] 30

Metropolis-Hastings Algorithm (MCMC)... Approximating marginal posterior distribution, h (θ i Y ), of θ i compute and display the histogram of θ (1) Other objects of interest: i,θ (2) i mean and variance of posterior distribution θ : Eθ ' θ 1 M MX θ (j),var(θ)' 1 M j=1 MX j=1,..., θ (M),i=1,...,N. i h θ (j) θ ih θi 0 θ (j). marginal density of Y (actually, Geweke s harmonic mean works better): f (Y ) ' 1 M MX f ³Y (j) θ f ³θ (j) j=1 31

Metropolis-Hastings Algorithm (MCMC)... Some intuition Algorithm is more likely to select moves into high probability regions than into low probability regions. n Set, θ (1),θ (2),..., θ (M)o, populated relatively more by elements near mode of f (θ Y ). n Set, θ (1),θ (2),...,θ (M)o, also populated (though less so) by elements far from mode of f (θ Y ). 32

Metropolis-Hastings Algorithm (MCMC)... Practical issues what value should you set k to? set k so that you accept (i.e., θ (r) = x) in step 3 of MCMC algorithm are roughly 27 percent of time what value of M should you set? a value so that if M is increased further, your results do not change in practice, M =10, 000 (a small value) up to M =1, 000, 000. large M is time-consuming. Could use Laplace approximation (after checking its accuracy) in initial phases of research project. 33

Laplace Approximation to Posterior Distribution In practice, Metropolis-Hasting algorithm very time intensive. Do it last! In practice, Laplace approximation is quick, essentially free and very accurate. Let θ R N denote the N dimensional vector of parameters and g (θ) log f (y θ) f (θ), f (y θ) ~likelihood of data f (θ) ~prior on parameters θ ~maximum of g (θ) (i.e., mode) 34

Laplace Approximation to Posterior Distribution... Second order Taylor series expansion about θ = θ : where g (θ) g (θ )+g θ (θ )(θ θ ) 1 2 (θ θ ) 0 g θθ (θ )(θ θ ), Interior optimality implies: g θθ (θ )= 2 log f (y θ) f (θ) θ θ 0 θ=θ g θ (θ )=0,g θθ (θ ) positive definite Then, f (y θ) f (θ) ' f (y θ ) f (θ )exp ½ 1 ¾ 2 (θ θ ) 0 g θθ (θ )(θ θ ). 35

Laplace Approximation to Posterior Distribution... Note 1 (2π) N 2 g θθ (θ ) 1 2 exp ½ 1 ¾ 2 (θ θ ) 0 g θθ (θ )(θ θ ) = multinormal density for N dimensional random variable θ with mean θ and variance g θθ (θ ) 1. So, posterior of θ i (i.e., h (θ i Y ³ )) is h approximately θ i ~N θ i, g θθ (θ ) 1i. ii This formula for the posterior distribution is essentially free, because g θθ is computed as part of gradient-based numerical optimization procedures. 36

Laplace Approximation to Posterior Distribution... Marginal likelihood of data, y, is useful for model comparisons. Easy to compute using the Laplace approximation. Property of Normal distribution: Z 1 g (2π) N θθ (θ ) 1 2 exp ½ 1 ¾ 2 2 (θ θ ) 0 g θθ (θ )(θ θ ) dθ =1 Then, Z f (y θ) f (θ) dθ ' = f (y θ ) f (θ Z ) 1 g θθ (θ ) 1 2 (2π) N 2 Z f (y θ ) f (θ )exp 1 (2π) N 2 ½ 1 ¾ 2 (θ θ ) 0 g θθ (θ )(θ θ ) dθ g θθ (θ ) 1 2 exp ½ 1 ¾ 2 (θ θ ) 0 g θθ (θ )(θ θ ) dθ = f (y θ ) f (θ ). 1 g θθ (θ ) 1 2 (2π) N 2 37

Laplace Approximation to Posterior Distribution... Formula for marginal likelihood based on Laplace approximation: f (y) = Z f (y θ) f (θ) dθ ' (2π) N f (y θ ) f (θ ) 2. g θθ (θ ) 1 2 Suppose f(y Model 1) >f(y Model 2). Then, posterior odds on Model 1 higher than Model 2. Model 1 fits better than Model 2 Can use this to compare across two different models, or to evaluate contribution to fit of various model features: habit persistence, adjustment costs, etc. 38

Generalized Method of Moments Express your econometric estimator into Hansen s GMM framework and you get standard errors Essentially, any estimation strategy fits (see Hamilton) Works when parameters of interest, β, have the following property: E u t {z} N 1 β {z} =0,βtrue value of some parameter(s) of interest n 1 u t (β) ~ stationary stochastic process (and other conditions) n = N : exactly identified n<n: over identified 39

Generalized Method of Moments... Example 1: mean β = Ex t, u t (β) =β x t. Example 2: mean and variance β = μσ, Ex t = μ, E (x t μ) 2 = σ 2. then, u t (β) = μ x t (x t μ) 2 σ 2. 40

Generalized Method of Moments... Example 3: mean, variance, correlation, relative standard deviation where β = μ y σ y μ x σ x ρ xy λ,λ σ x /σ y, Ey t = μ y,e y t μ y 2 = σ 2 y Ex t = μ x,e(x t μ x ) 2 = σ 2 x then ρ xy = E y t μ y (xt μ x ). σ y σ x μ x x t (x t μ x ) 2 σ 2 x μ u t (β) = y y t 2 yt μ y σ 2. y σ y σ x ρ xy y t μ y (xt μ x ) σ y λ σ x 41

Generalized Method of Moments... Example 4: New Keynesian Phillips curve π t =0.99E t π t+1 + γs t, or, π t 0.99π t+1 γs t = η t+1 where, η t+1 =0.99 (E t π t+1 π t+1 )= E t η t+1 =0 Under Rational Expectations : η t+1 time t information, z t u t (γ) =[π t 0.99π t+1 γs t ] z t 42

Generalized Method of Moments... Inference about β Estimator of β in exactly identified case (n = N) Choose ˆβ to mimick population property of true β, Eu t (β) =0. Define: Solve g T (β) = 1 T TX u t (β). t=1 ˆβ : g T {z} ˆβ N 1 = 0 {z}. N 1 43

Generalized Method of Moments... Example 1: mean β = Ex t, Choose ˆβ so that g T ³ˆβ = 1 T u t (β) =β x t. TX t=1 u t ³ˆβ = ˆβ 1 T TX x t =0 t=1 and ˆβ is simply sample mean. 44

Generalized Method of Moments... Example 4 in exactly identified case Eu t (γ) =E [π t 0.99π t+1 γs t ] z t,z t ~scalar choose ˆγ so that g T ³ˆβ = 1 T TX [π t 0.99π t+1 ˆγs t ] z t =0, t=1 or. standard instrumental variables estimator: ˆγ = 1 T P T t=1 [π t 0.99π t+1 ] z t 1 T P T t=1 s tz t 45

Generalized Method of Moments... Key message: In exactly identified case, GMM does not deliver a new estimator you would not have thought of on your own means, correlations, regression coefficients, exactly identified IV estimation, maximum likelihood. GMM provides framework for deriving asymptotically valid formulas for estimating sampling uncertainty. 46

Generalized Method of Moments... Estimating β in overidentified case (N >n) Cannot exactly implement sample analog of Eu t (β) =0: g T {z} ˆβ n 1 Instead, do the best you can : = 0 {z} N 1 ˆβ =argmin β g T (β) 0 W T g T (β), where W T ~ is a positive definite weighting matrix. GMM works for any positive definite W T, but is most efficient if W T is inverse of estimator of variance-covariance matrix of g T ³ˆβ : (W T ) 1 ³ˆβ 0 = Eg T ³ˆβ g T. 47

Generalized Method of Moments... This choice of weighting matrix very sensible: weight heavily those moment conditions (i.e., elements of g T ³ˆβ )that are precisely estimated pay less attention to the others. 49

Generalized Method of Moments... Estimator of WT 1 Note: 0 Eg T ³ˆβ g T ³ˆβ = 1 h i h i 0 T 2E u 1 ³ˆβ + u 2 ³ˆβ +... + u T ³ˆβ u 1 ³ˆβ + u 2 ³ˆβ +... + u T ³ˆβ ³ˆβ ³ˆβ = 1 T [T T Eu t ³ˆβ + T 1 ³ˆβ T Eu t u t 1 ³ˆβ where W 1 T 0 T 1 u t ³ˆβ + T Eu t 0 T 2 ³ˆβ + T Eu t 0 1 u t+1 ³ˆβ +... + 0 1 u t 2 ³ˆβ +.. + = 1 C (0) + T 0 C (r) =Eu t ³ˆβ u t r ³ˆβ " XT 1 r=1 0 u t+t 1 ³ˆβ T Eu t ³ˆβ ³ˆβ 0] T Eu t u t T +1 T r T is 1 T spectral density matrix at frequency zero, S 0, of u t ³ˆβ C (r)+c (r) 0 #, 50

Generalized Method of Moments... Conclude: W 1 T W 1 T = Eg T ³ˆβ g T ³ˆβ estimated by " [W 1 T = 1 T = 1 T Ĉ (0) + " XT 1 r=1 C (0) + T r T XT 1 r=1 T r T ³Ĉ (r)+ Ĉ (r) 0 # C (r)+c (r) 0 # = S 0 T. = 1 T Ŝ0, imposing whatever restrictions are implied by the null hypothesis, i.e., (as in ex. 4) C (r) =0,r>Rsome R. which is Newey-West estimator of spectral density at frequency zero Problem: need ˆβ to compute WT 1 and need WT 1 to compute ˆβ!! Solution - first compute ˆβ using W T = I, then iterate... 51

Generalized Method of Moments... Sampling Uncertainty in ˆβ. The exactly identified case By the Mean Value Theorem, g T ³ˆβ can be expressed as follows: g T ³ˆβ = g T (β 0 )+D ³ˆβ β0, where β 0 is the true value of the parameters and D = g T (β) β 0 β=β, some β between β 0 and ˆβ. Since g T ³ˆβ =0and g T (β 0 ) ã N (0,S 0 /T ), it follows: so ˆβ β 0 = D 1 g T (β 0 ), ã D ˆβ β 0 N Ã0, 0 S0 1D! 1 T 52

Generalized Method of Moments... The overidentified case. An extension of the ideas we have already discussed. Can derive the results for yourself, using the delta function method for deriving the sampling distribution of statistics. Hamilton s text book has a great review of GMM. 53