Estimation of moment-based models with latent variables

Estimation of moment-based models with latent variables work in progress Ra aella Giacomini and Giuseppe Ragusa UCL/Cemmap and UCI/Luiss UPenn, 5/4/2010 Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 1 / 35

Dynamic latent variables in macroeconomic models E.g., time-varying parameters, structural shocks, stochastic volatility etc. Typical parametric setting: X T = (X 1,..., X T ) = (Y T, Z T ), Y T observable, Z T latent Joint density p(x, θ 0 ) = p(y T jz T, θ 0 )p(z T, θ 0 ) =) estimation of θ 0 based on integrated likelihood bθ = arg max θ Z p(y T jz T, θ)p(z T, θ)dz T Integrated likelihood computed by state-space methods iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 2 / 35

Existing state-space methods State equation! p(z T, θ) known in closed form Observation equation! p(y T jz T, θ) " ltering" density known in closed form (e.g. Kalman lter) or easy to simulate Integral can be computed by MCMC methods Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 3 / 35

State-space methods for limited information models? We consider the following scenario: p(z T, θ) known! state equation same as before p(y T jz T, θ) unknown. Only information about θ is in the form of (non-linear) moment conditions E t 1 [g(y t, Z t, θ)] = 0! substitute observation equation with moment conditions iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 4 / 35

Applications. GMM with time-varying parameters Example #1. Time-varying "structural" parameters: E [g(y t, β t )] = 0 β t = Φβ t 1 + ε t, ε t iidn(0, Σ) E [] de ned with respect to joint distribution of Y t and β t Want to estimate θ = (Φ, Σ) and sequence of "smoothed" β t Application: Cogley and Sbordone s (2005) analysis of stability of structural parameters in a Calvo model of in ation Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 5 / 35

Applications. "Robust" stochastic volatility estimation Example #2. Y t = σ t ε t log σ 2 t = α + β log σ 2 t 1 + vt, v t iidn(0, 1) Existing estimation methods require distributional assumption on ε t (typically N(0, 1)) Problem: does not capture "fat tails" of nancial data =) include jumps or use fat-tailed distribution for ε t (not as straightforward as in GARCH case) Our method is robust to misspeci cation in distribution of ε t iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 6 / 35

Applications. Nonlinear DSGE models Example #3. Prototypical DSGE model. Optimality conditions: E t 1 [m(y t, S t, Z t, β)] = 0 S t = f (S t 1, Y t, Z t, β) Z t = ΦZ t 1 + ε t, ε t iidn(0, Σ) Want to estimate θ = (β, Φ, Σ) Y t = observable variables S t = endogenous latent variables Z t = exogenous latent variables m () and f () known iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 7 / 35

An and Schorfheide (2007) DSGE model In AS model, the endogenous latent variable equation has a simple form: S t = f (Y t, Z t, β) (1) Can substitute S t and rewrite the equilibrium conditions as E t 1 [g(y t, Z t, β)] = 0 Z t = ΦZ t 1 + ε t, ε t iid N(0, Σ) Warning: not all DSGEs t this framework Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 8 / 35

Existing approaches to estimation of DSGE models 1 Theory does not provide likelihood! must use approximation methods 2 Linearize around steady state (Smets and Wouters, 2003; Woodford, 2003) Solve the model to nd policy functions Y t = h(s t, Z t ) Construct likelihood by Kalman lter 3 Nonlinear approximations (Fernandez-Villaverde and Rubio-Ramirez, 2005) Solve the model (numerically or analytically in the case of second order approximations around steady state) to nd policy functions Construct likelihood by nonlinear state-space methods (e.g., particle lter) iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 9 / 35

Drawbacks of existing likelihood-based approaches 1 Linearization = possible loss of information (Fernandez-Villaverde and Rubio-Ramirez, 2005) 2 Must impose structure to solve the model 1 Add "shocks"/measurement error to avoid stochastic singularity 2 Restrict parameters to rule out indeterminacy (multiple rational expectations solutions) 3 Nonlinear state-space methods computationally intensive (must solve the model for each parameter draw) =) so far mostly applied to simple models iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 10 / 35

Relationship with simulation-based method of moments GMM, SMM, EMM, Indirect inference (eg, Ruge-Murcia, 2010) Di erence: requires knowledge of p(y T jz T ) or focuses on moments of the type E Y [g (Y, β)] = 0, (2) where g (Y, β) can be computed by simulation In our case, the model gives E Y,Z [m (Y, Z, β)] = 0 =) can be written as (2) only if p(zjy ) known Unlike these methods, we directly obtain estimates of the smoothed latent variables Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 11 / 35

The idea Propose methods for estimating non-linear moment-based models that "exploit" the information contained in the moment conditions Methods are: 1 Computationally convenient 2 Classical or Bayesian iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 12 / 35

Key elements of methodology Recall problem we want to solve (e.g., classical framework) Two steps: Z max θ p(y T jz T, θ) p(z T, θ)dz T " unknown " known 1 Approximate the unknown likelihood p(y T jz T, θ) 2 Integrate out the latent variables using classical or Bayesian methods 3 For DSGEs: from an exact likelihood of the approximate model... to an approximate likelihood of the exact model iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 13 / 35

Approximate likelihoods We consider two di erent approximation strategies Both use projection theory (for no latent variables, Kim (2002), Chernozhukov and Hong (2003), Ragusa (2009)): out of all probability measures satisfying the moment conditions, choose the one that minimizes the Kullback-Leibler information distance Method 1 does not require solving the model (but not applicable to models with dynamic latent endogenous variables) Method 2 applicable to all models but requires solution of (approximate) model iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 14 / 35

Approximate likelihoods - Method 1 Find density that satis es moment conditions and minimizes distance from the true density: gives approximate likelihood ep(y T jz T, θ) 1 exp 2 g T 0 Y T, Z T, θ VT 1 Y T, Z T, θ g T Y T, Z T, θ g T Z T, θ V T Y T, Z T, θ = p 1 T T g(y t, Z t, θ) w t 1 t=1 = Var(g T Y T, Z T, θ ), w t 1 instruments iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 15 / 35

Approximate likelihoods - Method 1 1 exp 2 g T 0 Y T, Z T, θ ep(y T jz T, θ) VT 1 Y T, Z T, θ g T Y T, Z T, θ ep(y T jz T, θ) is a simple transformation of the GMM objective function. Intuition: When (Z T, θ) is consistent with the model g T Y T, Z T, θ 0 =) ep(y T jz T, θ) close to max value of 1. When (Z T, θ) is inconsistent with the moment conditions =) large values of gt 0 Y T, Z T, θ VT 1 Y T, Z T, θ g T Y T, Z T, θ =) ep(y T jz T, θ) 0. iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 16 / 35

Approximate likelihoods - Method 2 Write p(y T jz T ) = Π T t=1 p(y tjz t, Y t 1 ) Choose approximate density bp(y t jz t, Y t 1, θ) (does not need to satisfy moment condition but easy to calculate) - For DSGEs, e.g., linearize model around steady state and apply Kalman lter =) bp(y t jz t, Y t 1, θ) are the ltered densities Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 17 / 35

Approximate likelihoods - Method 2 "Tilt" bp(y t jz t, Y t 1, θ) towards moment condition E t 1 [g(y t, Z t, θ)] = 0, new density ep() satis es moment condition and minimizes Kullback Leibler distance from bp() : Solve problem: Z Z min h2h log Z Z s.t. h(yt jz t, Y t 1 ) bp (Y t jz t, Y t 1 bp Y t jz t, Y t 1, θ dy t df Z t,, θ) g(y t, Z t, θ)h(y t jz t, Y t 1 )dy t df (Z t ) = 0 iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 18 / 35

Approximate likelihoods - Method 2 Under regularity conditions the solution is ep(y t jz t, Y t 1, θ) = exp fη t + λ t g (Y t, Z t, θ)g bp(y t jz t, Y t 1, θ) where (η t, λ t ) = arg min η,λ Z exp fη + λg (Y t, Z t, θ)g bp(y t jz t, Y t 1, θ)dy λ t = "weights for each moment condition"; η t = integration constant (η t, λ t ) are functions of Z t, Y t 1, θ Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 19 / 35

Approximate likelihoods - Method 2 ep(y t jz t, Y t 1, θ) = exp fη t + λ t g (Y t, Z t, θ)g bp(y t jz t, Y t 1, θ In practice, approximate integral and compute (η t, λ t ) by simulating N times from bp(y t jz t, Y t 1, θ) =) (η t, λ t ) = arg min η,λ 1 N N n exp η + λg i=1 Y (i) t o, Z t, θ Well-behaved objective function =) for DSGEs, small additional computational cost relative to Kalman lter (cf. particle lter?) iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 20 / 35

The two methods in a simple case No latent variables, Y T = (Y 1,..., Y T ) mean µ 0, variance σ 2 0 Moment condition identifying parameters are Method 1: bµ, bσ 2 g 1 (Y t, µ, σ 2 ) = Y t µ g 2 (Y t, µ, σ 2 ) = Yt 2 σ 2 1 = arg max exp θ=(µ,σ 2 ) 2 g T 0 Y T, θ VT 1 Y T, θ g T Y T, θ =) our estimator is same as GMM (Chernozhukov and Hong (2003)) Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 21 / 35

The two methods in a simple case Method 2: Start from n pdf of N(µ, σ 2 ) : bp (Y t ) = p 1 exp 1 2πσ 2σ (Y t µ) 2o and "tilt it" towards moment conditions ep (Y t ) = exp η + λ 1 (Y t µ) + λ 2 Yt 2 σ 2 1 p e 2πσ λ 1 = µ 0 σ 0 µ σ ; λ 2 = 1 2σ 1 2σ 0 1 2 (Y t No tilting if µ = µ 0, σ 2 = σ 2 0 In this case ep (Y t ) N(µ 0, σ 2 0 ) =) our estimator is the same as (Q)MLE Normality here is a special result - ep () no longer normal if e.g., g () non-linear Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 22 / 35

Step 2. Integrate out latent variables Classical estimation approach: solve Z bθ = max ep(y T jz T, θ)p(z T, θ)dz T θ using Jacquier, Johannes and Polson (2007) to compute integral here works well in our limited experience Bayesian estimation approach: assume prior for θ (and Z 0 ), π(θ) and calculate the approximate posterior ep(θ, Z T jy T ) ep(y T jz T, θ)p(z T jθ)π(θ) Integration of latent variables step is the same as previous literature iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 23 / 35

Econometric properties For method 2 (tilted density), can show that MLE based on approximate integrated likelihood ep(y T, θ) is consistent for Z θ = arg min θ log ep(y T, θ) p(y T p(y T )dy T ) θ = parameter that sets the approximate density that is consistent with the moment conditions as close as possible to true density In particular if moment condition uniquely identi es parameter θ 0, by construction θ = θ 0 Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 24 / 35

Econometric properties Back to simple example: Y t iid(µ 0, σ 2 0 ), g(y t, θ) = (Y t µ, Yt 2 σ 2 ), initial density bp N(µ, σ 2 ) If tilt towards both moments, approximate density ep N(µ 0, σ 2 0 ) =) our estimator (=QMLE) consistent for true parameters What if tilt towards only one moment condition? E.g., only use g 2 (Y t, θ) = Y 2 t σ 2 =) ep N( µ σ σ 0, σ 2 0 ) Variance estimated consistently; mean not estimated consistently Suggests that not using moments can cause distortions =) need to understand tradeo s between too many/too few moments iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 25 / 35

Econometric properties Hypothesis testing, model selection relatively straightforward for method 2 E.g., could test whether λ (or individual components) = 0, understand importance of non-linearities in DSGE models Open issue: identi cation (here assumed but challenging because of presence of latent variables + nonlinearity of moment conditions) Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 26 / 35

Method 1 in a simple example Data-generating process Moment condition Y t =.9Z t + v t iid N(0, 1) Z t =.9Z t 1 + ε t iid N(0, 1) E[Z t (Y t βz t )] = 0 Z t = ρz t 1 + ε t iid N(0, 1) g(y t, Z t, β) = Z t (Y t βz t ) Priors: β U(0, 2), ρ U(0, 1), Z 0 N(0, 1 1 ρ 2 ), T = 100 Use Jacquier et al. (2007) iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 27 / 35

Distribution of β Density 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3 2 1 0 1 2 3 c( 2, 2)

Distribution of ρ Density 0 2 4 6 8 10 1.0 0.5 0.0 0.5 1.0 ρ

Smoothed Probabilities x 4 2 0 2 4 6 Smoothed p(z x) Actual z 0 20 40 60 80 100 Time

Simulation: AS New Keynesian model 1 = βe t e τĉ t+1 +τĉ t + ˆR t ẑ t+1 ˆπ t+1 (3) 1 ν νφπ 2 (eτĉ t 1) = (e ˆπ t 1) 1 eĉt ŷ t = e ĝt φπ 2 1 e ˆπ t + 1 2ν 2ν βe (e ˆπ t+1 1)e τĉ t+1+τĉ t +ŷ t+1 ˆ (4) y t + ˆπ t+1 2 (e ˆπ t 1) 2 (5) ˆR t = ρ r ˆR t 1 + (1 ρ r )ψ 1 ˆπ t + (1 ρ r )ψ 2 (ŷ t ĝ t ) + σ R ε R,t (6) ẑ t = ρ z ẑ t 1 + σ z ε z,t ĝ t = ρ g ĝ t 1 + σ g ε g,t ε s independent N(0, 1) Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 28 / 35

AS New Keynesian model Observable variables: Y t = (X t, π t, R t ) 0 (output, in ation and interest rate), where X t = γ (Q) + 100(ŷ t ŷ t 1 + ẑ t ) π t = π (A) + 400 ˆπ t R t = π (A) + r (A) + 4γ (Q) + 400 ˆR t. ŷ t, ˆR t, ˆπ t = deviation from steady state Endogenous latent variable: S t = bc t = deviation from steady state of consumption Exogenous latent variables: Z t = (bz t, bg t ) 0 = technology and government spending iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 29 / 35

AS model in compact form (4) implies expression for S t as a function of Y t and Z t =) substitute into moment conditions Write policy rule as moment conditions Choose instruments to transform E t [] into E [] Write model as Z t = ρz 0 Z 0 ρ t g E [g(y t+1, Y t, Z t+1, Z t, θ)] = 0 1 + ε t, ε t iidn 0 0 σ 2, z 0 0 σ 2 g g () is 11 1, θ = (τ, ν, φ, 1/g, ψ 1, ψ 2, ρ R, σ R, π (A), γ (Q), r (A), ρ z, ρ g, σ z, σ g ) Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 30 / 35

AS model posterior Approximate posterior 1 ep(θ, Z T jy T ) exp 2 g T 0 T t=1 p(z t jz t 1, θ) Y T, Z T, θ VT 1 Y T, Z T, θ g T Y T t=1 p(g t jg t z 0 and g 0 drawn from their stationary distributions 1, γ)p(z 0, g 0 jγ) iacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 31 / 35

Simulation exercise Same DGP as AS: Generate a time series (T = 80) from a second order approximation to the model Parameters and priors as in AS Compare posteriors for θ obtained by our method to those in AS (both linear and nonlinear solution methods) Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 32 / 35

AS estimation results Draws from priors and posteriors for parameters π (A), γ (Q), r (A), ρ z, ρ g, σ z, σ g Red lines = true parameter values Estimation time: 100,000 MCMC draws 6 days Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 33 / 35

Figure 17: Posterior Draws: Linear versus Quadratic Approximation II Prior Linear/Kalman Posterior Quadratic/Particle Posterior 1 1 1 0.8 0.8 0.8 γ (Q) 0.6 γ (Q) 0.6 γ (Q) 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 2 4 6 π (A) 0 2 4 6 π (A) 0 2 4 6 π (A) 7 7 7 6 6 6 5 5 5 π (A) 4 π (A) 4 π (A) 4 3 3 3 2 2 2 1 0 1 2 r (A) 1 0 1 2 r (A) 1 0 1 2 r (A)

ρ z 1.4 1.2 1 0.8 0.6 0.4 0.2 0.5 1 1.5 6 x 10 3 5 4 ρ g ρ z 1.4 1.2 1 0.8 0.6 0.4 0.2 0.5 1 1.5 6 x 10 3 5 4 ρ g ρ z 1.4 1.2 1 0.8 0.6 0.4 0.2 0.5 1 1.5 6 x 10 3 5 4 ρ g σ z 3 2 1 0 0 0.005 0.01 σ g σ z 3 2 1 0 0 0.005 0.01 σ g σ z 3 2 1 0 0 0.005 0.01 σ g

Our estimation results Draws from priors and posteriors for parameters π (A), γ (Q), r (A), ρ z, ρ g, σ z, σ g Red lines = true parameter values Estimation time: 2 million MCMC draws 4-5 hours Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 34 / 35

1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 Priors π (A) γ (Q) 1.0 1.2 1.4 1.6 1.8 2.0 1 2 3 4 5 6 7 r (A) π (A) 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 Posteriors π (A) γ (Q) 1.0 1.2 1.4 1.6 1.8 2.0 1 2 3 4 5 6 7 r (A) π (A)

0.6 0.8 1.0 1.2 1.4 0.2 0.4 0.6 0.8 1.0 1.2 1.4 Priors ρ g ρ z 0.000 0.002 0.004 0.006 0.008 0.010 0.000 0.001 0.002 0.003 0.004 0.005 0.006 σ z σ z 0.6 0.8 1.0 1.2 1.4 0.2 0.4 0.6 0.8 1.0 1.2 1.4 Posteriors rho_g rho_z 0.000 0.002 0.004 0.006 0.008 0.010 0.000 0.001 0.002 0.003 0.004 0.005 0.006 sigma_g sigma_z

Conclusion Two new methods for estimating structural parameters in moment-based models that depend on dynamic latent variables Projection-based approximate likelihoods that satisfy the moment conditions Marries the computational convenience of MCMC in high-dimensional problems with the ability of GMM to handle nonlinear moment conditions Directly delivers "smoothed" latent variables Potential for estimating realistic models and understanding importance of non-linearities Giacomini and Ragusa (UCL/Cemmap and UCI/Luiss)Moments and latent variables UPenn, 5/4/2010 35 / 35