DSGE Methods. Estimation of DSGE models: Maximum Likelihood & Bayesian. Willi Mutschler, M.Sc.

Similar documents
Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano

DSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc.

Bayesian Inference for DSGE Models. Lawrence J. Christiano

DSGE-Models. Limited Information Estimation General Method of Moments and Indirect Inference

DSGE-Models. Calibration and Introduction to Dynare. Institute of Econometrics and Economic Statistics

The Metropolis-Hastings Algorithm. June 8, 2012

Bayesian Estimation of DSGE Models: Lessons from Second-order Approximations

Identification Analysis of DSGE models with DYNARE

Bayesian Estimation of DSGE Models

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

... Econometric Methods for the Analysis of Dynamic General Equilibrium Models

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

Monetary and Exchange Rate Policy Under Remittance Fluctuations. Technical Appendix and Additional Results

ESTIMATION of a DSGE MODEL

Estimating Macroeconomic Models: A Likelihood Approach

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

2.5 Forecasting and Impulse Response Functions

Bayesian Methods for Machine Learning

Nonlinear DSGE model with Asymmetric Adjustment Costs under ZLB:

Solving a Dynamic (Stochastic) General Equilibrium Model under the Discrete Time Framework

Graduate Macro Theory II: Notes on Quantitative Analysis in DSGE Models

Dynamic Factor Models and Factor Augmented Vector Autoregressions. Lawrence J. Christiano

Point, Interval, and Density Forecast Evaluation of Linear versus Nonlinear DSGE Models

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Sequential Monte Carlo Methods (for DSGE Models)

Sequential Monte Carlo Methods (for DSGE Models)

Eco517 Fall 2014 C. Sims MIDTERM EXAM

Estimating a Nonlinear New Keynesian Model with the Zero Lower Bound for Japan

ECO 513 Fall 2008 C.Sims KALMAN FILTER. s t = As t 1 + ε t Measurement equation : y t = Hs t + ν t. u t = r t. u 0 0 t 1 + y t = [ H I ] u t.

Signaling Effects of Monetary Policy

Likelihood-free MCMC

High-dimensional Problems in Finance and Economics. Thomas M. Mertens

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, )

Assessing Structural VAR s

ECO 513 Fall 2009 C. Sims HIDDEN MARKOV CHAIN MODELS

Estimation of moment-based models with latent variables

Empirical Evaluation and Estimation of Large-Scale, Nonlinear Economic Models

Assessing Structural VAR s

(I AL BL 2 )z t = (I CL)ζ t, where

Filtering and Likelihood Inference

Assessing Structural VAR s

Chapter 6. Maximum Likelihood Analysis of Dynamic Stochastic General Equilibrium (DSGE) Models

Estimating Deep Parameters: GMM and SMM

Assessing Structural VAR s

F denotes cumulative density. denotes probability density function; (.)

An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information.

Session 3A: Markov chain Monte Carlo (MCMC)

Assessing Structural VAR s

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Introduction to Machine Learning

Bayesian Dynamic Linear Modelling for. Complex Computer Models

Looking for the stars

Nonlinear and/or Non-normal Filtering. Jesús Fernández-Villaverde University of Pennsylvania

Statistical Inference and Methods

Computational statistics

Small Open Economy RBC Model Uribe, Chapter 4

DSGE Models in a Liquidity Trap and Japan s Lost Decade

Sequential Monte Carlo samplers for Bayesian DSGE models

1 The Basic RBC Model

Bayesian Econometrics

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Bayesian Methods for DSGE models Lecture 1 Macro models as data generating processes

Markov Chain Monte Carlo

Dynamic Macro 1 / 114. Bayesian Estimation. Summer Bonn University

Building and simulating DSGE models part 2

Deterministic Models

Economics 701 Advanced Macroeconomics I Project 1 Professor Sanjay Chugh Fall 2011

MCMC algorithms for fitting Bayesian models

Dynamic Identification of DSGE Models

1. Using the model and notations covered in class, the expected returns are:

Lecture 7: Linear-Quadratic Dynamic Programming Real Business Cycle Models

Introduction to Machine Learning CMU-10701

Markov Chain Monte Carlo methods

Bayesian Estimation of Input Output Tables for Russia

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

Bayesian Computations for DSGE Models

Principles of Bayesian Inference

GARCH Models Estimation and Inference

WORKING PAPER NO BAYESIAN ANALYSIS OF DSGE MODELS. Sungbae An University of Pennsylvania

DYNARE SUMMER SCHOOL

Stat 516, Homework 1

Lecture 16: State Space Model and Kalman Filter Bus 41910, Time Series Analysis, Mr. R. Tsay

Long-Run Covariability

Web Appendix for The Dynamics of Reciprocity, Accountability, and Credibility

Y t = log (employment t )

Inference. Jesús Fernández-Villaverde University of Pennsylvania

Ambiguous Business Cycles: Online Appendix

Bayesian Model Comparison:

Theory of Maximum Likelihood Estimation. Konstantin Kashin

New Notes on the Solow Growth Model

Title. Description. Remarks and examples. stata.com. stata.com. Introduction to DSGE models. intro 1 Introduction to DSGEs and dsge

GARCH Models Estimation and Inference

Stochastic simulations with DYNARE. A practical guide.

Estimation and Inference on Dynamic Panel Data Models with Stochastic Volatility

Short Questions (Do two out of three) 15 points each

DSGE Model Forecasting

Graduate Macro Theory II: Business Cycle Accounting and Wedges

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia

MID-TERM EXAM ANSWERS. p t + δ t = Rp t 1 + η t (1.1)

Transcription:

DSGE Methods Estimation of DSGE models: Maximum Likelihood & Bayesian Willi Mutschler, M.Sc. Institute of Econometrics and Economic Statistics University of Münster willi.mutschler@uni-muenster.de Summer 2014 Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 1 / 35

Full information estimation Idea Full information estimation requires a complete characterization of the data-generating-process (not only specific moments). Consider the linear first-order state-space representation of the model: z t = A(θ)z t 1 + B(θ)ε t, with E[ε t ] = 0, E[ε t ε t] = Σ ε (1) d t = D(θ)z t + µ t, with E[µ t ] = 0, E[µ t µ t] = Σ µ. (2) z t = ( x t, ŷ t) contains all model variables as deviations from steady-state, and A, B and D are functions of g x and h x. Matrix D combines the model variables z t with observable data variables d t Equation (1): state- or transition-equation Corresponds to the solution of the model. ε t are the stochastic innovations. Equation (2): observation-equation Corresponds to the measurement equations, subject to possible measurement errors µ t in the data. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 2 / 35

Full information estimation Idea Given distributional assumptions about ε t and µ t, one can derive the log-likelihood-function, log L(d θ), analytically or numerically. In the log-linear case and considering normally distributed variables, the Kalman-filter is used to calculate the likelihood analytically. In the nonlinear case the policy functions are functions of the vector of parameters θ. An Extended Kalman-Filter, a particle-filter, an efficient importance sampling or Sequential Monte Carlo Methods are then used to derive the likelihood numerically. There are two approaches for analyzing and evaluating the log-likelihood: 1 the classic (frequentist) Maximum-Likelihood-method, 2 the bayesian method. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 3 / 35

Kalman-filter, no measurement errors Notation We simplify and consider only the linear case and ignore possible measurement errors in data: d t = Dz t z t+1 = Az t + Bε t+1 ε i iid N (0, Σε ), Σ ε = E(ε i ε i ), E(ε iε j ) = 0 Notation for the linear projection ẑ t t j = E(z t d t j, d t j 1,... d 1 ) Σ t t j = E(z t ẑ t t j )(z t ẑ t t j ) d t t j = E(d t d t j, d t j 1,..., d 1 ) u t = d t d t t 1 = D(z t ẑ t t 1 ) E(u tu t ) = DΣ t t 1 D for t = 1, 2,..., T and j = 0, 1,... T. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 4 / 35

Kalman-filter, no measurement errors Initialization Since z t is covariance-stationary, the variance is given by: E(z t z [ t ) = E (Azt 1 + Bε }{{} t )(Az t 1 + Bε t ) ] Σ z = A E(z t 1 z t 1 ) A + B E(ε }{{} t ε t ) B }{{} Σ z =Σ ε Σ z = AΣ z A + BΣ ε B vec(σ z ) = (I A A) 1 vec(bσ ε B ) Vectorization The vec-operation stacks the rows of a m n Matrix M into a mn 1 vector vec(m). Then for arbitrary Matrices A, B and C : m n n p p k vec(abc) = (C A)vec(B), with : Kronecker-product. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 5 / 35

Kalman-filter, no measurement errors Initialization The unconditional expectation of z 1 is used for the initialization of the Kalman-filter, since there is no additional information yet: ẑ 1 = E(z 1 ) }{{} =E(z) = A E(z 0 ) }{{} =E(z) +B E(ε 1 ) ẑ }{{} 1 = 0, =0 vec(σ 1 0 ) = E(z 1 0)(z 1 0) = vec(σ z ) = (I A A) 1 vec(bσ ε B ). Note: Without loss of generalization we consider a steady-state of zero. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 6 / 35

Kalman-filter, no measurement errors Recursion The recursion is then given by: ẑ t+1 t = Aẑ t t Formula for updating a linear projection (Hamilton (1994, S.99 und S.379)) [ ẑ t t = ẑ t t 1 + E(z t ẑ t t 1 )(d t d t t 1 ) ] [ E(d t d t t 1 )(d t d t t 1 ) ] 1 ut ẑ t t = ẑ t t 1 + Σ t t 1 D ( DΣ t t 1 D ) 1 ut ẑ t+1 t = Aẑ t t = Aẑ t t 1 + AΣ t t 1 D ( DΣ t t 1 D ) 1 ut, with u t = d t d t t 1 = (d t Dẑ t t 1 ). Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 7 / 35

Kalman-filter, no measurement errors Recursion z t+1 ẑ t+1 t = A ( z t ẑ t t 1 ) +Bεt+1 AΣ t t 1 D ( DΣ t t 1 D ) 1 ut The MSE: Σ t+1 t = E ( z t+1 ẑ t+1 t ) ( zt+1 ẑ t+1 t ) is given by: Σ t+1 t = AΣ t t 1 A + BΣ εb AΣ t t 1 D ( DΣ t t 1 D ) 1 E(u tu t) }{{} ( DΣt t 1 D ) 1 DΣt t 1 A =DΣ t t 1 D } {{ } =I Mean-Sqared-Error (MSE) Σ t+1 Σ t+1 t = AΣ t t 1 A + BΣ εb AΣ t t 1 D ( DΣ t t 1 D ) 1 DΣt t 1 A Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 8 / 35

Kalman-filter, no measurement errors Summary The Kalman-filter can be summarized as follows: 1 Initialization with ẑ 1 = 0, vec(σ 1 0 ) = (I A A) 1 vec(bσ ε B ). 2 Period-t likelihood function u t = (d t Dẑ t t 1 ) d t t 1 = Dẑ t t 1 Ω t t 1 := E(u t u t) = DΣ t t 1 D 3 Period-t filtering density ẑ t t = ẑ t t 1 + Σ t t 1 D ( DΣ t t 1 D ) 1 ut Σ t t = Σ t t 1 Σ t t 1 D ( DΣ t t 1 D ) 1 DΣt t 1 4 Period-t predictive density ẑ t+1 t = Aẑ t t 1 + AΣ t t 1 D ( DΣ t t 1 D ) 1 ut Σ t+1 = AΣ t A + BΣ ε B K t DΣ t A. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 9 / 35

Log-Likelihood Given the gaussian assumption about the forecast error u t one can derive the distribution of the data d t conditional on (z t, d t 1, d t 2,... ) and set n 1 up the log-likelihood function: Log-likelihood T log L(d θ) = log L(d t θ) t=1 = nt 2 log(2π) 1 2 T log Ω t 1 2 t=1 T u t Ω 1 t u t. t=1 Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 10 / 35

Maximum-Likelihood Idea Approach: The parameters θ are fixed and the data is a random realization of this specific parametrization. The Maximum-Likelihood-estimator θ ML is then defined as { T } θ ML = argmax log L(d t θ). θ Given some regularity conditions the ML-estimator is consistent, asymptotically efficient and asymptotically gaussian. Uncertainty and inference are based upon the assumptions that to each realization of data there corresponds a different vector of parameters that maximizes the likelihood. Hint for the estimation of the parameters of a DSGE-model: The dimension of d t must be greater or equal to the dimension of the structural shocks ε t, or otherwise the residual term has a singular variance-covariance-matrix. If not: Add measurement errors or additional shocks. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 11 / 35 t=1

Exercise 4: An and Schorfheide (2007) via ML Consider the following new-keynesian model: ŷ t = E t [ŷ t+1 ] + ĝ t E t [ĝ t+1 ] 1 τ ( R t E t [ π t+1 ] E t [ẑ t+1 ]), π t = βe t [ π t+1 ] + κ(ŷ t ĝ t ), ĉ t = ŷ t ĝ t, R t = ρ R Rt 1 + (1 ρ R )ψ 1 π t + (1 ρ R )ψ 2 (ŷ t ĝ t ) + ɛ R,t ĝ t = ρ g ĝ t 1 + ɛ g,t, ẑ t = ρ z ẑ t 1 + ɛ z,t. All variables with a denote the logarithmic deviation from the steady-state, i.e. x t = log(x t ) log(x). The stochastic shocks are normally distributed with E[ɛ i,t ] = 0 and E[ɛ 2 i,t ] = σ2 i. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 12 / 35

Exercise 4: An and Schorfheide (2007) via ML Assume you have quarterly data to estimate the parameters: quarterly growth of GDP per capita in percent (YGR t ), annualized inflation rates in percent (INFL t ), annualized nominal interest rates in percent (INT t ). Model variables and observed data are linked by the following equations: YGR t = γ (Q) + 100(ŷ t ŷ t 1 + ẑ t ), INFL t = π (A) + 400 π t, INT t = π (A) + r (A) + 4γ (Q) + 400 R t. The parameter γ (Q), π (A) and r (A) are linked to the steady-state values of the model: γ = A t+1 A t = e γq 100 1 + γ(q) 100, π = e π (A) 400 1 + π(a) 400, r = e r (A) 400 1 + r (A) 400, β = e r(a) 1 400 1 + r (A) /400. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 13 / 35

Exercise 4: An and Schorfheide (2007) via ML Write a mod-file for the model in order to estimate it via Maximum-Likelihood. (a) Use the simulated dataset simdat1.mat for the estimation. The true values are τ = 2.000, κ = 0.150, ψ 1 = 1.500, ψ 2 = 1.000, ρ R = 0.600, ρ z = 0.650, ρ g = 0.950, σ R = 0.2/100, σ g = 0.8/100, σ z = 0.45/100, π (A) = 4.000, γ (Q) = 0.500, r (A) = 0.400. 1 Estimate all parameters via ML. Why does it not work? Hint: The nonlinear model implies that β = γ r = e γ (Q) 100 e r(a) 400 2 Calibrate the parameters ψ 1 and r (A) to their true values and estimate the other parameters. Why does it work now? Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 14 / 35.

Exercise 4: An and Schorfheide (2007) via ML (b) Use the simulated dataset simdat2.mat for the estimation. The true parameters are the same except now r (A) = 4. 1 Estimate all parameters via ML. Why does it still not work? 2 Calibrate ψ 1 to its true value and estimate all other parameters. Discuss your results. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 15 / 35

Maximum-Likelihood Discussion Experience shows that it can be pretty hard and tricky to estimate a DSGE model via Maximum-Likelihood. Data is often not sufficiently informative, i.e. the likelihood is flat in some directions (identification). DSGE-models are always misspecified. This can lead to absurd parameter values. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 16 / 35

Bayesian methods Idea Based upon the likelihood as well: the complete characterization of the data generating process. Approach: The parameters θ are random and data d is fixed. The idea is to combine known information (data) with additional believes (prior-believes) about the parameters and to get an expression for the conditional probability of the parameters. Hence, one is able to put more weight on a suspected span of the parameter space. Bayesian methods are a bridge between calibration and the Maximum-Likelihood-method: Bayesian Inference is a Way of Thinking, Not a Basket of Methods (Christopher Sims) Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 17 / 35

Bayesian methods Idea Likelihood-function L(d θ) is a conditional density of observed data given the parameters: (d θ) = L(d θ). Denote (θ) as the known prior density of the vector of parameters, then using Bayes-rule: (θ d) = L(d θ) (θ) (d) = L(d θ) (θ) L(d θ) (θ), (θ)l(d θ) dθ with meaning proportional to. (d) is the marginal likelihood of the data and ultimately only a constant that normalizes the expression to unity. It is independent of the parameters. Removing it doesn t change the form of the posterior density (θ d), it merely doesn t integrate to one. This non-normalized density is called posterior-kernel or, in logs, log-posterior-kernel. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 18 / 35

Bayesian methods Idea The mode is the Bayesian estimator θ B of the true parameter vector: θ B = argmax θ {log (θ d)} = argmax {log L(d θ) + log (θ)} θ Procedure: Calculate the log-likelihood with the Kalman-filter and simulate the log-posterior-kernel through sampling- or Monte-Carlo-methods. In the literature and in Dynare the Metropolis-Hastings-algorithm is commonly used. Inference can then be conducted via the properties of the posterior-distribution. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 19 / 35

Bayesian methods Metropolis-Hastings-algorithm An and Schorfheide (2007, S. 132) The algorithm constructs a Gaussian approximation around the posterior mode and uses a scaled version of the asymptotic covariance matrix as the covariance matrix for the proposal distribution. This allows for an efficient exploration of the posterior distribution at least in the neighborhood of the mode. The algorithm uses the fact that under very general regularity conditions the moments of a distribution are asymptotically normal. It constructs a sequence of draws (Markov-chains) from a proposal density. This does not need to be identical with the posterior density. It is only required that the algorithm can draw samples from the whole range of the posterior density. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 20 / 35

Bayesian methods Metropolis-Hastings-algorithm The current candidate (draw) θ is dependent on the previous candidate θ (s 1). Weights for all candidates are the same, however, they are only accepted with a certain probability α, calculated as the ratio of the posterior-kernel of the current to the one of the previous candidate. Due to this construct the algorithm tends to shift the draws from areas of low posterior probability to areas of high probability. If θ (s 1) is in an area of high posterior probability, it is likely that only candidates in the same area are accepted. If θ (s 1) is in an area of low posterior probability, it is very likely that new candidates are accepted. The covariance-matrix of the proposal distribution plays a major role, since it is important to set α neither too large nor to small. Current practice uses the covariance matrix of the mode θ B and scales it with a factor c such that the average acceptance probability is between 20% and 30%. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 21 / 35

Bayesian methods Metropolis-Hastings-algorithm Proposal distribution 1 Specify c 0, c and S. 2 Maximize log L(d θ) + log (θ) using numerical methods. θ B denotes the mode. 3 Calculate the inverse of the Hessian evaluated at the mode, denote it with Σ B. 4 Specify an initial value θ (0) or draw it from N ( θ B, c 2 0 Σ B). Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 22 / 35

Bayesian methods Metropolis-Hastings-algorithm Metropolis-Hastings-steps 5 For s = 1,..., S: Draw θ from the candidate-generating distribution (proposal density) N (µ (s 1), c 2 Σ B ). Calculate the acceptance probability α: ( α α θ (s 1), θ ) L (θ d) (θ ) = L ( θ (s 1) d ) ( θ (s 1)) With probability min {α, 1} accept the jump from θ (s 1) to θ. In other words: If α 1, set θ (s) = θ. With complementary probability don t accept the jump, i.e. draw a uniformly distributed random variable r between 0 and 1: If r α set θ (s) = θ (jump). If r > α set θ (s) = θ (s 1) (don t jump). Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 23 / 35

Bayesian methods Metropolis-Hastings-algorithm 6 Estimate the posterior expectation of a function (θ) with 1 S S s=1 ( θ (s)). 7 If the average acceptance probability does not yield a desirable value (typically between 20% 30%) or the algorithm does not converge, change c 0, c or S. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 24 / 35

Bayesian methods Remarks Bayesian estimation of a DSGE-model requires that the number of shocks is equivalent to the numbers of observable variables. Common choice for priors: gaussian, (normal, shifted or inverse) Gamma, Beta or the uniform distribution. Choosing a proper prior one has to consider lower and upper bounds as well as the skewness and kurtosis of the distribution. The results can vary due to the choice of priors and their parametrization. Therefore one has to check the robustness of the results: Try a different parametrization. Try more general priors. Noninformative priors. Sensitivity analysis. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 25 / 35

Properties of the Posterior-distribution The posterior density combines all information about θ: information after the data is observed as well as information prior to the data. Bayesian estimation works for every sample size, however, it has also the following asymptotic properties: 1 The priors become irrelevant for the determination of the posterior. 2 The posterior converges to a degenerate distribution around the true value (spike). 3 The posterior mode is approximately gaussian. Using the posterior distribution one can set up Bayesian confidence intervals (credibility sets), calculate forecasts using the predictive-density: L(d f d) = L(d f (θ d))dθ = L(d f θ, d) (θ d)dθ compare models. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 26 / 35

Model comparison Models can differ in their prior distribution, the likelihood and the parameters. Bayesian approach: Calculate the probability that model i is the true model, given the data. Suppose there are i = 1, 2 models M i with prior probability p i = P(M i ) that model M i is the true model. Each model has a set of parameters θ i with a prior distribution i (θ i ) and a likelihood L i (d θ). Then the probability of model 1 being the true model given the data, is given by: P(M 1 d) = P(M 1)L 1 (d M 1 ) L(d) = p 1 L1 (d, θ 1 M 1 )dθ 1 L(d) = p 1 L1 (d θ 1, M 1 ) 1 (θ 1 M 1 )dθ 1 L(d) with L(d) = p 1 L 1 (d θ 1, M 1 ) 1 (θ 1 M 1 )dθ 1 + p 2 L 2 (d θ 2, M 2 ) 2 (θ 2 M 2 )dθ 2 Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 27 / 35

Model comparison The expected value of the likelihood given the prior distribution is the so-called marginal-likelihood for model i: m i (d) = L i (d θ i, M i ) i (θ i M i )dθ i Using this, one can calculate the posterior-odds: PO 12 = P(M 1 d) P(M 2 d) = p 1 m 1 (d) p }{{} 2 m 2 (d) }{{} Prior-Odds-Ratio Bayes factor Together with P(M 1 d) + P(M 2 d) = 1 one gets the posterior-model-probabilities: P(M 1 d) = PO 12 1 + PO 12, P(M 2 d) = 1 P(M 1 d). Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 28 / 35

Model comparison The marginal likelihood measures the quality of a model to characterize data. The Posterior-Odds don t hint to the true model. They solely describe which model, compared to the other, has the highest conditional probability. A PO 12 >> 1 is an indication that the data as well as the priors prefer model 1. Guidelines of Jeffrey (1961): 1 : 1 3 : 1 weak evidence for model 1, 10 : 1 100 : 1 strong evidence for model 1, > 100 : 1 decisive evidence for model 1. Implementation and calculation of the integrals is done by numerical MCMC- and sampling-methods, as well as Laplace- or Harmonic-Mean-approximation. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 29 / 35

Discussion of full information estimators More restrictive assumptions are needed compared to the limited information estimation: specification of the distribution of the schocks, i.e. the likelihood. Advantages of a Maximum-Likelihood-estimation lie in the full characterization of the data-generating-process and the exact, consistent and efficient estimation of the parameters. Dilemma of absurd parameter estimates : Problem of the ML-estimation due to wrong distributional assumptions, problems in the optimization algorithm or non-separable identifiable parameters. Even transformations, upper and lower bounds, etc. are only limited to help overcome this problem, when the likelihood is flat. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 30 / 35

Discussion of full information estimators This is where Bayesian methods come in and bridge the gap between calibration and the ML-principle. Considering priors one can incorporate additional information into a model. Dilemma of absurd parameter estimates : Even with Bayesian means it is not possible to estimate these parameters (the posterior looks almost the same as the prior), but one can assign probability such that these parameters are very unlikely. Using priors one can exclude these absurd parameter estimates. Nevertheless the point of robustness and identification of the parameters remains a critical topic. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 31 / 35

Discussion of full information estimators An und Schorfheide (2006, S.124) Once one acknowledges that the DSGE model provides merely an approximation to the law of motion of the time series (... ), then it seems reasonable to assume that there need not exist a single parameter vector (... ), that delivers, say, the true intertemporal substitution elasticity or price adjustment costs and, simultaneously, the most precise impulse responses to a technology or monetary policy shock. Each estimation method is associated with a particular measure of discrepancy between the true law of motion and the class of approximating models. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 32 / 35

Exercise: Estimation with Bayesian methods Consider the following simplified RBC-model (social planer problem); max {c t+j,l t+j,k t+j } j=0 W t = β j u(c t+j, l t+j ) j=0 s.t. y t = c t + i t, A t = Ae at, y t = A t f (k t 1, l t ), a t = ρa t 1 + ε t, k t = i t + (1 δ)k t 1, where preferences and technology follow: [ c θ u(c t, l t ) = t (1 l t ) 1 θ] 1 τ, f (k t 1, l t ) = 1 τ Optimality is given by: ε t N(0, σ 2 ε), [ ] 1/ψ αk ψ t 1 + (1 α)l t ψ. u c (c t, l t ) βe t {u c (c t+1, l t+1 ) [A t+1 f k (k t, l t+1 ) + 1 δ]} = 0, u l (c t,l t) u c(c t,l t) A tf l (k t 1, l t ) = 0, c t + k t A t f (k t 1, l t ) (1 δ)k t 1 = 0. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 33 / 35

Exercise: Estimation with Bayesian methods (a) Write a mod-file for this model (with a sensible calibration and a steady-state block). (b) Simulate a sample of 10000 observations for c t, l t and y t using stoch_simul and save it in a mat-file. (c) Define priors for α, θ and τ (or a different set of parameters). (d) Estimate the posterior mode using the estimation command and a limited sample with 100 observations. How man observable variables do you need? Check the posterior mode using mode_check. If you get errors due to a non-positive definite Hessian, try a different optimization algorithm or change the initial values. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 34 / 35

Exercise: Estimation with Bayesian methods (e) If you are satisfied with the posterior mode, approximate the posterior distribution using the the Metropolis- Hastings-Algorithm with 3 5000 iterations. If it does not converge to the (ergodic) posterior-distribution, repeat the algorithm without discarding the previous draws. (f) How robust are the results regarding the specification of the priors? Repeat the estimation of the posterior-mode for different priors. (g) Use the same dataset to estimate the parameters of a misspecified model. Use the same model, however, with a small difference: a Cobb-Douglas production function or a separable utility function, or a model in which the household supplies inelastically one unit of labor. Hint: Don t forget to adjust the model equations as well as the steady-state block. (h) Compare the estimation of the common parameters as well as the marginal densities of the different models. Calculate the posterior-odds and the posterior-model-probabilities. Willi Mutschler (Institute of Econometrics) DSGE Methods Summer 2014 35 / 35