TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

Similar documents
(I AL BL 2 )z t = (I CL)ζ t, where

Eco517 Fall 2014 C. Sims MIDTERM EXAM

ECO 513 Fall 2009 C. Sims HIDDEN MARKOV CHAIN MODELS

ECO 513 Fall 2008 C.Sims KALMAN FILTER. s t = As t 1 + ε t Measurement equation : y t = Hs t + ν t. u t = r t. u 0 0 t 1 + y t = [ H I ] u t.

EXERCISE ON VAR AND VARMA MODELS

MID-TERM EXAM ANSWERS. p t + δ t = Rp t 1 + η t (1.1)

Eco517 Fall 2014 C. Sims FINAL EXAM

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013

Choosing among models

Lecture 2: Univariate Time Series

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Problem Set 1 Solution Sketches Time Series Analysis Spring 2010

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014

Volatility. Gerald P. Dwyer. February Clemson University

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Econ 424 Time Series Concepts

Time Series Analysis -- An Introduction -- AMS 586

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Gaussian Copula Regression Application

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Gibbs Sampling in Linear Models #2

Dynamic Regression Models (Lect 15)

Some Time-Series Models

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Booth School of Business, University of Chicago Business 41914, Spring Quarter 2013, Mr. Ruey S. Tsay. Midterm

Switching Regime Estimation

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010

Testing Restrictions and Comparing Models

CSC 2541: Bayesian Methods for Machine Learning

Univariate Nonstationary Time Series 1

Eco504, Part II Spring 2010 C. Sims PITFALLS OF LINEAR APPROXIMATION OF STOCHASTIC MODELS

Exercises - Time series analysis

Class: Trend-Cycle Decomposition

Introduction to Smoothing spline ANOVA models (metamodelling)

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

at least 50 and preferably 100 observations should be available to build a proper model

Discrete time processes

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, )

Multivariate Time Series: VAR(p) Processes and Models

1 Teaching notes on structural VARs.

Bayesian Inference for DSGE Models. Lawrence J. Christiano

ECOM 009 Macroeconomics B. Lecture 3

11. Further Issues in Using OLS with TS Data

CONJUGATE DUMMY OBSERVATION PRIORS FOR VAR S

2. An Introduction to Moving Average Models and ARMA Models

Trend-Cycle Decompositions

Financial Econometrics

High-dimensional Problems in Finance and Economics. Thomas M. Mertens

Elements of Multivariate Time Series Analysis

July 31, 2009 / Ben Kedem Symposium

1 Teaching notes on structural VARs.

Eco517 Fall 2004 C. Sims MIDTERM EXAM

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

1 Class Organization. 2 Introduction

Title. Description. var intro Introduction to vector autoregressive models

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Statistics 910, #15 1. Kalman Filter

Notes on Time Series Modeling

interval forecasting

Timevarying VARs. Wouter J. Den Haan London School of Economics. c Wouter J. Den Haan

A Bayesian perspective on GMM and IV

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract

X t = a t + r t, (7.1)

Econometrics of Panel Data

DSGE Methods. Estimation of DSGE models: Maximum Likelihood & Bayesian. Willi Mutschler, M.Sc.

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Class 1: Stationary Time Series Analysis

Vector autoregressions, VAR

Dynamic Factor Models and Factor Augmented Vector Autoregressions. Lawrence J. Christiano

MODEL COMPARISON CHRISTOPHER A. SIMS PRINCETON UNIVERSITY

A primer on Structural VARs

A time series is called strictly stationary if the joint distribution of every collection (Y t

Econometría 2: Análisis de series de Tiempo

Introduction to Eco n o m et rics

Model comparison. Christopher A. Sims Princeton University October 18, 2016

Computational Data Analysis!

Booth School of Business, University of Chicago Business 41914, Spring Quarter 2017, Mr. Ruey S. Tsay. Solutions to Midterm

Time Series Examples Sheet

Markov Chain Monte Carlo methods

STA 4273H: Statistical Machine Learning

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

STA 294: Stochastic Processes & Bayesian Nonparametrics

FINANCIAL ECONOMETRICS AND EMPIRICAL FINANCE -MODULE2 Midterm Exam Solutions - March 2015

Booth School of Business, University of Chicago Business 41914, Spring Quarter 2017, Mr. Ruey S. Tsay Midterm

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Inference with few assumptions: Wasserman s example

Financial Econometrics

Vector Auto-Regressive Models

1 Linear Difference Equations

Solving Linear Rational Expectations Models

Chapter 3 - Temporal processes

VAR Models and Applications

Graduate Macro Theory II: Notes on Quantitative Analysis in DSGE Models

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Applied time-series analysis

Multivariate Distributions

Transcription:

ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d. N(0, 1). (a) Show that the process is stationary and that ε t is its innovation process (i.e. its one-step-ahead forecast error). The roots of the AR polynomial in the lag operator are complex, both 1.02 in absolute value. Therefore that polynomial has a convergent one-sided inverse and the model implies convergence to a stationary process. The roots of the MA polynomial are also complex and of absolute value 1.0417, so that moving average operator also has a convergent one-sided inverse, implying that ε can be represented as a convergent linear combination of current and past values of y and thus that it is the innovation in y. Note that the phase angles of the roots (b in their representation as r = e a+bi is 2π/12 to within 5 significant figures in both cases. (b) Plot the spectral density of y. If y = c(l)ε, with ε serially uncorrelated with unit variance, its spectral density is c(e iω ) 2. In this example we have c(e iω ) = 1 1.6974e iω +.9604e 2iω 1 1.6638e iω +.9216e 2iω In R or Matlab, which know how to do complex arithmetic, one simply calculates and plots this expression, which is by construction real-valued. Here s the plot. 2. abs((atil/btil)^2) 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0 1 2 3 4 5 6 omega Note the high spikes at 2π/12 and 2π 2π/12. This is the spectral density of the sum of a white noise with variance 1 and another process, independent of it, that is a slowly varying sine wave of period 12. Date: January 31, 2016. c 2016 by Christopher A. Sims. c 2016. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. http://creativecommons.org/licenses/by-nc-sa/3.0/

2 TAKEHOME FINAL EXAM (c) Show that if the data are monthly, this process will display strong seasonal variation. What kind of a seasonal pattern will it display? It actually doesn t display a strong seasonal pattern. I had plotted the spectral density and noted the peaks at the sine wave of period 12, but did not think carefully enough about the fact that the peaks are so narrow that their integrals are small. This means that the variance of the data is dominated by the white noise component. There is a slowly varying sine wave in there, but it is very hard to see in plots of a simulated series. Simulating 6000 periods of realizations of this process and calculating an estimated autocorrelation function, one sees a clear, statistically significant seasonal pattern in the coefficients, but the absolute size of the autocorrelations is small. To get a clearly visible sine wave in the realizations I would have had to make the peaks higher, by making the absolute value of the AR root closer to one. For example, 1.730319 on the first lag and -.998001 on the second would have worked. (d) Explain why this example illustrates a potential pitfall of fitting an ARMA(2,2) to seasonally adjusted data. With seasonally adjusted data, power at the seasonal frequencies is greatly reduced. Therefore in fitting a model that is only approximately correct, usual estimation methods will allow more approximation error at those frequencies. A low-order autoregression or a low-order MA model for a univariate time series can only produce a smooth spectral density across seasonal frequencies, so the lack of power at the seasonal in the data is not an issue. But though an ARMA(2,2) model may look like it does not contain any clear seasonal mechanism, it can produce big spikes at seasonal frequencies, and it could therefore lead to estimates that imply spurious seasonal variation in the data or (perhaps worse) match the dips at the seasonal frequencies in the spectral density that arise from seasonal adjustment in effect exploiting the two-sided nature of seasonal adjustment filters to imply spuriously accurate forecasts. (2) Here s a version of the Solow growth model: C t + I t = A t K α t 1 (2) I t = θ t C t + I t (3) K t = (1 δ)k t 1 + I t (4) log(a t ) = log(a t 1 ) + ε t (5) ( ) ( ) θt θt 1 log = (1 ρ) θ + ρ log + ν t. (6) 1 θ t 1 θ t 1 ε and ν are assumed i.i.d. normal, with unknown variances σ 2 ε and σ 2 ν. (a) Explain in detail how, given values for the Greek-letter parameters and an initial value K 0 for capital, one could use a computer to generate a simulated time path for all the variables in the model.

TAKEHOME FINAL EXAM 3 The two exogenous drivers for the model are the A (technology) and θ (savings rate) processes. A is non-stationary, so an initial value for it must be selected arbitrarily for example, A 0 = 1. θ is most naturally regarded as stationary (otherwise it is eventually almost exactly one or zero almost all the time). So it makes sense to draw θ 0 /(1 θ 0 ) from its steady-state distribution N ( θ, σ 2 θ /(1 ρ2 ) ). Then of course one has to use this value to solve for θ 0 itself. With initial values in hand we can use the last two equations to recursively generate time paths for A t and θ t, having the computer draw shocks from the given normal distributions. Then the first three equations can be solved to eliminate C and I, yielding θ t A t K α t 1 = K t K t 1 (1 δ). ( ) This can be solved recursively to deliver a time path for K. Then (4) can be used to generate a time path for I t, and finally (1) can be used to generate the path of C t. (b) If we have observed time series for C and K (and hence also I), calculating the likelihood for this model is harder than simulating it. Explain why. We have a normal density for the two shocks ε and ν, and there are packaged routines to draw efficiently from these distributions. In the previous part we showed that there is then a non-iterative algorithm that generates a simulated path for the variables. The likelihood also has a recursive structure, as usual in time series models, in that we can write the log likelihood times the prior as T log(p(c t, K t C t s, K t s, s > 0; α, δ, ρ, θ, σε 2, σν) 2 + log π(α, δ, ρ, θ, σε 2, σν, 2 A 0, θ 0 ), t=1 where π is the pdf of a prior. We can solve for ε t, ν t at any given date, given the history of C t and K t up to that date we get A t from (2) and θ t from (3) and (4). But we can t find the likelihood by simply substituting functions of the observed data into ( ), because of the nonlinearity in the mappings. We need Jacobian terms. To be precise, (2)-(4) imply the Jacobian between C, K and A t, θ t (A t, θ t ) (C t, K t ) = 1 A t K 2α. And then to get the Jacobian for the full transformation from ε t, ν t to A t, θ t, we need to multiply the Jacobian above by (ε t, ν t ) (A t, θ t ) = 1 1 A t (1 θ t ) 2. So, since we have the Gaussian form for the conditional pdf s in ( ) and have an analytic form for ε t, ν t as functions of C t, K t (and past values and parameter values) at each date, we can construct the likelihood. It involves quite a bit more algebra than the simulation, and would be slower computationally, but this model is simple enough that the analytic form of the likelihood could be calculated quite quickly on a desktop or laptop. t 1 ( )

4 TAKEHOME FINAL EXAM Note the contrast with a rational expectations version of this model, in which agents choose I t and C t optimally at each date rather than saving a fixed fraction of income. In the rational expectations version there is no analytic form available (unless δ = 1) for the mapping from shocks to choice variables. We can approximate the mapping itself numerically, but numerically approximating the Jacobian would be many times harder. (c) Sometimes particle filtering provides a simple way to approximate the likelihood in dynamic models that are easily simulated. That is not so true in this model. Why? We can fairly easily compute the exact likelihood here, so the gap in difficulty between simulating the model and evaluating the likelihood is not that great. Furthermore, the particle filter involves generating at each date draws of an imperfectly observed state, then reweighting those draws based on the observed data. But here the model implies the state variable is perfectly observed, so this strategy simply doesn t work. If the observable were just C, not K, the particle filter would be a help. In that case, we could at each date and for each particle make a draw for K t from ( ), then reweight the particles by using the conditional pdf of K t given C t, for which an analytic formula could be constructed. Proceeding this way would avoid the need to integrate K t out of the joint density of K t, C t given history. (d) Would the particle filter become easier to implement if we assumed that one or both of observed C and K contain i.i.d. normal measurement error? Explain your answer. Yes, if both have measurement error, because then at each step of the particle filter we could draw particles from the mechanism in the original model, then reweight by the normal measurement error density. Of course this adds two parameters to the model (the measurement error variances), and changes its economic interpretation if the model implies non-trivial measurement error variances. If only one has measurement error, the particle filter is still no help.

TAKEHOME FINAL EXAM 5 (3) In a reduced form vector autoregression, orthogonalized by a triangular normalization, simulated draws from the posterior distribution of the impulse responses can be constructed without use of MCMC, because the posterior takes the form of a standard distribution. (a) Explain how this is done. (This is a standard result, so you can be brief.) Under a flat or conjugate prior, the covariance matrix of residuals has a Wishart posterior distribution, and conditional on that the regression coefficients have a joint normal distribution. Since these are standard distributions, we can draw from them directly (covariance matrix first, then coefficients) without MCMC or importance sampling. (b) In a structural VAR where the number of free parameters in the matrix A 0 of coefficients on current values is exactly (n 2 + n)/2 and lagged coefficients are unrestricted, it may be possible to construct posteriors on the impulse responses without MCMC by basing them on draws from the posterior on the reduced form parameters. But there are cases where this is not straightforward. Suppose the pattern of zero restrictions on A 0 in a 3 variable system is x x 0 x 0 x. (7) 0 x x How could you sample from the posterior for this model? Could your sampling be based on an initial set of draws from the reduced form posterior? With this pattern of zeros in A 0, there are pairs of distinct A 0 matrices that both satisfy the zero restrictions and both imply the same reduced form covariance matrix Σ = (A 0 A 0) 1. This implies that differences in postierior density across such pairs come entirely from any prior one might be imposing on A 0. This is not a problem for sampling from the posterior in itself a posterior with multiple peaks properly reflects the fact that multiple parameter values are equally likely according to the data. It is a problem for the strategy of starting by drawing values for the reduced form covariance matrix, though, as now there is no unique way to produce an A 0 that corresponds to Σ. Equally important is that with this pattern of zeros in A 0 there are Σ matrices such that no A 0 satisfying the restrictions can match Σ. So if we make draws from the reduced form posterior in the usual way, some of them will be Σ matrices that violate the zero constraints on A 0. Nonetheless, it is possible to treat the prior as flat or conjugate (i.e. Wishart) over Σ values that can be represented as (A 0 A 0) 1, with pairs of A 0 values implying the same Σ given equal weight. (That there are no more than two A 0 values matching a given Σ is true, though not obvious.) One would draw from the posterior on Σ, discard any draw that does not satisfy the restrictions, then check whether there are two A 0 s that match the drawn Σ. If there are two, pick one with 50% probability, and otherwise just keep the draw. This prior would give odd results, and result in few draws being retained, if the estimated reduced form Σ itself failed to satisfy the contraints. Also, if model comparisons were to be made, a separate Monte Carlo calculation would be

6 TAKEHOME FINAL EXAM required to determine what proportion of draws from the prior on Σ fail to satisfy the restrictions. A better approach would be to impose a prior (say Gaussian), based on substantive reasoning about the application, on the elements of A 0, and to use an MCMC scheme to draw from the posterior on A 0. Because the distribution of the AR coefficients (both reduced form and structural) is of known Gaussian form conditional on A 0, we can easily integrate over the coefficient distribution analytically to calculate the marginal posterior density for each draw of A 0, using it to guide a random-walk Metropolis (for example) MCMC scheme. Then to get a draw from the joint posterior on A 0 and the coefficients, we can go through the MCMC sample of A 0 draws and for each one make a draw from the corresponding Gaussian distribution for the coefficients. Of course this procedure is just what one would do in any over-identified structural VAR model.