Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Similar documents
Lecture 4: Dynamic models

The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence

Research Division Federal Reserve Bank of St. Louis Working Paper Series

Non-Markovian Regime Switching with Endogenous States and Time-Varying State Strengths

Timevarying VARs. Wouter J. Den Haan London School of Economics. c Wouter J. Den Haan

Dynamic Factor Models and Factor Augmented Vector Autoregressions. Lawrence J. Christiano

Bayesian Model Comparison:

The Kalman filter, Nonlinear filtering, and Markov Chain Monte Carlo

STA 4273H: Statistical Machine Learning

Session 5B: A worked example EGARCH model

ECO 513 Fall 2009 C. Sims HIDDEN MARKOV CHAIN MODELS

The Metropolis-Hastings Algorithm. June 8, 2012

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Session 3A: Markov chain Monte Carlo (MCMC)

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract

Bayesian Inference for DSGE Models. Lawrence J. Christiano

F denotes cumulative density. denotes probability density function; (.)

Markov Chain Monte Carlo Methods

Core Inflation and Trend Inflation. Appendix

Non-homogeneous Markov Mixture of Periodic Autoregressions for the Analysis of Air Pollution in the Lagoon of Venice

ECO 513 Fall 2008 C.Sims KALMAN FILTER. s t = As t 1 + ε t Measurement equation : y t = Hs t + ν t. u t = r t. u 0 0 t 1 + y t = [ H I ] u t.

STA 4273H: Statistical Machine Learning

Working Paper Series

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Part I State space models

Bayesian Monte Carlo Filtering for Stochastic Volatility Models

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

Computational statistics

Point, Interval, and Density Forecast Evaluation of Linear versus Nonlinear DSGE Models

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models

Statistical Inference and Methods

Probabilistic Graphical Models

Non-Markovian Regime Switching with Endogenous States and Time-Varying State Strengths

Gibbs Sampling in Linear Models #2

Modeling conditional distributions with mixture models: Theory and Inference

Controlled sequential Monte Carlo

DSGE MODELS WITH STUDENT-t ERRORS

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix

Likelihood-free MCMC

Spatio-temporal precipitation modeling based on time-varying regressions

Dynamic Generalized Linear Models

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Switching Regime Estimation

Bayesian Semiparametric GARCH Models

Markov Chain Monte Carlo methods

Bayesian Semiparametric GARCH Models

Reducing Dimensions in a Large TVP-VAR

Sequential Monte Carlo Methods for Bayesian Computation

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Introduction to Machine Learning

Bayesian Methods for Machine Learning

Bayesian Modeling of Conditional Distributions

Markov Chain Monte Carlo

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

Estimating Macroeconomic Models: A Likelihood Approach

Interweaving Strategy to Improve the Mixing Properties of MCMC Estimation for Time-Varying Parameter Models

LECTURE 15 Markov chain Monte Carlo

Bayesian Dynamic Linear Modelling for. Complex Computer Models

Advances and Applications in Perfect Sampling

Simulation Smoothing for State-Space Models: A. Computational Efficiency Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Monte Carlo Methods. Leon Gu CSD, CMU

DSGE Methods. Estimation of DSGE models: Maximum Likelihood & Bayesian. Willi Mutschler, M.Sc.

Paul Karapanagiotidis ECO4060

Nonlinear Inequality Constrained Ridge Regression Estimator

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models

Estimation and Inference on Dynamic Panel Data Models with Stochastic Volatility

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Gaussian Multiscale Spatio-temporal Models for Areal Data

Markov Chain Monte Carlo (MCMC)

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, )

Monte Carlo in Bayesian Statistics

Estimating marginal likelihoods from the posterior draws through a geometric identity

MCMC Sampling for Bayesian Inference using L1-type Priors

Stock index returns density prediction using GARCH models: Frequentist or Bayesian estimation?

Online Appendix for: A Bounded Model of Time Variation in Trend Inflation, NAIRU and the Phillips Curve

Computer Intensive Methods in Mathematical Statistics

Kernel adaptive Sequential Monte Carlo

Econ 623 Econometrics II Topic 2: Stationary Time Series

Business Cycle Comovements in Industrial Subsectors

Nonparametric Bayesian Methods (Gaussian Processes)

Generalized Autoregressive Score Models

Dynamic probabilities of restrictions in state space models: An application to the New Keynesian Phillips Curve

Working Papers in Econometrics and Applied Statistics

State-Space Methods for Inferring Spike Trains from Calcium Imaging

July 31, 2009 / Ben Kedem Symposium

Gaussian kernel GARCH models

Bayesian Nonparametric Regression for Diabetes Deaths

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

Default Priors and Effcient Posterior Computation in Bayesian

Time-Varying Parameters

Volatility. Gerald P. Dwyer. February Clemson University

Bayesian Stochastic Volatility (SV) Model with non-gaussian Errors

Markov-Switching Models with Endogenous Explanatory Variables. Chang-Jin Kim 1

This is the author s version of a work that was submitted/accepted for publication in the following source:

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

MCMC and Gibbs Sampling. Kayhan Batmanghelich

Lecture 2: From Linear Regression to Kalman Filter and Beyond

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

Markov Chain Monte Carlo

Transcription:

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Gerdie Everaert 1, Lorenzo Pozzi 2, and Ruben Schoonackers 3 1 Ghent University & SHERPPA 2 Erasmus University Rotterdam & Tinbergen Institute 3 National Bank of Belgium & Ghent University August 30, 2016 Abstract This online appendix provides technical details on the the estimation methodology used in the main paper, including an offset mixture representation for the stochastic volatility components, a general outline of the Gibbs sampler and details on the exact implementation for each of the Gibbs blocks. A1 Offset mixture representation for the stochastic volatility components Using ν t N 0, e hνt and µt N 0, e hµt together with equation 18 in the main paper, we can write ν t = e hνt/2 µ 1t and µ t = e hµt/2 µ 2t with µ 1t and µ 2t i.i.d. error terms with unit variance. A key feature of these stochastic volatility components is that they are nonlinear but can be transformed into linear components by taking the logarithm of their squares 2 ln e hνt/2 µ 1t = hνt + ln µ 1t 2, 2 ln e hµt/2 µ 2t = hµt + ln µ 2t 2, A-1 where ln µ 1t 2 and ln µ 2t 2 are log-chi-square distributed with expected value 1.2704 and variance 4.93. Following Kim et al. 1998, we approximate the linear models in A-1 by an offset mixture time series model as g jt = h jt + ɛ jt, A-2 Corresponding author at: Department of Economics, P.O. Box 1738, 3000 DR Rotterdam, the Netherlands, Tel:+31 10 4081256, Email: pozzi@ese.eur.nl, Website: http://people.few.eur.nl/pozzi. 1

e h for j = ν, µ, where g νt = ln νt/2 2 e µ 1t + c h and g µt = ln µt/2 2 µ 2t + c, with c =.001 being an offset constant, and the distribution of ɛ jt given by the following mixture of normals f ɛ jt = M q n f N ɛjt m n 1.2704, s 2 n, A-3 n=1 with component probabilities q n, means m n 1.2704 and variances s 2 j. Equivalently, this mixture density can be written in terms of the component indicator variable κ jt as ɛ jt κ jt = n N m n 1.2704, s 2 n, with P r κjt = n = q n. A-4 Following Kim et al. 1998, we use a mixture of M = 7 normal distributions to make the approximation to the log-chi-square distribution sufficiently good. Values for {q n, m n, s 2 n} are provided by Kim et al. in their Table 4. A2 General outline of the Gibbs sampler Taken together, equations 14 and 26 in the main paper constitute the observation equations of a State Space SS model, with the unobserved states β it and h jt evolving according to the state equations 22-23 and 24-25 respectively. In a standard linear Gaussian SS model, the Kalman filter can be used to filter the unobserved states from the data and to construct the likelihood function such that the unknown parameters can be estimated using Maximum Likelihood ML. However, the stochastic volatility components introduced in Section 3.1 and the stochastic model specification search outlined in Section 3.2 imply a non-regular estimation problem for which the standard approach via the Kalman filter and ML is not feasible. Instead, we use the Gibbs sampler which is a Markov Chain Monte Carlo MCMC method to simulate draws from the intractable joint and marginal posterior distributions of the unknown parameters and the unobserved states using only tractable conditional distributions. Intuitively, this amounts to reducing the complex non-linear model into a sequence of blocks for subsets of parameters/states that are tractable conditional on the other blocks in the sequence. For notational convenience, define the non-centered time-varying parameter vector βt = β0t, β1t, the non-centered stochastic volatilities vector h t = h νt, hµt, the mixture indicators vector κt = κ νt, κ µt, the unknown parameter vectors θ = θ 1,..., θ q, φ β = β 00, β 10, γ, ρ, σ η0, σ η1, φ h = h ν0, h µ0, σ υν, σ υµ, φ = δ, θ, φ β, φ h and the model indicators M θ = ι θ1,..., ι θq, M φβ = ι β00, ι β10, ι γ, ι ρ, ι β0t, ι β1t, M φh = ι hν, ι hµ, M = M θ, M φβ, M φh. Further let D t = ln C t, ln Y t, Z t be the data vector. Stacking observations over time, we denote D = {D t } T t=1 and similarly for β, h and κ. The posterior density of interest is then given by f φ, β, h, κ, M D. Building on Frühwirth-Schnatter and Wagner 2010 for the stochastic model specification part, on Chib and Greenberg 1994 for the moving average MA part and on Kim et al. 1998 for the stochastic volatility part, our MCMC scheme is as follows: 1. Sample the binary indicators M θ, M φβ and the constant parameters δ, θ, φ β conditional on the time-varying parameters β and on the stochastic volatilities h. This is accomplished as: 2

a Sample the first step parameters δ from f δ φ β, h, D. b Sample the binary indicators M θ and the MA coefficients θ: i. Sample the binary indicators M θ from f M θ δ, φ β, φ h, β, h, D marginalizing over the parameters θ for which variable selection is carried out. ii. Sample the unrestricted parameters in θ from f θ δ, φ β, φ h, β, h, M θ, D while setting the restricted parameters in θ for which the corresponding binary indicator in M θ is 0 equal to 0. c Sample the binary indicators M φβ and the parameters φ β : i. Sample the binary indicators M φβ from f M φβ δ, θ, φ h, β, h, D marginalizing over the parameters φ β for which variable selection is carried out. ii. Sample the unrestricted parameters in φ β from f φ β δ, θ, φ h, β, h, M φβ, D while setting the restricted parameters in φ β for which the corresponding binary indicator in M φβ 0 equal to 0. is 2. Sample the unrestricted time-varying parameters in β from fβ δ, θ, φ β, h, M φβ, D. The restricted time-varying parameters in β sampled directly from their prior distribution in equation 23. for which the corresponding binary indicator is 0 are 3. Sample the mixture indicators κ, the binary indicators M φh, the constant parameters φ h and the stochastic volatilities components h conditional on the constant parameters δ, θ, φ β and on the time-varying parameters β. This is accomplished as: a Sample the mixture indicators κ from f κ φ, β, h, D. 1 b Sample the binary indicators M φh and the constant parameters φ h conditional on the constant parameters δ, θ, φ β, on the time-varying parameters β, on the stochastic volatility components h and on the mixture indicators κ: i. Sample the binary indicators M φh from f M φh δ, θ, φ β, β, h, κ, D marginalizing over the parameters φ h for which variable selection is carried out. ii. Sample the unrestricted parameters in φ h from f φ h δ, θ, φ β, β, h, κ, M φh, D while setting the restricted parameters in φ h for which the corresponding binary indicator in M φh is 0 equal to 0. c Sample the unrestricted stochastic volatilities in h from fh φ, β, κ, M φh, D. The restricted stochastic volatilities in h for which the corresponding binary indicator is 0 are sampled directly from their prior distribution in equation 25. 1 Note that the ordering of the Gibbs steps is in line with Del Negro and Primiceri 2015 who argue that the mixture indicators κ should be drawn after the components that do not condition on κ directly but before the components that do condition on κ. 3

4. Perform a random sign switch for σ ηi and {βit }T t=1 and for σ υj and {h jt }T t=1, e.g., σ η1 and {β1t} T t=1 are left unchanged with probability 0.5 while with the same probability they are replaced by σ η1 and { β1t} T t=1. Given an arbitrary set of starting values, sampling from these blocks is iterated J times and, after a sufficiently large number of burn-in draws B, the sequence of draws B + 1,..., J approximates a sample from the virtual posterior distribution f φ, β, h, κ, M D. Details on the exact implementation of each of the blocks can be found in Section A3. The results reported in the main paper are based on 15 000 iterations, with the first 5 000 draws discarded as a burn-in sequence. A3 Details on the blocks in the Gibbs sampling algorithm In this section we provide details on the exact implementation of the Gibbs sampling algorithm outlined in Section A2 to jointly sample the binary indicators M, the constant parameters φ, the time-varying parameters β, the stochastic volatilities h and the mixture indicators κ. For notational convenience, let us first define a general regression model y = x m b m + θ L e, e N 0, Σ, A-5 where y = y 1,..., y T is a T 1 vector stacking observations on the dependent variable y t, x = x 1,..., x T is a T k unrestricted predictor matrix with typical row x t, b is a k 1 unrestricted parameter vector, θ L is a lag polynomial of order q and Σ is a diagonal matrix with elements σet 2 on the diagonal that may vary over time to allow for heteroskedasticity of a known form. The restricted predictor matrix x m and restricted parameter vector b m exclude those elements in x and b for which the corresponding binary indicator in the m is zero, with m being a subset of the model M. The MAq errors in equation A-5 imply a model that is non-linear in the parameters. As suggested by Ullah et al. 1986 and Chib and Greenberg 1994, conditional on θ a linear model can be obtained from a recursive transformation of the data. For t = 1,... T let ỹ t = y t θ i ỹ t i, with ỹ t = 0 for t 0, A-6 x t = x t and further for j = 1,..., q θ i x t i, with x t = 0 for t 0, A-7 ω jt = θ i ω j,t i + θ t+j 1, with ω jt = 0 for t 0, A-8 where θ s = 0 for s > q. Equation A-5 can then be transformed as ỹ = x m b m + ωλ + e = w m Φ m + e, A-9 4

where w = x, ω with ω = ω 1,..., ω T and ω t = ω 1t,..., ω qt, Φ m = b m, λ and λ = e0,..., e q+1 initial conditions that can be estimated as unknown parameters. Conditional on θ and σ 2 et, equation A-9 is a standard linear regression with observed variables ỹ and w m and heteroskedastic normal errors with known covariance matrix Σ. A naive implementation of the Gibbs sampler would be to first sample the binary indicators in m from f m Φ, Σ, ỹ, w and next Φ m from f Φ m m, Σ, ỹ, w. However, this approach does not result in an irreducible Markov chain as whenever an indicator in m equals zero, the corresponding coefficient in Φ is also zero which implies that the chain has absorbing states. Therefore, as in Frühwirth-Schnatter and Wagner 2010 we marginalize over the parameters Φ when sampling m and next draw Φ m conditional on the binary indicators in m. The posterior distribution f m Σ, ỹ, w can be obtained using Bayes Theorem as f m Σ, ỹ, w f ỹ m, Σ, w p m, A-10 with p m being the prior probability of m and f ỹ m, Σ, w the marginal likelihood of the regression model A-9 where the effect of the parameters Φ has been integrated out. Under the normal conjugate prior p Φ m = N Φ m 0, V m 0, A-11 the closed form solution of the marginal likelihood is given by f ỹ m, Σ, w Σ 0.5 VT m 0.5 ỹ Σ 1 ỹ + Φ m 0 exp V0 m 1 Φ m 0 Φm T VT m 1 Φ m T, A-12 V0 m 0.5 2 where the posterior moments Φ m T and V m T can be calculated as Φ m T = VT m w m Σ 1 ỹ + V0 m 1 Φ m 0, A-13 VT m = w m Σ 1 w m + V0 m 1 1. A-14 Following George and McCulloch 1993 we use a single-move sampler in which the binary indicators ι k in m are sampled recursively from the Bernoulli distribution with probability p ι k = 1 ι k, Σ, ỹ, w = f ι k = 1 ι k, Σ, ỹ, w f ι k = 0 ι k, Σ, ỹ, w + f ι k = 1 ι k, Σ, ỹ, w, A-15 for k depending on the particular subset m of M. We further randomize over the sequence in which the binary indicators are drawn. Conditional on m, the posterior distribution of Φ m is given by p Φ m m, Σ, ỹ, w = N Φ m T, V m T, A-16 with posterior moments b m T and V m T given in equations A-13-A-14. 5

Block 1: Sampling M θ, M φβ and δ, θ, φ β Block 1a: Sampling the first step parameters δ Conditional on the stochastic volatility component h νt, equation 14 can be written in the general notation of equation A-5 as: y t = ln Y t, x m t = x t = Z t, b m = b = δ and θl = 1 such that θle t = ν t and the elements in Σ are given by σ 2 et = e hνt. In the notation of equation A-9 we further have ỹ t = y t, w m t = w t = x t and Φ m = Φ = δ. The parameters in δ can then be sampled from the posterior distribution in equation A-16 and used to calculate E t 1 ln Y t = Z t δ, ν t = ln Y t Z t δ and ν t = θ L ν te hµt/2 1 ρ2 e hνt/2, conditional on θ, ρ, h νt and h µt. Block 1b: Sampling the binary indicators M θ and the MA coefficients θ Conditional on the parameters δ, φ h and φ β, on the time-varying coefficients β t, on the stochastic volatility h µt and on the binary indicators M φβ, equation 16 can be written in the general notation of equation A-5 as: y t = ln C t, x t = 1, Z t δ, β 0t, β 1tZ t δ, ln C t 1, b = β 00, β 10, σ η0, σ η1, γ and θle t = θlε t, such that the elements in Σ are given by σ 2 et = σ 2 εt = e hµt / 1 ρ 2. The values of the binary indicators in the subset M φβ of M then imply the restricted x M φ β t and b M φ β. Under the normal conjugate prior p θ = N b θ 0, V0 θ, the exact conditional distribution of θ is p θ Φ M φ β, Σ, y, x T exp e t θ 2 2σet 2 t=1 exp 1 θ b θ 2 0 V θ 1 0 θ b θ 0, A-17 where e t θ = ỹ t θ w t m θ Φ M φ β is calculated from the transformed model in equation A-9 further conditioning on the initial conditions λ to obtain Φ M φ β = b M. φ β, λ Direct sampling of θ using equation A-17 is not possible, though, as e t θ is a non-linear function of θ. To solve this issue, Chib and Greenberg 1994 propose to linearize e t θ around θ using a first-order Taylor expansion e t θ e t θ Ψ t θ θ, A-18 where Ψ t = Ψ 1t,..., Ψ qt is a 1 q vector including the first-order derivatives of e t θ evaluated at θ obtained using the following recursion Ψ it = e t i θ θj Ψ i,t j, j=1 A-19 where Ψ it = 0 for t 0. An adequate approximation can be obtained by choosing θ to be the non-linear 6

least squares estimate of θ conditional on the other parameters in the model, which can be obtained as θ = argmin θ T e t θ 2, A-20 t=1 Conditioning on θ, equation A-18 can then be rewritten as an approximate linear regression model e t θ + Ψ t θ Ψ t θ + e t θ, A-21 with dependent variable e t θ +Ψ t θ and explanatory variables Ψ t. As such, the approximate expression in A-21 can be written in the general notation of equation A-5 by setting y t = e t θ +Ψ t θ, x t = Ψ t, b = θ and θl = 1 such that θle t = ε t and σ 2 et = σ 2 εt = e hµt / 1 ρ 2. The values of the binary indicators in the subset M θ of M imply the restricted x M θ t further have ỹ t = y t, w m t = x M θ t and Φ m = b M θ. and b M θ. In the notation of equation A-9 we The binary indicators in M θ can then be sampled from the posterior distribution in equation A-10 with marginal likelihood calculated from equation A-12. Next, we sample θ M θ using a Metropolis- Hastings MH step. Suppose θ M θ i is the current draw in the Markov chain. To obtain the next draw θ M θ i + 1, first draw a candidate θ M θ c using the following normal proposal distribution q θ M θ θ, Σ, y, x N Φ M θ T, V M θ T, A-22 with posterior moments Φ M θ T and V M θ T calculated using equations A-13-A-14. The MH step then implies a further randomization which amounts to accepting the candidate draw θ M θ c with probability { α θ M θ i, θ M θ p θ M θ c Φ M θ, Σ, y, x q θ M θ i θ, Σ, y, x } c = min p θ M θ i Φ M θ, Σ, y, x q θ M θ c θ, Σ, y, x, 1. A-23 If θ M θ c is accepted, θ M θ i + 1 is set equal to θ M θ c while if θ M θ c is rejected, θ M θ i + 1 is set equal to θ M θ i. Note that the unrestricted θ is restricted to obtain θ M θ by excluding θ l when ι θl = 0. In this case θ l is not sampled but set equal to zero. Block 1c: Sampling the binary indicators M φβ and the second step parameters φ β Conditional on β t, δ and ν t, equation 26 can be written in the general notation of equation A-5 setting y t = ln C t, x t = 1, Z t δ, β 0t, β 1tZ t δ, ln C t 1, ν t, b = β 00, β 01, σ η0, σ η1, γ, ρ and θle t = θlµ t, such that the elements in Σ are given by σ 2 et = e hµt. Further conditioning on the MA parameters θ, the unrestricted transformed variables ỹ t and w t in equation A-9 are obtained, with corresponding unrestricted extended parameter vector Φ = b, λ. The values of the binary indicators in M φβ imply the restricted w M φ β t and Φ M φ β. First, the binary indicators in M φβ are sampled from the posterior distribution in equation A-10 with marginal likelihood calculated from equation A-12. Second, conditional on M φβ, φ β is sampled, together with λ, by drawing Φ M φ β from the general expression in equation A-16. Note that the unrestricted Φ = β 00, β 10, σ η0, σ η1, γ, ρ, λ is restricted to obtain Φ M φ β by excluding those coefficients for which the then 7

corresponding binary indicator in M φβ is zero. These restricted coefficients are not sampled but set equal to zero. Block 2: Sampling β In this block we use the forward-filtering and backward-sampling approach of Carter and Kohn 1994 and De Jong and Shephard 1995 to sample the time-varying parameters β. Conditional on the coefficients φ β, on the stochastic volatility h µt, on the first block results Z t δ and ν t M φβ, equation 26 can be rewritten as and on the binary indicators y t = ι β0t σ η0 β 0t + ι β1t σ η1 β 1tx 1t + θ L µ t, A-24 with y t = ln C t β 00 β 10 Z t δ γ ln C t 1 ρνt and x 1t = Z t δ. Using the recursive transformation suggested by Ullah et al. 1986 and Chib and Greenberg 1994, the model in equation A-24 can be transformed to ỹ t = ι β0t σ η0 β0t + ι β1t σ η1 β1t + ω t λ + µ t, A-25 where ỹ t and ω t = ω 1t,..., ω qt are calculated conditional on θ from equations A-6 and A-8 and similarly β 0t = β0t θ i β0,t i, with β0t = 0 for t 0, A-26 β 1t = β 1tx 1t θ i β1,t i, with β1t = 0 for t 0. A-27 Substituting equation 23 in A-26-A-27 yields β 0,t+1 = β0t θ i β0,t+1 i + η0t, β 1,t+1 = β 1tx 1,t+1 θ i β1,t+1 i + x 1,t+1 η1t, A-28 A-29 such that the state space representation of the model in equations A-25, 23 and A-28-A-29 is given by σ η {{}}{ }} ]{ ] α0t ỹ t ω t λ = 0 σ η0 0... 0 0 σ η1 0... 0 +µ t, A-30 α0,t+1 α 1,t+1 ] }{{} α t+1 = T0t 0 ] α0t α 1t ] 0 T 1t }{{}}{{} T t α t + K0t 0 } 0 K 1t {{ } K t η 1t α t α 1t ] ] η 0t, A-31 }{{} η t 8

with α i,t+1 given by β i,t+1 β i,t+1 β it. β i,t q 2 }{{} α i,t+1 1 0... 0 0 x i,t+1 θ 1... θ q 1 θ q = 0 1 0 0...... β it β it. β i,t q 2 β i,t q 1 0 0... 1 0 }{{}}{{} T it α it + 1 x i,t+1 0. } 0 {{ } K it η it ], A-32 for i = 0, 1 and where x 0t = 1 t, µ t N 0, e hµt and η it N 0, 1. In line with equations 23 and A-26-A-27, each of the states in α t is initialized at zero. Equations A-30-A-31 constitute a standard linear Gaussian state space model, from which the unknown state variables α t can be filtered using the standard Kalman filter. Sampling α t from its conditional distribution can then be done using the multimove simulation smoother of Carter and Kohn 1994 and De Jong and Shephard 1995. Using β i0, σ ηi and βit, the centered time-varying coefficients β it in equation 20 can easily be reconstructed from equation 22. Note that in a restricted model with ι βit = 0, σ ηi is excluded from σ η and α it is dropped from the state vector α t. In this case, no forward-filtering and backward-sampling for β it distribution using equation 23. is needed as this can be sampled directly from its prior Block 3: Sampling κ, M φh, φ h and h Block 3a: Sampling the mixture indicators κ Conditional on h jt and on ɛ jt, calculated from equation A-2 as ɛ jt = g jt h jt, we first sample the mixture indicators κ jt from the conditional probability mass p κ jt = n h jt, ɛ jt q n f N ɛjt h jt + m n 1.2704, s 2 n, A-33 for j = ν, µ. Block 3b: Sampling the binary indicators M φh and the constant parameters φ h Conditional on h jt, g jt and κ jt and using the non-centered parameterization in equation 27, equation A-2 can be rewritten in the general linear regression format of A-5 setting y t = g jt m κjt 1.2704, x t = 1, h jk, b = h j0, σ υj and θle t = ɛ jt = ɛ jt m κjt 1.2704 for j = ν, µ. Given the mixture distribution of ɛ jt defined in equation A-4, the centered error term ɛ jt has a heteroskedastic variance s 2 κ jt such that Σ =diags 2 κ j1,..., s 2 κ jt. In the notation of equation A-9 we further have ỹ t = y t and the unrestricted w t = x t and Φ = b. The values of the binary indicators in M φh w M φ h t then imply the restricted and Φ M φ h. Hence, the binary indicators ι hj, for j = ν, µ, in M φh can be sampled from the posterior distribution in equation A-10 with marginal likelihood calculated from equation A-12. Next conditioning on M φh, the parameters h j0, σ υj, for j = ν, µ in φ h are sampled by drawing Φ M φ h using the general expression in equation A-16. Note that the unrestricted Φ = h j0, σ υj is restricted 9

to obtain Φ M φ h by excluding σ υj when ι hj = 0. In this case σ υj is not sampled but set equal to zero. Block 3c: Sampling the stochastic volatilities h Conditional on the transformed errors g jt defined under equation A-2, on the mixture indicators κ jt and on the parameters φ h, equations A-2 and 24-25 can we written in the following conditional state space representation g jt m κjt 1.2704 ] h j0 = ] = h j,t+1 ] σ υj ] 1 h jt ] ] + ɛ jt, A-34 ] ] ] + 1, A-35 h jt υ jt for j = ν, µ, where ɛ jt = ɛ jt m κjt 1.2704 is ɛ jt centered around zero with var ɛ jt = s 2 κ jt and varυjt = 1. In line with equation 25, the random walk component h jt is initialized at zero. As equations A-34-A-35 constitute a standard linear Gaussian state space model, h jt can be filtered using the Kalman filter and sampled using the multimove simulation smoother of Carter and Kohn 1994 and De Jong and Shephard 1995. Note that in a restricted model with ι hj = 0, and hence σ υj = 0, no forward-filtering and backward-sampling for h jt is needed as this can be sampled directly from its prior distribution in equation 25. Using h j0, σ υj and h jt, the centered stochastic volatility component h jt in equation 21 can easily be reconstructed from equation 24. 10

References Carter, C. and Kohn, R. 1994. On gibbs sampling for state space models. Biometrika, 81:541 53. Chib, S. and Greenberg, E. 1994. Bayes inference in regression models with ARMAp,q errors. Journal of Econometrics, 64:183 206. De Jong, P. and Shephard, N. 1995. 82:339 50. The simulation smoother fro time series models. Biometrika, Del Negro, M. and Primiceri, G. 2015. Time varying structural vector autoregressions and monetary policy: a corrigendum. The review of economic studies, 82:1342 1345. Frühwirth-Schnatter, S. and Wagner, H. 2010. Stochastic model specification search for Gaussian and partial non-gaussian state space models. Journal of Econometrics, 1541:85 100. George, E. and McCulloch, R. 1993. Variable selection via gibbs sampling. Journal of the American Statistical Association, 88:881 889. Kim, S., Shephard, N., and Chib, S. 1998. Stochastic Volatility: Likelihood Inference and Comparison with ARCH Models. Review of Economic Studies, 653:361 93. Ullah, A., Vinod, H., and Singh, R. 1986. Estimation of linear models with moving average disturbances. Journal of Quantitative Economics, 2:137 152. 11