The Role of Aggregation in the Nonlinear Relationship between Monetary Policy and Output

The Role of Aggregation in the Nonlinear Relationship between Monetary Policy and Output Luiggi Donayre Department of Economics Washington University in St. Louis August 2010 Abstract Within a Bayesian framework, this paper studies the roles of aggregation and firm heterogeneity in the context of asymmetries in the response of output to monetary policy shocks of different magnitude. To the extent that these potential roles imply a smooth change in regimes, a threshold autoregressive (TAR) process and a smooth transition autoregressive (STAR) processes are compared within an unobserved component model of output, augmented with a monetary policy variable. The Bayesian model comparison favors the notion that the nonlinear dynamics are better described by a smooth transition between regimes, which suggests that aggregation and firm heterogeneity play a role in understanding whether the effects of output vary disproportionately with the size of the monetary shock. JEL Classification Code: C15, C22 Keywords: Bayesian Analysis, Asymmetry, Monetary Policy, Smooth Transition Autoregressive Process, Threshold Autoregressive Process, Unobserved Components Model, MCMC Methods. I am thankful to James Morley for valuable suggestions and discussion. All remaining errors are my own. 1

1 Introduction Several past studies have made use of different nonlinear models to investigate whether the effects of monetary policy shocks on output vary disproportionately with the size of the shock (Weise, 1999, Ravn and Sola, 2004, Lo and Piger, 2005). Despite some mixed results regarding the significance of this asymmetry, there is evidence that suggests that once the threshold that classifies monetary policy shocks in terms of their size is endogenized, output does respond asymmetrically to the size of the monetary policy shock (Donayre, 2010). To motivate the distinction between small and large monetary policy shocks, these studies have related their results to the implications of theoretical models with menu costs (Ball and Romer, 1990, Ball and Mankiw, 1994, Golosov and Lucas, 2007). In such settings, only small monetary shocks have an effect on output since keeping nominal prices fixed is associated with only a second-order cost. By contrast, because the menu cost becomes relatively small when monetary shocks are large, firms find it optimal to adjust their prices, leaving real output unchanged. In the empirical literature, the results found only partially support the implications of models with menu costs. Weise (1999) finds that while large and small shocks have different effects on output, the relative size of these two effects depends on the time horizon under consideration. Ravn and Sola (2004), on the other hand, find that small monetary policy shocks have disproportionately large effects on output when the monetary instrument is the Federal Funds rate (FFR). However, the evidence is not conclusive when M1 is used as the measure of monetary policy. More recently, the results in Donayre (2010) suggest that the response of output to large monetary policy shocks is neutral only after a few quarters. This paper makes the argument that this difference between theory and data could be caused by aggregation. At the disaggregated level, a particular monetary shock might be deemed small by some agents and large by others. That is, the aggregate response of output to monetary shocks is an average of the individual responses of economic agents and, as a consequence, that response is likely to capture the heterogeneity of behaviors. Furthermore, firms may be able to react simultaneously while others might face different constraints that only allow them to adjust their prices with a lag. Implicitly, the implications of theoretical models with menu costs rely on the assumption that all firms are homogeneous in the way they interpret monetary policy shocks or in the timing of their reactions. Nonetheless, these assumptions may not necessarily be consistent with the features of the data. From a theoretical viewpoint, this aggregation hypothesis can be supported by Ss-type of models (Caplin and Spulber, 1987, Caballero and Engel, 1991, Gertler and Leahy, 2006, Golosov and Lucas, 2

2007, Caballero and Engel, 2007, Caplin and Leahy, 2010). These models link infrequent price adjustment at the microeconomic level with aggregate price stickiness by means of fixed-cost inventory adjustments. Caplin and Spulber (1987) use an Ss model of microeconomic price adjustment to show that aggregation wipes off the effects of microeconomic stickiness. Ss-type dynamics can also imply, however, that large shocks could have real effects on output if firms do not adjust their prices optimally, making them fluctuate between the high state S and the low state s (Caballero and Engel, 2007). If firms are homogeneous in the way they interpret economic conditions and can adjust their prices simultaneously, the dynamics of the model can be described by an abrupt change in regime. In particular, threshold autoregressive (TAR) processes capture this idea in an appealing way. 1 On the other hand, when aggregation plays a role in determining the effects of monetary policy on output conditional on the size of the monetary shocks, the dynamics are likely to be better captured by a model whose deterministic components permit a smooth rather than an abrupt adjustment between regimes. In this case, smooth transition autoregressive (STAR) processes provide a better description of the change in regimes. To determine whether aggregation plays a role in explaining the different results found in the empirical literature, a formal comparison between TAR-driven and STAR-driven dynamics is needed. In this paper, a Bayesian approach is used to estimate and compare these TAR and STAR models. The two main reasons for adopting a Bayesian approach are the following: first, the estimation of threshold-type nonlinear models in a frequentist environment is cumbersome because it requires a gridsearch procedure over the threshold and the parameters of the nonlinear transition function. In a Bayesian framework, all parameters can be easily jointly estimated. Second, despite the fact that a TAR process is nested within a STAR model, the comparison between these two models is difficult in a frequentist environment as it requires developing the limiting distribution of the likelihood ratio on tests where the smoothing parameter approaches infinity. 2 By contrast, Bayesian model comparison using marginal likelihoods is conceptually straightforward for any set of models. The approach taken in this paper is closely related to the work of Koop and Potter (1999, 2004), who argue in favor of a Bayesian approach to evaluating evidence of TAR-type nonlinearities in economic time series. It is also similar in motivation to the studies that find evidence of STAR-type dynamics in the relationship between monetary policy and output (Weise, 1999, Rothman, van Dijk, and Franses, 2001). However, it departs from the existing literature in two directions. First, it introduces the 1 Similarly, Markov-switching models can also capture this type of regime-switching. However, the state variable in this setting is unobserved and, as a consequence, the threshold that triggers the change in regimes is not estimated. To the extent that the significance of the nonlinearity depends on the estimation of such a threshold, a TAR model is considered to capture the abrupt change in dynamics. 2 A TAR process converges to a STAR one as the smoothing parameter approaches infinity. 3

STAR process within an unobserved components (UC) framework. By considering this framework, the monetary policy shocks are assumed to affect only the transitory component of output, consistent with the notion on long-run money neutrality. Second, the paper estimates the UC model with a STARdriven transitory component using Markov-chain Monte Carlo methods. To date, only few authors have dealt with threshold models of the STAR-type from a Bayesian perspective. Lubrano (1999) considers Bayesian analysis of threshold regression models and finds that the shape of the posterior density is greatly determined by the type of threshold and transition function considered. Gefang and Strachan (2008) investigate the impact of international business cycles on the U.K. economy, using a Bayesian LSTVAR and find that the U.K. business cycle is asymmetrically influenced by the U.S., Germany and France. Their results for real exchange rate data for Canada, Japan, the U.K. and the U.S. suggest some support for nonlinearity. Lo and Morley (2010) investigate the persistence of exchange rate fluctuations using a multiple-regime logistic smooth transition autoregressive (MR-LSTAR) model. Their model nests TAR and STAR processes and the authors find strong evidence of nonlinearities in the exchange rate data. Using data for the U.S., the results of the paper support the nonlinear threshold dynamics in the relationship between monetary policy and output when the threshold variable is the size of the monetary policy shock. Based on the Bayesian model comparison, both the TAR-driven and STARdriven dynamics perform better than a linear specification. That is, there is evidence that the effects of monetary policy on output vary disproportionately with the size of the monetary shock. Furthermore, the analysis of marginal likelihoods also favors STAR-driven dynamics against a more abrupt change in regimes, like the one described in a TAR process. This suggests that aggregation and firm heterogeneity could potentially explain why the implications of menu cost models have found partial support only in the empirical literature. The remainder of this paper is organized as follows. The second section presents an UC model with linear and threshold-type nonlinear dynamics for the transitory component of output. In the third section, the empirical approach and practical issues for Bayesian estimation and model comparison are discussed. The fourth section reports the empirical results when these models are applied to U.S. data. Some concluding remarks are provided in the fifth section. 4

2 Empirical Models Since at least the 1930s, economists have argued that the effects of monetary policy on the real economy are asymmetric. Such beliefs have only been formally modeled in recent years in response to the development and improvement of nonlinear modeling. Based on this, a first distinction occurs between linear and nonlinear models of monetary policy effects on output. Furthermore, to address the importance of the role of aggregation -as discussed in the previous section- a second distinction occurs between TAR and STAR nonlinear dynamics. In this paper, all three different types of dynamics, linear, TAR and STAR, are considered within the framework of a UC model. Typically, economists are interested in the effects of monetary policy on the output gap (i.e., deviations of output from its potential level). In an UC framework, such effects are directly modeled by measuring the output gap as the transitory component of output. The general UC model can be described by: y t = y T t + y C t (2.1) y T t = µ + y T t 1 + ν t (2.2) y C t = F (z t ) + ɛ t (2.3) where where y t is a measure of output, y T t is the permanent (or trend) component of output, y C t is the transitory (or cyclical) component of output, z t is a 1 k matrix that includes lags of y C t as well as a measure of monetary policy; and F (.) captures the functional form between output and monetary policy. The innovations ɛ t and ν t have a joint normal distribution with mean zero and variance-covariance matrix Ω. The system (2.1)-(2.3) is a modified version of the simple UC decomposition of real output into the permanent and transitory components, as in Watson (1986). Following the original model, the permanent component of output, given in equation (2.2), is modeled as a random walk with a drift term, µ. For the benchmark case, the UC model is characterized by a linear transitory component (UClinear) and can be described by (2.1)-(2.2) and the following autoregressive distributed lag (ADL) process replacing (2.3): 5

P J yt C = φ p yt p C + α j x t j + ɛ t (2.4) p=1 j=1 where x t, the independent variable, is a measure of monetary policy shocks and all roots of the polynomial φ(l) = 1 P p=1 φ pl p lie outside the unit circle. In order to be consistent with the measures of monetary policy considered below, where the monetary variable does not affect output contemporaneously, only lags of x t are allowed to enter equation (2.4). In terms of the nonlinear dynamics, the UC model characterized by a TAR-driven transitory component (UC-TAR) can be described by (2.1)-(2.2) and the following ADL replacing (2.3): P J J yt C = φ p yt p C + αj S x t j I(s t c) + αj L x t j I(s t > c) + ɛ t (2.5) p=1 j=1 j=1 where I(.) denotes the indicator function; s t is the threshold variable; and c is the threshold parameter. When s t c, the response-coefficients are captured by the J 1 vector α S and when s t > c, they are captured by the J 1 vector α L. Note that the coefficients φ p p = 1,, P are not state-dependent. The autoregressive dynamics are assumed to be the same in both regimes because the question of interest concerns the differences in the coefficients on the monetary shocks (i.e, in different regimes). When the regime-switching is smooth, the UC model with a STAR-driven transitory component (UC-STAR) can be described by (2.1)-(2.2) and the following ADL replacing (2.3): P J J yt C = φ p yt p C + α j x t j + αj G x t j G(s t ) + ɛ t (2.6) p=1 j=1 j=1 where G(s t ) is a smooth function, bounded between 0 and 1, that can be described according to: G(s t ) = {1 + exp( γ(s t c))} 1 (2.7) where c is a location parameter and γ determines the smoothness of the change in the value of the logistic function (2.7). This transition function determines the weights put on each regime according to logistic specifications that depend on the smooth transition parameter. A desirable feature of this 6

function is that it nests the TAR model as a special case. As γ 0, the model becomes linear. However, as γ, (2.7) approaches the indicator function and the model turns into a TAR one. To the extent that the asymmetry studied in this paper refers to the size of the monetary policy shock, the threshold variable s t is given by the absolute value of the monetary policy shock in both nonlinear models. Thus, the dynamics of the model will depend on whether the size of the shock is relatively big or small, as measured by the absolute value. An additional feature of each model estimated here is that it allows for a one time break in the variance-covariance matrix. Output, as well as many other macroeconomic aggregates, has experienced a reduction in volatility since the mid 1980s, an episode known as the Great Moderation. To account for this fact, an exogenous break date is set to the first quarter of 1984 to split the sample accordingly. 3 To reduce the dimensionality of the estimation, the parameter λ [0, 1] rescales the variance-covariance matrix Ω to account for this reduction in volatility from a practical point of view. That is, the variancecovariance matrix after the break is given by λω. 4 This assumption is supported by the findings in Ahmed, Levin, and Wilson (2004), who cannot reject the hypothesis that the reduction in volatility in U.S. real GDP is proportional to the one prevailing during the Great Moderation. 2.1 Bayesian Econometric Approach The Bayesian estimation of the models is conducted by means of a multiple-block Metropolis-Hastings (MH) algorithm with a random-walk chain proposal. The MH algorithm is a posterior simulator that is useful in environments in which it is not known how to draw directly from the posterior, but where it is possible to evaluate relative densities. From a practical point of view, the algorithm involves drawing from a proposal distribution, with the values accepted or rejected as draws from the target distribution, based in the relative densities from both the proposal and target distributions. To provide an accurate approximation of the target distribution, the proposal distribution used in the paper is a multivariate Student t distribution, following the applied literature. In the estimation of threshold-type nonlinear models, some issues arise in the implementation of the MH algorithm. The first one affects only the estimation time, as there is a need to grid-search across the location parameter c to find the posterior mode. However, this only applies to constructing the proposal distribution for Bayesian estimation for the first draw, given the random-walk 3 The focus of this paper is not on break dates. Given that many authors have estimated the Great Moderation to begin in the mid 1980s, the break date is set to the last quarter of 1983, broadly consistent with previous findings. 4 This approach is also undertaken by Morley and Piger (2010), Sinclair (2009) when accounting for the Great Moderation in UC models of U.S. real GDP. 7

chain proposal. The second issue poses a problem to obtaining the scale of the proposal density, since the grid-search procedure over the location parameter makes it unfeasible to use numerical derivatives to evaluate the curvature of the posterior. To overcome the latter, the paper follows Lo and Morley (2010) and considers an alternative measure of the curvature of the posterior with respect to the location parameter c. It involves inverting the likelihood ratio statistic for the threshold parameter, based on a χ 2 (1) distribution assumption, to construct a 95% confidence interval and obtain a corresponding implied standard error. For further details, refer to Lo and Morley (2010). In the case of the UC-STAR model, the smoothing parameter γ faces the same two issues described above. Hence, in a similar fashion, the practical implementation of the MH algorithm involves a gridsearch procedure across values of γ to pin down the posterior mode. This issue, however, only involves the construction of the proposal distribution for the MH algorithm. After getting around these difficulties, the Bayesian estimation of the models can be carried on in a standard way. An additional assumption made in the estimation of the threshold-type nonlinear models is that, for the UC-TAR (UC-STAR) case, the location parameter c is (the location parameter c and the smoothing parameter γ are) not correlated with the rest of the parameters in the model. 5 Hence, for θ the set of all parameters in the model, the proposal distribution is constructed as follows: θ mt(µ θ, κσ θ, η) where µ θ is set to the posterior mode for the first drawing and to the previous draw from the randomwalk chain for all other drawings, η is the degrees of freedom parameter set exogenously to 15, following the applied Bayesian literature 6, and κ is a scaling factor for the proposal density, adjusted to attain an acceptance rate for the MH algorithm between 30% and 60%. Given the assumption described in the previous paragraph, that ĉ (and ˆγ) are uncorrelated with the rest of parameters in the model, the proposal variance-covariance matrix Σ θ for the UC-TAR model is given by: Σ θ = ˆσ2 (ˆθ c ) 0 0 ˆσ ĉ 2 5 Since it is infeasible to obtain numerical derivatives to evaluate the curvature of the posterior with respect to the threshold parameter (and smoothing parameter), the variance-covariance matrix of all parameters in the model cannot be jointly estimated based on the inverse Hessian matrix. 6 One of the reasons to use a multivariate Student t distribution as the proposal distribution is that the fatter tails allow for a wider set of possible draws. To guarantee that the tails are indeed fat, the degrees of freedom parameter is set to 15. 8

where ˆσ 2 (ˆθ c ) is the variance-covariance matrix of all parameters in the model, except for the threshold parameter, based on the estimated inverse Hessian matrix evaluated at the posterior mode. ˆσ 2 ĉ is the variance of the estimated location parameter ĉ based on the Lo and Morley (2010) approach discussed above. The proposal variance-covariance matrix Σ θ for the UC-STAR model is given by: Σ θ = (ˆθ c, γ ) 0 0 ˆσ ĉ,ˆγ 2 where ˆσ 2 (ˆθ c, γ ) is the variance-covariance matrix of all parameters in the model, except for the location and smoothing parameters, and ˆσ 2 (ĉ, ˆγ) = ( ˆσ 2 ĉ ˆσ2ˆγ ) I is the variance-covariance matrix of (ĉ, ˆγ). Once the models are estimated, the models can be compared based on their marginal likelihoods, which can be interpreted as the expected value of the likelihood function with respect to the prior distribution. To formally test two models, the posterior odds ratio is calculated: P r(m i /y) P r(m j /y) = p(m i) m(y/m i ) p(m j ) m(y/m j ) (2.8) where the first factor in the right-hand side of (2.8) is the prior odds ratio -the ratio of the prior probability of model i to the prior probability of model j. The second factor in the right-hand side of (2.8) is the Bayes factor, the ratio of the marginal likelihoods of the two models, which is given by m(y/m i ) = f(y/θ, M i)π(θ/m i )dθ for model i, and can be obtained following Chib and Jeliazkov (2001). draws. For each iteration of the MH algorithm, 20,000 draws were considered after discarding first 4,000 3 Results 3.1 Data and model specification All data are quarterly. Output is measured as the first differences in the natural logarithm of real Gross Domestic Product (GDP). The monetary instrument is the Federal Funds rate (FFR) and inflation is 9

measured using the first differences in the natural logarithm of the GDP deflator. All data are taken from the Federal Reserve Economic Data (FRED) database and are seasonally adjusted. The sample period goes from 1954:Q3 through 2008:Q3. To approximate the monetary policy variable, an interest rate-based monetary shock is constructed from the residuals of an identified VAR, which contains three variables: the Federal Funds rate (FFR), the logarithm of real GDP and the logarithm of the GDP deflator. To identify the shock, the policy variable is ordered last in the VAR (i.e., monetary shocks do not affect output contemporaneously) and four lags of each variable are included. The number of autoregressive coefficients for yt C, P, and the number of lags for the monetary shock, J, are set based on the results in Lo and Piger (2005). They estimate similar models for different numbers of lags and find that the best model is the one with P = J = 2. 3.2 Prior distributions In setting values for the priors, a number of considerations are taken into account. First, it is known that the degree of parameterization of the model influences the quality of inference in finite samples. For instance, priors that are very informative might bias the estimations. Second, given that Bayes factors are functions of the prior normalizing constants, the prior settings can have a strong influence on the posterior model weights (Strachan and van Dijk, 2004). Generally, less informative priors tend to penalize more highly parameterized models. Third, the understanding of the behavior of economic variables is sometimes limited. Thus, there exists a conflict between the desire to specify uninformative priors and priors that are informative and would improve the efficiency of estimation. Fourth, the use of larger, more parameterized models is neither avoided nor preferred a priori. Taking into account these considerations, the priors are elicitetd as follows. Table 1 provides a summary for the priors for all three models. For all models considered, the priors for the state-independent autoregressive coefficients follow a normal distribution. Each of the prior means is set to zero and each of the prior variances is set to 0.5. These are relatively uninformative priors, considering the truncation of the normal distribution to ensure stationarity (i.e., the roots of the polynomial φ(l) lie outside the unit circle). Likewise, the drift in the permanent component, µ, and the policy coefficients, α, are each assumed to follow a normal distribution with mean zero and variance 0.5. In the case of the nonlinear models, it is important to note all policy coefficients are assumed to have the same mean and variance. Even if 10

Table 1: Prior distributions and parameters for all models Parameter Support Density Mean Variance φ R Normal* 0 0.5 α R Normal 0 0.5 α L R Normal 0 0.5 α S R Normal 0 0.5 α G R Normal 0 0.5 µ R Normal 0 0.5 σ ɛ R + Wishart 1.1 1.5 σ ν R + Wishart 1.1 1.5 σ ɛν R Wishart 0 0.5 λ R + Gamma 0.5 0.2 c R + Gamma 0.57 0.29 γ R ++ Gamma 20 6 *denotes a truncated distribution. the effects of monetary policy shocks on output do vary disproportionately with the size of the shock, this view is not imposed a priori. For the variance-covariance matrix of the forecast errors, the variances are assumed to follow a gamma distribution with parameters a and b. For any random variable X gamma(a, b), the mean and variance satisfy the following equations: a = [E(X)]2 V ar(x) and b = E(X) V ar(x) (3.1) Hence, for the prior on the forecast error variance of both, the permanent and transitory components of output, it is assumed a mean of 1.1 and a variance of 1.5 which, in turn, imply that a = 0.806 and b = 0.733. For the covariance term, it is assumed to follow a normal distribution with mean zero and variance 0.5. This is a relatively uninformative prior. 7 The parameter λ, that rescales the variance-covariance matrix Ω, is assumed to follow a beta distribution with parameters a and b. For any random variable X [0, 1], and X beta(ā, b), the mean and variance satisfy the following equations: ā = E(X) [E(X)(1 E(X)) V ar(x)] V ar(x) b = (1 E(X)) [E(X)(1 E(X)) V ar(x)] V ar(x) The prior mean for λ is set to 0.5 and the prior variance is set to 0.2. Thus, the parameters for the 7 Only draws that make the variance-covariance matrix of forecast errors positive definite are considered. 11

beta distribution are ā = b = 0.125. With respect to the prior of the threshold and smoothing parameters, they are assumed to follow a gamma distribution. As explained before, when γ = 0, the smooth transition function in 2.7 becomes a constant and, as a consequence, the elements in α G become unidentified. Hence, following Lubrano (1999), the point γ = 0 is excluded a priori from the support of γ. Notice that, even if the prior for γ excludes zero, this does not bias the results in favor of asymmetry since the priors for the elements in α G are centered around zero. The prior mean for γ is set to 20, in line with the estimates in frequentist studies (Weise, 1999, van Dijk, Teräsvirta, and Franses, 2000, Rothman et al., 2001). The prior variance for γ is set to 6. Given the equations in (3.1), this implies that the parameters for the gamma distribution for γ are a = 66.667 and b = 3.333. For the prior of the location parameter c, a gamma distribution is assumed because the transition variable to be considered is the absolute value of the monetary policy shock, given that the paper is focused on asymmetries with respect to the size of the shock. The prior mean is set to the median of the state variable, s t, and the prior variance is set to 0.5 times the median of s t. For both, γ and c, the priors are relatively uninformative. The results of the model, however, are robust across different parameterizations of the prior distributions. 3.3 Posteriors Table 2 reports the logarithm of the marginal likelihood values, the Bayes factors and the posterior odds ratios for all models estimated, as shown in equation (2.8). The Bayes factor is calculated as the ratio between the marginal likelihood of a specific model and the largest marginal likelihood value among all models. Seven specifications were estimated: UC-linear, UC-TAR and five versions of a UC-STAR model. These five versions correspond to five possible delay lags of the threshold variable. When the regime-switching is abrupt, it is implicitly assumed that all firms respond to a monetary policy shock at the same time. Hence, they all respond to one threshold variable. As discussed above, however, this assumption might not be accurate at the aggregate level. Some firms might react contemporaneously, while others face different constraints that only allow them to react with a lag. This implies that, potentially, firms may react to different lags of the threshold variable. For this reason, five versions of the UC-STAR model are considered, corresponding to reactions of firms that range from contemporaneous to up to 4 quarters. 12

Table 2: Log marginal likelihood, Bayes factors and posterior odds Model Log. marg. lik. Bayes Factor Posterior odds UC-linear -195.54 0.00001 0.00007 UC-TAR -193.34 0.00013 0.00067 UC-STAR0-184.42 1.00000 1.00000 UC-STAR1-190.02 0.00374 0.00374 UC-STAR2-186.99 0.07654 0.07654 UC-STAR3-189.21 0.00831 0.00831 UC-STAR4-191.23 0.00110 0.00110 The Bayes factors are calculated with respect to the model with the highest log marginal likelihood. The posterior odds ratio takes into account the downweight factor to account for the bias towards UC-STAR models. In the calculation of the posterior odds ratio, this bias towards UC-STAR models is taken into account via the prior odds ratio. Instead of all models being equally likely a priori, the prior probability of each of the versions of the UC-STAR model is downweighted by 1/5. Based on the analysis of the posterior odds ratio, the results in table 2 show that the best model is the UC-STAR0 model. In general, there is strong evidence that output varies disproportionately with the size of the monetary policy shock. Each nonlinear model is favored against the UC-linear specification. More importantly, the evidence suggests that the transition between regimes is smooth. Even after downgweighting the UC-STAR models for the reasons explained above, each of them performs better than the UC-TAR one. The fact that none of the UC-STAR specifications is outperformed by the UC-TAR model is taken as evidence that aggregation and firm heterogeneity are important in understanding whether the effects of monetary policy on output vary disproportionately with the size of the shock. To gain some insight about the features of the nonlinearities in the data, table 3 summarizes the posterior distributions of the best nonlinear model. In this model, the sum of the policy response coefficients in the regime when monetary policy shocks are small, α 1 and α 2, is -0.202 and at least one of them is large and significant. That is, a small monetary policy shock that reduces the FFR in 25 basis points increases output in 0.051%. Once the issue of aggregation and firm heterogeneity is taken into account, however, large monetary policy shocks are neutral: the policy response coefficients in the regime when the monetary policy shocks are big, α G 1 and α G 2, are very close to zero, as suggested by the 90% Bayesian credibility intervals. These results suggest that, when the transition between regimes is smooth, the data are consistent with the implications of menu-cost models. To better understand the form of the regime-switching, the transition function (2.7) is plotted over 13

Table 3: Posterior distributions for the UC-STAR0 model Parameter Mean 90% interval φ 1 0.608 [0.296, 0.969] φ 2 0.003 [ 0.268, 0.221] µ 0.730 [0.611, 0.845] σ ɛ 1.347 [1.030, 1.614] σ ν 1.682 [1.375, 2.021] σ ɛν -2.064 [ 2.785, 1.482] λ 0.239 [0.174, 0.313] c 0.323 [0.152, 0.970] γ 19.050 [16.906, 20.732] α 1-0.192 [ 0.363, 0.018] α 2-0.010 [ 0.196, 0.122] α1 G 0.018 [ 0.185, 0.203] α2 G -0.068 [ 0.203, 0.133] the range of the threshold variable s t in figure 2. As can be seen from the graph, the change in the dynamics is relatively smooth. Moreover, figure 2.1 also supports the STAR dynamics and the idea that aggregation plays a key role in understanding whether the effects of monetary policy on output vary with the size of the monetary shock, as the logistic function exhibits many points in between 0 and 1. Figure 1: Smooth transition function 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 It is important to evaluate how these responses vary over time, given that the UC-STAR0 model allows for asymmetries in the behavior of the coefficients. As discussed in Koop, Pesaram, and Potter (1996), impulse response functions (IRFs) of nonlinear models are history - and shock - dependent. This 14

setting contrasts with traditional IRFs in a linear VAR, where shocks are treated symmetrically and independent of the regime prevailing in the economy. Therefore, to address these issues, generalized impulse-response functions (GIRFs) were constructed. See the appendix for details. Figure 2 presents the GIRFs for the transitory component of output. Since each particular history generates a given forecast of yt C, the median of these forecasts are reported, together with the 25th and 75th quantiles (dashed lines). The left panel of figure 2 plots the response of y C t for q = 15 periods ahead in regime 1, that is when the monetary shock hitting the system is small, corresponding to 58 possible histories. The right panel plots the response of y C t for the same number of periods ahead in regime 2, corresponding to 132 possible histories. Figure 2: Generalized impulse-response functions for the transitory component of the UC-STAR0 model.2 Small Shock Regime (58 histories).2 Large Shock Regime (132 histories).1.1.0.0 -.1 -.1 -.2 -.2 -.3 -.3 -.4 -.4 -.5 -.5 -.6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 -.6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Generalized impulse-response functions of the transitory component of output to a positive shock to the monetary policy variable. The size of the shocks corresponds to a standard deviation difference between the small and large shocks, with the estimated threshold as the middle point. From figure 2, it can be observed that the median response of the transitory component of output to a small positive monetary policy shock is larger than the median response to a large positive monetary shock (on impact, the transitory component of output falls -0.248 and -0.126, respectively). Nonetheless, the response to a large positive monetary policy shock is not significant, as the bands include zero for the all 15 quarters of the forecast. Hence, again, once aggregation and firm heterogeneity are taken into account, the data are consistent with the implications of models with menu costs. 15

4 Concluding Remarks Using Bayesian methods, this paper examines the role that aggregation and firm heterogeneity play in understanding whether the effects of monetary policy on output vary disproportionately with the size of the monetary shock. The results show that there is strong evidence of nonlinearities in the data. In particular, the response of output is asymmetric with respect to the magnitude of the shock. More importantly, there is evidence that the transition from one regime to the other is smooth, consistent with the idea that firms are heterogeneous in their beliefs about the economy and the timing with which they respond to economic circumstances. Furthermore, the estimated coefficients suggest that the response of output in periods in which monetary shocks are small is large and significant. On the other hand, the response of output to large monetary shocks is neutral, once aggregation and firm heterogeneity are taken into account. This result seems to provide some support to the implications of menu-cost models. However, it is important to note that this paper only describes the behavior of the linkages between output and the size of the monetary policy variable. To provide further support to the aggregation hypothesis, it would be interesting to evaluate whether the transition function exhibits a more abrupt change in regimes when considering the effects of monetary policy on more disaggregated output data. This is left as future research. 16

Appendix Computation of generalized impulse-response functions The procedure to compute the generalized impulse-response functions (GIRFs) follows the one described in Koop et al. (1996). The reader is referred there for further details. A GIRF can be defined as the effect of a one-time shock on the forecast of variables in a particular model, given a specific history. The response constructed must then be compared to a benchmark no shock scenario. In this way, the GIRF can be expressed as follows: GI Y (q, ν t, ω t ) = E [Y t+q, ν t, ω t 1 ] E [Y t+q /ω t 1 ] where GI Y is the generalized impulse-response function of a variable Y for period q, given the specific history ω t 1 and initial shock ν t, and E[.] is the expectations operator. To compute the GIRF, the conditional expectations in the equation above are simulated. The nonlinear model is assumed to be known (i.e., sample variability is ignored). The shock to Y, ν 0, occurs in period 0, and responses are computed for q periods ahead. Thus, the GI Y function is generated according to the following steps: Step 1: Pick a history ωt 1. The history is the actual value of the lagged endogenous variables at a particular date, or for a particular episode (e.g., those values of the endogenous variables that fall under regime 1). Step 2: Pick a sequence of two-dimensional shocks ν j,t+q, q = 0, 1,, n. This vector of shocks includes both monetary and idiosyncratic shocks. They are drawn with replacement from the vector of monetary shocks -the residuals from the identified VAR- and from the estimated residuals of the transitory component of the model. Step 3: Using ω i,t 1 and ν j,t+q, simulate the path for y t+q over n periods according to equation (2.6). This benchmark path is denoted as Y t+q (ω i,t 1, ν j,t+q ) for q = 1,, n. Step 4: Using the same ω i,t 1 and ν i,t+q, plus an additional initial shock ν 0, simulate the path for y t+q over n + 1 periods according to the equation for the transitory component of output. This profile path is denoted Y t+q (ν 0, ω i,t 1, ν j,t+q ) for q = 0, 1,, n. Step 5: Repeat steps 2 to 4 B times. 17

Step 6: Repeat steps 1 to 5 R times and compute the quantiles of the difference between the profile and benchmark paths Y t+q (ν 0, ω i,t 1, ν j,t+q ) Y t+q (ω i,t 1, ν j,t+q ). 18

References Ahmed, S., Levin, A., and Wilson, B. A. (2004), Recent U.S. Macroeconomic Stability: Good Policies, Good Practice or Good Luck? The Review of Economics and Statistics, 86(3), 824 832. Ball, L. and Mankiw, N. G. (1994), Asymmetric Price-Adjustment and Economic Fluctuations, The Economic Journal, 423, 247 261. Ball, L. and Romer, D. (1990), Are Prices Too Sticky? Quarterly Journal of Economics, 104(3), 507 524. Caballero, R. J. and Engel, E. (1991), Dynamic (S,s) Economies, Econometrica, 59(6), 1659 1686. (2007), Price Stickiness in Ss Models: New Interpretations of Old Results, Center Discussion Papers, 952, Economic Growth Center, Yale University. Caplin, A. and Leahy, J. (2010), Economic Theory and a World of Practice: A celebration of the (S,s) Model, Journal of Economic Perspectives, 24(1), 183 201. Caplin, A. and Spulber, D. (1987), Menu Costs and the Neutrality of Money, Quarterly Journal of Economics, 102, 703 726. Chib, S. and Jeliazkov, I. (2001), Marginal Likelihood from the Metropolis-Hastings Output, Journal of the American Statistical Association, 96, 270 281. Donayre, L. (2010), Estimated Thresholds in the Response of Output to Monetary Policy: Are Large Policy Changes Less Effective? Working Paper, Washington University in St. Louis. Gefang, D. and Strachan, R. (2008), Nonlinear Impact of International Business Cycles on the U.K. - A Bayesian Smooth Transition VAR, Discussion Papers in Economics 08/4, Dept. of Economics, University of Leicester. Gertler, M. and Leahy, J. (2006), A Phillips Curve with an Ss Foundation, NBER Working Papers, 11971. Golosov, M. and Lucas, R. E. (2007), Menu Costs and Phillips Curves, NBER Working Paper, 10187. Koop, G., Pesaram, M. H., and Potter, S. (1996), Impulse-Response Analysis in Nonlinear Multivariate Models, Journal of Econometrics, 74, 119 147. 19

Koop, G. and Potter, S. (1999), Bayes Factors and Nonlinearity: Evidence from Economic Time Series, Journal of Econometrics, 88, 251 281. (2004), Dynamic Asymmetries in U.S. Unemployment, Journal of Business and Economic Statistics, 17(3), 298 312. Lo, M.-C. and Morley, J. (2010), Bayesian Analysis of Nonlinear Exchange Rate Dynamics and the Purchasing Power Parity Persistence Puzzle, Working Paper, Washington University in St. Louis. Lo, M. C. and Piger, J. (2005), Is the Response of Output to Monetary Policy Asymmetric? Evidence from a Regime-Switching Coefficients Model, Journal of Money, Credit and Banking, 37, 865 887. Lubrano, M. (1999), Bayesian Analysis of Nonlinear Time Series with a Threshold, Nonlinear Econometric Modelling, Cambridge: Cambridge University Press. Morley, J. C. and Piger, J. (2010), The Asymmetric Business Cycle, Working Paper, Washington University in St. Louis. Ravn, M. and Sola, M. (2004), Asymmetric Effects of Monetary Policy in the U.S.: Positive versus Negative or Big versus Small? Federal Reserve Bank of St. Louis Review, 86(5), 41 60. Rothman, P., van Dijk, D., and Franses, P. H. (2001), Multivariate STAR Analysis of the Money- Output Relationship, Macroeconomic Dynamics, 5, 506 532. Sinclair, T. (2009), Asymmetry in the Business Cycle: Friedman s Plucking Model with Correlated Innovations, Studies in Nonlinear Dynamics Economics, 14(1), Article 3. van Dijk, D., Teräsvirta, T., and Franses, P. H. (2000), Smooth Transition Autoregressive Models: A Survey of Recent Developments, Econometric Institute Research Report EI 2000-23/A. Watson, M. W. (1986), Univariate Detrending Methods with Stochastic Trends, Journal of Monetary Economics, 18, 49 75. Weise, C. (1999), The Asymmetric Effects of Monetary Policy: A Nonlinear Vector Autoregression Approach, Journal of Money, Credit and Banking, 31, 85 108. 20