Persistence in Forecasting Performance
|
|
- Hortense Short
- 5 years ago
- Views:
Transcription
1 Persistence in Forecasting Performance Marco Aiolfi Bocconi Allan Timmermann UCSD December 27, 2003 Abstract This paper considers various measures of persistence in the forecasting performance of linear and nonlinear time-series models applied to output growth in the G7 countries. We find strong evidence of persistence among both top and bottom forecasting models, but also find systematic evidence of crossings - where a previously good (bad) forecasting model delivers bad (good) forecasting performance out-of-sample - particularly among linear models. We consider a range of forecast combination methods including spreads that put positive weights on models with good past forecasting performance and negative weights on models with poor past performance. We also propose a new three-stage model combination method that first sorts models into clusters based on their past performance, then pools forecasts within each cluster, followed by estimation of the optimal forecast combination weights for these clusters and shrinkage towards equal weights. This method is shown to work well empirically in out-of-sample forecasting experiments for output growth. 1 Introduction Forecasts are of considerable importance to decision makers throughout economics and finance and are routinely used by private enterprises, government institutions and professional economists. It is therefore not surprising that considerable effort has gone into designing 1
2 and estimating forecasting models ranging from simple, autoregressive specifications to complicated non-linear models or models with time-varying parameters. A reason why such a wide range of forecasting models is often considered is that the true data generating process underlying a particular series of interest is unknown and even the most complicated model is likely to be misspecified and can, at best, provide a reasonable local approximation to the target variable. Conditions under which the true model is selected asymptotically are quite strict, c.f. White (1990), Sin and White (1996), and are unlikely to be relevant in practical forecasting situations in macroeconomics with a large cross-section of forecasting models and a short time-series dimension. Forecasting models are also likely to be subject to structural instability as documented by Stock and Watson (1996) for a large series of macroeconomic variables. Viewed this way, it is highly unlikely that a single model will dominate all others at all points in time and the identity of the best local approximation is also likely to change over time. If the identity of the best local model is time-varying, a question immediately arises, namely whether it is possible to design a forecasting strategy that, at each point in time, selects the best current model. This strategy can be expected to be successful if the best current forecasting model outperforms all other models by a suitably large margin and if the ranking of the forecasting models is persistent. If either of these conditions fails to be met, a strategy of selecting only a single best model is unlikely to work well. Most obviously, if (ex-ante) the identity of the best model varies randomly from period to period, it will not be possible to identify this model by considering past forecasting performance across models. Similarly, if the best model only outperforms other models by a margin that is small relative to random sampling variation, it becomes difficult to identify this model by means of statistical methods based on past performance. Even if the single best model could be identified in this situation, it is likely that diversification gains from combining across a set of forecasting models with similar performance will dominate the strategy of only using a single model. In general there are hence good theoretical reasons for combining forecasts produced by several models. A popular strategy is to assign equal weights to the individual forecasting models. This becomes an optimal strategy if there is no ex-ante indication of the individual 2
3 models prospective out-of-sample forecasting performance, either because the models are of similar quality or because their (relative) performance is unstable over time. In practice, the sources that give rise to changes in the ranking of different forecasting models - e.g., major oil price shocks, policy chnages, institutional shifts or market participants learning behaviour - are neither likely to be very swift nor to unfold over several decades. In this case, one might expect the relative performance of forecasting models to be subject to change but also to display moderate degreses of persistence. How much persistence is a question of great practical relevance, however. Unfortunately, little is known about this question. 1 The first part of our paper therefore considers this question by studying the extent of persistence in forecasting performance using a range of non-parametric statistics. We find strong evidence of persistence among both top and bottom forecasting models, but also find systematic evidence of crossings - where a previously good (bad) forecasting model delivers bad (good) forecasting performance out-of-sample - particularly among linear models. If the (relative) performance of forecasting models changes over time, it is crucial to decide how long a sorting or ranking window to base the combination weights on. One possibility is to use the models full track record, in which case an expanding window is used for ranking. Alternatively, if performance persistence is more local in time, the models can be ranked over a much shorter recent time span that considers either their most recent performance or their performance over a rolling window of recent data. In the case where the correlation between the forecasts and the target variable does not change too rapidly over time, it becomes attractive to estimate forecast combination weights by least squares regression or (equivalently) using portfolio variance minimization methods. This approach introduces new problems, however. First, given the sample sizes typically available in macroeconomic forecasting, the combination weights are often imprecisely estimated, giving rise to poor out-of-sample forecasting performance. In particular, this is a problem when the number of models is large relative to the length of the time-series so that the covariance matrix of the forecast errors either cannot be estimated or is estimated very 1 Stock and Watson (1999) consider combination methods based on expanding window and rolling window estimations, two approaches that are usually associated with a stable and unstable data generating process, respectively. 3
4 imprecisely. Furthermore, the assumption of a stable covariance structure is unlikely to be satisfied in practice due to the presence of instabilities in the data generating process. To deal with such problems, we propose a new three-stage tenon-parametric three-stage approach for model combination that first sorts models into clusters based on their past performance, then pools forecasts within each cluster, followed by estimation of the optimal forecast combination weights for these clusters and shrinkage towards equal weights. The premise of this approach is that, when combining forecasts from a large cross-section of models, it is generally difficult to distinguish between the performance of the top models, but one can tell the difference between the best and worst models. This suggests including a subset of good models in the combined forecast. Our method is shown to work well empirically in an out-of-sample forecasting experiments for output growth in the G7 countries. The paper is organized as follows. Section 2 studies the persistence of forecasts produced by a range of linear and nonlinear time-series models. Section 3 introduces the forecast combination problem while Section 4 studies the out-of-sample forecasting performance of a range of standard combination methods proposed in the literature as well as some new ones. Section 5 proposes our new three-stage combination method, while Section 6 concludes. 2 Persistence in Forecasting Performance 2.1 Data Set Our empirical application uses data on real GDP growth in the the G7 country (Canada, France, Germany, Italy, Japan, UK and the US) and is taken from the larger data set compiled by Stock and Watson (2003). The data contains quarterly, seasonally adjusted time series over the sample , although some series are available only for a shorter period. 2 For all countries the initial estimation period is 1959.I-1969.IV so the first available forecast is for 1970.I (denoted T 0 +1), while the last forecast is for 1999.IV (denoted T ). The number of linear model specifications ranges from 41 for the UK and Canada over 2 More specifically, the sample size is 107 for the US and Canada, 89 for the UK, 80 for Japan, 103 for Germany and 67 for France. 4
5 43 for Germany, 45 for France, 51 for Japan and 66 for the US. 2.2 Forecasting Models and Methods The underlying forecasts in our problem are generated by time-series models of the form: y t+h = f i (Z t ; θ i,h )+u t+h,t,i. (1) Here i is an index for the forecasting model, θ i,h is a vector of unknown parameters, u it+h,t is an error term, and Z t is a vector of predictor variables that are known at time t. Individual forecasting models generally only use a subset of the elements of Z t. All forecasts are computed recursively, out-of-sample, i.e., the forecast of y t+h by the ith model is computed ³ as f i Z t ; ˆθ it,h. Following the analysis of Stock and Watson (2001), we consider both linear and nonlinear forecasting models. The class of linear models includes simple autoregressions with lag lengths selected recursively using the BIC and considering up to four lags: y t+h = c + A (L) y t + ² t+h. (2) We also consider bivariate autoregressive models that include a single additional regressor, x t, which is an element of Z t : y t+h = c + A (L) y t + B (L) x t + ² t+h. (3) Lag lengths are again recursively selected using the BIC with between 1 and 4 lags of x t and between 0 and 4 lags of y t. The class of non-linear forecasting models includes 18 Artificial Neural Network models with one and two hidden layers and different numbers of lags, p. Single layer feedforward neural network models take the form y t+h = β 0 0 ζ Xn 1 t + γ 1i g (β 0 1i ζ t)+² t+h, (4) i=1 ζ t = (1,y t,y t 1,..., y t p 1 ), p =1, 2, 3, 5
6 where g (z) = 1 is the logistic function. Neural network models with two hidden layers 1+e z take the form " y t+h = β 0 0 ζ Xn 2 n1 # X t + γ 2j g β 2ji g (β 0 1i ζ t) + ² t+h. (5) j=1 The choices for single hidden layer ANNs are n 1 =1, 2, 3,p=1, 2, 3 for a total of nine basic models. The choices for two hidden layers are n 1 =2,n 2 =1, 2, 3 and p =1, 2, 3, producing nine basic models and thus spanning many of the basic neural net designs, c.f. Swanson and White (1997). 3 We also consider 15 Logistic Smooth Transition Autoregression (LSTAR) models: i=1 y t+h = α 0 0ζ t + d t β 0 ζ t + ² t+h (6) d t = 1 1+exp(γ 0 + γ 0 1ξ t ), ξ t = (y t,y t 1,..., y t p 1 ), p =1, 2, 3. The LSTAR models differbythevariableusedtodefine the threshold and by the lag length p. Finally, we consider time-varying autoregressions (TVARs). These are autoregressive models whose parameters are allowed to evolve according to a multivariate random walk: y t+h = θ 0 t,hξ t + ε t+h θ t,h = θ t 1,h + u t,h, (7) u t,h iid 0, λ 2 σ 2 Q. Here σ 2 Q is the variance of u t,h.weconsiderdifferent values of λ and up to three lags for a total of 21 TVAR models, all of which are estimated by the Kalman filter. 3 For all ANN models, coeffcients were estimated by recursive non-linear least squared, minimizing the objective function by an ad-hoc algorithm developed by Stock and Watson. 6
7 2.3 Sorting Windows For each country and model type (linear versus non-linear), j, and forecast horizon, h, we produce a ((T h T 0 +1) M j ) panel of forecasts by (1) T,T h by (2) T,T h... by (M j) T,T h by (1) by h = T 1,T h 1 by (2) T 1,T h 1... by (M j) T 1,T h by (1) T 0 +h,t 0 by (2) T 0 +h,t 0... by (M j) T 0 +h,t 0 where the superscript, i, tracks the model, i =1,..., M j ; M j is the number of models for country/forecasting method j and h denotes the forecast horizon. We implement an automatic procedure to control for missing values and outliers to produce a balanced panel of forecasts. The h period performance of the generic forecasting model (i) is denoted L (i) t+h,t ³ L y t+h, by (i) t+h,t. In line with common practice, we shall assume mean squared error or quadratic loss, i.e. L(y t+h, by t+h,t )=e 2 t+h,t, (8) where e t+h,t = y t+h by t+h,t. We track the historical forecasting performance over a window of the last win periods by computing S (i) t = (1/win) P t τ=t win+1 L(i) τ,τ h. To study how persistent forecasting performance is through time, we consider three different tracking or sorting windows used to rank the forecasting models based on their historical performance: 1. Symmetric sorting and assessment window: win =1:S (i) t ³ = L e (i) t,t h ; 2. Asymmetric (rolling) sorting and assessment window: win =20:S (i) t 3. Expanding sorting window: S (i) t =(1/(t h +1) P t τ=h+1 L(i) τ,τ h. =(1/win) P t τ=t win+1 L(i) τ,τ h ; In each case the out-of-sample assessment window is based on forecasts for the period [T 0 + h; T ] so that the forecasting performance is assessed using A (i) t =(1/(T h T 0 +1)) P T τ=t 0 +h L(i) τ,τ h. 7
8 2.4 Empirical Evidence For each model, i, we record its rank at time t, R it. The model with the best (MSFE) forecasting performance, gets a rank of 1, the second best a rank of 2 and so on. Anticipating the analysis in section 4, we sort the models into quartiles and use 4 4 contingency tables to cross-tabulate the forecasting models sorting-period performance against their out-of-sample performance. 4 Tables 1-6 report empirical evidence on persistence in forecasting performance for the three sorting windows and linear (Tables 1-3) and nonlinear (Tables 4-6) models. Transition probability estimates, ˆP ij, in these tables give the probability of moving from quartile i (based on historical performance up to time t) to quartile j (based on future performance, e t+h,t, averaged over the out-of-sample period [T 0 +h; T ]). We show only the four top corners of the tables (i.e., P 11,P 14,P 41,P 44 ) since these effectively convey information about persistence or anti-persistence in forecasting performance. It is easy to benchmark the null of no persistence in these tables. If performance was purely random over time we would expect the transition probabilities to be close to 25% so the probability of good (or bad) future performance should be unaffected by past performance. Persistent forecasting performance would lead to estimates of P 11 and P 44 above 0.25, while P 14 and P 41 (the probability that a historically good model becomes a bad future model or that a historically bad model produces good future forecasts) should be well below Conversely, anti-persistence corresponds to low values of P 11 and P 44 and high values of P 14 and P 41. A chi-squared test statistic can easily be constructed for the estimated transition probabilities when h =1. However, at longer horizons (h 2) the data is overlapping so the performance statistics are serially correlated. To assess the statistical significance in this situation, we therefore use a bootstrap to construct confidence intervals for the transition probability estimates, ˆP ij. Several interesting results emerge from the tables. First, forecasting performance is generally persistent both for the linear and nonlinear models. Among the linear models, the 4 We also considered rank-order statistics for forecasting performance and found similar results. 8
9 stayer probability estimate for the top quartile ( ˆP 11 )issignificantly greater than 25% in at least 26 out of 28 cases. Hence, there is robust evidence of persistence among the best linear forecasting models. Using an expanding sorting window, 21 out of 28 of the estimates of P 14 are significantly larger than 0.25, however, suggesting that there is also a high chance that the historically best performing models have an above-average chance of becoming the future worst models. This probability declines, however, as shorter sorting windows are used - only half of the P 14 -estimates are above 0.25 under the asymmetric (rolling) and symmetric sorting windows in Tables 2 and 3. Interestingly, for the US under the expanding window ˆP 14 > ˆP 11 at all four horizons. Interestingly, for the historically worst models, the degree of persistence is clearly inversely related to the length of the assessment window. Using an expanding window, 17 of 28 estimates of P 44 are significantly larger than 0.25, growing to 23 for the asymmetric (rolling) window and 28 for the symmetric window. Persistence among the worst models thus appears to be more local in time. There is also a lower chance that a historically bad model delivers good forecasting performance in the future than the probability that a historically good model becomes a poor future performer: for each sorting window the estimates of P 41 are only significantly higher than 0.25 in between six and nine of the 28 cases considered in Tables 1-3. The relative persistence among the top and bottom models also depends on whether an expanding or a short symmetric sorting window is used in the rankings. Under the expanding sorting window, in general ˆP 11 > ˆP 44 and persistence is strongest among the top models. In contrast, under the much shorter symmetric sorting window, ˆP 44 > ˆP 11 suggesting that the worst forecasting models have the strongest local persistence. There is generally a much lower probability of crossings in the performance among the non-linear forecasting models and far fewer of the off-diagonal transition probability estimates are significant in Tables 4-6. At most four of 28 cases in each of these tables produce a significant value of ˆP 41 while at most two of 28 cases of the P 14 estimates are significant. In contrast, irrespective of the forecast horizon, h, the estimated transition 9
10 probabilities for the top and bottom quartile models continue to exceed 0.25 at significant marginsinatleast23and25of28casesforthetopandbottomquartiles,respectively. Among the non-linear forecasting models, the persistence tends to be highest among the bottom models, i.e. ˆP44 > ˆP 11. This pattern is particularly strong for the measures of local persistence, i.e., when a symmetric sorting ranking window is used. We conclude the following from these findings. First, perhaps unsurprisingly, there is systematic evidence of persistence in forecasting performance both at the top and at the bottom end of the rankings. Second, there is unfortunately also a strong tendency for the previous best linear models to become future underperformers (in the sense that their outof-sample MSFE performance belongs to the bottom quartile of models), particularly when an expanding sorting window is used. Third, as shorter and more recent sorting windows are used, the probability of crossings - the situation where a previous top quartile model becomes a future bottom quartile performer or vice versa - declines and the persistence in the linear forecasting models performance grows, particularly at the bottom end. Fourth, there is in general stronger persistence in the forecasting performance of the non-linear models and lower probability of crossings in these models forecasting performance than was observed for the linear models. 2.5 Persistence Under a Stable Factor Structure It is not surprising that we find persistence in the forecasting models performance. This is in fact what one would expect if, as seems reasonable, the predictors and target variable share a common factor structure. It is more difficult to say how much persistence one would expect under this type of assumption since this will depend on the relative variability of the factors and the distribution of factor loadings across forecasting models. To quantify how much persistence to expect under a stable factor model, we fitted a factor model of the form: X = F L + e. (9) T k T m m k T 1 where the number of factors, m, islessthanthenumberofregressors, k. Weestimated four factors b F and the loadings b L by applying principal component methods to the variance- 10
11 covariance matrix of X. We then fit a VAR(1) model to the factors and simulate these over time: bf t = A b F t 1 + u t, (10) where u t = N(0,Chol(ˆΣ)) and ˆΣ is the estimated variance-covariance matrix of the VAR(1) model. This allows us to simulate factors and regressors by drawing innovations, η. bx i t = F b t ˆL + η i t, i =1,..., MC. (11) We set the number of simulations, MC =1, 000 and draw the η-values from an IIN(0,1) distribution. For each of the simulated series we re-estimate the linear and nonlinear forecasting models and use these to generate h period forecasts. The resulting forecast errors are then used to compute transition probabilities. Tables 1-6 report 95% confidence intervals computed across the factor simulations in square brackets underneath the estimates for the actual data. Many interesting results emerge from these simulations. First of all, unsurprisingly, the stable factor structure explains a considerable amount of the forecasting persistence found in the actual data. This can be seen by noting that the factor confidence intervals are generally not centered on 0.25 but tend to be above this value for ˆP 11 and ˆP 44 and below it for ˆP 14 and ˆP 41. Clearly, however, a stable factor structure cannot explain the observed degree of persistence in the data. Relative to what one would expect under a stable factor model, there is more persistence in the forecasts from the linear models at the longest horizons (h =4 or h =8). Furthermore, there is far more local persistence among the worst models forecasting performance than what one would expect under the factor model. In particular, ˆP 44 is outside the 95% confidence interval generated by the stable factor model in 18 out of 28 cases under the asymmetric (rolling) sorting window in Table 2 and in 26 of 28 cases under the symmetric (short) sorting window in Table 3. Turning to the non-linear forecasting models, there is considerably more persistence among the top models sorted by means of the expanding window, particularly at long forecast horizons, than could be expected under a stable factor structure. For these nonlinear models 11
12 we once again see that the persistence patterns found in the actual data are particularly at odds with the factor simulations in the case of local persistence. 3 The Forecast Combination Problem Suppose we are interested in forecasting the conditional mean of a univariate process, y, h periods ahead and that an N vector of forecasts generated at time t, ŷ t+h,t =(ŷ 1t+h,t, ŷ 2t+h,t,..., ŷ Nt+h,t ) 0 is available. The information set underlying each forecast need not be observed to the decision maker in the forecasting problem since the forecasts are taken as the primitive information. We will instead simply assume that the information set at time t, F t,comprises the vector of current and past forecasts: F t = {ŷ t+h,t,..., ŷ h+1,1 }. This assumption is quite general. For example, autoregressive specifications are included by equating individual forecasts with lagged values of y. Similarly, a constant can trivially be included as one of the forecasts so that the combination scheme allows for an intercept term, a strategy recommended by Granger and Ramanathan (1984) and - for a variety of loss functions - by Elliott and Timmermann (2002). The general forecast combination problem can be posed as that of choosing a mapping g(ŷ t+h,t ; θ) from the set of predictions to the real line, Y t+h,t R that best approximates the conditional expectation, E[y t+h ŷ t+h,t ]. This general class of combination schemes comprises non-linear and time-varying combination methods, but it is far more common to limit the analysis by assuming that g(.) is a linear function and choosing weights, {ω it,h } N i=1, to produce a combined forecast, ŷt+h,t c,oftheform: N ŷt+h,t c = X ω it,h ŷ it+h,t. (12) ω it,h it must obviously be measurable on F t. This gives rise to the forecast error i=1 e c t+h,t = y t+h ŷ c t+h,t. (13) Following common practice, the forecaster is assumed to be endowed with a loss function that only depends on the forecast error, L(e t+h,t ).Theperiod-t combination weights ω th = 12
13 (ω 1t,h,..., ω Nt,h ) 0 can therefore be chosen to solve the problem ω th =argmin E L e c t+h,t Ft. (14) ω th Under MSE loss, the combination weights are easy to characterize in population and only depend on the first two conditional moments of the joint distribution of y and ŷ, µ yt+h µ µyt σ2 yt σ 0 21t. ŷ t+h,t µ t To keep the notation simple, we suppress the dependence of the joint distribution and moments on the forecast horizon, h. Assuming that the first forecast is simply a constant, the optimal (population) values of the constant and the vector of combination weights, ω c 0t and ω t,are σ 21t Σ 22t ω c 0t = µ yt ω 0 0tµ t ω t = Σ 1 22t σ 21t. (15) In general, the optimal combination weights depend on the full covariance matrix of forecasts. Sample estimates of these weights can be obtained by OLS regression of y t+h on a constant and ŷ t+h,t, c.f. Diebold and Pauly (1987). In cases with large N and small sample size, T,this estimation strategy is either not feasible or unlikely to perform well given the large number of parameters required to be estimated and the resulting large parameter estimation errors. Many empirical studies have found that it is difficult to produce more precise forecasts than those generated by such equal-weighted combinations, c.f. Clemen (1989). The forecast combination problem (14) is similar to that of minimizing the variance of a portfolio with the errors from the individual forecasts playing the role of asset returns. Intuition can in fact be gained from this analogy. Consider two forecasts whose forecast errors have variance σ 2 1, σ 2 2 with correlation ρ 12. Further suppose that both forecasts are unbiased so the combination weights of an unbiased forecast add up to one. The optimal combination weights for this case are ω 1 = σ 2 (σ 2 ρ 12 σ 1 ), σ σ 2 2 2ρ 12 σ 1 σ 2 ω 2 = 1 ω 1. (16) 13
14 The optimal weights reflect the relative magnitude of the variance of the individual forecast errors in addition to their correlation. A special case arises when one model - e.g. the ith model - has a much smaller forecast error and its forecast error is uncorrelated with that of other models so that diversification gains are less likely to be important. In this case only a single model is selected so that ω t = ξ i, (17) where ξ i is an N vector with zeros everywhere except for unity in the ith place. The opposite extreme arises when all forecasting models have forecast errors of (roughly) the same size. In this case, equal-weights are likely to work well irrespective of the correlations between the forecast errors: ω t = ι N /N, (18) where ι N is an N vector of ones. In the special case where the individual forecast errors are uncorrelated so all off-diagonal elements of Σ 22 are zero, the optimal combination weights take the form of a Discounted MSFE (DMSFE) scheme and are inversely proportional to the past MSFE performance (S it ): ω it = S 1 it /( N X i=1 Sit 1 ). (19) We further consider combination schemes that use ranking information on past performance as a sufficient statistic determining the weight of a given forecasting model. Such weighting schemes are non-parametric in the sense that they belong to the class ω it = f(r it ). Triangular weighting (TK) scheme and are inversely proportional to the models rank: N ω it = R 1 it /( X R 1 it ). (20) i=1 14
15 We also introduce a new set of spread combinations with weights 1+ω if R it αn ω it = 0 if αn <R it < (1 α)n ω R it (1 α)n, (21) where α is the proportion of top models that - based on performance up to time t -geta weight of 1+ω, while 1 α is the proportion of models that gets a weight of ω. We set ω =0.5 and α =0.25 or α =0.50 and thus study spread combinations that put a weight of 1.5 on the top 50% or top 25% of all models and a weight of -0.5 on the bottom 50% or 25% of the models ranked by their previous forecasting performance. 4 Empirical Results The evidence in Section 2 suggests that there is strong persistence in the forecasting performance of standard time-series models. The extent to which this evidence can be translated into improved out-of-sample forecasting performance is still an open issue, however. To address this question we study the out-of-sample forecasting performance for the combination strategies described above. Results from these experiments are provided in Table 7 (linear models) and Table 8 (nonlinear models). In each case we normalize the out-of-sample RMSFE-value of the previous best model at 1.00 and report the performance of all models relative to this benchmark. Values below 1 indicate that the previous best model does not generate the lowest RMSFE performance out-of-sample. For both sets of models there are four forecast horizons, three sorting windows and seven countries, yielding a total of 84 panels. Table 7 reports summary measures for the distribution of these (relative) RMSFE values across the 84 panels considered in our analysis. For each forecasting method we show the minimum, maximum, top and bottom 10% and 25%, mean and median value. Every single forecast combination method produces a mean or median value below 1. This is clear evidence that, in general, a strategy of selecting the top model based on past 15
16 forecasting performance and then using this to compute forecasts out-of-sample does not work very well. This holds both for the linear and non-linear forecasting methods. For the linear models the best out-of-sample RMSFEperformance isgeneratedbythe spread combinations and the trimmed mean based on the top quartile of models which produce median values for the (relative) RMSFE below 0.85 compared to a median value of 0.91 for the average ( mean ) combination. Furthermore, these two combination strategies produce the best forecasting performance in 32 and 33 of the 84 combinations of sorting windows, countries and forecast horizons listed in Table 7. Surprisingly, the mean forecast computed across all models - a combination strategy supported by many of the empirical results in Stock and Watson (1999) - is only best in two out of the 84 cases, less even than the eight cases generated by the previous worst quartile of models. Comparing the out-of-sample forecasting performance of the models across the expanding, asymmetric and symmetric sorting windows, the best sorting window varies considerably across countries. For example, for the US the lowest out-of-sample MSFE values were generated by a model selected by an expanding window irrespective of horizon, whereas for the UK, a model selected by a symmetric window generated the smallest forecast errors. Overall, across countries and horizons, the top model is selected by the expanding, rolling and symmetric window in 7, 11 and 10 out of the 28 cases, respectively. Overall, the best strategy is thus to average over the models in the top quartile, followed by the two spread combinations and the triangular kernel. These methods all produce better results than the popular and frequently used method simply averaging acros all models or using a single best model. Turning to the results for the nonlinear models, Table 8 suggests that a broader set of models should be used for averaging when dealing with non-linear forecasting models. The lowest median RMSFE performance is produced by averaging over the top 75% of models. This is consistent with the evidence in Tables 4-6 of strong persistence in the performance of the worst models, so that it pays to exclude the bottom quartile of forecasts. The finding in these tabels of persistence among the top quartile of forecasting models is further consistent with the good performance of the triangular kernel weighting scheme which weights the 16
17 forecasts from each model inversely with their historical forecasting performance. Once again, the spread combinations perform quite well, producing the lowest RMSFE values in 22 out of 84 cases. The triangular kernel and DMSFE model also do quite well, both producing top performance in nine cases. 4.1 Statistical Significance of Forecasting Performance So far we have not evaluated the statistical significance of the out-of-sample forecasting performance of the models relative to the benchmark that selects the single best model based on historical forecasting performance. To do so we apply the performance test proposed by Diebold and Mariano (1995). For each combination strategy we regress its out-of-sample squared forecast errors relative to that of the benchmark model on an intercept and compute the t-statistic of the intercept coefficient using Newey-West heteroskedasticity and autocorrelation consistent standard errors. We then bootstrap the residuals from this regression and project this sequence on an intercept whose (pivotal) t-statistic is compared against the value produced by the actual data to evaluate its statistical significance under the null of equal forecasting performance (which is imposed in the bootstrap replications). Overall, we found only few cases where the results are statistically significant even under conventional critical values which there may be good reasons not to rely on. Such a finding is unsurprising in the light of the relative short out-of-sample period and the poor smallsample properties of out-of-sample tests for predictive ability documented in Chao, Corradi and Swanson (2001). For the linear models it is only at the longest horizon (h =8)for Japan and at the two shortest horizons for the spread combinations in the case of Canada that performance is systematically better than the benchmark. For the non-linear forecasts there are a few isolated cases - mainly indicating performance statistically worse than the benchmark - particularly at short horizons for Japan. 17
18 5 Conditional Pooling and Combination Strategies Given the large number of forecasting models (N) relative to the number of time-series observations (T ), it is generally not feasible to estimate optimal combination weights at the level of the individual forecasts. Furthermore, it is generally found in the empirical literature on forecasting performance that estimated combination weights tend to lead to worse forecasting performance than if simple equal-weighted averages were used. For this reason we propose in this section a range of new (conditional) two-step combination strategies that pool the forecasting models by their their historical forecasting performance in the first stage and then combine the mean forecast computed for the selected clusters of models using a range of weighting schemes. One prominent example belonging to this class is that of thick modeling proposed by Granger and Yeon (2001) which removes outliers from the set of forecasting models in a step preceding the equal-weighting. Granger and Yeon argue that an advantage of thick modeling is that one no longer needs to worry about difficult decisions between close alternatives or between deciding the outcome of a test that is not decisive. 5.1 Combination of Forecasts from Pre-selected Quartiles The first set of combination methods operates at the level of the quartile-sorted forecasts and uses information on the estimated transition probabilities to select which quartiles to include in the combination. Hence, the forecasting models are initially assigned into quartiles based on their historical forecasting performance up to the point of the prediction, t. Foreach quartile, a pooled (average) forecast is then computed. If the transition probability estimates (using information up to time t) suggest that a particular quartile produced better-thanaverage forecasts, then the average forecast from this quartile is included in the combination. The pooling gives us between one and four forecasts. This number is small enough to let us consider estimating optimal combination weights by least squares methods following the proposal of Diebold and Pauly (1987). We also consider shrinkage methods that shrink the least-squares estimates of the combination weights towards equal-weights. As an extreme case, this includes simply using equal weights. As the number of forecasts (N) increases, 18
19 Elliott (2002) established conditions under which the expected loss from averaging gets closer to the expected loss from using the optimal weights. In small samples, however, there may be gains from using shrinkage estimators of the type: bω t (bγ) =(1 bγ) bω OLS + γω (22) Most shrinkage estimators shrink the least squares estimates of the weights, bω OLS,towards the sample average, ω, and specify a function of the data for bγ, the parameter governing the amount of shrinkage. To set out the three-stage strategy, let q i t+h,t be the p 1 vector containing the forecasts belonging to quartile i = 1, 2, 3, 4, where p = N/4 or the closest integer. We use the information contained in the quartile-by-quartile transition matrix to select quartiles deemed to produce precise forecasts as follows: 1. If ˆP 11 > ˆP 14 : include forecasts from all models in the top quartile. If ˆP 21 + ˆP 22 > ˆP 23 + ˆP 24 : include forecasts from all models in the second quartile If ˆP 31 + ˆP 32 > ˆP 33 + ˆP 34 : include forecasts from all models in the third quartile If ˆP 41 > ˆP 44 : include forecasts from all models in the fourth quartile Let I i be an indicator taking the value 1 if the ith quartile is included and otherwise zero, while ι p is a p 1 vector of ones. Then we consider three sets of combination weights applied to the forecasts pooled into quartiles, namely equal-weighted (QEW), optimally weighted (QOW) and shrinkage-weighted (QSW) combinations: 1. QEW = P 4 i=1 I i 1 P 4 i=1 I i(ι 0 p /p)qi t+h,t. 2. QOW = P 4 i=1 I iˆω it (ι 0 p /p)qt+h,t i where ˆωit are estimates of the optimal combination weights for the included quartiles 3. QSW = P 4 i=1 I iŝ it (ι 0 p /p)qt+h,t i where ŝit are shrinkage weights applied to the selected quartiles computed as ŝ i = λˆω i +(1 λ) P 4 i=1 I 1, n ³ P4i=1 o I i λ =max 0, 1 κ i T P 4, i=1 I i κ =.25,.5, 1. If none of the quartiles passes the test in the first step, we set QEW = QOW = QSW = P 1 N N i=1 yi t+h,t and average across all forecasting models. 19
20 5.2 Clustering by k-means algorithm The approach of sorting models into quartiles can be criticized for using arbitrary cut-off points. Two models with very similar in-sample forecasting performance may get assigned to different quartiles so that one model gets included in the combination while the other does not. To deal with this problem, we propose to use k means clustering algorithms to divide the models into a finite number of clusters. To motivate this approach, Figure 1 plots the in-sample MSFE performance (up to period T 0 ) against the out-of-sample forecasting performance across linear forecasting models while Figure 2 does the same for the non-linear models. In general there is not much support for a simple monotonic or linear relationship between past and future forecasting performance. However, there is indication of performance clusters in some countries, notably France, Germany and Japan. There is also some evidence - e.g., for Italy and Japan - at the extreme end that the models with the very worst in-sample forecasting performance tend to generate the highest out-of-sample RMSFE-values. This suggests trimming the worst models prior to forming forecasts. Suppose we identify M-clusters and denote by y k t+h,t the p k 1 vector containing the subset of forecasts belonging to cluster k =1, 2, 3 where cluster k =1contains the models with the lowest historical MSFE values. We consider the following conditional combination strategies: 1. CPB =(ι 0 p 1 /p 1 )yt+h,t 1 select the cluster with lowest in-sample MSFE-values and use the simple mean of the forecasts of his elements; P 2. CEW = 1 M 1 M k=1 (ι0 p k /p k )yt+h,t k exclude the worst cluster and apply equal-weigths to the forecasts from the top M 1 clusters. 3. COW = P M k=1 ˆω k (ι 0 pk /p k )yt+h,t k where ˆωk are least-squares estimates of the optimal combination weights for the M clusters 4. CSW = P M k=1 ŝk (ι 0 pk /p k )yt+h,t k where ŝk are the shrinkage weights for the M clusters ³ computed as: ŝ k = λˆω k +(1 λ) 1 M, λ =max n0, 1 κ n t h T 0 n o, κ =.25,.5, 1. 20
21 These methods require that part of the out-of-sample period is used to establish an initial ranking of the models. For each experiment we therefore use the first 20 out-of-sample observations for the initial sorting. period for this purpose. 5.3 Empirical Results Results from the three-step combination strategies are presented in Tables 9 (linear models) and 10 (non-linear models). The forecasting results are not directly comparable to those in Tables 7-8 which use the longer out-of-sample period. To allow comparison with the earlier strategies, we report the results based on the previous best model, the top quartile and the simple equal-weighted average (mean) forecast. Several interesting findings emerge from table 9 which summarizes the distribution of performance measures across countries, horizons and sorting windows. Both among the quartile combinations and among the clustering combinations, there are many strategies that outperform the best methods identified in Table 7 (TMB 25%) as well as the previous best and the simple mean across models. Among the quartile combinations there are large gains from using shrinkage methods. Among the clustering methods we consider strategies based on M =2, 3, i.e., using either two or three clusters. The best strategy appears to be a simple average across the top cluster (C3PB or C2PB) rather than combining models from the top cluster with those from the lower ranked clusters. In general the results tend to be better for the combination methods based on two rather than three clusters. Furthermore, shrinkage towards equal weights across clusters also improves forecasting performance. Combination of forecasts tend to be more effective for the pooled quartile forecasts than for the clusters. This is easy to understand. Forecast precision is unlikely to be as clearly divided across quartiles as it is across clusters and combining forecasts from different quartiles leads to a diversification gain. In contrast, once models have been clustered according to their past MSFE values, the difference between the top and second cluster s forecasting precision tends to be quite large and dominates the diversification gain from combining. To see how the performance of the various combination schemes varies across sorting window and forecast horizons, we also considered some more disaggregate results than those 21
22 shown in Table 9. Under the expanding and asymmetric sorting windows (pooling results across the forecast horizon and countries), shrinkage applied to the quartile portfolios worked best, while the forecast based on the top quartile was best under the symmetric (singleperiod) sorting window. Looking across forecast horizons, the top quartile gave the best performance for the short horizons (h =1, 2), while the shrinkage methods applied to the quartiles and the forecasts from the top cluster of models were best for the longer horizons (h =4, 8). Turning finally to the results for the non-linear models, Figure 2 indicates very different patterns in the relationship between in-sample (past) forecasting performance and out-ofsample (future) forecasting performance. Table 10 shows that once again shrinkage provides a reduction in the forecast errors as does clustering at least when the median forecasting performance is considered. The three-stage methods involving clustering, pooling and estimation of optimal combination weights and shrinkage towards equal weights continue to perform better than the mean forecast or using the average forecast from the top quartile of models. 6 Conclusion Forecast combination entails using information from a typically large set of forecasts and emerges as an attractive strategy when individual forecasting models are misspecified in a way that is unknown to the modeller. Misspecification is likely to be related not simply to functional form but also to instability in the joint distribution of forecasts and the target variable. In this situation, the identity of the best forecasting model is likely to change over time and a key question is for how long the relative performance of forecasting models persists. This paper investigated the nature of persistence in forecasting performance across a large set of linear and nonlinear models used to forecast GDP growth in the G7 countries. Much of the paper was exploratory since there is not, to our knowledge, any previous research on this question. We found strong evidence of persistence in forecasting performance as models that were in the top and bottom quartiles when ranked by their historical performance have 22
23 a higher than average chance of remaining in the top and bottom quartiles, respectively, in the out-of-sample period. However, we also found systematic evidence of crossings, where thepreviousbestmodelsbecometheworstmodelsinthefutureorviceversa,particularly among the linear forecasting models. Overall, the ranking of the forecasts tends to be more persistent for the non-linear models than for the linear models. We next explored a range of non-parametric combination schemes that use ranking information based on past forecasting performance and do not require estimation of forecast combination weights. We found evidence that a new set of spread combinations perform quite well. These assign a positive weight greater than one to the top models while the worst models get a negative weight and the weights sum to one. There is also evidence that using the average forecast from the top quartile of models results in out-of-sample forecasting performance better than that produced by the average forecast or the previous best forecast. Finally, we proposed a range of new combination strategies that first pre-select or pool models into either quartiles or clusters on the basis of the distribution of the distribution of past forecasting performance across models, pools forecasts within each cluster and then estimate optimal combination weights that are shrunk towards equal weights. We find evidence in our data that these conditional combination strategies lead to better forecasting performance than simpler strategies in common use such as as using the previous best model or simply averaging across all forecasting models or a small subset of these. References [1] Aiolfi, M. and C. A. Favero, 2003, Model Uncertainty, Thick Modeling and the Predictability of Stock Returns, CEPR Discussion Paper No [2] Bates, J.M. and C.W.J. Granger, 1969, The Combination of Forecasts. Operations Research Quarterly 20, [3] Chao, J., V. Corradi and N.R. Swanson, 2001, An Out of Sample Test for Granger Causality. Macroeconomic Dynamics 5,
24 [4] Clemen, R.T., 1989, Combining Forecasts: A Review and Annotated Bibliography. International Journal of Forecasting 5, [5] Corradi, V. and N.R. Swanson, 2002, A Consistent Test for Nonlinear Out of Sample Predictive Accuracy. Journal of Econometrics 110, [6] Diebold, F.X. and R. Mariano, 1995, Comparing Predictive Accuracy. Journal of Business and Economic Statistics 13, [7] Diebold, F.X. and P. Pauly, 1987, Structural Change and the Combination of Forecasts. Journal of Forecasting 6, [8] Elliott, G., 2002, Forecast Combination with Many Forecasts, mimeo. [9] Elliott, G. and A. Timmermann, 2002, Optimal Forecast Combinations under General Loss Functions and Forecast Error Distributions. Forthcoming in Journal of Econometrics. [10] Giacomini, R. and H. White, 2003, Test of Conditional Predictive Ability. UC San Diego Department of Economics Discussion Paper No. 09. [11] Granger, C.W.J., 1989, Combining Forecasts - Twenty Years Later. Journal of Forecasting 8, [12] Granger, C.W.J. and P. Newbold, 1986, Forecasting Economic Time Series, 2nd Edition. Academic Press. [13] Granger, C.W.J. and R. Ramanathan, 1984, Improved Methods of Combining Forecasts. Journal of Forecasting 3, [14] Granger, C.W.J. and Y. Yeon, 2001, Thick Models, mimeo, UC San Diego Department of Economics. [15] Sin, C.Y. and H. White, 1996, Information Criterion for Selecting Possibly Mis-specified Parametric Models. Journal of Econometrics 71,
25 [16] Stock, J.H. and M. Watson, 2001, A Comparison of Linear and Nonlinear Univariate Models for Forecasting Macroeconomic Time Series. Pages 1-44 In R.F. Engle and H. White (eds). Festschrift in honour of Clive Granger. [17] Stock, J.H. and M. Watson, 2003, Combination Forecasts of Output Growth in a Seven- Country Data Set. Working Paper Princeton University and Harvard University. [18] Swanson, N.R. and H. White, 1997, A Model Selection Approach to Real-Time Macroeconomic Forecasting Using Linear Models and Artificial Neural Networks. Review of Economics and Statistics 79, [19] White, H., 1990, A Consistent Model Selection. In C.W.J. Granger (ed) Modeling Economic Series. Oxford University Press. 25
26 Table 1: Expanding Sorting Window, Linear Models: Each cell reports the corner probabilities of the quartile-by-quartile contingency table for the forecasting models initial and subsequent h-period rankings. Transition probabilities Pij give the probability of moving from quartile i (based on historical performance, et,t h) to quartile j (based on future performance, et+h,t). Significance at the 95% confidence level is indicated by a *, while square brackets report 95% confidence intervals computed across 1,000 simulations under stable factors structure. rgdp h=1 h=2 h=4 h=8 USA [0.27,0.37] [0.30,0.38] [0.27,0.37] [0.29,0.38] [0.27,0.36] [0.28,0.38] [0.26,0.36] [0.28,0.38] [0.19,0.29] [0.19,0.29] [0.20,0.29] [0.20,0.29] [0.19,0.29] [0.19,0.29] [0.20,0.29] [0.19,0.30] UK [0.26,0.37] [0.28,0.39] [0.25,0.37] [0.28,0.39] [0.25,0.38] [0.27,0.39] [0.25,0.37] [0.26,0.39] [0.19,0.31] [0.19,0.31] [0.18,0.31] [0.19,0.32] [0.19,0.31] [0.19,0.31] [0.19,0.31] [0.19,0.32] France [0.26,0.37] [0.29,0.39] [0.26,0.37] [0.28,0.39] [0.25,0.38] [0.27,0.40] [0.25,0.38] [0.26,0.39] [0.19,0.31] [0.19,0.31] [0.18,0.31] [0.19,0.32] [0.19,0.32] [0.19,0.33] [0.19,0.32] [0.19,0.33] Germany [0.32,0.42] [0.26,0.35] [0.33,0.42] [0.25,0.35] [0.32,0.42] [0.25,0.35] [0.31,0.43] [0.24,0.35] [0.16,0.27] [0.25,0.36] [0.16,0.27] [0.25,0.36] [0.16,0.26] [0.25,0.36] [0.16,0.27] [0.25,0.37] Japan [0.32,0.43] [0.27,0.36] [0.32,0.42] [0.26,0.36] [0.31,0.42] [0.25,0.36] [0.31,0.42] [0.25,0.36] [0.17,0.27] [0.25,0.36] [0.17,0.27] [0.24,0.36] [0.17,0.27] [0.24,0.36] [0.17,0.28] [0.24,0.36] Canada [0.27,0.37] [0.28,0.39] [0.26,0.37] [0.29,0.39] [0.26,0.37] [0.28,0.39] [0.25,0.37] [0.27,0.39] [0.19,0.30] [0.18,0.30] [0.19,0.30] [0.19,0.30] [0.19,0.30] [0.19,0.30] [0.19,0.30] [0.19,0.31] Italy [0.25,0.37] [0.28,0.39] [0.25,0.37] [0.27,0.39] [0.25,0.37] [0.27,0.39] [0.24,0.37] [0.26,0.39] [0.18,0.30] [0.19,0.30] [0.19,0.31] [0.19,0.31] [0.18,0.31] [0.18,0.31] [0.19,0.31] [0.19,0.32]
Do we need Experts for Time Series Forecasting?
Do we need Experts for Time Series Forecasting? Christiane Lemke and Bogdan Gabrys Bournemouth University - School of Design, Engineering and Computing Poole House, Talbot Campus, Poole, BH12 5BB - United
More informationEcon 423 Lecture Notes: Additional Topics in Time Series 1
Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes
More informationDiscussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis
Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Sílvia Gonçalves and Benoit Perron Département de sciences économiques,
More informationWarwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014
Warwick Business School Forecasting System Summary Ana Galvao, Anthony Garratt and James Mitchell November, 21 The main objective of the Warwick Business School Forecasting System is to provide competitive
More informationMFx Macroeconomic Forecasting
MFx Macroeconomic Forecasting IMFx This training material is the property of the International Monetary Fund (IMF) and is intended for use in IMF Institute for Capacity Development (ICD) courses. Any reuse
More informationVector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.
Vector Autoregressive Model Vector Autoregressions II Empirical Macroeconomics - Lect 2 Dr. Ana Beatriz Galvao Queen Mary University of London January 2012 A VAR(p) model of the m 1 vector of time series
More informationUsing all observations when forecasting under structural breaks
Using all observations when forecasting under structural breaks Stanislav Anatolyev New Economic School Victor Kitov Moscow State University December 2007 Abstract We extend the idea of the trade-off window
More informationOptimizing forecasts for inflation and interest rates by time-series model averaging
Optimizing forecasts for inflation and interest rates by time-series model averaging Presented at the ISF 2008, Nice 1 Introduction 2 The rival prediction models 3 Prediction horse race 4 Parametric bootstrap
More informationA Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models
Journal of Finance and Investment Analysis, vol.1, no.1, 2012, 55-67 ISSN: 2241-0988 (print version), 2241-0996 (online) International Scientific Press, 2012 A Non-Parametric Approach of Heteroskedasticity
More informationFORECASTING THE FINNISH CONSUMER PRICE INFLATION USING ARTIFICIAL NEURAL NETWORK MODELS AND THREE AUTOMATED MODEL SELECTION TECHNIQUES *
Finnish Economic Papers Volume 26 Number 1 Spring 2013 FORECASTING THE FINNISH CONSUMER PRICE INFLATION USING ARTIFICIAL NEURAL NETWORK MODELS AND THREE AUTOMATED MODEL SELECTION TECHNIQUES * ANDERS BREDAHL
More informationForecast Combinations
Forecast Combinations Allan Timmermann UCSD July 21, 2005 Abstract Forecast combinations have frequently been found in empirical studies to produce better forecasts on average than methods based on the
More informationMultivariate Out-of-Sample Tests for Granger Causality
Multivariate Out-of-Sample Tests for Granger Causality Sarah Gelper and Christophe Croux K.U.Leuven, Faculty of Economics and Applied Economics, Naamsestraat 69, 3000 Leuven, Belgium Abstract A time series
More informationThe Empirical Behavior of Out-of-Sample Forecast Comparisons
The Empirical Behavior of Out-of-Sample Forecast Comparisons Gray Calhoun Iowa State University April 30, 2010 Abstract This paper conducts an empirical comparison of several methods for comparing nested
More informationEconometric Forecasting
Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 1, 2014 Outline Introduction Model-free extrapolation Univariate time-series models Trend
More informationPooling-based data interpolation and backdating
Pooling-based data interpolation and backdating Massimiliano Marcellino IEP - Bocconi University, IGIER and CEPR Revised Version - May 2006 Abstract Pooling forecasts obtained from different procedures
More informationEstimation and Testing of Forecast Rationality under Flexible Loss
Review of Economic Studies (2005) 72, 1107 1125 0034-6527/05/00431107$02.00 c 2005 The Review of Economic Studies Limited Estimation and Testing of Forecast Rationality under Flexible Loss GRAHAM ELLIOTT
More information2.5 Forecasting and Impulse Response Functions
2.5 Forecasting and Impulse Response Functions Principles of forecasting Forecast based on conditional expectations Suppose we are interested in forecasting the value of y t+1 based on a set of variables
More informationEconometrics Honor s Exam Review Session. Spring 2012 Eunice Han
Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity
More informationFinancial Econometrics
Financial Econometrics Nonlinear time series analysis Gerald P. Dwyer Trinity College, Dublin January 2016 Outline 1 Nonlinearity Does nonlinearity matter? Nonlinear models Tests for nonlinearity Forecasting
More informationForecasting the unemployment rate when the forecast loss function is asymmetric. Jing Tian
Forecasting the unemployment rate when the forecast loss function is asymmetric Jing Tian This version: 27 May 2009 Abstract This paper studies forecasts when the forecast loss function is asymmetric,
More informationDo Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods
Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of
More informationEconometría 2: Análisis de series de Tiempo
Econometría 2: Análisis de series de Tiempo Karoll GOMEZ kgomezp@unal.edu.co http://karollgomez.wordpress.com Segundo semestre 2016 IX. Vector Time Series Models VARMA Models A. 1. Motivation: The vector
More informationCombining country-specific forecasts when forecasting Euro area macroeconomic aggregates
Combining country-specific forecasts when forecasting Euro area macroeconomic aggregates Jing Zeng March 26, 215 Abstract European Monetary Union (EMU) member countries forecasts are often combined to
More informationCombining expert forecasts: Can anything beat the simple average? 1
June 1 Combining expert forecasts: Can anything beat the simple average? 1 V. Genre a, G. Kenny a,*, A. Meyler a and A. Timmermann b Abstract This paper explores the gains from combining surveyed expert
More informationMining Big Data Using Parsimonious Factor and Shrinkage Methods
Mining Big Data Using Parsimonious Factor and Shrinkage Methods Hyun Hak Kim 1 and Norman Swanson 2 1 Bank of Korea and 2 Rutgers University ECB Workshop on using Big Data for Forecasting and Statistics
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao
More informationState-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53
State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State
More informationECONOMETRICS HONOR S EXAM REVIEW SESSION
ECONOMETRICS HONOR S EXAM REVIEW SESSION Eunice Han ehan@fas.harvard.edu March 26 th, 2013 Harvard University Information 2 Exam: April 3 rd 3-6pm @ Emerson 105 Bring a calculator and extra pens. Notes
More informationEconomic modelling and forecasting
Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation
More information10. Time series regression and forecasting
10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the
More informationIntroduction to Econometrics
Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle
More informationVAR Models and Applications
VAR Models and Applications Laurent Ferrara 1 1 University of Paris West M2 EIPMC Oct. 2016 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions
More informationComparing Forecast Accuracy of Different Models for Prices of Metal Commodities
Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities João Victor Issler (FGV) and Claudia F. Rodrigues (VALE) August, 2012 J.V. Issler and C.F. Rodrigues () Forecast Models for
More informationEconometric Forecasting
Graham Elliott Econometric Forecasting Course Description We will review the theory of econometric forecasting with a view to understanding current research and methods. By econometric forecasting we mean
More informationForecast comparison of principal component regression and principal covariate regression
Forecast comparison of principal component regression and principal covariate regression Christiaan Heij, Patrick J.F. Groenen, Dick J. van Dijk Econometric Institute, Erasmus University Rotterdam Econometric
More informationReview of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley
Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate
More informationAveraging and the Optimal Combination of Forecasts
Averaging and the Optimal Combination of Forecasts Graham Elliott University of California, San Diego 9500 Gilman Drive LA JOLLA, CA, 92093-0508 September 27, 2011 Abstract The optimal combination of forecasts,
More informationA TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED
A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED by W. Robert Reed Department of Economics and Finance University of Canterbury, New Zealand Email: bob.reed@canterbury.ac.nz
More informationResearch Division Federal Reserve Bank of St. Louis Working Paper Series
Research Division Federal Reserve Bank of St. Louis Working Paper Series Averaging Forecasts from VARs with Uncertain Instabilities Todd E. Clark and Michael W. McCracken Working Paper 2008-030B http://research.stlouisfed.org/wp/2008/2008-030.pdf
More informationAppendix to The Life-Cycle and the Business-Cycle of Wage Risk - Cross-Country Comparisons
Appendix to The Life-Cycle and the Business-Cycle of Wage Risk - Cross-Country Comparisons Christian Bayer Falko Juessen Universität Bonn and IGIER Technische Universität Dortmund and IZA This version:
More informationVector Auto-Regressive Models
Vector Auto-Regressive Models Laurent Ferrara 1 1 University of Paris Nanterre M2 Oct. 2018 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions
More informationDepartment of Economics, UCSD UC San Diego
Department of Economics, UCSD UC San Diego itle: Spurious Regressions with Stationary Series Author: Granger, Clive W.J., University of California, San Diego Hyung, Namwon, University of Seoul Jeon, Yongil,
More informationIntroduction to Econometrics
Introduction to Econometrics STAT-S-301 Introduction to Time Series Regression and Forecasting (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Introduction to Time Series Regression
More informationForecasting in the presence of recent structural breaks
Forecasting in the presence of recent structural breaks Second International Conference in memory of Carlo Giannini Jana Eklund 1, George Kapetanios 1,2 and Simon Price 1,3 1 Bank of England, 2 Queen Mary
More informatione Yield Spread Puzzle and the Information Content of SPF forecasts
Economics Letters, Volume 118, Issue 1, January 2013, Pages 219 221 e Yield Spread Puzzle and the Information Content of SPF forecasts Kajal Lahiri, George Monokroussos, Yongchen Zhao Department of Economics,
More informationGeneral comments Linear vs Non-Linear Univariate vs Multivariate
Comments on : Forecasting UK GDP growth, inflation and interest rates under structural change: A comparison of models with time-varying parameters by A. Barnett, H. Mumtaz and K. Theodoridis Laurent Ferrara
More informationStructural VAR Models and Applications
Structural VAR Models and Applications Laurent Ferrara 1 1 University of Paris Nanterre M2 Oct. 2018 SVAR: Objectives Whereas the VAR model is able to capture efficiently the interactions between the different
More informationVolatility. Gerald P. Dwyer. February Clemson University
Volatility Gerald P. Dwyer Clemson University February 2016 Outline 1 Volatility Characteristics of Time Series Heteroskedasticity Simpler Estimation Strategies Exponentially Weighted Moving Average Use
More informationForecasting the term structure interest rate of government bond yields
Forecasting the term structure interest rate of government bond yields Bachelor Thesis Econometrics & Operational Research Joost van Esch (419617) Erasmus School of Economics, Erasmus University Rotterdam
More informationWeek 5 Quantitative Analysis of Financial Markets Characterizing Cycles
Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036
More informationCentral Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.
Forecasting Lecture 3 Structural Breaks Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, 2013 1 / 91 Bruce E. Hansen Organization Detection
More informationA dynamic factor model framework for forecast combination
Span. Econ. Rev. 1, 91 121 (1999) c Springer-Verlag 1999 A dynamic factor model framework for forecast combination Yeung Lewis Chan 1, James H. Stock 2, Mark W. Watson 3 1 Department of Economics, Harvard
More informationNon-nested model selection. in unstable environments
Non-nested model selection in unstable environments Raffaella Giacomini UCLA (with Barbara Rossi, Duke) Motivation The problem: select between two competing models, based on how well they fit thedata Both
More informationUniversity of Pretoria Department of Economics Working Paper Series
University of Pretoria Department of Economics Working Paper Series Predicting Stock Returns and Volatility Using Consumption-Aggregate Wealth Ratios: A Nonlinear Approach Stelios Bekiros IPAG Business
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationAnalysis of Fast Input Selection: Application in Time Series Prediction
Analysis of Fast Input Selection: Application in Time Series Prediction Jarkko Tikka, Amaury Lendasse, and Jaakko Hollmén Helsinki University of Technology, Laboratory of Computer and Information Science,
More informationMonitoring Forecasting Performance
Monitoring Forecasting Performance Identifying when and why return prediction models work Allan Timmermann and Yinchu Zhu University of California, San Diego June 21, 2015 Outline Testing for time-varying
More informationForecast Combination when Outcomes are Di cult to Predict
Forecast Combination when Outcomes are Di cult to Predict Graham Elliott University of California, San Diego 9500 Gilman Drive LA JOLLA, CA, 92093-0508 May 0, 20 Abstract We show that when outcomes are
More informationOut-of-Sample Forecasting of Unemployment Rates with Pooled STVECM Forecasts
Out-of-Sample Forecasting of Unemployment Rates with Pooled STVECM Forecasts Costas Milas Department of Economics, Keele University Rimini Centre for Economic Analysis Philip Rothman Department of Economics,
More informationForecasting Euro Area Real GDP: Optimal Pooling of Information
Euro Area Business Cycle Network (EABCN) Workshop Using Euro Area Data: Issues and Consequences for Economic Analysis Cambridge, 26-28 March 2008 Hosted by University of Cambridge Forecasting Euro Area
More informationAugmenting our AR(4) Model of Inflation. The Autoregressive Distributed Lag (ADL) Model
Augmenting our AR(4) Model of Inflation Adding lagged unemployment to our model of inflationary change, we get: Inf t =1.28 (0.31) Inf t 1 (0.39) Inf t 2 +(0.09) Inf t 3 (0.53) (0.09) (0.09) (0.08) (0.08)
More informationA Simple Explanation of the Forecast Combination Puzzle Å
OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 7, 3 (9) 35-949 doi:./j.468-84.8.54.x A Simple Explanation of the Forecast Combination Puzzle Å Jeremy Smith and Kenneth F. Wallis Department of Economics,
More informationNowcasting gross domestic product in Japan using professional forecasters information
Kanagawa University Economic Society Discussion Paper No. 2017-4 Nowcasting gross domestic product in Japan using professional forecasters information Nobuo Iizuka March 9, 2018 Nowcasting gross domestic
More informationInternal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.
Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample
More informationThis note introduces some key concepts in time series econometrics. First, we
INTRODUCTION TO TIME SERIES Econometrics 2 Heino Bohn Nielsen September, 2005 This note introduces some key concepts in time series econometrics. First, we present by means of examples some characteristic
More informationDarmstadt Discussion Papers in Economics
Darmstadt Discussion Papers in Economics The Effect of Linear Time Trends on Cointegration Testing in Single Equations Uwe Hassler Nr. 111 Arbeitspapiere des Instituts für Volkswirtschaftslehre Technische
More informationTopic 4 Unit Roots. Gerald P. Dwyer. February Clemson University
Topic 4 Unit Roots Gerald P. Dwyer Clemson University February 2016 Outline 1 Unit Roots Introduction Trend and Difference Stationary Autocorrelations of Series That Have Deterministic or Stochastic Trends
More informationForecasting. Bernt Arne Ødegaard. 16 August 2018
Forecasting Bernt Arne Ødegaard 6 August 208 Contents Forecasting. Choice of forecasting model - theory................2 Choice of forecasting model - common practice......... 2.3 In sample testing of
More information9) Time series econometrics
30C00200 Econometrics 9) Time series econometrics Timo Kuosmanen Professor Management Science http://nomepre.net/index.php/timokuosmanen 1 Macroeconomic data: GDP Inflation rate Examples of time series
More informationFaMIDAS: A Mixed Frequency Factor Model with MIDAS structure
FaMIDAS: A Mixed Frequency Factor Model with MIDAS structure Frale C., Monteforte L. Computational and Financial Econometrics Limassol, October 2009 Introduction After the recent financial and economic
More informationLECTURE 11. Introduction to Econometrics. Autocorrelation
LECTURE 11 Introduction to Econometrics Autocorrelation November 29, 2016 1 / 24 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists of choosing: 1. correct
More informationTesting for Regime Switching in Singaporean Business Cycles
Testing for Regime Switching in Singaporean Business Cycles Robert Breunig School of Economics Faculty of Economics and Commerce Australian National University and Alison Stegman Research School of Pacific
More informationECON3327: Financial Econometrics, Spring 2016
ECON3327: Financial Econometrics, Spring 2016 Wooldridge, Introductory Econometrics (5th ed, 2012) Chapter 11: OLS with time series data Stationary and weakly dependent time series The notion of a stationary
More informationInference in VARs with Conditional Heteroskedasticity of Unknown Form
Inference in VARs with Conditional Heteroskedasticity of Unknown Form Ralf Brüggemann a Carsten Jentsch b Carsten Trenkler c University of Konstanz University of Mannheim University of Mannheim IAB Nuremberg
More informationForecasting Euro Area Macroeconomic Variables using a Factor Model Approach for Backdating. Ralf Brüggemann and Jing Zeng
University of Konstanz Dep artment of Economics Forecasting Euro Area Macroeconomic Variables using a Factor Model Approach for Backdating Ralf Brüggemann and Jing Zeng Working Paper Series 2012 15 http://www.wiwi.uni
More informationFurther Evidence on the Usefulness of Real-Time Datasets for Economic Forecasting
http://www.aimspress.com/journal/qfe QFE, 1(1): 2-25 DOI: 10.3934/QFE.2017.1.2 Received: 29 January 2017 Accepted: 28 February 2017 Published: 10 April 2017 Research article Further Evidence on the Usefulness
More informationA radial basis function artificial neural network test for ARCH
Economics Letters 69 (000) 5 3 www.elsevier.com/ locate/ econbase A radial basis function artificial neural network test for ARCH * Andrew P. Blake, George Kapetanios National Institute of Economic and
More informationDYNAMIC ECONOMETRIC MODELS Vol. 9 Nicolaus Copernicus University Toruń Mariola Piłatowska Nicolaus Copernicus University in Toruń
DYNAMIC ECONOMETRIC MODELS Vol. 9 Nicolaus Copernicus University Toruń 2009 Mariola Piłatowska Nicolaus Copernicus University in Toruń Combined Forecasts Using the Akaike Weights A b s t r a c t. The focus
More informationWooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems
Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including
More informationMultivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]
1 Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8] Insights: Price movements in one market can spread easily and instantly to another market [economic globalization and internet
More informationMultivariate GARCH models.
Multivariate GARCH models. Financial market volatility moves together over time across assets and markets. Recognizing this commonality through a multivariate modeling framework leads to obvious gains
More informationA nonparametric test for seasonal unit roots
Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna To be presented in Innsbruck November 7, 2007 Abstract We consider a nonparametric test for the
More informationSimple robust averages of forecasts: Some empirical results
Available online at www.sciencedirect.com International Journal of Forecasting 24 (2008) 163 169 www.elsevier.com/locate/ijforecast Simple robust averages of forecasts: Some empirical results Victor Richmond
More informationForecasting Lecture 2: Forecast Combination, Multi-Step Forecasts
Forecasting Lecture 2: Forecast Combination, Multi-Step Forecasts Bruce E. Hansen Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Forecast Combination and Multi-Step Forecasts
More informationLecture 2: Univariate Time Series
Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:
More informationAPPLIED ECONOMETRIC TIME SERIES 4TH EDITION
APPLIED ECONOMETRIC TIME SERIES 4TH EDITION Chapter 2: STATIONARY TIME-SERIES MODELS WALTER ENDERS, UNIVERSITY OF ALABAMA Copyright 2015 John Wiley & Sons, Inc. Section 1 STOCHASTIC DIFFERENCE EQUATION
More informationTesting an Autoregressive Structure in Binary Time Series Models
ömmföäflsäafaäsflassflassflas ffffffffffffffffffffffffffffffffffff Discussion Papers Testing an Autoregressive Structure in Binary Time Series Models Henri Nyberg University of Helsinki and HECER Discussion
More informationEUROPEAN ECONOMY EUROPEAN COMMISSION DIRECTORATE-GENERAL FOR ECONOMIC AND FINANCIAL AFFAIRS
EUROPEAN ECONOMY EUROPEAN COMMISSION DIRECTORATE-GENERAL FOR ECONOMIC AND FINANCIAL AFFAIRS ECONOMIC PAPERS ISSN 1725-3187 http://europa.eu.int/comm/economy_finance N 249 June 2006 The Stacked Leading
More informationSteven Cook University of Wales Swansea. Abstract
On the finite sample power of modified Dickey Fuller tests: The role of the initial condition Steven Cook University of Wales Swansea Abstract The relationship between the initial condition of time series
More informationA Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance
CESIS Electronic Working Paper Series Paper No. 223 A Bootstrap Test for Causality with Endogenous Lag Length Choice - theory and application in finance R. Scott Hacker and Abdulnasser Hatemi-J April 200
More informationSelecting a Nonlinear Time Series Model using Weighted Tests of Equal Forecast Accuracy
Selecting a Nonlinear Time Series Model using Weighted Tests of Equal Forecast Accuracy Dick van Dijk Econometric Insitute Erasmus University Rotterdam Philip Hans Franses Econometric Insitute Erasmus
More informationFinQuiz Notes
Reading 9 A time series is any series of data that varies over time e.g. the quarterly sales for a company during the past five years or daily returns of a security. When assumptions of the regression
More informationFinancial Econometrics
Financial Econometrics Multivariate Time Series Analysis: VAR Gerald P. Dwyer Trinity College, Dublin January 2013 GPD (TCD) VAR 01/13 1 / 25 Structural equations Suppose have simultaneous system for supply
More informationGARCH Models. Eduardo Rossi University of Pavia. December Rossi GARCH Financial Econometrics / 50
GARCH Models Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 50 Outline 1 Stylized Facts ARCH model: definition 3 GARCH model 4 EGARCH 5 Asymmetric Models 6
More informationEconometrics. 9) Heteroscedasticity and autocorrelation
30C00200 Econometrics 9) Heteroscedasticity and autocorrelation Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Heteroscedasticity Possible causes Testing for
More informationNext, we discuss econometric methods that can be used to estimate panel data models.
1 Motivation Next, we discuss econometric methods that can be used to estimate panel data models. Panel data is a repeated observation of the same cross section Panel data is highly desirable when it is
More informationResearch Brief December 2018
Research Brief https://doi.org/10.21799/frbp.rb.2018.dec Battle of the Forecasts: Mean vs. Median as the Survey of Professional Forecasters Consensus Fatima Mboup Ardy L. Wurtzel Battle of the Forecasts:
More informationA Nonparametric Approach to Identifying a Subset of Forecasters that Outperforms the Simple Average
A Nonparametric Approach to Identifying a Subset of Forecasters that Outperforms the Simple Average Constantin Bürgi The George Washington University cburgi@gwu.edu Tara M. Sinclair The George Washington
More informationLATVIAN GDP: TIME SERIES FORECASTING USING VECTOR AUTO REGRESSION
LATVIAN GDP: TIME SERIES FORECASTING USING VECTOR AUTO REGRESSION BEZRUCKO Aleksandrs, (LV) Abstract: The target goal of this work is to develop a methodology of forecasting Latvian GDP using ARMA (AutoRegressive-Moving-Average)
More informationWhat s New in Econometrics? Lecture 14 Quantile Methods
What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression
More information