IB test for Direct versus Indirect approach in Seasonal Adjustment

IB test for Direct versus Indirect approach in easonal Adjustment Enrico Infante, Dario Buono, Adriana Buono 3 Eurostat, unit C: "National accounts methodology. ector accounts. Financial indicators", e-mail: enrico.infante@gmail.com Eurostat, unit B: "Quality, Methodology and Research", e-mail: dario.buono@ec.europa.eu 3 Università del annio, Department of Biological and Environmental ciences, e-mail: adriana.buono000@gmail.com Abstract The seasonally adjusted series of an aggregate can be obtained by seasonal adjusting it ("direct approach") or by aggregating the seasonally adjusted individual series ("indirect approach"). The literature to date has mainly focused upon an a posteriori comparison among the results achieved by applying different approaches. Here a new a priori test (IB test) for choosing between direct and indirect approach in seasonal adjustment is proposed. The test is applied before running any seasonal adjustment procedure. When the individual series present common seasonal patterns the aggregate will be adjusted directly, otherwise an indirect approach could be preferred. ections 3 and 4 include a simulation and a case study, respectively. This paper seeks to set out an a priori strategy for the identification of the most effective seasonal adjustment approach to be used. Keywords: easonal adjustment; direct vs. indirect approach; moving seasonality. Acknowledgements: We would like to acknowledge all the people that helped us with their comments. pecial thanks go to Robert Kirchner for his comments on how to build the test and the idea of the cube, to Agustín Maravall for his comments on direct and indirect approach, to Gian Luigi Mazzi, Riccardo Gatto, Jean Palate and David tephen Pollock for commenting earlier versions of this paper.. Introduction An aggregated time series Y t can be expressed as follows: t ( X, K, X, X ) Y = f, t kt K t The views and the opinions expressed in this paper are solely of the authors and do not necessarily reflect those of the institutions for which they work.

A special case is when f is an additive function, which can be generalized as follows: Y t = ω X t + K + ωk X kt + K + ω X t = ωk k = X kt A reference example of this kind of aggregate is the European Union GDP, obtained as the sum of GDPs of the 7 EU countries. To obtain seasonally adjusted figures, at least two different approaches can be applied (see Mazzi et al., 00a, 00b, for more details): Direct Approach: the seasonally adjusted data are computed directly by seasonally adjusting the aggregate Y t. Indirect Approach: the seasonally adjusted data are computed indirectly by seasonally adjusting data per each series X kt. The seasonally adjusted Y t is then given by the sum of the seasonally adjusted components. A third option could be the mixed approach: if it is possible to define a criterion in order to separate the series in groups, creating sub-aggregates (e.g. these series have common seasonal patterns), then it is possible to compute the seasonally adjusted figures by summing the seasonal adjusted data of these sub-aggregates. The direct and the indirect approaches have been discussed for many years, and there is no consensus on which is the best approach. ee, for instance, Maravall (006) or Hood and Findley (00). The main drawback to be considered with the direct approach is that there is no accounting consistency between the aggregate and individual series. A controversial point with the direct approach is the so called cancel-out effect: if there are two series with opposite patterns of seasonality, then the aggregated series will possibly show no seasonality, e.g. the aggregated series can show no seasonality even if all the individual series have seasonality. According to Mravall (006), this is not a drawback. On the other side, the indirect approach also has some drawbacks. First of all the presence of residual seasonality should always be carefully checked in all of the indirectly seasonally adjusted aggregates. Then, applying an indirect approach means working with a larger number of series, and in that case the calculation burden could be quite big. The numerical results obtained by performing the different approaches are usually close in terms of medium and long term evolution, but they can still diverge in terms of signs of the growth rates in the short term period. They are likely to coincide if the aggregate is an algebraic sum, the decomposition model is additive, there are no outliers and the filter used is the same for all the series. These conditions are rarely met with real data set. According to the E guidelines on seasonal adjustment (Eurostat, 009), if the series X kt do not show similar seasonal patterns, indirect adjustment is preferred. Otherwise if the series show common seasonal patterns and approximately the same timing in their peaks and troughs, the direct approach is preferred. In this case the aggregation will produce a smoother series with no loss of information on the seasonal patterns. Direct approach is preferred for transparency and accuracy, while indirect approach is preferred for consistency.

The aim of this paper is to propose a new innovative test (Infante and Buono test, hereafter IB test), based on a three-way ANOVA model, in order to identify which series can be aggregated in groups and to define at which level the seasonal adjustment procedure should be run. The principal advantage of this test is that it gives information about the approach to follow before seasonally adjusting the series, so it can be considered as an a priori method to choose the correct approach. A first elaboration of this idea is in Buono and Infante (0). The need of such kind of test is also stated in Cristadoro and abbatini (000).. The IB test The classical test for moving seasonality (Higginson, 975) is based on a two-way ANOVA model, where the two factors are respectively the time frequency (usually months or quarters) and the years. In order to test the presence of a moving seasonality between different series (not between the years of the same series, as the classic moving seasonality test) we propose the use of the IB test, which is based on a three-way ANOVA model. The three factors are the time frequency, the years and the series. The tested variable in the classical test for moving seasonality is the final estimation of the unmodified easonal-irregular differences (using the tool X- ARIMA the series of the easonal-irregular ratios is presented in Table D8) absolute value, if the decomposition model is an additive one; or the easonal-irregular ratio minus one absolute value, if the decomposition model is a multiplicative one. As the IB test needs to be performed a priori (e.g. before to run a seasonal adjustment procedure), it is not possible to use the easonal-irregular differences (or ratios), as in the HP test for moving seasonality. Thus, for creating the trend series T kt, a Hodrick-Prescott filter is applied to each series X kt. Then the proposed tested variable is obtained by subtracting the trend series from the original one: I = X T HP We kept the notation I for remarking the fact that it is a de-trended series. As such, the tested variable is a three-dimensional array (cube), where in the rows there is the i-th time frequency, in the columns there is the j-th year, and in the depth there is the k-th series. The test is performed only on the part of the time series that covers all the observations of entire years. The model is specified as follows: I = a + b + c + e i j k This equation implies that the value I represents the sum of: A term a i, i = K,, M, representing the numerical contribution due to the effect of the i-th time frequency (usually M=, for monthly series, or M=4, for quarterly series).

A term b j, j = K,, N, representing the numerical contribution due to the effect of the j-th year. A term c k, k = K,,, representing the numerical contribution due to the effect of the k-th series of the aggregate. A residual component term e, assumed to be normally distributed with zero mean, constant variance and zero covariance. It represents the effect, on the values of the I, of the whole set of factors not explicitly taken into account in the model. The test is based on the decomposition of the variance of the observations: = M + N + + R Denoting x = MN frequency means, x M N I i= j = k = M j = I M i= k = the general mean, x the N yearly means and N i = I N j= k= x the M time M N k = I MN i= j= the series means, then we can compute: M N M = ( xi x) is the between time frequencies variance. It is the effect M i= that measures the magnitude of the seasonality. N M N = ( x j x) is the between years variance. It is the effect that N j = measures the movement of the seasonality in the same series. MN = ( x k x) is the between series variance. It is the effect that k = measures the movement of the seasonality between different series. M N R = ( ) ( ) ( ) ( I xi x j x k + x) ( MN ) M N i= j = k = is the residual variance. o the null hypothesis is the following: H c = c = K = 0 : c Where H 0 is not rejected, it implies that there is no change in seasonality over the series (e.g. we cannot exclude that the series have common similar seasonal patterns) and a direct approach is recommended. If the null hypothesis is true, the relative test statistics is required to follow a Fisher- nedecor distribution with (-) and (MN-)-(M-)-(N-)-(-) degrees of freedom and can be written as: F = R

Rejecting the null hypothesis is to say that the direct approach should be avoided, and the indirect one should be taken in consideration. In the case that the null hypothesis is rejected, an option could be to run the test on subgroups of the series, in order to know which ones do present similar seasonal movements. Once the sub-groups of the series with common similar seasonal patterns are determined, a mixed approach could be used. 3. imulation study The purpose of the simulations is to check if the IB test is strong enough to be used systematically when dealing with a big data set (such as for the case of large organization as Eurostat). The simulations are executed on both monthly and quarterly series. As regards the monthly data, the IB test is performed on matrixes composed of three series and a length of ten years. This means that, following the given notation, it will be M=, N=0 and =3. The most common time series model is the so-called airline model, as to say a ARIMA(0,,)(0,,). Thus, each of the three series will follow an airline model, added by 00, as Eurostat usually deals with index data. In addition, a residual term, normally distributed with zero mean and variance equal to one, is added to each series. The MA coefficients are between - and, so that the process is invertible. uch coefficients have been generated using a uniform distribution. In order to simulate the seasonal peaks, three different groups have been created. In each group (named A, B and C) a different seasonality scheme is used, as shown in Table. Each group represents a case of common seasonal patterns. Table Groups with different seasonality schemes for simulations Monthly series Groups Jan Feb Mar Apr May Jun Jul Aug ep Oct Nov Dec A -5-6 +7 0 +0 +8 +6-36 +7 + +6-9 B -30-5 -4-6 -4 +4-3 + +4 +6 +3 +3 C - - +3 +5 +6 +6-44 + +8 +6 + - By means of this template, the expectations are that when the IB test is performed on the series of the same group, then the p value is high and the is low, so that the null hypothesis is accepted. On the other hand, when the IB test is performed on series of different groups, then the p value is low and the is high, so that the null hypothesis is rejected. The simulations are executed on all the combinations of the three groups. For each combination the test is performed 000 times on the simulated series. Table reports, for each combination, the average, the bands of the confidence intervals on the average of the and the number of times where the null hypothesis has been accepted at 90% and at 95% accuracy. For a better understanding of Table, it should be noticed that, under the conditions explained above, the threshold values of the, for accepting the null hypothesis at 90% and at 95%, are.384 and 3.05, respectively. Table Results of simulations with different combinations of monthly series

Combinations average lower band upper band No. H 0 accepted p value>0.0 AAA 0.848 0.7957 0.9005 97 966 BBB 0.444 0.4095 0.475 987 996 CCC 0.90 0.864 0.9790 99 96 ABC 5.6595 5.65 5.7038 0 AAB 4.9 4.08 4.9 0 0 ABB 3.305 3.6 3.393 0 0 AAC 3.940 3.587 3.93 47 399 ACC 3.986 3.69 3.343 53 394 BBC 6.976 6.958 7.065 0 0 BCC 5.53 5.485 5.5774 0 0 No. H 0 accepted p value>0.05 In the first three combinations the test works well, showing, as expected, that a direct approach is a better solution. The is always much lower than the threshold values. The results of the simulations with all the three different groups are also good. As there are three different seasonality schemes, the indirect approach is preferred. The is higher than the threshold values. As regards to the other combinations, the test works well when the schemes involved are A and B or B and C. For the combinations that involve groups A and C, the test shows a hesitation when considering a 95% significance. This is mainly because the groups A and C are not so different: except for the two summer months, they have always the same sign. For this reason we recommend to choose a 90% significance, as it is usually done for the moving seasonality test. Regarding the quarterly data, the IB test is performed on matrixes composed of three series and a length of ten years. This means that, following the given notation, it will be M=4, N=0 and =3. The series are treated as before. Thus each series follows an airline (invertible) model, added by 00 and by a residual term. The schemes for creating the seasonal peaks are shown in Table 3. Table 3 Groups with different seasonality schemes for simulations Monthly series Groups Qrt Qrt Qrt3 Qrt4 A -4 +6-5 +3 B -3 - +8 +6 C +3 +9-8 -4 As for the monthly series, the IB test is performed 000 times on the simulated series for each combination. The results are shown in Table 4. For a better understanding of Table 4, it should be noticed that, under the conditions explained above, the threshold values of the, for accepting the null hypothesis at 90% and at 95%, are.3538 and 3.089, respectively. Table 4 Results of simulations with different combinations of quarterly series Combinations average lower band upper band No. H 0 accepted p value>0.0 AAA 0.5486 0.508 0.5864 980 99 BBB 0.46 0.398 0.4543 994 998 CCC 0.6799 0.6348 0.750 964 986 ABC 3.78 3.663 3.900 0 0 No. H 0 accepted p value>0.05

AAB 33.44 3.979 33.308 0 0 ABB 35.590 3.436 3.744 0 0 AAC 8.9708 8.7439 9.977 5 3 ACC 8.9460 8.70 9.699 8 BBC 6.057 5.967 6.48 0 0 BCC 8.97 8.7059 9.396 3 5 In all the different combinations the test performs well, not rejecting the null hypothesis for the combinations with the series of the same group, and rejecting it for the combinations with the series from different groups. 4. Case tudy A real data example is being used to illustrate the proposed test. The data, taken from the database (http://epp.eurostat.ec.europa.eu/portal/page/portal/statistics/search_database, Eurostat, eptember 0), consist on the quarterly national accounts aggregates by branch. We considered the European big four (France, Germany, Italy and pain) aggregates for two branches: agriculture, forestry and fishing; industry (excluding construction). The time span is from 00 to 0 and the series are quarterly. The results of the test are reported in Table 5. Table 5 Results of the test for the big four (France, Germany, Italy and pain) Agriculture, forestry and df um q. Mean q. Industry (excluding df um q. Mean q. fishing construction) Time freq. 3 907 969. Time freq. 3 55.90 8.635 Year 0 374 37.4 Year 0 53.7 5.37 eries 3 40884 368. eries 3 9.38 6.459 Residual 59 37 77.0 Residual 59.98 7.69 = 77.0747 p value = 0.0000 = 0.8466 p value = 0.4703 For the series on agriculture, forestry and fishing the is very high, and consequently the p value is very low. This means that the series present not common similar seasonal patterns, as the null hypothesis is rejected. For adjusting the big four aggregate, the indirect approach is suggested. In the case of industry (excluding construction) the is low. Consequently the p value is higher than 0. and the null hypothesis is accepted. The series present similar seasonal patterns and a direct approach is suggested for seasonally adjusting the aggregate. As shown in the tables, the results of the test are quite easy to read, and the test is based on a well-known model. It should be remarked that no seasonal adjustment procedure has been performed, as the test is a priori. 5. Future research line A future research line is already been addressed, including:

Large scale applications: there is a need of practical feedback of the performance of the test. easonal co-movements test (Centoni and Cubbadda, 0): for benchmarking reasons, the seasonal co-movements test could be used. Outliers: a detailed study on how the presence of outliers impacts on the test performance. JDemetra+: in order to facilitate the use of the test, it could be added as a module to the upcoming Java version of the software Demetra+. References Buono, D. (005). tatistical Analysis of NM and CCs GDPs: outlier detection, seasonal adjustment and cycle extraction. International Journal of Applied Econometrics and Quantitative tudies, Euro-American Association of Economic Development, October 004, vol. -. Buono, D. and Infante, E. (0). New innovative 3-way ANOVA a-priori test for direct vs. indirect approach in easonal Adjustment. Euroindicators working papers, Luxembourg. Busetti, F. and Harvey, A. (003). easonality Tests. Journal of Business and Economic tatistics, Vol., No. 3, 40-436. Bušs, G. (009). Comparing forecasts of Latvia s GDP using imple easonal ARIMA models and Direct versus Indirect Approach. Munich Personal RePEc Archive. Centoni, M. and Cubbadda, G. (0). Modelling Comovements of Economic Time eries: A elective urvey. Centre for Economics and International tudies, Research Paper eries 9, Tor Vergata. Cohen, B. (007). Three-way ANOVA, in: Explaining Psychological tatistics (3 rd ed.). John Wiley & ons, New York, pp. 688-746. Cristadoro, R. and abbatini, R. (000). The easonal Adjustment of the Harmonised Index of Consumer Prices for the Euro Area: a Comparison of Direct and Indirect Method. Banca d Italia. Eurostat (009). E guidelines on seasonal adjustment. European Communities, Luxembourg. Findley, D. and Hood, C. (999). X--ARIMA and its Application to ome Italian Indicator eries. U Bureau of the Census. Ghysels, E. and Osborn. D.R. (00). The econometric analysis of seasonal time series, Cambridge university press.

Granger, W.J. (979). easonality: causation, interpretation, and implication, Arnold Zellner ed.. Harvey, A. and Trimbur, T. (008). Tresnd Estimation And The Hodrick-Prescott filter. J. Japan tatist. oc., Vol. 38, no., 4-49. Higginson, J. (975). An F Test for the Presence of Moving easonality When Using Census Method II-X- Variant. tatistics Canada. Hindrayanto, I. (004). easonal adjustment: direct, indirect or multivariate method?. Aenorm, No. 43. Hood, C. and Findley D. (00). Comparing Direct and Indirect easonal Adjustment of Aggregate eries. AA proceedings, Washington. Maravall, A. (006). An application of the TRAMO-EAT automatic procedure; direct versus indirect approach. Computational tatistics & Data Analysis 50, 67-90. Maravall, A. and Pérez, D. (0). Applying and interpreting Model-Based easonal Adjustment: The Euro-Area Industrial Production eries. Documentos de Trabajo, no. 6, Banco de España. Mazzi, G.L., Astolfi R. and Ladiray D. (00a). Business cycle extraction of Euro-zone GDP: direct versus indirect approach. European Communities, Luxembourg. Mazzi, G.L., Astolfi, R. and Ladiray D. (00b). easonal Adjustment of European Aggregates: Direct versus Indirect Approach. European Communities, Luxembourg. Otranto, E. and Triacca, U. (000). A Distance-based Method for the Choice of Direct or Indirect easonal Adjustment. Istituto Nazionale di tatistica, Roma. Rau, R. (006). NPT: A Generalized Non-Parametric easonality Test. European Population Conference, Liverpool. Robinson,. (004). imulation: the practice of model. Development and use, John Wiley & sons, New York. urtradhar, B.C. and Dagum, E.B. (998). Bartlett-type modified test for moving seasonality with applications. The tatistician 47, Part.