BOOTSTRAP BIAS CORRECTION IN SEMIPARAMETRIC ESTIMATION METHODS FOR ARFIMA MODELS

A pesquisa Operacioal e os Recursos Reováveis 4 a 7 de ovembro de 2003, Natal-RN BOOTSTRAP BIAS CORRECTION IN SEMIPARAMETRIC ESTIMATION METHODS FOR ARFIMA MODELS Glaura C. Fraco Depto. Estatística UFMG - Belo Horizote - MG 3.270-90- Email: glaura@est.ufmg.br Valdério A. Reise Depto. de Estatística - UFES, Av. Ferado Ferrari, s/, Goiabeiras, 29060-900, Email: valderio@cce.ufes.br Summary This paper compares the effect of bootstrap bias correctio for ivestigatig semiparametric estimatio of the fractioal parameter d i a particular case of log memory processes, i.e. for ARFIMA models with d i (0.0,0.5). The bootstrap techiques studied here are the local bootstrap, the bootstrap i the residuals of the frequecy-domai regressio equatio ad the well-kow parametric ad oparametric bootstrap i the residuals of the fitted model (Fraco ad Reise, 2003). Through simulatio, these bootstrap methods are compared, based o the mea ad the mea square error of the estimators, ad the bootstrap bias correctio proposed by Efro ad Tibshirai (993) is evaluated. Keywords: Semiparametric procedures, Fractioally itegrated ARMA process, Bootstrap i the residuals, Local Bootstrap, Bias correctio. Itroductio Methods for estimatig d i a log memory process are available usig several procedures, beig the semiparametric methods very easy to be implemeted. The methods i this class are called semiparametric i the sese that the spectral desity is parameterised oly withi a eighbourhood of zero frequecy. Also, to obtai the estimates these methods do ot require the full specificatios of the spectral desity of the process, icludig ARMA compoets. The most popular is the oe proposed by Geweke ad Porter-Hudak (983), hereafter deoted by GPH. The GPH estimator is obtaied through the regressio equatio of the spectral desity of the process. Followig the same idea, Reise (994) (hereafter called by SPR) suggested the regressio estimator based o the smoothed periodogram fuctio. Robiso (995) established the asymptotic distributioal properties of a modified form of the GPH estimator. I a differet cotext of the regressio methods, Robiso (994) preseted a semiparametric method based o the average of the periodogram fuctio. This method, deoted here by LBR, was also ivestigated by Lobato ad Robiso (996), who described the limitig distributio of LBR ad also provided a study for this estimator. The bootstrap techique (Efro, 979) is geerally accepted as a powerful tool for approximatig certai characteristics (for example, bias ad variace, or the distributio of the estimators that caot at all or oly with excessive effort be calculated by aalytical meas). Bootstrappig time series data has received cosiderable iterest i recet years ad it is ot straightforward to implemet, as the observatios are usually ot idepedet. I the time domai, a alterative is to bootstrap the residuals if the model is correctly specified. This approach is usually called oparametric bootstrap ad it is completely model depedet, sice the residuals will be approximately idepedet if the corrected model is used. The parametric

bootstrap is aother possibility i the time domai. I this situatio the data geeratig process of the time series should be kow, ad the residuals are resampled from this distributio, with the parameters estimated from the sample. Recet works o bootstrappig time series i the frequecy domai have provided a cosiderable cotributio o the subect. Amog them is the paper by Paparoditis ad Politis (999), which follows the local bootstrap idea of Shi (99), i which the author proposes drawig samples from a eighbourhood of each data poit. Paparoditis ad Politis (999) suggest a simple ad direct way of bootstrappig the periodogram ad give the asymptotic properties of this procedure applied to several classes of periodogrambased statistics. A alterative bootstrap method based o the residuals obtaied through the regressio of the spectral desity of the process is also used i this work. For a compariso of the bootstrap procedures metioed above, see Fraco ad Reise (2003). The mai goal of our simulatio study is to evaluate the bias i the estimatio of d, obtaied through the bootstrap techiques ad to use this measure to correct the origial estimatives i the way proposed by Efro ad Tibshirai (993). The pla of the paper is as follows. Sectio 2 briefly itroduces the ARFIMA model ad the estimates of d. Sectio 3 describes the bootstrappig procedures ad the bootstrap bias crrectio. Simulatio results are preseted i Sectio 4 ad some coclusio remarks are reported i Sectio 5. 2 The Model ad the Estimates of the Log Memory Parameter x be a statioary ARFIMA process defied by Let { t } t = where φ ( B) ad ( B) d ( B)( B) x t ( B) ε t φ = θ () θ are the polyomials of order p ad q, respectively, with all roots outside the uit circle ad B is the backshift operator defied by B X t = X t-. ε t is a White Noise process 2 Normally distributed with zero mea ad fiite variace σ ε ad ( B) d is the fractioal d 0.5; 0.5 the process defied i () is statioary ad ivertible differecig operator. For ( ) ad its spectral desity, f ( ω ), is give by 2d ( ω ) = f u ( ω )( 2si( ω / 2) ), ω [ π, π ] f (2) d where the fuctio f u ( ω ) is the spectral desity of the ARMA(p,q) process ut ( B) xt =. A more detailed descriptio of ARFIMA models ca be foud i Hoskig (98) ad Reise (994). A class of semiparametric estimators for d may be obtaied takig the logarithm of the spectral desity (2), l f ( ω ), which approximates a regressio equatio, havig the spectral desity as the depedet variable l ( ω ) ( 0). 2 fu f ( ω ) = l fu ( 0) d l( 2si( ω / 2) ) + l (3) fu 727

We cosider two estimators derived from the regressio equatio (3). The first oe was iitially proposed by Geweke ad Porter-Hudak (983) (GPH), ad uses a estimate of the spectral ω I ω, give by desity f ( ), the periodogram fuctio, ( ) ( ) = ( ) + I ω R 0 2 R( k) cos( kω ) (4) 2π k= where R(.) deotes the sample autocovariace of x t ad is the sample size. The GPH estimator is obtaied through the regressio equatio betwee l I(ω ) ad l(2si(ω /2)) 2. The secod oe, the smoothed periodogram estimator (SPR), was proposed by Reise (994). I this case, the estimate of the spectral desity is give by the smoothed periodogram fuctio, f sp (ω ): where ( k) ( ) fsp ω = λ( k) R( k) cos( kω ) (5) 2π k= ( ) λ is give by the Parze lag widow λ ( k) = 6 k m 3 k 2 m 2 3 k + 6 m 0,,, m / 2 < k > m k m / 2 k < m β where m =, 0 < β < (see, for example, Reise (994)). The SPR estimator is obtaied through the regressio equatio betwee l f sp (ω ) ad l (2si(ω /2)) 2. I both methods the umber of observatios i the regressio equatio (the badwidth) is a fuctio of the sample g =, with 0.0 α. 0. size, i.e. ( ) α The third semiparametric estimator cosidered hereby is the oe proposed by Robiso (994) (LBR). This estimator is the weighted averages of the ulogged periodogram based o the umber of frequecies, the badwidthτ, ad a costat q (0.0,.0). The estimator is Fˆ ( qωτ ) LBR ( q) = 0.5 log 2 log ˆ (6 ) q F( ωτ ) [ / 2π ] where ˆ 2π ωτ F( ωτ ) = I( ω ) ad [.] meas the iteger part. Our choices of τ = α ad = q=0.5 are based o the work by Lobato ad Robiso (996). They ivestigated through simulatio the fiite sample distributio of LBR ad the sesitivity of quatitiesτ ad q. 728

3 Bootstrap methods ad bootstrap estimate of bias I time series there are two ways of performig the bootstrap: the bootstrap i the residuals of the fitted model ad the movig blocks bootstrap. Refereces for the first approach iclude Stoffer ad Wall (99), Fraco ad Souza (2002), amog others. The secod method is cosidered i Kusch (989) ad Politis ad Romao (994) ad refereces therei. It is well-kow that the mai difficulty i the use of the movig block approach is to specify the appropriate choice of the block size to obtai series approximately ucorrelated. Also, we believe this gets harder i log memory process because the observatios are correlated for a log period of time. Due to these facts the movig block is ot cosidered i our study. However, we still thik it is a iterestig topic to be ivestigated for those iterested i the area. I this work, for the estimatio of the fractioal parameter, other classes of bootstrap methods are proposed ad compared with the bootstrap i the residuals of the fitted model. The first oe (BOOT-LOCAL) is the local bootstrap applied to the GPH, SPR ad LBR estimatio methods. I the secod approach (BOOT-REG) the bootstrap is performed i the residuals of the regressio equatio of GPH ad SPR procedures. Let x t, t =,..., be a observed part of a stochastic process. The bootstrap methods are summarised below. 3. The local bootstrap (BOOT-LOCAL) The local bootstrap is based o the asymptotic idepedece of the ordiates of the periodogram fuctio. The term "local" refers to the way the resamplig is doe. Assumig that the spectral desity f ( ω ) is a smooth fuctio of ω, the periodogram replicates ca be obtaied locally, that is, samplig the respective frequecies that are i a small eighbourhood of the frequecy ω of iterest. This procedure does ot require iitial estimatio of the ukow spectral desity to obtai frequecy domai residuals. Let I ( ),, =, L, N ω, where N = [ / 2] be the periodogram ordiates of { x } t. The local bootstrap procedure, as show i Paparoditis ad Politis (999), is summarized as follows:. Select a resamplig width κ where κ N ad κ [ N / 2]. 2. Defie i.i.d. discrete radom variables S,..., S N takig values i the set { 0, ±, L,±κ }. 3. Each oe of the 2 κ + ordiates ca be resampled with probability p κ, s 4. The bootstrap periodogram is the defied by I I I = 2κ + ( ω ) I ( ω ), =,2, L, [ / 2] = + S ( ω ) = I( ω ), ( ω ) = 0, ω = 0 ω < 0 729

The asymptotic validity of the local bootstrap is give i Paparoditis ad Politis (999). The choice of the resamplig width κ, i the case of a fiite sample size, deserves some precautios. Paparoditis ad Politis (999) show that, give a appropriate choice of p κ, s as a ~ Kerel estimator of f ( ω ), κ ca be chose by miimizig the mea square error of f ( ω ), I ω. the expected value of ( ) I particular, takig pκ, s = ( 2κ + ) for ay particular poit ω, a "optimal" resamplig width κ, ca be obtaied from / 5 2 4 / 5 9 f ( ω ) κ, = (7) 4{ ( 2 ) 2 8π f ( ω )} assumig that ( 2 ) ( ) 0 f ω (see Paparoditis ad Politis (999) for more details). I our simulatio exercise, the local bootstrap is performed by cosiderig the simplest case of p κ, s, which is give above, ad the method will be used i the fractioal parameter estimatio procedure where a estimate of the spectral desity is required. I the regressio estimators GPH ad SPR the local bootstrap will be applied i equatio (3) to resample I(ω ) ad f sp (ω ) to obtai the respective bootstrap estimates. I the LBR method, this bootstrap Fˆ. procedure will be cosidered by resamplig I(ω ) to obtai ( ) As far as the authors are cocered, the use of the above bootstrap techique to resample I(ω ) i the GPH ad LBR methods ad f sp (ω ) i the SPR has ever bee doe previously. Thus, it is proposed i this work as a alterative bootstrap method to deal with the semiparametric fractioal estimator approaches cosidered here. 3.2 Bootstrap i the residuals of the regressio equatio (BOOT-REG) I this case, the bootstrap is performed i the residuals obtaied from the approximated regressio equatio (3). The procedure is summarized as follows: The equatio (3) may be writte as where y 2 ( 2si( ω / 2 ) + ε = A d l (8) 2π g ω =, with =,2,..., g( ) ad g( ) is chose such as 0 as. For the GPH method we have I ( ω ) y = l( I ( ω ), ε = l = ε GPH, f ( ω ) ad for the Reise (994) estimator, ( ) 730

y = l ( f ( ) sp where l( f ( 0) ) ( ω ) ε SPR, ( ω ) f sp ω, ε = l = f A = u is a costat (see equatio (3)). The above errors are asymptotically idepedet ad idetically distributed (see, for example, Reise (994) for more details). Thus, sice the umber of observatios ivolved i the regressio equatio is a fuctio of the frequecies, the oparametric bootstrap i the frequecy domai ca be applied i the ε GPH, ε, so that the bootstrap series i the case of the GPH method, ( ) ad SPR, calculated as ad for the SPR method 2 ( ) = A dˆ l( 2si( ω / 2 ) e l I ω ca be ω p GPH, (9) l I + l 2 ( ) = A dˆ l( 2si( ω / 2 ) e sp ω sp SPR, (0) f + where e i,, i = GPH or SPR, are the residual bootstrap estimates ad all the terms without a star are kept fixed. It is well-kow that the residuals i the regressio equatios are oly asymptotically idepedet whe usig both estimates of f ( ω ), the periodogram ad the smoothed periodogram fuctios. This could ivalidate the use of the bootstrap techique i the regressio equatios (9) ad (0). However, as we will observe from the results of the simulatios, the bootstrap i the residuals of the regressio model has led to results very close to the simulatios, especially whe compared to the bootstrap o the residuals of the fitted model. Thus, i spite of this, we strogly believe i the ability of this kid of bootstrap to estimate the bias of the fractioal d i log memory processes. 3.3 Noparametric bootstrap i the residuals of the fitted model (BOOTNP-) The estimated residuals of the ARFIMA model, e t, are calculated by e t ˆ ˆ θ () = φ d ( B) ˆ( B)( B) xt which are supposed to be i.i.d. The oparametric bootstrap cosists of resamplig these residuals with replacemet, so that the bootstrap series x t ca be costructed as where e t is the residual bootstrap series. x ˆ ˆ d ( )( ) ˆ t = φ B B θ ( B) et (2) 73

Note that ˆd, φˆ ad θˆ are kept fixed durig the bootstrap replicatios, i.e., they are treated as the true populatio parameters. This method is called oparametric bootstrap, sice the distributio of the residuals is ot specified. 3.4 Parametric bootstrap i the residuals of the fitted model (BOOTPA-) The parametric bootstrap i the residuals of the fitted model (equatio ()) is aother way of performig the bootstrap i time series. The procedure is very similar to the BOOTNP-, but ow the distributio of the residuals is supposed to be kow. Hece, the samplig will be doe from the distributio that geerated the residuals with the parameters estimated from the origial series. A good review of both BOOTNP- ad BOOTPA- may be foud i Efro ad Tibshirai (993) ad Daviso ad Hikley (997). It should be oted that, i these last two approaches, we have made the assumptio that the true model was kow ad oly the parameters eeded to be estimated. Although this is ot realistic i practice, Smith et al. (997) show that the bias estimates of d may cause the choice of a icorrect ARFIMA specificatio by the model selectio criterio. Cosequetly, the errors will ot be idepedet, which may cause some problems whe applyig these bootstrap techiques. 3.5 Bootstrap estimate of bias The bootstrap estimate of bias is obtaied, accordig to Efro ad Tibshirai (993), by Bias ˆ = dˆ where ˆd is the estimate of d obtaied i the bootstrap replicatios ad dˆ is the estimate of d from the simulated series. Thus, the bias correctio ca be performed i the followig way dˆ d ˆ Corr = d ˆ Bias. 4 Simulatio Results All simulated ARFIMA time series throughout the paper were geerated via the algorithm suggested by Hoskig (984) for sample sizes = 00 ad 300, d = 0.2 ad three differet values of φ. The radom umbers N(0,) were obtaied by the subroutie RNNOR from the IMSL-FORTRAN. The umber of bootstrap ad replicatios were set equal to 000. The badwidth values of the GPH ad SPR methods were fixed for α = 0.5 ad 0.8. The trucatio poit β i the lag Parze widow, for the smoothed periodogram fuctio, was chose equal to 0.9 (see Reise (994) for more details). The tables preset the simulatio results of the mea ad the mea square error (mse) of the estimators of d, based o the ad Bootstrap methods. I the tables 4. ad 4.2 oe fids the results for ARFIMA (,d,0) models with d = 0.2 ad φ = 0.0, ± 0.2. I the colums uder the title Origi are preseted the results for the bootstrap before a bias correctio is performed. The values i bold face i this colum are the oes closer to the results. Uder the title Correc are the results after the bias correctio. The values i bold face i this colum are the oes closer to the real value d = 0.2 The results before the bias correctio (uder the colum Origi ) were already aalysed i aother work (Fraco ad Reise, 2003) ad the mai coclusios were that the bootstrap i the 732

regressio equatio (BOOT-REG) outperforms the other bootstrap approaches by presetig mea ad mse values very close to the method, except i the LBR estimates where this bootstrap techique caot be applied. Comparig the BOOT-LOCAL with the BOOTNP- ad the BOOTPA-, the first oe is superior showig closer values to those obtaied by the. As should be expected, for series of size = 300 the estimates are closer to the true value ad the MSE values are smaller compared to series of size = 00. After the bias correctio is performed, we ca draw the followig coclusios: For GPH, the estimates are very close to the real value of the parameter d = 0.2, thus BOOT-REG is the better bootstrap method to perform the bias correctio, as it has give the best performaces before the correctios. The exceptio occurs for φ = 0.2, where the BOOTNP- ad BOOTPA- show the best performaces. For the SPR estimator a iterestig fact happes. Although the BOOT-REG gives very close approximatios to the simulatios, the bias corrected estimates for BOOTNP- ad BOOTPA- are much closer to the real value of d = 0.2. This occurs because the differece from the bootstrap to the estimates compesates the differeces from the to the real value of d. For this estimator the bias correctio shows a real gai, as the results, usig BOOTNP- ad BOOTPA-, are closer to 0.2 tha the. Besides, the mse for the bootstrap bias corrected estimates is eve smaller tha it was for the origial estimates. The same happes to LBR, although for this method the estimates are much far from the real value d = 0.2, but they are still better tha the ad the origial bootstrap estimates, uless for φ = 0.2 It ca be oticed also that, i geeral, a bias correctio ca approximate reasoably the real value of d i all of the estimatio methods studied here. 733

GPH SPR LBR.209 (.053).754 (.00).527 (.0470) Origi Correc Origi Correc Origi Correc.0657.760.599.909.0989.2065 (.0673) (.046) (.028) (.0098) (.059) (.040).0657.760.602.906.099.2063 (.0673) (.046) (.029) (.0097) (.059) (.040).383.035.80.707.738.36 (.052) (.0535) (.00) (.03) (.0479) (.0486).208.753.755.528.526 (.053) (.0) (.0) (.0470) (.0470) Table 4. - Mea ad mse values for ARFIMA(,d,0) models ( = 00, d = 0.2) φ -0.2 0.0 0.2.948 (.082).200 (.062).2283 (.089) Origi Correc Origi Correc Origi Correc BOOTNP-.279 (.0985).77 (.0686).24 (.080).906 (.047).255 (.0994).204 (.0699) BOOTPA-.283 (.0987).73 (.0684).28 (.08).902 (.046).2559 (.0999).2007 (.0696) BOOT- LOCAL.200 (.0788).894 (.03).2002 (.06).208 (.072).2365 (.0798).220 (.03) BOOT-REG.947.948.2009.20.2282.2283 (.08) (.085) (.062) (.062) (.087) (.082) BOOTNP- BOOTPA- BOOT- LOCAL BOOT-REG.20 (.053).073 (.0466).52 (.0096).324 (.0388) Origi Correc Origi Correc Origi Correc BOOTNP- -.07 (.0600).237 (.5).0822 (.0397).222 (.0460).768 (.0028).0880 (.448) BOOTPA- -.07 (.060).238 (.56).0824 (.0397).229 (.0459).770 (.0028).0878 (.453) BOOT-.0864.282.448.595.3.534 LOCAL (.0433) (.0603) (.000) (.0098) (.0353) (.0523) Note: Numbers i brackets are the mse 734

Table 4.2 - Mea ad mse values for ARFIMA(,d,0) models ( = 300, d = 0.2) φ -0.2 0.0 0.2 Origi Correc Origi Correc Origi Correc.938 (.0384).204 (.0057).209 (.0360) BOOTNP-.2008 (.0272).868 (.089).2003 (.0034).2025 (.0027).235 (.0250).903 (.076) GPH BOOTPA-.2007 (.0274).969 (.090).2002 (.0034).2026 (.0023).238 (.0252).900 (.075) BOOT-.944.932.202.206.22.206 LOCAL (.0386) BOOT-REG.938 (.0384) (.0505).938 (.0385) (.0058).202 (.0057) (.0062).206 (.0057) (.0364).209 (.036) (.0470).209 (.0363) SPR.465 (.0274) BOOTNP-.36.794 (.0228) (.04) BOOTPA-.37.793 (.0229) (.04) BOOT-.562.368 LOCAL (.0275) (.0287) BOOT-REG.464.466 (.0274) (.0274).883 (.0040).798.968 (.003) (.0024).798.968 (.003) (.0024).903.863 (.0040) (.004).882.884 (.0040) (.0040).590 (.025).399.78 (.077) (.023).402.778 (.077) (.023).692.488 (.0253) (.0257).590.590 (.025) (.025) LBR.03 (.0332).77 (.0040) BOOTNP-.0427.599.535.899 (.027) (.0682) (.0047) (.0054) BOOTPA-.0426.600.535.899 (.0272) (.0686) (.0047) (.0054) BOOT-.0856.70.684.750 LOCAL (.0340) (.0473) (.0042) (.004) Note: Numbers i brackets are the mse.38 (.0293).704.0572 (.008) (.093).704.0572 (.009) (.0986).095.325 (.033) (.0464) 5. Coclusios We show i this work that a bootstrap bias correctio for the semiparametric estimates of parameter d i ARFIMA(,d,0) models ca be a alterative to obtai better estimates for this parameter. For GPH, the BOOT-REG is the better bootstrap method to perform the bias correctio, as it has give the best performaces before the correctios. For the SPR, the bias corrected estimates for BOOTNP- ad BOOTPA- are much closer to the real value d = 0.2 ad we have a substacial gai, as the results are closer to 0.2 tha the. The same happes to LBR, although for this method the estimates are much far from the real value d = 0.2. It ca be oticed also that, i geeral, a bias correctio ca approximate reasoably the real value of d i all of the estimatio methods studied here. 735

Refereces Daviso, A.C. ad Hikley, D.V. (997), Bootstrap Methods ad their Applicatio. Cambridge: Cambridge Uiversity Press. Efro, B. (979), Bootstrap methods: aother look at the Jackkife. The Aals of Statistics, 7, -26. Efro, B. ad Tibshirai, R. (993), A itroductio to the bootstrap. New York: Chapma ad Hall. Fraco, G.C ad Reise, V.A. (2003), Bootstrap Techiques i Semiparametric Estimatio Methods for ARFIMA Models: A Compariso Study, Computatioal Statistics. To appear. Fraco, G.C ad Souza, R.C. (2002), A compariso of methods for bootstrappig i the local level model, Joural of Forecastig, 2, 27-38. Geweke, J. ad Porter-Hudak, S. (983), The estimatio ad applicatio of log memory time series model. Joural of Time Series Aalysis, 4(4), 22-238. Hoskig, J. (98), Fractioal differecig. Biometrika, 68(),65-75. Hoskig, J. (984), Modellig persistece i hydrological time series usig fractioal differecig. Water Resources Research, 20(2), 898-908. Kusch, H.R. (989), The ackkife ad the bootstrap for geeral statioary observatios. The Aals of Statistics, 7, 27-24. Lobato, I. ad Robiso, P. M. (996), Averaged periodogram estimatio of log Memory. Joural of Ecoometrics, 73, 303-324. Paparoditis, E. ad Politis, D.N. (999), The local bootstrap for periodogram statistics. Joural of Time Series Aalysis, 20(2), 93-222. Politis, D.N. ad Romao, J.R. (994), The statioary bootstrap. Joural of the America Statistical Associatio. 89, 303-33. Reise, V.A. (994), Estimatio of the fractioal differece parameter i the ARIMA(p,d,q) model usig the smoothed periodogram. Joural of Time Series Aalysis, 5(3), 335-350. Robiso, P.M. (994), Semiparametric aalysis of log-memory time series. The Aals of Statistics. 22(), 55-539. Robiso, P.M. (995), Log-periodogram regressio of time series with log rage depedece. The Aals of Statistics, 23(3), 048-072. Shi, S.G. (99), Local bootstrap. A. Ist. Stat. Math., 43, 667-76. Smith, J., Taylor, N. ad Yadav, S. (997), Comparig the bias ad misspecificatio i ARFIMA models. Joural of Time Series Aalysis, 8(5), 507-527. Stoffer, D.S. ad Wall, K.D. (99), Bootstrappig state-space models: gaussia maximum likelihood estimatio ad the Kalma filter. Joural of the America Statistical Associatio, 86, 024-033. 736