Articl Intrnational Journal of Modrn Mathmatical Scincs 014 11(1): 40-48 Intrnational Journal of Modrn Mathmatical Scincs Journal hompag:www.modrnscintificprss.com/journals/ijmms.aspx ISSN:166-86X Florida USA ARIMA Mthods of Dtcting Outlirs in Tim Sris Priodic Procsss T. A. Lasisi 1 and D. K. Shangodoyin * 1 Dpartmnt of Mathmatics and Statistics Th Polytchnics Ibadan Nigria Dpartmnt of Statistics Unirsity of Botswana Botswana * Author to whom corrspondnc should b addrssd. E-mail: shangodoyink@mopipi.ub.bw Articl history: Rcid 4 March 014 Rcid in risd form 7 May 014 Accptd 4 Jun 014 Publishd 8 July 014. Abstract: W utilizd th priodic liklihood ratio tst statistic to assss th idnc that any stimatd outlir for a gin priod is an outlir. Th conditions ar that th timing t of th occurrnc of an outlir is known or assumd and th magnitud of th outlir has bn stimatd or spcifid. Whil in Fbruary priod and outlir modls rspctily confirmd 30% significanc of th wights of th outlir injctd into th sris howr confirms significantly 40% significant of th wights injctd displaying mor powrful fat in capturing th tim points of outlirs. In Octobr priod only outlir modl confirmd significantly 0% of th wight of th outlirs injctd into th sris. Our findings ral that th and modls prform bttr in trms of thir ability to captur th timing occurrnc of outlirs. T Kywords: Liklihood ratio tst statistic ARMA modls Priodic procsss 000 Mathmatical Subjct Classification: C15 C C5 1. Introduction In practic thr may b nd for dtcting outlirs if thy ar influntial to a modl fitting. Fox (197) appars to ha first discord th dtction of outlirs with tim sris assuming an
41 autorgrssi procss on outlir-fr sris with Gaussian nois. Th approach is a liklihood ratio critria for tsting th xistnc of additi and innoati outlirs undr th condition that srial corrlation xists a good condition for analyzing priodic sris. A numbr of rsarch works ha bn don on both last squars and maximum liklihood mthods of dtcting outlirs assuming known procss modls (Bianco t al 001 Tsay 1996). Bruc and Martin (1989) commnting on Cook s distanc statistic(cook1977) obsr that th tst statistic basd on th influnc of th i-th obsration on th paramtr of th rgrssion modl is wll stablishd but th dpndncy rlations which xist in tim sris gi ris to a smaring ffct whn th tst statistic for a modl cofficint is calculatd and thrfor th nd for tim sris ARMA modl spcifid that tst statistic. Th idntification of influntial obsrations in th complx of ARIMA has bn dlopd (Chang and Tiao 1983 Pna 1984 Tsay 1986 and 1996). In this papr w utilizd th drid statistic in (Lasisi T. A. t a 013l) Sction 3.1(3.1.1-3.1.4) PP 9 to assss th idnc that any stimatd outlir for a gin priod is an outlir. Th conditions ar that th timing t T of th occurrnc of an outlir is known or assumd and th magnitud of th outlir has bn stimatd or spcifid. Considration is gin to ARIMA modls considrd on priodic procsss using th liklihood ratio tst statistic dlopd by Tsay (1986) this xtndd to ll shift and transitory chang outlirs for priodic sris in this study.. Outlir Dtction Tst Statistic If w assum a singl outlir modl: ( B ) Z ( B ) Y ( B ) D (1) 1 1 ( T ) t( r m) t( r m) t( r m) t whr ( B ) is th wight attachd to th magnitud of outlirs. Suppos that th ARIMA modls ( B : I) Y ( B : I) and t( r m) t( r m) () t( r m) t( r m) ( B : I) Z ( B : I) whr ( B ) ( B ) ( B ) ar fittd on th outlir fr sris and outlir infstd sris rspctily. t( r m) is a squnc of indpndnt Gaussian ariats with man zro and arianc on. In modl (1) it is assumd that all th zros of ( m ) ( m ( B : I) and ) ( B : I) ar on or outsid th unit circl and that ( m ) ( B : I) and ( m ) ( B : I) ha no common factors if som of th rrors of ( m ) ( B : I) ar on th unit circl thn it is assumd that th procss starts at fixd tim point with known or gin starting alus. Th whit nois procss t has man zro and arianc applying to th last squar thory w ha th following rsults : (i) ADDITIVE OUTLIER CASE:
4 Th magnitud of outlir stimatd DˆT ( r m ) (ii)innovatnal OUTLIER CASE: is P with arianc ( B ) 1 (1) T ( r m) ( B) T ( r m) T ( r m) Th stimat of outlir magnitud is P () T ( r m) t( r m) with th arianc as T ( r m ) (iii)level SHIFT CASE: Gin that th wight 1 1 B with arianc1 B ( B ). t ( r m) (i)transitory CHANGE CASE: Th wight cofficint of outlir is 1 B 1 1 ( B) ( B) an with arianc t ( r m) 1 B ( B ) T r m 1 thn th magnitud of th outlirs stimatd is 1 th stimat of th outlir is ( ) 1 ( B) ( B) t r m Basd on th rsults in (i)-(i) w may construct th tst statistics for tsting th xistnc of an outlir at tim point t T. Th null hypothsis is that thr is no outlir at tim point t T ; undr th assumptions of knowing th tim sris paramtrs and tim occurrnc of th outlirs th following tst statistics ar distributd as 0 N.Although Chang and Tiao (1983) suggstd th critical alus 3.0 3.5 and 4.0. But in practical tim sris analysis w suggst th us of P( i) T c ( ) for a spcifid alu of undr normality assumption Tsay(1988). Assum that ( ) N(01) and suppos that th ARIMA ( p d q) modl paramtrs timing of outlirs and magnitud ar known thn th tst statistic for ach scnario would b as follows: (i) (ii) Existnc of : Dˆ (3) ( B ) AT 1 Existnc of : IT Dˆ t ( r m) T ( r m) ˆ t ( r m) T ( r m) (4) ˆ (iii) Existnc of : Dˆ T ( r m) LT 1 1 1 B ( B ) ˆ t ( r m) (i) Existnc of : (5)
43 Dˆ (1 B ) ( B ) ˆ T ( r m) CT 1 1 t ( r m) (6).1. Outlirs Dtction Algoritms (ODA) Th critria proposd in (i)-through (i) would b usful in dtcting outlirs in priodic procsss. If ths stimats of th outlirs magnitud ar stimatd using Outlir Dtcting Algorithm (ODA) in figur 1 to gt th magnitud from particular tim point T; w assumd outlir fr points ha a zro magnitud and usd ths to find if this is tru. Th rsult in this sction is only usd in th intrmdiat stps of outlir dtction procdur. Th final stimats of outlirs ar from th modl incorporating all th outlirs in which all paramtrs ar stimatd in th ARIMA ( p d q ) modl. Th following flow chart dmonstrats how automatic outlir dtction works. Lt b Dˆ t th magnitud of outlirs and t T whn an outlir occurs.
44 Figur 1: Outlir Dtction Algorithm 3. Empirical Illustration Th computation will follow th following algorithm for dtction of outlirs using th mthodology dscribd abo. Outlir Dtction Algorithm (ODA) follows ths stps: Rad th priodic obsrations Yt t=1...m Estimat th paramtrs of ARMA procss using SPSS or SYSTAT program.
45 1 (iii)obtain ( B) ( B) ( B) from th stimatd ARMA procsss in stp(ii). (i)rad th alus of DˆT to comput ˆ T ()Rad th critical alus C as 1.645(1%) and 1.96(5%). Th program for computing th xistnc outlirs using (i) Do: 's is: Calculat ˆ T from th alus of DˆT. using xprssion ˆ Calculat i= and it If it C display it othrwis no outlir thn stop. If thn display (ii) End Do. D Var. Ti ( r m) othrwis rchck using diffrnt (iii)go and rad nw priodic obsrations from th fil and prform algorithm. W ha stimatd th tim sris ARMA paramtrs from th data collctd on Maun Airport prcipitation and concntratd on thr outlir fr priods. Th paramtrs shown in tabl 1 ar usd for stimating th ( B ) and in quation () for all th outlir modls considrd. Undr th null t t it hypothsis that thr is no outlir th statistics AT IT LT and TT ha standard distribution. This mad th statistics ( )T to b radily usful in practical modling (Tsay 1986). For practical purposs of dtcting th xistnc and significanc of outlir in tim point T if ( )T is significantly gratr than th chosn critical alu and AT is gratr than any of IT LT and TT. Th additi outlir is mor pronouncd at this point than any othr outlirs. W assum th standard normal distribution with critical Z1 alu as 1.96 at 5% significanc ll. In tabl 1 is significantly gratr than th critical alu at tim points 11 16 18 4 6 and 30. Th is statistically significant at tim points 11 and 30. Both and ar significant at tim points 11 16 18 4 6 and 30. Th prsnc of and ar prominnt at ths tim points bcaus th -alus of and ar gratr than thos of and. For this January priod th and outlir modls confirmd significantly 60% of th wights of th outlirs injctd into th sris. In tabl is significantly gratr than th critical alu at tim points 16 6 and 30. Th is only statistically significant at tim points 18 and 4.Whil is significant at tim point 11 16 6 and 30 is significant at tim points 16 6 and 30. Th prsnc of is prominnt at ths tim points implying that th -alus of and ar on point ach lss than that of. For this Fbruary priod whil th and outlir modls ar rspctily confirmd to b 30% significant of th
46 wights of th outlirs injctd into th sris th is 40% significant of th injctd wights showing a mor powrful fat in capturing th outlirs. is significantly gratr than th critical alu at tim points and 6. Th and ar statistically significant at tim point only as shown in tabl 3. Th obious rason is that th rgim bhas diffrntly to th prsnc of outlirs as all th modls captur th tim point outlirs. Th prsnc of is prominnt at and 6 tim points bcaus th -alus of is gratr than thos of and. For Octobr priod only outlir modl confirmd significantly 0% of th wights of th outlirs injctd into th sris. TIMING Tabl 1: Th Valus of Liklihood Ratio Tst Statistic for January ˆ D ˆ D 3.1 0.4633344 3.1 0.489 3.131 0.488487 3.1 0.4881966 6 40.5 0.813395 40.5 0.785461 40.39035 0.8585 40.35 0.85347 11 133..6716945 133..137311 133.333.815356 133..8159087 16 104.475.0955351 104.475 1.440965 104.5795.08158 104.475.107953 18 11.65.595074 16.533-1.8171 11.767.3810051 11.65.389591 1 5.5 0.511473 5.5-1.877 5.555 0.5389759 5.5 0.541991 4 109.875.038471 111.734 1.89186 110.06.339371 109.977.347078 6 104.475.0955351 3.31-0.11 104.5795.08158 104.475.103044 7 4.075 0.8439305 50.4-0.97134 4.11708 0.889310 4.17948 0.8936314 30 167.05 3.3501484 167.05 3.117334 167.191 3.53094 167.05 3.5308067 Tabl : Th Valus of Liklihood Ratio Tst Statistic for Fbruary TIMING 1.75 0.187740 1.75 0.106609 1.7675 0.8350018 1.75 0.1933979 6 5.5 0.77304787 5.5 0.4994937 5.555 1.167353469 5.5 0.79651665 11 100.45 1.47873014 100.45 0.8430651 100.554.3979867 100.45 1.5404896 16 195.45.87757867 195.45 1.3834307 195.6455 4.34589131 195.45.96574179 18 13.95 0.0540986-177. -.5676378 13.96395 0.31018493 13.95 0.1455869 1 77.85 1.14631956 77.85-1.115875 77.9755 1.731011766 77.85 1.1810453 4 75.975 1.1187107 75.975.378617 76.05098 1.68937346 75.975 1.15357435 6 185.35.7885898 185.35 1.0476535 185.5103 4.10757191 185.35.8117351 7 33.55 0.4936468 33.55853-0.348448 33.55853 0.745438684 33.71033 0.5141314 30 136.5.0099446 136.5-0.4199967 136.6365 3.0351190 136.5.0709518
47 Tabl 3: Th Valus of Liklihood Ratio Tst Statistic for Octobr ˆ D ˆ D ˆ D TIMING 75.85 3.903016 75.85 3.57088 75.901 3.91153 75.85 3.903016 ˆ D 6 37. 1.61433 37..05393 37.37 1.6146689 37. 1.617533 11 1.375 0.975331 1.375-1.58691 1.396 0.97773 1.375 0.991473 16.65 0.1139076.65-1.09745.678 0.116133.65 0.1148351 18 1.75 0.055365-0.95-0.79319 1.76 0.055398 1.75 0.0554404 1 16.75 0.75754 16.75 0.618397 16.74 0.759658 16.75 0.758093 4 3.15 0.1366891 3.36 0.78199 3.153 0.13670 3.1515 0.1374799 6 15.15 0.6574094 1.513-0.04419 15.165 0.657584 15.15 0.657546 7 16.8 0.790085 18.91 0.767088 16.817 0.79179 16.815 0.7303168 30 0.15 0.006509 0.15-0.33009 0.15 0.0065043 0.15 0.007387 4. Conclusion Priodic liklihood ratio tst statistic was utilizd to assss th idnc that any stimatd outlir for a gin priod is an outlir. Th conditions ar that th timing t T of th occurrnc of an outlir is known or assumd and th magnitud of th outlir has bn stimatd or spcifid. For Maun Airport data it is obsrd that for January priod th and outlir modl confirm significantly 60% of th wights of th outlirs injctd into th sris. Th dtcts just only two tim points whil in Fbruary priod and outlir modls rspctily confirmd 30% significanc of th wights of th outlir injctd into th sris. Howr confirms significantly 40% significant of th wights injctd displaying mor powrful fat in capturing th tim points of outlirs. In Octobr priod only outlir modl confirmd significantly 0% of th wight of th outlirs injctd into th sris. Our findings ral that th and modls prform bttr in trms of thir ability to captur th timing occurrnc of outlirs. Rfrncs [1] Bianco A. M. Gracia B. M. Martinz E. J. & Yohai V.J. Outlirs dtction in rgrssion modls with ARIMA rrors using robust stimations Journal of Forcasting. 0(001): 5665-579 [] Bruc A. G. and Martin R.D. La-k-out diagnostics for Tim Sris J.R. Statist. Soc. B 51(1989): 363-44.
48 [3] Chang I. and Tiao G.C. Estimation of Tim Sris Paramtrs in th prsnc of outlirs Tchnical Rport 8 Unirsity of Chicago Statistics Rsarch Cntr 1983. [4] Cook R.D. Dtction of Influnc Obsrations in Linar Rgrssion Tchnomtrics 19(1977): 15-18 [5] Fox A.J.(197) Outlirs in tim sris Journal of Royal Stat. Soc. 34(197): 350-363. [6] Pna D Influnc Obsrations in Tim Sris Tchnical Rport 178. Mathmatics Rsarch Cntr Unirsity of Wisconsin Madison 1984. [7] Tsay R.S. (1986) Tim Sris Modl Spcification in th prsnc of Outlirs. J.A.S.A 81(1986): 13-141. [8] LasisiT.A. Shangodoyin D.K. and Mong S.R.T.(013) Spcicification of Prioddic Autocoarinc Structurs in th Prsnc of Outlirs Studis in Mathmatical Scincs 6()(013): 83-95.