ARIMA Methods of Detecting Outliers in Time Series Periodic Processes

Similar documents
EXST Regression Techniques Page 1

What are those βs anyway? Understanding Design Matrix & Odds ratios

Observer Bias and Reliability By Xunchi Pu

Inference Methods for Stochastic Volatility Models

Solution of Assignment #2

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, April 04, 2005, 8:35 AM) PART I: CHAPTER TWO COMB MATH.

Transitional Probability Model for a Serial Phases in Production

Derangements and Applications

Search sequence databases 3 10/25/2016

Two Products Manufacturer s Production Decisions with Carbon Constraint

Chapter 13 GMM for Linear Factor Models in Discount Factor form. GMM on the pricing errors gives a crosssectional

A Prey-Predator Model with an Alternative Food for the Predator, Harvesting of Both the Species and with A Gestation Period for Interaction

Cramér-Rao Inequality: Let f(x; θ) be a probability density function with continuous parameter

Linear Non-Gaussian Structural Equation Models

NEW APPLICATIONS OF THE ABEL-LIOUVILLE FORMULA

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES. 1. Statement of results

Differential Equations

Data Assimilation 1. Alan O Neill National Centre for Earth Observation UK

Some Results on E - Cordial Graphs

10. The Discrete-Time Fourier Transform (DTFT)

Homotopy perturbation technique

Construction of asymmetric orthogonal arrays of strength three via a replacement method

Review Statistics review 14: Logistic regression Viv Bewick 1, Liz Cheek 1 and Jonathan Ball 2

ANALYSIS IN THE FREQUENCY DOMAIN

Einstein Equations for Tetrad Fields

Dealing with quantitative data and problem solving life is a story problem! Attacking Quantitative Problems

Collisions between electrons and ions

LINEAR DELAY DIFFERENTIAL EQUATION WITH A POSITIVE AND A NEGATIVE TERM

Estimation of apparent fraction defective: A mathematical approach

Recursive Estimation of Dynamic Time-Varying Demand Models

Applied Statistics II - Categorical Data Analysis Data analysis using Genstat - Exercise 2 Logistic regression

Lecture 2: Discrete-Time Signals & Systems. Reza Mohammadkhani, Digital Signal Processing, 2015 University of Kurdistan eng.uok.ac.

Panel Data Analysis Introduction

ME 321 Kinematics and Dynamics of Machines S. Lambert Winter 2002

INFLUENCE OF GROUND SUBSIDENCE IN THE DAMAGE TO MEXICO CITY S PRIMARY WATER SYSTEM DUE TO THE 1985 EARTHQUAKE

Bifurcation Theory. , a stationary point, depends on the value of α. At certain values

Ch. 24 Molecular Reaction Dynamics 1. Collision Theory

Rotor Stationary Control Analysis Based on Coupling KdV Equation Finite Steady Analysis Liu Dalong1,a, Xu Lijuan2,a

MCB137: Physical Biology of the Cell Spring 2017 Homework 6: Ligand binding and the MWC model of allostery (Due 3/23/17)

Slide 1. Slide 2. Slide 3 DIGITAL SIGNAL PROCESSING CLASSIFICATION OF SIGNALS

Basic Polyhedral theory

Sara Godoy del Olmo Calculation of contaminated soil volumes : Geostatistics applied to a hydrocarbons spill Lac Megantic Case

PROOF OF FIRST STANDARD FORM OF NONELEMENTARY FUNCTIONS

ph People Grade Level: basic Duration: minutes Setting: classroom or field site

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018

Errata. Items with asterisks will still be in the Second Printing

Text: WMM, Chapter 5. Sections , ,

Procdings of IC-IDC0 ( and (, ( ( and (, and (f ( and (, rspctivly. If two input signals ar compltly qual, phas spctra of two signals ar qual. That is

2008 AP Calculus BC Multiple Choice Exam

Hardy-Littlewood Conjecture and Exceptional real Zero. JinHua Fei. ChangLing Company of Electronic Technology Baoji Shannxi P.R.

4037 ADDITIONAL MATHEMATICS

Higher order derivatives

A Propagating Wave Packet Group Velocity Dispersion

Least Favorable Distributions to Facilitate the Design of Detection Systems with Sensors at Deterministic Locations

MEASURING INFLUENCE IN DYNAMIC REGRESSION MODELS. Daniel Peña*

2F1120 Spektrala transformer för Media Solutions to Steiglitz, Chapter 1

Principles of Humidity Dalton s law

Appendix. Kalman Filter

Outline. Why speech processing? Speech signal processing. Advanced Multimedia Signal Processing #5:Speech Signal Processing 2 -Processing-

Rational Approximation for the one-dimensional Bratu Equation

Complex Powers and Logs (5A) Young Won Lim 10/17/13

Discrete Hilbert Transform. Numeric Algorithms

On the Hamiltonian of a Multi-Electron Atom

Learning Spherical Convolution for Fast Features from 360 Imagery

On the irreducibility of some polynomials in two variables

3-2-1 ANN Architecture

Sliding Mode Flow Rate Observer Design

EEO 401 Digital Signal Processing Prof. Mark Fowler

General Notes About 2007 AP Physics Scoring Guidelines

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES

The Matrix Exponential

UNTYPED LAMBDA CALCULUS (II)

TEMASEK JUNIOR COLLEGE, SINGAPORE. JC 2 Preliminary Examination 2017

Gradebook & Midterm & Office Hours

1 N N(θ;d 1...d l ;N) 1 q l = o(1)

Robust surface-consistent residual statics and phase correction part 2

The Equitable Dominating Graph

Chemical Physics II. More Stat. Thermo Kinetics Protein Folding...

u x v x dx u x v x v x u x dx d u x v x u x v x dx u x v x dx Integration by Parts Formula

The Matrix Exponential

Lecture 37 (Schrödinger Equation) Physics Spring 2018 Douglas Fields

(Upside-Down o Direct Rotation) β - Numbers

Difference -Analytical Method of The One-Dimensional Convection-Diffusion Equation

Definition1: The ratio of the radiation intensity in a given direction from the antenna to the radiation intensity averaged over all directions.

Probability Translation Guide

Introduction to the quantum theory of matter and Schrödinger s equation

Calculus II (MAC )

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation.

5.80 Small-Molecule Spectroscopy and Dynamics

THE IMPACT OF A PRIORI INFORMATION ON THE MAP EQUALIZER PERFORMANCE WITH M-PSK MODULATION

MA 262, Spring 2018, Final exam Version 01 (Green)

Solution: APPM 1360 Final (150 pts) Spring (60 pts total) The following parts are not related, justify your answers:

This is an accepted version of a paper published in IEEE Transactions on Information Forensics and Security.

An Extensive Study of Approximating the Periodic. Solutions of the Prey Predator System

Volterra Kernel Estimation for Nonlinear Communication Channels Using Deterministic Sequences

ON THE DISTRIBUTION OF THE ELLIPTIC SUBSET SUM GENERATOR OF PSEUDORANDOM NUMBERS

Homework #3. 1 x. dx. It therefore follows that a sum of the

SECTION where P (cos θ, sin θ) and Q(cos θ, sin θ) are polynomials in cos θ and sin θ, provided Q is never equal to zero.

Laboratory work # 8 (14) EXPERIMENTAL ESTIMATION OF CRITICAL STRESSES IN STRINGER UNDER COMPRESSION

Unfired pressure vessels- Part 3: Design

Transcription:

Articl Intrnational Journal of Modrn Mathmatical Scincs 014 11(1): 40-48 Intrnational Journal of Modrn Mathmatical Scincs Journal hompag:www.modrnscintificprss.com/journals/ijmms.aspx ISSN:166-86X Florida USA ARIMA Mthods of Dtcting Outlirs in Tim Sris Priodic Procsss T. A. Lasisi 1 and D. K. Shangodoyin * 1 Dpartmnt of Mathmatics and Statistics Th Polytchnics Ibadan Nigria Dpartmnt of Statistics Unirsity of Botswana Botswana * Author to whom corrspondnc should b addrssd. E-mail: shangodoyink@mopipi.ub.bw Articl history: Rcid 4 March 014 Rcid in risd form 7 May 014 Accptd 4 Jun 014 Publishd 8 July 014. Abstract: W utilizd th priodic liklihood ratio tst statistic to assss th idnc that any stimatd outlir for a gin priod is an outlir. Th conditions ar that th timing t of th occurrnc of an outlir is known or assumd and th magnitud of th outlir has bn stimatd or spcifid. Whil in Fbruary priod and outlir modls rspctily confirmd 30% significanc of th wights of th outlir injctd into th sris howr confirms significantly 40% significant of th wights injctd displaying mor powrful fat in capturing th tim points of outlirs. In Octobr priod only outlir modl confirmd significantly 0% of th wight of th outlirs injctd into th sris. Our findings ral that th and modls prform bttr in trms of thir ability to captur th timing occurrnc of outlirs. T Kywords: Liklihood ratio tst statistic ARMA modls Priodic procsss 000 Mathmatical Subjct Classification: C15 C C5 1. Introduction In practic thr may b nd for dtcting outlirs if thy ar influntial to a modl fitting. Fox (197) appars to ha first discord th dtction of outlirs with tim sris assuming an

41 autorgrssi procss on outlir-fr sris with Gaussian nois. Th approach is a liklihood ratio critria for tsting th xistnc of additi and innoati outlirs undr th condition that srial corrlation xists a good condition for analyzing priodic sris. A numbr of rsarch works ha bn don on both last squars and maximum liklihood mthods of dtcting outlirs assuming known procss modls (Bianco t al 001 Tsay 1996). Bruc and Martin (1989) commnting on Cook s distanc statistic(cook1977) obsr that th tst statistic basd on th influnc of th i-th obsration on th paramtr of th rgrssion modl is wll stablishd but th dpndncy rlations which xist in tim sris gi ris to a smaring ffct whn th tst statistic for a modl cofficint is calculatd and thrfor th nd for tim sris ARMA modl spcifid that tst statistic. Th idntification of influntial obsrations in th complx of ARIMA has bn dlopd (Chang and Tiao 1983 Pna 1984 Tsay 1986 and 1996). In this papr w utilizd th drid statistic in (Lasisi T. A. t a 013l) Sction 3.1(3.1.1-3.1.4) PP 9 to assss th idnc that any stimatd outlir for a gin priod is an outlir. Th conditions ar that th timing t T of th occurrnc of an outlir is known or assumd and th magnitud of th outlir has bn stimatd or spcifid. Considration is gin to ARIMA modls considrd on priodic procsss using th liklihood ratio tst statistic dlopd by Tsay (1986) this xtndd to ll shift and transitory chang outlirs for priodic sris in this study.. Outlir Dtction Tst Statistic If w assum a singl outlir modl: ( B ) Z ( B ) Y ( B ) D (1) 1 1 ( T ) t( r m) t( r m) t( r m) t whr ( B ) is th wight attachd to th magnitud of outlirs. Suppos that th ARIMA modls ( B : I) Y ( B : I) and t( r m) t( r m) () t( r m) t( r m) ( B : I) Z ( B : I) whr ( B ) ( B ) ( B ) ar fittd on th outlir fr sris and outlir infstd sris rspctily. t( r m) is a squnc of indpndnt Gaussian ariats with man zro and arianc on. In modl (1) it is assumd that all th zros of ( m ) ( m ( B : I) and ) ( B : I) ar on or outsid th unit circl and that ( m ) ( B : I) and ( m ) ( B : I) ha no common factors if som of th rrors of ( m ) ( B : I) ar on th unit circl thn it is assumd that th procss starts at fixd tim point with known or gin starting alus. Th whit nois procss t has man zro and arianc applying to th last squar thory w ha th following rsults : (i) ADDITIVE OUTLIER CASE:

4 Th magnitud of outlir stimatd DˆT ( r m ) (ii)innovatnal OUTLIER CASE: is P with arianc ( B ) 1 (1) T ( r m) ( B) T ( r m) T ( r m) Th stimat of outlir magnitud is P () T ( r m) t( r m) with th arianc as T ( r m ) (iii)level SHIFT CASE: Gin that th wight 1 1 B with arianc1 B ( B ). t ( r m) (i)transitory CHANGE CASE: Th wight cofficint of outlir is 1 B 1 1 ( B) ( B) an with arianc t ( r m) 1 B ( B ) T r m 1 thn th magnitud of th outlirs stimatd is 1 th stimat of th outlir is ( ) 1 ( B) ( B) t r m Basd on th rsults in (i)-(i) w may construct th tst statistics for tsting th xistnc of an outlir at tim point t T. Th null hypothsis is that thr is no outlir at tim point t T ; undr th assumptions of knowing th tim sris paramtrs and tim occurrnc of th outlirs th following tst statistics ar distributd as 0 N.Although Chang and Tiao (1983) suggstd th critical alus 3.0 3.5 and 4.0. But in practical tim sris analysis w suggst th us of P( i) T c ( ) for a spcifid alu of undr normality assumption Tsay(1988). Assum that ( ) N(01) and suppos that th ARIMA ( p d q) modl paramtrs timing of outlirs and magnitud ar known thn th tst statistic for ach scnario would b as follows: (i) (ii) Existnc of : Dˆ (3) ( B ) AT 1 Existnc of : IT Dˆ t ( r m) T ( r m) ˆ t ( r m) T ( r m) (4) ˆ (iii) Existnc of : Dˆ T ( r m) LT 1 1 1 B ( B ) ˆ t ( r m) (i) Existnc of : (5)

43 Dˆ (1 B ) ( B ) ˆ T ( r m) CT 1 1 t ( r m) (6).1. Outlirs Dtction Algoritms (ODA) Th critria proposd in (i)-through (i) would b usful in dtcting outlirs in priodic procsss. If ths stimats of th outlirs magnitud ar stimatd using Outlir Dtcting Algorithm (ODA) in figur 1 to gt th magnitud from particular tim point T; w assumd outlir fr points ha a zro magnitud and usd ths to find if this is tru. Th rsult in this sction is only usd in th intrmdiat stps of outlir dtction procdur. Th final stimats of outlirs ar from th modl incorporating all th outlirs in which all paramtrs ar stimatd in th ARIMA ( p d q ) modl. Th following flow chart dmonstrats how automatic outlir dtction works. Lt b Dˆ t th magnitud of outlirs and t T whn an outlir occurs.

44 Figur 1: Outlir Dtction Algorithm 3. Empirical Illustration Th computation will follow th following algorithm for dtction of outlirs using th mthodology dscribd abo. Outlir Dtction Algorithm (ODA) follows ths stps: Rad th priodic obsrations Yt t=1...m Estimat th paramtrs of ARMA procss using SPSS or SYSTAT program.

45 1 (iii)obtain ( B) ( B) ( B) from th stimatd ARMA procsss in stp(ii). (i)rad th alus of DˆT to comput ˆ T ()Rad th critical alus C as 1.645(1%) and 1.96(5%). Th program for computing th xistnc outlirs using (i) Do: 's is: Calculat ˆ T from th alus of DˆT. using xprssion ˆ Calculat i= and it If it C display it othrwis no outlir thn stop. If thn display (ii) End Do. D Var. Ti ( r m) othrwis rchck using diffrnt (iii)go and rad nw priodic obsrations from th fil and prform algorithm. W ha stimatd th tim sris ARMA paramtrs from th data collctd on Maun Airport prcipitation and concntratd on thr outlir fr priods. Th paramtrs shown in tabl 1 ar usd for stimating th ( B ) and in quation () for all th outlir modls considrd. Undr th null t t it hypothsis that thr is no outlir th statistics AT IT LT and TT ha standard distribution. This mad th statistics ( )T to b radily usful in practical modling (Tsay 1986). For practical purposs of dtcting th xistnc and significanc of outlir in tim point T if ( )T is significantly gratr than th chosn critical alu and AT is gratr than any of IT LT and TT. Th additi outlir is mor pronouncd at this point than any othr outlirs. W assum th standard normal distribution with critical Z1 alu as 1.96 at 5% significanc ll. In tabl 1 is significantly gratr than th critical alu at tim points 11 16 18 4 6 and 30. Th is statistically significant at tim points 11 and 30. Both and ar significant at tim points 11 16 18 4 6 and 30. Th prsnc of and ar prominnt at ths tim points bcaus th -alus of and ar gratr than thos of and. For this January priod th and outlir modls confirmd significantly 60% of th wights of th outlirs injctd into th sris. In tabl is significantly gratr than th critical alu at tim points 16 6 and 30. Th is only statistically significant at tim points 18 and 4.Whil is significant at tim point 11 16 6 and 30 is significant at tim points 16 6 and 30. Th prsnc of is prominnt at ths tim points implying that th -alus of and ar on point ach lss than that of. For this Fbruary priod whil th and outlir modls ar rspctily confirmd to b 30% significant of th

46 wights of th outlirs injctd into th sris th is 40% significant of th injctd wights showing a mor powrful fat in capturing th outlirs. is significantly gratr than th critical alu at tim points and 6. Th and ar statistically significant at tim point only as shown in tabl 3. Th obious rason is that th rgim bhas diffrntly to th prsnc of outlirs as all th modls captur th tim point outlirs. Th prsnc of is prominnt at and 6 tim points bcaus th -alus of is gratr than thos of and. For Octobr priod only outlir modl confirmd significantly 0% of th wights of th outlirs injctd into th sris. TIMING Tabl 1: Th Valus of Liklihood Ratio Tst Statistic for January ˆ D ˆ D 3.1 0.4633344 3.1 0.489 3.131 0.488487 3.1 0.4881966 6 40.5 0.813395 40.5 0.785461 40.39035 0.8585 40.35 0.85347 11 133..6716945 133..137311 133.333.815356 133..8159087 16 104.475.0955351 104.475 1.440965 104.5795.08158 104.475.107953 18 11.65.595074 16.533-1.8171 11.767.3810051 11.65.389591 1 5.5 0.511473 5.5-1.877 5.555 0.5389759 5.5 0.541991 4 109.875.038471 111.734 1.89186 110.06.339371 109.977.347078 6 104.475.0955351 3.31-0.11 104.5795.08158 104.475.103044 7 4.075 0.8439305 50.4-0.97134 4.11708 0.889310 4.17948 0.8936314 30 167.05 3.3501484 167.05 3.117334 167.191 3.53094 167.05 3.5308067 Tabl : Th Valus of Liklihood Ratio Tst Statistic for Fbruary TIMING 1.75 0.187740 1.75 0.106609 1.7675 0.8350018 1.75 0.1933979 6 5.5 0.77304787 5.5 0.4994937 5.555 1.167353469 5.5 0.79651665 11 100.45 1.47873014 100.45 0.8430651 100.554.3979867 100.45 1.5404896 16 195.45.87757867 195.45 1.3834307 195.6455 4.34589131 195.45.96574179 18 13.95 0.0540986-177. -.5676378 13.96395 0.31018493 13.95 0.1455869 1 77.85 1.14631956 77.85-1.115875 77.9755 1.731011766 77.85 1.1810453 4 75.975 1.1187107 75.975.378617 76.05098 1.68937346 75.975 1.15357435 6 185.35.7885898 185.35 1.0476535 185.5103 4.10757191 185.35.8117351 7 33.55 0.4936468 33.55853-0.348448 33.55853 0.745438684 33.71033 0.5141314 30 136.5.0099446 136.5-0.4199967 136.6365 3.0351190 136.5.0709518

47 Tabl 3: Th Valus of Liklihood Ratio Tst Statistic for Octobr ˆ D ˆ D ˆ D TIMING 75.85 3.903016 75.85 3.57088 75.901 3.91153 75.85 3.903016 ˆ D 6 37. 1.61433 37..05393 37.37 1.6146689 37. 1.617533 11 1.375 0.975331 1.375-1.58691 1.396 0.97773 1.375 0.991473 16.65 0.1139076.65-1.09745.678 0.116133.65 0.1148351 18 1.75 0.055365-0.95-0.79319 1.76 0.055398 1.75 0.0554404 1 16.75 0.75754 16.75 0.618397 16.74 0.759658 16.75 0.758093 4 3.15 0.1366891 3.36 0.78199 3.153 0.13670 3.1515 0.1374799 6 15.15 0.6574094 1.513-0.04419 15.165 0.657584 15.15 0.657546 7 16.8 0.790085 18.91 0.767088 16.817 0.79179 16.815 0.7303168 30 0.15 0.006509 0.15-0.33009 0.15 0.0065043 0.15 0.007387 4. Conclusion Priodic liklihood ratio tst statistic was utilizd to assss th idnc that any stimatd outlir for a gin priod is an outlir. Th conditions ar that th timing t T of th occurrnc of an outlir is known or assumd and th magnitud of th outlir has bn stimatd or spcifid. For Maun Airport data it is obsrd that for January priod th and outlir modl confirm significantly 60% of th wights of th outlirs injctd into th sris. Th dtcts just only two tim points whil in Fbruary priod and outlir modls rspctily confirmd 30% significanc of th wights of th outlir injctd into th sris. Howr confirms significantly 40% significant of th wights injctd displaying mor powrful fat in capturing th tim points of outlirs. In Octobr priod only outlir modl confirmd significantly 0% of th wight of th outlirs injctd into th sris. Our findings ral that th and modls prform bttr in trms of thir ability to captur th timing occurrnc of outlirs. Rfrncs [1] Bianco A. M. Gracia B. M. Martinz E. J. & Yohai V.J. Outlirs dtction in rgrssion modls with ARIMA rrors using robust stimations Journal of Forcasting. 0(001): 5665-579 [] Bruc A. G. and Martin R.D. La-k-out diagnostics for Tim Sris J.R. Statist. Soc. B 51(1989): 363-44.

48 [3] Chang I. and Tiao G.C. Estimation of Tim Sris Paramtrs in th prsnc of outlirs Tchnical Rport 8 Unirsity of Chicago Statistics Rsarch Cntr 1983. [4] Cook R.D. Dtction of Influnc Obsrations in Linar Rgrssion Tchnomtrics 19(1977): 15-18 [5] Fox A.J.(197) Outlirs in tim sris Journal of Royal Stat. Soc. 34(197): 350-363. [6] Pna D Influnc Obsrations in Tim Sris Tchnical Rport 178. Mathmatics Rsarch Cntr Unirsity of Wisconsin Madison 1984. [7] Tsay R.S. (1986) Tim Sris Modl Spcification in th prsnc of Outlirs. J.A.S.A 81(1986): 13-141. [8] LasisiT.A. Shangodoyin D.K. and Mong S.R.T.(013) Spcicification of Prioddic Autocoarinc Structurs in th Prsnc of Outlirs Studis in Mathmatical Scincs 6()(013): 83-95.