On the Influental Ponts n the Functonal Crcular Relatonshp Models Department of Mathematcs, Faculty of Scence Al-Azhar Unversty-Gaza, Gaza, Palestne alzad33@yahoo.com Abstract If the nterest s to calbrate two nstruments then the functonal relatonshp models are more approprate than regresson models. Fttng a straght lne when both varables are crcular and subect to errors has not receved much attenton. In ths paper, we consder the problem of detectng nfluental ponts n two functonal relatonshp models for crcular varables. The frst s developed based on the smple crcular regresson model, demoted by (SC), whle the second s derved from the complex lnear regresson model and denoted by (CL). The covarance matrces are derved and then the COVRATIO statstcs are formulated for both models. The cutoff ponts are obtaned and the power of performance s examned va smulaton studes. The performance of COVRATIO statstcs depends on the concentraton of error, sample sze and level of contamnaton. In the case of the lnear relatonshp between two crcular varables, COVRATIO statstcs of the (SC) model performs better than the (CL). Furthermore, a novel dagram, the so-called spoke plot, s utlzed to detect possble nfluental ponts. For llustraton purposes, the proposed procedures are appled on real data of wnd drectons measured by two dfferent nstruments. COVRATIO statstc and the spoke plot were able to dentfy two observatons as nfluental ponts. Keywords: Correlaton, Radar, Errors, Estmaton, Wnd.. Introducton In practce, calbratng two or more nstruments producng angular measurements can be statstcally handled va crcular functonal relatonshp models (see Cares and Wyatt, 003, Hassan, et al, 00). Takng nto the account that the compared varables are subected to errors. Thus, the exstence of one or more nfluental ponts n a data set s more lkely to affect the effcency of a suggested model. Up to date, there are only two publshed papers consder the problem of nfluental ponts n the crcular functonal relatonshp models. The frst studes the lnear functonal relatonshp model for crcular varables (SC) whch proposed by Cares and Wyatt (003), and the COVRATIO statstc s derved to detect possble nfluental ponts by Hussn, et al. (00). The second paper treats the crcular varables as complex numbers and then the complex lnear functonal relatonshp model (CL) was developed and the COVRATIO statstc was derved by Hussn and Abuzad (0). Cares and Wyatt (003) fxed the slope parameter to be one, later on Hussn, et al. (00) derved the COVRATIO based on the ntercept and concentraton parameters only.
Thus, ntroducng the slope parameter nto the model wll make the covarance matrx more nformatve. On the other hand, the nvestgaton of COVRATIO statstc performance for the (CL) model s questonable, where the real and magnary components were contamnated separately, and as gven by Hussn and Abuzad (0) the power of COVRATIO statstc was more than 0.5 for free contamnaton case. In ths paper, we wll compare the performances of the COVRATIO statstcs n detectng the nfluental ponts n the (SC) and the (CL) models by consderng two ssues: the frst s to ntroduce the slope parameter for the (SC) model, and the second s to contamnate both model consstently. The rest of the paper s organzed as follows: the followng secton presents the consdered two functonal relatonshp models of crcular data. Secton 3 dscusses the dervaton of the COVRATIO statstcs, calculaton of the cut-off ponts and the power of performance. A real data set s presented and analyzed n Secton 4.. Functonal Relatonshp Models of Crcular Varables Fttng a straght lne when both varables are crcular and subect to errors has not receved much attenton. In the followng two subsectons we consder two functonal relatonshp models for crcular varables and derve ther correspondng COVRATIO statstcs.. Lnear Functonal Relatonshp Model for Crcular Varables (SC) For any two crcular random varables X and Y measured wth errors, Cares and Wyatt (003) proposed the followng model: () x = X and y = Y,where Y = X ( mod ), for =,..., n, where and are ndependently dstrbuted wth von Mses dstrbutons, that s ~ VM (0,) and ~ VM (0, ). Under the same assumptons, Hussn (008) extended model () to the (SC) model and t s gven by: () x = X and y = Y,where Y = X ( mod ), for =,..., n, where s the slope parameter. There are (n+4) parameters to be estmated,.e.,,,, and the ncdental parameters X,..., X n. The MLE of the ntercept parameter s gven by: tan ( S / C) S 0, C 0, ˆ = tan ( S / C) C 0, tan ( S / C) S 0, C 0, 334
On the Influental Ponts n the Functonal Crcular Relatonshp Models where S sn( y ˆ x ) and C cos( y ˆ x ). The slope and X ˆ ˆ ˆ ˆ gven by ˆ ˆ X sn( y 0X) 0, Xˆ cos( y ˆ ˆ Xˆ ) 0 estmates are ˆ ˆ ˆ ˆ ˆ ˆ sn( x X 0) sn( y ˆ X 0) X X 0, where cos( ˆ ) ˆ x cos( ˆ ˆ ˆ ˆ X 0 y X 0) and X ˆ are mprovements of ˆ 0 and X ˆ 0, respectvely, and. Then we can fnd an estmate of for any value of from the equaton A( ) A( ) = cos( x ˆ ) cos( ˆ ˆ ˆ X y X), n For the case =, the estmate of concentraton parameter, ˆ may be obtaned by usng the approxmaton gven n Fsher (993): 3 5 w w 0.833w w < 0.53, A w = 0.4.39w 0.43 w 0.53 w < 0.85, 3 ( w 4w 3w) w 0.85. Hence, ˆ = A ( w) where w = { cos( x ˆ ) cos( ˆ ˆ ˆ X y X)}. n The second partal dervatves of the loglkelhood functon wth respect to the parameters are obtaned and then the Fsher's nformaton matrx s formulated to fnd the covarance matrx va fndng ts nverse. Then the followng results can be obtaned: (For detaled dervaton see Hussn, 008). ˆ ( ) X cov( ˆ, ˆ) cov( ˆ, ˆ) 0, var( ˆ), ˆ A( ˆ)[ n X ( X) ] ˆ n( ) ˆ var( ˆ), var( ˆ ), and ˆ A( ˆ)[ n X ( X ) ] n[ ˆ ˆ A ( ˆ) A( ˆ)] ˆ ( ) X cov( ˆ, ˆ). ˆ A( ˆ)[ n X ( X ) ]. Complex Lnear Functonal Relatonshp Model for Crcular varables (CL): For any two crcular random varables X and Y Hussn and Abuzad (0) proposed the complex lnear functonal relatonshp model, and t s gven by: (cos x sn x ) (cos X sn X ), and (cos y sn y ) (cosy sn Y ), and (3) where (cosy sn Y) (cos X sn X ), for =,..., n, 335
where and are ndependently dstrbuted errors from the bvarate complex Gaussan dstrbutons. The MLE of model parameters are gven by: ˆ (cos y ˆ cos ˆ X ), n ˆ (cos y cos ˆ sn sn ˆ ˆ cos ˆ X y X X ), n and ˆ ˆ sn x sn y X tan where cos x ˆ cos ˆ ˆ y and ˆ ˆ ( cos cos sn sn ) ( ˆ ˆ ˆ( ˆ cos ˆ x X x X X cos y )) n n ˆ (cos y cos ˆ sn sn ˆ X y X ). n Due to the absence of the closed-form for ˆ, ˆ, Xˆ and ˆ the estmates may be obtaned teratvely. The asymptotc propertes of ˆ and ˆ are obtaned from Fsher s nformaton matrx and gven by: a b3 a b var( ˆ), var( ˆ), ( a b )( a b3 ) ( a b ) ( a b )( a b3 ) ( a b ) b a cov( ˆ, ˆ), and ˆ ˆ cov( ˆ, ) cov( ˆ, ) 0, ( a b )( a b ) ( a b ) n cos Xˆ where a, a, b ˆ ˆ 3 WR, R WRW WR, b b3, R R ˆ sn Xˆ and also, W, (cos sn ˆ sn cos ˆ ˆ sn ˆ W ) y X y X X ˆ ˆ ˆ ˆ ˆ R (cos cos sn sn ) (sn sn ˆ cos sn ˆ ˆ cos ˆ x ) X x X y X y X X. ˆ ˆ For large values of n, these estmates are normally dstrbuted and can be used to estmate the standard error of ˆ, ˆ and (see Hussn and Abuzad, 0). 3. Covrato Statstc 3. COVRATIO Statstc ˆ Many procedures are derved based on deleton one- row approach to dentfy nfluental ponts n lnear regresson models (see Belsley, et al. 980). COVRATIO statstc s defned as the determnaton rato of covarance matrxes for full and reduced data. The COVRATIO s gven by COVRATIO COV( ) / COV, where COV s the covarance 336
On the Influental Ponts n the Functonal Crcular Relatonshp Models matrx for full data set and COV( ) s the covarance matrx by excludng the th row. If the rato s close to unty then there s no sgnfcant dfference between the covarance matrces,.e. the th observaton s not nfluental. Recently, COVRATIO statstc has been manpulated for crcular regresson models by Abuzad, et al. (0). The determnant of coeffcents covarance matrx for the (SC) model can be wrtten n the followng form: COV ˆ A ˆ ( ) (4) ( ˆ)[ n X ( X ) ] and the determnant of coeffcents covarance matrx for the (LC) model s gven by: COV [( a b )( a b3 ) ( a b ) ] where a, a, b, b and b 3 are as gven n Subsecton.. The COVRATIO( ) statstc s a logcal formula to detect a suspected nfluental pont when ts value s exceedng the cut-off pont. 3. Cut-off ponts of COVRATIO statstc Monte Carlo smulaton study s carred out to obtan the cut-off ponts of COVRATIO statstcs for the (SC) and the (CL) models. Seven dfferent sample szes of n=0,30,50,70,00, 50 and 00 are used. Makng use of the fact that the von Mses dstrbuton wth large concentraton parameter, tends to the normal dstrbuton wth varance (Fsher, 993, Jammalamadaka and SenGupta, 00). We generate X varable of sze n from von Mses dstrbuton, VM / 4,3. Wthout loss of generalty, the parameters of the (SC) and the (CL) models are fxed at =0 and =. Then the observed values of varable Y are calculated based on models () and (3) separately. Assumng that the rato of concentraton parameters, a random error from von Mses dstrbuton wth mean 0 and concentraton parameter =5,0,5,0 and 30 are added to the observed varables as gven n model (). Thus, the varance of the random error of the (CL) model are 0., 0., 0.067, 0.05 and 0.03, respectvely. The values of error concentraton parameters are determned to mnmze ther varaton compared to the modeled varables. The generated crcular data are ftted by models () and (3) ndependently. The COV for the (SC) and the (CL) models by usng expressons (4) and (5) are calculated. Next the th row from the generated data (,..., n ) are excluded subsequently to compute the values of COVRATIO and to specfy the maxmum value. ( ) (5) 337
The process s repeated 000 tmes for each combnaton of sample sze n and concentraton parameter (and varance values). Then the 0%, 5% and % upper percentles of the maxmum values of COVRATIO are calculated. The results ( ) show that the percentles are ndependent of the varaton parameters, where the values of standard devatons for the obtaned cut-off ponts based on the consdered varaton are ranged between 0.00 and 0.09. Table presents the cut-off ponts whch are the mean of the percentles assocated wth the standard devaton n the parentheses for each sample sze n. Table : Cut-off ponts for the null dstrbuton of COVRATIO statstc. Model 90% 0.7 (0.003) 30 0.4 (0.007) 50 0.34 (0.008) n 70 0.09 (0.004) 00 0.78 (0.0) ( ) 50 0.057 (0.004) 00 0.0 (0.00) SC 95%.337 0.46 0.336 0.57 0.9 0.06 0.00 (0.09) (0.00) (0.00) (0.003) (0.00) (0.006) (0.00) CL 99% 90% 95% 99%.857 (0.0) 5.577 (0.06) 6.695 (0.00) 7.0 (0.007) 0.69 (0.007) 0.84 (0.0) 0.870 (0.008) 0.960 (0.004) 0.374 (0.008) 0.68 (0.003) 0.680 (0.005) 0.74 (0.0) 0.30 (0.00) 0.494 (0.03) 0.503 (0.008) 0.536 (0.003) 0.0 (0.005) 0.386 (0.08) 0.49 (0.03) 0.474 (0.08) 0.084 (0.03) 0. (0.003) 0.306 (0.009) 0.48 (0.006) 0.04 (0.007) 0.77 (0.00) 0.9 (0.00) 0.9 (0.003) For all sample szes, the cut-off ponts of COVRATIO statstcs of the (SC) model are less than ts correspondng of the (CL) model. For small sample sze (n=0), the values of the cut-off ponts exceed the value of one, reflectng the napproprateness of COVRATIO statstcs for both models to detect nfluental ponts n small samples. Furthermore, at certan level of sgnfcance, the cut-off ponts s a decreasng functon of the sample sze, whch may refer to the relatve effect of one pont to the total weght of sample sze. 3.3 Power of Performance The performance of the two statstcs are examned numercally va a seres of smulaton studes by consderng fve sample szes, n=0,30,50,70 and 00 and two values of concentraton parameter =0 and 0. Two dfferent types of assocaton between crcular varables are consdered. The frst s a lnear assocaton ( = ), and the second s a nonlnear form of assocaton. 338
On the Influental Ponts n the Functonal Crcular Relatonshp Models Makng use of the fact that bvarate von Mses dstrbuton wth a large concentraton parameter, tends to a bvarate normal dstrbuton wth varance /( ), where s the crcular correlaton between two crcular random varables (Sngh, et al, 00). A smlar procedure to that descrbed n Subsecton 3. s used to generate the data, and for the purpose of comparson between COVRATIO statstcs for the (SC) and the (CL) models, the generated data are contamnated at observaton d as follows: y * d y d (mod ), * where yd s the value of yd after contamnaton and s the level of contamnaton ( 0 ). In order to generate two observed crcular dependent random varables X and Y, for each sample sze n, a set of bvarate von Mses dstrbuton MVM ( 0, κ, ) s generated based on the reecton samplng algorthm, whch proposed by Best and Fsher (979), where κ (3,3) and =. Thus, the varance of the each varable becomes 0.375. The process s repeated 000 tmes and the power of performance s calculated as the percentage of the correct detecton of the contamnated observaton at poston d. The results of smulaton study show that: n all cases, the power of performance s an ncreasng functon of the contamnaton level, Fgure shows the power of performance s a decreasng functon of the sample sze n. On the other hand, Fgure shows that the COVRATIO statstc for the (SC) model performs more effcent than the (CL) model when the assocaton between the crcular random varables s lnear. Power Power 00 00 80 80 60 60 40 40 0 0 n =0 n=30 n=50 n=00 0 0 CL, sg=0. SC, k=0 CL, sg=0.05 SC, k=0 0.0 0. 0.4 0.6 0.8.0 0.0 0. 0.4 0.6 0.8.0 Fg. : Power of COVRATIO ( ) statstc for the SC model wth =0. Fg. : Power of COVRATIO ( ) statstc for the SC and the CL models for n=50 and =0 and 0 ( =0. and 0.05) Unlke the procedure used by Hussn and Abuzad (0) for contamnatng the real and magnary parts n (CL) model separately, we contamnate the generated data before the transformaton nto the complex form. Thus, the results we have obtaned are more 339
reasonable and consstent where the power starts at almost zero when =0 and approaches to 00% when goes to. Contrastng wth the case when the crcular varables are lnearly correlated, Fgure 3 shows that the COVRATIO statstc for the (SC) model performs less effcent than the (CL) model when the assocaton s nonlnear. Power 00 80 60 40 0 0 Lnear CL Lnear LC, n=50 Lnear Lnear SC, n=50 SC Nonlnear LC, n=50 CL Text NonLnear SC, n=50 SC 0.0 0. 0.4 0.6 0.8.0 Fg. 3: Power of COVRATIO statstc, wth ( ) n=50, for SC and CL models, =0 and =0., respectvely. Fg. 4: Spoke plot of wnd drecton data measured by both technques. 4. Numercal Example A real data set consstng of 9 pars of observatons of wnd drecton are recorded by two dfferent nstruments: an HF radar system and an anchored wave buoy. Fgure 4 shows the spoke plot of wnd drecton data (Zubar, et al., 008). The nner rng represents the HF radar whle the outer rng represents the anchored wave buoy. Snce almost all the lnes do not cross the nner crcle, t means that the data are hghly correlated wth estmated correlaton parameter r ˆc 0. 95. Furthermore, there are only two lnes crossng the nner rng, whch are assocated wth observatons number 38 and. Ths ndcates that the pars correspondng to the two observatons may be nfluental. The data are ftted by the (SC) model gvng Y = 0.5 0.989 X ( mod ), wth ˆ.48, and the (CL) model s gven by 4 cosy sn Y 7.57 0 0.977 cos X sn X wth ˆ 0.48. The values of COVRATIO for the (SC) and the (CL) models are obtaned and gven n Fgure 5 and Fgure 6, respectvely. The statstcs are able to defne two observatons as nfluental ponts whch are 38 and. To nvestgate the effect of these two ponts they are deleted and the data are reftted usng the (SC) and the (CL) models. The values of parameter estmates and ther standard errors are gven n Table. It s notceable that for both models the estmates of ntercept and slope become closer to zero 340
COVRATIO () - COVRATIO (-)- 0.0 0. 0.4 0.6 On the Influental Ponts n the Functonal Crcular Relatonshp Models and one, respectvely. The standard error of the slope parameter n the (SC) model for the reduced data s less than the full data. On the other hand, for the (CL) model the standard error s almost the same. Ths ndcates the effcency of the COVRATIO statstc for the (SC) model more than the (CL) model n the consdered data. 0.4 0.3 0. 0. 0.0 0 40 80 0 0 0 40 60 80 00 0 Fg. 5: The COVRATIO statstc of ( ) wnd data ftted by (SC) model. Fg. 6: The COVRATIO statstc of ( ) wnd data ftted by (CL) model. Table : Parameters estmates of the (SC) and the (CL) models assocated wth the standard error Model SC CL Parameter Full data Reduced data Full data Reduced data ˆ 0.5 (0.07) 0.04 (0.059) 7.5 0 5 (.5 0 5 ) -0.90 0 5 (.0 0 5 ) ˆ 0.989 (0.07) 0.999 (0.04) 0.977(.88 ).00 (.88 ) 0 ˆ.468 (0.004) 6.739 (0.00) - - ˆ 0 - - 0.48 (8.38 ) 0.8 (8.7 ) 0 3 0 3 Concluson COVRATIO statstc as an dentfer of the nfluental ponts n functonal relatonshp models of crcular varables s derved and tested for two types of models. If two crcular varables are correlated lnearly, then the COVRATIO statstc of the (SC) model performs better than the (CL) model, vce versa for other types of assocaton. The contamnaton procedure of the (CL) model has shown a reasonable performance compare to the procedure used prevously by Hussn and Abuzad (0). The applcaton of the proposed statstcs for both models on wnd data has shown a consstent concluson of detectng two ponts as nfluental ponts. Other functonal relatonshp models for crcular data based on nonlnear assocaton between varables need to be studed. 34
References. Cares, S. and Wyatt, L.R. (003). A lnear functonal relatonshp model for crcular data wth an applcaton to the assessment of ocean wave measurements. J. Agrc. Bol. Envron. Stat., 8 (), pp. 53-69.. Hassan, S.F., Zubar, Y.Z. and Hussn, A.G. (00). Analyss of mssng values n smultaneous lnear functonal relatonshp model for crcular varables. Scentfc Research and Essays, 5(), pp. 483-49. 3. Hussn, A.G., Abuzad, A., Zulkfl, F. and Mohamed, I. (00). Asymptotc covarance ad detecton of nfluental observatons n a lnear functonal relatonshp model for crcular data wth applcaton to the measurements of wnd drectons. Scence Asa, 36, pp. 49-53. 4. Hussn, A. G. and Abuzad, A. (0). Detecton of outlers n functonal relatonshp model for crcular varables va complex form. Pakstan Journal of Statstcs, 8(), pp. 05-6. 5. Hussn, A.G. (008). Asymptotc propertes of parameters for lnear crcular functonal relatonshp model. Asan Journal of Mathematcs and Statstcs, (), pp. 8-5. 6. Fsher, N.I. (993). Statstcal Analyss of Crcular Data. Cambrdge Unversty Press. 7. Belsley, D.A., Kuh, E. and Welsch, R.E. (980). Regresson Dagnostcs: Identfyng nfluental data and sources of collnearty, New York: John Wley & Sons. 8. Abuzad, A., Mohamed, I., Hussn, A.G. and Rambl, A. (0). COVRATIO statstc for smple crcular regresson model. Chang Ma J. Sc.38 (3), pp. 3-330. 9. Jammalamadaka, S.R. and SenGupta, A. (00). Topcs n Crcular Statstcs, Sngapore: World Scentfc Press. 0. Sngh, H., Hnzdo, V. and Demchuk, E. (00). Probablstc model for two dependent crcular varables. Bometrka, 89, pp. 79 73.. Best, D. J. and Fsher, N. I. (979). Effcent smulaton of the von Mses dstrbuton. Journal of the Royal Statstcal Socety. Seres C, 8(), pp. 5-57.. Zubar, Y.Z., Hussan, F. and Hussn, A.G. (008). An Alternatve Analyss of Two Crcular Varables va Graphcal Representaton: An Applcaton to the Malaysan Wnd Data. Computer and Informaton Scence. (4), pp. 3-8. 34