World Academy of Sciece Egieerig ad echology Iteratioal Joural of Mathematical ad Computatioal Scieces Vol: No:6 008 rimmed Mea as a Adaptive Robust Estimator of a Locatio Parameter for Weibull Distributio Carolia B. Baguio Iteratioal Sciece Idex Mathematical ad Computatioal Scieces Vol: No:6 008 waset.org/publicatio/9953 Abstract Oe of the purposes of the robust method of estimatio is to reduce the ifluece of outliers i the data o the estimates. he outliers arise from gross errors or cotamiatio from distributios with log tails. he trimmed mea is a robust estimate. his meas that it is ot sesitive to violatio of distributioal assumptios of the data. It is called a adaptive estimate whe the trimmig proportio is determied from the data rather tha beig fixed a priori. he mai objective of this study is to fid out the robustess properties of the adaptive trimmed meas i terms of efficiecy high breakdow poit ad ifluece fuctio. Specifically it seeks to fid out the magitude of the trimmig proportio of the adaptive trimmed mea which will yield efficiet ad robust estimates of the parameter for data which follow a modified Weibull distributio with parameter λ / the trimmig proportio is determied by a ratio of two trimmed meas defied as the tail legth. Secodly the asymptotic properties of the tail legth ad the trimmed meas are also ivestigated. Fially a compariso is made o the efficiecy of the adaptive trimmed meas i terms of the stadard deviatio for the trimmig proportios ad whe these were fixed a priori. he asymptotic tail legths defied as the ratio of two trimmed meas ad the asymptotic variaces were computed by usig the formulas derived. While the values of the stadard deviatios for the derived tail legths for data of size 40 simulated from a Weibull distributio were computed for 00 iteratios usig a computer program writte i Pascal laguage. he fidigs of the study revealed that the tail legths of the Weibull distributio icrease i magitudes as the trimmig proportios icrease the measure of the tail legth ad the adaptive trimmed mea are asymptotically idepedet as the umber of observatios becomes very large or approachig ifiity the tail legth is asymptotically distributed as the ratio of two idepedet ormal radom variables ad the asymptotic variaces decrease as the trimmig proportios icrease. he simulatio study revealed empirically that the stadard error of the adaptive trimmed mea usig the ratio of tail legths is relatively smaller for differet values of trimmig proportios tha its couterpart whe the trimmig proportios were fixed a priori. Keywords Adaptive robust estimate asymptotic efficiecy breakdow poit ifluece fuctio L-estimates locatio parameter tail legth Weibull distributio. I I. INRODUCION has bee geerally realized that outliers i the data which do ot appear to come from the ormal distributio but B. Baguio is with MSU-II Iliga City 900 Philippies (e-mail: cbbaguio@yahoo.com). may have arise from Weibull distributio with log (heavy ) tails have uusually large ifluece o the estimates. Robust methods of estimatio have bee developed to reduce the ifluece of outliers i the data o the estimates. Statistics which are represeted as a liear combiatio of order statistics called L-estimates make a proper class of robust estimates for estimatig a locatio parameter. he trimmed mea deoted by is a special class of L-estimates. Adaptive estimators pertai to those which are derived from the distributio of the data. For coveiece i the choice of trimmig proportios for trimmed meas the user or researcher simply cosider the proportio of outlyig data o the left ad right tails which caused the iflatio of the variace which are usually termed as trimmig proportios whe fixed a priori. I may studies it was show theoretically ad empirically that choosig the trimmig proportios based from the distributio of the data proved to be more efficiet tha beig fixed. his approach is presetly called the Exploratory Data Aalysis (EDA) i the structure of the data ca be discered through the graphs of the distributio i order to rule out the presece of outliers as well as to get a visual perceptio o the legth of the tails of the distributio ad the symmetry. I the case of asymmetric distributios the side of the distributios havig loger tails must be trimmed with sufficietly large proportio. he use of the ratio of the legth of tails i the selectio of the trimmig proportios is ot yet popular eve up to these days ad still has to be explored for both the symmetric ad asymmetric distributios with log tails. II. OBJECIVES OF HE SUDY he mai objective of this study is to fid out the robustess properties of the adaptive trimmed meas i terms of efficiecy high breakdow poit ad ifluece fuctio. Specifically it seeks to fid out the magitude of the trimmig proportio of the adaptive trimmed mea which will yield efficiet ad robust estimates of the parameter for data which follows a modified Weibull distributio with parameter λ / the trimmig proportio is determied by a ratio of two trimmed meas defied as the tail legth. Secodly the asymptotic properties of the tail legth ad the trimmed meas are also ivestigated. Fially a compariso is made o the efficiecy adaptive trimmed meas i terms of the stadard deviatio for the trimmig proportios ad whe these were fixed a priori. Iteratioal Scholarly ad Scietific Research & Iovatio (6) 008 35 scholar.waset.org/307-689/9953
World Academy of Sciece Egieerig ad echology Iteratioal Joural of Mathematical ad Computatioal Scieces Vol: No:6 008 Iteratioal Sciece Idex Mathematical ad Computatioal Scieces Vol: No:6 008 waset.org/publicatio/9953 III. HEORY AND CONCEPS A. Weibull Distributio he Weibull distributio for 4 parameter values of λ is show i Fig. below. It is apparet from this figure that the desity fuctios for the four parameter values are log tailed. his implies that the variaces of the distributio are relatively large ad surely will affect the Ordiary Least Squares (OLS) estimates. I order to address this situatio the estimate to be cosidered is the adaptive trimmed mea. However the problem that is outstadig is what is the magitude of the trimmig proportio o the tails of the distributio which ca result to efficiet estimates? he probability distributio of the Weibull distributio is: () k >0 is the shape parameter ad the λ >0 is the scale parameter ad x is oegative real umber. he cumulative distributio fuctio is he mea is. ad the variace is. he modified Weibull distributio used i this study has the probability desity λ x F ( Γ ( + / λ )) e λ is the shape parameter. he variace of this distributio is Γ(3/ λ) Var (F) Γ(/ λ) Fig. Probability Desity Fuctio of Weibull Distributio () for parameters λ.50.0.5 ad 3.0 ad k34 respectively B. Properties of rimmed Mea ( ) A umber of ice properties of have bee cited i the textbook [7]. Refer to this book for short i the followig. he properties are give as follows: () is robust to outliers up to 00 % o the left side ad 00 ( )% o the right side (ii) the asymptotic efficiecy relative to the utrimmed mea is ( ) ad (iii). It is simple to compute ad its stadard error may be estimated from the ( + ) wisorized sample. Moreover is cosistet ad asymptotically ormal whe the uderlyig distributio F is cotiuous at the uique quatiles ξ F ( ) ad ξ F ( ) [8]. Geerally a larger proportio of data poits should be trimmed whe F has a loger tail ad a smaller proportio otherwise. herefore the choice of the trimmig proportios would deped upo a priori kowledge of the tail-legth of F. his iformatio may ot be available. herefore there is a eed to determie the trimmig proportios from the sample itself. he trimmed mea is called adaptive whe the trimmig proportios are determied from the data. [4] cosidered the kurtosis as a measure of the tail-legth ad use the sample kurtosis to determie the trimmig proportios. His estimate of the ceter of a symmetric distributio is give by a combiatio of several trimmed meas associated with differet trimmig proportios. Subsequetly [5] proposed a choice of the trimmig proportios based o the ratio R of two L-estimates. I a recet paper [] have reviewed Hogg s method ad preseted certai theoretical ad empirical results o the applicatio of R as a measure of tail-legth to determie the trimmig proportios of a adaptive trimmed mea. Here we derive the robustess ad asymptotic properties of the estimates ad R. Let F be the empirical distributio derived from a sample of observatios from the modified Weibull distributio with parameter λ ½. Cosider ow the robustess of with respect to the give measures of robustess such as the breakdow poit ad the Ifluece fuctio developed by [6]. Deote the descriptive measure of the trimmed mea by (F) ad the sample estimate (F ) by suppressig for coveiece. Assume that F is a cotiuous distributio with a locatio parameter θ which is to be estimated. If ( ) the (F) is a measure of locatio of F. [7]. By a outlier i a locatio parameter cotext without beig very specific refers to a observatio which is cosiderably larger i absolute value tha the bulk of the sample values. Various specific defiitios of a outlier ca be give. For example a observatio may be desigated as a outlier if it is more tha two or three times the iterquartile rage from the media. [7] defied the breakdow poit ε * of (F) as the miimum proportio ε of outlier cotamiatio at x for which ) is ubouded i x F x ε is give by ( F x ε ( ε) F + ε x. he fiite sample Iteratioal Scholarly ad Scietific Research & Iovatio (6) 008 353 scholar.waset.org/307-689/9953
World Academy of Sciece Egieerig ad echology Iteratioal Joural of Mathematical ad Computatioal Scieces Vol: No:6 008 Iteratioal Sciece Idex Mathematical ad Computatioal Scieces Vol: No:6 008 waset.org/publicatio/9953 breakdow poit ε * is the smallest proportio of the observatios i the sample which ca reder the estimator out of boud. It is easily see that the breakdow poit ε * for the trimmed mea is equal to mi( ). [] gives the derivatio of the ifluece fuctio of ξ ad ξ as the quatiles are uiquely determied. ( ) I F ( x) as show () with ad quatiles of F. hese x ξ W W ξ W x < ξ ξ x ξ ( ) ( F ) + ξ + ( x > ξ W ) ξ (3) deotes the ( ) wisorized mea. Note that () E I F( x) 0 (4) C. Asymptotic Property of the rimmed Mea [] derived the asymptotic property of the trimmed mea as follows ( F ) + I + F ( x) d ( F F ( x) R ( F ) + I F d F ( x) + R ( F ) + / I F ( x + i ) R i P 0 as ad x i R (5) deote the sample values. I fact R O p (/) uder reasoable coditios. herefore / ( ( F ) I F ( x i ) i (6) A applicatio of the Cetral Limit heorem shows that ( F d ( F )) N (0 V ( )) (7) V ( F ) var E ( I I F ( x ) F ( x )). (8) Here x deotes a radom observatio from the distributio F. From () the equatio i (9) was derived. ( ) V ( F ) ( ξ W ) + ( )( ξ W ) + ( x w ξ ξ I compariso the asymptotic variace of the ε th quatile is give by var ( ( x)) ( ) /( f ( )). (0) ς ε ε ξ ε F ε D. Relative Efficiecy Sice the trimmed mea is a special member of the class of L-estimator it is iterestig to compare its asymptotic variace with the asymptotic variace of ay other member of the class such as the utrimmed mea. he descriptive measure of a L- estimate measurig locatio is give by ( F ) F ( t ) d k ( t ) () 0 k is a probability distributio o (0). he trimmed mea ( F ) is obtaied by takig k uiform o ( ). Let ad be two L-estimators determied by k ad k ad let f ad f deote the desities of K ad K respectively. Suppose that >. 0 ( f ( t) / f ( t)) A (9) () A [3] have show that () implies V ( F ) A V ( F ). Here V( F) ad V( F) deote the asymptotic variaces of ad respectively. It follows from the above result that the asymptotic relative efficiecy of relative to the utrimmed mea is bouded below by ( ). E. Estimate of the Variace of Let r [ ] ad s [ ( )] [ x ] deotes the iteger part of x ad let X (I) deote the I th order statistic from the sample. he wisorized sample is give by y y y i x x x ( r + ) ( i) ( s ) r i r < i s s < i the wisorized sample variace deoted by S w is give by (3) ) df ( x). Iteratioal Scholarly ad Scietific Research & Iovatio (6) 008 354 scholar.waset.org/307-689/9953
World Academy of Sciece Egieerig ad echology Iteratioal Joural of Mathematical ad Computatioal Scieces Vol: No:6 008 Iteratioal Sciece Idex Mathematical ad Computatioal Scieces Vol: No:6 008 waset.org/publicatio/9953 S w ( y i y ) (4) i y deotes the wisorized sample mea. From (9) ad the asymptotic relatio (7) a estimate of the variace of was derived give by vˆ ar S /( ( ) w ). (5) E. ail-legth A sample from a distributio with loger tail is likely to cotai a larger umber of outliers compared to a sample from a distributio with shorter tail. herefore for estimatig a locatio parameter usig a trimmed mea we should trim the sample more if the uderlyig distributio has loger tail for the sake of robustess. I order to determie the trimmig proportios we eed to have a measure of the tail-legth which we should be able to estimate with some precisio. A measure of tail-legth which has bee referred to i the itroductio is give as follows. It is assumed throughout the followig that the modified Weibull distributio F with parameter λ ½ is symmetric about a locatio parameter θ ad that the trimmed mea is symmetrized that is the trimmig proportios are equal ( ). I this case the descriptive measure of the trimmed mea is deoted by ξ ( F ) x df ( x ) (6) ξ ad the sample estimate of ( F ) is give by ( F ) m m X i ( m + ) ( i ) (7) m [ ] ad x ( i ) deotes the i th ordered value i the sample. Let 0 < γ < δ < < /. he tail-legth of F is give by Rγ δ ( F ) ( γ ( F ) 0 γ ( F ) / ( δ γ ( F ) γ δ ( F )). (8) otice that the umerator ad deomiator of the right side of (8) are each ivariat with respect to traslatio. herefore R ( ) is ivariat with respect to both traslatio ad γ δ F R γ δ F scale trasformatio. Clearly ( ). It is also clear that a loger value of the tail-legth will iduce a larger value of R γ δ. herefore R γ δ ( F ) is a suitable measure of the tail-legth of F. R he sample estimate of the tail-legth is give by ( F ) ( γ ( F ) 0 γ ( F )) /( δ γ ( F ) γ δ ( F )). (9) A /B say. γ δ he asymptotic property of R γ δ ( F ) is derived from a applicatio of the asymptotic relatio (6). Usig this ad the symmetry of F a simple algebraic computatio shows that the trimmed mea F ) ad A ad B are pair-wise ( ucorrelated whe is large. Moreover they are joitly ormally distributed asymptotically. hus we have that heorem : he modified Weibull distributio F with parameter λ ½ is symmetrically distributed the it follows that R γ δ ( F ) ad ( F ) are asymptotically idepedet as. Moreover R γ δ ( F ) is asymptotically distributed as a ratio of two idepedet ormal radom variables give by A /B. Remarks: he statistical idepedece of the trimmed mea ( F ) R γ δ ( F give by ad the tail-legth ) heorem is a useful result. If the trimmig proportios of the trimmed mea is based o the tail-legth R γ δ ( F ) the result implies that the asymptotic distributio of the adaptive trimmed mea is the same as if was fixed a priori. IV. MEHODS AND MAERIALS By usig the modified form of the stadard Weibull distributio with the formula for the desity fuctio as: λ x ( Γ( + / λ)) e he tail legths R γ δ ( F ) for certai values of γ ad δ ad the correspodig respective asymptotic variaces V (F) for the Weibull distributio were computed for λ ½ usig the formulas show i the previous sectios. he results are show i able I. able I shows that the tail legths icrease as the (γδ) icrease while the variaces of the trimmed meas decrease whe icrease. his reveals the desirability of the tail legth i the light of high breakdow poit ad efficiecy as robustess properties. I able II the selectio rule is preseted give the tail legths which ca be summarized as follow: Selectio Rule Says that Iteratioal Scholarly ad Scietific Research & Iovatio (6) 008 355 scholar.waset.org/307-689/9953
World Academy of Sciece Egieerig ad echology Iteratioal Joural of Mathematical ad Computatioal Scieces Vol: No:6 008 Iteratioal Sciece Idex Mathematical ad Computatioal Scieces Vol: No:6 008 waset.org/publicatio/9953 if the tail legths are x R.. ( F) <.5 or y R. 5.3 ( F) <.0 the use.05. if the tail legths are.5 x <. 0 or.0 y <.5 the use.0. if the tail legths are.0 x <. 5 or.5 y < 3.0 the use.5. if the tail legths are.5 x < 3. 0 or 3.0 y < 3.5 the use.0. if the tail legths are 3.0 x or 3.5 y the use.30. V. EMPIRICAL RESULS From the modified Weibull distributio a radom sample of 40 observatios was geerated. From these samples the adaptive trimmed meas A A ad the trimmed meas for two fixed values for.05 ad.0 were computed. With m 00 iteratios the m values of A A.05.0 were obtaied from which the stadard deviatios were also computed such as S( A ) S( A ) S(.05 ) ad S(.0 ) respectively as show i able III. It is apparet from this table that the stadard deviatios are relatively smaller for the adaptive trimmed meas A ad A tha those trimmed meas i the trimmig proportios were fixed a priori at.05 ad.0. his idicates the relative efficiecy of the adaptive procedure which substatiates the fidigs of []. VI. CONCLUSION. he tail legth of the Weibull distributio icreases i magitudes as the trimmig proportio icreases.. he measure of the tail legth ad the adaptive trimmed mea are asymptotically idepedet as the umber of observatios becomes very large or approachig ifiity. 3. he tail legth is asymptotically distributed as the ratio of two idepedet ormal radom variables. 4. he asymptotic variaces decrease as the trimmig proportios icrease. 5. he stadard error of the adaptive trimmed mea is relatively smaller tha its couterpart i the trimmig proportios were fixed a priori. 6. he proposed adaptive rule may be used with advatage i the absece of ay prior groud for a appropriate choice of the trimmig proportios. VII. FUURE DIRECION he refiemet ad applicability of the selectio rule o the trimmig proportios correspodig to the tail legths R γ δ ( F) specifically for modified Weibull distributio are recommeded. Furthermore ivestigatio o the applicability of the selectio rule with some adjustmets o the degree of skewess for assymetrical distributio are hereby recommeded. he extesio of the procedure ca be explored for quatile regressio aalysis o the choice of the ξ th quatile values that istead of beig symmetrized that is (ξ -ξ) at two tails of the error distributio beig fixed adaptive trimmig proportios based o tail legths ca be a potetial choice. REFERENCES [] Alam K. ad Mitra A. (996). Adaptive Robust Estimate Based o ail-legth.sakhyaidia Joural of Statistics Series B(58) 67-678 [] Baguio Carolia B. (999). A Adaptive Robust Estimator of a Locatio Parameter. A Doctoral Dissertatio. [3] BickelP.J. ad LehmaE.L. (976). Descriptive Statistics for Noparametric Models III A. Stat. (4)39-58. [4] Hogg R.V. (967). Some Observatios i Robust Estimatio. Jour. Amer. Statist. Assoc.(6) 7986. [5] HoggR.V.(974). Adaptive Robust Procedures: A Partial Review ad Some Suggestios for Future Applicatios ad heory. Jour. Amer. Statist. Assoc.(69) 909-93. [6] HuberP.J. (964). Robust Estimatio of a Locatio Parameter. A. Math. Stat. (35) 73-0. [7] Staudte R.G. ad Sheather S.J. (990). Robust Estimatio ad estig. Wiley Series i Probability ad Statistics. [8] Stigler S. M. (969). Liear Fuctios of order Statistics A. Math. Stat. 40. 770-788. Iteratioal Scholarly ad Scietific Research & Iovatio (6) 008 356 scholar.waset.org/307-689/9953
World Academy of Sciece Egieerig ad echology Iteratioal Joural of Mathematical ad Computatioal Scieces Vol: No:6 008 ABLE I AIL-LENGH R γ δ ( F) AND ASYMPOIC VARIANCE OF RIMMED MEAN rim size ail Legth R γ δ ( F ) Asymptotic Variace V (F) (γδ) (..) (..5) (.5.30) (.0.30) 0..5.0.5 Distributio Modified Weibull λ ½ 3.06 3.58 4.7 4.8 0.0 45.4 34.73.97 0.64 Iteratioal Sciece Idex Mathematical ad Computatioal Scieces Vol: No:6 008 waset.org/publicatio/9953 ABLE II SELECION RULE FOR HE VALUES OF AIL LENGHS AND Values of A. Select x R.. ( F) if A. Select y R. 5.3 ( F) if.05 x <.5 y <.0.0.5 x <. 0.0 y <. 5.5.0 x <. 5.5 y < 3. 0.0.5 x < 3. 0 3.0 y < 3. 5.30 3 x 3 y.0 ABLE III SANDARD ERRORS OF A A Distributio S( A ) S( A ) S( ).05.5 S( ).0 Modified Weibull.560.553.759.63 λ ½ Iteratioal Scholarly ad Scietific Research & Iovatio (6) 008 357 scholar.waset.org/307-689/9953