Goodness of Fit Test for The Skew-T Distribution

Similar documents
Summary of the lecture in Biostatistics

Bootstrap Method for Testing of Equality of Several Coefficients of Variation

Module 7: Probability and Statistics

A new Family of Distributions Using the pdf of the. rth Order Statistic from Independent Non- Identically Distributed Random Variables

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Simple Linear Regression

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Unimodality Tests for Global Optimization of Single Variable Functions Using Statistical Methods

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Lecture 3 Probability review (cont d)

Analysis of Variance with Weibull Data

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

CHAPTER VI Statistical Analysis of Experimental Data

Bayesian Inferences for Two Parameter Weibull Distribution Kipkoech W. Cheruiyot 1, Abel Ouko 2, Emily Kirimi 3

Special Instructions / Useful Data

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design

Functions of Random Variables

Objectives of Multiple Regression

Chapter 14 Logistic Regression Models

A New Family of Transformations for Lifetime Data

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Econometric Methods. Review of Estimation

STK4011 and STK9011 Autumn 2016

BAYESIAN INFERENCES FOR TWO PARAMETER WEIBULL DISTRIBUTION

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Comparison of Parameters of Lognormal Distribution Based On the Classical and Posterior Estimates

Chapter 8. Inferences about More Than Two Population Central Values

Lecture Notes Types of economic variables

4. Standard Regression Model and Spatial Dependence Tests

Chapter 8: Statistical Analysis of Simulated Data

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Bias Correction in Estimation of the Population Correlation Coefficient

Estimation and Testing in Type-II Generalized Half Logistic Distribution

A NEW MODIFIED GENERALIZED ODD LOG-LOGISTIC DISTRIBUTION WITH THREE PARAMETERS

Statistics MINITAB - Lab 5

ESS Line Fitting

Multiple Choice Test. Chapter Adequacy of Models for Regression

Lecture 3. Sampling, sampling distributions, and parameter estimation

MEASURES OF DISPERSION

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Confidence Intervals for Double Exponential Distribution: A Simulation Approach

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Chapter Two. An Introduction to Regression ( )

Comparing Different Estimators of three Parameters for Transmuted Weibull Distribution


Module 7. Lecture 7: Statistical parameter estimation

BAYESIAN ESTIMATOR OF A CHANGE POINT IN THE HAZARD FUNCTION

Chapter 13 Student Lecture Notes 13-1

Chapter 5 Elementary Statistics, Empirical Probability Distributions, and More on Simulation

VOL. 3, NO. 11, November 2013 ISSN ARPN Journal of Science and Technology All rights reserved.

Point Estimation: definition of estimators

Some Statistical Inferences on the Records Weibull Distribution Using Shannon Entropy and Renyi Entropy

A Study of the Reproducibility of Measurements with HUR Leg Extension/Curl Research Line

Simple Linear Regression

Median as a Weighted Arithmetic Mean of All Sample Observations

STATISTICAL INFERENCE

Parameter Estimation in Generalized Linear Models through

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

Chapter 11 The Analysis of Variance

22 Nonparametric Methods.

Simulation Output Analysis

Some Applications of the Resampling Methods in Computational Physics

Random Variate Generation ENM 307 SIMULATION. Anadolu Üniversitesi, Endüstri Mühendisliği Bölümü. Yrd. Doç. Dr. Gürkan ÖZTÜRK.

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Introduction to local (nonparametric) density estimation. methods

Bayes Interval Estimation for binomial proportion and difference of two binomial proportions with Simulation Study

Chapter 5 Properties of a Random Sample

ENGI 3423 Simple Linear Regression Page 12-01

Study of Correlation using Bayes Approach under bivariate Distributions

Confidence Interval Estimations of the Parameter for One Parameter Exponential Distribution

STK3100 and STK4100 Autumn 2017

LINEAR REGRESSION ANALYSIS

STK3100 and STK4100 Autumn 2018

STA302/1001-Fall 2008 Midterm Test October 21, 2008

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

PROPERTIES OF GOOD ESTIMATORS

Correlation and Simple Linear Regression

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

Lecture 1 Review of Fundamental Statistical Concepts

= 1. UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Parameters and Statistics. Measures of Centrality

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

UC Riverside UC Riverside Electronic Theses and Dissertations

Line Fitting and Regression

Correlation and Regression Analysis

Continuous Distributions

Wu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1

Linear Regression with One Regressor

Chapter 2 Supplemental Text Material

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

(Monte Carlo) Resampling Technique in Validity Testing and Reliability Testing

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Logistic regression (continued)

5.1 Properties of Random Numbers

Transcription:

Joural of mathematcs ad computer scece 4 (5) 74-83 Artcle hstory: Receved ecember 4 Accepted 6 Jauary 5 Avalable ole 7 Jauary 5 Goodess of Ft Test for The Skew-T strbuto M. Magham * M. Bahram + epartmet of Statstcs Uversty of Isfah Isfaha Ira * magham8@gmal.com + m.bahram@sc.u.ac.r Abstract I ths mauscrpt goodess-of-ft test s proposed for the Skew-t dstrbuto based o propertes of the famly of these dstrbutos ad the sample correlato coeffcet. The crtcal values for the test ca be acheved by Mote Carlo smulato method for several sample szes ad levels of sgfcace. The power of the proposed test ca be specfed for dfferet sample szes ad cosderg dverse alteratves. Keywords: Sample correlato coeffcet; Skew-t; Goodess-of-ft test.. Itroducto Let Z be a radom varable we say that Z has the Skew-ormal dstrbuto deoted by ts probablty desty fucto be Z ~ SN f ad f z ( z; ) ( z) ) z I ( ( z) () where deotes the desty ad cumulatve dstrbuto fucto of stadard Normal dstrbuto respectvely. The skew-ormal dstrbuto was troduced by Azzal (985) as a famly wth the appealg property of strctly cludg the ormal law as well as a wde varety of skewed destes. We say that a radom varable W has the Skew-t dstrbuto wth parameters ad R f W d Z / V where Z s the skew-ormal varable wth pdf () V / ad are depedet. Ths varable s deoted by W St. If a radom varable Y s defed as Y W wth R R the Y St( ). Skew-Cauchy dstrbuto s obtaed smply as specal cases of the skew-t wth SC. ad deoted by ()

M. Magham M. Bahram/ J. Math. Computer Sc. 4 (5) 74-83 Some well kow propertes of skew-t varables whch wll be useful for costructg goodess of ft test are the followg (See for detals [3]): (a) If W St( ) the W St( ). (b) If W St( ) the W F( ).. EF-Based Tests Perez Rodrguez ad Vllaseor () developed a goodess of ft test for the skew ormal famly based o the sample correlato coeffcet ad showed that ther test have greater power tha the Emprcal strbuto Fucto-based tests agast some alteratve dstrbutos. We are terested testg the ull hypothess H : Y s St( ) for some R R R R (3) agast geeral alteratves. I ths secto we dscuss geeral EF-based goodess-of-ft statstcs desged to test the ull hypothess H. EF-based test statstcs measure the dfferece betwee the dstrbuto fucto F (.) stated the ull hypothess ad the EF a step fucto deoted by F (.) gve as F ( / where ( )... ( ) are the ordered statstcs of the 's. To compare the two dstrbuto fuctos several statstcs ca be used that Stephes (986) dvdes to two famles. The Cramér-vo Mses famly cotas the Cramér-vo Mses statstc W Watso's U statstc ad the Aderso-arlg statstc A defed as: W ( ) ( ) () F ( F( df( ( ) U F ( F( F ( t) F( t) df( t) df( A The Kolmogorov-Smrov famly cotas the statstcs ad the Kuper statstc V defed as: sup F ( F( F( F( df( F ( F( supf ( F (. ma V. the Kolmogorov-Smrov statstc Stephes (986) provdes the followg smple formulae for calculatg these statstcs: W p() 75 (4)

M. Magham M. Bahram/ J. Math. Computer Sc. 4 (5) 74-83 U W p (5) A log p log p (6) ( ) ( ma p( ) (7) ma p( ) (8) ma (9) Where F V p( ) ( ) ad p p( ) /. Large values of a gve statstc dcate sgfcat dffereces betwee the emprcal ad hypotheszed dstrbuto fuctos ad thus that we should reject the ull hypothess. I geeral whe the parameter values of the hypotheszed dstrbuto are completely specfed the samplg dstrbuto of ay of these EF statstcs s kow eactly ad tables of percetage pots are avalable (see Stephes (986) Table 4.). However whe the values take by the parameters of the dstrbuto are ukow ad have to be estmated from the sample the samplg dstrbuto of ay EF statstc depeds o the dstrbuto beg tested sample sze true values of the ukow parameters ad method used to estmate the parameters. Now we descrbe the parametrc bootstrap techques used to estmate the quatles of the test statstc T whe the hypotheszed dstrbuto s skew-t wth parameter values estmated from the data. Mamum lkelhood methods ca be employed to estmate the parameters of the skew-t dstrbuto. Sce aalytc epressos do ot est for these estmators umercal methods must be used to compute them. Note that whe the ukow parameters are locato or scale parameters ad they are estmated usg locato ad scale equvarat estmators (as are mamum lkelhood estmators) the samplg dstrbutos of the EF statstcs do ot deped o the true values of those parameters. (see Eastma ad Ba (973)). Therefore the values of ad were used for smplcty because of the samplg dstrbutos of the statstcs beg varat to chages the locato ad scale parameters. Sce however the asymptotc ull dstrbuto of the test statstc depeds upo the ukow value of ad a parametrc bootstrap versos of the test s performed: ^. Gve the sample y... y compute the mamum lkelhood estmator ad ˆ of ad.. Calculate the value of the chose test statstc T usg the approprate formula(e) from Eqs. ( 4) - ( ) where (.) St ˆ ˆ. F deoted the dstrbuto fucto of (a) Geerate a bootstrap sample of sze from St ˆ ˆ. () 76

M. Magham M. Bahram/ J. Math. Computer Sc. 4 (5) 74-83 (b) Gve the bootstrap sample geerated prevously compute the ML estmators of ad say ad ˆ*. * ˆ (c) Compute the value of the test statstc say T usg * ˆ 3. Repeat steps (a) (b) ad (c) tmes to get T j.... 4. Obta T (.5 ) as (95) j ˆ* ad the bootstrap sample. T where T j... deotes the ordered T j values. ( j) 3. Correlato Goodess-of-Ft Test I ths secto we troduced goodess-of ft test for skew-t dstrbuto wth sample correlato coeffcet. The test procedure s based o property (b). From Eq. (): Y W where Y St( ) ad W St( ) the X : ( Y ) W By property (b): X : ( ). W F From () parameter has bee elmated from the problem. () () For fed ad say ad X has a scale dstrbuto P( X G( ) where G s the dstrbuto fucto of a F ( ) radom varable. So gve the sample y.. y ad calculate... by usg (). A cosstet estmator for P( X s the emprcal dstrbuto fucto the G( ) F ( therefore u : G F ( (3) Sce (3) s establshed we should epect a strog lear relatoshp betwee 's ad u 's uder the ull hypothess stated (3). If ad are estmated by cosstet estmators say ad the t s epected that the lear relatoshp (4) stll holds. To test f there s a strog lear relatoshp betwee 's ad u 's the sample correlato coeffcet statstc s used whch s gve by C Corr( X U) X X U U X X U U The ull hypothess (3) s rejected at the level of sgfcace f C C ( ) where C ( ) s such that (4) ma ( ). (6) ma P Re ject H H 77 P C C

M. Magham M. Bahram/ J. Math. Computer Sc. 4 (5) 74-83 The dstrbuto of C uder the ull hypothess for each fed value of ad ca be obtaed by Mote Carlo smulato. Note that C s scale varat ad the dstrbuto of X () does ot deped o therefore we wll f ad. If the radom sample comes from a dstrbuto fucto dfferet from the skew-t dstrbuto for whch property (b) does ot hold the t meas that (3) does ot hold. Therefore the sample correlato coeffcet (4) ca ot be ear hece C should be lower tha the crtcal value sce uder H the dstrbuto of C wll be cocetrated close to. Therefore we use the followg procedure to obta the crtcal values:. F.. Smulate a sample of sze from St ( ) 3. Calculate the mamum lkelhood estmator of parameter. 4. Calculate... usg Eq. (). 5. Sort 's to ascedg order. 6. Calculate u : G F ( ) dstrbuto.... where G s the quatle fucto of the F ( ) 7. Calculate C usg Eq. (4) ad the data u geerated steps 5 ad 6. 8. Repeat steps -7 B tmes. Upo fshg the smulato process we have B realzatos of C for a gve value of ad. Therefore the value of the crtcal costat C ( ) s determed wth the quatles from the emprcal dstrbuto of C. For eample fg. presets graph of ( )..5.5 ad 5 for whch shows that the dstrbuto of the test statstc C uder H ot depeds o the value of the ukow parameter. Our smulatos show ths fact defeasble for arbtrary. C as a fucto of Note that we have lmted our atteto to Y St( ) wth sce Y St( ) by property (a). Therefore dstrbuto of C does ot deped o the sg of hece the crtcal costat C ( ) (5) s such that ma P C C ( ) ma PC C ( ). For arbtrary smulatos show that the values of the crtcal costat C ( ) are determed wth the quatles from the emprcal dstrbuto of C obtaed by smulato wth arbtrary. Fg. show ths fact for.? 78

M. Magham M. Bahram/ J. Math. Computer Sc. 4 (5) 74-83 Fgure. Crtcal values as a fucto of for 5 B 5 for the statstc C. Gve a radom sample of data values the steps ecessary to carry out a gve test ca be summarzed as follows:. Calculate the MLEs of the ad usg lbrary `s' (Azzal (8)) R (R evelopmet Core ad Team 8) ad deote by ˆ ad ˆ.. Calculate the value of the test statstc C usg the Eq. (4). 3. For a gve sgfcace level detfy the quatle C ( ) of the test statstc correspodg to ˆ ad. 4. If C C ( ) the ull hypothess s rejected at the sgfcace level. 4. Smulato studes 4-. Tests sze The results of sze estmatos of tests preseted Table ad obtaed by smulato for. 5. The selected sample szes were 5 ad the value of parameter { 4 5 37 5 7.5 3. } From Table ad t ca be see that the estmated tests szes are very close to the omal sgfcace level. Table : Test sze estmates usg the statstcs obtaed by smulato wth B Mote Carlo samples of sze 5 wth. 5. Statstc A W U V C 4 5 3 7 5 7.5 3.4.57.4.43..47.47.54.57.39.8.53.35.3.43.37.54.33.53.5.5..37.3 79.4...36.4.....43.37.5.4.9.38.3.54.56.56.7.4.54.48.7

M. Magham M. Bahram/ J. Math. Computer Sc. 4 (5) 74-83 Table : Test sze estmates usg the statstcs obtaed by smulato wth B Mote Carlo samples of sze wth. 5. Statstc A W U V C 4 5 37 5 7.5 3.4..7.34.54.7.37...3..8.4.34..34.3.4.53.49.33.4.9.7.45.5.37.47.4.5.38.4.35.7.53.4.6...3.4.54.3.5...49.9 4-. Tests power To aalyze the behavor of the proposed tests alteratves dfferet to the skew-t were cosdered. The dstrbutos selected for ths were: skew-slash (SSL) Logstc Epoetal Ch squared Webull Gumbel Log Normal ad Stable (see Nola (999)). We also cosdered some bmodal dstrbutos. The results are show Tables 3 ad 4 from whch t ca be see that the proposed test C show the hghest powers for several of the cosdered alteratves. Table 3: Power estmates of the A A W U V C statstcs for some alteratves wth 5. 5 B 5. U V C Alteratve W SSL().54.64.43.34.7.745 Stadard logstc.4.34.7.48.3.47 Stadard ep..3.64..54.347.65 Chsquared(4).78.95.74.3.47.9 Webull(.75).3.7.4.4.37.84 Stadard Gumbel.59.47.47.5.3.537 Log-Normal(.5).7.56.54.3.7.694 Sta(.6.5;).34.4.4.3.9.45.5N(4.5.5)+.5N(-4.5.5).87.9.54.67.49.93.9N(4.5.5)+.N(-4.5.5).95.54.48.5.64.934.5N((/3))+.5N(-(/3)).5.6.7.74.45.664.9N((/3))+.N(-(/3)).4.574.3.7.36.9 8

M. Magham M. Bahram/ J. Math. Computer Sc. 4 (5) 74-83 Table 4: Power estmates of the A W U V C statstcs for some alteratves wth.5 B 5. Alteratve A W U V C SSL().67.74.3.4.39.753 Stadard logstc.5.4.3.3.345.58 Stadard ep..345.75.43.64.457.76 Chsquared(4).93.7.8.5.7.3 Webull(.75).47.69.396.48.67.867 Stadard Gumbel.746.64.68.69.47.77 Log-Normal(.5).875.64.58.44.35.793 Sta(.6.5;).44.57.9.487.47.673.5N(4.5.5)+.5N(-4.5.5).83.945.673.74.576.944.9N(4.5.5)+.N(-4.5.5).96.6.33.549.673.967.5N((/3))+.5N(-(/3)).574.73.35.47.59.8.9N((/3))+.N(-(/3)).5.67.4.3.3.384 6. Numercal eample To llustrate how the test procedure works wth real data we use data collected at the Australa Isttute of Sport (AIS) (Cook & Wesberg (994)) cotag male athletes of body mass de (BMI). Table 5 reports mamum lkelhood estmators of some skew models cosderg the full St ( ) model ad two specal cases: Skew-ormal ad Skew-cauchy. The Akake formato crtero (AIC) s used to compare the estmated models (Lerou (99)). As s well kow a model wth a mmum AIC value s to be preferred. Therefore the St ft appears to be preferable. These pots are further llustrated Fgure 3 where a hstogram of the data s plotted together wth the ftted destes. Table 5: MLE estmates ad Log-lkelhood values. Model SN SC St.7978.7597.37376 4.37343.38549.9735 3.69.54.446 - - 5.6387 Log-lkelhood -37.8347-47.553-35.933 AIC 48.6694 5.46 479.866 8

M. Magham M. Bahram/ J. Math. Computer Sc. 4 (5) 74-83 AIC log( L) k. L ad k are the mamzed log lkelhood ad umber of parameters...5..5..5 SN SGN SCN SC S-t Fgure : Hstogram of BMI of Australa athletes. The les represet dstrbutos ftted usg mamum lkelhood estmato. However the goodess of ft test for ths data Skew-Normal ad Skew-Cauchy rejecto of SN ad SC models but we ca ot reject the hypothess of a uderlyg skew-t populato for data set. (See for detals [8]). The results are summarzed Table 6. The crtcal pots the correspodg value of the test statstcs ad rage of P value gve Table 6. Table 6: Crtcal pots ad values of the test statstcs for the BMI data Model SN SC St * r r.9754747 R =.76834 C =.97686 Test statstcs.94698 5 3 35 %.9455.9738395.73456.7389985.5%.937466.979894.775845.783385 5%.9586.9834587.8549.87583 %.96663.986867.864546.8893 5%.97494.9885898.899.9979.5%.98959.9936363.97463.97486 P value (.5.5) (.5.5) (.5.5) (.5) 8

M. Magham M. Bahram/ J. Math. Computer Sc. 4 (5) 74-83 It s mportat to meto that all the calculatos show ths work were obtaed usg routes wrtte R. Ths routes uses the s package ad are freely avalable up o request. Refereces [] A. Azzal A class of dstrbutos whch cludes the ormal oes Scadava Joural of Statstcs. (985) 7 78. [] A. Azzal R package s: the skew-ormal ad skew-t dstrbutos (verso.4 6) Uversta d Padova (8). [3] A. Azzal A. Captao strbutos geerated by perturbato of symmetry wth emphass o a multvarate skew t dstrbuto Joural of the Royal Statstcal Socety: Seres B (Statstcal Methodology). 65() (3) 367 389. [4] R.. Cook S. Wesberg A Itroducto to Regresso Graphcs New York: Wley (994). [5] J. Eastma L.J. Ba A property of mamum lkelhood estmators the presece of locatoscale usace parameters Commu. Statst. (973) 3 8. [6] B.G. Lerou Cosstet estmato of a mg dstrbuto Aals of Statstcs. (3) (99) 35 36. [7] J.P. Nola Stable dstrbutos Uversty Washgto C (999). [8] P. Perez Rodrguez J.A. Vllaseor O testg the skew ormal hypothess Joural of Statstcal Plag ad Iferece. 4 () 348 359. [9] R evelopmet Core Team R: a laguage ad evromet for statstcal computg R Foudato for Statstcal Computg Vea Austra; ISBN 3-95-7-. (8). [] M.A. Stephes Tests based o EF statstcs I: 'Agosto R. B. Stephes M. A. eds. Goodess-of-Ft Techques. New York: Marcel ekker (986). 83