COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION

Size: px
Start display at page:

Download "COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION"

Transcription

1 (REFEREED RESEARCH) COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION Hakan S. Sazak 1, *, Hülya Yılmaz 2 1 Ege University, Department of Statistics, İzmir, Turkey 2 Eskişehir Osmangazi University, Department of Biostatistics and Medical Informatics, Eskişehir, Turkey Received: Accepted: Abstract: Quite a few robust estimators have been proposed by many authors since the well-known estimators of the location and scale parameters, the sample mean and the sample standard deviation, which are not robust to deviations from normality. There are some studies in the literature investigating the robustness of the various methods through simulation but they generally focus on investigating the performance of the estimators of the location parameter. In this study we compared the performance of two types of Huber s M estimators (w24 and BS82), the modified maximum likelihood (MML) estimators, and the sample median and the scaled median absolute deviation (MAD) w.r.t. the sample mean and the sample standard deviation via simulation under the mixture and outlier models. Depending on the simulation results, in the estimation of the location parameter, we can suggest the general usage of the Huber s M estimators. In the estimation of the scale parameter, the MML estimator of the scale parameter can be used unless the sample size and the extremity of contamination (k) are large. In such situations the sample standard deviation should be preferred. Keywords: Modified Maximum Likelihood; Robustness; M Estimators; Mixture Model; Outlier Model 1. INTRODUCTION The most well-known estimators of the location and scale parameters are the sample mean and the sample standard deviation, respectively. They have the optimal properties under normality but they do not possess robustness which means they lose considerable amount of efficiency in the case of deviations from normality or in the presence of outliers [1-3]. Assumption of normality may not be realistic for various real life data sets [3]. Ignoring the violation of the normality assumption can end up with inefficient estimation of the parameters. This may possibly lead to a wrong analysis and interpretation of the situation. Quite a few robust estimators have been proposed to * Corresponding Author: Tel: Fax: hakan.savas.sazak@ege.edu.tr 20

2 alleviate this problem. Wilcox [3] gave the definitions and properties of a variety of estimators in detail. Wilcox [3], Özdemir [4], and Wilcox and Özdemir [5] performed simulation studies to compare the efficiencies of several estimators of the location parameter for different distributions and models. As a common result, they have found that no estimator of the location parameter is the best for all situations but it is clear that the sample mean is the least efficient estimator unless the distribution is normal. This is an expected result since it is known that the sample mean is too sensitive to deviations from normality [1]. In this study, we will first introduce the most popular estimators of the location and scale parameters and then compare their performance through a Monte Carlo simulation study under several situations. In detail, two types of Huber s M estimators (w24 and BS82), the modified maximum likelihood (MML) estimators, and the sample median and the scaled median absolute deviation (MAD), will be compared with the sample mean and the sample standard deviation under the normal and non-normal distributed data sets for various sample sizes. Non-normal conditions are provided with different mixture and outlier models. The paper is organized as follows. We give the descriptions of the mentioned estimators of the location and scale parameters, and the mixture and outlier models in Section 2. Section 3 contains the simulation results which are performed to compare the efficiencies of the introduced methods. Final section includes some concluding remarks and suggestions. 2. METHODOLOGY The usual estimators of the location and scale parameters are the sample mean and the sample standard deviation which are, respectively, n x 1 x i n and n 1 2 s (x i x). n 1 i 1 i 1 21

3 A. Huber s M Estimators Let y 1,y 2,,y n be a random sample from a distribution of the type (1/ )f((y- )/ ). Huber [6] assumed that f is unknown but a long-tailed symmetric distribution (kurtosis 3), and then proposed a new method to estimate the location and 4 22 scale parameters. Gross [7] investigated 25 estimators of and out of 65 estimators discussed by Andrews et al. [8] and recommended three of them, namely, the wave estimators w24, the bisquare estimators BS82 and the Hampel estimators H22 [1]. In this study, w24 and BS82 estimators were used for comparison. Pairs of equations according to w24 and BS82, respectively, are shown below: T 0 =median(y i ), S 0 =median ( y i T 0 ) and, z i = (y i T 0 ) ( 1 i n. h S 0 ) For w24, μ w24 = T 0 + (hs 0 ) tan 1 [ i sin z i ] and i cos z σ w24 = (hs 0 ) [n i where h=2.4. i sin(z i ) 2]1/2 ( i cos(z i )) For BS82, μ BS82 = T 0 + (hs 0 ) i ψ(z i ) i ψ (z i ) and σ BS82 = (hs 0 ) [n i ψ 2 (z i ) 2]1/2 ( i ψ (z i )). Here, ψ(z) = { z(1 z2 ) 2 ; z 1 0 ; z > 1 where h=8.2. and ψ (z) = 1 6z i 2 + 5z i 4 Remark: Gross [7] tried various h coefficients for the wave and bisquare estimators and finally recommended using h=2.4 and 8.2 for w24 and BS82 estimators, respectively, depending on the Monte Carlo simulations since they possess both high efficiency and robustness for these coefficients. 22

4 B. Modified Maximum Likelihood (MML) Estimators The normality assumption is too restrictive from applications point of view; see, for example, Huber [9] and Tiku et al. [10]. Hampel et al. [11] pointed out that many real life data can be approximated by Student s t-distribution. Assuming Student s t distribution also provides more robust estimators [12]. Because of these facts we assumed an underlying long tailed symmetric (LTS) distribution which is a scaled Student s t distribution with 2p-1 df. scaled so that its variance is 2. Another advantage of LTS distribution is that it covers normal distribution since it reduces to normal distribution for p=. Let X be a random variable following LTS distribution that is shown below: f(x; p) = p 1 (x μ)2 σ kβ ( 1 2, p 1 (1 + 2 ) kσ 2 ), < x < where k = 2p 3 (p 2); β(a, b) = Γ(a)Γ(b)/Γ(a + b). In order to obtain the MML estimators of the location and scale parameters which originated with Tiku [13], initially the maximum likelihood (ML) equations are expressed in terms of the ordered variates z (i) = x (i) μ σ simply by replacing z i = x i μ σ by z (i) ( 1 i n). The intractable terms in the likelihood equations are linearized by using the first two terms of a Taylor series expansion and the following estimators are obtained for a given value of p. where μ MML = n i=1 β ix (i) n i=1 β i n B = 2p α k i=1 i [X (i) i=1 β i n n and X (i) i=1 β i σ MML = B+ B 2 +4nC 2 n(n 1) n ] and C = 2p β k i=1 i [X (i) i=1 β i n n X (i) i=1 β i 2 ], 23

5 α i = (1/k)t 3 (i) [1+(1/k)t (i) 2 ] 2 and β i = 1 [1+(1/k)t 2 (i) ] 2. Note that, t (i) = E(z (i) ) where z (i) = x (i) μ For 1 i n, t (i) can be obtained from the equation f(z)dz. σ. t (i) In real life, the parameter p of LTS distribution is not known. In our study we used a calibration technique [14] to estimate p. The likelihood function of LTS distribution is computed for several values of p with the corresponding MML estimates of μ and σ. Then, the value of p, that maximizes the likelihood function, is taken as the estimate of p. C. Median and Median Absolute Deviation (MAD) Median (μ ) is one of the widely known robust estimators of the location parameter. Let y 1,y 2,.,y n be a random sample. Median is the middle order statistic when n is odd. When n is even, then the average of the order statistics with ranks (n/2) and ((n/2)+1) is equal to the median. Median absolute deviation (MAD) is a simple way to calculate the variation of a data set which is median y i median(y i ). It was first used to estimate the unknown scale parameter. Then, MAD was scaled by dividing it by to make it an unbiased estimator of for normal distribution as follows MAD = median y i median(y i ) Mixture and Outlier Models of Normal Distribution In the mixture model, a sample contains subsamples and each of these subsamples comes from a different population with a specified probability. 24

6 In this study, for the mixture model, we assume that the sample contains two subsamples that come from normal distributions with mean zero but with different scale parameters with probability π and (1 π), respectively. The mixture model is shown below: obs. with probability π~ N(0, k 2 ) and obs. with probability (1 π)~ N(0,1) This model has mean 0 and variance (1 - π + π k 2 ). Consider that a sample which has outliers contains totally n observations. If we want to model this sample with a theoretical distribution, outliers and the regular observations must be modeled separately. The model that is combined with the distributions of regular and outlying observations is called an outlier model. In this study, for the outlier model, it is assumed that both regular and outlying observations come from a normal distribution with mean zero but with different scale parameters. The outlier model is shown below: a obs. ~ N(0, k 2 ) and (n a) obs. ~ N(0,1) This model has mean 0 and variance (1 - (a/n) + (a/n) k 2 ). For both the mixture and outlier models, k can be considered as the extremity of contamination. Remark: Under regularity conditions, the distributions of the sample mean and the sample standard deviation are approximately normal for large n (see Bain and Engelhardt [15] and Kenney and Keeping [16]). In the same way, under regularity conditions, M estimators have asymptotic normal distribution (see Huber [2]). The MML estimators also have asymptotic normal distribution (like the ML estimators) under very general regularity conditions since they are asymptotically equivalent to ML estimators (see Tiku and Akkaya [1] for details). Since the sample median is a central order statistic (or the average of two central order statistics for the even sample size), it also has asymptotic normal distribution under certain conditions (see Bain and Engelhardt [15]). Hall and Welsh [17] showed that MAD is asymptotically normal under only very mild smoothness conditions on the underlying distribution. It is 25

7 possible to work out the exact distributions of the mentioned estimators under several situations by using some approximation methods as Edgeworth expansion or saddlepoint techniques but it can be very cumbersome [2]. 3. SIMULATION STUDY In this study, the performance of various estimators of the location and scale parameters are investigated under standard normal distribution and different cases of the mixture and outlier models of normal distribution through simulation. (100,000/n) Monte Carlo runs are performed with MATLAB package program. The simulations are done for the sample sizes n=20, 50 and 100. For the mixture model, the probability that the observations come from N(0, k 2 ) is taken as π = 0.05 and 0.1. For the outlier model, the proportion of the outliers in a sample is taken to be p=0.05 and 0.1. For both model, the extremity of contamination, k, is taken as 5, 10 and 20. After the data sets have been generated, they are standardized by the square root of the variance of the model. Thus, for all the data sets, the expected mean value is zero and the expected standard deviation is 1. Then, the simulated means, biases, variances, mean square errors (mse) and relative efficiencies (eff) are calculated to investigate the efficiency of the mentioned estimators. The estimators of the location parameter, μ w24, μ BS82, μ MML and μ, are compared according to the relative efficiency w.r.t the sample mean x. The formula of the relative efficiency is shown below: eff(θ i x ) = 100x mse(x ) mse(θ i) θ i ( i=1,2,3,4; θ 1 = μ w24, θ 2 = μ BS82, θ 3 = μ MML, θ 4 = μ ) The estimators of the scale parameter, σ w24, σ BS82, σ MML and MAD, are compared according to the relative efficiency w.r.t the standard deviation (s). The formula of the relative efficiency is shown below: eff(θ i s) = 100x mse(s) mse(θ i) 26

8 θ i (i=1,2,3,4; θ 1 = σ w24, θ 2 = σ BS82, θ 3 = σ MML θ 4 = MAD ) The simulation results are given in Tables 1-7. Tables include simulated means, biases, variances, mse s and efficiency values. The values in the tables are grouped by π and k for Tables 1-6. Table 7 contains the results of the data sets from the standard normal distribution. In general, the Huber s M estimators, w24 and BS82 estimators produce very similar results. Thus, we give comments about the Huber s M estimators without differentiating between them. Table 1 shows the simulation results for the mixture model when the sample size is equal to 20. It is observed that as π or k values increase, efficiencies of the robust estimators of the location parameter increase. Huber s M estimators of the location parameter are the most efficient estimators and the sample mean is the worst estimator of the location parameter in this situation. The MML estimator of the location parameter is more efficient than the median for whereas it is worse than the median for and 20. For the scale parameter estimation, for the situation when π=0.05 and, the Huber s M estimators of the scale parameter are the best. The MML estimator of the scale parameter takes the second place although there is only a marginal difference between the MML estimator and the Huber s M estimators. MAD* takes the third place and the sample standard deviation is the worst. In the other situations the MML estimator is the best. For the situation when π=0.05 and Huber s M estimators are the second best and MAD* takes the third place which is only marginally better than the sample standard deviation. In all other situations, the sample standard deviation is the second best after the MML estimator of the scale parameter. It is seen that as π or k values increase, the bias of Huber s M estimators of the scale parameter and MAD* get larger and this makes them extremely inefficient in estimating the scale parameter. The simulation results of the mixture model for the sample size n=50 are given in Table 2. All the estimators of the location parameter give higher efficiencies than the sample mean and have the same order as they have in Table 1. The MML estimator of the scale parameter is the best for and for π = 0.05 and. As π or k values increase, The MML estimator of the scale parameter produce some bias and it becomes inefficient w.r.t the sample standard deviation. For high k values, the sample standard 27

9 deviation is the best among all the scale parameter estimators. We should especially note that the Huber s M estimators of the scale parameter and MAD* produce huge bias as π or k values increase. For example for π = 0.1 and, their mean is around 0.17 making an approximate bias of which leads to an efficiency around 21%. The last simulation study for the mixture model is done for the sample size n=100, which is shown in Table 3. Results are very similar with Table 1 and 2 for the estimation of the location parameter. In the estimation of the scale parameter, the sample standard deviation is the best estimator except for where the MML scale estimator is the best. Huber s M estimators of the scale parameter and MAD* cannot be used for this situation because of huge bias and extreme inefficiency. In Table 4, the simulation results of the outlier model for the sample size n=20 are given. Again, Huber s M estimators produce the most efficient estimation of the location parameter. They are followed by the MML estimator for low p and k values. For the situation when, the sample median is better than the MML estimator of the location parameter. It is also better than the MML estimator when p=0.1 and whereas it is worse than the MML estimator when p=0.05 and. The sample mean is the worst estimator of the location parameter which is not a surprising result. The MML estimator of the scale parameter dominates this table. The sample standard deviation takes the second place except for p=0.05 and where Huber s M estimators are better. Again, the Huber s M estimators of the scale parameter and MAD* produce huge bias as the values of p and k increase. Table 5 shows the simulation results of the outlier model for the sample size n=50. The results are very similar with the results of Table 4 for the location parameter. In the estimation of the scale parameter, the MML estimator of the scale parameter is the best for and for p=0.05 and. In other situations it takes the second place behind the sample standard deviation. For high k values, the MML estimator of the scale parameter produce some bias but the bias produced by the Huber s M estimators of the scale parameter and MAD* is huge. For example for p=0.1 and, their mean is around 0.17 and their bias is approximately This inevitably results in an extremely inefficient estimation of the scale parameter. 28

10 Table 6 contains the simulation results of the outlier model for the sample size n=100. The results are again very similar with the results of Table 4 and 5 for the location parameter. The MML estimator of the scale parameter is the best for. In all other situations the sample standard deviation is the best estimator of the scale parameter and the MML estimator takes the second place. Huber s M estimators of the scale parameter and MAD* have huge bias in all situations of Table 6 and the bias gets huge as p and k get larger. It is very obvious that they cannot be used in the estimation of the scale parameter for this case. Finally, Table 7 gives the simulation results for the sample size n=20, 50 and 100 when the underlying distribution is standard normal. It is very natural to see that the sample mean and the standard deviation are the most efficient estimators in this situation. The MML estimators of the location and scale parameters take the second place although there is only a marginal difference between them and the sample mean and the sample standard deviation. The Huber s M estimators of the location and scale parameters take the next place. There is no big difference between the Huber s M estimators of the location parameter and the MML estimator of the location parameter but there is a significant difference between the Huber s M estimators and the MML estimator of the scale parameter. The sample median and MAD* have very poor efficiencies in this situation. 29

11 π = π = 0. 1 LOCATION PARAMETERS SCALE Methods: μ w24 μ BS82 μ MML μ x σ w24 σ BS82 σ MML MAD* S mean bias variance mse eff mean bias variance mse eff mean bias variance mse eff mean bias variance mse eff mean bias variance mse eff mean bias variance mse eff Table 1 Simulation results of the mixture model for n=20 30

12 LOCATION PARAMETERS SCALE Methods: μ w24 μ BS82 μ MML μ x σ w24 σ BS82 σ MML MAD* S mean bias variance mse eff π = mean bias variance mse eff mean bias variance mse eff mean bias variance mse eff π = 0. 1 mean bias variance mse eff mean bias variance mse eff Table 2 Simulation results of the mixture model for n=50 31

13 LOCATION PARAMETERS SCALE Methods: μ w24 μ BS82 μ MML μ x σ w24 σ BS82 σ MML MAD* S mean bias variance mse eff π = mean bias variance mse eff mean bias variance mse eff mean bias variance mse eff π = 0. 1 mean bias variance mse eff mean bias variance mse eff Table 3 Simulation results of the mixture model for n=100 32

14 LOCATION PARAMETERS SCALE Methods: μ w24 μ BS82 μ MML μ x σ w24 σ BS82 σ MML MAD* S mean bias variance mse eff p = mean bias variance mse eff mean bias variance mse eff mean bias variance mse eff p = 0. 1 mean bias variance mse eff mean bias variance mse eff Table 4 Simulation results of the outlier model for n=20 33

15 LOCATION PARAMETERS SCALE Methods : μ w24 μ BS82 μ MML μ x σ w24 σ BS82 σ MML MAD* S mean bias variance mse eff p = mean bias variance mse eff mean bias variance mse eff mean bias variance mse eff p = 0. 1 mean bias variance mse eff mean bias variance mse eff Table 5 Simulation results of the outlier model for n=50 34

16 LOCATION PARAMETERS SCALE Methods: μ w24 μ BS82 μ MML μ x σ w24 σ BS82 σ MML MAD* S mean bias variance mse eff p = mean bias variance mse eff mean bias variance mse eff mean bias variance mse eff p = 0. 1 mean bias variance mse eff mean bias variance mse eff Table 6 Simulation results of the outlier model for n=100 35

17 n=20 LOCATION PARAMETERS SCALE Methods: μ w24 μ BS82 μ MML μ x σ w24 σ BS82 σ MML MAD* S mean Bias variance mse eff n=50 n=100 mean bias variance mse eff mean bias variance mse eff Table 7 Simulation results for standard normal distribution 4. CONCLUSION In this paper we have done a simulation study to compare the performance of some well-known estimation methods for the location and scale parameters under several conditions including the mixture and outlier models and the normal distribution. For the mixture and outlier models, similar results are observed. In the estimation of the location parameter, the Huber s M estimators give the best results. For low k values the MML estimator takes the second place whereas for high k values the sample median is the second best estimator of the location parameter. The worst estimator of the location parameter is the sample mean. In the estimation of the scale parameter, for the sample size n=20, the MML estimator of the scale parameter is always the best estimator except the case when p or π=0.05 and where the Huber s M estimators are the best. In other situations the efficiency of the Huber s M estimators and MAD* are very close to each other but both are worse than the MML estimators and the sample standard deviation. It is easily observed that as the values of π or p and k get higher, the Huber s M estimators of the scale parameter and MAD* produce great bias and become very inefficient. For the sample size n=50, the MML estimators are still the best for and 10 except the case when p or π=0.1 and where the sample standard deviation is the 36

18 best. In other situations the MML estimator takes the second place after the sample standard deviation. Thus, the MML estimator of the scale parameter and the sample standard deviation takes the first two places in estimating the scale parameter. Again, as the values of π or p and k get higher, the Huber s M estimators of the scale parameter and MAD* produce great bias and become very inefficient. For the sample size n=100, the MML estimator of the scale parameter is the best estimator just in the case when. The second best is the sample standard deviation. In all other cases the sample standard deviation is the best and the MML estimator of the scale parameter is the second best estimator of the scale parameter. For this case both the Huber s M estimators of the scale parameter and MAD* are extremely inefficient w.r.t. the sample standard deviation. This is because of the fact that they produce huge bias and the bias gets larger as π or p and k get higher. Finally, in the simulation for the standard normal distribution, as expected, the best results are produced by the sample mean and the sample standard deviation. The MML estimators of the location and scale parameters take the second place but there is just a marginal difference between them and the sample mean and the sample standard deviation. The Huber s M estimators take the third place. The sample median and MAD* are the worst estimators of the location and the scale parameter, respectively, for this situation. They are extremely inefficient and cannot be used. If we have to give a suggestion for the usage of the estimator of the location parameter, we can suggest the usage of the Huber s M estimators. In the estimation of the scale parameter, the MML estimator of the scale parameter can be used unless the sample size and the extremity of contamination (k) are large. In such situations the sample standard deviation should be preferred. REFERENCES [1] M. L. Tiku and A. D. Akkaya, Robust Estimation and Hypothesis Testing, 2004, New Delhi. [2] P.J. Huber, Robust Statistics, Wiley, New York,1981. [3] R.R. Wilcox, Introduction to Robust Estimation and Hypothesis Testing, 2005, Elsevier Academic Press, Second Edition. 37

19 [4] A.F. Özdemir, Comparing measures of location when the underlying distribution distribution has heavier tails than normal, İstatistikçiler Dergisi, 2010, 3, pp [5] A.F. Özdemir and R. Wilcox, New results on the small-sample properties of some robust univariate estimators of location, Communications in Statistics - Simulation and Computation, 2012, 41(9), pp [6] P.J. Huber, Robust estimation of a location parameter, Annals Math. Stat., 1964, 35, pp [7] A.M. Gross, Confidence interval robustness with long-tailed symmetric distributions, J. Amer. Stat. Assoc., 1976, 71, pp [8] D.F. Andrews, P.J. Bickel, F.R. Hampel, P.J. Huber, W.H. Rogers and J.W. Tukey, Robust Estimates of Location: Survey and Advances, 1972, Princeton, NJ: Princeton University Press. [9] P.J. Huber, Robust Statistics, Wiley, New York, [10] M.L. Tiku, W.Y. Tan and N. Balakrishnan, Robust Inference, 1986, Marcel Dekker, New York. [11] F.R. Hampel, E.M. Ronchetti, and P.J. Rousseeuw, Robust Statistics, 1986, Wiley, New York. [12] K.L. Lange, R.J.A. Little, J.M.G. Taylor, Robust statistical modeling using the t-distribution, Journal of the American Statistical Association, 1989, 84 (408), pp [13] M.L. Tiku, Estimating the mean and standard deviation from a censored normal sample, Biometrika, 1967, 54, pp [14] H. Yilmaz and H.S. Sazak, Double-looped maximum likelihood estimation for the parameters of the generalized gamma distribution, Mathematics and Computers in Simulation, 2014, 98, pp [15] L. J. Bain and M. Engelhardt, Introduction to Probability and Mathematical Statistics, Second edition, PWS-Kent, Boston. [16] J. F. Kenney and E. S. Keeping, Mathematics of Statistics, Part 2, Second edition, Princeton, [17] P. Hall and A. H. Welsh, Limit theorems for the median deviation, Annals of the Institute of Statistical Mathematics, 1985, 37 (1), pp

Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution

Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution Journal of Computational and Applied Mathematics 216 (2008) 545 553 www.elsevier.com/locate/cam Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution

More information

Testing for a unit root in an ar(1) model using three and four moment approximations: symmetric distributions

Testing for a unit root in an ar(1) model using three and four moment approximations: symmetric distributions Hong Kong Baptist University HKBU Institutional Repository Department of Economics Journal Articles Department of Economics 1998 Testing for a unit root in an ar(1) model using three and four moment approximations:

More information

OPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION

OPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION REVSTAT Statistical Journal Volume 15, Number 3, July 2017, 455 471 OPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION Authors: Fatma Zehra Doğru Department of Econometrics,

More information

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY G.L. Shevlyakov, P.O. Smirnov St. Petersburg State Polytechnic University St.Petersburg, RUSSIA E-mail: Georgy.Shevlyakov@gmail.com

More information

High Breakdown Analogs of the Trimmed Mean

High Breakdown Analogs of the Trimmed Mean High Breakdown Analogs of the Trimmed Mean David J. Olive Southern Illinois University April 11, 2004 Abstract Two high breakdown estimators that are asymptotically equivalent to a sequence of trimmed

More information

Breakdown points of Cauchy regression-scale estimators

Breakdown points of Cauchy regression-scale estimators Breadown points of Cauchy regression-scale estimators Ivan Mizera University of Alberta 1 and Christine H. Müller Carl von Ossietzy University of Oldenburg Abstract. The lower bounds for the explosion

More information

PROD. TYPE: COM ARTICLE IN PRESS. Computational Statistics & Data Analysis ( )

PROD. TYPE: COM ARTICLE IN PRESS. Computational Statistics & Data Analysis ( ) COMSTA 28 pp: -2 (col.fig.: nil) PROD. TYPE: COM ED: JS PAGN: Usha.N -- SCAN: Bindu Computational Statistics & Data Analysis ( ) www.elsevier.com/locate/csda Transformation approaches for the construction

More information

Increasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University

Increasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University Power in Paired-Samples Designs Running head: POWER IN PAIRED-SAMPLES DESIGNS Increasing Power in Paired-Samples Designs by Correcting the Student t Statistic for Correlation Donald W. Zimmerman Carleton

More information

Empirical likelihood-based methods for the difference of two trimmed means

Empirical likelihood-based methods for the difference of two trimmed means Empirical likelihood-based methods for the difference of two trimmed means 24.09.2012. Latvijas Universitate Contents 1 Introduction 2 Trimmed mean 3 Empirical likelihood 4 Empirical likelihood for the

More information

ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION

ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION ZHENLINYANGandRONNIET.C.LEE Department of Statistics and Applied Probability, National University of Singapore, 3 Science Drive 2, Singapore

More information

9. Robust regression

9. Robust regression 9. Robust regression Least squares regression........................................................ 2 Problems with LS regression..................................................... 3 Robust regression............................................................

More information

INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF INDEPENDENT GROUPS ABSTRACT

INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF INDEPENDENT GROUPS ABSTRACT Mirtagioğlu et al., The Journal of Animal & Plant Sciences, 4(): 04, Page: J. 344-349 Anim. Plant Sci. 4():04 ISSN: 08-708 INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF

More information

Does k-th Moment Exist?

Does k-th Moment Exist? Does k-th Moment Exist? Hitomi, K. 1 and Y. Nishiyama 2 1 Kyoto Institute of Technology, Japan 2 Institute of Economic Research, Kyoto University, Japan Email: hitomi@kit.ac.jp Keywords: Existence of moments,

More information

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:

More information

On robust and efficient estimation of the center of. Symmetry.

On robust and efficient estimation of the center of. Symmetry. On robust and efficient estimation of the center of symmetry Howard D. Bondell Department of Statistics, North Carolina State University Raleigh, NC 27695-8203, U.S.A (email: bondell@stat.ncsu.edu) Abstract

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

Robust factorial ANCOVA with LTS error distributions

Robust factorial ANCOVA with LTS error distributions Hacettepe Journal of Mathematics and Statistics Volume 47 (2) (2018), 347 363 Robust factorial ANCOVA with LTS error distributions Şükrü Acıtaş and Birdal Şenoğlu Abstract In this study, parameter estimation

More information

Modied tests for comparison of group means under heteroskedasticity and non-normality caused by outlier(s)

Modied tests for comparison of group means under heteroskedasticity and non-normality caused by outlier(s) Hacettepe Journal of Mathematics and Statistics Volume 46 (3) (2017), 493 510 Modied tests for comparison of group means under heteroskedasticity and non-normality caused by outlier(s) Mustafa Cavus, Berna

More information

ROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS

ROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS ROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS AIDA TOMA The nonrobustness of classical tests for parametric models is a well known problem and various

More information

2 Mathematical Model, Sequential Probability Ratio Test, Distortions

2 Mathematical Model, Sequential Probability Ratio Test, Distortions AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 153 162 Robust Sequential Testing of Hypotheses on Discrete Probability Distributions Alexey Kharin and Dzmitry Kishylau Belarusian State University,

More information

Highly Robust Variogram Estimation 1. Marc G. Genton 2

Highly Robust Variogram Estimation 1. Marc G. Genton 2 Mathematical Geology, Vol. 30, No. 2, 1998 Highly Robust Variogram Estimation 1 Marc G. Genton 2 The classical variogram estimator proposed by Matheron is not robust against outliers in the data, nor is

More information

Estimation of Parameters of the Weibull Distribution Based on Progressively Censored Data

Estimation of Parameters of the Weibull Distribution Based on Progressively Censored Data International Mathematical Forum, 2, 2007, no. 41, 2031-2043 Estimation of Parameters of the Weibull Distribution Based on Progressively Censored Data K. S. Sultan 1 Department of Statistics Operations

More information

STOCHASTIC COVARIATES IN BINARY REGRESSION

STOCHASTIC COVARIATES IN BINARY REGRESSION Hacettepe Journal of Mathematics and Statistics Volume 33 2004, 97 109 STOCHASTIC COVARIATES IN BINARY REGRESSION Evrim Oral and Süleyman Günay Received 27 : 04 : 2004 : Accepted 23 : 09 : 2004 Abstract

More information

Fast and robust bootstrap for LTS

Fast and robust bootstrap for LTS Fast and robust bootstrap for LTS Gert Willems a,, Stefan Van Aelst b a Department of Mathematics and Computer Science, University of Antwerp, Middelheimlaan 1, B-2020 Antwerp, Belgium b Department of

More information

Measuring robustness

Measuring robustness Measuring robustness 1 Introduction While in the classical approach to statistics one aims at estimates which have desirable properties at an exactly speci ed model, the aim of robust methods is loosely

More information

Published: 26 April 2016

Published: 26 April 2016 Electronic Journal of Applied Statistical Analysis EJASA, Electron. J. App. Stat. Anal. http://siba-ese.unisalento.it/index.php/ejasa/index e-issn: 2070-5948 DOI: 10.1285/i20705948v9n1p111 A robust dispersion

More information

CONVERTING OBSERVED LIKELIHOOD FUNCTIONS TO TAIL PROBABILITIES. D.A.S. Fraser Mathematics Department York University North York, Ontario M3J 1P3

CONVERTING OBSERVED LIKELIHOOD FUNCTIONS TO TAIL PROBABILITIES. D.A.S. Fraser Mathematics Department York University North York, Ontario M3J 1P3 CONVERTING OBSERVED LIKELIHOOD FUNCTIONS TO TAIL PROBABILITIES D.A.S. Fraser Mathematics Department York University North York, Ontario M3J 1P3 N. Reid Department of Statistics University of Toronto Toronto,

More information

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality Communications for Statistical Applications and Methods 014, Vol. 1, No. 3, 13 4 DOI: http://dx.doi.org/10.5351/csam.014.1.3.13 ISSN 87-7843 An Empirical Characteristic Function Approach to Selecting a

More information

Lecture 14 October 13

Lecture 14 October 13 STAT 383C: Statistical Modeling I Fall 2015 Lecture 14 October 13 Lecturer: Purnamrita Sarkar Scribe: Some one Disclaimer: These scribe notes have been slightly proofread and may have typos etc. Note:

More information

Minimum Hellinger Distance Estimation with Inlier Modification

Minimum Hellinger Distance Estimation with Inlier Modification Sankhyā : The Indian Journal of Statistics 2008, Volume 70-B, Part 2, pp. 0-12 c 2008, Indian Statistical Institute Minimum Hellinger Distance Estimation with Inlier Modification Rohit Kumar Patra Indian

More information

One-Sample Numerical Data

One-Sample Numerical Data One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

A bias improved estimator of the concordance correlation coefficient

A bias improved estimator of the concordance correlation coefficient The 22 nd Annual Meeting in Mathematics (AMM 217) Department of Mathematics, Faculty of Science Chiang Mai University, Chiang Mai, Thailand A bias improved estimator of the concordance correlation coefficient

More information

Outlier Robust Nonlinear Mixed Model Estimation

Outlier Robust Nonlinear Mixed Model Estimation Outlier Robust Nonlinear Mixed Model Estimation 1 James D. Williams, 2 Jeffrey B. Birch and 3 Abdel-Salam G. Abdel-Salam 1 Business Analytics, Dow AgroSciences, 9330 Zionsville Rd. Indianapolis, IN 46268,

More information

ISSN Some aspects of stability in time series small sample case

ISSN Some aspects of stability in time series small sample case Journal Afrika Statistika Journal Afrika Statistika Vol. 5, N 8, 2010, page 252 259. ISSN 0852-0305 Some aspects of stability in time series small sample case Hocine Fellag Laboratory of Pure and Applied

More information

Inferring from data. Theory of estimators

Inferring from data. Theory of estimators Inferring from data Theory of estimators 1 Estimators Estimator is any function of the data e(x) used to provide an estimate ( a measurement ) of an unknown parameter. Because estimators are functions

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE

POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE Supported by Patrick Adebayo 1 and Ahmed Ibrahim 1 Department of Statistics, University of Ilorin, Kwara State, Nigeria Department

More information

J. W. LEE (Kumoh Institute of Technology, Kumi, South Korea) V. I. SHIN (Gwangju Institute of Science and Technology, Gwangju, South Korea)

J. W. LEE (Kumoh Institute of Technology, Kumi, South Korea) V. I. SHIN (Gwangju Institute of Science and Technology, Gwangju, South Korea) J. W. LEE (Kumoh Institute of Technology, Kumi, South Korea) V. I. SHIN (Gwangju Institute of Science and Technology, Gwangju, South Korea) G. L. SHEVLYAKOV (Gwangju Institute of Science and Technology,

More information

Effects of Outliers and Multicollinearity on Some Estimators of Linear Regression Model

Effects of Outliers and Multicollinearity on Some Estimators of Linear Regression Model 204 Effects of Outliers and Multicollinearity on Some Estimators of Linear Regression Model S. A. Ibrahim 1 ; W. B. Yahya 2 1 Department of Physical Sciences, Al-Hikmah University, Ilorin, Nigeria. e-mail:

More information

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at Biometrika Trust Robust Regression via Discriminant Analysis Author(s): A. C. Atkinson and D. R. Cox Source: Biometrika, Vol. 64, No. 1 (Apr., 1977), pp. 15-19 Published by: Oxford University Press on

More information

Physics 509: Bootstrap and Robust Parameter Estimation

Physics 509: Bootstrap and Robust Parameter Estimation Physics 509: Bootstrap and Robust Parameter Estimation Scott Oser Lecture #20 Physics 509 1 Nonparametric parameter estimation Question: what error estimate should you assign to the slope and intercept

More information

Robust Variable Selection Through MAVE

Robust Variable Selection Through MAVE Robust Variable Selection Through MAVE Weixin Yao and Qin Wang Abstract Dimension reduction and variable selection play important roles in high dimensional data analysis. Wang and Yin (2008) proposed sparse

More information

Learning Energy-Based Models of High-Dimensional Data

Learning Energy-Based Models of High-Dimensional Data Learning Energy-Based Models of High-Dimensional Data Geoffrey Hinton Max Welling Yee-Whye Teh Simon Osindero www.cs.toronto.edu/~hinton/energybasedmodelsweb.htm Discovering causal structure as a goal

More information

Robustness of location estimators under t- distributions: a literature review

Robustness of location estimators under t- distributions: a literature review IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS Robustness of location estimators under t- distributions: a literature review o cite this article: C Sumarni et al 07 IOP Conf.

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture VII (26.11.07) Contents: Maximum Likelihood (II) Exercise: Quality of Estimators Assume hight of students is Gaussian distributed. You measure the size of N students.

More information

Predicting a Future Median Life through a Power Transformation

Predicting a Future Median Life through a Power Transformation Predicting a Future Median Life through a Power Transformation ZHENLIN YANG 1 Department of Statistics and Applied Probability, National University of Singapore, 3 Science Drive 2, Singapore 117543 Abstract.

More information

Weighted empirical likelihood estimates and their robustness properties

Weighted empirical likelihood estimates and their robustness properties Computational Statistics & Data Analysis ( ) www.elsevier.com/locate/csda Weighted empirical likelihood estimates and their robustness properties N.L. Glenn a,, Yichuan Zhao b a Department of Statistics,

More information

A Brief Overview of Robust Statistics

A Brief Overview of Robust Statistics A Brief Overview of Robust Statistics Olfa Nasraoui Department of Computer Engineering & Computer Science University of Louisville, olfa.nasraoui_at_louisville.edu Robust Statistical Estimators Robust

More information

ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT

ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT Rachid el Halimi and Jordi Ocaña Departament d Estadística

More information

In Chapter 2, some concepts from the robustness literature were introduced. An important concept was the inuence function. In the present chapter, the

In Chapter 2, some concepts from the robustness literature were introduced. An important concept was the inuence function. In the present chapter, the Chapter 3 Robustness Properties of the Student t Based Pseudo Maximum Likelihood Estimator In Chapter 2, some concepts from the robustness literature were introduced. An important concept was the inuence

More information

Some Theoretical Properties and Parameter Estimation for the Two-Sided Length Biased Inverse Gaussian Distribution

Some Theoretical Properties and Parameter Estimation for the Two-Sided Length Biased Inverse Gaussian Distribution Journal of Probability and Statistical Science 14(), 11-4, Aug 016 Some Theoretical Properties and Parameter Estimation for the Two-Sided Length Biased Inverse Gaussian Distribution Teerawat Simmachan

More information

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Alisa A. Gorbunova and Boris Yu. Lemeshko Novosibirsk State Technical University Department of Applied Mathematics,

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

arxiv: v1 [math.st] 20 May 2014

arxiv: v1 [math.st] 20 May 2014 ON THE EFFICIENCY OF GINI S MEAN DIFFERENCE CARINA GERSTENBERGER AND DANIEL VOGEL arxiv:405.5027v [math.st] 20 May 204 Abstract. The asymptotic relative efficiency of the mean deviation with respect to

More information

DESCRIPTIVE STATISTICS FOR NONPARAMETRIC MODELS I. INTRODUCTION

DESCRIPTIVE STATISTICS FOR NONPARAMETRIC MODELS I. INTRODUCTION The Annals of Statistics 1975, Vol. 3, No.5, 1038-1044 DESCRIPTIVE STATISTICS FOR NONPARAMETRIC MODELS I. INTRODUCTION BY P. J. BICKEL 1 AND E. L. LEHMANN 2 University of California, Berkeley An overview

More information

Introduction to Robust Statistics. Elvezio Ronchetti. Department of Econometrics University of Geneva Switzerland.

Introduction to Robust Statistics. Elvezio Ronchetti. Department of Econometrics University of Geneva Switzerland. Introduction to Robust Statistics Elvezio Ronchetti Department of Econometrics University of Geneva Switzerland Elvezio.Ronchetti@metri.unige.ch http://www.unige.ch/ses/metri/ronchetti/ 1 Outline Introduction

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Approximate Median Regression via the Box-Cox Transformation

Approximate Median Regression via the Box-Cox Transformation Approximate Median Regression via the Box-Cox Transformation Garrett M. Fitzmaurice,StuartR.Lipsitz, and Michael Parzen Median regression is used increasingly in many different areas of applications. The

More information

Robust Outcome Analysis for Observational Studies Designed Using Propensity Score Matching

Robust Outcome Analysis for Observational Studies Designed Using Propensity Score Matching The work of Kosten and McKean was partially supported by NIAAA Grant 1R21AA017906-01A1 Robust Outcome Analysis for Observational Studies Designed Using Propensity Score Matching Bradley E. Huitema Western

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

Remedial Measures for Multiple Linear Regression Models

Remedial Measures for Multiple Linear Regression Models Remedial Measures for Multiple Linear Regression Models Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Remedial Measures for Multiple Linear Regression Models 1 / 25 Outline

More information

A Derivation of the EM Updates for Finding the Maximum Likelihood Parameter Estimates of the Student s t Distribution

A Derivation of the EM Updates for Finding the Maximum Likelihood Parameter Estimates of the Student s t Distribution A Derivation of the EM Updates for Finding the Maximum Likelihood Parameter Estimates of the Student s t Distribution Carl Scheffler First draft: September 008 Contents The Student s t Distribution The

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples

Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples 90 IEEE TRANSACTIONS ON RELIABILITY, VOL. 52, NO. 1, MARCH 2003 Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples N. Balakrishnan, N. Kannan, C. T.

More information

Robust Linear Discriminant Analysis and the Projection Pursuit Approach

Robust Linear Discriminant Analysis and the Projection Pursuit Approach Robust Linear Discriminant Analysis and the Projection Pursuit Approach Practical aspects A. M. Pires 1 Department of Mathematics and Applied Mathematics Centre, Technical University of Lisbon (IST), Lisboa,

More information

Influence Functions of the Spearman and Kendall Correlation Measures Croux, C.; Dehon, C.

Influence Functions of the Spearman and Kendall Correlation Measures Croux, C.; Dehon, C. Tilburg University Influence Functions of the Spearman and Kendall Correlation Measures Croux, C.; Dehon, C. Publication date: 1 Link to publication Citation for published version (APA): Croux, C., & Dehon,

More information

Some New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models.

Some New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models. Some New Methods for Latent Variable Models and Survival Analysis Marie Davidian Department of Statistics North Carolina State University 1. Introduction Outline 3. Empirically checking latent-model robustness

More information

Leverage effects on Robust Regression Estimators

Leverage effects on Robust Regression Estimators Leverage effects on Robust Regression Estimators David Adedia 1 Atinuke Adebanji 2 Simon Kojo Appiah 2 1. Department of Basic Sciences, School of Basic and Biomedical Sciences, University of Health and

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

Using R in Undergraduate Probability and Mathematical Statistics Courses. Amy G. Froelich Department of Statistics Iowa State University

Using R in Undergraduate Probability and Mathematical Statistics Courses. Amy G. Froelich Department of Statistics Iowa State University Using R in Undergraduate Probability and Mathematical Statistics Courses Amy G. Froelich Department of Statistics Iowa State University Undergraduate Probability and Mathematical Statistics at Iowa State

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample

More information

Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION

Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION INTRODUCTION Statistical disclosure control part of preparations for disseminating microdata. Data perturbation techniques: Methods assuring

More information

-However, this definition can be expanded to include: biology (biometrics), environmental science (environmetrics), economics (econometrics).

-However, this definition can be expanded to include: biology (biometrics), environmental science (environmetrics), economics (econometrics). Chemometrics Application of mathematical, statistical, graphical or symbolic methods to maximize chemical information. -However, this definition can be expanded to include: biology (biometrics), environmental

More information

A REMARK ON ROBUSTNESS AND WEAK CONTINUITY OF M-ESTEVtATORS

A REMARK ON ROBUSTNESS AND WEAK CONTINUITY OF M-ESTEVtATORS J. Austral. Math. Soc. (Series A) 68 (2000), 411-418 A REMARK ON ROBUSTNESS AND WEAK CONTINUITY OF M-ESTEVtATORS BRENTON R. CLARKE (Received 11 January 1999; revised 19 November 1999) Communicated by V.

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

STATISTICS 4, S4 (4769) A2

STATISTICS 4, S4 (4769) A2 (4769) A2 Objectives To provide students with the opportunity to explore ideas in more advanced statistics to a greater depth. Assessment Examination (72 marks) 1 hour 30 minutes There are four options

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Robust Preprocessing of Time Series with Trends

Robust Preprocessing of Time Series with Trends Robust Preprocessing of Time Series with Trends Roland Fried Ursula Gather Department of Statistics, Universität Dortmund ffried,gatherg@statistik.uni-dortmund.de Michael Imhoff Klinikum Dortmund ggmbh

More information

Midwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter

Midwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter Midwest Big Data Summer School: Introduction to Statistics Kris De Brabanter kbrabant@iastate.edu Iowa State University Department of Statistics Department of Computer Science June 20, 2016 1/27 Outline

More information

Robustness. James H. Steiger. Department of Psychology and Human Development Vanderbilt University. James H. Steiger (Vanderbilt University) 1 / 37

Robustness. James H. Steiger. Department of Psychology and Human Development Vanderbilt University. James H. Steiger (Vanderbilt University) 1 / 37 Robustness James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 37 Robustness 1 Introduction 2 Robust Parameters and Robust

More information

A Simulation Comparison Study for Estimating the Process Capability Index C pm with Asymmetric Tolerances

A Simulation Comparison Study for Estimating the Process Capability Index C pm with Asymmetric Tolerances Available online at ijims.ms.tku.edu.tw/list.asp International Journal of Information and Management Sciences 20 (2009), 243-253 A Simulation Comparison Study for Estimating the Process Capability Index

More information

Using Ridge Least Median Squares to Estimate the Parameter by Solving Multicollinearity and Outliers Problems

Using Ridge Least Median Squares to Estimate the Parameter by Solving Multicollinearity and Outliers Problems Modern Applied Science; Vol. 9, No. ; 05 ISSN 9-844 E-ISSN 9-85 Published by Canadian Center of Science and Education Using Ridge Least Median Squares to Estimate the Parameter by Solving Multicollinearity

More information

Confidence intervals for kernel density estimation

Confidence intervals for kernel density estimation Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting

More information

Exact Linear Likelihood Inference for Laplace

Exact Linear Likelihood Inference for Laplace Exact Linear Likelihood Inference for Laplace Prof. N. Balakrishnan McMaster University, Hamilton, Canada bala@mcmaster.ca p. 1/52 Pierre-Simon Laplace 1749 1827 p. 2/52 Laplace s Biography Born: On March

More information

Improved Ridge Estimator in Linear Regression with Multicollinearity, Heteroscedastic Errors and Outliers

Improved Ridge Estimator in Linear Regression with Multicollinearity, Heteroscedastic Errors and Outliers Journal of Modern Applied Statistical Methods Volume 15 Issue 2 Article 23 11-1-2016 Improved Ridge Estimator in Linear Regression with Multicollinearity, Heteroscedastic Errors and Outliers Ashok Vithoba

More information

Robust Stochastic Frontier Analysis: a Minimum Density Power Divergence Approach

Robust Stochastic Frontier Analysis: a Minimum Density Power Divergence Approach Robust Stochastic Frontier Analysis: a Minimum Density Power Divergence Approach Federico Belotti Giuseppe Ilardi CEIS, University of Rome Tor Vergata Bank of Italy Workshop on the Econometrics and Statistics

More information

f(x µ, σ) = b 2σ a = cos t, b = sin t/t, π < t 0, a = cosh t, b = sinh t/t, t > 0,

f(x µ, σ) = b 2σ a = cos t, b = sin t/t, π < t 0, a = cosh t, b = sinh t/t, t > 0, R-ESTIMATOR OF LOCATION OF THE GENERALIZED SECANT HYPERBOLIC DIS- TRIBUTION O.Y.Kravchuk School of Physical Sciences and School of Land and Food Sciences University of Queensland Brisbane, Australia 3365-2171

More information

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Application of Variance Homogeneity Tests Under Violation of Normality Assumption Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com

More information

A Robust Strategy for Joint Data Reconciliation and Parameter Estimation

A Robust Strategy for Joint Data Reconciliation and Parameter Estimation A Robust Strategy for Joint Data Reconciliation and Parameter Estimation Yen Yen Joe 1) 3), David Wang ), Chi Bun Ching 3), Arthur Tay 1), Weng Khuen Ho 1) and Jose Romagnoli ) * 1) Dept. of Electrical

More information

robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression

robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression Robust Statistics robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Introduction Robust regression Examples Conclusion. Robust regression. Jiří Franc

Introduction Robust regression Examples Conclusion. Robust regression. Jiří Franc Robust regression Robust estimation of regression coefficients in linear regression model Jiří Franc Czech Technical University Faculty of Nuclear Sciences and Physical Engineering Department of Mathematics

More information

KANSAS STATE UNIVERSITY

KANSAS STATE UNIVERSITY ROBUST MIXTURES OF REGRESSION MODELS by XIUQIN BAI M.S., Kansas State University, USA, 2010 AN ABSTRACT OF A DISSERTATION submitted in partial fulfillment of the requirements for the degree DOCTOR OF PHILOSOPHY

More information

Definitions of ψ-functions Available in Robustbase

Definitions of ψ-functions Available in Robustbase Definitions of ψ-functions Available in Robustbase Manuel Koller and Martin Mächler July 18, 2018 Contents 1 Monotone ψ-functions 2 1.1 Huber.......................................... 3 2 Redescenders

More information

Outline Lecture 2 2(32)

Outline Lecture 2 2(32) Outline Lecture (3), Lecture Linear Regression and Classification it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic

More information

Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity

Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity International Mathematics and Mathematical Sciences Volume 2011, Article ID 249564, 7 pages doi:10.1155/2011/249564 Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity J. Martin van

More information

A general linear model OPTIMAL BIAS BOUNDS FOR ROBUST ESTIMATION IN LINEAR MODELS

A general linear model OPTIMAL BIAS BOUNDS FOR ROBUST ESTIMATION IN LINEAR MODELS OPTIMAL BIAS BOUNDS FOR ROBUST ESTIMATION IN LINEAR MODELS CHRISTINE H. MOLLER Freie Universitbt Berlin 1. Mathematisches Instilut Arnimallee 2-6 0-14195 Berlin Germany Abstract. A conditionally contaminated

More information

A Modified M-estimator for the Detection of Outliers

A Modified M-estimator for the Detection of Outliers A Modified M-estimator for the Detection of Outliers Asad Ali Department of Statistics, University of Peshawar NWFP, Pakistan Email: asad_yousafzay@yahoo.com Muhammad F. Qadir Department of Statistics,

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Analysis of Type-II Progressively Hybrid Censored Data

Analysis of Type-II Progressively Hybrid Censored Data Analysis of Type-II Progressively Hybrid Censored Data Debasis Kundu & Avijit Joarder Abstract The mixture of Type-I and Type-II censoring schemes, called the hybrid censoring scheme is quite common in

More information