Introduction Robust regression Examples Conclusion. Robust regression. Jiří Franc
|
|
- Erica Walker
- 6 years ago
- Views:
Transcription
1 Robust regression Robust estimation of regression coefficients in linear regression model Jiří Franc Czech Technical University Faculty of Nuclear Sciences and Physical Engineering Department of Mathematics
2 Outline 1 Introduction Regression analysis Classical regression estimators 2 Robust regression L-estimators M-estimators R-estimators S-estimators Least median of squares (LMS) Least trimmed squares (LTS) 3 Examples 4 Conclusion
3 Linear regression model Definition The multiple linear regression model is the model Where Y i = X i,1 β X i,2 β X i,p β 0 p + e i = X T i β 0 + e i i = 1... n. n is the sample size, X i =(X i,1, X i,2,..., X i,p ) T is called explanatory variables or regressors and it is a sequence of deterministic p-dimensional vectors or a sequence of random variables, Y i is called response variable (dependent variable) and it is i-th element of the random sequence of observations, β 0 =(β 0 1, β 0 2,..., β 0 p) T is a p-dimensional vector of true regression coefficients, e i is called the sequence of disturbances (error terms), it represents unexplained variation in the dependent variable and it is a sequence of random variables.
4 Linear regression model Let rewrite the previous definition with n equations in matrix notation. Definition Y 1 Y 2. = Y n X 1,1 X 1,2... X 1,p X 2,1 X 2,2... X 2,p X n,1 X n,2 X n,n β 0 1 β 0 2. β 0 p + e 1 e 2. e n Equivalently, Y = Xβ 0 + e
5 Classical regression estimators Ordinary least squares method (OLS) n ˆβ OLS = arg min β R p i=1 ˆβ OLS = (X T X) 1 X T Y (Y i X T i β) 2 = arg min(y Xβ) T (Y Xβ) β R p In the certain conditions OLS is the best linear unbiased estimator of β 0. In the certain conditions OLS is the best estimator among all unbiased estimators (ordinary least squares method is best for multiple regression when the iid errors are normal distributed.) OLS is not robust and consequently often gives false result for real data (even a single regression outlier can totally offset the OLS estimator).
6 Motivation An example of outliers in y-direction and OLS estimate. Number of International Calls (in tens of millions) from Belgium dependence on Year NoCalls Legend Dataset OLS estimate Year
7 Motivation An example of outliers in y-direction and OLS estimate. Y Legend Dataset OLS estimate X
8 Motivation An example of outliers in x-direction and OLS estimate. Y Legend Dataset OLS estimate X
9 Robust regression The main aims of robust statistics: description of the structure best fitting the bulk of the data. identification of deviating data points (outliers) or deviating substructures for further treatment. identification of highly influential data points (leverage points) or at least warning about them. deal with unsuspected serial correlations.
10 Robust regression The main aims of robust statistics: description of the structure best fitting the bulk of the data. identification of deviating data points (outliers) or deviating substructures for further treatment. identification of highly influential data points (leverage points) or at least warning about them. deal with unsuspected serial correlations. Two ways how to deal with regression outliers: Regression diagnostics: where certain quantities are computed from the data with the purpose of pinpointing influential points, after which these outliers can be removed or corrected. Robust regression: which tries to devise estimators that are not so strongly affected by outliers.
11 L-estimators α trimmed estimators ˆβ (Lte,α) = 1 n 2 nα n nα i= nα +1 Y (i), where (Y (1),..., Y (n) ) is the order statistic and α is definite coefficient: α 0, 1. α regression quantile (Koenker, Bassett (1978)) ˆβ (Lrq,α) = arg min β R p n ρ α(r i ), i=1 where r i (β) = Y i X T i β i = 1... n and ρ α(r i ) = { α r i if r i 0 (α 1) r i if r i < 0. L-estimators are scale equivariant and regression equivariant. However the breakdown point is still 0% and for α = 0.5 holds, 0.5-regression quantile estimator is least absolute values estimator.
12 L-estimators Least absolute values method n ˆβ L1 = arg min β R p i=1 Least maximum deviation method ˆβ L Y i X T i β = arg min β R p = arg min(max r i(β) ) β R p i ˆn n r i (β) i=1
13 M-estimators M-estimators are based on the idea of replacing the squared residuals used in OLS estimation by another function of the residuals. M-estimators ˆβ (M) = arg min β R p n ρ(r i (β)), where ρ is a symmetric function with a unique minimum at zero. Differentiating this expression with respect to the regression coefficients yields: M-estimators i=1 n ψ(r i )X i = 0, i=1 where ψ is the derivative of ρ. The M-estimate is obtained by solving this system of p equations.
14 M-estimators OLS, L 1 are also M-estimators with ψ(t) = t for OLS and ψ(t) = sgn(t) for L 1 estimate. M-estimators are unfortunately not scale equivariant even if they are regression equivariant. Hence one has to studentizate the M-estimators by an estimate of scale of disturbances ˆσ necessarily. ˆβ (M) = arg min β R p n ( ) ri (β) ρ, ˆσ One possibility is to use the median absolute deviation (MAD): ( ) ˆσ = C median r i median(r j ) i j, where C is a constant (correction factor), which depends on the distribution. i=1 For normally distributed data C =
15 M-estimators Let vectors (Y i, X T i ) T are iid with distribution function F (Y, X). If the function ρ has an absolutely continuously derivation ψ, σ = 1 for simplicity and the functional T (F ) corresponding to the M-estimator then the functional T (F ) is the solution of ψ(y X T T (F ))XdF (X, Y ) = 0. Define M(ψ, F ) := ψ (Y X T T (F ))XX T df (X, Y ). Then the influence function of T at a distribution F (on R p R) is given by IF (X 0, Y 0, T, F ) = M 1 (ψ, F )X 0ψ(Y 0 X 0 T T (F )). The influence function with respect of Y 0 can by bounded by choice of ψ, but the influence function of M-estimators is unbounded in respect of X 0. The breakdown point of M-estimators is 0% due to the vulnerability to leverage points.
16 M-estimators Maronna and Yoahai (1981) showed, under certain conditions, that M-estimators are consistent and asymptoticly normal with asymptotic covariance matrix where V (T, F ) = E [ IF (X, Y, T, F )IF T (X, Y, T, F ) ] = = IF (X, Y, T, F )IF T (X, Y, T, F )df (X, Y ) = = M(ψ, F ) 1 Q(ψ, F )M(ψ, F ) 1 M(ψ, F ) = ψ (Y X T T (F ))XX T df (X, Y ) Q(ψ, F ) = ψ 2 (Y X T T (F ))XX T df (X, Y ) If disturbances are normal distributed (e i N(0, σ 0 )) and iid, the multivariate M-estimator has the asymptotic covariance matrix V (ψ, Φ) = E[(XXT ) 1 ]σ 2, e where e is the asymptotic efficiency defined by ( ψ dφ ) 2 e = ψ2 dφ.
17 M-estimators Huber minimax M-estimator ψ(t) = { t b sgn(t) where b is a constant. if t < b if t b Andrew M-estimator { sin(t) if π t < π ψ(t) = 0 otherwise Tukey M-estimator ( t 1 ( ) ) t 2 2 if t < c ψ(t) = c 0 otherwise Descending minimax M-estimator t i ψ(t) = b sgn(t) tanh [ 1 2 b(c t )] i 0 o where a, b and c are constants. Hampel M-estimator t if t < a a sgn(t) if a t < b ψ(t) = c t sgn(t) if b t < c c b 0 otherwise where a, b and c are constants. where c is a constant.
18 M-estimators Huber minimax (b=1) psi(t) t
19 M-estimators Hampel (a=1,b=2,c=3) psi(t) t
20 M-estimators Descending minimax (a=1,b=1.2,c=3) psi(t) t
21 M-estimators Andrew psi(t) t
22 M-estimators Tukey (c=3) psi(t) t
23 GM-estimators Generalized M-estimators are introduced in order to bounding the influence function of outlying X i s by means of some weight function w. GM-estimators ˆβ (GM) = arg min β R p n i=1 w(x i ) ρ(r i(β)) ˆσ The definition can be rewrite to n w(x i )ψ i=1 ( ri ) X i = 0. ˆσ Unfortunately Maronna, Buston and Yohai (1979) showed that the breakdown point of GM-estimators can be no better than a certain value that decrease as a function of p 1, where p is the number of regression coefficients.
24 GM-estimators The algorithm of Iteratively reweighted least squares with GM-estimates based on some ψ function is following. 1 The first elementary estimate ˆβ (OLS) of β 0. 2 Count the residuals r i ( ˆβ) = Y i Ŷ i = Y i X T i 3 Count the estimate ˆσ of σ. (e.g. MAD: ˆσ = median i ( r i median j 4 Count the weights w i. (e.g. Andrew s ψ function: w i = ψ( r iˆσ ) r iˆσ ) ˆβ i = 1... n. ) (r j ) ) 5 Update the estimate ˆβ by performing a weighted least squares with the weights w i Calculate ˆβ (WLS) = (X T WX) 1 X T WY 6 Go back to item 2 and iterate until convergence
25 R-estimators R-estimation is procedure based on the ranks of the residuals. The idea of using rank statistics has been extended to the domain of multiple regression by Jurečková(1971) and Jaeckel(1972). Let R i (β) be the rank of Y i X i T β, further let a n (i), i = 1,..., n be a nondecreasing scores function, satisfying n a n (i) = 0. i=1 R-estimator ˆβ (R,a i ) = arg min β R p n i=1 a n (R i (β))(y i Xi T β) = arg min β R p n a n (R i )r i. i=1
26 R-estimators Some possibilities for the scores function a n(i) are: Wilcoxon scores Median scores a n(i) = i n Van der Waerden scores ( ) a n(i) = Φ 1 i n + 1 ( a n(i) = sgn i n + 1 ) 2 Bounded normal scores ( { ( a n(i) = min c, max Φ 1 where c is a parameter. i n + 1 ) }), c,
27 R-estimators The most widely used R-estimator is the estimator with Wilcoxon scores function, because the resulting R-estimator have a relatively simple structure. R-estimators are automatically scale equivariant. The breakdown point of R-estimators is still 0% due to the vulnerability to leverage points. In the case of regression with intercept, the constant term is estimated separately. In order to solve the R-estimates, one has to find the minimum of the objective function by varying the unknown parameters.
28 S-estimators S-estimators were introduced by Rousseeuw and Yohai (1984), they are derived from a scale statistic in an implicit way and they are regression, scale and affine equivariant. S-estimators are defined by minimization of the dispersion of the residuals by: S-estimator ˆβ (S,c,K) = arg min β R p s(r 1 (β),..., r n (β)) = arg min ŝ(β), β R p where ŝ(β) is estimator of scale defined as the solution of equation 1 n n ( ) ri (β) ρ = K. ŝ(β) i=1
29 S-estimators The function ρ must satisfy the following conditions: C1 ρ is symmetric and continuously differentiable function and ρ(0) = 0. C2 There exists parameter c > 0 such that ρ is strictly increasing on 0, c and constant on c,. For the S-estimator with breakdown point of 50% one extra condition has to be add. C3 An example: Tukey s biweight function ρ. ρ(x) = { x 2 2 x 4 2c + x 6 2 6c 4 K ρ(c) = 1 2 if x c c 2 6 otherwise, where c is a parameter from condition C2.
30 S-estimators Under certain conditions, the behavior of ˆβ (S) at the normal model, where (Y i, x i ) are iid random variables satisfying Y i = xi T β 0 + e i with e i N(0, σ 2 ) is given by ( lim n ˆβ(S) β 0) = N (0, E [ X T X ] 1 ( ψ 2 dφ ) ) ( n ψ dφ ). 2 Consequently the asymptotic efficiency is defined by e := ( ψ dφ ) 2 ( ψ2 dφ ). The same formulas was derived also for the M-estimator.
31 S-estimators An example of the S-estimator corresponding to the Tukey s function ρ with breakdown point of 50% is K = E Φ [ρ] and c = E Φ [ρ] = + ρ(x)φ(x)dx = 2 c c 2 6 φxdx + c c c c = c2 Φ( c) + 1 x 2 φ(x)dx c 2 c ( c = c2 Φ( c) Φ( c) (2c)e c ( 1 3 6Φ( c) ( 2c 3 + 6c ) e c2 2c 2 2 x 2 x4 + x6 φ(x)dx = 2 2c 2 6c 4 c x 4 φ(x)dx + 1 x 6 φ(x)dx = 6c ) 4 c ) c 4 ( 15 30Φ( c) ( 2c c c ) e c2 2 ) and K = α (0; 0.5] ρ(c)
32 S-estimators The asymptotic efficiency of the S-estimator corresponding to the Tukey s function ρ for different values of the breakdown point ε. ε e c K 50% 28.7% % 37.0% % 42.6% % 56.0% % 66.1% % 75.9% % 84.7% % 91.7% % 96.6% MM-estimators: high-breakdown and high-efficiency estimators, where the initial estimate is obtained with an S-estimator, and it is then improved with an M-estimator.
33 LMS-estimator LMS is probably the first really applicable 50% breakdown point estimator introduced by Rousseeuw (1984). The idea of the LMS is to replace the sum operator by a median, which is very robust. Least median of squares estimator ˆβ (LMS) = arg min β R p ( ) med(ri 2 (β)). i
34 LMS-estimator LMS is probably the first really applicable 50% breakdown point estimator introduced by Rousseeuw (1984). The idea of the LMS is to replace the sum operator by a median, which is very robust. Least median of squares estimator ˆβ (LMS) = arg min β R p ( ) med(ri 2 (β)). i There always exists a solution for the LMS estimator. The LMS estimator is regression equivariant, scale equivariant and affine equivariant. If p > 1 then the breakdown point of the LMS method is ε := ([n/2] p + 2)/n. The LMS is not asymptotically normal.
35 LTS-estimator Least trimmed squares estimator ˆβ (LTS) = arg min β R p = h r(i) 2 (β), i=1 where r(1) 2... r (n) 2 are the ordered squared residuals.
36 LTS-estimator Least trimmed squares estimator ˆβ (LTS) = arg min β R p = h r(i) 2 (β), i=1 where r(1) 2... r (n) 2 are the ordered squared residuals. There always exists a solution for the LTS-estimator. The LTS estimator is regression equivariant, scale equivariant and affine equivariant. If p > 1, h = [n/2] + [(p + 1)/2] then the breakdown point of the LTS-estimator is ε := ([n p]/2 + 1)/n. The LTS has the same asymptotic efficiency at the normal distribution as the Huber-type of M-estimator.
37 LTS-estimator Least trimmed squares estimator ˆβ (LTS) = arg min β R p = h r(i) 2 (β), i=1 where r(1) 2... r (n) 2 are the ordered squared residuals. The drawback of the LTS is that the objective function requires sorting of the squared residuals, which takes O(n log n) operations. Robust high breakdown point estimators like the LTS can be very sensitive to a very small change of data or to a deletion of even one point from data set (i.e. small change of data can really cause a large change of the estimate).
38 Examples Number of International Calls (in tens of millions) from Belgium dependence on Year NoCalls Legend Dataset OLS M Hampel M Huber MM Tukey LTS Year
39 Examples An example of outliers in y-direction. Y Legend Dataset OLS M Hampel M Huber MM Tukey LTS X
40 Examples An example of outliers in x-direction. Y Legend Dataset OLS M Hampel M Huber MM Tukey LTS X
41 Examples Robust regression in R m.ols <- lm(nocalls~(year),data=datab) library(mass) m.huber m.hampel m.tukey m.mm.bisq <- rlm(nocalls~(year),data=datab,psi = psi.huber) <- rlm(nocalls~(year),data=datab,psi = psi.hampel) <- rlm(nocalls~(year),data=datab,psi = psi.bisquare) <- rlm(nocalls~(year),data=datab, method= MM ) library(robustbase) - a package of basic robust statistics. m.lts <- ltsreg(nocalls~(year),data=datab) library(robust) - a package of robust methods.
42 Conclusion and Discussion Thank you for attention. The discussion is open.
43 References Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986), Robust Statistics: The Approach Based on Influence Functions, J. Wiley, New York. Rousseeuw, P. J. and Leroy, A. M. (1987), Robust Regression and Outlier Detection. J. Wiley, New York. Jurečková, J. (2001). Robustní statistické metody, Karolinum, Prague Víšek, J. Á. (2000): Regression with high breakdown point. Robust 2000, 2001,
Introduction to Robust Statistics. Elvezio Ronchetti. Department of Econometrics University of Geneva Switzerland.
Introduction to Robust Statistics Elvezio Ronchetti Department of Econometrics University of Geneva Switzerland Elvezio.Ronchetti@metri.unige.ch http://www.unige.ch/ses/metri/ronchetti/ 1 Outline Introduction
More informationLecture 12 Robust Estimation
Lecture 12 Robust Estimation Prof. Dr. Svetlozar Rachev Institute for Statistics and Mathematical Economics University of Karlsruhe Financial Econometrics, Summer Semester 2007 Copyright These lecture-notes
More informationodhady a jejich (ekonometrické)
modifikace Stochastic Modelling in Economics and Finance 2 Advisor : Prof. RNDr. Jan Ámos Víšek, CSc. Petr Jonáš 27 th April 2009 Contents 1 2 3 4 29 1 In classical approach we deal with stringent stochastic
More informationA Modified M-estimator for the Detection of Outliers
A Modified M-estimator for the Detection of Outliers Asad Ali Department of Statistics, University of Peshawar NWFP, Pakistan Email: asad_yousafzay@yahoo.com Muhammad F. Qadir Department of Statistics,
More informationRegression Analysis for Data Containing Outliers and High Leverage Points
Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain
More informationA Brief Overview of Robust Statistics
A Brief Overview of Robust Statistics Olfa Nasraoui Department of Computer Engineering & Computer Science University of Louisville, olfa.nasraoui_at_louisville.edu Robust Statistical Estimators Robust
More informationSmall Sample Corrections for LTS and MCD
myjournal manuscript No. (will be inserted by the editor) Small Sample Corrections for LTS and MCD G. Pison, S. Van Aelst, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling
More informationHigh Breakdown Point Estimation in Regression
WDS'08 Proceedings of Contributed Papers, Part I, 94 99, 2008. ISBN 978-80-7378-065-4 MATFYZPRESS High Breakdown Point Estimation in Regression T. Jurczyk Charles University, Faculty of Mathematics and
More informationRobust model selection criteria for robust S and LT S estimators
Hacettepe Journal of Mathematics and Statistics Volume 45 (1) (2016), 153 164 Robust model selection criteria for robust S and LT S estimators Meral Çetin Abstract Outliers and multi-collinearity often
More informationIndian Statistical Institute
Indian Statistical Institute Introductory Computer programming Robust Regression methods with high breakdown point Author: Roll No: MD1701 February 24, 2018 Contents 1 Introduction 2 2 Criteria for evaluating
More informationKANSAS STATE UNIVERSITY Manhattan, Kansas
ROBUST MIXTURE MODELING by CHUN YU M.S., Kansas State University, 2008 AN ABSTRACT OF A DISSERTATION submitted in partial fulfillment of the requirements for the degree Doctor of Philosophy Department
More informationMULTIVARIATE TECHNIQUES, ROBUSTNESS
MULTIVARIATE TECHNIQUES, ROBUSTNESS Mia Hubert Associate Professor, Department of Mathematics and L-STAT Katholieke Universiteit Leuven, Belgium mia.hubert@wis.kuleuven.be Peter J. Rousseeuw 1 Senior Researcher,
More informationComputing Robust Regression Estimators: Developments since Dutter (1977)
AUSTRIAN JOURNAL OF STATISTICS Volume 41 (2012), Number 1, 45 58 Computing Robust Regression Estimators: Developments since Dutter (1977) Moritz Gschwandtner and Peter Filzmoser Vienna University of Technology,
More informationARE THE BAD LEVERAGE POINTS THE MOST DIFFICULT PROBLEM FOR ESTIMATING THE UNDERLYING REGRESSION MODEL?
ARE THE BAD LEVERAGE POINTS THE MOST DIFFICULT PROBLEM FOR ESTIMATING THE UNDERLYING REGRESSION MODEL? RESEARCH SUPPORTED BY THE GRANT OF THE CZECH SCIENCE FOUNDATION 13-01930S PANEL P402 ROBUST METHODS
More informationA SHORT COURSE ON ROBUST STATISTICS. David E. Tyler Rutgers The State University of New Jersey. Web-Site dtyler/shortcourse.
A SHORT COURSE ON ROBUST STATISTICS David E. Tyler Rutgers The State University of New Jersey Web-Site www.rci.rutgers.edu/ dtyler/shortcourse.pdf References Huber, P.J. (1981). Robust Statistics. Wiley,
More informationPAijpam.eu M ESTIMATION, S ESTIMATION, AND MM ESTIMATION IN ROBUST REGRESSION
International Journal of Pure and Applied Mathematics Volume 91 No. 3 2014, 349-360 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: http://dx.doi.org/10.12732/ijpam.v91i3.7
More informationLeverage effects on Robust Regression Estimators
Leverage effects on Robust Regression Estimators David Adedia 1 Atinuke Adebanji 2 Simon Kojo Appiah 2 1. Department of Basic Sciences, School of Basic and Biomedical Sciences, University of Health and
More informationTwo Simple Resistant Regression Estimators
Two Simple Resistant Regression Estimators David J. Olive Southern Illinois University January 13, 2005 Abstract Two simple resistant regression estimators with O P (n 1/2 ) convergence rate are presented.
More informationON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX
STATISTICS IN MEDICINE Statist. Med. 17, 2685 2695 (1998) ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX N. A. CAMPBELL *, H. P. LOPUHAA AND P. J. ROUSSEEUW CSIRO Mathematical and Information
More informationChapter 11: Robust & Quantile regression
Chapter 11: Robust & Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 13 11.3: Robust regression Leverages h ii and deleted residuals t i
More informationDefinitions of ψ-functions Available in Robustbase
Definitions of ψ-functions Available in Robustbase Manuel Koller and Martin Mächler July 18, 2018 Contents 1 Monotone ψ-functions 2 1.1 Huber.......................................... 3 2 Redescenders
More informationOutliers and Robust Regression Techniques
POLS/CSSS 503: Advanced Quantitative Political Methodology Outliers and Robust Regression Techniques Christopher Adolph Department of Political Science and Center for Statistics and the Social Sciences
More informationRegression Estimation in the Presence of Outliers: A Comparative Study
International Journal of Probability and Statistics 2016, 5(3): 65-72 DOI: 10.5923/j.ijps.20160503.01 Regression Estimation in the Presence of Outliers: A Comparative Study Ahmed M. Gad 1,*, Maha E. Qura
More informationRobust regression in R. Eva Cantoni
Robust regression in R Eva Cantoni Research Center for Statistics and Geneva School of Economics and Management, University of Geneva, Switzerland April 4th, 2017 1 Robust statistics philosopy 2 Robust
More informationROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY
ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY G.L. Shevlyakov, P.O. Smirnov St. Petersburg State Polytechnic University St.Petersburg, RUSSIA E-mail: Georgy.Shevlyakov@gmail.com
More informationSmall sample corrections for LTS and MCD
Metrika (2002) 55: 111 123 > Springer-Verlag 2002 Small sample corrections for LTS and MCD G. Pison, S. Van Aelst*, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling
More informationIntroduction to Linear regression analysis. Part 2. Model comparisons
Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual
More informationBreakdown points of Cauchy regression-scale estimators
Breadown points of Cauchy regression-scale estimators Ivan Mizera University of Alberta 1 and Christine H. Müller Carl von Ossietzy University of Oldenburg Abstract. The lower bounds for the explosion
More informationA Comparison of Robust Estimators Based on Two Types of Trimming
Submitted to the Bernoulli A Comparison of Robust Estimators Based on Two Types of Trimming SUBHRA SANKAR DHAR 1, and PROBAL CHAUDHURI 1, 1 Theoretical Statistics and Mathematics Unit, Indian Statistical
More informationChapter 11: Robust & Quantile regression
Chapter 11: Robust & Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/17 11.3: Robust regression 11.3 Influential cases rem. measure: Robust regression
More informationIMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE
IMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE Eric Blankmeyer Department of Finance and Economics McCoy College of Business Administration Texas State University San Marcos
More informationRobust estimation for linear regression with asymmetric errors with applications to log-gamma regression
Robust estimation for linear regression with asymmetric errors with applications to log-gamma regression Ana M. Bianco, Marta García Ben and Víctor J. Yohai Universidad de Buenps Aires October 24, 2003
More informationFast and robust bootstrap for LTS
Fast and robust bootstrap for LTS Gert Willems a,, Stefan Van Aelst b a Department of Mathematics and Computer Science, University of Antwerp, Middelheimlaan 1, B-2020 Antwerp, Belgium b Department of
More informationPractical High Breakdown Regression
Practical High Breakdown Regression David J. Olive and Douglas M. Hawkins Southern Illinois University and University of Minnesota February 8, 2011 Abstract This paper shows that practical high breakdown
More informationRobust regression in Stata
The Stata Journal (2009) 9, Number 3, pp. 439 453 Robust regression in Stata Vincenzo Verardi 1 University of Namur (CRED) and Université Libre de Bruxelles (ECARES and CKE) Rempart de la Vierge 8, B-5000
More informationMidwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter
Midwest Big Data Summer School: Introduction to Statistics Kris De Brabanter kbrabant@iastate.edu Iowa State University Department of Statistics Department of Computer Science June 20, 2016 1/27 Outline
More information9. Robust regression
9. Robust regression Least squares regression........................................................ 2 Problems with LS regression..................................................... 3 Robust regression............................................................
More informationRobust multivariate methods in Chemometrics
Robust multivariate methods in Chemometrics Peter Filzmoser 1 Sven Serneels 2 Ricardo Maronna 3 Pierre J. Van Espen 4 1 Institut für Statistik und Wahrscheinlichkeitstheorie, Technical University of Vienna,
More informationTwo Stage Robust Ridge Method in a Linear Regression Model
Journal of Modern Applied Statistical Methods Volume 14 Issue Article 8 11-1-015 Two Stage Robust Ridge Method in a Linear Regression Model Adewale Folaranmi Lukman Ladoke Akintola University of Technology,
More informationHalf-Day 1: Introduction to Robust Estimation Techniques
Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Half-Day 1: Introduction to Robust Estimation Techniques Andreas Ruckstuhl Institut fr Datenanalyse
More informationStat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13)
Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) 1. Weighted Least Squares (textbook 11.1) Recall regression model Y = β 0 + β 1 X 1 +... + β p 1 X p 1 + ε in matrix form: (Ch. 5,
More informationRobust Methods in Industrial Statistics
Robust Methods in Industrial Statistics Dimitar Vandev University of Sofia, Faculty of Mathematics and Informatics, Sofia 1164, J.Bourchier 5, E-mail: vandev@fmi.uni-sofia.bg Abstract The talk aims to
More informationRobust statistics. Michael Love 7/10/2016
Robust statistics Michael Love 7/10/2016 Robust topics Median MAD Spearman Wilcoxon rank test Weighted least squares Cook's distance M-estimators Robust topics Median => middle MAD => spread Spearman =>
More informationDetecting outliers and/or leverage points: a robust two-stage procedure with bootstrap cut-off points
Detecting outliers and/or leverage points: a robust two-stage procedure with bootstrap cut-off points Ettore Marubini (1), Annalisa Orenti (1) Background: Identification and assessment of outliers, have
More informationRobust high-dimensional linear regression: A statistical perspective
Robust high-dimensional linear regression: A statistical perspective Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics STOC workshop on robustness and nonconvexity Montreal,
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationRobust Regression. Catherine Stuart
Robust Regression Catherine Stuart 16th April, 2011 Abstract An introduction to robustness in statistics, with emphasis on its relevance to regression analysis. The weaknesses of the least squares estimator
More informationDetecting and Assessing Data Outliers and Leverage Points
Chapter 9 Detecting and Assessing Data Outliers and Leverage Points Section 9.1 Background Background Because OLS estimators arise due to the minimization of the sum of squared errors, large residuals
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationRemedial Measures for Multiple Linear Regression Models
Remedial Measures for Multiple Linear Regression Models Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Remedial Measures for Multiple Linear Regression Models 1 / 25 Outline
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationJoseph W. McKean 1. INTRODUCTION
Statistical Science 2004, Vol. 19, No. 4, 562 570 DOI 10.1214/088342304000000549 Institute of Mathematical Statistics, 2004 Robust Analysis of Linear Models Joseph W. McKean Abstract. This paper presents
More informationActa Universitatis Carolinae. Mathematica et Physica
Acta Universitatis Carolinae. Mathematica et Physica TomĂĄĹĄ Jurczyk Ridge least weighted squares Acta Universitatis Carolinae. Mathematica et Physica, Vol. 52 (2011), No. 1, 15--26 Persistent URL: http://dml.cz/dmlcz/143664
More informationRobust Methods in Regression Analysis: Comparison and Improvement. Mohammad Abd- Almonem H. Al-Amleh. Supervisor. Professor Faris M.
Robust Methods in Regression Analysis: Comparison and Improvement By Mohammad Abd- Almonem H. Al-Amleh Supervisor Professor Faris M. Al-Athari This Thesis was Submitted in Partial Fulfillment of the Requirements
More informationApplications and Algorithms for Least Trimmed Sum of Absolute Deviations Regression
Southern Illinois University Carbondale OpenSIUC Articles and Preprints Department of Mathematics 12-1999 Applications and Algorithms for Least Trimmed Sum of Absolute Deviations Regression Douglas M.
More informationCHAPTER 5. Outlier Detection in Multivariate Data
CHAPTER 5 Outlier Detection in Multivariate Data 5.1 Introduction Multivariate outlier detection is the important task of statistical analysis of multivariate data. Many methods have been proposed for
More informationIntroduction to robust statistics*
Introduction to robust statistics* Xuming He National University of Singapore To statisticians, the model, data and methodology are essential. Their job is to propose statistical procedures and evaluate
More informationUsing Ridge Least Median Squares to Estimate the Parameter by Solving Multicollinearity and Outliers Problems
Modern Applied Science; Vol. 9, No. ; 05 ISSN 9-844 E-ISSN 9-85 Published by Canadian Center of Science and Education Using Ridge Least Median Squares to Estimate the Parameter by Solving Multicollinearity
More informationBy 3DYHOýtåHN, Wolfgang Härdle
No. 2005 31 ROBUST ESTIMATION OF DIMENSION REDUCTION SPACE By 3DYHOýtåHN, Wolfgang Härdle February 2005 ISSN 0924-7815 Robust estimation of dimension reduction space P. Čížek a and W. Härdle b a Department
More informationRobust scale estimation with extensions
Robust scale estimation with extensions Garth Tarr, Samuel Müller and Neville Weber School of Mathematics and Statistics THE UNIVERSITY OF SYDNEY Outline The robust scale estimator P n Robust covariance
More informationRobust methods for multivariate data analysis
JOURNAL OF CHEMOMETRICS J. Chemometrics 2005; 19: 549 563 Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cem.962 Robust methods for multivariate data analysis S. Frosch
More informationSTAT 593 Robust statistics: Modeling and Computing
STAT 593 Robust statistics: Modeling and Computing Joseph Salmon http://josephsalmon.eu Télécom Paristech, Institut Mines-Télécom & University of Washington, Department of Statistics (Visiting Assistant
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationRobust estimation of model with the fixed and random effects
Robust estimation of model wh the fixed and random effects Jan Ámos Víšek, Charles Universy in Prague, visek@fsv.cuni.cz Abstract. Robustified estimation of the regression model is recalled, generalized
More information14 Multiple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in
More informationReview of robust multivariate statistical methods in high dimension
Review of robust multivariate statistical methods in high dimension Peter Filzmoser a,, Valentin Todorov b a Department of Statistics and Probability Theory, Vienna University of Technology, Wiedner Hauptstr.
More informationON THE MAXIMUM BIAS FUNCTIONS OF MM-ESTIMATES AND CONSTRAINED M-ESTIMATES OF REGRESSION
ON THE MAXIMUM BIAS FUNCTIONS OF MM-ESTIMATES AND CONSTRAINED M-ESTIMATES OF REGRESSION BY JOSE R. BERRENDERO, BEATRIZ V.M. MENDES AND DAVID E. TYLER 1 Universidad Autonoma de Madrid, Federal University
More informationEstimation, admissibility and score functions
Estimation, admissibility and score functions Charles University in Prague 1 Acknowledgements and introduction 2 3 Score function in linear model. Approximation of the Pitman estimator 4 5 Acknowledgements
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationInference based on robust estimators Part 2
Inference based on robust estimators Part 2 Matias Salibian-Barrera 1 Department of Statistics University of British Columbia ECARES - Dec 2007 Matias Salibian-Barrera (UBC) Robust inference (2) ECARES
More informationRobust Variable Selection Through MAVE
Robust Variable Selection Through MAVE Weixin Yao and Qin Wang Abstract Dimension reduction and variable selection play important roles in high dimensional data analysis. Wang and Yin (2008) proposed sparse
More informationQuantitative Methods I: Regression diagnostics
Quantitative Methods I: Regression University College Dublin 10 December 2014 1 Assumptions and errors 2 3 4 Outline Assumptions and errors 1 Assumptions and errors 2 3 4 Assumptions: specification Linear
More informationEconometrics II - EXAM Answer each question in separate sheets in three hours
Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following
More informationL-momenty s rušivou regresí
L-momenty s rušivou regresí Jan Picek, Martin Schindler e-mail: jan.picek@tul.cz TECHNICKÁ UNIVERZITA V LIBERCI ROBUST 2016 J. Picek, M. Schindler, TUL L-momenty s rušivou regresí 1/26 Motivation 1 Development
More informationarxiv: v1 [math.st] 2 Aug 2007
The Annals of Statistics 2007, Vol. 35, No. 1, 13 40 DOI: 10.1214/009053606000000975 c Institute of Mathematical Statistics, 2007 arxiv:0708.0390v1 [math.st] 2 Aug 2007 ON THE MAXIMUM BIAS FUNCTIONS OF
More informationThe S-estimator of multivariate location and scatter in Stata
The Stata Journal (yyyy) vv, Number ii, pp. 1 9 The S-estimator of multivariate location and scatter in Stata Vincenzo Verardi University of Namur (FUNDP) Center for Research in the Economics of Development
More informationRobust Weighted LAD Regression
Robust Weighted LAD Regression Avi Giloni, Jeffrey S. Simonoff, and Bhaskar Sengupta February 25, 2005 Abstract The least squares linear regression estimator is well-known to be highly sensitive to unusual
More informationRobustness for dummies
Robustness for dummies CRED WP 2012/06 Vincenzo Verardi, Marjorie Gassner and Darwin Ugarte Center for Research in the Economics of Development University of Namur Robustness for dummies Vincenzo Verardi,
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x
More informationAvailable from Deakin Research Online:
This is the published version: Beliakov, Gleb and Yager, Ronald R. 2009, OWA operators in linear regression and detection of outliers, in AGOP 2009 : Proceedings of the Fifth International Summer School
More informationAlternative Biased Estimator Based on Least. Trimmed Squares for Handling Collinear. Leverage Data Points
International Journal of Contemporary Mathematical Sciences Vol. 13, 018, no. 4, 177-189 HIKARI Ltd, www.m-hikari.com https://doi.org/10.1988/ijcms.018.8616 Alternative Biased Estimator Based on Least
More informationRobust Statistics. Frank Klawonn
Robust Statistics Frank Klawonn f.klawonn@fh-wolfenbuettel.de, frank.klawonn@helmholtz-hzi.de Data Analysis and Pattern Recognition Lab Department of Computer Science University of Applied Sciences Braunschweig/Wolfenbüttel,
More informationMultivariate Autoregressive Time Series Using Schweppe Weighted Wilcoxon Estimates
Western Michigan University ScholarWorks at WMU Dissertations Graduate College 4-2014 Multivariate Autoregressive Time Series Using Schweppe Weighted Wilcoxon Estimates Jaime Burgos Western Michigan University,
More informationKANSAS STATE UNIVERSITY
ROBUST MIXTURES OF REGRESSION MODELS by XIUQIN BAI M.S., Kansas State University, USA, 2010 AN ABSTRACT OF A DISSERTATION submitted in partial fulfillment of the requirements for the degree DOCTOR OF PHILOSOPHY
More informationVARIABLE SELECTION IN ROBUST LINEAR MODELS
The Pennsylvania State University The Graduate School Eberly College of Science VARIABLE SELECTION IN ROBUST LINEAR MODELS A Thesis in Statistics by Bo Kai c 2008 Bo Kai Submitted in Partial Fulfillment
More informationOn consistency factors and efficiency of robust S-estimators
TEST DOI.7/s749-4-357-7 ORIGINAL PAPER On consistency factors and efficiency of robust S-estimators Marco Riani Andrea Cerioli Francesca Torti Received: 5 May 3 / Accepted: 6 January 4 Sociedad de Estadística
More informationRobust Estimation of Cronbach s Alpha
Robust Estimation of Cronbach s Alpha A. Christmann University of Dortmund, Fachbereich Statistik, 44421 Dortmund, Germany. S. Van Aelst Ghent University (UGENT), Department of Applied Mathematics and
More informationECON The Simple Regression Model
ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In
More informationCOMPARING ROBUST REGRESSION LINES ASSOCIATED WITH TWO DEPENDENT GROUPS WHEN THERE IS HETEROSCEDASTICITY
COMPARING ROBUST REGRESSION LINES ASSOCIATED WITH TWO DEPENDENT GROUPS WHEN THERE IS HETEROSCEDASTICITY Rand R. Wilcox Dept of Psychology University of Southern California Florence Clark Division of Occupational
More informationIDENTIFYING MULTIPLE OUTLIERS IN LINEAR REGRESSION : ROBUST FIT AND CLUSTERING APPROACH
SESSION X : THEORY OF DEFORMATION ANALYSIS II IDENTIFYING MULTIPLE OUTLIERS IN LINEAR REGRESSION : ROBUST FIT AND CLUSTERING APPROACH Robiah Adnan 2 Halim Setan 3 Mohd Nor Mohamad Faculty of Science, Universiti
More informationLinear Models 1. Isfahan University of Technology Fall Semester, 2014
Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationRegression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin
Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n
More informationNonparametric Bootstrap Estimation for Implicitly Weighted Robust Regression
J. Hlaváčová (Ed.): ITAT 2017 Proceedings, pp. 78 85 CEUR Workshop Proceedings Vol. 1885, ISSN 1613-0073, c 2017 J. Kalina, B. Peštová Nonparametric Bootstrap Estimation for Implicitly Weighted Robust
More informationEXTENDING PARTIAL LEAST SQUARES REGRESSION
EXTENDING PARTIAL LEAST SQUARES REGRESSION ATHANASSIOS KONDYLIS UNIVERSITY OF NEUCHÂTEL 1 Outline Multivariate Calibration in Chemometrics PLS regression (PLSR) and the PLS1 algorithm PLS1 from a statistical
More informationOPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION
REVSTAT Statistical Journal Volume 15, Number 3, July 2017, 455 471 OPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION Authors: Fatma Zehra Doğru Department of Econometrics,
More informationRobust estimators based on generalization of trimmed mean
Communications in Statistics - Simulation and Computation ISSN: 0361-0918 (Print) 153-4141 (Online) Journal homepage: http://www.tandfonline.com/loi/lssp0 Robust estimators based on generalization of trimmed
More informationReference: Davidson and MacKinnon Ch 2. In particular page
RNy, econ460 autumn 03 Lecture note Reference: Davidson and MacKinnon Ch. In particular page 57-8. Projection matrices The matrix M I X(X X) X () is often called the residual maker. That nickname is easy
More informationA NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL
Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl
More informationDetection of outliers in multivariate data:
1 Detection of outliers in multivariate data: a method based on clustering and robust estimators Carla M. Santos-Pereira 1 and Ana M. Pires 2 1 Universidade Portucalense Infante D. Henrique, Oporto, Portugal
More information