Robust regression in R. Eva Cantoni
|
|
- Rosa Wilkerson
- 5 years ago
- Views:
Transcription
1 Robust regression in R Eva Cantoni Research Center for Statistics and Geneva School of Economics and Management, University of Geneva, Switzerland April 4th, 2017
2 1 Robust statistics philosopy 2 Robust regression 3 R ressources 4 Examples 5 Bibliography
3 Against what is robust statistics robust? Robust Statistics aims at producing consistent and possibly efficient estimators and test statistics with stable level when the model is slightly misspecified. Model misspecification encompasses a relatively large set of possibilities, and robust statistics cannot deal with all types of model misspecifications. By slight model misspecification, we suppose that the data generating process lies in a neighborhood of the true (postulated) model, the one that is considered as useful for the problem under investigation.
4 Against what is robust statistics robust? This neighborhood is formalized as F ε = (1 ε)f θ + εg, (1) F θ is the postulated model, θ is a set of parameters of interest, G is an arbitrary distribution and 0 ε 1 captures the amount of model misspecification
5 Against what is robust statistics robust? Inference Classical G 0 << ε < 1 0 < ε << 1 ε = 0 arbitrary?? F θ G = z F ε F ε F θ G = F (θ,θ ) F ε F ε F θ G such that F ε = F (θ,θ ) F (θ,θ ) F (θ,θ ) F θ Robust arbitrary? F θ F θ G = z F ε F θ F θ G = F (θ,θ ) F ε F θ F θ G such that F ε = F (θ,θ ) F (θ,θ ) F θ F θ Table: Models at which inference can be at best done.
6 Robustness measures Robust estimators protect against: bias under contamination breakdown point They imply a trade-off between efficiency and robustness!
7 Linear model and classical estimation For i = 1,..., n consider with ɛ i N (0, σ 2 ). y i = x T i β + ɛ i, The maximum likelihood estimator ˆβ ML minimizes ( n y i xi T σ i=1 ˆβ ML ) 2 = n i=1 r 2 i, or, alternatively, solves ( n y i xi T σ i=1 ˆβ ML ) x i = n r i x i = 0. i=1
8 Outliers x y Classification des points aberrants Point vertical Bon point levier Mauvais point levier
9 Robustness vs diagnostic Masking... ML residuals Index resid(olsmultiple)
10 Robustness vs diagnostic Masking... Data and ML fit x y
11 M and GM-estimation The M-estimator ˆβ M solves ) n (y i xi T ˆβ M ψ c x i = σ i=1 n ψ c (r i )x i = i=1 n w c (r i ) r i x i = 0, i=1 where w c (r i ) = ψ c (r i )/r i, or minimizes ) n (y i xi T ˆβ M ρ c. σ i=1 The Mallows GM-estimator ˆβ GM is an alternative that solves ) n (y i xi T ˆβ GM n ψ c w(x i )x i = w(r i )w(x i )r i x i = 0. σ i=1 i=1
12 ψ c and ρ c functions Huber Tukey or bisquare psi_c(r) psi_c(r) r r ρ H (u) ρ B (u) u u
13 Comparison x y OLS M Huber GM Huber M Tukey
14 S-estimation The S-estimator ˆβ S is an alternative that minimizes ) n (y i xi T ˆβ S ρ c, s i=1 where s is a scale M-estimator that solves 1 n ( ρ (1) yi x T ) i β c = b. n s, i=1
15 MM-estimation The MM-estimator is a two-step estimator constructed as follow: 1. Let s n be the scale estimate from an initial S-estimator. 2. With ρ (2) c ( ) ρ (1) c ( ), the MM-estimator ˆβ MM minimizes ( ) n y i xi T ˆβ MM. i=1 ρ (2) c s n
16 Comparison x y OLS M Huber M Tukey MM S
17 Robust GLM (GM-estimator) For the GLM model (e.g. logistic, Poisson) g(µ i ) = x T i β where E(Y i ) = µ i, Var(Y i ) = v(µ i ) and r i = (y i µ i ), the robust φvµi estimator is defined by n [ 1 ] ψ c (r i )w(x i ) µ i a(β) = 0, (2) φvµi i=1 where µ i = µ i / β = µ i / η i x i and a(β) = 1 n n i=1 E[ψ(r i; c)]w(x i )/ φv µi µ i. The constant a(β) is a correction term to ensure Fisher consistency.
18 R functions for robust linear regression (G)M-estimation MASS: rlm() with method= M (Huber, Tukey, Hampel) Choice for the scale estimator: MAD, Huber Proposal 2 S-estimation robust: lmrob with estim= Initial robustbase: lmrob.s MM-estimation MASS: rlm() with method= MM robust: lmrob (with estim= Final, default) robustbase: lmrob()
19 R functions for other models robustbase: glmrob GM-estimation, Huber (include Gaussian) Negative binomial model: glmrob.nb from From our book webpage: eva-cantoni/books/ robust-methods-in-biostatistics/
20 Coleman Dataset coleman from package robustbase. A data frame with 20 observations on the following 6 variables. salaryp: staff salaries per pupil fatherwc: percent of white-collar fathers sstatus: socioeconomic status composite deviation: means for family size, family intactness, father s education, mother s education, and home items teachersc: mean teacher s verbal test score motherlev: mean mother s educational level, one unit is equal to two school years Y: verbal mean test score (y, all sixth graders)
21 Coleman > summary (m. lm ) C a l l : lm ( formula = Y., data = coleman ) Residuals : Min 1Q Median 3Q Max C o e f f i c i e n t s : Est imat e Std. E r r o r t v a l u e Pr(> t ) ( I n t e r c e p t ) s a l a r y P fatherwc s s t a t u s e 05 t e a c h e r S c motherlev S i g n i f. c o d e s : 0 Äò Äô Äò Äô Äò Äô Äò. Äô 0. 1 Äò Äô 1 Residual standard e r r o r : on 14 degrees of freedom M u l t i p l e R squared : , Adjusted R squared : F s t a t i s t i c : on 5 and 14 DF, p v a l u e : e 07
22 Coleman > r e q u i r e ( r o b u s t b a s e ) > summary (m. lmrob, s e t t i n g = KS2011 ) C a l l : lmrob ( formula = Y., data = coleman ) \ > method = MM Residuals : Min 1Q Median 3Q Max C o e f f i c i e n t s : Est imat e Std. E r r o r t v a l u e Pr(> t ) ( I n t e r c e p t ) s a l a r y P fatherwc e 05 s s t a t u s e 11 teachersc e 08 motherlev S i g n i f. c o d e s : 0 Äò Äô Äò Äô Äò Äô Äò. Äô 0. 1 Äò Äô 1 Robust r e s i d u a l s t a n d a r d e r r o r : M u l t i p l e R squared : , Adjusted R squared : Convergence i n 11 IRWLS i t e r a t i o n s Robustness weights : o b s e r v a t i o n 18 i s an o u t l i e r w i t h w e i g h t = 0 ( < ) ; The r e m a i n i n g 19 ones a r e summarized as Min. 1 st Qu. Median Mean 3 rd Qu. Max
23 Coleman Ctn d: A l g o r i t h m i c parameters : t u n i n g. c h i bb t u n i n g. p s i r e f i n e. t o l r e l. t o l e e e e e 0 s o l v e. t o l eps. o u t l i e r eps. x warn. l i m i t. r e j e c t warn. l i m i t. meanr e e e e e 0 nresample max. i t b e s t. r. s k. f a s t. s k. max maxit. s c a l t r a c e. l e v mts compute. rd f a s t. s. l a r g e. n p s i s u b s a m p l i n g cov compute. o u t l i e r. s t a t s b i s q u a r e n o n s i n g u l a r. vcov. a v a r 1 SM s e e d : i n t ( 0 )
24 Coleman Robustness weights m.lmrob$rweights Index
25 Coleman > summary (m. rlm ) C a l l : rlm ( f o r m u l a = Y., data = coleman ) Residuals : Min 1Q Median 3Q Max C o e f f i c i e n t s : Value Std. E r r o r t v a l u e ( I n t e r c e p t ) s a l a r y P fatherwc s s t a t u s t e a c h e r S c motherlev Residual standard e r r o r : on 14 degrees of freedom
26 Breastfeeding UK study on the decision of pregnant women to breastfeed. 135 expecting mothers asked on their feeding choice (breast= 1 if breastfeeding, try to breastfeed and mixed brest- and bottle-feeding, =0 if exclusive bottle-feeding). Covariates: advancement of the pregnancy (pregnancy, end or beginning), how mothers fed as babies (howfed, some breastfeeding or only bottle-feeding), how mother s friend fed their babies (howfedfriends, some breastfeeding or only bottle-feeding), if had a partner (partner, no or yes), age (age), age at which left full time education (educat), ethnic group (ethnic, white or non white) and if ever smoked (smokebf, no or yes) or if stopped smoking (smokenow, no or yes). The first listed level of each factor is used as the reference (coded 0).
27 Breastfeeding The sample characteristics are as follow: out of the 135 observations, 99 were from mothers that have decided at least to try to breastfeed, 54 mothers were at the beginning of their pregnancy, 77 were themselves breastfed as baby, 85 of the mother s friend had breastfed their babies, 114 mothers had a partner, median age was (with minimum equal 17 and maximum equal 40), median age at the end of education was 17 (minimum=14, maximum=38), 77 mothers were white and 32 mothers were smoking during the pregnancy, whereas 51 had smoked before.
28 Breastfeeding breast i Bernoulli(µ i ), so that E(breast i ) = µ i and Var(breast i ) = µ i (1 µ i ) (Binomial family). Use the logit link. logit(e(breast)) = logit(p(breast)) = = β 0 + β 1 pregnancy + β 2 howfed + β 3 howfedfr + β 4 partner + β 5 ethnic + β 6 smokebf + β 7 smokenow + β 8 age + β 9 educat, where logit(µ i ) = log( µ i 1 µ i ), with µ i /(1 µ i ) being the odds of a success, and µ i = P(breast) is the probability of at least try to breastfeed.
29 Breastfeeding > r e q u i r e ( r o b u s t b a s e ) > breast. glmrobwx=glmrob ( decwhat howfedfr+ethnic+educat+age+grp+howfed+partner+ smokenow+smokebf, family=binomial, weights. on. x= hat, tcc =1.5, data=breast ) > summary ( b r e a s t. glmrobwx ) C a l l : glmrob ( formula = decwhat howfedfr + ethnic + educat + age + grp + howfed + partner + smokenow + smokebf, family = binomial, data = breast, w e i g h t s. on. x = hat, t c c = 1. 5 ) C o e f f i c i e n t s : E s t i m a t e Std. E r r o r z v a l u e Pr(> z ) ( I n t e r c e p t ) h o w f e d f r B r e a s t ethnicnon w h i t e e d u c a t age g r p B e g i n n i n g h o w f e d B r e a s t p a r t n e r P a r t n e r smokenowyes smokebfyes S i g n i f. codes : R o b u s t n e s s w e i g h t s w. r w. x : Min. 1 st Qu. Median Mean 3 rd Qu. Max Number o f o b s e r v a t i o n s : 135 F i t t e d by method Mqle ( i n 12 i t e r a t i o n s ) ( Dispersion parameter f o r binomial family taken to be 1)
30 Breastfeeding Ctn d: No d e v i a n c e v a l u e s a v a i l a b l e A l g o r i t h m i c parameters : acc t c c maxit 50 t e s t. acc c o e f
31 Breastfeeding Robustness weights w(x) Index w(r) Index
32 Mailing list and conferences Dedicated mailing list: ICORS 2017 in Wollongong, Australia: ICORS 2016 in Geneva:
33 References [1] Jean-Jacques Droesbeke, Gilbert Saporta, and Christine Thomas-Agnan. Méthodes robustes en statistique. Editions TECHNIP, [2] Frank R. Hampel, Elvezio M. Ronchetti, Peter J. Rousseeuw, and Werner A. Stahel. Robust Statistics: The Approach Based on Influence Functions. Wiley, New York, [3] S. Heritier, E. Cantoni, S. Copt, and M.-P. Victoria-Feser. Robust Methods in Biostatistics. Wiley-Interscience, [4] Peter J. Huber. Robust Statistics. Wiley, New York, [5] Peter J. Huber and Elvezio M. Ronchetti. Robust Statistics. Wiley, New York, Second Edition. [6] R.A. Maronna, R.D. Martin, and V.J. Yohai. Robust statistics. Wiley New York, 2006.
Introduction to Robust Statistics. Elvezio Ronchetti. Department of Econometrics University of Geneva Switzerland.
Introduction to Robust Statistics Elvezio Ronchetti Department of Econometrics University of Geneva Switzerland Elvezio.Ronchetti@metri.unige.ch http://www.unige.ch/ses/metri/ronchetti/ 1 Outline Introduction
More informationDefinitions of ψ-functions Available in Robustbase
Definitions of ψ-functions Available in Robustbase Manuel Koller and Martin Mächler July 18, 2018 Contents 1 Monotone ψ-functions 2 1.1 Huber.......................................... 3 2 Redescenders
More informationIntroduction Robust regression Examples Conclusion. Robust regression. Jiří Franc
Robust regression Robust estimation of regression coefficients in linear regression model Jiří Franc Czech Technical University Faculty of Nuclear Sciences and Physical Engineering Department of Mathematics
More informationSmall Sample Corrections for LTS and MCD
myjournal manuscript No. (will be inserted by the editor) Small Sample Corrections for LTS and MCD G. Pison, S. Van Aelst, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling
More informationRobust Statistics with S (R) JWT. Robust Statistics Collaborative Package Development: robustbase. Robust Statistics with R the past II
Robust Statistics with S (R) JWT Robust Statistics Collaborative Package Development: robustbase Martin Maechler 1 Andreas Ruckstuhl 2 1 ETH Zurich 2 ZHW Winterthur Switzerland maechler@r-project.org,
More informationA NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL
Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl
More informationROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS
ROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS AIDA TOMA The nonrobustness of classical tests for parametric models is a well known problem and various
More informationA Modified M-estimator for the Detection of Outliers
A Modified M-estimator for the Detection of Outliers Asad Ali Department of Statistics, University of Peshawar NWFP, Pakistan Email: asad_yousafzay@yahoo.com Muhammad F. Qadir Department of Statistics,
More informationRegression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples.
Regression models Generalized linear models in R Dr Peter K Dunn http://www.usq.edu.au Department of Mathematics and Computing University of Southern Queensland ASC, July 00 The usual linear regression
More informationRobust negative binomial regression
Robust negative binomial regression Michel Amiguet, Alfio Marazzi, Marina S. Valdora, Víctor J. Yohai september 2016 Negative Binomial Regression Poisson regression is the standard method used to model
More informationrobmixglm: An R package for robust analysis using mixtures
robmixglm: An R package for robust analysis using mixtures 1 Introduction 1.1 Model Ken Beath Macquarie University Australia Package robmixglm implements the method of Beath (2017). This assumes that data
More informationRobust Statistics. Frank Klawonn
Robust Statistics Frank Klawonn f.klawonn@fh-wolfenbuettel.de, frank.klawonn@helmholtz-hzi.de Data Analysis and Pattern Recognition Lab Department of Computer Science University of Applied Sciences Braunschweig/Wolfenbüttel,
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationOPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION
REVSTAT Statistical Journal Volume 15, Number 3, July 2017, 455 471 OPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION Authors: Fatma Zehra Doğru Department of Econometrics,
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationLecture 12 Robust Estimation
Lecture 12 Robust Estimation Prof. Dr. Svetlozar Rachev Institute for Statistics and Mathematical Economics University of Karlsruhe Financial Econometrics, Summer Semester 2007 Copyright These lecture-notes
More informationComputing Robust Regression Estimators: Developments since Dutter (1977)
AUSTRIAN JOURNAL OF STATISTICS Volume 41 (2012), Number 1, 45 58 Computing Robust Regression Estimators: Developments since Dutter (1977) Moritz Gschwandtner and Peter Filzmoser Vienna University of Technology,
More information6. INFLUENTIAL CASES IN GENERALIZED LINEAR MODELS. The Generalized Linear Model
79 6. INFLUENTIAL CASES IN GENERALIZED LINEAR MODELS The generalized linear model (GLM) extends from the general linear model to accommodate dependent variables that are not normally distributed, including
More informationWEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION
WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION Michael Amiguet 1, Alfio Marazzi 1, Victor Yohai 2 1 - University of Lausanne, Institute for Social and Preventive Medicine, Lausanne, Switzerland 2 - University
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationRobust model selection criteria for robust S and LT S estimators
Hacettepe Journal of Mathematics and Statistics Volume 45 (1) (2016), 153 164 Robust model selection criteria for robust S and LT S estimators Meral Çetin Abstract Outliers and multi-collinearity often
More informationGeneralized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model
Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY
ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY G.L. Shevlyakov, P.O. Smirnov St. Petersburg State Polytechnic University St.Petersburg, RUSSIA E-mail: Georgy.Shevlyakov@gmail.com
More informationUsing R in 200D Luke Sonnet
Using R in 200D Luke Sonnet Contents Working with data frames 1 Working with variables........................................... 1 Analyzing data............................................... 3 Random
More informationSmall sample corrections for LTS and MCD
Metrika (2002) 55: 111 123 > Springer-Verlag 2002 Small sample corrections for LTS and MCD G. Pison, S. Van Aelst*, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling
More informationGeneralized linear models
Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models
More informationCOMPARING ROBUST REGRESSION LINES ASSOCIATED WITH TWO DEPENDENT GROUPS WHEN THERE IS HETEROSCEDASTICITY
COMPARING ROBUST REGRESSION LINES ASSOCIATED WITH TWO DEPENDENT GROUPS WHEN THERE IS HETEROSCEDASTICITY Rand R. Wilcox Dept of Psychology University of Southern California Florence Clark Division of Occupational
More informationLogistic Regression - problem 6.14
Logistic Regression - problem 6.14 Let x 1, x 2,, x m be given values of an input variable x and let Y 1,, Y m be independent binomial random variables whose distributions depend on the corresponding values
More informationA Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46
A Generalized Linear Model for Binomial Response Data Copyright c 2017 Dan Nettleton (Iowa State University) Statistics 510 1 / 46 Now suppose that instead of a Bernoulli response, we have a binomial response
More informationMidwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter
Midwest Big Data Summer School: Introduction to Statistics Kris De Brabanter kbrabant@iastate.edu Iowa State University Department of Statistics Department of Computer Science June 20, 2016 1/27 Outline
More informationGeneralized Estimating Equations
Outline Review of Generalized Linear Models (GLM) Generalized Linear Model Exponential Family Components of GLM MLE for GLM, Iterative Weighted Least Squares Measuring Goodness of Fit - Deviance and Pearson
More information9 Generalized Linear Models
9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models
More information9. Robust regression
9. Robust regression Least squares regression........................................................ 2 Problems with LS regression..................................................... 3 Robust regression............................................................
More informationExperimental Design and Statistical Methods. Workshop LOGISTIC REGRESSION. Jesús Piedrafita Arilla.
Experimental Design and Statistical Methods Workshop LOGISTIC REGRESSION Jesús Piedrafita Arilla jesus.piedrafita@uab.cat Departament de Ciència Animal i dels Aliments Items Logistic regression model Logit
More informationMULTIVARIATE TECHNIQUES, ROBUSTNESS
MULTIVARIATE TECHNIQUES, ROBUSTNESS Mia Hubert Associate Professor, Department of Mathematics and L-STAT Katholieke Universiteit Leuven, Belgium mia.hubert@wis.kuleuven.be Peter J. Rousseeuw 1 Senior Researcher,
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March Presented by: Tanya D. Havlicek, ACAS, MAAA ANTITRUST Notice The Casualty Actuarial Society is committed
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationLogistic Regressions. Stat 430
Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to
More informationBreakdown points of Cauchy regression-scale estimators
Breadown points of Cauchy regression-scale estimators Ivan Mizera University of Alberta 1 and Christine H. Müller Carl von Ossietzy University of Oldenburg Abstract. The lower bounds for the explosion
More informationA new robust model selection method in GLM with application to ecological data
Sakate and Kashid Environ Syst Res 2016 5:9 DOI 10.1186/s40068-016-0060-7 RESEARCH Open Access A new robust model selection method in GLM with application to ecological data D. M. Sakate * and D. N. Kashid
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationRobust Variable Selection Through MAVE
Robust Variable Selection Through MAVE Weixin Yao and Qin Wang Abstract Dimension reduction and variable selection play important roles in high dimensional data analysis. Wang and Yin (2008) proposed sparse
More informationMATH 644: Regression Analysis Methods
MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100
More informationCOMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION
(REFEREED RESEARCH) COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION Hakan S. Sazak 1, *, Hülya Yılmaz 2 1 Ege University, Department
More informationLeverage effects on Robust Regression Estimators
Leverage effects on Robust Regression Estimators David Adedia 1 Atinuke Adebanji 2 Simon Kojo Appiah 2 1. Department of Basic Sciences, School of Basic and Biomedical Sciences, University of Health and
More informationSpatial Variation in Infant Mortality with Geographically Weighted Poisson Regression (GWPR) Approach
Spatial Variation in Infant Mortality with Geographically Weighted Poisson Regression (GWPR) Approach Kristina Pestaria Sinaga, Manuntun Hutahaean 2, Petrus Gea 3 1, 2, 3 University of Sumatera Utara,
More informationLecture 5: LDA and Logistic Regression
Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationGeneralized Additive Models
Generalized Additive Models The Model The GLM is: g( µ) = ß 0 + ß 1 x 1 + ß 2 x 2 +... + ß k x k The generalization to the GAM is: g(µ) = ß 0 + f 1 (x 1 ) + f 2 (x 2 ) +... + f k (x k ) where the functions
More informationPAijpam.eu M ESTIMATION, S ESTIMATION, AND MM ESTIMATION IN ROBUST REGRESSION
International Journal of Pure and Applied Mathematics Volume 91 No. 3 2014, 349-360 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: http://dx.doi.org/10.12732/ijpam.v91i3.7
More informationLogistic Regression 21/05
Logistic Regression 21/05 Recall that we are trying to solve a classification problem in which features x i can be continuous or discrete (coded as 0/1) and the response y is discrete (0/1). Logistic regression
More informationStatistical Methods III Statistics 212. Problem Set 2 - Answer Key
Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423
More informationDetecting and Assessing Data Outliers and Leverage Points
Chapter 9 Detecting and Assessing Data Outliers and Leverage Points Section 9.1 Background Background Because OLS estimators arise due to the minimization of the sum of squared errors, large residuals
More informationIMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE
IMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE Eric Blankmeyer Department of Finance and Economics McCoy College of Business Administration Texas State University San Marcos
More informationVarious Issues in Fitting Contingency Tables
Various Issues in Fitting Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Complete Tables with Zero Entries In contingency tables, it is possible to have zero entries in a
More informationBiostatistics for physicists fall Correlation Linear regression Analysis of variance
Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody
More informationRegression Analysis for Data Containing Outliers and High Leverage Points
Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain
More informationGARCH Models Estimation and Inference
GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in
More informationLinear Methods for Prediction
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationIndian Statistical Institute
Indian Statistical Institute Introductory Computer programming Robust Regression methods with high breakdown point Author: Roll No: MD1701 February 24, 2018 Contents 1 Introduction 2 2 Criteria for evaluating
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
More informationRegression so far... Lecture 21 - Logistic Regression. Odds. Recap of what you should know how to do... At this point we have covered: Sta102 / BME102
Background Regression so far... Lecture 21 - Sta102 / BME102 Colin Rundel November 18, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationGeneralized Linear Models. Last time: Background & motivation for moving beyond linear
Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationOutlier Robust Nonlinear Mixed Model Estimation
Outlier Robust Nonlinear Mixed Model Estimation 1 James D. Williams, 2 Jeffrey B. Birch and 3 Abdel-Salam G. Abdel-Salam 1 Business Analytics, Dow AgroSciences, 9330 Zionsville Rd. Indianapolis, IN 46268,
More informationMaximum Likelihood Conjoint Measurement in R
Maximum Likelihood Conjoint Measurement in R Kenneth Knoblauch¹, Blaise Tandeau de Marsac¹, Laurence T. Maloney² 1. Inserm, U846 Stem Cell and Brain Research Institute Dept. Integrative Neurosciences Bron,
More informationIntroduction to Linear Regression Rebecca C. Steorts September 15, 2015
Introduction to Linear Regression Rebecca C. Steorts September 15, 2015 Today (Re-)Introduction to linear models and the model space What is linear regression Basic properties of linear regression Using
More informationGeneralized Linear Models. stat 557 Heike Hofmann
Generalized Linear Models stat 557 Heike Hofmann Outline Intro to GLM Exponential Family Likelihood Equations GLM for Binomial Response Generalized Linear Models Three components: random, systematic, link
More informationFAMA-FRENCH 1992 REDUX WITH ROBUST STATISTICS
FAMA-FRENCH 1992 REDUX WITH ROBUST STATISTICS R. Douglas Martin* Professor Emeritus Applied Mathematics and Statistics University of Washington IAQF, NYC December 11, 2018** *Joint work with Christopher
More informationInvestigating Models with Two or Three Categories
Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might
More informationA SHORT COURSE ON ROBUST STATISTICS. David E. Tyler Rutgers The State University of New Jersey. Web-Site dtyler/shortcourse.
A SHORT COURSE ON ROBUST STATISTICS David E. Tyler Rutgers The State University of New Jersey Web-Site www.rci.rutgers.edu/ dtyler/shortcourse.pdf References Huber, P.J. (1981). Robust Statistics. Wiley,
More informationLinear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52
Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two
More informationLinear Regression With Special Variables
Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:
More informationDetecting outliers and/or leverage points: a robust two-stage procedure with bootstrap cut-off points
Detecting outliers and/or leverage points: a robust two-stage procedure with bootstrap cut-off points Ettore Marubini (1), Annalisa Orenti (1) Background: Identification and assessment of outliers, have
More informationIdentifying and accounting for outliers and extreme response patterns in latent variable modelling
Identifying and accounting for outliers and extreme response patterns in latent variable modelling Irini Moustaki Athens University of Economics and Business Outline 1. Define the problem of outliers and
More informationGeneral Regression Model
Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical
More informationRegression Methods for Survey Data
Regression Methods for Survey Data Professor Ron Fricker! Naval Postgraduate School! Monterey, California! 3/26/13 Reading:! Lohr chapter 11! 1 Goals for this Lecture! Linear regression! Review of linear
More informationA Robust Strategy for Joint Data Reconciliation and Parameter Estimation
A Robust Strategy for Joint Data Reconciliation and Parameter Estimation Yen Yen Joe 1) 3), David Wang ), Chi Bun Ching 3), Arthur Tay 1), Weng Khuen Ho 1) and Jose Romagnoli ) * 1) Dept. of Electrical
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationarxiv: v4 [math.st] 16 Nov 2015
Influence Analysis of Robust Wald-type Tests Abhik Ghosh 1, Abhijit Mandal 2, Nirian Martín 3, Leandro Pardo 3 1 Indian Statistical Institute, Kolkata, India 2 Univesity of Minnesota, Minneapolis, USA
More informationRobustness of location estimators under t- distributions: a literature review
IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS Robustness of location estimators under t- distributions: a literature review o cite this article: C Sumarni et al 07 IOP Conf.
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationSTA102 Class Notes Chapter Logistic Regression
STA0 Class Notes Chapter 0 0. Logistic Regression We continue to study the relationship between a response variable and one or more eplanatory variables. For SLR and MLR (Chapters 8 and 9), our response
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationProjection Estimators for Generalized Linear Models
Projection Estimators for Generalized Linear Models Andrea Bergesio and Victor J. Yohai Andrea Bergesio is Teaching Assistant, Universidad Nacional del Litoral, IMAL, Güemes 3450, S3000GLN Santa Fe, Argentina
More informationTwo Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00
Two Hours MATH38052 Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER GENERALISED LINEAR MODELS 26 May 2016 14:00 16:00 Answer ALL TWO questions in Section
More informationMixed-effects Maximum Likelihood Difference Scaling
Mixed-effects Maximum Likelihood Difference Scaling Kenneth Knoblauch Inserm U 846 Stem Cell and Brain Research Institute Dept. Integrative Neurosciences Bron, France Laurence T. Maloney Department of
More informationRegression Analysis By Example
Regression Analysis By Example Third Edition SAMPRIT CHATTERJEE New York University ALI S. HADI Cornell University BERTRAM PRICE Price Associates, Inc. A Wiley-Interscience Publication JOHN WILEY & SONS,
More informationClass Notes: Week 8. Probit versus Logit Link Functions and Count Data
Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While
More informationSelection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models
Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Massimiliano Bratti & Alfonso Miranda In many fields of applied work researchers need to model an
More informationSTA6938-Logistic Regression Model
Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of
More information