A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE
|
|
- Ambrose Hampton
- 5 years ago
- Views:
Transcription
1 Sankhyā : The Indian Journal of Statistics 2000, Volume 62, Series A, Pt. 1, pp A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE By M. MERCEDES SUÁREZ RANCEL and MIGUEL A. GONZÁLEZ SIERRA University of La Laguna, Spain SUMMARY. In this paper we show that the influence approach of Hadi (1992) for overall potential influence of a subset of observations, is closely related to a variant of Cook s (1986) absolute measure of local influence. This result indicates that the structure of the local influence concept is quite useful for identifying influential subset, and provides a further justification for local influence analysis. 1. Introduction The method of local influence was introduced by Cook (1986) and modified by Billor and Loynes (1993) as a general tool for assessing the influence of local departures from the assumptions underlying the statistical models. The work of some authors (Lawrance (1988), Peña and Yohai (1995)) indicates that one of the attractions of the local influence concept is that it assesses the effect of joint perturbations on the data cases, more easily than the global influence. Thus, frequently, in a local sense the results are free from masking effects that present difficulties to individual case-deletion methods. This article shows that local-influence analysis of perturbations of the variance is similar to the usual regression diagnostic based on Hadi s measure (1992) for detecting influential subset. Section 2 gives the general idea of local influence. Section 3 describes the Hadi s measure. Section 4 shows that local-influence analysis of perturbations of the variance is similar to the usual regression diagnostic based on Hadi s measure for detecting influential subset, Section 5 provides illustrative examples. Paper received. March 1998; revised January AMS (1991) subject classification. 62J20. Key words and phrases. Case deletion, Hadi s measure, local influence, masking, regression, swamping.
2 a connection between local and deletion influence Local Influence Consider the standard linear regression model: Y = Xβ + ɛ,... (1) where ɛ is an n 1 vector whose elements are assumed to be independent normal random variables with mean zero and known variance σ 2, X is a known n k matrix with full column rank, β is a k 1 vector of parameters and Y is an n 1 vector of response variables. Collectively, the i-th observation y i on the response variable in combination with the associated values for the explanatory variables will be referred to as the i-th case. Many measures have been suggested to assess influence of observations in regression modeling. Chatterjee and Hadi (1986) gave an excellent review on this subject. Cook (1986) considered a general version of Cook s distance 2 Ŷ Ŷ(i) D i = kσ 2,... (2) where Ŷ, Ŷ(i) are the n 1 vectors of fitted values based on the full data and the data without the i-th case, respectively, and k is the dimension of β. He investigated 2 Ŷ Ŷ(w) D i (w) = kσ 2,... (3) where Ŷ(w) is the vector of fitted values obtained when the i-th case has weight w and the remaining cases have weight 1. These ideas have been extended to general models. This extension is partially motivated by the following relationship between D i (w) and the log-likelihood L(β) for model (1), kd i (w) = [ Y Ŷ(w) 2 Y Ŷ 2 ] σ 2 = 2[L( ˆβ) L( ˆβ w )], where ˆβ = ˆβ w=1 and ˆβ w is the maximum likelihood estimator of β when the i-th case has weight w. The form of this relationship is a consequence of the statistical structure assumed for the errors in model (1). The log-likelihood for the unperturbed and perturbed models are denoted by L(θ) and L(θ w), respectively. Then the likelihood displacement LD(w) is defined by LD(w) = 2[L(ˆθ) L(ˆθ w )],... (4) where ˆθ and ˆθ w are the maximum likelihood estimators of θ under the unperturbed and perturbed models respectively. The vector of the values w and LD(w) forms the surface of interest as w varies over certain space. The direction h max of maximum curvature of the likelihood displacement surface in the postulated model (where
3 146 m. mercedes suárez rancel and miguel a. gonzález sierra w = w 0 ) indicates the greatest local sensitivity against perturbations. The direction of maximum curvature is used as the main diagnostic tool in the local influence method. Billor and Lyones (1993) show some practical and theoretical difficulties which arise in Cook s approach. For example, computability of the maximum curvature is restricted to the linear regression model; lack of invariance of the curvature under reparametrisation of the perturbation scheme; and lack of definition of the parameters. To avoid these difficulties Billor and Loynes (1993) suggest, an alternative likelihood displacement: LD (w) = 2[L(θ) L(ˆθ w w)],... (5) where L(ˆθ w w) is the log-likelihood of the perturbed model, while Cook (1986), uses only the perturbation in the estimation of the parameters. Billor and Loynes (1993) suggest that the first derivative of LD provides valuable information about the local behavior of LD, so they use the direction which produces the maximum increment in LD with the following slope: If we take the (perturbed) model : l max = LD(w 0 ) = 2 L(ˆθ w). Y = Xβ + ɛ... (1a) where var(ɛ) = σ 2 W 1 with W = diag(1, 1,..., 1 + w i, 1,..., 1) then: l i = l max,i = (1 e2 i ).... (6) σ2 3. Hadi s Influence Measure Hadi (1992) proposes a measure for detecting influential subset of observations which is resistant to masking and swamping effects. This measure is based on the simple fact that potentially influential observations are outliers in either the X-space, the Y-space, or both, which yields the overall influence measure as: Hi 2 k d 2 i = (1 p ii ) (1 d 2 i ) + p ii, i = 1, 2,..., n... (7) 1 p ii where d 2 1 = e 2 i /e e, is the square of the i-th normalized residual and p ii is the i-th diagonal element of P = X(X X) 1 X Y. The diagnostic measure (7) is the sum of two components each of which has a nice interpretation. A large value of the first term on the right hand side of (7) indicates that the model has a poor fit (a large prediction error) and a large value of the second term indicates the presence of an outlier in the X space.
4 a connection between local and deletion influence A Connection Between Local And Deletion Influence In this section, we show that local-influence analysis of perturbations of the variance is similar to the usual regression diagnostic based on the Hadi s measure for detecting influential subset. To see that we propose a new measure based on the following likelihood displacement: LD (i) (w) = 2[L(θ) L (i) (ˆθ wi w i )] where L (i) (ˆθ wi ) is the log-likelihood displacement under the perturbed model when the i-th observation is deleted. If we apply this likelihood displacement to the perturbed model (1a) we have: L (i) (ˆθ wi w i ) = 1 2 ln 2π 1 2 ln[ ˆσ 2 (i) ] w i 2ˆσ (i) 2 (y i x ˆβ i (i) ) 2 (1 + w i ) 2. Then the slope of the maximum rate of increase in LD (i) (w i ) is given by: e2 i(i) l i(i) = l max(i) = 1 e (i) e (i) d 2 i = 1 (1 p ii )(1 d 2 k 1),... (8) i )(n where d 2 i = e 2 i /e e. Thereby, to try to control the influence of the high-leverage observations in (8), we propose a quasi likelihood displacement : LD (i) (w i ) = 2[L(ˆθ) L (i) (ˆθ wi w i )] + [var(ŷi) var(ŷw i )]. So that, the slope of the maximum increment direction of LD (i) (w i ) is: l i(i) = l max,i(i) = 1 d 2 i (1 p ii )(1 d 2 i )(n k 1) ˆσ2 p ii (1 p ii ) 2. Since p ii = k while Σd 2 i = 1, multiplying the second term by kˆσ2 prevents from being dominated by its third term: kd 2 i l i(i) = l max,i(i) = 1 e e[ (1 p ii )(1 d 2 i ) + p ii ].... (9) (1 p ii ) 2 Thus the measures (9) has an expression similar to Hadi s measure defined on (7), indicating an existing relation between local and deletion diagnostic. 5. Example 5.1 Scottish Hill Races Data. As a first numerical illustration, consider the Scottish Hill Races data reported in Atkinson (1986) and Lawrance (1989). The
5 148 m. mercedes suárez rancel and miguel a. gonzález sierra data give the record times (in seconds) of 35 Scottish Hill races in 1984 along with two explanatory variables, distance of race (in miles) and climb (in feet). The following model is fit to the data: T ime = β 0 + β 1 Distance + β 2 Climb + ɛ.... (10) The data contain two clear outliers, observations 18 with r i = 4.6 and 7 with r i = 2.8. For comparison purpose, we remove these two observations and refit model (10) to the remaining 33 races. As seen in Table 1, the two largest cases based on Hi 2 are equal to those based on l i(i), while the Cook s measure (C2 i ) and the Billor and Loynes measure l i highlight different observations. Table 1: New York rivers data: Two largest value based on H 2 i, l i(i), (C 2 i ) and l i. Case H 2 i Case l i(i) Case C 2 i Case l i , , House Price Data. Brant (1986) considers the house price data determining the selling price of houses (Weisberg, 1985) to illustrate the masking effect. We have n = 27 and k = 10 in these data. We apply Hi 2, l i(i), C2 i and l i, to these data and obtain 4 largest values in Table 2. Hi 2, l i(i) and C2 i give the same results, however, l i shows very different pattern. Table 2. House Price Data: Four largest value based on H 2 1, l i(i), (C 2 i ) and l i. Case H 2 i Case l i(i) Case C2 i Case l i Acknowledgments. The authors are thanks Ali S. Hadi and Robert Loynes for their helpful comments on an earlier version of this manuscript. References Atkinson, A.C. (1986). Comment: Aspects of diagnostic regression analysis, A comment on Influential observations, high leverage points, and outliers in linear regression by S. Chatterjee and A. S. Hadi. Statistical Science, 1 (3) Billor, N., and Loynes, R. M. (1993). Local Influence: A New Approach. Comm. Statist.- Theory Meth., 22, Brant, R. (1986). Finding and understanding influential sets in regression. Technical Report #466, School of Statistics, University of Minnesota. Chatterjee, S. and A.S. Hadi (1986). Influential observations, high leverage points, and outliers in linear regression. Statistical Science, 1 (3)
6 a connection between local and deletion influence 149 Cook, R. D. (1977). Detection of Influential Observations in Linear Regression, Technometrics, 19, (1986). Assessment of Local Influence (with discussion). Jour. Royal Statist. Soc., Ser. B., 48, Hadi, A.S. (1992). A New Measure of Overall Potential Influence in Linear Regression. Computational Statistics & Data Analysis, 14, Lawrance, A.J. (1988). Regression Transformation Diagnostic Using Local Influence. Jour. Amer. Statist. Assoc., 83, (1989). Local and deletion influence. Unpublished manuscript. Peña, D. and Yohai, V.J. (1995). The Detection of Influential Subsets In Linear Regression By Using An Influence Matrix. Jour. Royal Statist. Soc. Series B-Methodological, 57, Weisberg, S. (1985). Applied Linear Regression, 2nd Ed.. Wiley, New York. M. Mercedes Suaŕez Rancel and Miguel A. Gonzaĺez Sierra Department of Statistics Research Operation and Computation Faculty of Mathematics University of La Laguna Tenerife Spain msuarez@ull.es / magsierr@ull.es
UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationLocal Influence and Residual Analysis in Heteroscedastic Symmetrical Linear Models
Local Influence and Residual Analysis in Heteroscedastic Symmetrical Linear Models Francisco José A. Cysneiros 1 1 Departamento de Estatística - CCEN, Universidade Federal de Pernambuco, Recife - PE 5079-50
More informationThe Masking and Swamping Effects Using the Planted Mean-Shift Outliers Models
Int. J. Contemp. Math. Sciences, Vol. 2, 2007, no. 7, 297-307 The Masking and Swamping Effects Using the Planted Mean-Shift Outliers Models Jung-Tsung Chiang Department of Business Administration Ling
More informationRegression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin
Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n
More informationRobust model selection criteria for robust S and LT S estimators
Hacettepe Journal of Mathematics and Statistics Volume 45 (1) (2016), 153 164 Robust model selection criteria for robust S and LT S estimators Meral Çetin Abstract Outliers and multi-collinearity often
More informationRegression Analysis for Data Containing Outliers and High Leverage Points
Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain
More informationMeasuring Local Influential Observations in Modified Ridge Regression
Journal of Data Science 9(2011), 359-372 Measuring Local Influential Observations in Modified Ridge Regression Aboobacker Jahufer 1 and Jianbao Chen 2 1 South Eastern University and 2 Xiamen University
More informationQuantitative Methods I: Regression diagnostics
Quantitative Methods I: Regression University College Dublin 10 December 2014 1 Assumptions and errors 2 3 4 Outline Assumptions and errors 1 Assumptions and errors 2 3 4 Assumptions: specification Linear
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More information((n r) 1) (r 1) ε 1 ε 2. X Z β+
Bringing Order to Outlier Diagnostics in Regression Models D.R.JensenandD.E.Ramirez Virginia Polytechnic Institute and State University and University of Virginia der@virginia.edu http://www.math.virginia.edu/
More informationPh.D. Qualifying Exam Friday Saturday, January 3 4, 2014
Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently
More informationLinear regression. Example 3.6.1: Relation between abrasion loss and hardness of rubber tires.
Linear regression Example 3.6.1: Relation between abrasion loss and hardness of rubber tires. X i the abrasion loss 30 observations, i = 1,..., 30. Sample space R 30. y i the hardness, i = 1,..., 30. Model:
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationPrediction Intervals in the Presence of Outliers
Prediction Intervals in the Presence of Outliers David J. Olive Southern Illinois University July 21, 2003 Abstract This paper presents a simple procedure for computing prediction intervals when the data
More informationRegression Diagnostics for Survey Data
Regression Diagnostics for Survey Data Richard Valliant Joint Program in Survey Methodology, University of Maryland and University of Michigan USA Jianzhu Li (Westat), Dan Liao (JPSM) 1 Introduction Topics
More informationThe Algorithm for Multiple Outliers Detection Against Masking and Swamping Effects
Int. J. Contemp. Math. Sciences, Vol. 3, 2008, no. 17, 839-859 The Algorithm for Multiple Outliers Detection Against Masking and Swamping Effects Jung-Tsung Chiang Department of Business Administration
More informationTESTS FOR TRANSFORMATIONS AND ROBUST REGRESSION. Anthony Atkinson, 25th March 2014
TESTS FOR TRANSFORMATIONS AND ROBUST REGRESSION Anthony Atkinson, 25th March 2014 Joint work with Marco Riani, Parma Department of Statistics London School of Economics London WC2A 2AE, UK a.c.atkinson@lse.ac.uk
More informationBANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1
BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013)
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationA NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL
Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl
More informationMath 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University
Math 5305 Notes Diagnostics and Remedial Measures Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Diagnostics and Remedial Measures 1 / 44 Model Assumptions
More informationGeneralized Linear Models
Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More informationINFERENCE FOR MULTIPLE LINEAR REGRESSION MODEL WITH EXTENDED SKEW NORMAL ERRORS
Pak. J. Statist. 2016 Vol. 32(2), 81-96 INFERENCE FOR MULTIPLE LINEAR REGRESSION MODEL WITH EXTENDED SKEW NORMAL ERRORS A.A. Alhamide 1, K. Ibrahim 1 M.T. Alodat 2 1 Statistics Program, School of Mathematical
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationLecture 4 Multiple linear regression
Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters
More informationMinimax design criterion for fractional factorial designs
Ann Inst Stat Math 205 67:673 685 DOI 0.007/s0463-04-0470-0 Minimax design criterion for fractional factorial designs Yue Yin Julie Zhou Received: 2 November 203 / Revised: 5 March 204 / Published online:
More information, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1
Regression diagnostics As is true of all statistical methodologies, linear regression analysis can be a very effective way to model data, as along as the assumptions being made are true. For the regression
More informationMAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik
MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,
More informationROBUSTNESS OF TWO-PHASE REGRESSION TESTS
REVSTAT Statistical Journal Volume 3, Number 1, June 2005, 1 18 ROBUSTNESS OF TWO-PHASE REGRESSION TESTS Authors: Carlos A.R. Diniz Departamento de Estatística, Universidade Federal de São Carlos, São
More informationOutlier detection and variable selection via difference based regression model and penalized regression
Journal of the Korean Data & Information Science Society 2018, 29(3), 815 825 http://dx.doi.org/10.7465/jkdi.2018.29.3.815 한국데이터정보과학회지 Outlier detection and variable selection via difference based regression
More informationLinear Models 1. Isfahan University of Technology Fall Semester, 2014
Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and
More informationBootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.
Bootstrap G. Jogesh Babu Penn State University http://www.stat.psu.edu/ babu Director of Center for Astrostatistics http://astrostatistics.psu.edu Outline 1 Motivation 2 Simple statistical problem 3 Resampling
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1
MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationMLR Model Checking. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project
MLR Model Checking Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en
More informationBusiness Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'
Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where
More informationThe outline for Unit 3
The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationStatistics and econometrics
1 / 36 Slides for the course Statistics and econometrics Part 10: Asymptotic hypothesis testing European University Institute Andrea Ichino September 8, 2014 2 / 36 Outline Why do we need large sample
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More information18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013
18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are
More informationSelection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models
Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Junfeng Shang Bowling Green State University, USA Abstract In the mixed modeling framework, Monte Carlo simulation
More information9. Least squares data fitting
L. Vandenberghe EE133A (Spring 2017) 9. Least squares data fitting model fitting regression linear-in-parameters models time series examples validation least squares classification statistics interpretation
More informationBIOS 2083 Linear Models c Abdus S. Wahed
Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter
More informationMonitoring Random Start Forward Searches for Multivariate Data
Monitoring Random Start Forward Searches for Multivariate Data Anthony C. Atkinson 1, Marco Riani 2, and Andrea Cerioli 2 1 Department of Statistics, London School of Economics London WC2A 2AE, UK, a.c.atkinson@lse.ac.uk
More informationDiagnostics for Linear Models With Functional Responses
Diagnostics for Linear Models With Functional Responses Qing Shen Edmunds.com Inc. 2401 Colorado Ave., Suite 250 Santa Monica, CA 90404 (shenqing26@hotmail.com) Hongquan Xu Department of Statistics University
More informationA Note on Visualizing Response Transformations in Regression
Southern Illinois University Carbondale OpenSIUC Articles and Preprints Department of Mathematics 11-2001 A Note on Visualizing Response Transformations in Regression R. Dennis Cook University of Minnesota
More informationSTAT2201 Assignment 6
STAT2201 Assignment 6 Question 1 Regression methods were used to analyze the data from a study investigating the relationship between roadway surface temperature (x) and pavement deflection (y). Summary
More informationContents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects
Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:
More informationCHAPTER 5. Outlier Detection in Multivariate Data
CHAPTER 5 Outlier Detection in Multivariate Data 5.1 Introduction Multivariate outlier detection is the important task of statistical analysis of multivariate data. Many methods have been proposed for
More informationKutlwano K.K.M. Ramaboa. Thesis presented for the Degree of DOCTOR OF PHILOSOPHY. in the Department of Statistical Sciences Faculty of Science
Contributions to Linear Regression Diagnostics using the Singular Value Decomposition: Measures to Identify Outlying Observations, Influential Observations and Collinearity in Multivariate Data Kutlwano
More informationUnit 10: Simple Linear Regression and Correlation
Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationPAijpam.eu M ESTIMATION, S ESTIMATION, AND MM ESTIMATION IN ROBUST REGRESSION
International Journal of Pure and Applied Mathematics Volume 91 No. 3 2014, 349-360 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: http://dx.doi.org/10.12732/ijpam.v91i3.7
More informationThe Multiple Regression Model Estimation
Lesson 5 The Multiple Regression Model Estimation Pilar González and Susan Orbe Dpt Applied Econometrics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model:
More informationGeneralized Linear Models
York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear
More informationHandbook of Regression Analysis
Handbook of Regression Analysis Samprit Chatterjee New York University Jeffrey S. Simonoff New York University WILEY A JOHN WILEY & SONS, INC., PUBLICATION CONTENTS Preface xi PARTI THE MULTIPLE LINEAR
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More informationDetecting outliers and/or leverage points: a robust two-stage procedure with bootstrap cut-off points
Detecting outliers and/or leverage points: a robust two-stage procedure with bootstrap cut-off points Ettore Marubini (1), Annalisa Orenti (1) Background: Identification and assessment of outliers, have
More information1 Least Squares Estimation - multiple regression.
Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1
More informationEffect of outliers on the variable selection by the regularized regression
Communications for Statistical Applications and Methods 2018, Vol. 25, No. 2, 235 243 https://doi.org/10.29220/csam.2018.25.2.235 Print ISSN 2287-7843 / Online ISSN 2383-4757 Effect of outliers on the
More informationOn Modifications to Linking Variance Estimators in the Fay-Herriot Model that Induce Robustness
Statistics and Applications {ISSN 2452-7395 (online)} Volume 16 No. 1, 2018 (New Series), pp 289-303 On Modifications to Linking Variance Estimators in the Fay-Herriot Model that Induce Robustness Snigdhansu
More informationStat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13)
Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) 1. Weighted Least Squares (textbook 11.1) Recall regression model Y = β 0 + β 1 X 1 +... + β p 1 X p 1 + ε in matrix form: (Ch. 5,
More informationSensitivity Analysis in Linear Regression
Sensitivity Analysis in Linear Regression Sensitivity Analysis in Linear Regression SAMPRIT CHAlTERJEE New York University New York, New York ALI S. HAD1 Cornell University Ithaca, New York WILEY JOHN
More informationRegression Analysis By Example
Regression Analysis By Example Third Edition SAMPRIT CHATTERJEE New York University ALI S. HADI Cornell University BERTRAM PRICE Price Associates, Inc. A Wiley-Interscience Publication JOHN WILEY & SONS,
More informationSTAT 4385 Topic 06: Model Diagnostics
STAT 4385 Topic 06: Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 1/ 40 Outline Several Types of Residuals Raw, Standardized, Studentized
More informationIll-conditioning and multicollinearity
Linear Algebra and its Applications 2 (2000) 295 05 www.elsevier.com/locate/laa Ill-conditioning and multicollinearity Fikri Öztürk a,, Fikri Akdeniz b a Department of Statistics, Ankara University, Ankara,
More informationvariability of the model, represented by σ 2 and not accounted for by Xβ
Posterior Predictive Distribution Suppose we have observed a new set of explanatory variables X and we want to predict the outcomes ỹ using the regression model. Components of uncertainty in p(ỹ y) variability
More informationMIT Spring 2015
Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)
More informationLecture 1: Linear Models and Applications
Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation
More informationSimple Linear Regression
Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationThe regression model with one fixed regressor cont d
The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationThe Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model
Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for
More informationFitting a regression model
Fitting a regression model We wish to fit a simple linear regression model: y = β 0 + β 1 x + ɛ. Fitting a model means obtaining estimators for the unknown population parameters β 0 and β 1 (and also for
More informationECON 5350 Class Notes Functional Form and Structural Change
ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter
More informationThe Effect of a Single Point on Correlation and Slope
Rochester Institute of Technology RIT Scholar Works Articles 1990 The Effect of a Single Point on Correlation and Slope David L. Farnsworth Rochester Institute of Technology This work is licensed under
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationA Simple Plot for Model Assessment
A Simple Plot for Model Assessment David J. Olive Southern Illinois University September 16, 2005 Abstract Regression is the study of the conditional distribution y x of the response y given the predictors
More informationDetection of single influential points in OLS regression model building
Analytica Chimica Acta 439 (2001) 169 191 Tutorial Detection of single influential points in OLS regression model building Milan Meloun a,,jiří Militký b a Department of Analytical Chemistry, Faculty of
More informationIntroduction to Estimation Methods for Time Series models. Lecture 1
Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation
More informationGeneral Linear Model: Statistical Inference
Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least
More informationLAD Regression for Detecting Outliers in Response and Explanatory Variables
journal of multivariate analysis 61, 144158 (1997) article no. MV971666 LAD Regression for Detecting Outliers in Response and Explanatory Variables Yadolah Dodge Statistics Group, University of Neucha^
More informationPOLI 8501 Introduction to Maximum Likelihood Estimation
POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,
More informationDefault priors and model parametrization
1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)
More informationDIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS
DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS Ivy Liu and Dong Q. Wang School of Mathematics, Statistics and Computer Science Victoria University of Wellington New Zealand Corresponding
More informationRegression Analysis II
Regression Analysis II Measures of Goodness of fit Two measures of Goodness of fit Measure of the absolute fit of the sample points to the sample regression line Standard error of the estimate An index
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationML and REML Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 58
ML and REML Variance Component Estimation Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 58 Suppose y = Xβ + ε, where ε N(0, Σ) for some positive definite, symmetric matrix Σ.
More informationON D-OPTIMAL DESIGNS FOR ESTIMATING SLOPE
Sankhyā : The Indian Journal of Statistics 999, Volume 6, Series B, Pt. 3, pp. 488 495 ON D-OPTIMAL DESIGNS FOR ESTIMATING SLOPE By S. HUDA and A.A. AL-SHIHA King Saud University, Riyadh, Saudi Arabia
More informationLecture One: A Quick Review/Overview on Regular Linear Regression Models
Lecture One: A Quick Review/Overview on Regular Linear Regression Models Outline The topics to be covered include: Model Specification Estimation(LS estimators and MLEs) Hypothesis Testing and Model Diagnostics
More informationJoint Probability Distributions
Joint Probability Distributions ST 370 In many random experiments, more than one quantity is measured, meaning that there is more than one random variable. Example: Cell phone flash unit A flash unit is
More informationEmpirical Bayes Estimation in Multiple Linear Regression with Multivariate Skew-Normal Distribution as Prior
Journal of Mathematical Extension Vol. 5, No. 2 (2, (2011, 37-50 Empirical Bayes Estimation in Multiple Linear Regression with Multivariate Skew-Normal Distribution as Prior M. Khounsiavash Islamic Azad
More informationStatistics II Exercises Chapter 5
Statistics II Exercises Chapter 5 1. Consider the four datasets provided in the transparencies for Chapter 5 (section 5.1) (a) Check that all four datasets generate exactly the same LS linear regression
More information