Regression diagnostics

Size: px
Start display at page:

Download "Regression diagnostics"

Transcription

1 Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, / 6

2 Motivation When working with a linear model with design matrix X, the conventional linear model is based on the following conditions: E[Y X] col(x) and var[y X] = σ I. Least squares point estimates depend on the first condition approximately holding. Least squares inferences depend on both of the above conditions approximately holding. Inferences for small sample sizes may also depend on the distribution of Y E[Y X] being approximately multivariate Gaussian, but for moderate or large sample sizes this condition is not critical. Regression diagnostics for linear models are approaches for assessing how well a particular data set fits these two conditions. / 6

3 Residuals Linear models can be expressed in two equivalent ways: Focus only on moments: E[Y X] col(x) and var[y X] = σ I. Use a generative model, in this case an additive error model of the form y = Xβ + ɛ, where ɛ is random with E[ɛ X] = 0, and cov[ɛ X] I. Since the residuals can be viewed as predictions of the errors, it turns out that regression model diagnostics can often be developed using the residuals. Recall that the residuals can be expressed R (I P)y where P is the projection onto col(x ). 3 / 6

4 Residuals The residuals have two key mathematical properties regardless of the correctness of the model specification: The residuals sum to zero, since (I P)1 = 0 and hence 1 r = 1 (I P)y = 0. The residuals and fitted values are orthogonal (they have zero sample covariance): ĉov(r, ŷ X) (r r) Ŷ = r ŷ = y (I P)Py = 0. These properties hold as long as an intercept is included in the model (so P 1 = 1, where 1 is a vector of 1 s). / 6

5 Residuals If the basic linear model conditions hold, these two properties have population counterparts: The expected value of each residual is zero: E[R X] = (I P)E[Y X] = 0 R n. The population covariance between any residual and any fitted value is zero: cov(r, ŷ X) = E[rŷ ] = (I P)cov(Y X)P = σ (I P)P = 0 R n n. 5 / 6

6 Residuals If the model is correctly specified, there is a simple formula for the variances and covariances of the residuals: cov(r X) = (I P)E[yy ](I P) = (I P) ( Xββ X + σ I ) (I P) = σ (I P). If the model is correctly specified, the standardized residuals and the Studentized residuals y i ŷ i ˆσ y i ŷ i ˆσ(1 P ii ) 1/ approximately have mean zero and variance one. 6 / 6

7 External standardization of residuals Let ˆσ i be the estimate of σ obtained by fitting a regression model omitting the i th case. It turns out that we can calculate this value without actually refitting the model: ˆσ i = (n p 1)ˆσ r i /(1 P ii) n p where r i is the residual for the model fit to all data. The externally standardized residuals are y i ŷ i ˆσ i, The externally Studentized residuals are y i ŷ i ˆσ i (1 P ii ) 1/. 7 / 6

8 Outliers and masking In some settings, residuals can be used to identify outliers. However, in a small data set, a large outlier will increase the value of ˆσ, and hence may mask itself. Externally Studentized residuals solve the problem of a single large outlier masking itself. But masking may still occur if multiple large outliers are present. 8 / 6

9 Outliers and masking If multiple large outliers may be present we may use alternate estimates of the scale parameter σ: Interquartile range (IQR): this is the difference between the 75 th percentile and the 5 th percentile of the distribution or data. The IQR of the standard normal distribution is 1.35, so IQR/1.35 can be used to estimate σ. Median Absolute Deviation (MAD): this is the median value of the absolute deviations from the median of the distribution or data, i.e. median( Z median(z) ). The MAD of the standard normal distribution is 0.65, so MAD/0.65 can be used to estimate σ. These alternative estimates of σ can be used in place of the usual ˆσ for standardizing or Studentizing residuals. 9 / 6

10 Leverage Leverage is a measure of how strongly the data for case i determine the fitted value ŷ i. Since ŷ = Py, and ŷ i = j P ij y j, it is natural to define the leverage for case i as P ii, where P is the projection matrix onto col(x). This is related to the fact that the variance of the i th residual is σ (1 P ii ). Since the residuals have mean zero, when P ii is close to 1, the residual will likely be close to zero. This means that fitted line will usually pass close to (x i, y i ) if it is a high leverage point. 10 / 6

11 Leverage These are the coefficients P ij plotted against x j (for a specific value of i), in a simple linear regression: X j ŷ k = i S + n(x i x)(x k x) y i ns 11 / 6

12 P ij Leverage If we use basis functions, the coefficients in each row of P are much more local. X j 1 / 6

13 Leverage What is a big leverage? The average leverage is trace(p)/n = (p + 1)/n. If the leverage for a particular case is two or more times greater than the average leverage, it may be considered to have high leverage. In simple linear regression, it is easy to show that var(y i ˆα ˆβx i ) = (n 1)σ /n σ (x i X ) / j (x j x). This implies that when p = 1, P ii = 1/n + (x i x) / j (x j x). 13 / 6

14 Leverage Leverage values in a simple linear regression: Y X Leverage X 1 / 6

15 Leverage Leverage values in a linear regression with two independent variables: X X1 15 / 6

16 Leverage In general, P ii = x i (X X) 1 x i = x i (X X/n) 1 x i/n where x i is the i th row of X (including the intercept). Let x i be row i of X without the intercept, let µ be the sample mean of the x i, and let Σ X be the sample covariance matrix of the x i (scaled by n rather than n 1). It is a fact that and therefore x i (X X/n) 1 x i = ( x i µ)σ 1 X ( x i µ) + 1 P ii = ( ( x i µ X )Σ 1 X ( x i µ X ) + 1 ) /n. Note that this implies that P ii 1/n. 16 / 6

17 Leverage The expression ( x i µ X )Σ 1 X ( x i µ X ) is the Mahalanobis distance between x i and µ X. Thus there is a direct relationship between the Mahalanobis distance of a point relative to the center of the covariate set, and its leverage. 17 / 6

18 Influence Influence measures the degree to which deletion of a case changes the fitted model. We will see that this is different from leverage a high leverage point has the potential to be influential, but is not always influential. The deleted slope for case i is the fitted slope vector that obtained upon deleting case i. The following identity allows the deleted slopes to be calculated efficiently ˆβ (i) = ˆβ r i 1 P ii (X X) 1 x i, where r i is the i th residual, and x i is row i of the design matrix. 18 / 6

19 Influence The vector of all deleted fitted values Ŷ(i) are ŷ (i) = X ˆβ (i) = ŷ r i 1 P ii X(X X) 1 X. Influence can be measured by Cook s distance: D i = = 1 (p + 1)ˆσ (ŷ ŷ (i)) (ŷ ŷ (i) ) r i (1 P ii ) (p + 1)ˆσ x i(x X) 1 x i P ii r s i (1 P ii )(p + 1), where r i is the residual and r s i is the studentized residual. 19 / 6

20 Influence Cook s distance approximately captures the average squared change in fitted values due to deleting case i, in error variance units. Cook s distance is large only if both the leverage P ii is high, and the studentized residual for the i th case is large. As a general rule, D i values from 1/ to 1 are high, and values greater than 1 are considered to be very high. 0 / 6

21 Influence Cook s distances in a simple linear regression: 0.10 Cook's distance X 1 / 6

22 Influence Cook s distances in a linear regression with two variables: X X / 6

23 Regression graphics Quite a few graphical techniques have been proposed to aid in visualizing regression relationships. We will discuss the following plots: 1. Scatterplots of Y against individual X variables.. Scatterplots of X variables against each other. 3. Residuals versus fitted values plot.. Added variable plots. 5. Partial residual plots. 6. Residual quantile plots. 3 / 6

24 Scatterplots of Y against individual X variables E[Y X ] = X 1 X + X 3, var[y X ] = 1, var(x j ) = 1, cor(x j, X k ) = 0.3 Y 0 Y 0 0 X 1 0 X Y 0 Y 0 0 X 3 0 X 1 X +X 3 / 6

25 Scatterplots of X variables against each other E[Y X ] = X 1 X + X 3, var[y X ] = 1, var(x j ) = 1, cor(x j, X k ) = 0.3 X 0 0 X 1 X 3 0 X X 0 X 1 5 / 6

26 Residuals against fitted values plot E[Y X ] = X 1 X + X 3, var[y X ] = 1, var(x j ) = 1, cor(x j, X k ) = 0.3 Residuals 0 0 Fitted values 6 / 6

27 Residuals against fitted values plots Heteroscedastic errors: E[Y X ] = X 1 + X 3, var[y X ] = + X 1 + X 3, var(x j ) = 1, cor(x j, X k ) = Residuals Fitted values 7 / 6

28 Residuals against fitted values plots Nonlinear mean structure: E[Y X ] = X 1, var[y X ] = 1, var(x j) = 1, cor(x j, X k ) = 0.3 Residuals 0 0 Fitted values 8 / 6

29 Added variable plots Suppose P j is the projection onto the span of all covariates except X j, and define Ŷ j = P j Y, Xj = P j X j. The added variable plot is a scatterplot of Y Ŷ j against X Xj. The squared correlation coefficient of the points in the added variable plot is the partial R for variable j. Added variable plots are also called partial regression plots. 9 / 6

30 Added variable plots E[Y X ] = X 1 X + X 3, var[y X ] = 1, var(x j ) = 1, cor(x j, X k ) = 0.3 Ŷ X 1 Ŷ 3 0 Ŷ 0 0 X 3 0 X 30 / 6

31 Partial residual plot Suppose we fit the model Ŷ i = ˆβ X i = ˆβ 0 + ˆβ 1 X i1 + ˆβ p X ip. The partial residual plot for covariate j is a plot of ˆβ j X ij + R i against X ij, where R i is the residual. The partial residual plot attempts to show how covariate j is related to Y, if we control for the effects of all other covariates. 31 / 6

32 Partial residual plot E[Y X ] = X 1, var[y X ] = 1, var(x j) = 1, cor(x j, X k ) = 0.3 ˆβ1 X 1 +R 0 ˆβ X +R 0 0 X 1 0 X ˆβ3 X 3 +R 0 0 X 3 3 / 6

33 Residual quantile plots E[Y X ] = X 1, var[y X ] = 1, var(x j) = 1, cor(x j, X k ) = 0.3 t distributed errors Residual quantiles (standardized) 0 0 Standard normal quantiles 33 / 6

34 Transformations As noted above, the linear model imposes two main constrains on the population that is under study. Specifically, the conditional mean function should be linear, and the conditional variance function should be constant. If it appears that E[Y X = x] is not linear in x, or that Var[Y X = x] is not constant in x, it may be possible to continuously transform either y or x so that the linear model becomes more consistent with the data. 3 / 6

35 Variance stabilizing transformations Many populations encountered in practice exhibit a mean/variance relationship, where E[Y i ] and var[y i ] are related. Suppose that var[y i ] = g(e[y i ])σ, and let f ( ) be a transform to be applied to the y i. The goal is to find a transform such that the variances of the transformed responses are constant. Using a Taylor expansion, f (Y i ) f (E[Y i ]) + f (E[Y i ])(Y i E[Y i ]). 35 / 6

36 Variance stabilizing transformations Therefore var[f (Y i )] f (E[Y i ]) var[y i ] = f (E[Y i ]) g(e[y i ])σ. The goal is to find f such that f = 1/ g. Example: Suppose g(z) = z λ. This includes the Poisson regression case λ = 1, where the variance is proportional to the mean, and the case λ = where the standard deviation is proportional to the mean. When λ = 1, f solves f (z) = 1/ z, so f is the square root function. When λ =, f solves f (z) = 1/z, so f is the logarithm function. 36 / 6

37 Log/log regression Suppose we fit a simple linear regression of the form E[log(Y ) log(x )] = α + β log(x ). E[log(Y ) X = x + 1] E[log(Y ) X = x] = β Using the crude approximation log E[Y X ] E[log(Y ) X ], we conclude E[Y X ] is approximately scaled by a factor of e β when X is scaled by a factor of e. Thus in a log/log model, we may say that a f % change in X is approximately associated with a f β % change in the expected response. 37 / 6

38 Maximum likelihood estimation of a data transformation The Box-Cox family of transforms is y y λ 1, λ which makes sense only when all Y i are positive. The Box-Cox family includes the identity (λ = 1), all power transformations such as the square root (λ = 1/) and reciprocal (λ = 1), and the logarithm in the limiting case λ / 6

39 Maximum likelihood estimation of a data transformation Suppose we assume that for some value of λ, the transformed data follow a linear model with Gaussian errors. We can then set out to estimate λ. The joint log-likelihood of the transformed data is n log(π) n log σ 1 σ i (Y (λ) i X i β). Next we transform this back to a likelihood in terms of Y i = g 1 λ This joint log-likelihood is (Y (λ) i ). n log(π) n log σ 1 σ (g λ (Y i ) X i β) + i i log J i where the Jacobian is log J i = log g λ(y i ) = (λ 1) log Y i. 39 / 6

40 Maximum likelihood estimation of a data transformation The joint log likelihood for the Y i is n log(π) n log σ 1 σ (g λ (Y i ) X i β) + (λ 1) i i log Y i. This likelihood is maximized with respect to λ, β, and σ to identify the MLE. 0 / 6

41 Maximum likelihood estimation of a data transformation To do the maximization, let Y (λ) g λ (Y ) denote the transformed observed responses, and let Ŷ (λ) denote the fitted values from regressing Y (λ) on X. Since σ does not appear in the Jacobian, ˆσ λ n 1 Y (λ) Ŷ (λ) will be the maximizing value of σ. Therefore the MLE of β and λ will maximize n log ˆσ λ + (λ 1) i log Y i. 1 / 6

42 Collinearity Diagnostics Collinearity inflates the sampling variances of covariate effect estimates. To understand the effect of collinearity on var[ ˆβ j X], reorder the columns and partition the design matrix X as X = ( X j X 0 ) = ( Xj X j + X j X 0 ) where X 0 is the n p matrix consisting of all columns in X except X j, and X j is the projection of X j onto col(x 0 ). Therefore ( H X X = X j X j (X j X j ) X 0 X 0 (X j X j ) X 0 X 0 ). var ˆβ j = σ H 1 11, so we want a simple expression for H / 6

43 Collinearity Diagnostics A symmetric block matrix can be inverted using: ( A B C B ) 1 = ( S 1 C 1 B S 1 S 1 BC 1 C 1 + C 1 B S 1 BC 1 ), where S = A BC 1 B. Therefore H 1 1,1 = 1 X j (X j Xj ) P 0 (X j Xj ), where P 0 = X 0 (X 0 X 0) 1 X 0 is the projection matrix onto col(x 0 ). 3 / 6

44 Collinearity Diagnostics Since X j X j and since X j so col(x 0 ), we can write H 1 1,1 = 1 X j X j X j, (X j Xj ) = 0, it follows that X j = X j X j + X j = X j Xj + Xj, H 1 1,1 = 1 Xj. This makes sense, since smaller values of Xj correspond to greater collinearity. / 6

45 Collinearity Diagnostics Let R jx be the coefficient of determination (multiple R ) for the regression of X j on the other covariates. R jx = 1 X j (X j X j ) X j X j = 1 X j X j X j. Combining the two equations yields H 1 11 = 1 X j X j 1 1 Rjx. 5 / 6

46 Collinearity Diagnostics The two factors in the expression H 1 11 = 1 X j X j reflect two different sources of variance of ˆβ j : 1 1 Rjx. 1/ X j X j = 1/ ((n 1) var(x j )) reflects the scaling of X j The variance inflation factor (VIF) 1/(1 Rjx ) is scale-free. It is always greater than or equal to 1, and is equal to 1 only if X j is orthogonal to the other covariates. Large values of the VIF indicate that parameter estimation is strongly affected by collinearity. 6 / 6

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent

More information

Specification Errors, Measurement Errors, Confounding

Specification Errors, Measurement Errors, Confounding Specification Errors, Measurement Errors, Confounding Kerby Shedden Department of Statistics, University of Michigan October 10, 2018 1 / 32 An unobserved covariate Suppose we have a data generating model

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

STAT 4385 Topic 06: Model Diagnostics

STAT 4385 Topic 06: Model Diagnostics STAT 4385 Topic 06: Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 1/ 40 Outline Several Types of Residuals Raw, Standardized, Studentized

More information

where x and ȳ are the sample means of x 1,, x n

where x and ȳ are the sample means of x 1,, x n y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring

More information

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Math 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University

Math 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University Math 5305 Notes Diagnostics and Remedial Measures Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Diagnostics and Remedial Measures 1 / 44 Model Assumptions

More information

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Regression Diagnostics for Survey Data

Regression Diagnostics for Survey Data Regression Diagnostics for Survey Data Richard Valliant Joint Program in Survey Methodology, University of Maryland and University of Michigan USA Jianzhu Li (Westat), Dan Liao (JPSM) 1 Introduction Topics

More information

Introduction to bivariate analysis

Introduction to bivariate analysis Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.

More information

CAS MA575 Linear Models

CAS MA575 Linear Models CAS MA575 Linear Models Boston University, Fall 2013 Midterm Exam (Correction) Instructor: Cedric Ginestet Date: 22 Oct 2013. Maximal Score: 200pts. Please Note: You will only be graded on work and answers

More information

Example: Suppose Y has a Poisson distribution with mean

Example: Suppose Y has a Poisson distribution with mean Transformations A variance stabilizing transformation may be useful when the variance of y appears to depend on the value of the regressor variables, or on the mean of y. Table 5.1 lists some commonly

More information

Quantitative Methods I: Regression diagnostics

Quantitative Methods I: Regression diagnostics Quantitative Methods I: Regression University College Dublin 10 December 2014 1 Assumptions and errors 2 3 4 Outline Assumptions and errors 1 Assumptions and errors 2 3 4 Assumptions: specification Linear

More information

Remedial Measures for Multiple Linear Regression Models

Remedial Measures for Multiple Linear Regression Models Remedial Measures for Multiple Linear Regression Models Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Remedial Measures for Multiple Linear Regression Models 1 / 25 Outline

More information

Introduction to bivariate analysis

Introduction to bivariate analysis Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.

More information

Lecture 1: Linear Models and Applications

Lecture 1: Linear Models and Applications Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

Bivariate data analysis

Bivariate data analysis Bivariate data analysis Categorical data - creating data set Upload the following data set to R Commander sex female male male male male female female male female female eye black black blue green green

More information

4 Multiple Linear Regression

4 Multiple Linear Regression 4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

Lecture 4: Regression Analysis

Lecture 4: Regression Analysis Lecture 4: Regression Analysis 1 Regression Regression is a multivariate analysis, i.e., we are interested in relationship between several variables. For corporate audience, it is sufficient to show correlation.

More information

STATISTICS 479 Exam II (100 points)

STATISTICS 479 Exam II (100 points) Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the

More information

Regression Diagnostics Procedures

Regression Diagnostics Procedures Regression Diagnostics Procedures ASSUMPTIONS UNDERLYING REGRESSION/CORRELATION NORMALITY OF VARIANCE IN Y FOR EACH VALUE OF X For any fixed value of the independent variable X, the distribution of the

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Topic 12 Overview of Estimation

Topic 12 Overview of Estimation Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Matrix Approach to Simple Linear Regression: An Overview

Matrix Approach to Simple Linear Regression: An Overview Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

Analysing data: regression and correlation S6 and S7

Analysing data: regression and correlation S6 and S7 Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

Linear Regression (9/11/13)

Linear Regression (9/11/13) STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter

More information

The Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility

The Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility The Slow Convergence of OLS Estimators of α, β and Portfolio Weights under Long Memory Stochastic Volatility New York University Stern School of Business June 21, 2018 Introduction Bivariate long memory

More information

CHAPTER 5. Outlier Detection in Multivariate Data

CHAPTER 5. Outlier Detection in Multivariate Data CHAPTER 5 Outlier Detection in Multivariate Data 5.1 Introduction Multivariate outlier detection is the important task of statistical analysis of multivariate data. Many methods have been proposed for

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Regression Model Building

Regression Model Building Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation in Y with a small set of predictors Automated

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

11 Hypothesis Testing

11 Hypothesis Testing 28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes

More information

Simple Linear Regression for the MPG Data

Simple Linear Regression for the MPG Data Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory

More information

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017 Summary of Part II Key Concepts & Formulas Christopher Ting November 11, 2017 christopherting@smu.edu.sg http://www.mysmu.edu/faculty/christophert/ Christopher Ting 1 of 16 Why Regression Analysis? Understand

More information

MIT Spring 2015

MIT Spring 2015 Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013)

More information

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R.

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R. Methods and Applications of Linear Models Regression and the Analysis of Variance Third Edition RONALD R. HOCKING PenHock Statistical Consultants Ishpeming, Michigan Wiley Contents Preface to the Third

More information

Bayesian Inference. Chapter 9. Linear models and regression

Bayesian Inference. Chapter 9. Linear models and regression Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013 18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are

More information

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y Predictor or Independent variable x Model with error: for i = 1,..., n, y i = α + βx i + ε i ε i : independent errors (sampling, measurement,

More information

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure. STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals... 11 Observed

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2) RNy, econ460 autumn 04 Lecture note Orthogonalization and re-parameterization 5..3 and 7.. in HN Orthogonalization of variables, for example X i and X means that variables that are correlated are made

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

1. Variance stabilizing transformations; Box-Cox Transformations - Section. 2. Transformations to linearize the model - Section 5.

1. Variance stabilizing transformations; Box-Cox Transformations - Section. 2. Transformations to linearize the model - Section 5. Ch. 5: Transformations and Weighting 1. Variance stabilizing transformations; Box-Cox Transformations - Section 5.2; 5.4 2. Transformations to linearize the model - Section 5.3 3. Weighted regression -

More information

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

Well-developed and understood properties

Well-developed and understood properties 1 INTRODUCTION TO LINEAR MODELS 1 THE CLASSICAL LINEAR MODEL Most commonly used statistical models Flexible models Well-developed and understood properties Ease of interpretation Building block for more

More information

Lecture One: A Quick Review/Overview on Regular Linear Regression Models

Lecture One: A Quick Review/Overview on Regular Linear Regression Models Lecture One: A Quick Review/Overview on Regular Linear Regression Models Outline The topics to be covered include: Model Specification Estimation(LS estimators and MLEs) Hypothesis Testing and Model Diagnostics

More information

MLR Model Checking. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project

MLR Model Checking. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project MLR Model Checking Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en

More information

A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE

A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE Sankhyā : The Indian Journal of Statistics 2000, Volume 62, Series A, Pt. 1, pp. 144 149 A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE By M. MERCEDES SUÁREZ RANCEL and MIGUEL A. GONZÁLEZ SIERRA University

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Reference: Davidson and MacKinnon Ch 2. In particular page

Reference: Davidson and MacKinnon Ch 2. In particular page RNy, econ460 autumn 03 Lecture note Reference: Davidson and MacKinnon Ch. In particular page 57-8. Projection matrices The matrix M I X(X X) X () is often called the residual maker. That nickname is easy

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response.

Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response. Multicollinearity Read Section 7.5 in textbook. Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response. Example of multicollinear

More information

Distribution Assumptions

Distribution Assumptions Merlise Clyde Duke University November 22, 2016 Outline Topics Normality & Transformations Box-Cox Nonlinear Regression Readings: Christensen Chapter 13 & Wakefield Chapter 6 Linear Model Linear Model

More information

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance

More information

Ch. 5 Transformations and Weighting

Ch. 5 Transformations and Weighting Outline Three approaches: Ch. 5 Transformations and Weighting. Variance stabilizing transformations; Box-Cox Transformations - Section 5.2; 5.4 2. Transformations to linearize the model - Section 5.3 3.

More information

Single and multiple linear regression analysis

Single and multiple linear regression analysis Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics

More information

ECON 3150/4150, Spring term Lecture 7

ECON 3150/4150, Spring term Lecture 7 ECON 3150/4150, Spring term 2014. Lecture 7 The multivariate regression model (I) Ragnar Nymoen University of Oslo 4 February 2014 1 / 23 References to Lecture 7 and 8 SW Ch. 6 BN Kap 7.1-7.8 2 / 23 Omitted

More information

STAT 350: Geometry of Least Squares

STAT 350: Geometry of Least Squares The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

Applied linear statistical models: An overview

Applied linear statistical models: An overview Applied linear statistical models: An overview Gunnar Stefansson 1 Dept. of Mathematics Univ. Iceland August 27, 2010 Outline Some basics Course: Applied linear statistical models This lecture: A description

More information