Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects

Size: px
Start display at page:

Download "Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects"

Transcription

1 Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, / 32

2 Model Diagnostics: An Overview Basic diagnostics, review Model adequacy for a predictor variable added-variable plots Outlying Y observation and studentized/deleted residuals Outlying X observation and hat matrix/leverage values Influential cases Multicollinearity diagnostics and variance inflation factor W. Zhou (Colorado State University) STAT 540 July 6th, / 32

3 Model Assumptions Recall multiple linear regression model, for i = 1,..., n p 1 Y i = β 0 + β j X ij + ɛ i, ɛ i iid N(0, σ 2 ). j=1 Relationship between Y and X: E(Yi) = β 0 + p 1 j=1 βjxij. Homogeneous variance: Var(Yi) = Var(ɛ i) σ 2. Independence: Cov(ɛi, ɛ j) = Cov(Y i, Y j) = 0, i j. Normal distribution: Yi N(β 0 + p 1 j=1 βjxij, σ2 ). W. Zhou (Colorado State University) STAT 540 July 6th, / 32

4 Basic Diagnostics Exploratory Data Analysis Same as before: scatterplots, boxplots, histograms, summaries New: scatterplot matrices, split boxplots, brush/spin, coplots Linearity, Homoscedasticity, Normality Same as before: (externally studentized) residuals vs. each X, against Ŷ, and against time (also note: ACF plot), QQplot. Tests: e.g., F test for lack of fit, Breusch-Pagan, etc. (see Chapter 6.8 KNNL) Outliers, Influence, and Correlated Predictors Major focus of this set of notes W. Zhou (Colorado State University) STAT 540 July 6th, / 32

5 1 Review of Residuals 2 Detecting Outliers Outlying Response Outlying Predictor 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, / 32

6 Residuals Review Recall that the residuals e = (e 1,..., e n ) T = Y Ŷ = (I H)Y, where H is the hat/projection matrix. The mean of the residuals is e1 T = The variance-covariance matrix of the residuals is Var{e} = and is estimated by s 2 {e} = W. Zhou (Colorado State University) STAT 540 July 6th, / 32

7 Residuals Review Denote H = [h ij ] n i,j=1. Then we have variance of e i Var{e i } = σ 2 (1 h ii ), estimated by s 2 {e i } = MSE(1 h ii ) The covariance of e i and e j (i j) is Cov{e i, e j } = σ 2 (0 h ij ) = σ 2 h ij estimated by s 2 {e i } = MSE h ij. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

8 Studentized Residuals Review The variance of e i is not constant and the covariance of e i, e j is not zero. Observations with a large residual relatively to its standard deviation may be outlying. To compare n residuals, standardize so that the residuals are on the same scale. Studentized residuals (a.k.a. internally studentized) are defined as r i = ei s{e = e i. i} MSE(1 hii) If the model is appropriate, the studentized residuals {ri} have constant variance, while the ordinary residuals {e i} do not. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

9 Deleted Residuals Review Influence: the ith point can pull the line response surface strongly toward it if it is highly influential. This masks the point s influence. Strategy: define the residual for the ith point as the prediction error for that point using the model fit to the data omitting that point. Deleted residuals are defined as It can be shown that d i = Y i Ŷi( i) = d i = Y i Ŷi( i). e i 1 h ii = Y i Ŷi 1 h ii. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

10 Deleted Residuals Review Let X i = (1, X i1,..., X i,p 1 ) (a row vector). Let X i and MSE i denote the design matrix and the MSE with the ith row (observation) deleted. Recall that s 2 {pred} = MSE(1 + X h (X T X) 1 Xh T ), one can show W. Zhou (Colorado State University) STAT 540 July 6th, / 32

11 Studentized Deleted Residuals Review The studentized deleted residuals (a.k.a. externally studentized) are defined as, for i = 1,..., n, t i = d i s{d i } = ei 1 h ii = MSE i 1 h ii Note that (n p)mse = (n p 1)MSE i + e i MSE i (1 h ii ). e2 i 1 h ii, so and there is no need to fit n separate regressions. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

12 Outlying Y Observation Outlying observations are well separated from the remainder of the data. Consider three types of outlying observations: 1 Outlying not in X but in Y X: Usually not influential. 2 Outlying in X but not Y X: Usually not influential. 3 Outlying in X and Y X: Can be very influential. Goal: Identify outlying and influential observations. The task is relatively straightforward for 1-2 predictor variables but becomes more challenging for more than 2 predictor variables. Hidden Extrapolation. Basic idea: Outlying observations may involve large residuals and often have large impact on the model fit. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

13 Identifying Outlying Y Observations Basic idea: the ith observation is outlying in Y if t i is large. Under H 0 : observation i is not outlying in Y The decision rule is t i = d i s{d i } t n p 1. Need Bonferroni adjustment, why? n multiple comparisons. For most n and p, t1 α 2n ;n p 1 at the α = 5% level is greater than 3. In practice, t i > 3 then observation i is a possible outlier. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

14 Hat Matrix and Leverages Basic idea: use the hat matrix to identify outliers in X. Recall that H = [h ij ] n i,j=1 and h ii = X i (X T X) 1 X T i. The diagonal elements hii are called leverages. Properties of leverages hii: 1 0 h ii 1 (can you show this? ) 2 n i=1 h ii = p h ni=1 h = ii = p (show it). n n 3 h ii is a measure of the distance between X values of the ith observation and the means of the X values for all n observations (show: h ii = 1/n + (x 1i x 1 ) T (X cx c) 1 (x 1i x 1 ), where X c is the centerred design matrix X.) W. Zhou (Colorado State University) STAT 540 July 6th, / 32

15 Identifying Outlying X Observation Effects of hat values: if the ith data point is outlying in X with a high leverage h ii, it can influence the fitted response Ŷi. A higher leverage hii results in more weight of Y i in determining Ŷi (as Ŷ = HY ). A higher leverage hii results in a smaller s{e i}, as Ŷi is closer to Yi. Connections to nonparametric smoothing. What is a bad hat value? 1 If h ii > 2p/n, then observation i is considered to be outlying in X. 2 Moderate leverage if h ii [0.2, 0.5) and high leverage if h ii [0.5, 1]. 3 Draw a histogram, stem-and-leaf, or other plot of h ii. Outlying observations tend to be large and there tends to be a gap between the outlying group and other leverage values. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

16 Hidden Extrapolation H can be used to identify hidden extrapolation for large p. It is possible for a point Xnew to have each X new,i (i = 1,..., p) within the corresponding marginal range of X, but for the p-dim point X new to lie outside the support region of the empirical joint distribution of X. Can be very difficult to detect, especially if no 2-way scatterplot or 3-way brush/spin illustrates it. Consider h new,new = X new (X T X) 1 X T new. If h new,new max i h ii, then it is fine to make predictions at X new. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

17 1 Review of Residuals 2 Detecting Outliers Outlying Response Outlying Predictor 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, / 32

18 Identifying Influential Observations An observation is influential if its deletion leads to major changes in the fitted regression. Not all outlying observations are influential. Main idea: Leave-one-out approach like the deleted residuals. Consider 3 measures: 1 DFFITS 2 Cook s distance 3 DFBETAS No diagnostics identify all possible problems. For example, leave-one-out methods do not address multiple influential observations. More complicated methods are possible: bootstrap, highd-dimensional situations. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

19 DFFITS DFFITS measures the effect of the ith case on fitted value of Y i DF F IT S i = Ŷi Ŷ i MSE i h ii and we can show DF F IT S i = t i where t i is the ith studentized deleted residual. hii 1 h ii For small to medium data sets, DF F IT S i > 1 implies that the ith observation may be influential. For large data sets, DF F IT S i > 2 p/n implies that the ith observation may be influential. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

20 Cook s Distance Cook s distance measures the influence of the ith observation on all n fitted values. n j=1 D i = (Ŷj Ŷj( i)) 2. p MSE i and show D i = where r i is the studendized residual. ( ) ( r 2 i hii ) p 1 h ii W. Zhou (Colorado State University) STAT 540 July 6th, / 32

21 Cook s Distance Cook s D is large when r i is large and h ii is large D i < F p,n p,0.2 (the 20th percentile) is no concern D i > F p,n p,0.5 indicate substantial influence What about between? Crude rule of thumb: If Di > 1, investigate the ith observation as possibly influential. If p, what happens? W. Zhou (Colorado State University) STAT 540 July 6th, / 32

22 DFBETAS DFBETAS measures the influence of the ith observation on a single coefficient β k. DF BET AS k(i) = ( ˆβ k ˆβ k( i) )/ MSE i c kk where c kk = [(X T X) 1 ] kk Recall that V ar( ˆβ) = σ 2 (X T X) 1. Larger DF BET AS k(i) indicates larger impact of observation i on ˆβ k. For small to medium data sets, if DF BET ASk(i) > 1, then the ith observation may be influential. For large data sets, if DF BET ASk(i) > 2/ n, then the ith observation may be influential. The sign of DF BET ASk(i) tells whether inclusion of observation i leads to an increase (+) or decrease (-) in ˆβ k. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

23 1 Review of Residuals 2 Detecting Outliers Outlying Response Outlying Predictor 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, / 32

24 Multicollinearity When the predictor variables are correlated among themselves, multicollinearity among them is said to exist. Consider two extreme cases. Uncorrelated predictor variables. Predictor variables are perfectly correlated. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

25 Linearly Independent Predictor Variables Consider Y = β 0 + β 1 X 1 + β 2 X 2 + ɛ. Suppose X1 X 2, i.e. ˆ Corr(X 1, X 2) = 0. We can show ˆβ 1 = n i=1 (Yi Ȳ )(Xi1 X 1) n i=1 (Xi1 X 1) 2, ˆβ2 = n i=1 (Yi Ȳ )(Xi2 X 2) n i=1 (Xi2 X 2) 2. The LS estimate of β1 is not affected by X 2 and vice versa. Also, the order in which the predictor variables are put in the model is inconsequential. Interpretation of regression coefficients is clear: β 1 is the expected change in Y for one unit increase in X 1 with X 2 held constant. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

26 Predictor Variables are Linearly Dependent Again, suppose Y = β 0 + β 1 X 1 + β 2 X 2 + ɛ. But X 2 = 2X Suppose β 0 = 3, β 1 = 2, β 2 = 5. Then all the following models give the same fit for Y : Y = 3 + 2X1 + 5X 2 + ɛ. Y = X1 + ɛ. Y = 2 + 6X2 + ɛ. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

27 What is still fine. Prediction of Y is fine within the model/data scope, but unreliable outside the model/data scope. What is not. The β s are not unique because X is reduced rank (why?) and X T X is not invertible. Interpretation of the effect of the jth predictor holding all other variables constant is difficult. A regression coefficient may no longer reflect the effect of its corresponding predictor variable. Even worse: multicollinearity does not violate any model assumptions! W. Zhou (Colorado State University) STAT 540 July 6th, / 32

28 Concerns with Multicollinearity Multicollinearity could be between 3 or more variables, rather than just a correlated pair. That would be harder to detect. Effects of multicollinearity on the inference of regression coefficients: Large changes in the fitted ˆβk when another X is added or deleted Small changes in the data lead to very large changes in ˆβ Large s{ ˆβk }. Makes the ˆβ k seem non-significant even though the predictors are jointly significant and R 2 is large. More difficult to interpret ˆβk as the effect of X k on Y because the other X s cannot be held constant. Estimated coefficients may have wrong sign or implausible magnitudes. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

29 Some Diagnostics for Multicollinearity Multicollinearity is harmless for estimation of mean response and prediction of new observation at X h. Assuming no extrapolation! Diagnosing multicollinearity Large changes in ˆβ s when a predictor (or an observation) is added or deleted. Important predictors are not statistically significant (large p-values) in individual tests. Wide confidence intervals for β s corresponding to important predictor variables. The sign of ˆβ is counter-intuitive. Predictors are highly correlated. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

30 Variance Inflation Factor (VIF) Variance inflation factor (VIF) for ˆβ k : 1 VIF k = 1 Rk 2, k = 1,..., p 1 where R 2 k is the R2 for a regression of X k on the other predictor variables. VIF measures the increase in the standard error of βk due to the presence of other variables. If maxk VIF k > 10, multicollinearity may have a large impact on the inference. If p 1 j=1 VITj > p 1, there may be serious multicollinearity problems (for large p). W. Zhou (Colorado State University) STAT 540 July 6th, / 32

31 Variance Inflation Factor (VIF) As R 2 k is the coefficient of multiple determination R2 of the model σ 2 { ˆβ k } σ 2 VIF k = σ2. 1 R k 2 p 1 X ik = β 0 + X ij + ɛ. j k 1 When R 2 k decreases, σ2 { ˆβ k } decreases. 2 When R 2 k increases, σ2 { ˆβ k } increases. In fact, VIF k = (n 1) ( (X c X c ) 1) kk where X c is the scaled design matrix. (Can you show this?) W. Zhou (Colorado State University) STAT 540 July 6th, / 32

32 Some Remedial Measures for Multicollinearity Classical method. Drop one or more predictor variables from the model (selection, frontier of statistics). For polynomial or interaction regression models, use centered predictor variables X ik X k to reduce multicollinearity (Gram-Schmidt transformation, why?) Modern method. Create new predictor variables: principal component regression, PLRS, dimension-reductions. Use shrinkage regression such as ridge, LASSO, SCAD, group LASSO, adaptive LASSO, ˆβ R = (X T X + λi) 1 X T Y Although ˆβR has a smaller variance, it is a biased estimator of β. Going into very frontier of Statistical Machine Learning and High-dimensional Inference. W. Zhou (Colorado State University) STAT 540 July 6th, / 32

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n

More information

Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014

Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014 Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2014 1 / 13 Chapter 8: Polynomials & Interactions

More information

STAT 4385 Topic 06: Model Diagnostics

STAT 4385 Topic 06: Model Diagnostics STAT 4385 Topic 06: Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 1/ 40 Outline Several Types of Residuals Raw, Standardized, Studentized

More information

Chapter 10 Building the Regression Model II: Diagnostics

Chapter 10 Building the Regression Model II: Diagnostics Chapter 10 Building the Regression Model II: Diagnostics 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 41 10.1 Model Adequacy for a Predictor Variable-Added

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

Regression Diagnostics Procedures

Regression Diagnostics Procedures Regression Diagnostics Procedures ASSUMPTIONS UNDERLYING REGRESSION/CORRELATION NORMALITY OF VARIANCE IN Y FOR EACH VALUE OF X For any fixed value of the independent variable X, the distribution of the

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 49 Outline 1 How to check assumptions 2 / 49 Assumption Linearity: scatter plot, residual plot Randomness: Run test, Durbin-Watson test when the data can

More information

Remedial Measures for Multiple Linear Regression Models

Remedial Measures for Multiple Linear Regression Models Remedial Measures for Multiple Linear Regression Models Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Remedial Measures for Multiple Linear Regression Models 1 / 25 Outline

More information

holding all other predictors constant

holding all other predictors constant Multiple Regression Numeric Response variable (y) p Numeric predictor variables (p < n) Model: Y = b 0 + b 1 x 1 + + b p x p + e Partial Regression Coefficients: b i effect (on the mean response) of increasing

More information

Regression Diagnostics for Survey Data

Regression Diagnostics for Survey Data Regression Diagnostics for Survey Data Richard Valliant Joint Program in Survey Methodology, University of Maryland and University of Michigan USA Jianzhu Li (Westat), Dan Liao (JPSM) 1 Introduction Topics

More information

Matrix Approach to Simple Linear Regression: An Overview

Matrix Approach to Simple Linear Regression: An Overview Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix

More information

Diagnostics and Remedial Measures: An Overview

Diagnostics and Remedial Measures: An Overview Diagnostics and Remedial Measures: An Overview Residuals Model diagnostics Graphical techniques Hypothesis testing Remedial measures Transformation Later: more about all this for multiple regression W.

More information

CHAPTER 5. Outlier Detection in Multivariate Data

CHAPTER 5. Outlier Detection in Multivariate Data CHAPTER 5 Outlier Detection in Multivariate Data 5.1 Introduction Multivariate outlier detection is the important task of statistical analysis of multivariate data. Many methods have been proposed for

More information

Regression Model Building

Regression Model Building Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation in Y with a small set of predictors Automated

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

REGRESSION DIAGNOSTICS AND REMEDIAL MEASURES

REGRESSION DIAGNOSTICS AND REMEDIAL MEASURES REGRESSION DIAGNOSTICS AND REMEDIAL MEASURES Lalmohan Bhar I.A.S.R.I., Library Avenue, Pusa, New Delhi 110 01 lmbhar@iasri.res.in 1. Introduction Regression analysis is a statistical methodology that utilizes

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,

More information

Multicollinearity and A Ridge Parameter Estimation Approach

Multicollinearity and A Ridge Parameter Estimation Approach Journal of Modern Applied Statistical Methods Volume 15 Issue Article 5 11-1-016 Multicollinearity and A Ridge Parameter Estimation Approach Ghadban Khalaf King Khalid University, albadran50@yahoo.com

More information

Quantitative Methods I: Regression diagnostics

Quantitative Methods I: Regression diagnostics Quantitative Methods I: Regression University College Dublin 10 December 2014 1 Assumptions and errors 2 3 4 Outline Assumptions and errors 1 Assumptions and errors 2 3 4 Assumptions: specification Linear

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

MLR Model Checking. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project

MLR Model Checking. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project MLR Model Checking Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

10 Model Checking and Regression Diagnostics

10 Model Checking and Regression Diagnostics 10 Model Checking and Regression Diagnostics The simple linear regression model is usually written as i = β 0 + β 1 i + ɛ i where the ɛ i s are independent normal random variables with mean 0 and variance

More information

Leverage. the response is in line with the other values, or the high leverage has caused the fitted model to be pulled toward the observed response.

Leverage. the response is in line with the other values, or the high leverage has caused the fitted model to be pulled toward the observed response. Leverage Some cases have high leverage, the potential to greatly affect the fit. These cases are outliers in the space of predictors. Often the residuals for these cases are not large because the response

More information

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines) Dr. Maddah ENMG 617 EM Statistics 11/28/12 Multiple Regression (3) (Chapter 15, Hines) Problems in multiple regression: Multicollinearity This arises when the independent variables x 1, x 2,, x k, are

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Topic 19: Remedies Outline Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Regression Diagnostics Summary Check normality of the residuals

More information

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13)

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) 1. Weighted Least Squares (textbook 11.1) Recall regression model Y = β 0 + β 1 X 1 +... + β p 1 X p 1 + ε in matrix form: (Ch. 5,

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

MULTICOLLINEARITY AND VARIANCE INFLATION FACTORS. F. Chiaromonte 1

MULTICOLLINEARITY AND VARIANCE INFLATION FACTORS. F. Chiaromonte 1 MULTICOLLINEARITY AND VARIANCE INFLATION FACTORS F. Chiaromonte 1 Pool of available predictors/terms from them in the data set. Related to model selection, are the questions: What is the relative importance

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for

More information

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation. Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1

More information

A Modern Look at Classical Multivariate Techniques

A Modern Look at Classical Multivariate Techniques A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico

More information

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. 1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. T F T F T F a) Variance estimates should always be positive, but covariance estimates can be either positive

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice

The Model Building Process Part I: Checking Model Assumptions Best Practice The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test

More information

Regression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr

Regression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr Regression Model Specification in R/Splus and Model Diagnostics By Daniel B. Carr Note 1: See 10 for a summary of diagnostics 2: Books have been written on model diagnostics. These discuss diagnostics

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

STAT Checking Model Assumptions

STAT Checking Model Assumptions STAT 704 --- Checking Model Assumptions Recall we assumed the following in our model: (1) The regression relationship between the response and the predictor(s) specified in the model is appropriate (2)

More information

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression

More information

Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response.

Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response. Multicollinearity Read Section 7.5 in textbook. Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response. Example of multicollinear

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

Diagnostics and Remedial Measures

Diagnostics and Remedial Measures Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression

More information

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1 Regression diagnostics As is true of all statistical methodologies, linear regression analysis can be a very effective way to model data, as along as the assumptions being made are true. For the regression

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.

More information

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where

More information

MIT Spring 2015

MIT Spring 2015 Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Checking model assumptions with regression diagnostics

Checking model assumptions with regression diagnostics @graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk Checking model assumptions with regression diagnostics Graeme L. Hickey University of Liverpool Conflicts of interest None Assistant Editor

More information

Topic 18: Model Selection and Diagnostics

Topic 18: Model Selection and Diagnostics Topic 18: Model Selection and Diagnostics Variable Selection We want to choose a best model that is a subset of the available explanatory variables Two separate problems 1. How many explanatory variables

More information

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013 18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Beam Example: Identifying Influential Observations using the Hat Matrix

Beam Example: Identifying Influential Observations using the Hat Matrix Math 3080. Treibergs Beam Example: Identifying Influential Observations using the Hat Matrix Name: Example March 22, 204 This R c program explores influential observations and their detection using the

More information

Model Selection. Frank Wood. December 10, 2009

Model Selection. Frank Wood. December 10, 2009 Model Selection Frank Wood December 10, 2009 Standard Linear Regression Recipe Identify the explanatory variables Decide the functional forms in which the explanatory variables can enter the model Decide

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

STATISTICS 479 Exam II (100 points)

STATISTICS 479 Exam II (100 points) Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the

More information

Need for Several Predictor Variables

Need for Several Predictor Variables Multiple regression One of the most widely used tools in statistical analysis Matrix expressions for multiple regression are the same as for simple linear regression Need for Several Predictor Variables

More information

Lecture 1: Linear Models and Applications

Lecture 1: Linear Models and Applications Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation

More information

Ridge Regression. Summary. Sample StatFolio: ridge reg.sgp. STATGRAPHICS Rev. 10/1/2014

Ridge Regression. Summary. Sample StatFolio: ridge reg.sgp. STATGRAPHICS Rev. 10/1/2014 Ridge Regression Summary... 1 Data Input... 4 Analysis Summary... 5 Analysis Options... 6 Ridge Trace... 7 Regression Coefficients... 8 Standardized Regression Coefficients... 9 Observed versus Predicted...

More information

Labor Economics with STATA. Introduction to Regression Diagnostics

Labor Economics with STATA. Introduction to Regression Diagnostics Labor Economics with STATA Liyousew G. Borga November 4, 2015 Introduction to Regression Diagnostics Liyou Borga Labor Economics with STATA November 4, 2015 64 / 85 Outline 1 Violations of Basic Assumptions

More information

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R.

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R. Methods and Applications of Linear Models Regression and the Analysis of Variance Third Edition RONALD R. HOCKING PenHock Statistical Consultants Ishpeming, Michigan Wiley Contents Preface to the Third

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Lecture 12 Inference in MLR

Lecture 12 Inference in MLR Lecture 12 Inference in MLR STAT 512 Spring 2011 Background Reading KNNL: 6.6-6.7 12-1 Topic Overview Review MLR Model Inference about Regression Parameters Estimation of Mean Response Prediction 12-2

More information

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO. Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Linear Regression Models

Linear Regression Models Linear Regression Models November 13, 2018 1 / 89 1 Basic framework Model specification and assumptions Parameter estimation: least squares method Coefficient of determination R 2 Properties of the least

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Lecture One: A Quick Review/Overview on Regular Linear Regression Models

Lecture One: A Quick Review/Overview on Regular Linear Regression Models Lecture One: A Quick Review/Overview on Regular Linear Regression Models Outline The topics to be covered include: Model Specification Estimation(LS estimators and MLEs) Hypothesis Testing and Model Diagnostics

More information

Chapter 7. Scatterplots, Association, and Correlation

Chapter 7. Scatterplots, Association, and Correlation Chapter 7 Scatterplots, Association, and Correlation Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 29 Objective In this chapter, we study relationships! Instead, we investigate

More information

Regression Diagnostics

Regression Diagnostics Diag 1 / 78 Regression Diagnostics Paul E. Johnson 1 2 1 Department of Political Science 2 Center for Research Methods and Data Analysis, University of Kansas 2015 Diag 2 / 78 Outline 1 Introduction 2

More information

Simple Linear Regression for the MPG Data

Simple Linear Regression for the MPG Data Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory

More information

Introduction The framework Bias and variance Approximate computation of leverage Empirical evaluation Discussion of sampling approach in big data

Introduction The framework Bias and variance Approximate computation of leverage Empirical evaluation Discussion of sampling approach in big data Discussion of sampling approach in big data Big data discussion group at MSCS of UIC Outline 1 Introduction 2 The framework 3 Bias and variance 4 Approximate computation of leverage 5 Empirical evaluation

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Linear Models, Problems

Linear Models, Problems Linear Models, Problems John Fox McMaster University Draft: Please do not quote without permission Revised January 2003 Copyright c 2002, 2003 by John Fox I. The Normal Linear Model: Structure and Assumptions

More information

3. Diagnostics and Remedial Measures

3. Diagnostics and Remedial Measures 3. Diagnostics and Remedial Measures So far, we took data (X i, Y i ) and we assumed where ɛ i iid N(0, σ 2 ), Y i = β 0 + β 1 X i + ɛ i i = 1, 2,..., n, β 0, β 1 and σ 2 are unknown parameters, X i s

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Lecture 9 SLR in Matrix Form

Lecture 9 SLR in Matrix Form Lecture 9 SLR in Matrix Form STAT 51 Spring 011 Background Reading KNNL: Chapter 5 9-1 Topic Overview Matrix Equations for SLR Don t focus so much on the matrix arithmetic as on the form of the equations.

More information

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46 BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics

More information

Simultaneous Inference: An Overview

Simultaneous Inference: An Overview Simultaneous Inference: An Overview Topics to be covered: Joint estimation of β 0 and β 1. Simultaneous estimation of mean responses. Simultaneous prediction intervals. W. Zhou (Colorado State University)

More information

Linear Models in Machine Learning

Linear Models in Machine Learning CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,

More information

L7: Multicollinearity

L7: Multicollinearity L7: Multicollinearity Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Introduction ï Example Whats wrong with it? Assume we have this data Y

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

Diagnostics for Linear Models With Functional Responses

Diagnostics for Linear Models With Functional Responses Diagnostics for Linear Models With Functional Responses Qing Shen Edmunds.com Inc. 2401 Colorado Ave., Suite 250 Santa Monica, CA 90404 (shenqing26@hotmail.com) Hongquan Xu Department of Statistics University

More information

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses ISQS 5349 Final Spring 2011 Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses 1. (10) What is the definition of a regression model that we have used throughout

More information

Chapter 5 Matrix Approach to Simple Linear Regression

Chapter 5 Matrix Approach to Simple Linear Regression STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model Checking/Diagnostics Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics The session is a continuation of a version of Section 11.3 of MMD&S. It concerns

More information

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model Checking/Diagnostics Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics The session is a continuation of a version of Section 11.3 of MMD&S. It concerns

More information

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the

More information

Regression Steven F. Arnold Professor of Statistics Penn State University

Regression Steven F. Arnold Professor of Statistics Penn State University Regression Steven F. Arnold Professor of Statistics Penn State University Regression is the most commonly used statistical technique. It is primarily concerned with fitting models to data. It is often

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

LINEAR REGRESSION. Copyright 2013, SAS Institute Inc. All rights reserved.

LINEAR REGRESSION. Copyright 2013, SAS Institute Inc. All rights reserved. LINEAR REGRESSION LINEAR REGRESSION REGRESSION AND OTHER MODELS Type of Response Type of Predictors Categorical Continuous Continuous and Categorical Continuous Analysis of Variance (ANOVA) Ordinary Least

More information

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Prepared by: Prof Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Putra Malaysia Serdang M L Regression is an extension to

More information