Lecture 19 Multiple (Linear) Regression

Similar documents
Lecture 14 Simple Linear Regression

Chapter 3: Multiple Regression. August 14, 2018

Inferences for Regression

Scatter plot of data from the study. Linear Regression

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Intro to Linear Regression

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij =

Simple Linear Regression

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Scatter plot of data from the study. Linear Regression

AMS 7 Correlation and Regression Lecture 8

Lecture 18: Simple Linear Regression

Confidence Intervals and Sets

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Applied Statistics and Econometrics

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

Business Statistics. Lecture 9: Simple Regression

Intro to Linear Regression

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Lecture 6 Multiple Linear Regression, cont.

14 Multiple Linear Regression

Applied Regression Analysis

3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No.

Categorical Predictor Variables

1 Least Squares Estimation - multiple regression.

Simple Linear Regression

Well-developed and understood properties

Density Temp vs Ratio. temp

General Linear Model (Chapter 4)

Introduction and Single Predictor Regression. Correlation

LECTURE 5 HYPOTHESIS TESTING

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Simple Linear Regression

Multiple Linear Regression

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

ECON3150/4150 Spring 2015

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Estadística II Chapter 5. Regression analysis (second part)

STK4900/ Lecture 3. Program

Reference: Davidson and MacKinnon Ch 2. In particular page

Inference for Regression

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

The prediction of house price

Multivariate Regression Analysis

Multiple Regression Analysis

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

11 Hypothesis Testing

Ch 3: Multiple Linear Regression

Math 3330: Solution to midterm Exam

Ch 2: Simple Linear Regression

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

Regression #2. Econ 671. Purdue University. Justin L. Tobias (Purdue) Regression #2 1 / 24

Lectures on Simple Linear Regression Stat 431, Summer 2012

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS

MS&E 226: Small Data

L2: Two-variable regression model

Analysing data: regression and correlation S6 and S7

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Day 4: Shrinkage Estimators

Multiple Linear Regression

MATH 644: Regression Analysis Methods

STAT 3A03 Applied Regression With SAS Fall 2017

Linear Regression. Chapter 3

Lecture 5: Clustering, Linear Regression

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

Statistics: A review. Why statistics?

Lecture 10 Multiple Linear Regression

Introduction to Estimation Methods for Time Series models. Lecture 1

Weighted Least Squares

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Introduction to Linear Regression

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58

Randomized Complete Block Designs

Lecture 5: Clustering, Linear Regression

Simple and Multiple Linear Regression

Ref.: Spring SOS3003 Applied data analysis for social science Lecture note

22s:152 Applied Linear Regression. Chapter 5: Ordinary Least Squares Regression. Part 1: Simple Linear Regression Introduction and Estimation

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Matrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

review session gov 2000 gov 2000 () review session 1 / 38

22s:152 Applied Linear Regression. Returning to a continuous response variable Y...

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Multiple Linear Regression

22s:152 Applied Linear Regression. In matrix notation, we can write this model: Generalized Least Squares. Y = Xβ + ɛ with ɛ N n (0, Σ)

Exam Applied Statistical Regression. Good Luck!

Simple linear regression

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Regression Analysis: Exploring relationships between variables. Stat 251

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Econometrics of Panel Data

1 Multiple Regression

INFERENCE FOR REGRESSION

Measuring the fit of the model - SSR

Transcription:

Lecture 19 Multiple (Linear) Regression Thais Paiva STA 111 - Summer 2013 Term II August 1, 2013 1 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Lecture Plan 1 Multiple regression 2 OLS estimates of β and α 3 Interpretation 2 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Linear regression A study on depression: The response variable is Depression, which is the score on a self-report depression inventory Predictors: Simplicity is the score that indicates a subjects need to see the world in black and white Fatalism is the score that indicates the belief in the ability to control ones own destiny. Depression is thought to be related to simplicity and fatalism 3 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Linear regression Patient Depression Simplicity Fatalism 1 0.42 0.76 0.11 2 0.52 0.73 1.00 3 0.71 0.62 0.04 4 0.66 0.84 0.42 5 0.54 0.48 0.81 6 0.34 0.41 1.23 7 0.42 0.85 0.30 8 1.08 1.50 1.20 9 0.36 0.31 0.66 10 0.92 1.41 0.85 11 0.33 0.43 0.42 12 0.41 0.53 0.07 13 0.83 1.17 0.30 14 0.65 0.42 1.09 15 0.80 0.76 1.13 4 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Depression data 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 Simplicity Fatalism Depression 5 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Depression data 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 Simplicity Fatalism Depression 6 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Depression data - residuals 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 Simplicity Fatalism Depression 7 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Assumptions for multiple linear regression Y i = α + β 1 X 1i + β 2 X 2i +... + β p X pi + ε i Just as with simple linear regression, the following have to hold: 1 Constant variance (also called homoscedasticity) V (ε i ) = σ 2 for all i = 1,..., n, for some σ 2 2 Linearity 3 Independence ε i ε j for all i, j = 1,..., n, i j 8 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Interpretation of the β s Y i = α + β 1 X 1i + β 2 X 2i +... + β p X pi + ε i β j is the average effect on Y of increasing X j by one unit, with all X k j held constant This is sometimes referred to as the effect of X j after controlling for X k j So β simplicity is the average effect of simplicity on depression after controlling for fatalism 9 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Always plot residuals 0.5 1.0 1.5 2.0 2.5 3.0 0.5 0.0 0.5 1.0 simplicity ε^ 0.0 0.5 1.0 1.5 2.0 0.5 0.0 0.5 1.0 fatalism ε^ 10 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Histogram of residuals Frequency 0 5 10 15 0.5 0.0 0.5 1.0 ε^ 11 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

OLS estimates of α, β 1,..., β p (This is only really reasonable to write down if p = 2) where ˆβ 1 = s Y (r X1 Y r X1 X 2 r X2 Y ) s X1 (1 r 2 X 1 X 2 ) ˆβ 2 = s Y (r X2 Y r X1 X 2 r X1 Y ) s X2 (1 r 2 X 1 X 2 ) ˆα = Ȳ ˆβ 1 X1 ˆβ 2 X2, r AB = n Y i = α + β 1 X 1i + β 2 X 2i + ε i n i=1 (A i Ā)(B i B) i=1 (A n i Ā)2 i=1 (B i B) for some A and B and 2 SA 2 = 1 n (A i n 1 Ā)2 for some A i=1 12 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

It is easier if you know matrix algebra Y = Xβ + ε, where y 1 1 x 11... x 1p α ε 1 y 2 Y =.., X = 1 x 21... x 2p........., β = β 1.., ε = ε 2.. y n 1 x 21... x np β p ε n 13 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

It is easier if you know matrix algebra It turns out that the error sum of squares can be written as X T Y X T Xˆβ = 0 ˆε = (Y Xβ) T (Y Xβ) ˆε β = 2XT (Y Xβ) set = 0 X T Y = X T Xˆβ (X T X) 1 X T Y = ˆβ 14 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

It is easier if you know matrix algebra A couple of things are clear ˆβ = (X T X) 1 X T Y 1 ˆβ is linear in Y 2 ˆβ is easy to compute if we have a computer 15 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

The coefficient of determination Similarly to simple linear regression, and where TSS = n (Y i Ȳ ) 2, ESS = i=1 r 2 = ESS TSS TSS = ESS + RSS, n (Ŷ i Ȳ ) 2, RSS = i=1 SS: Sum of Squares. T: Total. E: Explained. R: Residual n (Y i Ŷ i ) 2 i=1 16 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

s 2 and degrees of freedom Similarly to simple linear regression, s 2 = 1 n p 1 = RSS n p 1 n (Y i Ŷi) 2 i=1 Note the n p 1 degrees of freedom. Why? We had to estimate p + 1 regression parameters. 17 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Hypothesis tests for β j Suppose we are interested in testing H 0 : β j = 0 H A : β j 0 (or the one-sided version) Assuming p = 2 (tractable, but more complicated for p > 2), define and similarly for s 2ˆβ 2. Then (even for p > 2), s 2ˆβ1 = s 2 (n 1)s 2 X 1 (1 r 2 X 1X 2 ) t ˆβ j = ˆβ j β j s ˆβ j t n p 1 18 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Hypothesis tests for β j Notice that s 2ˆβ1 = s 2 (n 1)s 2 X 1 (1 r 2 X 1X 2 ) depends on r 2 X 1X 2, which depends on X 2. So the test for β 1 depends on the other predictor variables What is the interpretation of this test then? Assuming that the other β k j 0, can we reject the hypothesis that β j = 0? 19 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Scatterplot matrix depression 0.5 1.0 1.5 2.0 2.5 3.0 0.5 1.0 1.5 2.0 2.5 0.5 1.0 1.5 2.0 2.5 3.0 simplicity 0.5 1.0 1.5 2.0 2.5 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 fatalism 20 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

3D Scatterplot and plane 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 Simplicity Fatalism Depression 21 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Tests for β simplicity and β fatalism β simplicity : β fatalism : t βsimplicity = 3.649 p-value = 0.0005 t βfatalism = 3.829 p-value = 0.0003 But what if we take fatalism out of the model? Then we get β simplicity : t βsimplicity = 4.175 p-value = 2 10 8 Why? 22 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Scatterplot matrix depression 0.5 1.0 1.5 2.0 2.5 3.0 0.5 1.0 1.5 2.0 2.5 0.5 1.0 1.5 2.0 2.5 3.0 simplicity 0.5 1.0 1.5 2.0 2.5 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 fatalism 23 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

1907 Romanian Peasant Rebellion From Wikipedia: The Romanian Peasants Revolt took place in March 1907 in Moldavia and it quickly spread, reaching Wallachia. Y = Intensity of the rebellion, by county X 1 = Commercialization of agriculture X 2 = Traditionalism X 3 = Strength of middle peasantry X 4 = Inequality of land tenure 24 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Scatterplot matrix intensity 10 20 30 40 5 10 15 1 1 2 3 4 10 20 30 40 commerce tradition 80 85 90 5 10 15 midpeasant 1 1 2 3 4 80 85 90 0.45 0.60 0.75 0.45 0.60 0.75 inequality 25 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Peasant Rebellion results With all the predictors in the model: Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) -12.32796 5.74640-2.145 0.0418 * commerce 0.10055 0.02144 4.690 8.33e-05 *** tradition 0.10578 0.06161 1.717 0.0984. midpeasant 0.09333 0.07466 1.250 0.2229 inequality 0.42198 3.11171 0.136 0.8932 26 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Peasant Rebellion results Without commerce, tradition becomes significant at α = 0.05: Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) -20.03497 7.40287-2.706 0.0119 * tradition 0.19705 0.07859 2.507 0.0187 * midpeasant 0.03480 0.09897 0.352 0.7279 inequality 5.12172 3.96053 1.293 0.2073 27 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Caveats 1 Be careful interpreting the coefficients! Multiple regression is usually applied to observational data 2 Do not think of the sign of the coefficient as special it can actually change as other covariates are added or removed from the model 3 Similarly, tests about any covariate are only meaningful in the context of the other covariates in the model 4 Always make sure a linear model is appropriate for all predictors! 5 Always check residuals for heteroscedasticity and normality 28 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Caveats In particular, a special case that you should be careful about is when the predictors are highly correlated In this situation the coefficient estimates may change erratically in response to small changes in the model or the data This phenomenon is called multicollinearity Because of that, matrix correlation of the predictors is also something to look at (and report) in the analysis 29 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013

Summary 1 Multiple linear regression fits the best hyperplane to the data 2 We can test hypotheses about any of the β j s 3 Be careful about interpretation 4 Correlation of the predictors also important because of multicollinearity 30 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013