Last updated: Oct 18, 2012 LINEAR REGRESSION PSYC 3031 INTERMEDIATE STATISTICS LABORATORY. J. Elder

Similar documents
Introduction to Regression

Linear Regression Measurement & Evaluation of HCC Systems

Regression. Marc H. Mehlman University of New Haven

Simple Linear Regression

Inferences for Regression

Extensions of One-Way ANOVA.

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

Multiple linear regression S6

Regression and the 2-Sample t

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Extensions of One-Way ANOVA.

Multiple Regression Part I STAT315, 19-20/3/2014

Applied Regression Analysis

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

A discussion on multiple regression models

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Six Sigma Black Belt Study Guides

The simple linear regression model discussed in Chapter 13 was written as

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

1 Correlation and Inference from Regression

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Stat 500 Midterm 2 12 November 2009 page 0 of 11

ST430 Exam 2 Solutions

1 Multiple Regression

Multiple Linear Regression. Chapter 12

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

Inference for Regression

1 The Classic Bivariate Least Squares Model

1 Introduction 1. 2 The Multiple Regression Model 1

Chapter 12 - Part I: Correlation Analysis

1 Use of indicator random variables. (Chapter 8)

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

MORE ON SIMPLE REGRESSION: OVERVIEW

The Multiple Regression Model

Section 3: Simple Linear Regression

Applied Regression Modeling: A Business Approach Chapter 3: Multiple Linear Regression Sections

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Lecture 19 Multiple (Linear) Regression

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model

B. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i

Simple Linear Regression

MODELS WITHOUT AN INTERCEPT

28. SIMPLE LINEAR REGRESSION III

22s:152 Applied Linear Regression. Chapter 5: Ordinary Least Squares Regression. Part 1: Simple Linear Regression Introduction and Estimation

Inference with Simple Regression

Ch 2: Simple Linear Regression

Lecture 11: Simple Linear Regression

Statistiek II. John Nerbonne. March 17, Dept of Information Science incl. important reworkings by Harmut Fitz

STK4900/ Lecture 3. Program

Data Analysis 1 LINEAR REGRESSION. Chapter 03

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Chapter 4: Regression Models

Diagnostics and Transformations Part 2

Review of Multiple Regression

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species

Chapter 14 Simple Linear Regression (A)

MULTICOLLINEARITY AND VARIANCE INFLATION FACTORS. F. Chiaromonte 1

COMPARING SEVERAL MEANS: ANOVA

Ref.: Spring SOS3003 Applied data analysis for social science Lecture note

Business Statistics. Lecture 9: Simple Regression

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression

Correlation and Simple Linear Regression

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

Introduction to Regression

Intro to Linear Regression

Heteroscedasticity 1

Correlation and Regression

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

Stat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb

INFERENCE FOR REGRESSION

Simple Linear Regression

General Linear Statistical Models - Part III

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Mathematics for Economics MA course

Single and multiple linear regression analysis

15.063: Communicating with Data

Intro to Linear Regression

Statistical Modelling in Stata 5: Linear Models

Tests of Linear Restrictions

36-707: Regression Analysis Homework Solutions. Homework 3

Multiple Regression: Example

REVIEW 8/2/2017 陈芳华东师大英语系

Simple linear regression

Can you tell the relationship between students SAT scores and their college grades?

BNAD 276 Lecture 10 Simple Linear Regression Model

Business Statistics. Lecture 10: Correlation and Linear Regression

Chapter 14 Student Lecture Notes 14-1

Linear Regression and Correlation

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

SCHOOL OF MATHEMATICS AND STATISTICS

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

appstats27.notebook April 06, 2017

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Multiple Linear Regression II. Lecture 8. Overview. Readings

Multiple Linear Regression II. Lecture 8. Overview. Readings. Summary of MLR I. Summary of MLR I. Summary of MLR I

Econometrics Part Three

Transcription:

Last updated: Oct 18, 2012 LINEAR REGRESSION

Acknowledgements 2 Some of these slides have been sourced or modified from slides created by A. Field for Discovering Statistics using R.

Simple Linear

Objectives 4 Understand linear regression with one predictor Understand how we assess the fit of a regression model Total sum of squares Model sum of squares Residual sum of squares F R 2 Know how to do regression using R Interpret a regression model

What is? 5 A way of predicting the value of one variable from another. It is a hypothetical model of the relationship between two variables. We will focus on a linear relationship, in which the outcome variable is predicted by a straight line.

Describing a Straight Line 6 Y = b + bx + ε i 0 i i i b i b 0 coefficient for the predictor Gradient (slope) of the regression line Direction/strength of relationship Intercept (value of Y when X = 0) Point at which the regression line crosses the Y-axis (ordinate)

Intercepts and Gradients 7 Same intercept, different slopes Same slope, different intercepts

8 The Method of Least Squares - - + + - - + + + - - - + - - This graph shows a scatterplot of some data with a line representing the general trend. The vertical lines (dotted) represent the differences (or residuals) between the line and the actual data

Why Least Squares? 9 It can be shown that if the noise is 0-mean and independent and identically-distributed (IID), then the maximum probability linear model is that which minimizes the sum of squared residuals.

How Good Is the Model? 10 The regression line is only a model based on the data. This model might not reflect reality. We need some way of testing how well the model fits the observed data. How? Slide 10

Sources of Variability 11 SS T Total variability (variability between scores and the mean). SS R Residual/error variability (variability between the regression model and the actual data). SS M Model variability (difference in variability between the model and the mean). SS T SS T = SS M + SS R SS M SS R

Sources of Variability: Sums of Squares 12 Let t i = observed values of outcome variable. y i = model predictions for outcome variable. Then The total variation is SS T = ( t i t ) 2 The residual (error) variation is SS R = n i=1 ( t i y i ) 2 The variation explained by the model is SS M = n i=1 n i=1 ( y i t ) 2

Testing the Model: ANOVA 13 If the model results in better prediction than using the mean, then we expect SS M to be much greater than SS R

Coefficient of Determination: R 2 14 R 2 The proportion of variance accounted for by the regression model. The Pearson Correlation Coefficient Squared 2 R = SS SS M T

Mean Squares: Testing Significance 15 Let t i = observed values of outcome variable. y i = model predictions for outcome variable. Then The unexplained (residual) mean squares is MS R = 1 df R SS R = 1 n 2 SS = 1 R n 2 The model mean squares is n i=1 ( t i y i ) 2 ( ) 2 MS M = 1 SS df M = 1 n M 2 1 SS = SS = y y M M i i=1 F = MS MS M R

More on the F-Statistic 16 The F-statistic can be used to compare any two nested models. Let s label the models 1 and 2, where model 2 is an elaboration of model 1. Then we can test whether model 2 significantly increases the explained variance using the statistic F = SS 1 SS 2 df 2 df 1 SS 2 n df 2 where the SS refer to the sum of squared residuals from the respective models and n is the total number of data points.

More on the F-Statistic 17 In the case of simple linear regression: Model 1 is the mean (horizontal line) with one degree of freedom. Model 2 is the sloped line with two degrees of freedom. Thus F = SS 1 SS 2 df 2 df 1 SS 2 n df 2 = SS T SS R 2 1 SS R n 2 = SS T SS R SS R n 2 = SS M SS R n 2 = MS M MS R.

: An Example 18 A record company boss was interested in predicting record sales from advertising. Data 200 different album releases Outcome variable: Sales (CDs and downloads) in the week after release Predictor variable: The amount (in units of 1000) spent promoting the record before release.

19 Doing Simple Using R Commander

in R 20 We run a regression analysis using the lm() function lm stands for linear model. This function takes the general form: newmodel<-lm(outcome ~ predictor(s), data = dataframe, na.action = an action))

in R 21 albumsales.1 <- lm(album1$sales ~ album1$adverts) Or we can tell R what dataframe to use (using data = nameofdataframe), and then specify the variables without the dataframename$ before them: albumsales.1 <- lm(sales ~ adverts, data = album1)

Output of a Simple 22 We have created an object called albumsales.1 that contains the results of our analysis. We can show the object by executing: summary(albumsales.1) >Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 1.341e+02 7.537e+00 17.799 <2e-16 *** adverts 9.612e-02 9.632e-03 9.979 <2e-16 *** Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 65.99 on 198 degrees of freedom Multiple R-squared: 0.3346, Adjusted R-squared: 0.3313 F-statistic: 99.59 on 1 and 198 DF, p-value: < 2.2e-16

Adjusted R-Squared 23 Note that if n = 2, R 2 = 1. In general, when two variables in fact have 0 correlation in the population, the expected R 2 for a sample of size n is 1/(n-1). In other words, the standard sample R 2 is a biased estimator of the population R 2. The adjusted R 2 provides an alternative, approximately unbiased estimator of the population R 2.

Using the Model 24 Record Salesi = b0 + b1 = 134.14 + Advertising Budget ( 0.09612 Advertising Budget ) i i Record Sales i ( 0.09612 Advertising Budget ) = 134.14 + i = 134.14 + = 143.75 ( 0.09612 100)

Objectives 25 Understand linear regression with one predictor Understand how we assess the fit of a regression model Total sum of squares Model sum of squares Residual sum of squares F R 2 Know how to do regression using R Interpret a regression model

Multiple

Objec-ves 27 Understand when to use mul-ple regression. Understand the mul-ple regression equa-on and what the betas represent. Understand different methods of regression Hierarchical Stepwise Forced Entry Understand how to do a mul-ple regression using R Understand how to interpret mul-ple regression. Understand the assump-ons of mul-ple regression and how to test them Slide 27

Multiple 28 Multiple regression extends linear regression to allow for 2 or more independent variables. There is still only one dependent (criterion) variable. We can think of the independent variables as predictors of the dependent variable. The main complication in multiple regression arises when the predictors are not statistically independent. 28 PSYC 6130, PROF. J. ELDER

: An Example 29 A record company boss was interested in predic-ng record sales from adver-sing. Data 200 different album releases Outcome variable: Sales (CDs and Downloads) in the week aner release Predictor variables The amount (in s) spent promo-ng the record before release (see last lecture) Number of plays on the radio (new variable)

The Model with One Predictor 30 Slide 30

Multiple as an Equation 31 With mul-ple regression the rela-onship is described using a straighworward generaliza-on of the equa-on for a straight line. y + = b0 b1 X1 + b2 X2 + + b n X n + εi Slide 31

Degrees of freedom 32 df = n k 1 where n = sample size k = number of predictors PSYC 6130, PROF. J. ELDER 32

33 b 0 b 0 is the intercept. The intercept is the value of the Y variable when all Xs = 0. This is the point at which the regression plane crosses the Y-axis. Slide 33

Coefficients 34 b 1 is the regression coefficient for variable 1. b 2 is the regression coefficient for variable 2. b n is the regression coefficient for n th variable. Slide 34

The Model with Two Predictors 35 b Adverts b 0 b airplay Slide 35

Coefficient of Multiple Determination 36 The proportion of variance explained by all of the independent variables together is called the coefficient of multiple determination (R 2 ). R is called the multiple correlation coefficient. R measures the correlation between the predictions and the actual values of the dependent variable. The correlation r iy of predictor i with the criterion (dependent variable) Y is called the validity of predictor i. 36 PSYC 6130, PROF. J. ELDER

Methods of Mul-ple 37 Hierarchical: Experimenter decides the order in which variables are entered into the model. Forced Entry: All predictors are entered simultaneously. Stepwise: Predictors are selected using their semi- par-al correla-on with the outcome. Slide 37

Hierarchical 39 Known predictors (based on past research) are entered into the regression model first. New predictors are then entered in a separate step/block. Experimenter makes the decisions. Slide 39

Hierarchical 40 It is the best method: Based on theory tes-ng. You can see the unique predic-ve influence of a new variable on the outcome because known predictors are held constant in the model. Bad Point: Relies on the experimenter knowing what they re doing! Slide 40

Forced Entry 41 All variables are entered into the model simultaneously. The results obtained depend on the variables entered into the model. It is important, therefore, to have good theore-cal reasons for including a par-cular variable. Slide 41

Stepwise 42 Select as the first predictor the variable that yields the largest R 2. Having selected the 1st predictor, a second is chosen from the remaining predictors. The semi- par-al correla-on is used as a criterion for selec-on. Slide 42

Stepwise 43 Step 2: Having selected the 1 st predictor, a second one is chosen from the remaining predictors. The semi- par.al correla.on is used as a criterion for selec-on. Slide 43

Semi- Par-al Correla-on 44 Par-al correla-on: measures the rela-onship between two variables, controlling for the effect that a third variable has on them both. A semi- par-al correla-on: Measures the rela-onship between two variables controlling for the effect that a third variable has on only one of the others. Slide 44

Semipartial Correlations 45 The semipartial correlations measure the correlation between each predictor and the criterion when all other predictors are held fixed. In this way, the effects of correlations between predictors are eliminated. In general, the semipartial correlations are smaller than the pairwise correlations. 45 PSYC 6130, PROF. J. ELDER

46 Problems with Stepwise Methods Rely on a mathema-cal criterion. Variable selec-on may depend upon only slight differences in the Semi- par-al correla-on. These slight numerical differences can lead to major theore-cal differences. Should be used only for explora-on Slide 46

Multicollinearity 47 Multicollinearity occurs when two predictors are strongly correlated. The result is that a family of solutions exist that trade off the regression weights between correlated predictors. This makes estimation of the regression coefficents b i unreliable. Also note that in this case, the coefficient of determination R 2 for the model will be much less than the sum of the R 2 values for each predictor alone.

48 Uncorrelated Predictors Variance explained by assignments Variance explained by midterm Total variance 2 r 1Y 2 r 2Y R =Total proportion of variance explained = r Y σ Y + r Y σ Y 2 2 2 2 2 1 2

49 Correlated Predictors Variance explained by assignments Variance explained by midterm Total variance 2 r 1Y 2 r 2Y R =Total proportion of variance explained < r Y + r 2 2 2 1 2 Y

Example 50 Predicting records sales (Y) from advertising (X 1 ) and airplay (X 2 ). Y = b 0 + b 1 X 1 + b 2 X 2 + ε albumsales.2 <- lm(sales ~ adverts + airplay, data = album2)

Coefficients 51 b 1 = 0.087. So, as adver-sing increases by 1, record sales increase by 0.087 units. b 2 = 3589. So, each -me (per week) a song is played on radio 1 its sales increase by 3589 units. Slide 51

Constructing a Model 52 y Sales = b + b X + b X = 0 1 41124 + 1 2 2 0.087Adverts + 3589plays Sales 1 Million Advertising,15 plays = 41124 + ( 0.087 1,000,000) + ( 3589 15) = 41124 + 87000 + 53835 = 181959 Slide 52

Standardized Coefficients 53 The coefficients b do not directly inform us of the importance of each predictor, since that also depends upon the dispersion of the predictors. To better assess importance, it is useful to transform the regression equation to standardized form: z y = β 0 + β 1 z 1 + β 2 z 2 + + β n z n + ε i where z y is the z-score for the outcome variable z i is the z-score for the i th predictor X i.

Standardised Coefficients 54 lm.beta(albumsales.2) β 1 = 0.523 As adver-sing increases by 1 standard devia-on, record sales increase by 0.523 of a standard devia-on. β 2 = 0.546 When the number of plays on radio increases by 1 s.d. its sales increase by 0.546 standard devia-ons. Slide 54

Comparing Models 55 In standard linear regression we use an F-statistic to determine whether the linear model is significantly better than the mean in predicting the outcome variable. In hierarchical regression, we can use the same method to determine whether the addition of a new predictor leads to a significant improvement in predicting the outcome variable. In R, this can be achieved using the anova() function.

Comparing Models 56 anova(model.1, model.2,, model.n) Note that models must be hierarchical (nested): model.(i+1) includes all predictors of model.i, plus 1 or more additional predictors. Example: anova(albumsales.1, albumsales.2) Model 1: sales ~ adverts Model 2: sales ~ adverts + airplay Res.Df RSS Df Sum of Sq F Pr(>F) 1 198 862264 2 197 480428 1 381836 156.57 < 2.2e-16 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1

Generaliza-on 57 When we run regression, we hope to be able to generalize the sample model to the en-re popula-on. To do this, several assump-ons must be met. Viola-ng these assump-ons stops us generalizing conclusions to our target popula-on. Slide 57

Assumptions 58 Quantitative variables Linear dependence of outcome variable on predictors Homoscedasticity Independent, normally distributed errors Limited multicollinearity

Standardized Residuals 59 If the errors are normally distributed: ~95% of standardized residuals should lie between ±2. ~99% of standardized residuals should lie between ± 2.5. Slide 59

Normality of Errors: Histograms 60 Good Bad

Testing Independence 61 The Durbin-Watson Test Looks for statistical correlations between residuals of neighbouring cases. Statistic should be close to 2 if cases are independent. Example: > dwt(model3) lag Autocorrelation D-W Statistic p-value 1 0.0026951 1.949819 0.716 Alternative hypothesis: rho!= 0

Testing for Multicollinearity 62 Can use the VIF function. Values less than 10 are ok. Example > vif(model3) adverts airplay attract 1.014593 1.042504 1.038455

Objec-ves 63 Understand when to use mul-ple regression. Understand the mul-ple regression equa-on and what the betas represent. Understand different methods of regression Hierarchical Stepwise Forced Entry Understand how to do a mul-ple regression using R Understand how to interpret mul-ple regression. Understand the assump-ons of mul-ple regression and how to test them Slide 63