Multiple Regression Methods

Similar documents
The simple linear regression model discussed in Chapter 13 was written as

Ch 13 & 14 - Regression Analysis

Basic Business Statistics, 10/e

Multiple Regression Examples

Chapter 14 Simple Linear Regression (A)

Chapter 3 Multiple Regression Complete Example

Correlation & Simple Regression

Confidence Interval for the mean response

INFERENCE FOR REGRESSION

Chapter 14 Student Lecture Notes 14-1

Basic Business Statistics 6 th Edition

Chapter 15 Multiple Regression

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Regression Models. Chapter 4. Introduction. Introduction. Introduction

STAT 212 Business Statistics II 1

Chapter 4. Regression Models. Learning Objectives

Chapter 12: Multiple Regression

Inference for Regression Inference about the Regression Model and Using the Regression Line

LI EAR REGRESSIO A D CORRELATIO

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Chapter 7 Student Lecture Notes 7-1

School of Mathematical Sciences. Question 1. Best Subsets Regression

Model Building Chap 5 p251

Simple Linear Regression

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house.

Chapter 9. Correlation and Regression

The Multiple Regression Model

9. Linear Regression and Correlation

Correlation Analysis

Statistics for Managers using Microsoft Excel 6 th Edition

Ch 2: Simple Linear Regression

Unit 11: Multiple Linear Regression

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables

Chapter 14 Multiple Regression Analysis

Simple Linear Regression: A Model for the Mean. Chap 7

Multiple Linear Regression

Mathematics for Economics MA course

School of Mathematical Sciences. Question 1

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

y response variable x 1, x 2,, x k -- a set of explanatory variables

CHAPTER 5 LINEAR REGRESSION AND CORRELATION

Chapter 13. Multiple Regression and Model Building

Chapter 4: Regression Models

Inferences for Regression

28. SIMPLE LINEAR REGRESSION III

Applied Regression Analysis

12.12 MODEL BUILDING, AND THE EFFECTS OF MULTICOLLINEARITY (OPTIONAL)

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Inference for the Regression Coefficient

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3

STAT Chapter 11: Regression

23. Inference for regression

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

STATISTICS 110/201 PRACTICE FINAL EXAM

Introduction to Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

SMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each)

Histogram of Residuals. Residual Normal Probability Plot. Reg. Analysis Check Model Utility. (con t) Check Model Utility. Inference.

Simple Linear Regression. Steps for Regression. Example. Make a Scatter plot. Check Residual Plot (Residuals vs. X)

Regression Analysis. BUS 735: Business Decision Making and Research

Bayesian Analysis LEARNING OBJECTIVES. Calculating Revised Probabilities. Calculating Revised Probabilities. Calculating Revised Probabilities

Concordia University (5+5)Q 1.

Inference for Regression

MATH 644: Regression Analysis Methods

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013

Multiple Regression: Chapter 13. July 24, 2015

Ch 3: Multiple Linear Regression

Business Statistics. Lecture 10: Course Review

Formal Statement of Simple Linear Regression Model

Homework 2: Simple Linear Regression

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

STA121: Applied Regression Analysis

TMA4255 Applied Statistics V2016 (5)

Steps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line?

Regression Analysis II

Lecture 10 Multiple Linear Regression

Finding Relationships Among Variables

SMAM 314 Exam 42 Name

BNAD 276 Lecture 10 Simple Linear Regression Model

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

CHAPTER EIGHT Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Lecture 18: Simple Linear Regression

SMAM 314 Practice Final Examination Winter 2003

STA 4210 Practise set 2a

Inference for Regression Simple Linear Regression

A discussion on multiple regression models

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

Multiple Linear Regression. Chapter 12

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

Predict y from (possibly) many predictors x. Model Criticism Study the importance of columns

Single and multiple linear regression analysis

Correlation and Regression

1 Introduction to One-way ANOVA

MULTICOLLINEARITY AND VARIANCE INFLATION FACTORS. F. Chiaromonte 1

ANOVA: Analysis of Variation

Transcription:

Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret a slope coefficient in the multiple regression model The reason for using the adjusted coefficient of determination in multiple regression The meaning of multicollinearity and how to detect it How to test the overall utility of the predictor How to test the additional value of a single predictor How to test the significance of a subset of predictors in multiple regression The meaning of extrapolation Section 1.1 The Multiple Regression Model 3

1.1 The Multiple Regression Model Example (with two predictors): Y = Sales revenue per region (tens of thousands of dollars) x 1 = advertising expenditures (thousands of dollars) x = median household income (thousands of dollars) The data follow: 4 1.1 The Multiple Regression Model The data are: Region Sales Adv Exp Income A 1 1 3 B 1 38 C 1 4 D 3 35 E 3 41 F 3 4 43 G 4 3 46 H 4 5 44 I 5 5 48 J 5 6 45 A graphical representation follows. 5 1.1 The Multiple Regression Model 3D Scatterplot of Sales vs Income vs Adv Exp 5.5 Sales 4..5 1. Adv Exp 4 3 6 35 4 45 Income Objective: Fit a plane through the points 6

1.1 The Multiple Regression Model Population Model: E Y ) = β + β x +... + β where ( 1 1 or Y β + β x +... + β + ε = 1 1 k xk ε is the error term. Interpretation of any β j Change in Y per unit change in x j, when all other independent variables are held constant. is called the partial slope, j = 1,,,k β j k x k 7 1.1 The Multiple Regression Model First-order model No higher-order terms or interaction terms An interaction term is the product of two predictors: x 1 x Change in E(Y) per unit change in x 1 depends on the value of x. 8 Section 1. Estimating Multiple Regression 9

1. Estimating Multiple Regression Criterion used to estimate β 's Method of Least Squares minimize the sum of squared residuals Symbolically: min y i y i We will use software to do the calculations 1 1. Estimating Multiple Regression Example (Y = Sales, x 1 = Advertising Expenditures, x = Median Household Income): The Minitab output follows. Regression Analysis: Sales versus Adv Exp and Income The regression equation is Sales = - 5.9 +.416 Adv Exp +.163 Income Predictor Coef SE Coef T P VIF Constant -5.91 1.7 -.96.1 Adv Exp.4158.1367 3.4.19 1.8 Income.1633.475 3.44.11 1.8 S =.54114 R-Sq = 89.8% R-Sq(adj) = 86.8% 11 1. Estimating Multiple Regression The fitted model is: Y vs. = 5.9 +.416x1 +.163x Y =.681+.75 x 1 and Y = 7.69+.58 x The coefficient of an independent variable x j in a multiple regression equation does not, in general, equal the coefficient that would apply to that variable in a simple linear regression. In multiple regression, the coefficient refers to the effect of changing that x j variable while other independent variables stay constant. In simple linear regression, all other potential independent variables are ignored. (Hildebrand, Ott and Gray) 1

1. Estimating Multiple Regression Interpretation of.416: An additional unit (or an increase of $1,) of Advertising Expenditures leads to.416 increase in Sales when Median Household Income is fixed, i.e., regardless of whether x is 3 or 48. Does this seem reasonable? If Advertising Expenditures are increased by 1 unit, do you expect Sales to increase by.416 units regardless of whether the region has income of $3, or $48,? 13 1. Estimating Multiple Regression The output also gives the estimate of σ ε, both directly and indirectly. Indirectly, σ ε can be estimated as follows: = sε = (Sum of Squared Residuals) /df MS (Residual Error) where df Error = n - (k + 1) = n k - 1 The estimate of σ ε can also be read directly from s on the output. 14 1. Estimating Multiple Regression Example (Sales vs. Adv. Exp. And Income): The Minitab output follows. Regression Analysis: Sales versus Adv Exp and Income The regression equation is Sales = - 5.9 +.416 Adv Exp +.163 Income S =.54114 R-Sq = 89.8% R-Sq(adj) = 86.8% Analysis of Variance Source DF SS MS F P Regression 17.95 8.9751 3.65. Residual Error 7.498.98 Total 9. 15

1. Estimating Multiple Regression Use the output to locate the estimate of σ ε From the output, s ε =.5411 Or, s ε = MS( Error) =.98 =.541 16 1. Estimating Multiple Regression Coefficient of Determination R or R y x x x 1 k Concept: we define the coefficient of determination as the proportional reduction in the squared error of Y, which we obtain by knowing the values of x1, x,..., xk. (Hildebrand, Ott and Gray) SSR SSE As in simple regression, R = = 1- SST SST 17 1. Estimating Multiple Regression Example (Sales vs. Adv. Exp. And Income): From the output, R-Sq = 89.8% Interpretation: 89.8% of the variation in Sales is explained by a multiple regression model with Adv. Exp. and Income as predictors. 18

1. Estimating Multiple Regression Adjusted Coefficient of Determination ( = Ra SSE / (n - (k + 1)) 1- SST / (n -1) n -1 SSE = 1- n - (k + 1) SST R a SSE and SST are each divided by their degrees of freedom. Since (n - 1) / (n - (k + 1)) > 1 R < R a ) 19 1. Estimating Multiple Regression R a Why use? SST is fixed, regardless of the number of predictors. SSE decreases when more predictors are used. R increases when more predictors are used. R a However, can decrease when another predictor is added to the fitted model, even though R increases. Why? The decrease in SSE is offset by the loss of a degree of freedom in [n (k +1)] for SSE. 1. Estimating Multiple Regression The following example illustrates this. Example: For a fitted model with 1 observations, suppose SST = 5. When k =, SSE = 5. When k = 3, SSE = 4.5. R R α 5 9 5 K=: 1 =.9 1 =.871 5 7 5 K=3: 4.5 9 4.5 1 =.91 1 =. 865 5 6 5 Even though there has been a modest increase in R, R has decreased. α 1

1. Estimating Multiple Regression Sequential Sum of Squares (SEQ SS) Concept: The incremental contributions to SS (Regression) when the predictors enter the model in the order specified by the user. Example (Sales vs. Adv. Exp. And Income): The Minitab output follows for when Adv.Exp. is entered first. Analysis of Variance Source DF SS MS F P Regression 17.95 8.9751 3.65. Source DF Seq SS Adv Exp 1 14.498 Income 1 3.4574 1. Estimating Multiple Regression SS (Regression Using x1 and x) SSR ( x 1, x ) = 17.95 SS (Regression Using x 1 only) SSR ( x ) 1 = 14.498 SS (Regression for xwhen x1 is already in the model) SSR ( x x1 ) = 3.4574 3 1. Estimating Multiple Regression Example (Sales vs. Adv. Exp. And Income): The Minitab output follows for when Income is entered first. Analysis of Variance Source DF SS MS F P Regression 17.95 8.9751 3.65. Source DF Seq SS Income 1 15.48 Adv Exp 1.793 4

1. Estimating Multiple Regression x 1 SS (Regression Using and ) SSR ( x 1, x ) = 17.95 {Unchanged SS (Regression Using x only) SSR ( x ) = 15.48 SS (Regression for x 1 when x is already in the model) SSR ( x 1 x ) =.793 Regardless of which predictor is entered first, the sequential sums of squares, when added, equal SS (Regression). x 5 Section 1.3 Inferences in Multiple Regression 6 1.3 Inferences in Multiple Regression Objective: Build a parsimonious model as few predictors as necessary Must now assume errors in population model are normally distributed F-test for overall model H : β1 = β =... β k = vs. : at least one β H a j 7

1.3 Inferences in Multiple Regression Test Statistic: F = MS (Regression)/MS (Residual Error) Concept: If SS (Regression) is large relative to SS (Residual), the indication is that there is real predictive value in [some of] the independent variables x1, x,..., x k. (Hildebrand, Ott and Gray) Decision Rule: Reject H if F > Fα, k, n k 1 or reject if p value < α H 8 1.3 Inferences in Multiple Regression Example (Sales vs. Adv. Exp. and Income): The Minitab output follows: Analysis of Variance Source DF SS MS F P Regression 17.95 8.9751 3.65. Residual Error 7.498.98 Total 9. 9 1.3 Inferences in Multiple Regression H β vs. Test : 1 = β = at the 5% level. H a : At least one β j Since F = 3.65 > F.5,, 7 = 4.74, reject H : at the 5% level. β1 = β = Or since p-value =. <.5, reject H : at the 5% level. β1 = β = Implication: At least one of the x s has some predictive power. 3

1.3 Inferences in Multiple Regression t-test for Significance of an Individual Predictor H : β j = vs. H a : β j, j = 1,,, k H implies that x j has no additional predictive value as the last predictor in to a model that contains all the other predictors ( ˆ β )/ Test Statistic: t = j s ˆ β j where s is the estimated standard error of β ˆ j β ˆj 31 1.3 Inferences in Multiple Regression In Minitab notation, T = (Coef) / (SE Coef) Decision Rule: Reject H if t > tα /, n k 1 H or reject if p-value < α. Warning: Limit the number of t-tests to avoid a high overall Type 1 error rate. 3 1.3 Inferences in Multiple Regression Example (Sales vs. Adv. Exp. and Income): The Minitab output follows: Predictor Coef SE Coef T P VIF Constant -5.91 1.7 -.96.1 Adv Exp.4158.1367 3.4.19 1.8 Income.1633.475 3.44.11 1.8 Test H : β1 = vs. H a : β1 at the 5% level. 33

1.3 Inferences in Multiple Regression Since t =.4 > t reject H 3. 5, 7 = : β1 =.365, at the 5% level. Or since p-value =.19 <.5, reject H : β1 = at the 5% level. Implication: Advertising Expenditures provides additional predictive value to a model having Income as a predictor 34 1.3 Inferences in Multiple Regression Multicollinearity Concept: High correlation between at least one pair of predictors, e.g., x 1 and x Correlated x's provide no new information. Example: In predicting heights of adults using the length of the right leg, the length of the left leg would be of little value. Symptoms of Multicollinearity Wrong signs for ˆ β s t-test isn't significant even though you believe the predictor is useful and should be in the fitted model. 35 1.3 Inferences in Multiple Regression Detection of Multicollinearity R x j x1 x j - 1 x j + 1 xk is the coefficient of determination obtained by regressing x j on the remaining (k - 1) predictors, denoted by R j. If R j >.9, this is a signal that multicollinearity is present. This criterion can be expressed in a different way. 36

1.3 Inferences in Multiple Regression Let VIF j denote the Variance Inflation Factor of the j th predictor: 1 VIFj = 1 R If VIF j > 1, this is a signal that multicollinearity is present. j 37 1.3 Inferences in Multiple Regression Why is VIF j called the variance inflation factor for the j th predictor? The estimated standard error of β j in a multiple regression is: or s ˆ β j = s ε ( x ij 1 x ) (1 R ) VIF = j ˆ s β ε j s ( xij x j ) j j 38 1.3 Inferences in Multiple Regression If VIF j is large, so is, which leads to a t-test sβ j that is not statistically significant. The VIF measures how much the variance (square of the standard error) of a coefficient is increased because of collinearity. (Hildebrand, Ott and Gray) 39

1.3 Inferences in Multiple Regression Example (Sales vs. Adv. Exp. and Income): The Minitab output follows. The regression equation is Sales = - 5.9 +.416 Adv Exp +.163 Income Predictor Coef SE Coef T P VIF Constant -5.91 1.7 -.96.1 Adv Exp.4158.1367 3.4.19 1.8 Income.1633.475 3.44.11 1.8 Since both VIFs = 1.8 < 1, multicollinearity between Advertising Expenditures and Median Household Income is not a problem. 4 1.3 Inferences in Multiple Regression The Minitab output for regressing Adv Exp on Income follows: The regression equation is Adv Exp = - 6.6 +.9 Income S = 1.39955 R-Sq = 43.% Since R =.43, VIF = 1/(1 -.43) = 1.8, as shown. 41 1.3 Inferences in Multiple Regression To illustrate multicollinearity, consider Exercise 1.19 Exercise 1.19: A study of demand for imported subcompact cars consists of data from 1 metropolitan areas. The variables are: Demand: Imported subcompact car sales as a percentage of total sales Educ: Average number of years of schooling completed by adults Income: Per capita income Popn: Area population Famsize: Average size of intact families 4

1.3 Inferences in Multiple Regression The Minitab output follows: The regression equation is Demand = - 1.3 + 5.55 Educ +.89 Income + 1.9 Popn - 11.4 Famsize Predictor Coef SE Coef T P VIF Constant -1.3 57.98 -..98 Educ 5.55.7.5.79 8.8 Income.885 1.38.68.5 4. Popn 1.95 1.371 1.4.3 1.6 Famsize -11.389 6.669-1.71.131 1.3 S =.6868 R-Sq = 96.% R-Sq(adj) = 94.1% 43 1.3 Inferences in Multiple Regression Is there a multicollinearity (MC) problem? Since the VIF = 1.3 > 1 for the variable Famsize, there is a MC problem. Note that the p-value for the F test =., indicating that at least one of the x s has predictive value. However, the smallest p-value for any t-test is.79, indicating that not one of the individual x s has predictive value. What is the source of the MC problem? A matrix plot, in conjunction with the correlations, could be useful. This exercise will be revisited in Section 13.1 44 1.3 Inferences in Multiple Regression Remedies if multicollinearity is a problem. Eliminate one or more of the collinear predictors. Form a new predictor that is a surrogate of collinear predictor Multicollinearity could occur if one of the predictors is x. This can be eliminated by using ( x x) as the predictor. 45

Section 1.4 Testing a Subset of the Regression 46 1.4 Testing a Subset of the Regression To illustrate the concept, consider Exercise 13.55 Exercise 13.55: A bank that offers charge cards to customers studies the yearly purchase amount (in thousands of dollars) on the card as related to the age, income (in thousands of dollars), whether the cardholder owns or rents a home and years of education of the cardholder. The variable owner equals 1 if the cardholder owns a home and if the cardholder rents a home. The other variables are selfexplanatory. The original data set has information on 16 cardholders. Upon further examination of the data, you decide to remove the data for cardholder 19 because this is an older individual who has a high income from having saved early in life and having invested successfully. This cardholder travels extensively and frequently uses her/his charge card. 47 1.4 Testing a Subset of the Regression Problem to be investigated: The income and education predictors measure the economic well-being of a cardholder. Do these predictors have any predictive value given the age and home ownership variables? The null hypothesis is that the β s corresponding to these predictors are simultaneously equal to. 48

1.4 Testing a Subset of the Regression General Case Complete Model: E( Y ) β + β x + + β x + β x + + β x = 1 1 g g g + 1 g+ 1 Null hypothesis: H β = = β : g + 1 k = Reduced Model: E ( Y ) = β + β x + + β 1 1 g x g k k 49 1.4 Testing a Subset of the Regression Exercise 13.55 Complete Model: E( Y ) = β + β1( Age) + β( Owner) + β3( Income) + β4( Educn) Null hypothesis: H : β β Income = Educn = Reduced Model: E( Y) = β + β1( Age) + β( Owner) 5 1.4 Testing a Subset of the Regression The test statistic is called the Partial F statistic Partial F Statistic [ SSEreduced SSEcomplete]/[ dfreduced df F = [ SSE ]/[ df ] Rationale complete complete complete SSE decreases as new terms are added to the model. If the x s from (g + 1) to k have predictive ability, then SSE complete should be much smaller than SSE reduced Their difference [SSE reduced SSE complete ] should be large ] 51

1.4 Testing a Subset of the Regression Note: df reduced -df complete = k g; df complete = n (k + 1) Note: Note: SSE complete /df complete = MSE complete Other versions of the partial F-test are in H,O&G. Decision criterion: Reject H if Partial F > F α,k-g,n-k-1 5 1.4 Testing a Subset of the Regression Exercise 13.55: [ SSE F = reduced SSE [ SSE H : β Income = β Educn = complete complete ]/[ df ]/[ df reduced complete [1.88 1.1937]/[4 ] = = 6.45.78 df (from the Minitab output that follows) Since 6.45 > F.5,,154 =3.55, reject H. Either Income or Education add predictive value to a model that contains Age and Owner ] complete ] 53 1.4 Testing a Subset of the Regression Regression Analysis: Purch_1 versus Age_1, Income_1, Owner_1, Educn_1 The regression equation is Purch_1 = -.797 +.336 Age_1 +.97 Income_1 +.11 Owner_1 +.98 Educn_1 S =.884 R-Sq = 95.% R-Sq(adj) = 94.8% Analysis of Variance Source DF SS MS F P Regression 4.4678 5.617 74.6. Residual Error 154 1.1937.78 Total 158 3.6616 54

1.4 Testing a Subset of the Regression Regression Analysis: PURCH_1 versus AGE_1, OWNER_1 The regression equation is PURCH_1 = -.6 +.4 AGE_1 +. OWNER_1 S =.985 R-Sq = 94.6% R-Sq(adj) = 94.5% Analysis of Variance Source DF SS MS F P Regression.374 11.187 1355.48. Residual Error 156 1.88.8 Total 158 3.66 55 Section 1.5 Forecasting Using Multiple Regression 56 1.5 Forecasting Using Multiple Regression A major purpose of regression is to make predictions using the fitted model. In simple regression, we could obtain a confidence interval for E(Y) or a prediction interval for an individual Y. In both cases, the danger of extrapolation must be considered. Extrapolation occurs when using values of x far outside the range of x-values used to build the fitted model. 57

1.5 Forecasting Using Multiple Regression In regressing Sales on Advertising Expenditures, Advertising Expenditures ranged from 1 to 6. It would be incorrect to obtain a Confidence Interval for E(Y) or a Prediction Interval for Y far outside this range. We don t know if the fitted model is valid outside this range. In multiple regression, one must consider not only the range of each predictor but the set of values of the predictors taken together. 58 1.5 Forecasting Using Multiple Regression Consider the following example: Example: Y = sales revenue per region (tens of thousands of dollars) x 1 = advertising expenditures (thousands of dollars) x = median household income (thousands of dollars) The values for x 1 and x are: Region A B C D E F G H I J x 1 1 1 3 4 3 5 5 6 x 3 38 4 35 41 43 46 44 48 45 59 1.5 Forecasting Using Multiple Regression The scatterplot for x 1 vs. x follows. 6

1.5 Forecasting Using Multiple Regression Extrapolation occurs when using the fitted model to predict outside the elbow-shaped region. This would occur when Advertising Expenditures is 5 and income is 35. The Minitab output follows 61 1.5 Forecasting Using Multiple Regression Regression Analysis: Sales versus Adv Exp, Income The regression equation is Sales = - 5.9 +.416 Adv Exp +.163 Income Predicted Values for New Observations New Obs Fit SE Fit 95% CI 95% PI 1.73.53 (1.451, 3.956) (.913, 4.494) X X denotes a point that is an outlier in the predictors. Values of Predictors for New Observations New Obs Adv Exp Income 1 5. 35. Minitab indicates that this set of values for x 1 and x is an outlier 6 Keywords: Chapter 1 Multiple regression model Partial slopes First order model Adjusted Coefficient of Determination, R a Multicollinearity Variance Inflation Factor Overall F test t-test Complete model Reduced model Partial F test Extrapolation 63

Summary of Chapter 1 The Multiple Linear Regression Model Interpreting the slope coefficient of a single predictor in a multiple regression model Understanding the difference between the coefficient of determination (R a ) and the adjusted coefficient of determination (R ) The detection of multicollinearity and its impact Using the F statistic to test the overall utility of the predictors Using the t-test to test the additional value of a single predictors Using the partial F test for assessing the significance of a subset of predictors The meaning of extrapolation in multiple regression 64