Multiple Regression Methods
|
|
- Malcolm Davidson
- 6 years ago
- Views:
Transcription
1 Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret a slope coefficient in the multiple regression model The reason for using the adjusted coefficient of determination in multiple regression The meaning of multicollinearity and how to detect it How to test the overall utility of the predictor How to test the additional value of a single predictor How to test the significance of a subset of predictors in multiple regression The meaning of extrapolation Section 1.1 The Multiple Regression Model 3
2 1.1 The Multiple Regression Model Example (with two predictors): Y = Sales revenue per region (tens of thousands of dollars) x 1 = advertising expenditures (thousands of dollars) x = median household income (thousands of dollars) The data follow: The Multiple Regression Model The data are: Region Sales Adv Exp Income A B 1 38 C 1 4 D 3 35 E 3 41 F G H I J A graphical representation follows The Multiple Regression Model 3D Scatterplot of Sales vs Income vs Adv Exp 5.5 Sales Adv Exp Income Objective: Fit a plane through the points 6
3 1.1 The Multiple Regression Model Population Model: E Y ) = β + β x β where ( 1 1 or Y β + β x β + ε = 1 1 k xk ε is the error term. Interpretation of any β j Change in Y per unit change in x j, when all other independent variables are held constant. is called the partial slope, j = 1,,,k β j k x k The Multiple Regression Model First-order model No higher-order terms or interaction terms An interaction term is the product of two predictors: x 1 x Change in E(Y) per unit change in x 1 depends on the value of x. 8 Section 1. Estimating Multiple Regression 9
4 1. Estimating Multiple Regression Criterion used to estimate β 's Method of Least Squares minimize the sum of squared residuals Symbolically: min y i y i We will use software to do the calculations 1 1. Estimating Multiple Regression Example (Y = Sales, x 1 = Advertising Expenditures, x = Median Household Income): The Minitab output follows. Regression Analysis: Sales versus Adv Exp and Income The regression equation is Sales = Adv Exp Income Predictor Coef SE Coef T P VIF Constant Adv Exp Income S = R-Sq = 89.8% R-Sq(adj) = 86.8% Estimating Multiple Regression The fitted model is: Y vs. = x x Y = x 1 and Y = x The coefficient of an independent variable x j in a multiple regression equation does not, in general, equal the coefficient that would apply to that variable in a simple linear regression. In multiple regression, the coefficient refers to the effect of changing that x j variable while other independent variables stay constant. In simple linear regression, all other potential independent variables are ignored. (Hildebrand, Ott and Gray) 1
5 1. Estimating Multiple Regression Interpretation of.416: An additional unit (or an increase of $1,) of Advertising Expenditures leads to.416 increase in Sales when Median Household Income is fixed, i.e., regardless of whether x is 3 or 48. Does this seem reasonable? If Advertising Expenditures are increased by 1 unit, do you expect Sales to increase by.416 units regardless of whether the region has income of $3, or $48,? Estimating Multiple Regression The output also gives the estimate of σ ε, both directly and indirectly. Indirectly, σ ε can be estimated as follows: = sε = (Sum of Squared Residuals) /df MS (Residual Error) where df Error = n - (k + 1) = n k - 1 The estimate of σ ε can also be read directly from s on the output Estimating Multiple Regression Example (Sales vs. Adv. Exp. And Income): The Minitab output follows. Regression Analysis: Sales versus Adv Exp and Income The regression equation is Sales = Adv Exp Income S = R-Sq = 89.8% R-Sq(adj) = 86.8% Analysis of Variance Source DF SS MS F P Regression Residual Error Total 9. 15
6 1. Estimating Multiple Regression Use the output to locate the estimate of σ ε From the output, s ε =.5411 Or, s ε = MS( Error) =.98 = Estimating Multiple Regression Coefficient of Determination R or R y x x x 1 k Concept: we define the coefficient of determination as the proportional reduction in the squared error of Y, which we obtain by knowing the values of x1, x,..., xk. (Hildebrand, Ott and Gray) SSR SSE As in simple regression, R = = 1- SST SST Estimating Multiple Regression Example (Sales vs. Adv. Exp. And Income): From the output, R-Sq = 89.8% Interpretation: 89.8% of the variation in Sales is explained by a multiple regression model with Adv. Exp. and Income as predictors. 18
7 1. Estimating Multiple Regression Adjusted Coefficient of Determination ( = Ra SSE / (n - (k + 1)) 1- SST / (n -1) n -1 SSE = 1- n - (k + 1) SST R a SSE and SST are each divided by their degrees of freedom. Since (n - 1) / (n - (k + 1)) > 1 R < R a ) Estimating Multiple Regression R a Why use? SST is fixed, regardless of the number of predictors. SSE decreases when more predictors are used. R increases when more predictors are used. R a However, can decrease when another predictor is added to the fitted model, even though R increases. Why? The decrease in SSE is offset by the loss of a degree of freedom in [n (k +1)] for SSE. 1. Estimating Multiple Regression The following example illustrates this. Example: For a fitted model with 1 observations, suppose SST = 5. When k =, SSE = 5. When k = 3, SSE = 4.5. R R α K=: 1 =.9 1 = K=3: =.91 1 = Even though there has been a modest increase in R, R has decreased. α 1
8 1. Estimating Multiple Regression Sequential Sum of Squares (SEQ SS) Concept: The incremental contributions to SS (Regression) when the predictors enter the model in the order specified by the user. Example (Sales vs. Adv. Exp. And Income): The Minitab output follows for when Adv.Exp. is entered first. Analysis of Variance Source DF SS MS F P Regression Source DF Seq SS Adv Exp Income Estimating Multiple Regression SS (Regression Using x1 and x) SSR ( x 1, x ) = SS (Regression Using x 1 only) SSR ( x ) 1 = SS (Regression for xwhen x1 is already in the model) SSR ( x x1 ) = Estimating Multiple Regression Example (Sales vs. Adv. Exp. And Income): The Minitab output follows for when Income is entered first. Analysis of Variance Source DF SS MS F P Regression Source DF Seq SS Income Adv Exp
9 1. Estimating Multiple Regression x 1 SS (Regression Using and ) SSR ( x 1, x ) = {Unchanged SS (Regression Using x only) SSR ( x ) = SS (Regression for x 1 when x is already in the model) SSR ( x 1 x ) =.793 Regardless of which predictor is entered first, the sequential sums of squares, when added, equal SS (Regression). x 5 Section 1.3 Inferences in Multiple Regression Inferences in Multiple Regression Objective: Build a parsimonious model as few predictors as necessary Must now assume errors in population model are normally distributed F-test for overall model H : β1 = β =... β k = vs. : at least one β H a j 7
10 1.3 Inferences in Multiple Regression Test Statistic: F = MS (Regression)/MS (Residual Error) Concept: If SS (Regression) is large relative to SS (Residual), the indication is that there is real predictive value in [some of] the independent variables x1, x,..., x k. (Hildebrand, Ott and Gray) Decision Rule: Reject H if F > Fα, k, n k 1 or reject if p value < α H Inferences in Multiple Regression Example (Sales vs. Adv. Exp. and Income): The Minitab output follows: Analysis of Variance Source DF SS MS F P Regression Residual Error Total Inferences in Multiple Regression H β vs. Test : 1 = β = at the 5% level. H a : At least one β j Since F = 3.65 > F.5,, 7 = 4.74, reject H : at the 5% level. β1 = β = Or since p-value =. <.5, reject H : at the 5% level. β1 = β = Implication: At least one of the x s has some predictive power. 3
11 1.3 Inferences in Multiple Regression t-test for Significance of an Individual Predictor H : β j = vs. H a : β j, j = 1,,, k H implies that x j has no additional predictive value as the last predictor in to a model that contains all the other predictors ( ˆ β )/ Test Statistic: t = j s ˆ β j where s is the estimated standard error of β ˆ j β ˆj Inferences in Multiple Regression In Minitab notation, T = (Coef) / (SE Coef) Decision Rule: Reject H if t > tα /, n k 1 H or reject if p-value < α. Warning: Limit the number of t-tests to avoid a high overall Type 1 error rate Inferences in Multiple Regression Example (Sales vs. Adv. Exp. and Income): The Minitab output follows: Predictor Coef SE Coef T P VIF Constant Adv Exp Income Test H : β1 = vs. H a : β1 at the 5% level. 33
12 1.3 Inferences in Multiple Regression Since t =.4 > t reject H 3. 5, 7 = : β1 =.365, at the 5% level. Or since p-value =.19 <.5, reject H : β1 = at the 5% level. Implication: Advertising Expenditures provides additional predictive value to a model having Income as a predictor Inferences in Multiple Regression Multicollinearity Concept: High correlation between at least one pair of predictors, e.g., x 1 and x Correlated x's provide no new information. Example: In predicting heights of adults using the length of the right leg, the length of the left leg would be of little value. Symptoms of Multicollinearity Wrong signs for ˆ β s t-test isn't significant even though you believe the predictor is useful and should be in the fitted model Inferences in Multiple Regression Detection of Multicollinearity R x j x1 x j - 1 x j + 1 xk is the coefficient of determination obtained by regressing x j on the remaining (k - 1) predictors, denoted by R j. If R j >.9, this is a signal that multicollinearity is present. This criterion can be expressed in a different way. 36
13 1.3 Inferences in Multiple Regression Let VIF j denote the Variance Inflation Factor of the j th predictor: 1 VIFj = 1 R If VIF j > 1, this is a signal that multicollinearity is present. j Inferences in Multiple Regression Why is VIF j called the variance inflation factor for the j th predictor? The estimated standard error of β j in a multiple regression is: or s ˆ β j = s ε ( x ij 1 x ) (1 R ) VIF = j ˆ s β ε j s ( xij x j ) j j Inferences in Multiple Regression If VIF j is large, so is, which leads to a t-test sβ j that is not statistically significant. The VIF measures how much the variance (square of the standard error) of a coefficient is increased because of collinearity. (Hildebrand, Ott and Gray) 39
14 1.3 Inferences in Multiple Regression Example (Sales vs. Adv. Exp. and Income): The Minitab output follows. The regression equation is Sales = Adv Exp Income Predictor Coef SE Coef T P VIF Constant Adv Exp Income Since both VIFs = 1.8 < 1, multicollinearity between Advertising Expenditures and Median Household Income is not a problem Inferences in Multiple Regression The Minitab output for regressing Adv Exp on Income follows: The regression equation is Adv Exp = Income S = R-Sq = 43.% Since R =.43, VIF = 1/(1 -.43) = 1.8, as shown Inferences in Multiple Regression To illustrate multicollinearity, consider Exercise 1.19 Exercise 1.19: A study of demand for imported subcompact cars consists of data from 1 metropolitan areas. The variables are: Demand: Imported subcompact car sales as a percentage of total sales Educ: Average number of years of schooling completed by adults Income: Per capita income Popn: Area population Famsize: Average size of intact families 4
15 1.3 Inferences in Multiple Regression The Minitab output follows: The regression equation is Demand = Educ +.89 Income Popn Famsize Predictor Coef SE Coef T P VIF Constant Educ Income Popn Famsize S =.6868 R-Sq = 96.% R-Sq(adj) = 94.1% Inferences in Multiple Regression Is there a multicollinearity (MC) problem? Since the VIF = 1.3 > 1 for the variable Famsize, there is a MC problem. Note that the p-value for the F test =., indicating that at least one of the x s has predictive value. However, the smallest p-value for any t-test is.79, indicating that not one of the individual x s has predictive value. What is the source of the MC problem? A matrix plot, in conjunction with the correlations, could be useful. This exercise will be revisited in Section Inferences in Multiple Regression Remedies if multicollinearity is a problem. Eliminate one or more of the collinear predictors. Form a new predictor that is a surrogate of collinear predictor Multicollinearity could occur if one of the predictors is x. This can be eliminated by using ( x x) as the predictor. 45
16 Section 1.4 Testing a Subset of the Regression Testing a Subset of the Regression To illustrate the concept, consider Exercise Exercise 13.55: A bank that offers charge cards to customers studies the yearly purchase amount (in thousands of dollars) on the card as related to the age, income (in thousands of dollars), whether the cardholder owns or rents a home and years of education of the cardholder. The variable owner equals 1 if the cardholder owns a home and if the cardholder rents a home. The other variables are selfexplanatory. The original data set has information on 16 cardholders. Upon further examination of the data, you decide to remove the data for cardholder 19 because this is an older individual who has a high income from having saved early in life and having invested successfully. This cardholder travels extensively and frequently uses her/his charge card Testing a Subset of the Regression Problem to be investigated: The income and education predictors measure the economic well-being of a cardholder. Do these predictors have any predictive value given the age and home ownership variables? The null hypothesis is that the β s corresponding to these predictors are simultaneously equal to. 48
17 1.4 Testing a Subset of the Regression General Case Complete Model: E( Y ) β + β x + + β x + β x + + β x = 1 1 g g g + 1 g+ 1 Null hypothesis: H β = = β : g + 1 k = Reduced Model: E ( Y ) = β + β x + + β 1 1 g x g k k Testing a Subset of the Regression Exercise Complete Model: E( Y ) = β + β1( Age) + β( Owner) + β3( Income) + β4( Educn) Null hypothesis: H : β β Income = Educn = Reduced Model: E( Y) = β + β1( Age) + β( Owner) Testing a Subset of the Regression The test statistic is called the Partial F statistic Partial F Statistic [ SSEreduced SSEcomplete]/[ dfreduced df F = [ SSE ]/[ df ] Rationale complete complete complete SSE decreases as new terms are added to the model. If the x s from (g + 1) to k have predictive ability, then SSE complete should be much smaller than SSE reduced Their difference [SSE reduced SSE complete ] should be large ] 51
18 1.4 Testing a Subset of the Regression Note: df reduced -df complete = k g; df complete = n (k + 1) Note: Note: SSE complete /df complete = MSE complete Other versions of the partial F-test are in H,O&G. Decision criterion: Reject H if Partial F > F α,k-g,n-k Testing a Subset of the Regression Exercise 13.55: [ SSE F = reduced SSE [ SSE H : β Income = β Educn = complete complete ]/[ df ]/[ df reduced complete [ ]/[4 ] = = df (from the Minitab output that follows) Since 6.45 > F.5,,154 =3.55, reject H. Either Income or Education add predictive value to a model that contains Age and Owner ] complete ] Testing a Subset of the Regression Regression Analysis: Purch_1 versus Age_1, Income_1, Owner_1, Educn_1 The regression equation is Purch_1 = Age_ Income_ Owner_ Educn_1 S =.884 R-Sq = 95.% R-Sq(adj) = 94.8% Analysis of Variance Source DF SS MS F P Regression Residual Error Total
19 1.4 Testing a Subset of the Regression Regression Analysis: PURCH_1 versus AGE_1, OWNER_1 The regression equation is PURCH_1 = AGE_1 +. OWNER_1 S =.985 R-Sq = 94.6% R-Sq(adj) = 94.5% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Section 1.5 Forecasting Using Multiple Regression Forecasting Using Multiple Regression A major purpose of regression is to make predictions using the fitted model. In simple regression, we could obtain a confidence interval for E(Y) or a prediction interval for an individual Y. In both cases, the danger of extrapolation must be considered. Extrapolation occurs when using values of x far outside the range of x-values used to build the fitted model. 57
20 1.5 Forecasting Using Multiple Regression In regressing Sales on Advertising Expenditures, Advertising Expenditures ranged from 1 to 6. It would be incorrect to obtain a Confidence Interval for E(Y) or a Prediction Interval for Y far outside this range. We don t know if the fitted model is valid outside this range. In multiple regression, one must consider not only the range of each predictor but the set of values of the predictors taken together Forecasting Using Multiple Regression Consider the following example: Example: Y = sales revenue per region (tens of thousands of dollars) x 1 = advertising expenditures (thousands of dollars) x = median household income (thousands of dollars) The values for x 1 and x are: Region A B C D E F G H I J x x Forecasting Using Multiple Regression The scatterplot for x 1 vs. x follows. 6
21 1.5 Forecasting Using Multiple Regression Extrapolation occurs when using the fitted model to predict outside the elbow-shaped region. This would occur when Advertising Expenditures is 5 and income is 35. The Minitab output follows Forecasting Using Multiple Regression Regression Analysis: Sales versus Adv Exp, Income The regression equation is Sales = Adv Exp Income Predicted Values for New Observations New Obs Fit SE Fit 95% CI 95% PI (1.451, 3.956) (.913, 4.494) X X denotes a point that is an outlier in the predictors. Values of Predictors for New Observations New Obs Adv Exp Income Minitab indicates that this set of values for x 1 and x is an outlier 6 Keywords: Chapter 1 Multiple regression model Partial slopes First order model Adjusted Coefficient of Determination, R a Multicollinearity Variance Inflation Factor Overall F test t-test Complete model Reduced model Partial F test Extrapolation 63
22 Summary of Chapter 1 The Multiple Linear Regression Model Interpreting the slope coefficient of a single predictor in a multiple regression model Understanding the difference between the coefficient of determination (R a ) and the adjusted coefficient of determination (R ) The detection of multicollinearity and its impact Using the F statistic to test the overall utility of the predictors Using the t-test to test the additional value of a single predictors Using the partial F test for assessing the significance of a subset of predictors The meaning of extrapolation in multiple regression 64
The simple linear regression model discussed in Chapter 13 was written as
1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple
More informationCh 13 & 14 - Regression Analysis
Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more
More informationBasic Business Statistics, 10/e
Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:
More informationMultiple Regression Examples
Multiple Regression Examples Example: Tree data. we have seen that a simple linear regression of usable volume on diameter at chest height is not suitable, but that a quadratic model y = β 0 + β 1 x +
More informationChapter 14 Simple Linear Regression (A)
Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables
More informationChapter 3 Multiple Regression Complete Example
Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationConfidence Interval for the mean response
Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.
More informationINFERENCE FOR REGRESSION
CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We
More informationChapter 14 Student Lecture Notes 14-1
Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationChapter 15 Multiple Regression
Multiple Regression Learning Objectives 1. Understand how multiple regression analysis can be used to develop relationships involving one dependent variable and several independent variables. 2. Be able
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6
STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationRegression Models. Chapter 4. Introduction. Introduction. Introduction
Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager
More informationSTAT 212 Business Statistics II 1
STAT 1 Business Statistics II 1 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 1: BUSINESS STATISTICS II Semester 091 Final Exam Thursday Feb
More informationChapter 4. Regression Models. Learning Objectives
Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing
More informationChapter 12: Multiple Regression
Chapter 12: Multiple Regression 12.1 a. A scatterplot of the data is given here: Plot of Drug Potency versus Dose Level Potency 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 Dose Level b. ŷ = 8.667 + 0.575x
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationLI EAR REGRESSIO A D CORRELATIO
CHAPTER 6 LI EAR REGRESSIO A D CORRELATIO Page Contents 6.1 Introduction 10 6. Curve Fitting 10 6.3 Fitting a Simple Linear Regression Line 103 6.4 Linear Correlation Analysis 107 6.5 Spearman s Rank Correlation
More informationPredictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore
What is Multiple Linear Regression Several independent variables may influence the change in response variable we are trying to study. When several independent variables are included in the equation, the
More informationChapter 7 Student Lecture Notes 7-1
Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model
More informationSchool of Mathematical Sciences. Question 1. Best Subsets Regression
School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 9 and Assignment 8 Solutions Question 1 Best Subsets Regression Response is Crime I n W c e I P a n A E P U U l e Mallows g E P
More informationModel Building Chap 5 p251
Model Building Chap 5 p251 Models with one qualitative variable, 5.7 p277 Example 4 Colours : Blue, Green, Lemon Yellow and white Row Blue Green Lemon Insects trapped 1 0 0 1 45 2 0 0 1 59 3 0 0 1 48 4
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More information(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house.
Exam 3 Resource Economics 312 Introductory Econometrics Please complete all questions on this exam. The data in the spreadsheet: Exam 3- Home Prices.xls are to be used for all analyses. These data are
More informationChapter 9. Correlation and Regression
Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in
More informationThe Multiple Regression Model
Multiple Regression The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & or more independent variables (X i ) Multiple Regression Model with k Independent Variables:
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationUnit 11: Multiple Linear Regression
Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable
More informationChapter 26 Multiple Regression, Logistic Regression, and Indicator Variables
Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables 26.1 S 4 /IEE Application Examples: Multiple Regression An S 4 /IEE project was created to improve the 30,000-footlevel metric
More informationChapter 14 Multiple Regression Analysis
Chapter 14 Multiple Regression Analysis 1. a. Multiple regression equation b. the Y-intercept c. $374,748 found by Y ˆ = 64,1 +.394(796,) + 9.6(694) 11,6(6.) (LO 1) 2. a. Multiple regression equation b.
More informationSimple Linear Regression: A Model for the Mean. Chap 7
Simple Linear Regression: A Model for the Mean Chap 7 An Intermediate Model (if the groups are defined by values of a numeric variable) Separate Means Model Means fall on a straight line function of the
More informationMultiple Linear Regression
Andrew Lonardelli December 20, 2013 Multiple Linear Regression 1 Table Of Contents Introduction: p.3 Multiple Linear Regression Model: p.3 Least Squares Estimation of the Parameters: p.4-5 The matrix approach
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationSchool of Mathematical Sciences. Question 1
School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 8 and Assignment 7 Solutions Question 1 Figure 1: The residual plots do not contradict the model assumptions of normality, constant
More informationRegression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationCHAPTER 5 LINEAR REGRESSION AND CORRELATION
CHAPTER 5 LINEAR REGRESSION AND CORRELATION Expected Outcomes Able to use simple and multiple linear regression analysis, and correlation. Able to conduct hypothesis testing for simple and multiple linear
More informationChapter 13. Multiple Regression and Model Building
Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More information28. SIMPLE LINEAR REGRESSION III
28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of
More informationApplied Regression Analysis
Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of
More information12.12 MODEL BUILDING, AND THE EFFECTS OF MULTICOLLINEARITY (OPTIONAL)
12.12 Model Building, and the Effects of Multicollinearity (Optional) 1 Although Excel and MegaStat are emphasized in Business Statistics in Practice, Second Canadian Edition, some examples in the additional
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationAnalysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.
Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a
More informationSMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3
SMAM 319 Exam1 Name 1. Pick the best choice. (10 points-2 each) _c A. A data set consisting of fifteen observations has the five number summary 4 11 12 13 15.5. For this data set it is definitely true
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More information23. Inference for regression
23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence
More informationMultiple Regression. Peerapat Wongchaiwat, Ph.D.
Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com The Multiple Regression Model Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (X i ) Multiple Regression Model
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More informationIntroduction to Regression
Introduction to Regression ιατµηµατικό Πρόγραµµα Μεταπτυχιακών Σπουδών Τεχνο-Οικονοµικά Συστήµατα ηµήτρης Φουσκάκης Introduction Basic idea: Use data to identify relationships among variables and use these
More informationChapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression
Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing
More informationBusiness Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal
Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing
More informationSMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each)
SMAM 319 Exam 1 Name 1.Pick the best choice for the multiple choice questions below (10 points 2 each) A b In Metropolis there are some houses for sale. Superman and Lois Lane are interested in the average
More informationHistogram of Residuals. Residual Normal Probability Plot. Reg. Analysis Check Model Utility. (con t) Check Model Utility. Inference.
Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?
More informationSimple Linear Regression. Steps for Regression. Example. Make a Scatter plot. Check Residual Plot (Residuals vs. X)
Simple Linear Regression 1 Steps for Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?
More informationRegression Analysis. BUS 735: Business Decision Making and Research
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals and Agenda Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn
More informationBayesian Analysis LEARNING OBJECTIVES. Calculating Revised Probabilities. Calculating Revised Probabilities. Calculating Revised Probabilities
Valua%on and pricing (November 5, 2013) LEARNING OBJECTIVES Lecture 7 Decision making (part 3) Regression theory Olivier J. de Jong, LL.M., MM., MBA, CFD, CFFA, AA www.olivierdejong.com 1. List the steps
More informationConcordia University (5+5)Q 1.
(5+5)Q 1. Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Mid Term Test May 26, 2004 Two Hours 3 Instructor Course Examiner
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationMATH 644: Regression Analysis Methods
MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100
More informationUNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student
More informationMultiple Regression: Chapter 13. July 24, 2015
Multiple Regression: Chapter 13 July 24, 2015 Multiple Regression (MR) Response Variable: Y - only one response variable (quantitative) Several Predictor Variables: X 1, X 2, X 3,..., X p (p = # predictors)
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationLinear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).
Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation
More informationSTA121: Applied Regression Analysis
STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using
More informationTMA4255 Applied Statistics V2016 (5)
TMA4255 Applied Statistics V2016 (5) Part 2: Regression Simple linear regression [11.1-11.4] Sum of squares [11.5] Anna Marie Holand To be lectured: January 26, 2016 wiki.math.ntnu.no/tma4255/2016v/start
More informationSteps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line?
Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?
More informationRegression Analysis II
Regression Analysis II Measures of Goodness of fit Two measures of Goodness of fit Measure of the absolute fit of the sample points to the sample regression line Standard error of the estimate An index
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationFinding Relationships Among Variables
Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis
More informationSMAM 314 Exam 42 Name
SMAM 314 Exam 42 Name Mark the following statements True (T) or False (F) (10 points) 1. F A. The line that best fits points whose X and Y values are negatively correlated should have a positive slope.
More informationBNAD 276 Lecture 10 Simple Linear Regression Model
1 / 27 BNAD 276 Lecture 10 Simple Linear Regression Model Phuong Ho May 30, 2017 2 / 27 Outline 1 Introduction 2 3 / 27 Outline 1 Introduction 2 4 / 27 Simple Linear Regression Model Managerial decisions
More informationEcon 3790: Business and Economics Statistics. Instructor: Yogesh Uppal
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu Sampling Distribution of b 1 Expected value of b 1 : Variance of b 1 : E(b 1 ) = 1 Var(b 1 ) = σ 2 /SS x Estimate of
More informationCHAPTER EIGHT Linear Regression
7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationMultiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company
Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationSMAM 314 Practice Final Examination Winter 2003
SMAM 314 Practice Final Examination Winter 2003 You may use your textbook, one page of notes and a calculator. Please hand in the notes with your exam. 1. Mark the following statements True T or False
More informationSTA 4210 Practise set 2a
STA 410 Practise set a For all significance tests, use = 0.05 significance level. S.1. A multiple linear regression model is fit, relating household weekly food expenditures (Y, in $100s) to weekly income
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More information2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23
2.4. ASSESSING THE MODEL 23 2.4.3 Estimatingσ 2 Note that the sums of squares are functions of the conditional random variables Y i = (Y X = x i ). Hence, the sums of squares are random variables as well.
More informationMultiple Linear Regression. Chapter 12
13 Multiple Linear Regression Chapter 12 Multiple Regression Analysis Definition The multiple regression model equation is Y = b 0 + b 1 x 1 + b 2 x 2 +... + b p x p + ε where E(ε) = 0 and Var(ε) = s 2.
More informationDisadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means
Stat 529 (Winter 2011) Analysis of Variance (ANOVA) Reading: Sections 5.1 5.3. Introduction and notation Birthweight example Disadvantages of using many pooled t procedures The analysis of variance procedure
More informationPredict y from (possibly) many predictors x. Model Criticism Study the importance of columns
Lecture Week Multiple Linear Regression Predict y from (possibly) many predictors x Including extra derived variables Model Criticism Study the importance of columns Draw on Scientific framework Experiment;
More informationSingle and multiple linear regression analysis
Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics
More informationCorrelation and Regression
Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should
More information1 Introduction to One-way ANOVA
Review Source: Chapter 10 - Analysis of Variance (ANOVA). Example Data Source: Example problem 10.1 (dataset: exp10-1.mtw) Link to Data: http://www.auburn.edu/~carpedm/courses/stat3610/textbookdata/minitab/
More informationMULTICOLLINEARITY AND VARIANCE INFLATION FACTORS. F. Chiaromonte 1
MULTICOLLINEARITY AND VARIANCE INFLATION FACTORS F. Chiaromonte 1 Pool of available predictors/terms from them in the data set. Related to model selection, are the questions: What is the relative importance
More informationANOVA: Analysis of Variation
ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical
More information