Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Size: px
Start display at page:

Download "Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore"

Transcription

1

2 What is Multiple Linear Regression Several independent variables may influence the change in response variable we are trying to study. When several independent variables are included in the equation, the regression is called multiple linear regression.

3 Multiple Regression Modeling Steps 1. Start with a hypothesis or belief. 2. Estimate unknown model parameters ( 0, 1, 2. ) 3. Specify probability distribution of random error term - Assumed to be a normal distribution 4. Check the assumptions of regression (normality, heteroscedasticity and multi-collinearity) 5. Evaluate Model 6. Use Model for Prediction & Estimation and other purposes.

4 Multiple Linear Regression Model Relationship between 1 dependent & 2 or more independent variables is a linear function. Population Y-intercept Population slopes Random error Y i 0 1 X 1i 2 X 2i k X ki i Dependent (response) variable Independent (explanatory) variables

5 Prediction Model The prediction equation obtained from sample data is: ki k i i X X X Y ˆ

6 Multiple Regression Model Y Y i = X 1i + 2 X 2i + i (Observed Y) Y i = Treatment Cost X 1i = Body Weight Response Plane 0 i X 2i = HR Pulse X 2 X 1 (X 1i,X 2i ) YX = X 1i + 2 X 2i

7

8 Estimating and Interpreting the -Parameters Belief Y x k x k Fitted Model ˆ ˆ ˆ yˆ 0 1x1... kxk Using least squares method Minimize ˆ 2 SSE y y

9 Estimation of Parameters in Multiple Regression The least squares function is given by The least squares estimates must satisfy n i k j ij j i n i i x y SSE j xij n i k j xij j yi j SSE n i k j xij j yi SSE and

10 Estimation of Parameters in Multiple Regression The least squares equations are The solution to the above equations are the least squares estimators of the regression coefficients.

11 DAD Example Total Hospital Cost β β1 Body weight β2 Body height β3 HR Pulse β4 0 RR

12 MODEL Treatment Cost Body weight HR Pulse RR Body height

13 Interpretation of Coefficients ( 1 ) Partial regression coefficient Total cost of treatment increases by INR for every one kg increase in body weight while other parameters like body height, HR pulse and RR are controlled (kept constant). Treatment Cost Body weight Body height HR Pulse RR

14 Interpretation of Coefficients ( 3 ) Partial regression coefficient Total cost of treatment increases by INR for every one unit increase in HR pulse while other parameters like body weight, Body height and RR are controlled. Treatment Cost Body weight Body height HR Pulse RR

15 Standardized Beta (Beta Weights) The beta weights are the regression coefficients for standardized data. Standardized Beta is the average increase in the dependent variable when the independent variable is increased by one standard deviation and other independent variables are held constant.

16 Beta Weights S Standardized Beta i S S S x y Standard deviation of X Standard deviation of Y x y

17

18 Partial Correlation Partial correlation coefficient measures the relationship between two variables (say Y and X 1 ) when the influence of all other variables (say X 2, X 3,, X k ) connected with these two variables (Y and X 1 ) are removed. Partial correlation is the correlation between residualized response and residualized predictor.

19 Semi-Partial (Part) Correlation Semi-partial (or part correlation) coefficient measures the relationship between two variables (say Y and X 1 ) when the influence of all other variables (say X 2, X 3,, X k ) connected with these two variables (Y and X 1 ) are removed from one of the variables (X 1 ). Only the predictor is residualized. The denominator of the coefficient (the total variance of the response variable, Y) remains the same no matter which predictor is being examined.

20 Y D Variance in Y = A + B + C + D A C B Variance in Y explained by the model = A + B + C Variance in Y not explained by the model = E F E G X 1 X 2 Semi partial (part) correlation for X 1 = A /(A+B+C+D) Semi partial (part) correlation for X 2 = B /(A+B+C+D) Partial correlation for X 1 = A / (A+D) Partial correlation for X 2 = B / (B + D) IMPORTANT In part correlation we do not residualize the response variable

21 r r 12,3 13,2 r 23,1 r 12 (1 r r 13 (1 r r (1 r 2 12 Partial Correlation r 13 r 23 )(1 r r 12 r 23 )(1 r r 12 r 13 )(1 r ) ) ) Correlation between y1 and x2, when the influence of x3 is removed from both y1 and x2. Correlation between y1 and x3, when the influence of x2 is removed from both y1 and x3. Correlation between x2 and x3, when the influence of y1 is removed from both x2 and x3. For Proof: Theory of Econometrics by A Koutsoyiannis

22 Semi-partial correlation (Part Correlation) Semi-partial (or part) correlation sr 12,3 is the correlation between y 1 and x 2 when influence of x 3 is partialled out of x 2 (not on y 1 ). sr 12,3 r 12 r 13 1 r r Part Correlation Partial Correlation (1 r 2 13 )

23 Semi-partial (part) correlation Square of Semi-partial (or part correlation) gives the increase in the R 2 value (coefficient of multiple determination) when an explanatory variable is added to the model.

24 Y = x Body weight + 2 x HR Pulse = (0.228) 2 = 0.173

25 Change in R 2 when a new variable is added Y 1 A C D B A X 1 X 2 Variance specific to X1 (part A) Variance common to X1 and X2 (part C) Variance specific to X2 (part B) Left over R R 2

26

27 Multiple Regression Model Diagnostics Test for overall model fitness (R-Square and Adjusted R-Square) Test for overall model statistical significance (F test test) Test for portions of the model (Partial F-test) Test for statistical significance of individual explanatory variables (t test) Test for Normality and Homoscedasticity of residuals Test for Multi-collinearity and Auto Correlation

28 Co-efficient of multiple determination R 2 SSR SST 1 SSE SST i i ( y ( y i i y) y) R Multiple coefficient of multiple determination R x 1 2, is x the percentage of 2,..., x k the variation of y explained by

29 Y = x Body Weight + 2 x HR Pulse R 2 is % of variation in Y is explained by Bodyweight and HR Pulse

30 Y = x Body Weight x HR Pulse

31 Co-efficient of determination in Multiple Regression Coefficient of determination increases as the number of explanatory variables increases. In SSR/SST, the numerator, SSR, increases as the number of explanatory variables increases, whereas the denominator, SST, remains constant. Increase in R 2 can deceptive, since more number of explanatory variables may over-fit the data.

32

33 Adjusted R 2 Inclusion of additional explanatory variable will increase the R 2 value. By introducing an additional explanatory variable, we increase the numerator of the expression for R 2 while the denominator remains the same. To correct this defect, we adjust the R 2 by taking into account the degrees of freedom.

34 Adjusted R 2 R R n k 2 A 2 A 1 SSE/(n SST/(n Adjusted R - Square number of number of k 1) 1) observations explanatory variables

35 Y = x Body Weight + 2 x HR Pulse Adjusted R 2 is 0.167

36

37 Testing For Overall Significance of Model F Test H0 : β1 β2... βk 0 H :Not allβ values are zero A Test for overall significance of multiple regression model. Checks if there is a statistically significant relationship between Y and any of the explanatory variables (X 1, X 2,, X k ).

38 F Statistic F = MSR/MSE Relationship between F and R 2 F (1 R 2 R 2 / k ) /( n k 1)

39 DAD CASE - ANOVA Table F = x /

40 Test for Individual Variables

41 Testing for Significance of Individual Parameters H H 0 A : β :β i i 0 0 t S e i ( ) i T- test: By rejecting the null hypothesis, we can claim that there is a statistically significant relationship between the response variable Y and explanatory variable X i.

42 SPSS Output for Coefficients

43

44 Testing Model Portions Partial F Test Examines the contribution of a set of explanatory variables to the regression model. Used for selecting explanatory variables in regression model building strategies such as stepwise regression.

45 Testing Model Portions Partial F Test Full Model: Y X X k X k Reduced Model (r < k): Y X X r X r Test H 0 : r+1 =.. = k = 0 Partial F (SSE reduced SSE MSE full full )/(k r) k = Number of variables in the full model r = Number of variables in the reduced model

46 Partial F-test Example (DAD) Full Model: Treatment cost RR 4 0 Bodyweight 1 Bodyheight 2 HRpulse 3 Reduced Model: Treatment cost 0 1 Bodyweight 2 HRpulse F ( SSE SSE ) /(4 reduced MSE full full 2)

47 ( SSE F reduced p value SSE MSE full full ) /(4 2) ( ) /

48

49 Categorical Variables in Regression Many regression models are likely to have qualitative (categorical) variables as explanatory variable. Qualitative variables (categorical variables) in regression are replaced with dummy variables (or indicator variables) in regression model. A categorical variable with n levels are replaced with (n-1) dummy variables. The category for which no dummy variable assigned is known as Base Category.

50 Dummy variable When there are more than one qualitative variable, it is advisable to use (n-1) dummy variables for both qualitative variables along with the intercept. multi- Use of n dummy variables along with intercept will result in collinearity, known as dummy variable trap.

51 Dummy variables in Regression The intercept, 0, is the mean value of the base category. The coefficients attached to dummy variables are called differential intercept coefficients.

52 Qualitative Variables DAD Case 1. Gender (2 levels) 2. Marital Status (2 levels) 3. Key Complaints (6 levels)

53 Model with Dummy Variable Y = β 0 + β 1 x Female Y for Male = Y for Female = =

54 Y = β 0 + β 1 x Female + β 2 x Body Weight

55 Impact of Dummy Variable Y Y 0 2 X 2 ( X 0 1) (M) 2 2 (F)

56 Qualitative Variable with Multiple Levels Key Complaints 1. ACHD 2. CAD 3.NONE 4. OS-ASD 5. OTHER 6.RHD

57

58 Regression Model Building Derived variables may help to explain the variation in the response variable. For example, consider a credit rating problem, which can be used for approving housing loan. A bank may use factors such as loan amount and value of the property etc. Customer Loan Amount 250, , ,000

59 Derived Variables Customer Loan Amount 250, , ,000 Value of Property 300, ,000 1,000,000 Loan / Value

60 Interaction Variables Consider a regression model of type: Y = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 1 X 2 Interaction variable The interaction variable is usually an interaction between a quantitative variable and qualitative variable

61 One of the popular applications of regression is to predict gender discrimination. Consider a regression model with salary as response variable Y: Y = β 0 + β 1 Gender + β 2 Work Experience + β 3 Gender x Work Experience

62 Interaction Variable Let Gender = 1 implies Female: Then Y for Female is: Y = β 0 + β 1 + (β 2 + β 3 ) Work Experience Y for Male is: Y = β 0 + β 2 x Work Experience

63 Conditional relationship in Interaction Variables

64 DAD Example Interaction Variable Y = β 0 + β 1 Marital Status + β 2 Body Weight + β 3 Marital status x Bodyweight

65

66 Multicollinearity High correlation between explanatory variables is called multi-collinearity. Multi-collinearity leads to unstable coefficients. Always exists; matter of degree.

67 Multi-collinearity Y = β 0 + β 1 Marital Status + β 2 Body Weight + β 3 Marital status x Bodyweight

68 Symptoms of Multi-collinearity High R 2 but few significant t ratios. F-test rejects the null hypothesis, but none of the individual t-tests are rejected. Correlations between pairs of X variables are more than with Y variables.

69 Effects of Multi-collinearity The variances of regression coefficient estimators are inflated. The magnitudes of regression coefficient estimates may be different. Adding and removing variables produce large changes in the coefficient estimates. Regression coefficient may have opposite sign.

70 Identifying Multi-collinearity Variance Inflation factor The variance inflation factor (VIF) is a relative measure of the increase in the variance in standard error of beta coefficient because of collinearity. A VIF greater than 10 indicates that collinearity is very high. A VIF value of more than 4 is not acceptable.

71 Variance of Beta Estimates ) (1 ) ( ) (1 ) ( r SS S Var r SS S Var X X Y X e X e

72 Variance inflation factor Variance inflation factor associated with introducing a new variable X j is given by: VIF(X ) j 1 1 R 2 j 2 R j is the coefficient of determination for the regression of X j asdependent variable The standard error of the corresponding Beta is inflated by VIF

73 Impact of VIF on Se() R j Tolerance (1-R 2 ) VIF Impact on Se() VIF

74 DAD CASELET For Body Weight Actual t x Corresponding p - value 0.006

75 Remedies for Multi-collinearity Drop a variable (may result in specification bias). Principal component analysis (PCA) Partial Least Squares (PLS)

76

77 Forward Selection Method In forward selection procedure in which variables are sequentially entered into the model. The first variable considered for entry is the one with smallest p-value based on F-test.

78 Backward elimination method A variable selection procedure in which all variables are entered into the equation and then sequentially removed starting with the most nonsignificant variable. At each step, the largest probability of F is removed.

79 Stepwise Regression The first variable considered for entry into the equation is the one with the largest F value or smallest p-value based on F-test. At each step, the independent variable not in the equation that has the smallest probability of F is entered. Variables already in the regression equation are removed if their probability of F becomes sufficiently large.

80

81

82 Y = CAD RHD HR PULSE MARRIED OTHERS

Multiple linear regression S6

Multiple linear regression S6 Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple

More information

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu Sampling Distribution of b 1 Expected value of b 1 : Variance of b 1 : E(b 1 ) = 1 Var(b 1 ) = σ 2 /SS x Estimate of

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

Single and multiple linear regression analysis

Single and multiple linear regression analysis Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

Chapter 7 Student Lecture Notes 7-1

Chapter 7 Student Lecture Notes 7-1 Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

x3,..., Multiple Regression β q α, β 1, β 2, β 3,..., β q in the model can all be estimated by least square estimators

x3,..., Multiple Regression β q α, β 1, β 2, β 3,..., β q in the model can all be estimated by least square estimators Multiple Regression Relating a response (dependent, input) y to a set of explanatory (independent, output, predictor) variables x, x 2, x 3,, x q. A technique for modeling the relationship between variables.

More information

Multiple linear regression

Multiple linear regression Multiple linear regression Course MF 930: Introduction to statistics June 0 Tron Anders Moger Department of biostatistics, IMB University of Oslo Aims for this lecture: Continue where we left off. Repeat

More information

LI EAR REGRESSIO A D CORRELATIO

LI EAR REGRESSIO A D CORRELATIO CHAPTER 6 LI EAR REGRESSIO A D CORRELATIO Page Contents 6.1 Introduction 10 6. Curve Fitting 10 6.3 Fitting a Simple Linear Regression Line 103 6.4 Linear Correlation Analysis 107 6.5 Spearman s Rank Correlation

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

holding all other predictors constant

holding all other predictors constant Multiple Regression Numeric Response variable (y) p Numeric predictor variables (p < n) Model: Y = b 0 + b 1 x 1 + + b p x p + e Partial Regression Coefficients: b i effect (on the mean response) of increasing

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Multiple Regression Methods

Multiple Regression Methods Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret

More information

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines) Dr. Maddah ENMG 617 EM Statistics 11/28/12 Multiple Regression (3) (Chapter 15, Hines) Problems in multiple regression: Multicollinearity This arises when the independent variables x 1, x 2,, x k, are

More information

Data Analysis 1 LINEAR REGRESSION. Chapter 03

Data Analysis 1 LINEAR REGRESSION. Chapter 03 Data Analysis 1 LINEAR REGRESSION Chapter 03 Data Analysis 2 Outline The Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression Other Considerations in Regression Model Qualitative

More information

The Multiple Regression Model

The Multiple Regression Model Multiple Regression The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & or more independent variables (X i ) Multiple Regression Model with k Independent Variables:

More information

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

Multiple Regression. Peerapat Wongchaiwat, Ph.D. Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com The Multiple Regression Model Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (X i ) Multiple Regression Model

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Prepared by: Prof Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Putra Malaysia Serdang M L Regression is an extension to

More information

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal Econ 3790: Statistics Business and Economics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 14 Covariance and Simple Correlation Coefficient Simple Linear Regression Covariance Covariance between

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Analysing data: regression and correlation S6 and S7

Analysing data: regression and correlation S6 and S7 Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association

More information

Model Building Chap 5 p251

Model Building Chap 5 p251 Model Building Chap 5 p251 Models with one qualitative variable, 5.7 p277 Example 4 Colours : Blue, Green, Lemon Yellow and white Row Blue Green Lemon Insects trapped 1 0 0 1 45 2 0 0 1 59 3 0 0 1 48 4

More information

Multiple Regression and Model Building Lecture 20 1 May 2006 R. Ryznar

Multiple Regression and Model Building Lecture 20 1 May 2006 R. Ryznar Multiple Regression and Model Building 11.220 Lecture 20 1 May 2006 R. Ryznar Building Models: Making Sure the Assumptions Hold 1. There is a linear relationship between the explanatory (independent) variable(s)

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using

More information

Sociology Research Statistics I Final Exam Answer Key December 15, 1993

Sociology Research Statistics I Final Exam Answer Key December 15, 1993 Sociology 592 - Research Statistics I Final Exam Answer Key December 15, 1993 Where appropriate, show your work - partial credit may be given. (On the other hand, don't waste a lot of time on excess verbiage.)

More information

Bayesian Analysis LEARNING OBJECTIVES. Calculating Revised Probabilities. Calculating Revised Probabilities. Calculating Revised Probabilities

Bayesian Analysis LEARNING OBJECTIVES. Calculating Revised Probabilities. Calculating Revised Probabilities. Calculating Revised Probabilities Valua%on and pricing (November 5, 2013) LEARNING OBJECTIVES Lecture 7 Decision making (part 3) Regression theory Olivier J. de Jong, LL.M., MM., MBA, CFD, CFFA, AA www.olivierdejong.com 1. List the steps

More information

The simple linear regression model discussed in Chapter 13 was written as

The simple linear regression model discussed in Chapter 13 was written as 1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple

More information

Multiple regression: Model building. Topics. Correlation Matrix. CQMS 202 Business Statistics II Prepared by Moez Hababou

Multiple regression: Model building. Topics. Correlation Matrix. CQMS 202 Business Statistics II Prepared by Moez Hababou Multiple regression: Model building CQMS 202 Business Statistics II Prepared by Moez Hababou Topics Forward versus backward model building approach Using the correlation matrix Testing for multicolinearity

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

CS 5014: Research Methods in Computer Science

CS 5014: Research Methods in Computer Science Computer Science Clifford A. Shaffer Department of Computer Science Virginia Tech Blacksburg, Virginia Fall 2010 Copyright c 2010 by Clifford A. Shaffer Computer Science Fall 2010 1 / 207 Correlation and

More information

Unit 11: Multiple Linear Regression

Unit 11: Multiple Linear Regression Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable

More information

Sociology 593 Exam 1 Answer Key February 17, 1995

Sociology 593 Exam 1 Answer Key February 17, 1995 Sociology 593 Exam 1 Answer Key February 17, 1995 I. True-False. (5 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher regressed Y on. When

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

General linear models. One and Two-way ANOVA in SPSS Repeated measures ANOVA Multiple linear regression

General linear models. One and Two-way ANOVA in SPSS Repeated measures ANOVA Multiple linear regression General linear models One and Two-way ANOVA in SPSS Repeated measures ANOVA Multiple linear regression 2-way ANOVA in SPSS Example 14.1 2 3 2-way ANOVA in SPSS Click Add 4 Repeated measures The stroop

More information

MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics

MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression analysis Predicting with regression analysis Old exam question

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Chapter 14. Multiple Regression Models. Multiple Regression Models. Multiple Regression Models

Chapter 14. Multiple Regression Models. Multiple Regression Models. Multiple Regression Models Chapter 14 Multiple Regression Models 1 Multiple Regression Models A general additive multiple regression model, which relates a dependent variable y to k predictor variables,,, is given by the model equation

More information

Ch14. Multiple Regression Analysis

Ch14. Multiple Regression Analysis Ch14. Multiple Regression Analysis 1 Goals : multiple regression analysis Model Building and Estimating More than 1 independent variables Quantitative( 量 ) independent variables Qualitative( ) independent

More information

Econometrics -- Final Exam (Sample)

Econometrics -- Final Exam (Sample) Econometrics -- Final Exam (Sample) 1) The sample regression line estimated by OLS A) has an intercept that is equal to zero. B) is the same as the population regression line. C) cannot have negative and

More information

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is. Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Regression Models REVISED TEACHING SUGGESTIONS ALTERNATIVE EXAMPLES

Regression Models REVISED TEACHING SUGGESTIONS ALTERNATIVE EXAMPLES M04_REND6289_10_IM_C04.QXD 5/7/08 2:49 PM Page 46 4 C H A P T E R Regression Models TEACHING SUGGESTIONS Teaching Suggestion 4.1: Which Is the Independent Variable? We find that students are often confused

More information

Simple Linear Regression

Simple Linear Regression 9-1 l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical Method for Determining Regression 9.4 Least Square Method 9.5 Correlation Coefficient and Coefficient

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

Lecture 9: Linear Regression

Lecture 9: Linear Regression Lecture 9: Linear Regression Goals Develop basic concepts of linear regression from a probabilistic framework Estimating parameters and hypothesis testing with linear models Linear regression in R Regression

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

ANOVA CIVL 7012/8012

ANOVA CIVL 7012/8012 ANOVA CIVL 7012/8012 ANOVA ANOVA = Analysis of Variance A statistical method used to compare means among various datasets (2 or more samples) Can provide summary of any regression analysis in a table called

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 7: Multicollinearity Egypt Scholars Economic Society November 22, 2014 Assignment & feedback Multicollinearity enter classroom at room name c28efb78 http://b.socrative.com/login/student/

More information

Sociology 593 Exam 1 February 14, 1997

Sociology 593 Exam 1 February 14, 1997 Sociology 9 Exam February, 997 I. True-False. ( points) Indicate whether the following statements are true or false. If false, briefly explain why.. There are IVs in a multiple regression model. If the

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

MATH ASSIGNMENT 2: SOLUTIONS

MATH ASSIGNMENT 2: SOLUTIONS MATH 204 - ASSIGNMENT 2: SOLUTIONS (a) Fitting the simple linear regression model to each of the variables in turn yields the following results: we look at t-tests for the individual coefficients, and

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

STAT Chapter 10: Analysis of Variance

STAT Chapter 10: Analysis of Variance STAT 515 -- Chapter 10: Analysis of Variance Designed Experiment A study in which the researcher controls the levels of one or more variables to determine their effect on the variable of interest (called

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

Lab 10 - Binary Variables

Lab 10 - Binary Variables Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2

More information

Statistics and Quantitative Analysis U4320

Statistics and Quantitative Analysis U4320 Statistics and Quantitative Analysis U3 Lecture 13: Explaining Variation Prof. Sharyn O Halloran Explaining Variation: Adjusted R (cont) Definition of Adjusted R So we'd like a measure like R, but one

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

B. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i

B. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i B. Weaver (24-Mar-2005) Multiple Regression... 1 Chapter 5: Multiple Regression 5.1 Partial and semi-partial correlation Before starting on multiple regression per se, we need to consider the concepts

More information

SIMPLE REGRESSION ANALYSIS. Business Statistics

SIMPLE REGRESSION ANALYSIS. Business Statistics SIMPLE REGRESSION ANALYSIS Business Statistics CONTENTS Ordinary least squares (recap for some) Statistical formulation of the regression model Assessing the regression model Testing the regression coefficients

More information

CIVL 7012/8012. Simple Linear Regression. Lecture 3

CIVL 7012/8012. Simple Linear Regression. Lecture 3 CIVL 7012/8012 Simple Linear Regression Lecture 3 OLS assumptions - 1 Model of population Sample estimation (best-fit line) y = β 0 + β 1 x + ε y = b 0 + b 1 x We want E b 1 = β 1 ---> (1) Meaning we want

More information

Introduction to Regression

Introduction to Regression Regression Introduction to Regression If two variables covary, we should be able to predict the value of one variable from another. Correlation only tells us how much two variables covary. In regression,

More information

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Structural Equation Modeling Topic 1: Correlation / Linear Regression Outline/Overview Correlations (r, pr, sr) Linear regression Multiple regression interpreting

More information

5.1 Model Specification and Data 5.2 Estimating the Parameters of the Multiple Regression Model 5.3 Sampling Properties of the Least Squares

5.1 Model Specification and Data 5.2 Estimating the Parameters of the Multiple Regression Model 5.3 Sampling Properties of the Least Squares 5.1 Model Specification and Data 5. Estimating the Parameters of the Multiple Regression Model 5.3 Sampling Properties of the Least Squares Estimator 5.4 Interval Estimation 5.5 Hypothesis Testing for

More information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information 1. Qualitative Information Qualitative Information Up to now, we assume that all the variables has quantitative meaning. But often in empirical work, we must incorporate qualitative factor into regression

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Daniel Boduszek University of Huddersfield

Daniel Boduszek University of Huddersfield Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to moderator effects Hierarchical Regression analysis with continuous moderator Hierarchical Regression analysis with categorical

More information

A discussion on multiple regression models

A discussion on multiple regression models A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I PubH 7405: REGRESSION ANALYSIS MLR: INFERENCES, Part I TESTING HYPOTHESES Once we have fitted a multiple linear regression model and obtained estimates for the various parameters of interest, we want to

More information

Classification & Regression. Multicollinearity Intro to Nominal Data

Classification & Regression. Multicollinearity Intro to Nominal Data Multicollinearity Intro to Nominal Let s Start With A Question y = β 0 + β 1 x 1 +β 2 x 2 y = Anxiety Level x 1 = heart rate x 2 = recorded pulse Since we can all agree heart rate and pulse are related,

More information

Finding Relationships Among Variables

Finding Relationships Among Variables Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis

More information

Correlation and Regression Bangkok, 14-18, Sept. 2015

Correlation and Regression Bangkok, 14-18, Sept. 2015 Analysing and Understanding Learning Assessment for Evidence-based Policy Making Correlation and Regression Bangkok, 14-18, Sept. 2015 Australian Council for Educational Research Correlation The strength

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

ECON 4230 Intermediate Econometric Theory Exam

ECON 4230 Intermediate Econometric Theory Exam ECON 4230 Intermediate Econometric Theory Exam Multiple Choice (20 pts). Circle the best answer. 1. The Classical assumption of mean zero errors is satisfied if the regression model a) is linear in the

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis Where as simple linear regression has 2 variables (1 dependent, 1 independent): y ˆ = a + bx Multiple linear regression has >2 variables (1 dependent, many independent): ˆ

More information

Remedial Measures, Brown-Forsythe test, F test

Remedial Measures, Brown-Forsythe test, F test Remedial Measures, Brown-Forsythe test, F test Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 7, Slide 1 Remedial Measures How do we know that the regression function

More information

Lecture 6: Linear Regression (continued)

Lecture 6: Linear Regression (continued) Lecture 6: Linear Regression (continued) Reading: Sections 3.1-3.3 STATS 202: Data mining and analysis October 6, 2017 1 / 23 Multiple linear regression Y = β 0 + β 1 X 1 + + β p X p + ε Y ε N (0, σ) i.i.d.

More information

Chapter 14 Multiple Regression Analysis

Chapter 14 Multiple Regression Analysis Chapter 14 Multiple Regression Analysis 1. a. Multiple regression equation b. the Y-intercept c. $374,748 found by Y ˆ = 64,1 +.394(796,) + 9.6(694) 11,6(6.) (LO 1) 2. a. Multiple regression equation b.

More information

Day 4: Shrinkage Estimators

Day 4: Shrinkage Estimators Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have

More information

Use of Dummy (Indicator) Variables in Applied Econometrics

Use of Dummy (Indicator) Variables in Applied Econometrics Chapter 5 Use of Dummy (Indicator) Variables in Applied Econometrics Section 5.1 Introduction Use of Dummy (Indicator) Variables Model specifications in applied econometrics often necessitate the use of

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information