Multiple regression: Model building. Topics. Correlation Matrix. CQMS 202 Business Statistics II Prepared by Moez Hababou

Size: px
Start display at page:

Download "Multiple regression: Model building. Topics. Correlation Matrix. CQMS 202 Business Statistics II Prepared by Moez Hababou"

Transcription

1 Multiple regression: Model building CQMS 202 Business Statistics II Prepared by Moez Hababou Topics Forward versus backward model building approach Using the correlation matrix Testing for multicolinearity : Variance inflationary factor Complete model versus reduced model Non-linear models Polynomial models: quadratic or cubic models Exponentional or logarithmic regression models page 2 Correlation Matrix Shows simple correlation relationship among all possible pairs of variables Useful for understanding how the explanatory variables influence the dependent variable Useful for showing the correlation among the explanatory variables page 3 1

2 Example The data for this example was derived from an air pollution study in forty cities. The variables are defined as follows TMR Total mortality rate SMIN, SMEAN, SMAX biweekly sulphate reading, smallest annual, average annual and largest annual respectively PMIN, PMEAN, PMAX biweekly suspended particulate reading, smallest annual, average annual and largest annual page 4 Example GE65 percent of population at least 65 multiplied by 10 NONPOOR percent of families above poverty level PERWH percent of whites in population LPOP logarithm of population PM2 population density page 5 Example CORRELATION MATRIX TMR SMIN SMEAN SMAX page 6 PMIN PMEAN PMAX PM2 GE65 PERWH NON POOR LPOP TMR 1 SMIN SMEAN SMAX PMIN PMEAN PMAX PM GE PERWH NON POOR LPOP

3 Variance inflationary variable Good indicator of the correlation between ONE independent variable Xj and all the other independent variables VIFj=1/(1-R 2 j) R 2 j is the coefficient of determination for the regression model X j =β 0 +β 1 x 1 +β 2 x 2 + +β j-1 x j-1 + β j+1 x j+1 + +β k x k + u page 7 Variance inflationary variable If you find that VIF j is close to 1, this means that variable Xj is not strongly correlated with the others independent variables in the model and can be kept in. If you find that VIF j is higher than 10, this implies that variable Xj is strongly correlated with the others independent variables in the model and should be dropped. page 8 Reduced Models Herein we study reduced models which contain only a subset of the set of possible explanatory variables In order to compare various models in which one is a subset of another we require a statistical test which will indicate whether there has been a loss of explanatory power by reducing the number of explanatory variables The partial F- test is outlined below for this purpose page 9 3

4 Example TMR SMIN PMAX GE65 PERWH NONPOOR page 10 Reduced Models The two models being compared are called the full model and the reduced model The full model contains all p explanatory variables and is given by: y = β 0 +β 1 x 1 + +β q x q +β q+1 x q+1 + +β k x k + u The reduced model eliminates the first q explanatory variables and is given by: y = β 0 + β q+1 x q+1 + +β k x k + u k = number of independent variables in complete model k-q = number of independent variables in reduced model page 11 Reduced Models We wish to test the null hypothesis that the reduced model is as good as the full model for explaining the variation in y Hence we have H 0 : Reduced model as good as Full model Equivalently we are testing that the coefficients for the first q explanatory variables are zero Hence H 0 : β 1 = β 2 = = β q = 0 Note that the order of the variables in the model is arbitrary so we assume the first q page 4

5 Comparing Full and Reduced Models Denote the sums of squares and coefficient of multiple determination for the full model by SST, SSR, SSE and R 2 Denote the reduced model sums of squares and coefficient of multiple determination by SSR R and Note that the total sum of squares remains fixed page 13 Test Statistic For Comparing Full and Reduced Models The test statistic is given by: F = ( SSR SSRR ) q ( ) SSE n k ( R RR ) q 2 ( 1 R ) ( n k 1) F follows a F distribution with q and (n-k-1) d.f. = page 14 Coefficients Standard Error t Stat P-value 95% LL 95% UL VIF Intercept E SMIN PMAX GE E PERWH NON POOR page 15 5

6 Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 40 ANOVA df SS MS F Significance F Regression E-14 Residual Total page 16 H 0 : Reduced model as good as Full model or the extra variables are superfluous H A : Full model superior or at least one of the 6 variables is important Full Model Reduced Model df SS df SS Regression Residual Total Regression Residual Total ( ) 6 F = = Also, page 17 Full Model Regression Statistics R Square Reduced Model Regression Statistics R Square ( ) 6 F = = ( ) 28 Mean of F distribution is usually near 1 Since F is much less than 1 obviously cannot reject H 0 Note: F 0.05,6,28 = 2.45, F 0.01,6,28 = 3.53, F 0.10,6,28 = 2.00 page 18 6

7 Variable Selection Methods The variable mix affects the significance of the explanatory variables in a regression model which leads one to ask how a subset of variables could be selected so that R squared is maximized while the number of significant explanatory variables is as small as possible Techniques that attempt to accomplish this are called variable selection methods or stepwise methods Two simple methods of variable selection are outlined below page 19 Forward Selection The simplest form of variable selection Variables are entered one at a time beginning with no variables in the model Variables added are chosen to maximize the increase in R squared at each step As the R square cannot decrease when variables are added. It is advised to focus more on the incremental in the ADJUSTED R SQUARE. Additions are made as long as change in the ADJUSTED R square is considered significant page 20 Forward Selection Because this method never removes a variable once it has been added and because some variables may never be chosen for entry it is recommended that variables be added far beyond conventional significance levels so that alternative models can be seen Recall that the significance level of some variables can change as a result of the addition of other variables page 21 7

8 A Useful Analogy Consider an employer who wishes to hire college graduates to solve various problems The first graduate hired is the one who can solve the most problems The second graduate hired is the one who can solve the most problems that remain after the first graduate has been hired The second person hired is not necessarily the one who can solve the second largest number of problems page 22 Forward Stepwise Regression Variable R Squared P(Partial F) Intercept GE65 PERWH PMEAN NONPOOR SMIN PMIN SMAX SMEAN PM2 LPOP PMAX GE E PERWH PMEAN NON POOR SMIN PMIN SMAX SMEAN PM LPOP PMAX page 23 Backward Elimination This method is the opposite of the forward method in that the process begins with all variables included in the model The variables are removed in such a way that the reduction in R squared is minimized at each step Note that R squared cannot increase as variables are removed page 24 8

9 Backward Stepwise Regression Variable R Squared P(Partial F) Intercept SMIN SMEAN SMAX PMIN PMEAN PMAX PM2 GE65 PERWH NONPOOR LPOP PMAX LPOP PM PMIN SMEAN SMAX SMIN NON POOR PMEAN PERWH GE E page 25 A Variable Selection Caution VARIABLE SELECTION METHODS ARE NOT A GOOD WAY OF STUDYING RELATIONSHIPS AMONG EXPLANATORY VARIABLES AND A DEPENDENT VARIABLE. IT IS SIMPLY A FISHING EXPEDITION. KEEP IN MIND THAT BECAUSE YOU ARE MAXIMIZING R SQUARED, THERE IS A TENDENCY TO OVER- FIT OR FIT NOISE IN THE SENSE THAT A SECOND SAMPLE MIGHT PRODUCE SOME VERY DIFFERENT RESULTS. page 26 A Variable Selection Caution IF YOU DO WANT TO OBTAIN A GOOD PREDICTIVE MODEL YOU SHOULD USE CROSS VALIDATION METHODS TO VERIFY THE QUALITY OF THE MODEL. CROSS VALIDATION INVOLVES FITTING THE MODEL FOR A PORTION OF THE DATA SET AND THEN TESTING IT ON ANOTHER PORTION OF THE DATA SET. page 27 9

10 A Variable Selection Caution IN CASES WHERE YOU HAVE OVER FITTED, TYPICALLY THE SAME VARIABLES WILL NOT BE SIGNIFICANT IN BOTH SETS OF DATA. ANOTHER APPROACH IS TO TAKE VARIABLES THAT ARE MUTUALLY HIGHLY CORRELATED AND COMBINE THEM INTO FACTORS(E.G. simple average) AND THEN USE THE FACTORS AS EXPLANATORY VARIABLES. page 28 A Variable Selection Caution A MORE SOPHISTICATED VERSION OF THIS IS CALLED FACTOR ANALYSIS WHICH IS HEAVILY USED IN THE SOCIAL SCIENCES OR PRINCIPAL COMPONENTS ANALYSIS WHICH IS HEAVILY USED IN THE PHYSICAL AND BIOLOGICAL SCIENCES. page 29 Nonlinear Models The nonlinear models we study in this section are called intrinsically linear in that they can be fitted using linear regression model methods Although the relationship between the dependent variable and the explanatory variable is nonlinear the model can be expressed as a linear combination of explanatory variables page 30 10

11 Nonlinear Models Polynomial models: quadratic and cubic Models Models With Interaction Growth Models Model with dummy variables Grafted Models page 31 QUADRATIC MODELS The quadratic linear model is given by y = β + β X + X + u 0 1 β2 It can be fitted using multiple regression It is a curvilinear model which has one change in direction Also referred to as a parabola Often in practice there is no change in direction and hence only only one branch of the parabola is used 2 page 32 Quadratic Model An example of a quadratic model is given below Y X page 33 11

12 This example data set relates bank employee salaries to age level and years of education. A variety of fitted models is illustrated below. Because of the natural skewness in the salary variable the logarithm of salary will be used as the dependent variable. page 34 LCURRENT EDUC EDAGE AGE AGE2 EDUC page 35 Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 100 ANOVA df SS MS F Significance F Regression E-19 Residual Total Coefficients Standard Error t Stat P-value 95% LL 95% UL VIF Intercept E EDUC E page 36

13 Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 100 ANOVA df SS MS F Significance F Regression E-21 Residual Total Coefficients Standard Error t Stat P-value 95% LL 95% UL VIF Intercept E EDUC EDUC page 37 Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 100 ANOVA df SS MS F Significance F Regression Residual Total Coefficients Standard Error t Stat P-value 95% LL 95% UL VIF Intercept E AGE page 38 Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 100 ANOVA df SS MS F Significance F Regression E-07 Residual Total Coefficients Standard Error t Stat P-value 95% LL 95% UL VIF Intercept E AGE AGE E page 39 13

14 Models With Interaction Given a regression model with two explanatory variables X and Z, an interaction variable can be defined which is obtained by taking the product of X and Z, given by XZ Adding this interaction variable to the regression model yields a model which defines a nonlinear surface in three dimensional space called a quadratic surface page 40 Fitting a Model With Interaction The presence of the interaction variable allows the response of Y to X to depend on the level of Z and similarly the response of Y to Z to depend on the level of X This can be seen by looking at the model below page 41 Fitting a Model With Interaction Y = β 0 + β 1 X + β 2 Z +β 3 XZ + u = β 0 + [β 1 + β 3 Z]X + β 2 Z + u or = β 0 + β 1 X + [β 2 + β 3 X] Z+ u Since this interaction model is linear in the three terms, linear regression methods can still be applied page 42 14

15 Continuing with the bank employee data used above we study the use of an interaction variable between EDUC and AGE. page 43 Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 100 ANOVA df SS MS F Significance F Regression E-22 Residual Total Coefficients Standard Error t Stat P-value 95% LL 95% UL VIF Intercept E EDUC E EDAGE E AGE E page 44 Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 100 ANOVA df SS MS F Significance F Regression E-22 Residual Total Coefficients Standard Error t Stat P-value 95% LL 95% UL VIF Intercept E EDUC EDAGE AGE AGE EDUC page 45 15

16 10.8 LNSALARY VS EDUC AND AGE LNSALARY 10 AGE= AGE= AGE= AGE= EDUC page 46 LNSALARY VS AGE AND EDUC EDUC=20 LNSALARY EDUC=16 EDUC= EDUC= page 47 AGE Growth Models In business analysis growth projections and analysis are an integral part of planning and performance evaluation It is therefore of interest to be able to fit a constant growth model to a time series such as revenue, costs or profit Fitting a linear model to a time series results in a series which grows by a constant amount each year whereas constant yearly growth requires an exponential model page 48 16

17 Fitting a Seasonal Time Series Using Dummy Variables And Slope Shifters A linear model which allows for variation over the seasons of the year can be constructed using dummy variables and slope shifters This technique will be combined with the constant growth model to study a time series As an example the monthly retail trade series for chain stores in Canada for the period will be studied page 49 Fitting a Seasonal Time Series Using Dummy Variables And Slope Shifters A linear model is fitted to the logarithm of retail trade in order to fit a constant growth model Eleven dummy variables and eleven slope shifter variables are used to represent the eleven months other than December Since December is by far the month with the largest monthly retail sales in any given year it provides a convenient base case page 50 LNSALES TIME D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 TD1 TD2 TD page TD4 TD5 TD6 TD7 TD8 TD9 TD10 TD

18 See ChartSales and ChartLogsales to see time series behavior of SALES Sheet4 has the data for these charts page 52 RETAIL SALES VS TIME SALES page 53 TIME 16.5 LNSALES VS TIME 16 LNSALES page 54 TIME 18

19 See Sheet labeled REGRESSION1 for the fitting of the constant growth model with monthly dummies and slope shifters(red table) This model permits different sales levels for each month as well as different growth rates for each month page 55 RETAIL SALES DATA REGRESSION ON DUMMIES AND SLOPE SHIFTERS Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 216 ANOVA df SS MS F Significance F Regression E-149 Residual Total page 56 Coefficients Standard Error t Stat P-value Intercept E-277 TIME E-64 D E-25 D E-26 D E-17 D E-15 D E-13 D E- D E-15 D E-13 D E- D E-11 D E-06 TD TD page 57 19

20 Coefficients Standard Error t Stat P-value D E-11 D E-06 TD TD TD TD TD TD TD TD TD TD TD page 58 See Sheet labeled REGRESSION2 for the fitting of the constant growth model with monthly dummies but no slope shifters This model permits different sales levels each month but assumes the same growth rate for all months Red table shows details of fit and green table shows fitted values and residuals page 59 RETAIL SALES DATA REGRESSION ON DUMMIES ONLY Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 216 ANOVA df SS MS F Significance F Regression E-164 Residual Total page 60 20

21 Coefficients Standard Error t Stat P-value Intercept E+00 TIME E-168 D E-57 D E-63 D E-45 D E-42 D E-35 D E-35 D E-41 D E-38 D E-37 D E-34 D E-23 page 61 Ln SALES = TIME D D D D D D D D D D D11 December: Ln SALES = TIME SALES = e x e TIME = 1,846,865 ( ) TIME page 62 January: D1 = 1 Ln SALES = TIME SALES = e ( ) TIME February: D2 = 1 Ln SALES = TIME SALES = e ( ) TIME Note how growth rate is constant for each month so annual growth rate is ( ) = or average rate of growth 9.77% per year page 63 21

22 To compare monthly Sales December: Ln SALES = TIME * where TIME * is a particular December January: Ln SALES = (TIME * +1) = TIME * Difference: January - December = = page 64 Ln = lnjan lndec = Jan Dec Jan = e = Dec So January retail sales are expected to be 56.4% of the prior December retail sales page 65 See ChartFittedSeasonality to observe the fitted constant growth model compared to the actual data page 66 22

23 Predicted Sales vs Time Sales page 67 Time 23

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore What is Multiple Linear Regression Several independent variables may influence the change in response variable we are trying to study. When several independent variables are included in the equation, the

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Overfitting Categorical Variables Interaction Terms Non-linear Terms Linear Logarithmic y = a +

More information

STAT 212 Business Statistics II 1

STAT 212 Business Statistics II 1 STAT 1 Business Statistics II 1 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 1: BUSINESS STATISTICS II Semester 091 Final Exam Thursday Feb

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

Regression Analysis II

Regression Analysis II Regression Analysis II Measures of Goodness of fit Two measures of Goodness of fit Measure of the absolute fit of the sample points to the sample regression line Standard error of the estimate An index

More information

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression Chapter 12 12-1 North Seattle Community College BUS21 Business Statistics Chapter 12 Learning Objectives In this chapter, you learn:! How to use regression analysis to predict the value of a dependent

More information

Correlation and Regression (Excel 2007)

Correlation and Regression (Excel 2007) Correlation and Regression (Excel 2007) (See Also Scatterplots, Regression Lines, and Time Series Charts With Excel 2007 for instructions on making a scatterplot of the data and an alternate method of

More information

Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear

Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear relationship between: - one independent variable X and -

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University Lecture 15 20 Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University Modeling for Time Series Forecasting Forecasting is a necessary input to planning, whether in business,

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

The simple linear regression model discussed in Chapter 13 was written as

The simple linear regression model discussed in Chapter 13 was written as 1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University Lecture 15 20 Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University Modeling for Time Series Forecasting Forecasting is a necessary input to planning, whether in business,

More information

Unit 11: Multiple Linear Regression

Unit 11: Multiple Linear Regression Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Marquette University Executive MBA Program Statistics Review Class Notes Summer 2018

Marquette University Executive MBA Program Statistics Review Class Notes Summer 2018 Marquette University Executive MBA Program Statistics Review Class Notes Summer 2018 Chapter One: Data and Statistics Statistics A collection of procedures and principles

More information

Chapter 7 Student Lecture Notes 7-1

Chapter 7 Student Lecture Notes 7-1 Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model

More information

LI EAR REGRESSIO A D CORRELATIO

LI EAR REGRESSIO A D CORRELATIO CHAPTER 6 LI EAR REGRESSIO A D CORRELATIO Page Contents 6.1 Introduction 10 6. Curve Fitting 10 6.3 Fitting a Simple Linear Regression Line 103 6.4 Linear Correlation Analysis 107 6.5 Spearman s Rank Correlation

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X 1.04) =.8508. For z < 0 subtract the value from

More information

Unit 27 One-Way Analysis of Variance

Unit 27 One-Way Analysis of Variance Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied

More information

MBA Statistics COURSE #4

MBA Statistics COURSE #4 MBA Statistics 51-651-00 COURSE #4 Simple and multiple linear regression What should be the sales of ice cream? Example: Before beginning building a movie theater, one must estimate the daily number of

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 24, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

STAT 212: BUSINESS STATISTICS II Third Exam Tuesday Dec 12, 6:00 PM

STAT 212: BUSINESS STATISTICS II Third Exam Tuesday Dec 12, 6:00 PM STAT212_E3 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICS & STATISTICS Term 171 Page 1 of 9 STAT 212: BUSINESS STATISTICS II Third Exam Tuesday Dec 12, 2017 @ 6:00 PM Name: ID #:

More information

Sociology Research Statistics I Final Exam Answer Key December 15, 1993

Sociology Research Statistics I Final Exam Answer Key December 15, 1993 Sociology 592 - Research Statistics I Final Exam Answer Key December 15, 1993 Where appropriate, show your work - partial credit may be given. (On the other hand, don't waste a lot of time on excess verbiage.)

More information

The Multiple Regression Model

The Multiple Regression Model Multiple Regression The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & or more independent variables (X i ) Multiple Regression Model with k Independent Variables:

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

What is a Hypothesis?

What is a Hypothesis? What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population mean Example: The mean monthly cell phone bill in this city is μ = $42 population proportion Example:

More information

Sociology 593 Exam 1 Answer Key February 17, 1995

Sociology 593 Exam 1 Answer Key February 17, 1995 Sociology 593 Exam 1 Answer Key February 17, 1995 I. True-False. (5 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher regressed Y on. When

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

Parametric Estimating Nonlinear Regression

Parametric Estimating Nonlinear Regression Parametric Estimating Nonlinear Regression The term nonlinear regression, in the context of this job aid, is used to describe the application of linear regression in fitting nonlinear patterns in the data.

More information

Multiple Regression Methods

Multiple Regression Methods Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:

More information

Time Series Analysis. Smoothing Time Series. 2) assessment of/accounting for seasonality. 3) assessment of/exploiting "serial correlation"

Time Series Analysis. Smoothing Time Series. 2) assessment of/accounting for seasonality. 3) assessment of/exploiting serial correlation Time Series Analysis 2) assessment of/accounting for seasonality This (not surprisingly) concerns the analysis of data collected over time... weekly values, monthly values, quarterly values, yearly values,

More information

3 Time Series Regression

3 Time Series Regression 3 Time Series Regression 3.1 Modelling Trend Using Regression Random Walk 2 0 2 4 6 8 Random Walk 0 2 4 6 8 0 10 20 30 40 50 60 (a) Time 0 10 20 30 40 50 60 (b) Time Random Walk 8 6 4 2 0 Random Walk 0

More information

Finding Relationships Among Variables

Finding Relationships Among Variables Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis

More information

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu Sampling Distribution of b 1 Expected value of b 1 : Variance of b 1 : E(b 1 ) = 1 Var(b 1 ) = σ 2 /SS x Estimate of

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Correlation. A statistics method to measure the relationship between two variables. Three characteristics Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction

More information

MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics

MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression analysis Predicting with regression analysis Old exam question

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data? Univariate analysis Example - linear regression equation: y = ax + c Least squares criteria ( yobs ycalc ) = yobs ( ax + c) = minimum Simple and + = xa xc xy xa + nc = y Solve for a and c Univariate analysis

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Time-Series Analysis. Dr. Seetha Bandara Dept. of Economics MA_ECON

Time-Series Analysis. Dr. Seetha Bandara Dept. of Economics MA_ECON Time-Series Analysis Dr. Seetha Bandara Dept. of Economics MA_ECON Time Series Patterns A time series is a sequence of observations on a variable measured at successive points in time or over successive

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Python 데이터분석 보충자료. 윤형기

Python 데이터분석 보충자료. 윤형기 Python 데이터분석 보충자료 윤형기 (hky@openwith.net) 단순 / 다중회귀분석 Logistic Regression 회귀분석 REGRESSION Regression 개요 single numeric D.V. (value to be predicted) 과 one or more numeric I.V. (predictors) 간의관계식. "regression"

More information

SIMPLE REGRESSION ANALYSIS. Business Statistics

SIMPLE REGRESSION ANALYSIS. Business Statistics SIMPLE REGRESSION ANALYSIS Business Statistics CONTENTS Ordinary least squares (recap for some) Statistical formulation of the regression model Assessing the regression model Testing the regression coefficients

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables 26.1 S 4 /IEE Application Examples: Multiple Regression An S 4 /IEE project was created to improve the 30,000-footlevel metric

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

CS 5014: Research Methods in Computer Science

CS 5014: Research Methods in Computer Science Computer Science Clifford A. Shaffer Department of Computer Science Virginia Tech Blacksburg, Virginia Fall 2010 Copyright c 2010 by Clifford A. Shaffer Computer Science Fall 2010 1 / 207 Correlation and

More information

Bayesian Analysis LEARNING OBJECTIVES. Calculating Revised Probabilities. Calculating Revised Probabilities. Calculating Revised Probabilities

Bayesian Analysis LEARNING OBJECTIVES. Calculating Revised Probabilities. Calculating Revised Probabilities. Calculating Revised Probabilities Valua%on and pricing (November 5, 2013) LEARNING OBJECTIVES Lecture 7 Decision making (part 3) Regression theory Olivier J. de Jong, LL.M., MM., MBA, CFD, CFFA, AA www.olivierdejong.com 1. List the steps

More information

CHAPTER 4 CRITICAL GROWTH SEASONS AND THE CRITICAL INFLOW PERIOD. The numbers of trawl and by bag seine samples collected by year over the study

CHAPTER 4 CRITICAL GROWTH SEASONS AND THE CRITICAL INFLOW PERIOD. The numbers of trawl and by bag seine samples collected by year over the study CHAPTER 4 CRITICAL GROWTH SEASONS AND THE CRITICAL INFLOW PERIOD The numbers of trawl and by bag seine samples collected by year over the study period are shown in table 4. Over the 18-year study period,

More information

2. Regression Review

2. Regression Review 2. Regression Review 2.1 The Regression Model The general form of the regression model y t = f(x t, β) + ε t where x t = (x t1,, x tp ), β = (β 1,..., β m ). ε t is a random variable, Eε t = 0, Var(ε t

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e Economics 102: Analysis of Economic Data Cameron Spring 2016 Department of Economics, U.C.-Davis Final Exam (A) Tuesday June 7 Compulsory. Closed book. Total of 58 points and worth 45% of course grade.

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

STATISTICS 110/201 PRACTICE FINAL EXAM

STATISTICS 110/201 PRACTICE FINAL EXAM STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable

More information

Statistics and Quantitative Analysis U4320

Statistics and Quantitative Analysis U4320 Statistics and Quantitative Analysis U3 Lecture 13: Explaining Variation Prof. Sharyn O Halloran Explaining Variation: Adjusted R (cont) Definition of Adjusted R So we'd like a measure like R, but one

More information

4. Nonlinear regression functions

4. Nonlinear regression functions 4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change

More information

1. (Problem 3.4 in OLRT)

1. (Problem 3.4 in OLRT) STAT:5201 Homework 5 Solutions 1. (Problem 3.4 in OLRT) The relationship of the untransformed data is shown below. There does appear to be a decrease in adenine with increased caffeine intake. This is

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

School of Mathematical Sciences. Question 1

School of Mathematical Sciences. Question 1 School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 8 and Assignment 7 Solutions Question 1 Figure 1: The residual plots do not contradict the model assumptions of normality, constant

More information

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species Lecture notes 2/22/2000 Dummy variables and extra SS F-test Page 1 Crab claw size and closing force. Problem 7.25, 10.9, and 10.10 Regression for all species at once, i.e., include dummy variables for

More information

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu Last Name (Print): Solution First Name (Print): Student Number: MGECHY L Introduction to Regression Analysis Term Test Friday July, PM Instructor: Victor Yu Aids allowed: Time allowed: Calculator and one

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Single and multiple linear regression analysis

Single and multiple linear regression analysis Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics

More information

Systems of Equations and Inequalities. College Algebra

Systems of Equations and Inequalities. College Algebra Systems of Equations and Inequalities College Algebra System of Linear Equations There are three types of systems of linear equations in two variables, and three types of solutions. 1. An independent system

More information

Lecture 9: Linear Regression

Lecture 9: Linear Regression Lecture 9: Linear Regression Goals Develop basic concepts of linear regression from a probabilistic framework Estimating parameters and hypothesis testing with linear models Linear regression in R Regression

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,

More information

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS QUESTIONS 5.1. (a) In a log-log model the dependent and all explanatory variables are in the logarithmic form. (b) In the log-lin model the dependent variable

More information

Multiple linear regression S6

Multiple linear regression S6 Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple

More information

Observations Homework Checkpoint quizzes Chapter assessments (Possibly Projects) Blocks of Algebra

Observations Homework Checkpoint quizzes Chapter assessments (Possibly Projects) Blocks of Algebra September The Building Blocks of Algebra Rates, Patterns and Problem Solving Variables and Expressions The Commutative and Associative Properties The Distributive Property Equivalent Expressions Seeing

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Microarra Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 5 Linear Regression dr. Petr Nazarov 14-1-213 petr.nazarov@crp-sante.lu Statistical data analsis in Ecel. 5. Linear regression OUTLINE Lecture

More information

Lecture 48 Sections Mon, Nov 16, 2009

Lecture 48 Sections Mon, Nov 16, 2009 and and Lecture 48 Sections 13.4-13.5 Hampden-Sydney College Mon, Nov 16, 2009 Outline and 1 2 3 4 5 6 Outline and 1 2 3 4 5 6 and Exercise 13.4, page 821. The following data represent trends in cigarette

More information

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines) Dr. Maddah ENMG 617 EM Statistics 11/28/12 Multiple Regression (3) (Chapter 15, Hines) Problems in multiple regression: Multicollinearity This arises when the independent variables x 1, x 2,, x k, are

More information

1 Introduction to Minitab

1 Introduction to Minitab 1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you

More information

STA 101 Final Review

STA 101 Final Review STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem

More information

STAT Checking Model Assumptions

STAT Checking Model Assumptions STAT 704 --- Checking Model Assumptions Recall we assumed the following in our model: (1) The regression relationship between the response and the predictor(s) specified in the model is appropriate (2)

More information

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I PubH 7405: REGRESSION ANALYSIS MLR: INFERENCES, Part I TESTING HYPOTHESES Once we have fitted a multiple linear regression model and obtained estimates for the various parameters of interest, we want to

More information

Regression Model Building

Regression Model Building Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation in Y with a small set of predictors Automated

More information

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information