The simple linear regression model discussed in Chapter 13 was written as

Size: px
Start display at page:

Download "The simple linear regression model discussed in Chapter 13 was written as"

Transcription

1 1519T_c14 03/27/ :28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple Regression Model 14.3 Standard Deviation of Errors 14.4 Coefficient of Multiple Determination I n Chapter 13, we discussed simple linear regression and linear correlation. A simple regression model includes one independent and one dependent variable, and it presents a very simplified scenario of real-world situations. In the real world, a dependent variable is usually influenced by a number of independent variables. For example, the sales of a company s product may be determined by the price of that product, the quality of the product, and advertising expenditure incurred by the company to promote that product. Therefore, it makes more sense to use a regression model that includes more than one independent variable. Such a model is called a multiple regression model. In this chapter we will discuss multiple regression models Computer Solution of Multiple Regression 14.1 Multiple Regression Analysis The simple linear regression model discussed in Chapter 13 was written as y A Bx This model includes one independent variable, which is denoted by x, and one dependent variable, which is denoted by y. As we know from Chapter 13, the term represented by in the above model is called the random error. Usually a dependent variable is affected by more than one independent variable. When we include two or more independent variables in a regression model, it is called a multiple regression model. Remember, whether it is a simple or a multiple regression model, it always includes one and only one dependent variable. 614

2 14.1 Multiple Regression Analysis 615 A multiple regression model with y as a dependent variable and x 1, x 2, x 3, p, x k as independent variables is written as y A B 1 x 1 B 2 x 2 B 3 x 3 p B k x k (1) where A represents the constant term, B 1, B 2, B 3, p, B k are the regression coefficients of independent variables x 1, x 2, x 3, p, x k, respectively, and represents the random error term. This model contains k independent variables x 1, x 2, x 3, p, and x k. From model (1), it would seem that multiple regression models can only be used when the relationship between the dependent variable and each independent variable is linear. Furthermore, it also appears as if there can be no interaction between two or more of the independent variables. This is far from the truth. In the real world, a multiple regression model can be much more complex. Discussion of such models is outside the scope of this book. When each term contains a single independent variable raised to the first power as in model (1), we call it a first-order multiple regression model. This is the only type of multiple regression model we will discuss in this chapter. In regression model (1), A represents the constant term, which gives the value of y when all independent variables assume zero values. The coefficients B 1, B 2, B 3, p, and B k are called the partial regression coefficients. For example, B 1 is a partial regression coefficient of x 1. It gives the change in y due to a one-unit change in x 1 when all other independent variables included in the model are held constant. In other words, if we change x 1 by one unit but keep x 2, x 3, p, and x k unchanged, then the resulting change in y is measured by B 1. Similarly the value of B 2 gives the change in y due to a one-unit change in x 2 when all other independent variables are held constant. In model (1) above, A, B 1, B 2, B 3, p, and B k are called the true regression coefficients or population parameters. A positive value for a particular B i in model (1) will indicate a positive relationship between y and the corresponding x i variable. A negative value for a particular B i in that model will indicate a negative relationship between y and the corresponding x i variable. Remember that in a first-order regression model such as model (1), the relationship between each x and y is a straight-line relationship. In model (1), A B 1 x 1 B 2 x 2 B 3 x 3 p i B k x k is called the deterministic portion and is the stochastic portion of the model. When we use the t distribution to make inferences about a single parameter of a multiple regression model, the degrees of freedom are calculated as df n k 1 where n represents the sample size and k is the number of independent variables in the model. Definition Multiple Regression Model A regression model that includes two or more independent variables is called a multiple regression model. It is written as y A B 1 x 1 B 2 x 2 B 3 x 3 p B k x k where y is the dependent variable, x 1, x 2, x 3, p, x k are the k independent variables, and is the random error term. When each of the x i variables represents a single variable raised to the first power as in the above model, this model is referred to as a first-order multiple regression model. For such a model with a sample size of n and k independent variables, the degrees of freedom are: df n k 1 When a multiple regression model includes only two independent variables (with model (1) reduces to y A B 1 x 1 B 2 x 2 A multiple regression model with three independent variables (with k 32 is written as y A B 1 x 1 B 2 x 2 B 3 x 3 k 22,

3 616 Chapter 14 Multiple Regression If model (1) is estimated using sample data, which is usually the case, the estimated regression equation is written as ŷ a b 1 x 1 b 2 x 2 b 3 x 3 p b k x k (2) In equation (2), a, b 1, b 2, b 3, p, and b k are the sample statistics, which are the point estimators of the population parameters A, B 1, B 2, B 3, p, and B k, respectively. In model (1), y denotes the actual values of the dependent variable for members of the sample. In the estimated model (2), ŷ denotes the predicted or estimated values of the dependent variable. The difference between any pair of y and ŷ values gives the error of prediction. For a multiple regression model, SSE a 1y ŷ2 2 where SSE stands for the error sum of squares. As in Chapter 13, the estimated regression equation (2) is obtained by minimizing the sum of squared errors, that is, Minimize a 1y ŷ22 The estimated equation (2) obtained by minimizing the sum of squared errors is called the least squares regression equation. Usually the calculations in a multiple regression analysis are made by using statistical software packages for computers, such as MINITAB, instead of using the formulas manually. Even for a multiple regression equation with two independent variables, the formulas are complex and manual calculations are time consuming. In this chapter we will perform the multiple regression analysis using MINITAB. The solutions obtained by using other statistical software packages such as JMP, SAS, S-Plus, or SPSS can be interpreted the same way. The TI-84 and Excel do not have built-in procedures for the multiple regression model Assumptions of the Multiple Regression Model Like a simple linear regression model, a multiple (linear) regression model is based on certain assumptions. The following are the major assumptions for the multiple regression model (1). Assumption 1: The mean of the probability distribution of is zero, that is, E1 2 0 If we calculate errors for all measurements for a given set of values of independent variables for a population data set, the mean of these errors will be zero. In other words, while individual predictions will have some amount of errors, on average our predictions will be correct. Under this assumption, the mean value of y is given by the deterministic part of regression model (1). Thus, E1y2 A B 1 x 1 B 2 x 2 B 3 x 3 p B k x k where E1y2 is the expected or mean value of y for the population. This mean value of y is also denoted by m y0x1, x 2, p, x k. Assumption 2: The errors associated with different sets of values of independent variables are independent. Furthermore, these errors are normally distributed and have a constant standard deviation, which is denoted by s. Assumption 3: The independent variables are not linearly related. However, they can have a nonlinear relationship. When independent variables are highly linearly correlated, it is referred to as multicollinearity. This assumption is about the nonexistence of the multicollinearity problem. For example, consider the following multiple regression model: y A B 1 x 1 B 2 x 2 B 3 x 3

4 14.4 Coefficient of Multiple Determination 617 All of the following linear relationships (and other such linear relationships) between x 1, x 2, and x 3 should be invalid for this model. If any linear relationship exists, we can substitute one variable for another, which will reduce the number of independent variables to two. However, nonlinear relationships, such as x 1 4x 2 and x 2 2x 1 6x between x 1, x 2, and x 3 are permissible. In practice, multicollinearity is a major issue. Examining the correlation for each pair of independent variables is a good way to determine if multicollinearity exists. Assumption 4: There is no linear association between the random error term and each independent variable x i Standard Deviation of Errors The standard deviation of errors (also called the standard error of the estimate) for the multiple regression model (1) is denoted by s, and it is a measure of variation among errors. However, when sample data are used to estimate multiple regression model (1), the standard deviation of errors is denoted by s e. The formula to calculate s e is as follows. SSE s where SSE a 1y ŷ2 2 e B n k 1 Note that here SSE is the error sum of squares. We will not use this formula to calculate s e manually. Rather we will obtain it from the computer solution. Note that many software packages label as Root MSE, where MSE stands for mean square error. s e x 1 x 2 4x 3 x 2 5x 1 2x 3 x 1 3.5x Coefficient of Multiple Determination In Chapter 13, we denoted the coefficient of determination for a simple linear regression model by r 2 and defined it as the proportion of the total sum of squares SST that is explained by the regression model. The coefficient of determination for the multiple regression model, usually called the coefficient of multiple determination, is denoted by R 2 and is defined as the proportion of the total sum of squares SST that is explained by the multiple regression model. It tells us how good the multiple regression model is and how well the independent variables included in the model explain the dependent variable. Like r 2, the value of the coefficient of multiple determination R 2 always lies in the range 0 to 1, that is, 0 R 2 1 Just as in the case of the simple linear regression model, SST is the total sum of squares, SSR is the regression sum of squares, and SSE is the error sum of squares. SST is always equal to the sum of SSE and SSR. They are calculated as follows. SSE a e 2 a 1y ŷ2 2 SST SS yy a 1y y2 2 SSR a 1ŷ y2 2 SSR is the portion of SST that is explained by the use of the regression model, and SSE is the portion of SST that is not explained by the use of the regression model. The coefficient of multiple determination is given by the ratio of SSR and SST as follows. R 2 SSR SST

5 618 Chapter 14 Multiple Regression R 2 The coefficient of multiple determination has one major shortcoming. The value of generally increases as we add more and more explanatory variables to the regression model (even if they do not belong in the model). Just because we can increase the value of R 2 does not imply that the regression equation with a higher value of R 2 does a better job of predicting the dependent variable. Such a value of R 2 will be misleading, and it will not represent the true explanatory power of the regression model. To eliminate this shortcoming of R 2, it is preferable to use the adjusted coefficient of multiple determination, which is denoted by R 2. Note that R 2 is the coefficient of multiple determination adjusted for degrees of freedom. The value of R 2 may increase, decrease, or stay the same as we add more explanatory variables to our regression model. If a new variable added to the regression model contributes significantly to explain the variation in y, then R 2 increases; otherwise it decreases. The value of R 2 is calculated as follows. SSE/1n k 12 R R 2 n 1 2a or 1 n k 1 b SST/1n 12 Thus, if we know R 2, we can find the vlaue of R 2. Almost all statistical software packages give the values of both R 2 and R 2 for a regression model. Another property of R 2 to remember is that whereas R 2 can never be negative, R 2 can be negative. While a general rule of thumb is that a higher value of R 2 implies that a specific set of independent variables does a better job of predicting a specific dependent variable, it is important to recognize that some dependent variables have a great deal more variability than others. Therefore, R 2.30 could imply that a specific model is not a very strong model, but it could be the best possible model in a certain scenario. Many good financial models have values of R 2 below Computer Solution of Multiple Regression In this section, we take an example of a multiple regression model, solve it using MINITAB, interpret the solution, and make inferences about the population parameters of the regression model. R 2 Using MINITAB to find a multiple regression equation. EXAMPLE 14 1 A researcher wanted to find the effect of driving experience and the number of driving violations on auto insurance premiums. A random sample of 12 drivers insured with the same company and having similar auto insurance policies was selected from a large city. Table 14.1 lists Table 14.1 Number of Monthly Premium Driving Experience Driving Violations (dollars) (years) (past 3 years)

6 14.5 Computer Solution of Multiple Regression 619 the monthly auto insurance premiums (in dollars) paid by these drivers, their driving experiences (in years), and the numbers of driving violations committed by them during the past three years. Using MINITAB, find the regression equation of monthly premiums paid by drivers on the driving experiences and the numbers of driving violations. Solution Let y the monthly auto insurance premium 1in dollars2 paid by a driver x 1 the driving experience 1in years2 of a driver x 2 the number of driving violations committed by a driver during the past three years We are to estimate the regression model y A B 1 x 1 B 2 x 2 (3) The first step is to enter the data of Table 14.1 into MINITAB spreadsheet as shown in Screen Here we have entered the given data in columns C1, C2, and C3 and named them Monthly Premium, Driving Experience and Driving Violations, respectively. Screen 14.1 To obtain the estimated regression equation, select Stat Regression Regression. In the dialog box you obtain, enter Monthly Premium in the Response box, and Driving Experience and Driving Violations in the Predictors box as shown in Screen Note that you can enter the column names C1, C2, and C3 instead of variable names in these boxes. Click OK to obtain the output, which is shown in Screen From the output given in Screen 14.3, the estimated regression equation is: ŷ x x 2

7 620 Chapter 14 Multiple Regression Screen 14.2 Screen 14.3

8 14.5 Computer Solution of Multiple Regression Estimated Multiple Regression Model Example 14 2 describes, among other things, how the coefficients of the multiple regression model are interpreted. EXAMPLE 14 2 Refer to Example 14 1 and the MINITAB solution given in Screen (a) Explain the meaning of the estimated regression coefficients. (b) What are the values of the standard deviation of errors, the coefficient of multiple determination, and the adjusted coefficient of multiple determination? (c) What is the predicted auto insurance premium paid per month by a driver with seven years of driving experience and three driving violations committed in the past three years? (d) What is the point estimate of the expected (or mean) auto insurance premium paid per month by all drivers with 12 years of driving experience and 4 driving violations committed in the past three years? Interpreting parts of the MINITAB solution of multiple regression. Solution (a) From the portion of the MINITAB solution that is marked I in Screen 14.3, the estimated regression equation is ŷ x x 2 (4) From this equation, a 110, b , and b We can also read the values of these coefficients from the column labeled Coef in the portion of the output marked II in the MINITAB solution of Screen From this column we obtain a , b , and b Notice that in this column the coefficients of the regression equation appear with more digits after the decimal point. With these coefficient values, we can write the estimated regression equation as ŷ x x 2 (5) The value of a in the estimated regression equation (5) gives the value of ŷ for x 1 0 and x 2 0. Thus, a driver with no driving experience and no driving violations committed in the past three years is expected to pay an auto insurance premium of $ per month. Again, this is the technical interpretation of a. In reality, that may not be true because none of the drivers in our sample has both zero experience and zero driving violations. As all of us know, some of the highest premiums are paid by teenagers just after obtaining their drivers licenses. The value of b in the estimated regression model gives the change in ŷ for a one-unit change in x 1 when x 2 is held constant. Thus, we can state that a driver with one extra year of experience but the same number of driving violations is expected to pay $ (or $2.75) less per month for the auto insurance premium. Note that because b 1 is negative, an increase in driving experience decreases the premium paid. In other words, y and x 1 have a negative relationship. The value of b in the estimated regression model gives the change in ŷ for a one-unit change in x 2 when x 1 is held constant. Thus, a driver with one extra driving violation during the past three years but with the same years of driving experience is expected to pay $ (or $16.11) more per month for the auto insurance premium.

9 622 Chapter 14 Multiple Regression (b) (c) (d) The values of the standard deviation of errors, the coefficient of multiple determination, and the adjusted coefficient of multiple determination are given in part III of the MINITAB solution of Screen From this part of the solution, s R 2 and R 2 e , 93.1%, 91.6% Thus, the standard deviation of errors is The value of R % tells us that the two independent variables, years of driving experiences and the numbers of driving violations, explain 93.1% of the variation in the auto insurance premiums. The value of R % is the value of the coefficient of multiple determination adjusted for degrees of freedom. It states that when adjusted for degrees of freedom, the two independent variables explain 91.6% of the variation in the dependent variable. To Find the predicted auto insurance premium paid per month by a driver with seven years of driving experience and three driving violations during the past three years, we substitute x 1 7 and x 2 3 in the estimated regression model (5). Thus, ŷ x x $ Note that this value of ŷ is a point estimate of the predicted value of y, which is denoted by y p. The concept of the predicted value of y is the same as that for a simple linear regression model discussed in Section of Chapter 13. To obtain the point estimate of the expected (mean) auto insurance premium paid per month by all drivers with 12 years of driving experience and four driving violations during the past three years, we substitute x 1 12 and x 2 4 in the estimated regression equation (5). Thus, ŷ x x $ This value of ŷ is a point estimate of the mean value of y, which is denoted by E1y2 or m y0x1 x 2. The concept of the mean value of y is the same as that for a simple linear regression model discussed in Section of Chapter Confidence Interval for an Individual Coefficient The values of a, b 1, b 2, b 3, p, and b k obtained by estimating model (1) using sample data give the point estimates of A, B 1, B 2, B 3, p, and B k, respectively, which are the population parameters. Using the values of sample statistics a, b 1, b 2, b 3, p, and b k, we can make confidence intervals for the corresponding population parameters A, B 1, B 2, B 3, p, and B k, respectively. Because of the assumption that the errors are normally distributed, the sampling distribution of each b i is normal with its mean equal to B i and standard deviation equal to s bi. For example, the sampling distribution of b 1 is normal with its mean equal to B 1 and standard deviation equal to s b1. However, usually s e is not known and, hence, we cannot find s bi. Consequently, we use s bi as an estimator of s bi and use the t distribution to determine a confidence interval for B i. The formula to obtain a confidence interval for a population parameter B i is given below. This is the same formula we used to make a confidence interval for B in Section of Chapter 13. The only difference is that to make a confidence interval for a particular B i for a multiple regression model, the degrees of freedom are n k 1. Confidence Interval for B i The 11 a2 100% confidence interval for B i is given by b i ts bi The value of t that is used in this formula is obtained from the t distribution table for a 2 area in the right tail of the t distribution curve and 1n k 12 degrees of freedom. The values of b i and s bi are obtained from the computer solution.

10 14.5 Computer Solution of Multiple Regression 623 Example 14 3 describes the procedure to make a confidence interval for an individual regression coefficient B i. EXAMPLE 14 3 B 1 Determine a 95% confidence interval for (the coefficient of experience) for the multiple regression of auto insurance premium on driving experience and the number of driving violations. Use the MINITAB solution of Screen Making a confidence interval for an individual coefficient of a multiple regression model. Solution To make a confidence interval for B 1, we use the portion marked II in the MINITAB solution of Screen From that portion of the MINITAB solution, b and s b Note that the value of the standard deviation of b 1, s b1.9770, is given in the column labeled SE Coef in part II of the MINITAB solution. The confidence level is 95%. The area in each tail of the t distribution curve is obtained as follows. The sample size is 12, which gives n 12. Because there are two independent variables, k 2. Therefore, From the t distribution table (Table V of Appendix C), the value of t for.025 area in the right tail of the t distribution curve and 9 degrees of freedom is Then, the 95% confidence interval for is B 1 Area in each tail of the t distribution Degrees of freedom n k b 1 ts b to.5373 Thus, the 95% confidence interval for b 1 is 4.96 to.54. That is, we can state with 95% confidence that for one extra year of driving experience, the monthly auto insurance premium changes by an amount between $4.96 and $.54. Note that since both limits of the confidence interval have negative signs, we can also state that for each extra year of driving experience, the monthly auto insurance premium decreases by an amount between $.54 and $4.96. By applying the procedure used in Example 14 3, we can make a confidence interval for any of the coefficients (including the constant term) of a multiple regression model, such as A and in model (3). For example, the 95% confidence intervals for A and B 2, respectively, are B 2 a ts a to b 2 ts b to Testing a Hypothesis about an Individual Coefficient We can perform a test of hypothesis about any of the coefficients of the regression model (1) using the same procedure that we used to make a test of hypothesis about B for a simple regression model in Section of Chapter 13. The only difference is that the degrees of freedom are equal to n k 1 for a multiple regression model. Again, because of the assumption that the errors are normally distributed, the sampling distribution of each b i is normal with its mean equal to B i and standard deviation equal to s bi. However, usually s e is not known and, hence, we cannot find s bi. Consequently, we use s bi as an estimator of s bi, and use the t distribution to perform the test. B i

11 624 Chapter 14 Multiple Regression Test Statistic for b i The value of the test statistic t for is calculated as t b i B i s bi The value of B i is substituted from the null hypothesis. Usually, but not always, the null hypothesis is H 0 : B i 0. The MINITAB solution contains this value of the t statistic. b i Example 14 4 illustrates the procedure for testing a hypothesis about a single coefficient. Testing a hypothesis about a coefficient of a multiple regression model. EXAMPLE 14 4 Using the 2.5% significance level, can you conclude that the coefficient of the number of years of driving experience in regression model (3) is negative? Use the MINITAB output obtained in Example 14 1 and shown in Screen 14.3 to perform this test. Solution From Example 14 1, our multiple regression model (3) is y A B 1 x 1 B 2 x 2 where y is the monthly auto insurance premium (in dollars) paid by a driver, x 1 is the driving experience (in years), and x 2 is the number of driving violations committed during the past three years. From the MINITAB solution, the estimated regression equation is ŷ x x 2 To conduct a test of hypothesis about B 1, we use the portion marked II in the MINITAB solution given in Screen From that portion of the MINITAB solution, b and s b Note that the value of the standard deviation of b 1, s b1.9770, is given in the column labeled SE Coef in part II of the MINITAB solution. To make a test of hypothesis about B 1, we perform the following five steps. Step 1. State the null and alternative hypotheses. We are to test whether or not the coefficient of the number of years of driving experience in regression model (3) is negative, that is, whether or not B 1 is negative. The two hypotheses are H 0 : B 1 0 H 1 : B Note that we can also write the null hypothesis as H 0 : B 1 0, which states that the coefficient of the number of years of driving experience in the regression model (3) is either zero or positive. Step 2. Select the distribution to use. The sample size is small 1n and s e is not known. The sampling distribution of b 1 is normal because the errors are assumed to be normally distributed. Hence, we use the t distribution to make a test of hypothesis about B 1. Step 3. Determine the rejection and nonrejection regions. The significance level is.025. The sign in the alternative hypothesis indicates that the test is left-tailed. Therefore, area in the left tail of the t distribution curve is.025. The degrees of freedom are: df n k From the t distribution table (Table V in Appendix C), the critical value of t for 9 degrees of freedom and.025 area in the left tail of the t distribution curve is 2.262, as shown in Figure 14.1.

12 14.5 Computer Solution of Multiple Regression 625 Reject H 0 Do not reject H 0 Figure 14.1 = Critical value of t t Step 4. Calculate the value of the test statistic and p-value. The value of the test statistic t for b 1 can be obtained from the MINITAB solution given in Screen This value is given in the column labeled T and the row named Driving Experience in the portion marked II in that MINITAB solution. Thus, the observed value of t is Also, in the same portion of the MINITAB solution, the p-value for this test is given in the column labeled P and the row named Driving Experience. This p-value is.020. However, MINITAB always gives the p-value for a two-tailed test. Because our test is one-tailed, the p-value for our test is Step 5. Make a decision. The value of the test statistic, t 2.81, is less than the critical value of t and it falls in the rejection region. Consequently, we reject the null hypothesis and conclude that the coefficient of x 1 in regression model (3) is negative. That is, an increase in the driving experience decreases the auto insurance premium. Also the p-value for the test is.010, which is less than the significance level of a.025. Hence, based on this p-value also, we reject the null hypothesis and conclude that B 1 is negative. Note that the observed value of t in Step 4 of Example 14 4 is obtained from the MINITAB solution only if the null hypothesis is H 0 : B 1 0. However, if the null hypothesis is that B 1 is equal to a number other than zero, then the t value obtained from the MINITAB solution is no longer valid. For example, suppose the null hypothesis in Example 14 4 is and the alternative hypothesis is In this case the observed value of t will be calculated as t b 1 B 1 s b1 t b 1 B 1 s b1 p-value H 0 : B 1 2 H 1 : B To calculate this value of t, the values of b 1 and s b1h0 are obtained from the MINITAB solution of Screen The value of B 1 is substituted from. EXERCISES CONCEPTS AND PROCEDURES 14.1 How are the coefficients of independent variables in a multiple regression model interpreted? Explain What are the degrees of freedom for a multiple regression model to make inferences about individual parameters?

13 Correlations: Monthly Prem, Driving Exper, No. of violations Monthly Prem Driving Exper Driving Exper No. of violation Cell Contents: Pearson correlation P-Value Regression Analysis: Monthly Prem versus Driving Exper, No. of violation The regression equation is Monthly Prem = Driving Exper No. of violations Predictor Coef SE Coef T P VIF Constant Driving Exper No. of violations S = R-Sq = 93.1% R-Sq(adj) = 91.6% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Source DF Seq SS Driving Exper No. of violations Predicted Values for New Observations New Obs Fit SE Fit 95% CI 95% PI (106.98, ) (86.39, ) Values of Predictors for New Observations New Driving No. of Obs Exper violations

14 Correlations: Monthly Prem, Driving Exper, No. of violations Monthly Prem Driving Exper Driving Exper Standard deviation of random errors Se= No. of violation Cell Contents: Pearson correlation P-Value Regression Analysis: Monthly Prem versus Driving Exper, No. of violation = 110 1= = 16.1 The regression equation is Monthly Prem = Driving Exper No. of violations Predictor Coef SE Coef T P VIF Constant Driving Exper No. of violations S = R-Sq = 93.1% R-Sq(adj) = 91.6% Analysis of Variance Sb1 = Sb2 = to= for b1 to= 6.16 for b2 Source DF SS MS F P Regression Residual Error Total Source DF Seq SS Driving Exper No. of violations SSyy = Predicted Values for New Observations New Obs Fit SE Fit 95% CI 95% PI (106.98, ) (86.39, ) Values of Predictors for New Observations New Driving No. of Obs Exper violations ( ) ( )

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

Multiple Regression Methods

Multiple Regression Methods Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

Chapter 7 Student Lecture Notes 7-1

Chapter 7 Student Lecture Notes 7-1 Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

Multiple Regression Examples

Multiple Regression Examples Multiple Regression Examples Example: Tree data. we have seen that a simple linear regression of usable volume on diameter at chest height is not suitable, but that a quadratic model y = β 0 + β 1 x +

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

Chapter 15 Multiple Regression

Chapter 15 Multiple Regression Multiple Regression Learning Objectives 1. Understand how multiple regression analysis can be used to develop relationships involving one dependent variable and several independent variables. 2. Be able

More information

The Multiple Regression Model

The Multiple Regression Model Multiple Regression The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & or more independent variables (X i ) Multiple Regression Model with k Independent Variables:

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

Regression Analysis II

Regression Analysis II Regression Analysis II Measures of Goodness of fit Two measures of Goodness of fit Measure of the absolute fit of the sample points to the sample regression line Standard error of the estimate An index

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6 STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf

More information

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics Chapter 13 Student Lecture Notes 13-1 Department of Quantitative Methods & Information Sstems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analsis QMIS 0 Dr. Mohammad

More information

STAT 212 Business Statistics II 1

STAT 212 Business Statistics II 1 STAT 1 Business Statistics II 1 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 1: BUSINESS STATISTICS II Semester 091 Final Exam Thursday Feb

More information

LI EAR REGRESSIO A D CORRELATIO

LI EAR REGRESSIO A D CORRELATIO CHAPTER 6 LI EAR REGRESSIO A D CORRELATIO Page Contents 6.1 Introduction 10 6. Curve Fitting 10 6.3 Fitting a Simple Linear Regression Line 103 6.4 Linear Correlation Analysis 107 6.5 Spearman s Rank Correlation

More information

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

STATISTICS 110/201 PRACTICE FINAL EXAM

STATISTICS 110/201 PRACTICE FINAL EXAM STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Confidence Interval for the mean response

Confidence Interval for the mean response Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Model Building Chap 5 p251

Model Building Chap 5 p251 Model Building Chap 5 p251 Models with one qualitative variable, 5.7 p277 Example 4 Colours : Blue, Green, Lemon Yellow and white Row Blue Green Lemon Insects trapped 1 0 0 1 45 2 0 0 1 59 3 0 0 1 48 4

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

School of Mathematical Sciences. Question 1. Best Subsets Regression

School of Mathematical Sciences. Question 1. Best Subsets Regression School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 9 and Assignment 8 Solutions Question 1 Best Subsets Regression Response is Crime I n W c e I P a n A E P U U l e Mallows g E P

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using

More information

SMAM 314 Practice Final Examination Winter 2003

SMAM 314 Practice Final Examination Winter 2003 SMAM 314 Practice Final Examination Winter 2003 You may use your textbook, one page of notes and a calculator. Please hand in the notes with your exam. 1. Mark the following statements True T or False

More information

Chapter 14 Multiple Regression Analysis

Chapter 14 Multiple Regression Analysis Chapter 14 Multiple Regression Analysis 1. a. Multiple regression equation b. the Y-intercept c. $374,748 found by Y ˆ = 64,1 +.394(796,) + 9.6(694) 11,6(6.) (LO 1) 2. a. Multiple regression equation b.

More information

Finding Relationships Among Variables

Finding Relationships Among Variables Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis

More information

TMA4255 Applied Statistics V2016 (5)

TMA4255 Applied Statistics V2016 (5) TMA4255 Applied Statistics V2016 (5) Part 2: Regression Simple linear regression [11.1-11.4] Sum of squares [11.5] Anna Marie Holand To be lectured: January 26, 2016 wiki.math.ntnu.no/tma4255/2016v/start

More information

Simple Linear Regression

Simple Linear Regression 9-1 l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical Method for Determining Regression 9.4 Least Square Method 9.5 Correlation Coefficient and Coefficient

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

A discussion on multiple regression models

A discussion on multiple regression models A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

School of Mathematical Sciences. Question 1

School of Mathematical Sciences. Question 1 School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 8 and Assignment 7 Solutions Question 1 Figure 1: The residual plots do not contradict the model assumptions of normality, constant

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

1 Introduction to Minitab

1 Introduction to Minitab 1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

MBA Statistics COURSE #4

MBA Statistics COURSE #4 MBA Statistics 51-651-00 COURSE #4 Simple and multiple linear regression What should be the sales of ice cream? Example: Before beginning building a movie theater, one must estimate the daily number of

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

23. Inference for regression

23. Inference for regression 23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence

More information

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. 12er12 Chapte Bivariate i Regression (Part 1) Bivariate Regression Visual Displays Begin the analysis of bivariate data (i.e., two variables) with a scatter plot. A scatter plot - displays each observed

More information

Regression Analysis. BUS 735: Business Decision Making and Research

Regression Analysis. BUS 735: Business Decision Making and Research Regression Analysis BUS 735: Business Decision Making and Research 1 Goals and Agenda Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn

More information

Concordia University (5+5)Q 1.

Concordia University (5+5)Q 1. (5+5)Q 1. Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Mid Term Test May 26, 2004 Two Hours 3 Instructor Course Examiner

More information

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu Sampling Distribution of b 1 Expected value of b 1 : Variance of b 1 : E(b 1 ) = 1 Var(b 1 ) = σ 2 /SS x Estimate of

More information

Simple Linear Regression: A Model for the Mean. Chap 7

Simple Linear Regression: A Model for the Mean. Chap 7 Simple Linear Regression: A Model for the Mean Chap 7 An Intermediate Model (if the groups are defined by values of a numeric variable) Separate Means Model Means fall on a straight line function of the

More information

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable, Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/2 01 Examination Date Time Pages Final December 2002 3 hours 6 Instructors Course Examiner Marks Y.P.

More information

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house.

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house. Exam 3 Resource Economics 312 Introductory Econometrics Please complete all questions on this exam. The data in the spreadsheet: Exam 3- Home Prices.xls are to be used for all analyses. These data are

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Regression Analysis IV... More MLR and Model Building

Regression Analysis IV... More MLR and Model Building Regression Analysis IV... More MLR and Model Building This session finishes up presenting the formal methods of inference based on the MLR model and then begins discussion of "model building" (use of regression

More information

[4+3+3] Q 1. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by

[4+3+3] Q 1. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Final June 2004 3 hours 7 Instructors Course Examiner Marks Y.P. Chaubey

More information

What is a Hypothesis?

What is a Hypothesis? What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population mean Example: The mean monthly cell phone bill in this city is μ = $42 population proportion Example:

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

CHAPTER EIGHT Linear Regression

CHAPTER EIGHT Linear Regression 7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following

More information

Conditions for Regression Inference:

Conditions for Regression Inference: AP Statistics Chapter Notes. Inference for Linear Regression We can fit a least-squares line to any data relating two quantitative variables, but the results are useful only if the scatterplot shows a

More information

F-tests and Nested Models

F-tests and Nested Models F-tests and Nested Models Nested Models: A core concept in statistics is comparing nested s. Consider the Y = β 0 + β 1 x 1 + β 2 x 2 + ǫ. (1) The following reduced s are special cases (nested within)

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Linear Regression Linear Regression with One Independent Variable 2.1 Introduction In Chapter 1 we introduced the linear model as an alternative for making inferences on means of one or

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I PubH 7405: REGRESSION ANALYSIS MLR: INFERENCES, Part I TESTING HYPOTHESES Once we have fitted a multiple linear regression model and obtained estimates for the various parameters of interest, we want to

More information

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments. Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a

More information

Regression used to predict or estimate the value of one variable corresponding to a given value of another variable.

Regression used to predict or estimate the value of one variable corresponding to a given value of another variable. CHAPTER 9 Simple Linear Regression and Correlation Regression used to predict or estimate the value of one variable corresponding to a given value of another variable. X = independent variable. Y = dependent

More information

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23 2.4. ASSESSING THE MODEL 23 2.4.3 Estimatingσ 2 Note that the sums of squares are functions of the conditional random variables Y i = (Y X = x i ). Hence, the sums of squares are random variables as well.

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore What is Multiple Linear Regression Several independent variables may influence the change in response variable we are trying to study. When several independent variables are included in the equation, the

More information

16.3 One-Way ANOVA: The Procedure

16.3 One-Way ANOVA: The Procedure 16.3 One-Way ANOVA: The Procedure Tom Lewis Fall Term 2009 Tom Lewis () 16.3 One-Way ANOVA: The Procedure Fall Term 2009 1 / 10 Outline 1 The background 2 Computing formulas 3 The ANOVA Identity 4 Tom

More information

STA 4210 Practise set 2a

STA 4210 Practise set 2a STA 410 Practise set a For all significance tests, use = 0.05 significance level. S.1. A multiple linear regression model is fit, relating household weekly food expenditures (Y, in $100s) to weekly income

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Applied Regression Analysis. Section 2: Multiple Linear Regression

Applied Regression Analysis. Section 2: Multiple Linear Regression Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response

More information

Multiple Linear Regression

Multiple Linear Regression Andrew Lonardelli December 20, 2013 Multiple Linear Regression 1 Table Of Contents Introduction: p.3 Multiple Linear Regression Model: p.3 Least Squares Estimation of the Parameters: p.4-5 The matrix approach

More information

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

Multiple Regression. Peerapat Wongchaiwat, Ph.D. Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com The Multiple Regression Model Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (X i ) Multiple Regression Model

More information

Sociology 593 Exam 1 Answer Key February 17, 1995

Sociology 593 Exam 1 Answer Key February 17, 1995 Sociology 593 Exam 1 Answer Key February 17, 1995 I. True-False. (5 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher regressed Y on. When

More information

SIMPLE REGRESSION ANALYSIS. Business Statistics

SIMPLE REGRESSION ANALYSIS. Business Statistics SIMPLE REGRESSION ANALYSIS Business Statistics CONTENTS Ordinary least squares (recap for some) Statistical formulation of the regression model Assessing the regression model Testing the regression coefficients

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal Econ 3790: Statistics Business and Economics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 14 Covariance and Simple Correlation Coefficient Simple Linear Regression Covariance Covariance between

More information