Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables are related The variable being predicted is called the and is denoted by y. The variables being used to predict the value of the dependent variable are called the and are denoted by x. involves one independent variable and one dependent variable. The relationship between the two variables is approximated by a straight line. Regression analysis involving two or more independent variables is called. 2. Simple Linear Regression Model 2.1. Regression Model The equation that describes how y is related to x and an error term is called the regression model. The simple linear regression model is: y = β 0 + β 1 x +ε where β 0 and β 1 are called of the model ε is a random variable called the 1
2.2. Simple Linear Regression Equation The equation that describes how the expected value of y, denoted by E(y), is related to x. E(y) = β 0 + β 1 x Graph of the regression equation is a straight line. β 0 is the y intercept of the regression line. β 1 is the slope of the regression line. E(y) is the expected value of y for a given x value Positive relationship Negative relationship No relationship 2.3. Estimated Simple Linear Regression Equation = b 0 + b 1 x The graph is called the estimated regression line b 0 is the y intercept of the line. b 1 is the slope of the line. is the estimated value of y (and E(y) as well) for a given x value. Estimation process 2
3. Least Squares Method 3.1. Least Squares Criterion Minimize the sum of the squares of the deviations between the observed values of the dependent variable and the estimated values of the dependent variable. Where y i = observed value of the dependent variable for the ith observation = estimated value of the dependent variable for the ith observation 3.2. Slope and y-intercept for the Estimated Regression Equation Slope: Where 3
x i = value of independent variable for ith observation y i = value of dependent variable for ith observation = mean value for independent variable = mean value for dependent variable y-intercept: 4
Exercise 1. Reed Auto Sales Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown below. Find the estimated simple linear regression equation. 5
4. Coefficient of Determination 4.1. Sum of Squares due to Error (SSE) ith residual The error in using to estimate SSE = 4.2. Total sum of squares A measure of the error involved in using to estimate SST = 4.3. Sum of Squares due to Regression (SSR) To measure how much the value on the estimated regression line deviate from SSR = 4.4. Relationship among SST, SSR, SSE SST = SSR + SSE 4.5. The coefficient of determination r 2 = SSR/SST 0 r 2 1 Evaluate the goodness of fit for the estimated regression equation Interpretation When the regression is perfect, SSE = 0 and r 2 = 1 When the regression is the poorest, SSR = 0 and r 2 = 0 6
4.6. Sample Correlation Coefficient r xy (signof b1) Coefficient of Determination r xy (signof b ) 1 r 2 Restricted to a linear relationship between two variables Exercise 2. Reed Auto Sales Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown below. (1) Find the coefficient of determination. (2) How can you interpret this result? (3) Find the sample Correlation Coefficient. 7
5. Testing for Significance: t Test 5.1. Assumptions about the Error Term ε The error ε is a random variable with mean of zero. (E(ε) = 0) Implication: E(y) = β 0 + β 1 x The variance of ε, denoted by σ 2, is the same for all values of the independent variable. The values of ε are independent The value of y for a particular value of x is not related to the value of y for any other value of x The error ε is a normally distributed random variable. Implication: y is also normally distributed random variable 5.2. Testing whether β 1 = 0. To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of β 1 is zero. Two tests are commonly used: t Test F Test Both the t test and F test require an estimate of σ 2, the variance of ε in the regression model. 8
5.3. An estimate of σ 2 The mean square error (MSE) provides the estimate of σ 2, and the notation s 2 is also used. Where SSE ( y ˆ y ) i 2 i ( y b b x ) i 0 2 1 i To estimate σ we take the square root of s 2. The resulting value, s, is referred to as the standard error of the estimate. 5.4. t Test Hypotheses H 0 : β 1 = 0 Ha: β 1 0 Test statistic Where Rejection rule Reject H 0 if p-value < α or t < -t α / 2 or t > t α / 2 where t α / 2 is based on a t distribution with n - 2 degrees of freedom 9
5.5. Confidence interval for β 1 The form of a confidence interval for β 1 is: where t α / 2 is the t value providing an area of α/2 in the upper tail of a t distribution with n - 2 degrees of freedom b1 is the point estimator t α/2 s b1 is the margin of error Rejection rule Reject H 0 if 0 is not included in the confidence interval for β 1. 10
Exercise 3. Reed Auto Sales Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown below. (α =.05) t test for the significance in the simple linear regression. (1) Test statistic? (2) Critical value? (3) What is the confidence interval for β 1? (4) Conclusion? 11