Chapter 14 Simple Linear Regression

Similar documents
Regression. The Simple Linear Regression Model

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Statistics for Economics & Business

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Statistics for Business and Economics

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Chapter 11: Simple Linear Regression and Correlation

Basic Business Statistics, 10/e

Chapter 15 - Multiple Regression

Chapter 13: Multiple Regression

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Statistics MINITAB - Lab 2

Correlation and Regression

Scatter Plot x

Statistics II Final Exam 26/6/18

Learning Objectives for Chapter 11

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Comparison of Regression Lines

/ n ) are compared. The logic is: if the two

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Chapter 9: Statistical Inference and the Relationship between Two Variables

x i1 =1 for all i (the constant ).

STAT 3008 Applied Regression Analysis

e i is a random error

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

a. (All your answers should be in the letter!

17 - LINEAR REGRESSION II

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

β0 + β1xi and want to estimate the unknown

Negative Binomial Regression

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

Biostatistics 360 F&t Tests and Intervals in Regression 1

Properties of Least Squares

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Lecture 6: Introduction to Linear Regression

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Introduction to Regression

STATISTICS QUESTIONS. Step by Step Solutions.

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

28. SIMPLE LINEAR REGRESSION III

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

SIMPLE LINEAR REGRESSION

18. SIMPLE LINEAR REGRESSION III

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

The SAS program I used to obtain the analyses for my answers is given below.

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Topic 7: Analysis of Variance

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Lecture 4 Hypothesis Testing

Topic- 11 The Analysis of Variance

Chapter 15 Student Lecture Notes 15-1

Economics 130. Lecture 4 Simple Linear Regression Continued

The Ordinary Least Squares (OLS) Estimator

Lecture 3 Stat102, Spring 2007

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

Introduction to Analysis of Variance (ANOVA) Part 1

Reduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor

Statistics Chapter 4

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

SIMPLE LINEAR REGRESSION and CORRELATION

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal

Regression Analysis. Regression Analysis

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

β0 + β1xi. You are interested in estimating the unknown parameters β

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Chapter 8 Indicator Variables

MD. LUTFOR RAHMAN 1 AND KALIPADA SEN 2 Abstract

Linear Regression Analysis: Terminology and Notation

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

x = , so that calculated

Rockefeller College University at Albany

Problem of Estimation. Ordinary Least Squares (OLS) Ordinary Least Squares Method. Basic Econometrics in Transportation. Bivariate Regression Analysis

Chapter 8 Multivariate Regression Analysis

STAT 511 FINAL EXAM NAME Spring 2001

Diagnostics in Poisson Regression. Models - Residual Analysis

Chapter 10. What is Regression Analysis? Simple Linear Regression Analysis. Examples

Continuous vs. Discrete Goods

Modeling and Simulation NETW 707

January Examinations 2015

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Transcription:

Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng how the varables are related. The varable beng predcted s called the dependent varable and s denoted b. The varables beng used to predct the value of the dependent varable are called the ndependent varables and are denoted b. Smple lnear regresson nvolves one ndependent varable and one dependent varable. Two or more ndependent varables s called multple regresson. The relatonshp between the two varables s appromated b a straght lne. Smple Lnear Regresson Model The equaton that descrbes how s related to and an error term s called the regresson model. The smple lnear regresson model s: = b 0 + b +e b 0 and b are called parameters of the model, e s a random varable called the error term. Smple Lnear Regresson Equaton Smple Lnear Regresson Equaton The smple lnear regresson equaton s: Postve Lnear Relatonshp E( = b 0 + b E( Graph of the regresson equaton s a straght lne. b 0 s the ntercept of the regresson lne. b s the slope of the regresson lne. E( s the epected value of for a gven value. Intercept b 0 Regresson lne Slope b s postve 3 4 Smple Lnear Regresson Equaton Negatve Lnear Relatonshp Estmated Smple Lnear Regresson Equaton The estmated smple lnear regresson equaton Intercept b 0 E( Regresson lne ŷ b b 0 The graph s called the estmated regresson lne. Slope b s negatve b 0 s the ntercept of the lne. b s the slope of the lne. s the estmated value of for a gven value. ŷ ou can show No Relatonshp 5 6

Chapter 4 Smple Lnear Regresson Regresson Model = b 0 + b +e Regresson Equaton E( = b 0 + b Unknown Parameters b 0, b b 0 and b provde estmates of b 0 and b Estmaton Process Sample Data:.... n n Estmated Regresson Equaton ŷ b b 0 Sample Statstcs b 0, b Least Squares Method Least Squares Crteron mn ( = observed value of the dependent varable for the th observaton mn ( ˆ mn ( (b b ^ = estmated value of the dependent varable for the th observaton 0 7 8 Observed Value of for Predcted Value of for Intercept = β 0 Smple Lnear Regresson Model β0 β ε ε Random Error for ths value Slope = β 9 Least Squares Method Slope for the Estmated Regresson Equaton ( ( b ( = value of ndependent varable for th observaton = value of dependent varable for th observaton = mean value for ndependent varable = mean value for dependent varable -Intercept for the Estmated Regresson Equaton b b 0 0 Smple Lnear Regresson Eample: Reed Auto Sales Reed Auto perodcall has a specal weeklong sale. As part of the advertsng campagn Reed runs one or more televson commercals durng the weekend precedng the sale. Data from a sample of 5 prevous sales are shown here. Number of TV Ads ( 3 3 Number of Cars Sold ( 4 4 8 7 7 S = 0 S = 00 0 ( ( --6 4 0- --3 7 ( ( 0 ( 4 Estmated Regresson Equaton Slope for the Estmated Regresson Equaton ( ( 0 b 5 ( 4 -Intercept for the Estmated Regresson Equaton b 0 b 0 5( 0 Estmated Regresson Equaton ˆ 0 5

Chapter 4 Smple Lnear Regresson Usng Ecel s Chart Tools for Scatter Dagram & Estmated Regresson Equaton Reed Auto Sales Estmated Regresson Lne Measures of Varaton Total varaton s made up of two parts: SST SSR SSE Total Sum of Squares Regresson Sum of Squares Error Sum of Squares SST ( SSR ( ˆ = Mean value of the dependent varable = Observed value of the dependent varable = Predcted value of for the gven value SSE ( ˆ 3 4 Measures of Varaton Measures of Varaton SST = total sum of squares (Total Varaton Measures the varaton of the values around ther mean SST ( SSR = regresson sum of squares (Eplaned Varaton Varaton attrbutable to the relatonshp between and SSR ( ˆ SSE = error sum of squares (Uneplaned Varaton Varaton n attrbutable to factors other than SSE ( ˆ _ SST ( SSE ( ˆ SSR ( ˆ _ 5 6 Coeffcent of Determnaton r or R Relatonshp Among SST, SSR, SSE ( SST = SSR + SSE ( ˆ ( ˆ SST = total sum of squares SSR = sum of squares due to regresson SSE = sum of squares due to error r = SSR/SST = 00/4 =.877 The regresson relatonshp s ver strong; 87.7% of the varablt n the number of cars sold can be eplaned b the lnear relatonshp between the number of TV ads and the number of cars sold. 7 Sample Correlaton Coeffcent We learned n Chapter 3 r (sgn of b Coeffcent of D eterm naton r (sgn of b r b = the slope of the estmated regresson equaton ˆ b0 b 8 3

Chapter 4 Smple Lnear Regresson Sample Correlaton Coeffcent r (sgn of b r Eamples of Appromate r (or R Values The sgn of b n the equaton s +. r ˆ 0 5 = +.877 r = +.9366 r = r = Perfect lnear relatonshp between and : 00% of the varaton n s eplaned b varaton n Note: Ths onl holds for smple regresson 9 r = 0 Eamples of Appromate r (or R Values Dfferent Values of the Correlaton Coeffcent Once Agan 0 < r < Weaker lnear relatonshps between and : Some but not all of the varaton n s eplaned b varaton n Eamples of Appromate r (or R Values SUMMAR OUTPUT Armand s Pzza (Ecel Fle r = 0 Regresson Statstcs Multple R 0.950955 R Square 0.9073363 = 60 + 5 Adjusted R Square 0.890575334 Standard Error 3.893669 Observatons 0 No lnear relatonshp between and : ANOVA df SS MS F Sgnfcance F r = 0 The value of does not depend on. (None of the varaton n s eplaned b varaton n Regresson (SSR 400 400 74.484.54887E-05 Resdual (SSE 8 530 9.5 Total (SST 9 5730 Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 60 9.60348 6.503336 0.0009 38.747559 8.757 38.74756 8.75744 3 Varable 5 0.5806538 8.66749.5E-05 3.66905963 6.338094 3.6690596 6.338094037 4 4

Sales Chapter 4 Smple Lnear Regresson 50 00 50 Armand's Pzza = 60 + 5 R² = 0.907 Reed Auto Sales Estmated Regresson Lne Once Agan SUMMAR OUTPUT Regresson Statstcs Multple R 0.9365858 R Square 0.877998 Adjusted R Square 0.836573 Standard Error.6046899 Observatons 5 00 50 0 0 5 0 5 0 5 30 Populaton Predcted Lnear ( 5 ANOVA df SS MS F Sgnfcance F Regresson 00 00.485743 0.089863 Resdual 3 4 4.667 Total 4 4 Coeffcents Standard Error t Stat P-value Lower 95% Upper Lower 95% 95.0% Upper 95.0% Intercept 0.3664393 4.6 0.04360.46895750 7.53.469 7.53045 Ads 5.080345 4.69 0.089863.565659 8.437.563 8.43743488 Pont Estmaton = 0 + 5(3 = 5 cars If 3 TV ads are run pror to a sale, we epect the mean number of cars sold to be: 6 Lookng at Regresson n More Detal Assumptons About the Error Term e. The error e s a random varable wth mean of zero.. The varance of e, denoted b, s the same for all values of the ndependent varable. 3. The values of e are ndependent. 4. The error e s a normall dstrbuted random varable. 7 8 Testng for Sgnfcance To test for a sgnfcant regresson relatonshp, we must conduct a hpothess test to determne whether the value of b s zero. Two tests are commonl used: t Test and F Test Both the t test and F test requre an estmate of, the varance of e n the regresson model. 9 An Estmate of SSE Testng for Sgnfcance The mean square error (MSE provdes the estmate of, and the notaton s s also used. s = MSE = SSE/(n ( ˆ ( b 0 b An Estmate of whch s called the standard error of the estmate. s MSE SSE n 30 5

Chapter 4 Smple Lnear Regresson Testng for Sgnfcance: t Test Hpotheses Test Statstc Rejecton Rule t H0: b 0 H : 0 a b b s b where s s S ( b Reject H 0 f p-value < a or t < -t a or t > t a t a s based on a t dstrbuton Wth n - degrees of freedom 3 Confdence Interval for b Rejecton Rule Reject H 0 f 0 s not ncluded n the confdence nterval for b. 95% Confdence Interval for b b t s = 5 +/- 3.8(.08 = 5 +/- 3.44 a / b Concluson or.56 to 8.44 0 s not ncluded n the confdence nterval. Reject H 0 3 Confdence Interval for b The form of a confdence nterval for b s: b s the pont estmator where t a/ s the t value provdng an area of a/ n the upper tal of a t dstrbuton wth n - degrees of freedom s the margn of error Testng for Sgnfcance: t Test. Determne the hpotheses.. Specf the level of sgnfcance. 3. Select the test statstc. H0: b 0 H : 0 a b b t s a =.05 4. State the rejecton rule. Reject H 0 f p-value <.05 or t > 3.8 (wth 3 degrees of freedom b 33 34 Testng for Sgnfcance: t Test 5. Compute the value of the test statstc. b 5 t 4.63 s.08 b 6. Determne whether to reject H 0. t = 4.54 provdes an area of.0 n the upper tal. Hence, the p-value s less than.0. (Also, t = 4.63 > 3.8. We can reject H 0. 35 Hpotheses Test Statstc Testng for Sgnfcance: F Test Rejecton Rule H : b 0 H 0 : a b 0 F = MSR/MSE Reject H 0 f p-value < a or F > F a F a s based on an F dstrbuton wth degree of freedom n the numerator and n - degrees of freedom n the denomnator 36 6

Chapter 4 Smple Lnear Regresson Mechancs of the F Test Graphcall Testng for Sgnfcance: F Test H0: b 0 H : 0 a b. Determne the hpotheses.. Specf the level of sgnfcance. 3. Select the test statstc. 4. State the rejecton rule. a =.05 F = MSR/MSE 5. Compute the value of the test statstc. Reject H 0 f p-value <.05 or F > 0.3 (wth d.f. n numerator and 3 d.f. n denomnator F = MSR/MSE = 00/4.667 =.43 6. Determne whether to reject H 0. F = 7.44 provdes an area of.05 n the upper tal. Thus, the p-value correspondng to F =.43 s less than (.05 =.05. Hence, we reject H 0. 37 The statstcal evdence s suffcent to conclude that we have a sgnfcant relatonshp between the number of TV ads ared and the number of cars sold. 38 Some Cautons about the Interpretaton of Sgnfcance Tests Rejectng H 0 : b = 0 and concludng that the relatonshp between and s sgnfcant does not enable us to conclude that a cause-and-effect relatonshp s present between and. Just because we are able to reject H 0 : b = 0 and demonstrate statstcal sgnfcance does not enable us to conclude that there s a lnear relatonshp between and. Resdual Analss If the assumptons about the error term e appear questonable, the hpothess tests about the sgnfcance of the regresson relatonshp and the nterval estmaton results ma not be vald. The resduals provde the best nformaton about e. Resdual for Observaton ˆ Much of the resdual analss s based on an eamnaton of graphcal plots. 39 40 Resdual Plot Aganst If the assumpton that the varance of e s the same for all values of s vald, and the assumed regresson model s an adequate representaton of the relatonshp between the varables, then Resdual Plot Aganst The resdual plot should gve an overall mpresson of a horzontal band of ponts The resdual plot should gve an overall mpresson of a horzontal band of ponts. 4 unbased: have an average value of zero n an thn vertcal strp, and 4 7

Chapter 4 Smple Lnear Regresson Resdual Plot Aganst Eample: Armand s Pzza Parlors Student Populaton ( Sales ( Predcted sales = 60 + 5( Resduals ( 58 70-6 05 90 5 8 88 00-8 8 00 8 7 0-3 6 37 40-3 0 57 60-3 0 69 60 9 49 70-6 0 90 Resdual Plot Aganst Usng Ecel to Produce a Resdual Plot When the Regresson dalog bo appears, we must also select the Resdual Plot opton. The output wll nclude two new tems: A plot of the resduals aganst the ndependent varable, and A lst of predcted values of and the correspondng resdual values. 43 44 Standardzed Resdual Plot The standardzed resdual plot can provde nsght about the assumpton that the error term e has a normal dstrbuton. If ths assumpton s satsfed, the dstrbuton of the standardzed resduals should appear to come from a standard normal probablt dstrbuton. Eample: Armand s Pzza Parlors Observaton Predcted sales = 60 + 5( Resduals ( Standardzed Resdual 70 - -.079 90 5.4 3 00 - -.9487 4 00 8.430 5 0-3 -.96 6 40-3 -.96 7 60-3 -.37 8 60 9.75 9 70 - -.74 0 90.079 45 Independence Assumpton Independence assumpton s most lkel to be volated when the data are tme-seres data If the data s not tme seres, then t can be reordered wthout affectng the data For tme-seres data, the tme-ordered error terms can be autocorrelated Postve autocorrelaton s when a postve error term n tme perod tends to be followed b another postve value n +k Negatve autocorrelaton s when a postve error term tends to be followed b a negatve value 46 Independence Assumpton Vsuall Postve Autocorrelaton Independence Assumpton Vsuall Negatve Autocorrelaton 47 48 8