Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Similar documents
Statistics for Economics & Business

Basic Business Statistics, 10/e

Statistics for Business and Economics

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Chapter 14 Simple Linear Regression

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Chapter 11: Simple Linear Regression and Correlation

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Chapter 15 Student Lecture Notes 15-1

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Comparison of Regression Lines

Chapter 15 - Multiple Regression

Lecture 6: Introduction to Linear Regression

Correlation and Regression

Statistics II Final Exam 26/6/18

a. (All your answers should be in the letter!

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Statistics MINITAB - Lab 2

x i1 =1 for all i (the constant ).

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 13: Multiple Regression

Scatter Plot x

Learning Objectives for Chapter 11

Economics 130. Lecture 4 Simple Linear Regression Continued

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

/ n ) are compared. The logic is: if the two

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

18. SIMPLE LINEAR REGRESSION III

17 - LINEAR REGRESSION II

SIMPLE LINEAR REGRESSION

STATISTICS QUESTIONS. Step by Step Solutions.

28. SIMPLE LINEAR REGRESSION III

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

Regression. The Simple Linear Regression Model

Introduction to Regression

Negative Binomial Regression

Sociology 301. Bivariate Regression. Clarification. Regression. Liying Luo Last exam (Exam #4) is on May 17, in class.

Topic 7: Analysis of Variance

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

STAT 3008 Applied Regression Analysis

Unit 10: Simple Linear Regression and Correlation

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Diagnostics in Poisson Regression. Models - Residual Analysis

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Statistics Chapter 4

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Biostatistics 360 F&t Tests and Intervals in Regression 1

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

Problem of Estimation. Ordinary Least Squares (OLS) Ordinary Least Squares Method. Basic Econometrics in Transportation. Bivariate Regression Analysis

The Ordinary Least Squares (OLS) Estimator

Properties of Least Squares

β0 + β1xi and want to estimate the unknown

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 13 Student Lecture Notes 13-1

Linear Regression Analysis: Terminology and Notation

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

IV. Modeling a Mean: Simple Linear Regression

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

e i is a random error

STAT 511 FINAL EXAM NAME Spring 2001

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

Topic- 11 The Analysis of Variance

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

CHAPTER 6 GOODNESS OF FIT AND CONTINGENCY TABLE PREPARED BY: DR SITI ZANARIAH SATARI & FARAHANIM MISNI

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Lecture 4 Hypothesis Testing

Activity #13: Simple Linear Regression. actgpa.sav; beer.sav;

The SAS program I used to obtain the analyses for my answers is given below.

Midterm Examination. Regression and Forecasting Models

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Linear correlation and linear regression

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

CHAPTER 8. Exercise Solutions

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Professor Chris Murray. Midterm Exam

Regression Analysis. Regression Analysis

Sociology 301. Bivariate Regression II: Testing Slope and Coefficient of Determination. Bivariate Regression. Calculating Expected Values

Statistical Evaluation of WATFLOOD

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

Chapter 8 Indicator Variables

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

x = , so that calculated

III. Econometric Methodology Regression Analysis

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

Lecture 3 Stat102, Spring 2007

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

Transcription:

Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1

Chapter Topcs Types of Regresson Models Determnng the Smple Lnear Regresson Equaton Measures of Varaton n Regresson and Correlaton Assumptons of Regresson and Correlaton Resdual Analyss and the Durbn-Watson Statstc Estmaton of Predcted Values Correlaton - Measurng the Strength of the Assocaton 1999 Prentce-Hall, Inc. Chap. 13-2

Purpose of Regresson and Correlaton Analyss Regresson Analyss s Used Prmarly for Predcton A statstcal model used to predct the values of a dependent or response varable based on values of at least one ndependent or explanatory varable Correlaton Analyss s Used to Measure Strength of the Assocaton Between Numercal Varables 1999 Prentce-Hall, Inc. Chap. 13-3

The Scatter Dagram Plot of all (, Y ) pars 60 40 20 0 Y 0 20 40 60 1999 Prentce-Hall, Inc. Chap. 13-4

Types of Regresson Models Postve Lnear Relatonshp Relatonshp NOT Lnear Negatve Lnear Relatonshp No Relatonshp 1999 Prentce-Hall, Inc. Chap. 13-5

Smple Lnear Regresson Model Relatonshp Between Varables Is a Lnear Functon The Straght Lne that Best Ft the Data Y ntercept Random Error Y 0 1 Dependent (Response) Varable Slope Independent (Explanatory) Varable 1999 Prentce-Hall, Inc. Chap. 13-6

Populaton Lnear Regresson Model Y Y 0 1 = Random Error Observed Value Observed Value m Y 0 1 1999 Prentce-Hall, Inc. Chap. 13-7

Sample Lnear Y Regresson Model Y b b 0 1 = Predcted Value of Y for observaton = Value of for observaton b 0 = Sample Y - ntercept used as estmate of the populaton 0 b 1 = Sample Slope used as estmate of the populaton 1 1999 Prentce-Hall, Inc. Chap. 13-8

Smple Lnear Regresson Equaton: Example Annual Store Square Feet Sales ($000) You wsh to examne the relatonshp between the square footage of produce stores and ts annual sales. Sample data for 7 stores were obtaned. Fnd the equaton of the straght lne that fts the data best 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760 1999 Prentce-Hall, Inc. Chap. 13-9

Annual Sales ($000) Scatter Dagram Example 12000 10000 8000 6000 4000 2000 0 0 1000 2000 3000 4000 5000 6000 Excel Output S q u a re F e e t 1999 Prentce-Hall, Inc. Chap. 13-10

Equaton for the Best Straght Lne Y b 0 b 1 1636. 415 1. 487 From Excel Prntout: C o e ffc e n ts I n te r c e p t 1 6 3 6. 4 1 4 7 2 6 V a r a b l e 1. 4 8 6 6 3 3 6 5 7 1999 Prentce-Hall, Inc. Chap. 13-11

Annual Sales ($000) Graph of the Best Straght Lne 12000 10000 8000 6000 4000 2000 0 0 1000 2000 3000 4000 5000 6000 S q u a re F e e t 1999 Prentce-Hall, Inc. Chap. 13-12

Interpretng the Results Y = 1636.415 +1.487 The slope of 1.487 means for each ncrease of one unt n, the Y s estmated to ncrease 1.487unts. For each ncrease of 1 square foot n the sze of the store, the model predcts that the expected annual sales are estmated to ncrease by $1487. 1999 Prentce-Hall, Inc. Chap. 13-13

Measures of Varaton: The Sum of Squares SST = Total Sum of Squares measures_ the varaton of the Y values around ther mean Y SSR = Regresson Sum of Squares explaned varaton attrbutable to the relatonshp between and Y SSE = Error Sum of Squares varaton attrbutable to factors other than the relatonshp between and Y 1999 Prentce-Hall, Inc. Chap. 13-14

Measures of Varaton: The Sum of Squares Y _ SST = (Y - Y) 2 SSE =(Y - Y ) 2 _ SSR = (Y - Y) 2 _ Y 1999 Prentce-Hall, Inc. Chap. 13-15

Measures of Varaton The Sum of Squares: Example Excel Output for Produce Stores df SS R e g r e ss o n 1 30380456.12 R e s d u a l 5 1871199.595 T o ta l 6 32251655.71 SSR SSE SST 1999 Prentce-Hall, Inc. Chap. 13-16

The Coeffcent of Determnaton SSR r 2 = = SST regresson sum of squares total sum of squares Measures the proporton of varaton that s explaned by the ndependent varable n the regresson model 1999 Prentce-Hall, Inc. Chap. 13-17

Coeffcents of Determnaton (r 2 ) and Correlaton (r) Y r 2 = 1, r = +1 Y r 2 = 1, r = -1 Y ^ = b 0 + b 1 ^ Y = b 0 + b 1 Y r 2 =.8, r = +0.9 r 2 = 0, r = 0 Y Y ^ = b 0 + b 1 Y ^ = b 0 + b 1 1999 Prentce-Hall, Inc. Chap. 13-18

Standard Error of Estmate S SSE = (Y Y yx n 2 1 n 2 n ) 2 The standard devaton of the varaton of observatons around the regresson lne 1999 Prentce-Hall, Inc. Chap. 13-19

Measures of Varaton: Example Excel Output for Produce Stores R e g re sso n S ta tstc s M u lt p le R 0. 9 7 0 5 5 7 2 R S q u a re 0. 9 4 1 9 8 1 2 9 A d ju s t e d R S q u a re 0. 9 3 0 3 7 7 5 4 S t a n d a rd E rro r 6 1 1. 7 5 1 5 1 7 r 2 =.94 O b s e rva t o n s 7 94% of the varaton n annual sales can be explaned by the varablty n the sze of the store as measured by square footage S yx 1999 Prentce-Hall, Inc. Chap. 13-20

Lnear Regresson Assumptons 1. Normalty For Lnear Models Y Values Are Normally Dstrbuted For Each Probablty Dstrbuton of Error s Normal 2. Homoscedastcty (Constant Varance) 3. Independence of Errors 1999 Prentce-Hall, Inc. Chap. 13-21

Varaton of Errors Around the Regresson Lne f(e) y values are normally dstrbuted around the regresson lne. For each x value, the spread or varance around the regresson lne s the same. Y 2 1 Regresson Lne 1999 Prentce-Hall, Inc. Chap. 13-22

Resdual Analyss Purposes Examne Lnearty Evaluate volatons of assumptons Graphcal Analyss of Resduals Plot resduals Vs. values Dfference between actual Y & predcted Y Studentzed resduals: Allows consderaton for the magntude of the resduals 1999 Prentce-Hall, Inc. Chap. 13-23

Resdual Analyss for Lnearty e Not Lnear e Lnear 1999 Prentce-Hall, Inc. Chap. 13-24

Resdual Analyss for Homoscedastcty SR Heteroscedastcty Homoscedastcty SR Usng Standardzed Resduals 1999 Prentce-Hall, Inc. Chap. 13-25

Resdual Analyss: Computer Output Example Observaton Predcted Y Resduals 1 4202.344417-521.3444173 2 3928.803824-533.8038245 3 5822.775103 830.2248971 Produce Stores 4 9894.664688-351.6646882 5 3557.14541-239.1454103 R e s d u a l P lo t 6 4918.90184 644.0981603 7 3588.364717 171.6352829 Excel Output 0 1000 2000 3000 4000 5000 6000 S q u a r e F e e t 1999 Prentce-Hall, Inc. Chap. 13-26

The Durbn-Watson Statstc Used when data s collected over tme to detect autocorrelaton (Resduals n one tme perod are related to resduals n another perod) Measures Volaton of ndependence assumpton D n (e 2 n e 1 e 2 ) 2 1 Should be close to 2. If not, examne the model for autocorrelaton. 1999 Prentce-Hall, Inc. Chap. 13-27

Resdual Analyss for Independence SR Not Independent SR Independent 1999 Prentce-Hall, Inc. Chap. 13-28

Inferences about the Slope: t Test t Test for a Populaton Slope Is a Lnear Relatonshp Between & Y? Null and Alternatve Hypotheses H 0 : 1 = 0 (No Lnear Relatonshp) H 1 : 1 0 (Lnear Relatonshp) Test Statstc: t b 1 S b 1 1 and df = n - 2 Where S b 1 n ( 1 S Y ) 2 1999 Prentce-Hall, Inc. Chap. 13-29

Example: Produce Stores Data for 7 Stores: Annual Store Square Feet Sales ($000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760 Regresson Model Obtaned: Y = 1636.415 +1.487 The slope of ths model s 1.487. Is there a lnear relatonshp between the square footage of a store and ts annual sales? 1999 Prentce-Hall, Inc. Chap. 13-30

Inferences about the Slope: t Test Example H 0 : 1 = 0 H 1 : 1 0 a.05 df 7-2 = 7 Crtcal Value(s): Test Statstc: From Excel Prntout t S tat P-value In te rc e p t 3.6 2 4 4 3 3 3 0.0 1 5 1 4 8 8 V a ra b le 1 9.0 0 9 9 4 4 0.0 0 0 2 8 1 2 Decson: Reject.025-2.5706 Reject.025 0 2.5706 t Reject H 0 Concluson: There s evdence of a relatonshp. 1999 Prentce-Hall, Inc. Chap. 13-31

Inferences about the Slope: Confdence Interval Example Confdence Interval Estmate of the Slope b 1 t n-2 S b1 Excel Prntout for Produce Stores L o w er 95% Up p er 95% I n te rc e p t 4 7 5. 8 1 0 9 2 6 2 7 9 7. 0 1 8 5 3 V a ra b l e 1. 0 6 2 4 9 0 3 7 1. 9 1 0 7 7 6 9 4 At 95% level of Confdence The confdence Interval for the slope s (1.062, 1.911). Does not nclude 0. Concluson: There s a sgnfcant lnear relatonshp between annual sales and the sze of the store. 1999 Prentce-Hall, Inc. Chap. 13-32

Estmaton of Predcted Values Confdence Interval Estmate for m Y The Mean of Y gven a partcular Standard error of the estmate Ŷ t t value from table wth df=n-2 n2 S yx Sze of nterval vary accordng to dstance away from mean,. 1 n ( n 1 ( ) 2 ) 2 1999 Prentce-Hall, Inc. Chap. 13-33

Estmaton of Predcted Values Confdence Interval Estmate for Indvdual Response Y at a Partcular Addton of ths 1 ncreased wdth of nterval from that for the mean Y Ŷ t n2 S yx 1 1 n ( n 1 ( ) 2 ) 2 1999 Prentce-Hall, Inc. Chap. 13-34

Interval Estmates for Dfferent Values of Y Confdence Interval for a ndvdual Y Confdence Interval for the mean of Y 1999 Prentce-Hall, Inc. Chap. 13-35 _ A Gven

Example: Produce Stores Data for 7 Stores: Annual Store Square Feet Sales ($000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760 Predct the annual sales for a store wth 2000 square feet. Regresson Model Obtaned: Y = 1636.415 +1.487 1999 Prentce-Hall, Inc. Chap. 13-36

Estmaton of Predcted Values: Example Confdence Interval Estmate for Indvdual Y Fnd the 95% confdence nterval for the average annual sales for stores of 2,000 square feet Predcted Sales Y = 1636.415 +1.487 = 4610.45 ($000) = 2350.29 S Y = 611.75 t n-2 = t 5 = 2.5706 Ŷ t n2 S yx 1 n ( n 1 ( ) 2 ) 2 = 4610.45 980.97 Confdence nterval for mean Y 1999 Prentce-Hall, Inc. Chap. 13-37

Estmaton of Predcted Values: Example Confdence Interval Estmate for m Y Fnd the 95% confdence nterval for annual sales of one partcular stores of 2,000 square feet Predcted Sales Y = 1636.415 +1.487 = 4610.45 ($000) = 2350.29 S Y = 611.75 t n-2 = t 5 = 2.5706 Ŷ t n2 S yx 1 1 n ( n 1 ( ) 2 ) 2 = 4610.45 1853.45 Confdence nterval for ndvdual Y 1999 Prentce-Hall, Inc. Chap. 13-38

Correlaton: Measurng the Strength of Assocaton Answer How Strong Is the Lnear Relatonshp Between 2 Varables? Coeffcent of Correlaton Used Populaton correlaton coeffcent denoted r ( Rho ) Values range from -1 to +1 Measures degree of assocaton Is the Square Root of the Coeffcent of Determnaton 1999 Prentce-Hall, Inc. Chap. 13-39

Test of Coeffcent of Correlaton Tests If There Is a Lnear Relatonshp Between 2 Numercal Varables Same Concluson as Testng Populaton Slope 1 Hypotheses H 0 : r = 0 (No Correlaton) H 1 : r 0 (Correlaton) 1999 Prentce-Hall, Inc. Chap. 13-40

Chapter Summary Descrbed Types of Regresson Models Determned the Smple Lnear Regresson Equaton Provded Measures of Varaton n Regresson and Correlaton Stated Assumptons of Regresson and Correlaton Descrbed Resdual Analyss and the Durbn- Watson Statstc Provded Estmaton of Predcted Values Dscussed Correlaton - Measurng the Strength of the Assocaton 1999 Prentce-Hall, Inc. Chap. 13-41