Learning Objectives for Chapter 11

Similar documents
Statistics for Economics & Business

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Basic Business Statistics, 10/e

Chapter 11: Simple Linear Regression and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Chapter 14 Simple Linear Regression

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

28. SIMPLE LINEAR REGRESSION III

Statistics for Business and Economics

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

18. SIMPLE LINEAR REGRESSION III

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Statistics MINITAB - Lab 2

Chapter 15 - Multiple Regression

Lecture 6: Introduction to Linear Regression

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Chapter 9: Statistical Inference and the Relationship between Two Variables

Introduction to Regression

Chapter 13: Multiple Regression

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

x i1 =1 for all i (the constant ).

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Comparison of Regression Lines

17 - LINEAR REGRESSION II

Scatter Plot x

/ n ) are compared. The logic is: if the two

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Statistics II Final Exam 26/6/18

Linear Regression Analysis: Terminology and Notation

Unit 10: Simple Linear Regression and Correlation

STAT 3008 Applied Regression Analysis

SIMPLE LINEAR REGRESSION

Correlation and Regression

STATISTICS QUESTIONS. Step by Step Solutions.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

Negative Binomial Regression

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

e i is a random error

Lecture 4 Hypothesis Testing

Properties of Least Squares

x = , so that calculated

The Ordinary Least Squares (OLS) Estimator

Topic 7: Analysis of Variance

β0 + β1xi and want to estimate the unknown

Chapter 8 Indicator Variables

Biostatistics 360 F&t Tests and Intervals in Regression 1

Statistics Chapter 4

Regression. The Simple Linear Regression Model

Economics 130. Lecture 4 Simple Linear Regression Continued

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

The SAS program I used to obtain the analyses for my answers is given below.

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

UNIVERSITY OF TORONTO. Faculty of Arts and Science JUNE EXAMINATIONS STA 302 H1F / STA 1001 H1F Duration - 3 hours Aids Allowed: Calculator

Midterm Examination. Regression and Forecasting Models

Regression Analysis. Regression Analysis

a. (All your answers should be in the letter!

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Activity #13: Simple Linear Regression. actgpa.sav; beer.sav;

Chemometrics. Unit 2: Regression Analysis

IV. Modeling a Mean: Simple Linear Regression

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Chapter 15 Student Lecture Notes 15-1

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

β0 + β1xi. You are interested in estimating the unknown parameters β

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Reduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal

First Year Examination Department of Statistics, University of Florida

β0 + β1xi. You are interested in estimating the unknown parameters β

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Chapter 4: Regression With One Regressor

Professor Chris Murray. Midterm Exam

III. Econometric Methodology Regression Analysis

Introduction to Analysis of Variance (ANOVA) Part 1

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Diagnostics in Poisson Regression. Models - Residual Analysis

STAT 511 FINAL EXAM NAME Spring 2001

Lecture 3 Stat102, Spring 2007

T E C O L O T E R E S E A R C H, I N C.

Modeling and Simulation NETW 707

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Sociology 301. Bivariate Regression. Clarification. Regression. Liying Luo Last exam (Exam #4) is on May 17, in class.

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Transcription:

Chapter : Lnear Regresson and Correlaton Methods Hldebrand, Ott and Gray Basc Statstcal Ideas for Managers Second Edton Learnng Objectves for Chapter Usng the scatterplot n regresson analyss Usng the method of least squares for fndng the best fttng lne Understandng the underlyng assumptons n regresson analyss Determnng whether or not any observatons are hgh leverage ponts, y-outlers or hgh nfluence ponts Usng the the t-test for testng the sgnfcance of the slope coeffcent Usng the the F-test for testng the slope coeffcent Understandng the dfference between the correlaton coeffcent and the coeffcent of determnaton Sectons.. The Lnear Regresson Model; Estmatng Model Parameters

. The Lnear Regresson Model. Estmatng Model Parameters Objectve: Model the relatonshp between a response or dependent varable (Y) and one predctor or ndependent varable (x). Examples: For consumer purchase decsons, let Y = market share and x = consumer s degree of top of mnd brand awareness (% of consumers who name ths brand frst). In beta analyss, let Y = return on a securty (IBM) over a perod of tme and x = return on the market (DJIA). For a partcular corporaton, let Y = sales revenue for a regon at the year s end and x = advertsng expendtures for the year. Y s recorded n tens of thousands of dollars and x s recorded n thousands of dollars. In all three examples, can x be used to predct Y?. The Lnear Regresson Model. Estmatng Model Parameters Consder the example where Y = sales revenue and x = advertsng expendtures. The data follow: Regon Sales Adv Exp A B C D E F G H I J 6 a) Is there a lnear relatonshp between Sales and Advertsng Expendtures?. The Lnear Regresson Model. Estmatng Model Parameters A scatterplot of the data follows: Scatterplot for Sales vs. Advertsng Expendtures Sales 6 Adv Exp The scatterplot s used to assess whether or not there s a lnear relatonshp between Sales and Advertsng Expendtures. 6

. The Lnear Regresson Model. Estmatng Model Parameters Is a lnear relatonshp feasble? From the scatterplot, t appears that as Advertsng Expendtures ncrease, Sales ncrease lnearly. The relatonshp between Sales and Advertsng Expendtures s an example of a statstcal relatonshp between varables. If a straght lne s ft to the ponts, all the ponts would not fall on the straght lne. There are other factors besdes Advertsng Expendtures that affect Sales. An example of a determnstc relatonshp s: Total Costs = Fxed Costs + Varable Costs 7. The Lnear Regresson Model. Estmatng Model Parameters Ftted Model The general expresson for the lne to be ft s: yˆ = ˆ β + ˆ β x 0 Resduals are predcton errors n the sample. The resdual for an observaton ( x, y ) s the dfference between the actual value of sales and the predcted value: = y ( ˆ ˆ β 0 + βx ) Resdual = y ˆ y How should the ftted lne be determned? 8. The Lnear Regresson Model. Estmatng Model Parameters Possble crtera for fttng a lne passng through ( x, y) : n. Mnmze the sum of the resduals or mn ( y yˆ ) = Defcency: An nfnte number of lnes satsfy ths crteron.. Mnmze the sum of the absolute value of the resduals or n mn y yˆ = Defcency: Procedure not avalable n most statstcal software.. Mnmze the sum of the squared resduals or n mn ( y yˆ ) = Crteron () s known as the Method of Least Squares. 9

. The Lnear Regresson Model. Estmatng Model Parameters Procedure to obtan Intercept and Slope of Ftted Model When the Least-Squares crteron s used, β are 0 and β found by solvng the followng two expressons: ˆ β =Σ( x x)( y y) / Σ( x x ) and ˆ β y βˆ x 0 = To facltate the arthmetc, note that and Σ ( x x)( y y) = Σ x y n x y Σ ( x x ) = Σx nx 0. The Lnear Regresson Model. Estmatng Model Parameters Example (Sales vs. Advertsng Expendtures): b) Fnd the least-squares regresson lne. Do the calculatons by hand. Regon Sales(y) Adv Exp(x) (x)(y) x A B C D 6 9 E 6 F 6 G 9 H 0 I J 6 0 6 Sum 0 6 0. The Lnear Regresson Model. Estmatng Model Parameters From the prevous slde, x y = 6, x = 0, x =., y =. 0 x y nx y = 6 (0)(.)(.0) = 0 x nx = 0 (0)(.) = 7. 6 ˆβ = [ x y nx y ]/[ x nx ]= = 0 / 7.6 = 0.7 ˆ β = y ˆ β x =.0 (0.7)(.) 0.68 0 =

. The Lnear Regresson Model. Estmatng Model Parameters The least-squares regresson equaton s: Yˆ = 0.68+ 0. 7x The y-ntercept The Slope Coeffcent Notce the smlarty to the pont-slope formula for a straght lne: y = mx + b In regresson, the terms are reversed.. The Lnear Regresson Model. Estmatng Model Parameters Example (Sales vs. Advertsng Expendtures): c) State the equaton of the least-squares regresson lne by usng Mntab. The Mntab output follows: Regresson Analyss: Sales versus Adv Exp The regresson equaton s Sales = 0.68 + 0.7 Adv Exp Predctor Coef SE Coef T P Constant 0.68 0.69.0 0.66 Adv Exp 0.76 0.79.9 0.00 S = 0.8970 R-Sq = 7.% R-Sq(adj) = 69.0% The equaton of the ftted lne s: Sales ^ = 0.68+0.7 Adv Exp. The Lnear Regresson Model. Estmatng Model Parameters Example (Sales vs. Advertsng Expendtures): d) Obtan the ftted lne plot by usng Mntab. Ftted Regresson Lne for Sales vs. Adv Exp Sales = 0.68 + 0.76 Adv Exp S 0.8970 R-Sq 7.% R-Sq(adj) 69.0% Sales Adv Exp 6 The ftted lne plot renforces the approprateness of usng a lnear model.

. The Lnear Regresson Model. Estmatng Model Parameters Example (Sales vs. Advertsng Expendtures): e) Interpret the slope of the ftted lne n the context of ths problem Sales ncrease by 0.7 unts for each unt ncrease n Advertsng Expendture f) Predct sales for a regon that has advertsng expendtures of unts. yˆ = 0.68+ 0.7(.0) =. 86 unts 6. The Lnear Regresson Model. Estmatng Model Parameters g) Determne the resdual for Regon A Resdual = (Actual Sales) (Predcted Sales) = (.0) [0.68 +(0.7)(.0)] =.06 = -0.06 The ftted values and resduals for all regons follow. Regon Adv Exp Sales Ft Resdual A.00.000.06-0.06 B.00.000.0 -.0 C.00.000.06 0.9 D.00.000.8-0.8 E.00.000.0 0.870 F.00.000.80-0.80 G.00.000.8. H.00.000.0-0.0 I.00.000.0 0.696 J 6.00.000.09-0.09 7. The Lnear Regresson Model. Estmatng Model Parameters h) On the ftted lne plot, specfy the resdual for Regon G. Ftted Regresson Lne for Sales vs. Adv Exp Sales = 0.68 + 0.76 Adv Exp S 0.8970 R-Sq 7.% R-Sq(adj) 69.0% Resdual for Regon G Sales Adv Exp 6 8

. The Lnear Regresson Model. Estmatng Model Parameters Correspondng to the ftted model s the populaton model: y Error E( Y) = β 0 + βx Unknown The probablty dstrbuton for Y at x= Error 0 6 x 9. The Lnear Regresson Model. Estmatng Model Parameters Propertes of the populaton model At each value of x, there s a probablty dstrbuton of Y values. The means, E(Y), of these probablty dstrbutons le on a straght lne, where β0 s the ntercept, and β s the slope The expresson for E(Y) s: E( Y) = β0 + βx, or Y = β + β x + ε 0 where ε s the error or dfference between Y and E(Y ) 0. The Lnear Regresson Model. Estmatng Model Parameters Assumptons. The relaton s n fact lnear, so that the errors all have expected value zero; E ( ε ) = 0 for all.. The errors all have the same varance; Var ( ε ) = σ ε for all.. The errors are ndependent of each other. The ftted lne or model s an estmate of the populaton model

σ ε. The Lnear Regresson Model. Estmatng Model Parameters s also unknown and needs to be estmated. Snce resduals estmate errors, use the varaton n the resduals to estmate the varaton n the errors. There are constrants on the resduals: resdual = 0 ( x )( resdual ) = 0 For the Sales vs. Advertsng Expendtures example, these constrants are shown on the next slde. The resduals have (n-) degrees of freedom.. The Lnear Regresson Model. Estmatng Model Parameters The varaton n the resduals s: s s [ = n ε = Use to estmate ε Regon Sales Adv Exp Ft Res (AdvExp)(Res) A.06-0.06-0.06 B.0 -.0 -.6 C.06 0.9 0.9 D.8-0.8 -.6 E.0 0.870.79 F.80-0.80 -.9 G.8.. H.0-0.0 -. I.0 0.696.78 J 6.09-0.09-0.7 Sum 0.000 0.000 σ ε ] resdual 0 /( n ). The Lnear Regresson Model. Estmatng Model Parameters s ε Termnology for Sample standard devaton around the regresson lne Standard error of estmate Resdual standard devaton

. The Lnear Regresson Model. Estmatng Model Parameters Example (Sales vs. Advertsng Expendtures): ) Identfy the value of the sample standard devaton about the regresson lne from the Mntab output. Regresson Analyss: Sales versus Adv Exp The regresson equaton s Sales = 0.68 + 0.7 Adv Exp Predctor Coef SE Coef T P Constant 0.68 0.69.0 0.66 Adv Exp 0.76 0.79.9 0.00 S = 0.8970 R-Sq = 7.% R-Sq(adj) = 69.0% The value of the sample standard devaton about the regresson lne s: s = 0.8970. The Lnear Regresson Model. Estmatng Model Parameters How s ths useful? Lke any other standard devaton, the resdual standard devaton may be nterpreted by the Emprcal Rule. About 9% of the predcton errors wll fall wthn +/- standard devatons of the mean error; the mean error s always 0 n the least-squares regresson model. Therefore, a resdual standard devaton of 0.8 means that about 9% of the predcton errors wll be less than +/- (0.8) = +/-.66 [Hldebrand, Ott and Gray] 6. Estmatng Model Parameters A hgh-leverage pont s one for whch the x-value s, n some sense, far away from most of the x-values. Most of x-values y............................ x A hgh leverage pont 7

. Estmatng Model Parameters MINITAB flags a hgh leverage pont wth an X symbol. The determnaton s made by lookng at the leverage, denoted by h, for each observaton: h = + n ( x x) n ( x x) j= Some lmts are bult n: / n h ; Squared devaton of a partcular x Varaton n all x s relatve to x h = ; h = n If h > 6 / n, ths observaton s flagged wth an X symbol. Why 6/n? 6 /n = h n / = 8. Estmatng Model Parameters Example (Sales vs. Advertsng Expendtures): j) Fnd the leverage for regon J where the pont s (6, ). h J = (6. ) + 0 7.6 Is ths a hgh leverage pont? No, snce = 0.8 A pont wth a large x-value s not necessarly a hgh leverage pont. h0 = 0.8< 0.6. 9. Estmatng Model Parameters The leverage values for all 0 observatons follow: Regon Sales Adv Exp SRES HI A -0.7 0.76 B -.7969 0.7 C 0.80 0.76 D -.0870 0.09 E.8 0.7 F -0.767 0.88 G.7 0.09 H -0.6 0.79 I 0.9776 0.79 J 6-0.0 0.808 All leverages are less than 6/n No Hgh leverage ponts. Leverages for each observaton 0

. Estmatng Model Parameters The ftted lne plot follows. The graph suggests there are no hgh leverage ponts Ftted Regresson Lne for Sales vs. Adv Exp Sales = 0.68 + 0.76 Adv Exp S 0.8970 R-Sq 7.% R-Sq(adj) 69.0% Sales Adv Exp 6. Estmatng Model Parameters Example (Sales vs. Advertsng Expendtures): Regon G has just sent an emal statng that they reported ncorrect values for sales and advertsng expendtures. The bad news s that they actually spent $0,000 (0 unts) on Advertsng. The good news s that Sales were actually $80,000 (8 unts). Is the revsed data pont for Regon G a hgh leverage pont? (0.9) h G = + = 0.6 > 0.60 0 68.9 The pont (0,8) s a hgh leverage pont. The Mntab output follows.. Estmatng Model Parameters The regresson output for the revsed data follows. Regresson Analyss: Sales vs Adv Exp [Revsed Data: Change (,) to (0,8)] The regresson equaton s Sales = 0.9 + 0.76 Adv Exp Predctor Coef SE Coef T P Constant 0.906 0.0. 0.8 Adv Exp 0.760 0.0877 8.70 0.000 S = 0.796 R-Sq = 90.% R-Sq(adj) = 89.% Unusual Observatons Obs AdvExp Sales Ft SE Ft Resdual St Resd 7 0.0 8.000 7.9 0.70 0.09 0. X X denotes an observaton whose X value gves t large nfluence.

. Estmatng Model Parameters The leverage (HI) for all of the observatons follows. Regon Sales Adv Exp SRES HI A -0.767 0.06 B -.990 0.9 C.7 0.06 D -.088 0.76 E.8 0.9 F -0.707 0.00 G 8 0 0. 0.6008 H -0.986 0.76 I.6 0.76 J 6 0.08 0.6006. Estmatng Model Parameters The ftted lne plot renforces that Regon G s a hgh leverage pont Ftted Regresson Lne for Sales vs. Adv Exp Sales_ = 0.906 + 0.760 Adv Exp_ 8 7 S 0.796 R-Sq 90.% R-Sq(adj) 89.% 6 Sales_ 0 0 6 8 0 Adv Exp_. Estmatng Model Parameters A hgh leverage pont s not necessarly Bad. A hgh leverage pont has the potental to alter the ftted lne. For the Example, the potental was not realzed. For the orgnal data, the slope was 0.7. For the revsed data, the slope s 0.76. 6

. Estmatng Model Parameters Standardzed Resduals A resdual s standardzed as follows: SR = = resdual std. dev. of resdual resdual sε h A Standardzed Resdual (SR) depends on the resdual and the leverage. 7. Estmatng Model Parameters A requrement s that the errors n the populaton regresson model be normally dstrbuted. Snce the resduals estmate the errors, ths mples the resduals should be normally dstrbuted. Standardzed Resduals can be vewed as values of a standard normal random varable. A Standardzed Resdual s consdered large f SR >. 8. Estmatng Model Parameters Example (Sales vs. Advertsng Expendtures): Regon G has just sent another emal statng that they reported ncorrect values agan for sales and advertsng expendtures. The good news s that they only spent $,000 ( unts) on Advertsng. The better news s that Sales were actually $0,000 ( unts). Does ths result n a y-outler for Regon G? SR G = = resdual G G sε h.0 (.0) 0.0 =.067 >.0 Regon G s now a pont wth a y-outler. 9

. Estmatng Model Parameters The Mntab output for the revsed data follows. Regresson Analyss: Sales versus Adv Exp [Revsed Data: Change (,) to (,)] The regresson equaton s Sales = 0.80 + 0.77 Adv Exp Predctor Coef SE Coef T P Constant 0.80 0.7. 0.9 Adv Exp 0.77 0.98.6 0.007 S =.07 R-Sq = 6.0% R-Sq(adj) = 7.% Unusual Observatons Obs AdvExp Sales Ft SE Ft Resdual St Resd 7.00.000.97 0..0.07R R denotes an observaton wth a large standardzed resdual. 0. Estmatng Model Parameters The ftted lne plot renforces that Regon G s a y-outler. Ftted Regresson Lne for Sales vs. Adv Exp Sales = 0.80 + 0.77 Adv Exp S.07 R-Sq 6.0% R-Sq(adj) 7.% y-outler Sales 6 Adv Exp. Estmatng Model Parameters Hgh Influence Ponts A hgh leverage pont that also corresponds to a y-outler s a hgh nfluence pont. Example (Sales vs. Advertsng Expendtures): Regon G just can t get ther act together. Regon G has just sent another emal statng that they reported ncorrect values agan for sales and advertsng expendtures. The bad news s that they spent $0000 (0 unts) on Advertsng. The really bad news s that Sales were stll only $0,000 ( unts). Does ths result n a hgh nfluence pont for Regon G? From the Mntab output that follows, the answer s yes. The potental of the hgh leverage to alter the regresson lne was realzed.

. Estmatng Model Parameters Regresson Analyss: Sales versus Adv Exp [Revsed Data: Change (,) to (0,)] The regresson equaton s Sales =.7 + 0.9 Adv Exp Predctor Coef SE Coef T P Constant.77 0.6.9 0.0 Adv Exp 0.99 0.07.00 0.07 S =.0809 R-Sq =.9% R-Sq(adj) = 7.0% Unusual Observatons Obs AdvExp Sales Ft SE Ft Resdual St Resd 7 0.0.000.90 0.868 -.90 -.RX R denotes an observaton wth a large standardzed resdual. X denotes an observaton whose X value gves t large nfluence. Secton. Inferences about Regresson Parameters. Inferences about Regresson Parameters Model: E(Y) = β x 0 + β Model Parsmony: Is x useful n predctng Y? Hypotheses: : β = 0 vs. H : β 0 H0 a The samplng dstrbuton of β s needed to test H 0 : β = 0 Addtonal assumpton for the populaton model: Errors (ε ) are normally dstrbuted.

. Inferences about Regresson Parameters Samplng dstrbuton of β β The samplng dstrbuton of s the probablty dstrbuton of the dfferent values of β whch would be obtaned wth repeated samplng, when the values of the ndependent varable x are held constant for the repeated samples. 6. Inferences about Regresson Parameters β s normally dstrbuted wth: E( β ) = β Var( β) = V( β) = x ( x) Substtute s ε for σ ε to estmate the Var( β ) σ ε 7. Inferences about Regresson Parameters The estmated standard error of sε ( x x) Mntab denotes ths as SE Coef β s: 8

. Inferences about Regresson Parameters The dstrbuton of s t n- β β [Estmated Standard Error of β] To test H 0 : β = 0, use β 0 [EstmatedStandardError of β ] 9. Inferences about Regresson Parameters The rejecton regon depends on the research hypothess: Research Hypothess H a : β 0 H a : β > 0 H a : β < 0 Reject Rejecton Regon H 0 f or f t > t t < t Reject H 0 f t Reject H f 0 t α /, n α/, n > tα, n < tα, n 0. Inferences about Regresson Parameters Example (Sales vs. Advertsng Expendtures): Test H0: β = 0 vs. Ha: β 0 at the % sgnfcance level by usng the Mntab output. Regresson Analyss: Sales versus Adv Exp The regresson equaton s Sales = 0.68 + 0.7 Adv Exp Predctor Coef SE Coef T P Constant 0.68 0.69.0 0.66 Adv Exp 0.76 0.79.9 0.00 S = 0.8970 R-Sq = 7.% R-Sq(adj) = 69.0%

. Inferences about Regresson Parameters The test statstc s β 0.76 0 t = = =.9 SE Coef.79 Snce t =.9> t.0,8 =.06, reject H : β = 0 0 Equvalently, snce p-value =.00, reject H : β 0 = 0 Advertsng Expendtures s a sgnfcant predctor of Sales.. Inferences about Regresson Parameters The F statstc can also be used to test H 0 : β = 0 vs. H a : β 0 The F test s equvalent to the t-test n smple regresson. The F test s presented n the sldes for Secton.. Secton. Predctng New Y Values Usng Regresson

. Predctng New Y Values Usng Regresson Scenaro : Fnd a confdence nterval for E(Y) at x n+ A new value of x, denoted by x n+, s specfed Y E(Y) X n+ X. Predctng New Y Values Usng Regresson Scenaro : Pont Estmate: y Interval Estmate: ˆ ˆ n+ = β 0 + βxn+ yˆ The further that x n+ s from x, the wder the nterval. The larger the range of x-values, the narrower the nterval. The larger the number of data ponts, the narrower the nterval. ˆ ( xn x) + n ( x x) + n+ ± tα /, n se 6. Predctng New Y Values Usng Regresson Example (Sales vs. Advertsng Expendtures): Wth 9% confdence, wthn what lmts s the average value for sales when advertsng expendtures are $,000? Pont Estmate: Y ˆ =.68+.7() =.8; sε =.80; Σ( x ) x = 7.6 Interval Estmate: (.).8± (.06)(.80) + 0 7.6 or [.9,.] Ths s a confdence nterval estmate for the populaton average for sales regons wth advertsng expendtures of $,000. 7

. Predctng New Y Values Usng Regresson Scenaro : Fnd a predcton nterval for Y n+ at x n+ Y y n+ E(Y) X n+ X 8. Predctng New Y Values Usng Regresson Predctng a specfc value for Y at a gven value of x. Pont Estmate: y ˆ n β 0 Interval Estmate: ˆ + = + βxn+ yˆ A predcton nterval for Y n+ s wder than a confdence nterval for E(Y n+ ). ˆ ( xn x) + + n ( x x) + n+ ± tα /, n se 9. Predctng New Y Values Usng Regresson Example (Sales vs. Advertsng Expendtures): Suppose a new regon s to be allowed advertsng expendtures of $,000. What sales revenue can be antcpated? Obtan a 9% predcton nterval. Pont Estmate: yˆ =.8; s.80; ( ) ε = Σ x x = 7.6 Interval Estmate: (.).8± (.06)(.80) + + 0 7.6 or [.,.6] Ths s a 9% predcton nterval for the sales of an ndvdual regon when advertsng expendtures are $,000 60

. Predctng New Y Values Usng Regresson The Mntab output follows for the confdence and predcton ntervals: Values of Predctors for New Observatons New Obs Adv Exp.00 Predcted Values for New Observatons New Obs Ft SE Ft 9% CI 9% PI.80 0.9 (.908,.) (.,.607) 6. Predctng New Y Values Usng Regresson The confdence and predcton ntervals are shown n the followng graph: 8 7 6 Ftted Regresson Lne for Sales vs. Adv Exp Sales = 0.68 + 0.76 Adv Exp Regresson 9% C I 9% PI S 0.8970 R-Sq 7.% R-Sq(adj) 69.0% Sales 0 - Adv Exp 6 6 Secton. Correlaton 6

. Correlaton ( ) Coeffcent of Determnaton ryx orr Based on explaned and unexplaned devaton Λ Λ y + y = y y y y Total devaton Explaned devaton Unexplaned devaton Of the total devaton, how much s explaned by fttng the regresson lne and how much s left over? 6. Correlaton Example (Sales vs. Advertsng Expendtures): For regon I, x= and y= yˆ 9 = 0.68 + 0.76 () =.0 Total devaton = ( y 9 y ) = ( ) = Explaned devaton = ( yˆ 9 y) = (.0 ) =.0 Unexplaned devaton = y yˆ ) = (.0) 0.696 ( 9 9 = 6. Correlaton Ths s shown on the ftted lne plot: Ftted Regresson Lne for Sales vs. Adv Exp Sales = 0.68 + 0.76 Adv Exp Unexplaned devaton S 0.8970 R-Sq 7.% R-Sq(adj) 69.0% Total devaton Sales Explaned devaton y Adv Exp 6 66

. Correlaton Square both sdes to account for negatve devatons ( y y) = ( yˆ y) + ( y yˆ ) + [ cross product term] Sum over all observatons Σ ( y y ) = Σ y y + Σ y y Sum of Squares due to Total (SST) Sum of Squares due to Regresson (SSR) Sum of Squares due to Error (SSE) 67. Correlaton Each of Sum of Squares has assocated wth t degrees of freedom (df). Σ ( y ) y = Σ y y + Σ y y SST = SSR + SSE Degrees of freedom for each Sum of Squares are: (n ) = + (n - ) 68. Correlaton Mean Square = Sum of Squares/ df MSR = SSR/df = SSR / = SSR MSE = SSE/df = SSE/(n ) Ths leads to another test statstc, the F- statstc, for testng H 0 : β = 0. F = MSR/MSE { F Statstc 69

. Correlaton Ratonale: If SSR s large relatve to SSE, ths ndcates the ndependent varable x has real predctve value. The F test s one-taled. Rejecton Regon: Reject H 0 : β = 0 f F-Statstc > F α,,n- Or, reject H 0 f p-value < α. 70. Correlaton Example: Sales and Advertsng Expendtures Locate the value of the F-Statstc, the assocated p-value and state your concluson. Regresson Analyss: Sales versus Adv Exp The regresson equaton s Sales = 0.68 + 0.7 Adv Exp Predctor Coef SE Coef T P Constant 0.68 0.69.0 0.66 Adv Exp 0.76 0.79.9 0.00 S = 0.8970 R-Sq = 7.% R-Sq(adj) = 69.0% Analyss of Varance Source DF SS MS F P Regresson.9.9.0 0.00 Resdual Error 8.07 0.688 Total 9 0.000 7. Correlaton F test =.9/0.688 =.0 Snce.0 > F.0,,8 =., reject H 0 : β = 0. Percentage ponts of the F dstrbuton are n Table 6. Equvalently, snce p-value =.00, reject H 0 : β = 0. 7

. Correlaton In smple regresson, t = F. Example: Sales and Advertsng Expendtures t = (.9) =.07 = F In smple regresson, the p-values for the F test and t test are equal. Example: Sales and Advertsng Expendtures p-value for F test =.00 = p-value for t-test 7. Correlaton Coeffcent of Determnaton, denoted by or R, s r yx SSR R = SST R specfes how much of the total varaton s explaned by the ftted lne. 7. Correlaton Example (Sales vs. Advertsng Expendtures): Use the Mntab output to fnd the value of the coeffcent of determnaton and nterpret the value n the context of ths problem. Regresson Analyss: Sales versus Adv Exp The regresson equaton s Sales = 0.68 + 0.7 Adv Exp S = 0.8970 R-Sq = 7.% R-Sq(adj) = 69.0% From the output, R = 7.% Interpretaton: 7.% of the varaton n Sales s explaned by the regresson model wth Advertsng Expendtures as the predctor 7

. Correlaton The coeffcent of determnaton, R, ranges from 0 to The coeffcent of correlaton, denoted by r, s obtaned from R : r = R where the sgn (+,-) of r s the same as the sgn of. βˆ 76. Correlaton r takes on values from - to + Correlaton measures the strength of the lnear relatonshp between x and Y. The coeffcent of determnaton (R ) and the coeffcent of correlaton (r) have very dfferent nterpretatons. 77. Correlaton In correlaton, both varables are on an equal footng. It does not matter whch s labeled x and whch s labeled Y. The objectve s to measure the assocaton between x and Y. Ths s n contrast to regresson analyss, where the objectve s to use x to predct Y. 78

. Correlaton Example (Sales vs. Advertsng Expendtures) r = R = 0.7 = + 0.8 r has a postve sgn because the slope of postve: ˆ β = +0.7 ˆβ s Warnng: Correlaton does not mply causaton. 79. Correlaton r denotes the sample correlaton coeffcent. r estmates the populaton correlaton (ρ) Hypothess testng on ρ requres certan assumptons. In regresson analyss, the values of x are predetermned constants In correlaton analyss, the values of x are randomly selected. In correlaton analyss, the x values also come from a normal dstrbuton. More precsely, the random sample of (x,y) values s drawn from a bvarate normal dstrbuton. 80. Correlaton To test H 0 : ρ = 0, the test statstc s t = r where t has (n ) d.f. n r The rejecton regon depends on the form of H a : If H a : ρ > 0 reject H 0 f t > t α,n- If H a : ρ < 0 reject H 0 f t < -t α,n- If H a : ρ 0 reject H 0 f t > t α/,n- 8

. Correlaton Exercse.7: A survey of recent M.B.A. graduates of a busness school obtaned data on frst-year salary and years of pror work experence. [The data are n Exercse.7. Assume that the students were randomly selected] The Mntab output follows: Pearson correlaton of SALARY and EXPER = 0.70 P-Value = 0.000 n t = r = 0.70 = 6.9 r (.70) For such a large t-value, the p-value s 0.000. Concluson: Reject H 0 : ρ = 0 8 Keywords: Chapter Regresson analyss Independent varable Predctor varable Dependent varable Response varable Scatterplot Smple regresson Slope y-ntercept Least-squares method Hgh leverage ponts Y-outlers Hgh nfluence ponts Standard error of the estmate t-test on slope F-test on slope Coeffcent of determnaton Coeffcent of correlaton 8 Summary of Chapter Understandng the role of the scatterplot Understandng the ratonale of the least squares method for fndng the best fttng lne Understandng the mpact of hgh leverage ponts, y- outlers and hgh nfluence ponts on the ftted lne Understandng varablty around the regresson lne Testng the slope coeffcent usng the t-test Testng the slope coeffcent usng the F-test Understandng the dfference between the correlaton coeffcent and the coeffcent of determnaton 8