Statistics for Economics & Business

Similar documents
Basic Business Statistics, 10/e

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Statistics for Business and Economics

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Chapter 14 Simple Linear Regression

Chapter 11: Simple Linear Regression and Correlation

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Comparison of Regression Lines

Learning Objectives for Chapter 11

Chapter 13: Multiple Regression

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Correlation and Regression

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Lecture 6: Introduction to Linear Regression

Chapter 9: Statistical Inference and the Relationship between Two Variables

18. SIMPLE LINEAR REGRESSION III

Chapter 15 Student Lecture Notes 15-1

x i1 =1 for all i (the constant ).

28. SIMPLE LINEAR REGRESSION III

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics MINITAB - Lab 2

Chapter 15 - Multiple Regression

a. (All your answers should be in the letter!

Introduction to Regression

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Basic Business Statistics 6 th Edition

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

17 - LINEAR REGRESSION II

/ n ) are compared. The logic is: if the two

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

Statistics II Final Exam 26/6/18

STAT 3008 Applied Regression Analysis

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Scatter Plot x

Negative Binomial Regression

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Economics 130. Lecture 4 Simple Linear Regression Continued

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

e i is a random error

Topic 7: Analysis of Variance

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

STATISTICS QUESTIONS. Step by Step Solutions.

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

SIMPLE LINEAR REGRESSION

x = , so that calculated

Lecture 4 Hypothesis Testing

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

IV. Modeling a Mean: Simple Linear Regression

Regression. The Simple Linear Regression Model

CHAPTER 8. Exercise Solutions

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

The Ordinary Least Squares (OLS) Estimator

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Correlation Analysis

Unit 10: Simple Linear Regression and Correlation

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

Problem of Estimation. Ordinary Least Squares (OLS) Ordinary Least Squares Method. Basic Econometrics in Transportation. Bivariate Regression Analysis

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

Activity #13: Simple Linear Regression. actgpa.sav; beer.sav;

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Properties of Least Squares

Chapter 10. What is Regression Analysis? Simple Linear Regression Analysis. Examples

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

The SAS program I used to obtain the analyses for my answers is given below.

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

β0 + β1xi and want to estimate the unknown

Statistics Chapter 4

III. Econometric Methodology Regression Analysis

Diagnostics in Poisson Regression. Models - Residual Analysis

Introduction to Analysis of Variance (ANOVA) Part 1

January Examinations 2015

Reduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor

Midterm Examination. Regression and Forecasting Models

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

Linear Regression Analysis: Terminology and Notation

Biostatistics 360 F&t Tests and Intervals in Regression 1

Regression Analysis. Regression Analysis

β0 + β1xi. You are interested in estimating the unknown parameters β

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Chapter 8 Multivariate Regression Analysis

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

F statistic = s2 1 s 2 ( F for Fisher )

Transcription:

Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable The meanng of the regresson coeffcents b and b 1 How to evaluate the assumptons of regresson analyss and know what to do f the assumptons are volated To make nferences about the slope and correlaton coeffcent To estmate mean values and predct ndvdual values

Correlaton vs. Regresson A scatter plot can be used to show the relatonshp between two varables Correlaton analyss s used to measure the strength of the assocaton (lnear relatonshp) between two varables Correlaton s only concerned wth strength of the relatonshp No causal effect s mpled wth correlaton Scatter plots were frst presented n Ch. Correlaton was frst presented n Ch. 3 Introducton to Regresson Analyss Regresson analyss s used to: Predct the value of a dependent varable based on the value of at least one ndependent varable Explan the mpact of changes n an ndependent varable on the dependent varable Dependent varable: the varable we wsh to predct or explan Independent varable: the varable used to predct or explan the dependent varable

Smple Lnear Regresson Model Only one ndependent varable, Relatonshp between and s descrbed by a lnear functon Changes n are assumed to be related to changes n Types of Relatonshps Lnear relatonshps Curvlnear relatonshps

Types of Relatonshps (contnued) Strong relatonshps Weak relatonshps Types of Relatonshps (contnued) No relatonshp

Smple Lnear Regresson Model Dependent Varable Populaton ntercept = β + β + Populaton Slope Coeffcent 1 Independent Varable ε Random Error term Lnear component Random Error component Observed Value of for Smple Lnear Regresson Model = β + β + ε 1 (contnued) Predcted Value of for Intercept = β ε Random Error for ths value Slope = β 1

Smple Lnear Regresson Equaton (Predcton Lne) The smple lnear regresson equaton provdes an estmate of the populaton regresson lne Estmated (or predcted) value for observaton Estmate of the regresson ntercept Estmate of the regresson slope Ŷ = b + b 1 Value of for observaton The Least Squares Method b and b 1 are obtaned by fndng the values of that mnmze the sum of the squared dfferences between and Ŷ : mn ( + Ŷ ) = mn ( (b b1 ))

Fndng the Least Squares Equaton The coeffcents b and b 1, and other regresson results n ths chapter, wll be found usng Excel or Mntab Formulas are shown n the text for those who are nterested Interpretaton of the Slope and the Intercept b s the estmated mean value of when the value of s zero b 1 s the estmated change n the mean value of as a result of a one-unt ncrease n

Smple Lnear Regresson Example A real estate agent wshes to examne the relatonshp between the sellng prce of a home and ts sze (measured n square feet) A random sample of 1 houses s selected Dependent varable () = house prce n $1s Independent varable () = square feet Smple Lnear Regresson Example: Data House Prce n $1s () Square Feet () 45 14 31 16 79 17 38 1875 199 11 19 155 45 35 34 45 319 145 55 17

Smple Lnear Regresson Example: Scatter Plot House prce model: Scatter Plot House Prce ($1s) 45 4 35 3 5 15 1 5 5 1 15 5 3 Square Feet Smple Lnear Regresson Example: Output Regresson Statstcs Multple R.7611 R Square.588 Adjusted R Square.584 Standard Error 41.333 Observatons 1 The regresson equaton s: house prce = 98.4833 +.1977 (square feet) ANOVA df SS MS F Sgnfcance F Regresson 1 18934.9348 18934.9348 11.848.139 Resdual 8 13665.565 178.1957 Total 9 36.5 Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.4833 58.3348 1.6996.189-35.577 3.7386 Square Feet.1977.397 3.3938.139.3374.1858

Smple Lnear Regresson Example: Graphcal Representaton House prce model: Scatter Plot and Predcton Lne Intercept = 98.48 House Prce ($1s) 45 4 35 3 5 15 1 5 5 1 15 5 3 Square Feet Slope =.1977 house prce = 98.4833 +.1977 (square feet) Smple Lnear Regresson Example: Interpretaton of b o house prce = 98.4833 +.1977 (square feet) b s the estmated mean value of when the value of s zero (f = s n the range of observed values) Because a house cannot have a square footage of, b has no practcal applcaton

Smple Lnear Regresson Example: Interpretng b 1 house prce = 98.4833 +.1977 (square feet) b 1 estmates the change n the mean value of as a result of a one-unt ncrease n Here, b 1 =.1977 tells us that the mean value of a house ncreases by.1977($1) = $19.77, on average, for each addtonal one square foot of sze Smple Lnear Regresson Example: Makng Predctons Predct the prce for a house wth square feet: house prce = = = 98.4833 +.1977 (sq.ft.) 98.4833 +.1977() 317.78 The predcted prce for a house wth square feet s 317.78($1,s) = $317,78

Smple Lnear Regresson Example: Makng Predctons When usng a regresson model for predcton, only predct wthn the relevant range of data Relevant range for nterpolaton House Prce ($1s) 45 4 35 3 5 15 1 5 5 1 15 5 3 Square Feet Do not try to extrapolate beyond the range of observed s Measures of Varaton Total varaton s made up of two parts: SST = SSR + SSE Total Sum of Squares Regresson Sum of Squares Error Sum of Squares SST = ( SSR = (Ŷ SSE = ( Ŷ ) ) ) where: = Mean value of the dependent varable = Observed value of the dependent varable ˆ = Predcted value of for the gven value

Measures of Varaton (contnued) SST = total sum of squares (Total Varaton) Measures the varaton of the values around ther mean SSR = regresson sum of squares (Explaned Varaton) Varaton attrbutable to the relatonshp between and SSE = error sum of squares (Unexplaned Varaton) Varaton n attrbutable to factors other than Measures of Varaton (contnued) SST = ( - ) SSE = ( - ) _ SSR = ( - ) _

Coeffcent of Determnaton, r The coeffcent of determnaton s the porton of the total varaton n the dependent varable that s explaned by varaton n the ndependent varable The coeffcent of determnaton s also called r-squared and s denoted as r r SSR = SST regresson sum of squares = total sum of squares note: r 1 Examples of Approxmate r Values r = 1 r = 1 Perfect lnear relatonshp between and : 1% of the varaton n s explaned by varaton n r = 1

Examples of Approxmate r Values < r < 1 Weaker lnear relatonshps between and : Some but not all of the varaton n s explaned by varaton n Examples of Approxmate r Values r = No lnear relatonshp between and : r = The value of does not depend on. (None of the varaton n s explaned by varaton n )

Smple Lnear Regresson Example: Coeffcent of Determnaton, r Regresson Statstcs Multple R.7611 R Square.588 Adjusted R Square.584 Standard Error 41.333 Observatons 1 SSR 18934.9348 r = = =.588 SST 36.5 58.8% of the varaton n house prces s explaned by varaton n square feet ANOVA df SS MS F Sgnfcance F Regresson 1 18934.9348 18934.9348 11.848.139 Resdual 8 13665.565 178.1957 Total 9 36.5 Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.4833 58.3348 1.6996.189-35.577 3.7386 Square Feet.1977.397 3.3938.139.3374.1858 Standard Error of Estmate The standard devaton of the varaton of observatons around the regresson lne s estmated by S Where = SSE = n = 1 SSE = error sum of squares n = sample sze n ( ˆ n )

Smple Lnear Regresson Example: Standard Error of Estmate Regresson Statstcs Multple R.7611 R Square.588 Adjusted R Square.584 Standard Error 41.333 Observatons 1 S = 41.333 ANOVA df SS MS F Sgnfcance F Regresson 1 18934.9348 18934.9348 11.848.139 Resdual 8 13665.565 178.1957 Total 9 36.5 Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.4833 58.3348 1.6996.189-35.577 3.7386 Square Feet.1977.397 3.3938.139.3374.1858 Comparng Standard Errors S s a measure of the varaton of observed values from the regresson lne smalls larges The magntude of S should always be judged relatve to the sze of the values n the sample data.e., S = $41.33K s moderately small relatve to house prces n the $K - $4K range

Assumptons of Regresson L.I.N.E Lnearty The relatonshp between and s lnear Independence of Errors Error values are statstcally ndependent Normalty of Error Error values are normally dstrbuted for any gven value of Equal Varance (also called homoscedastcty) The probablty dstrbuton of the errors has constant varance Resdual Analyss The resdual for observaton, e, s the dfference between ts observed and predcted value Check the assumptons of regresson by examnng the resduals Examne for lnearty assumpton Evaluate ndependence assumpton Evaluate normal dstrbuton assumpton Examne for constant varance for all levels of (homoscedastcty) Graphcal Analyss of Resduals Can plot resduals vs. e = Ŷ

Resdual Analyss for Lnearty x x resduals x resduals x Not Lnear ü Lnear Resdual Analyss for Independence Not Independent ü Independent resduals resduals resduals

Checkng for Normalty Examne the Stem-and-Leaf Dsplay of the Resduals Examne the Boxplot of the Resduals Examne the Hstogram of the Resduals Construct a Normal Probablty Plot of the Resduals Resdual Analyss for Normalty When usng a normal probablty plot, normal errors wll approxmately dsplay n a straght lne Percent 1-3 - -1 1 3 Resdual

Resdual Analyss for Equal Varance x x resduals Non-constant varance x resduals ü Constant varance x Smple Lnear Regresson Example: Resdual Output RESIDUAL OUTPUT Predcted House Prce Resduals 1 51.9316-6.9316 73.87671 38.139 3 84.85348-5.853484 4 34.684 3.93716 5 18.9984-19.9984 6 68.3883-49.3883 7 356.51 48.79749 8 367.1799-43.1799 9 54.6674 64.3364 1 84.85348-9.85348 Resduals 8 6 4 - -4-6 House Prce Model Resdual Plot 1 3 Square Feet Does not appear to volate any regresson assumptons

Smple Lnear Regresson Example: Resdual Output Resdual Plots for House Prce () 99 Normal Probablty Plot Versus Fts Percent 9 5 1 Resdual 5 5-5 1-1 -5 Resdual 5 1-5 4 8 3 Ftted Value 36 Hstogram Versus Order 3 5 Frequency 1 Resdual 5-5 -5-5 5 Resdual 5 75-5 1 3 4 5 6 7 Observaton Order 8 9 1 Measurng Autocorrelaton: The Durbn-Watson Statstc Used when data are collected over tme to detect f autocorrelaton s present Autocorrelaton exsts f resduals n one tme perod are related to resduals n another perod

Autocorrelaton Autocorrelaton s correlaton of the errors (resduals) over tme Tme (t) Resdual Plot Here, resduals show a cyclc pattern (not random.) Cyclcal patterns are a sgn of postve autocorrelaton Resduals 15 1 5-5 -1-15 4 6 8 Tme (t) Volates the regresson assumpton that resduals are random and ndependent The Durbn-Watson Statstc The Durbn-Watson statstc s used to test for autocorrelaton H : resduals are not correlated H 1 : postve autocorrelaton s present D n = = n (e e = 1 e ) 1 The possble range s D 4 D should be close to f H s true D less than may sgnal postve autocorrelaton, D greater than may sgnal negatve autocorrelaton

Testng for Postve Autocorrelaton H : postve autocorrelaton does not exst H 1 : postve autocorrelaton s present Calculate the Durbn-Watson test statstc = D (The Durbn-Watson Statstc can be found usng Excel or Mntab) Fnd the values d L and d U from the Durbn-Watson table (for sample sze n and number of ndependent varables k) Decson rule: reject H f D < d L Reject H Inconclusve Do not reject H d L d U Testng for Postve Autocorrelaton (contnued) Suppose we have the followng tme seres data: 16 14 1 Sales 1 8 6 y = 3.65 + 4.738x R =.8976 4 5 1 15 5 3 Tme Is there autocorrelaton?

Testng for Postve Autocorrelaton (contnued) Example wth n = 5: Excel/PHStat output: Durbn-Watson Calculatons Sum of Squared Dfference of Resduals 396.18 Sum of Squared Resduals 379.98 Durbn-Watson Statstc 1.494 Sales 16 14 1 1 8 6 4 5 1 15 5 3 Tme y = 3.65 + 4.738x R =.8976 D = n = (e e n = 1 e ) 1 = 396.18 = 1.494 379.98 Testng for Postve Autocorrelaton (contnued) Here, n = 5 and there s k = 1 one ndependent varable Usng the Durbn-Watson table, d L = 1.9 and d U = 1.45 D = 1.494 < d L = 1.9, so reject H and conclude that sgnfcant postve autocorrelaton exsts Decson: reject H snce D = 1.494 < d L Reject H Inconclusve Do not reject H d L =1.9 d U =1.45

Inferences About the Slope The standard error of the regresson slope coeffcent (b 1 ) s estmated by Sb 1 = S SS = S ( ) where: S b1 = Estmate of the standard error of the slope S = SSE n = Standard error of the estmate Inferences About the Slope: t Test t test for a populaton slope Is there a lnear relatonshp between and? Null and alternatve hypotheses H : β 1 = (no lnear relatonshp) H 1 : β 1 (lnear relatonshp does exst) Test statstc t STAT = b 1 S d.f. = n β b 1 1 where: b 1 = regresson slope coeffcent β 1 = hypotheszed slope S b1 = standard error of the slope

Inferences About the Slope: t Test Example House Prce n $1s (y) Square Feet (x) 45 14 31 16 79 17 38 1875 199 11 19 155 45 35 34 45 319 145 55 17 Estmated Regresson Equaton: house prce = 98.5 +.198 (sq.ft.) The slope of ths model s.198 Is there a relatonshp between the square footage of the house and ts sales prce? Inferences About the Slope: t Test Example H : β 1 = H 1 : β 1 Coeffcents Standard Error t Stat P-value Intercept 98.4833 58.3348 1.6996.189 Square Feet.1977.397 3.3938.139 b 1 S b1 t STAT b β 1 1 = S b 1. 1977 = = 3. 3938. 397

Inferences About the Slope: t Test Example Test Statstc: t STAT = 3.39 H : β 1 = H 1 : β 1 d.f. = 1- = 8 α/=.5 α/=.5 Reject H Do not reject H -t α/ t α/ Reject H -.36.36 3.39 Decson: Reject H There s suffcent evdence that square footage affects house prce Inferences About the Slope: t Test Example From Excel output: H : β 1 = H 1 : β 1 Coeffcents Standard Error t Stat P-value Intercept 98.4833 58.3348 1.6996.189 Square Feet.1977.397 3.3938.139 From Mntab output: Predctor Coef SE Coef T P Constant 98.5 58.3 1.69.19 Square Feet.1977.397 3.33.1 Decson: Reject H, snce p-value < α There s suffcent evdence that square footage affects house prce. p-value

F Test for Sgnfcance F Test statstc: F STAT = MSR MSE where MSR = SSR k SSE MSE = n k 1 where F STAT follows an F dstrbuton wth k numerator and (n k - 1) denomnator degrees of freedom (k = the number of ndependent varables n the regresson model) F-Test for Sgnfcance Output Regresson Statstcs Multple R.7611 R Square.588 Adjusted R Square.584 Standard Error 41.333 Observatons 1 ANOVA df SS MS F Sgnfcance F Regresson 1 18934.9348 18934.9348 11.848.139 Resdual 8 13665.565 178.1957 Total 9 36.5 MSR 18934.9348 F STAT = = = 11.848 MSE 178.1957 Wth 1 and 8 degrees of freedom p-value for the F-Test

F Test for Sgnfcance (contnued) H : β 1 = H 1 : β 1 α =.5 df 1 = 1 df = 8 Do not reject H Crtcal Value: F α = 5.3 α =.5 Reject H F.5 = 5.3 F Test Statstc: MSR F STAT = =11.8 MSE Decson: Reject H at α =.5 Concluson: There s suffcent evdence that house sze affects sellng prce Confdence Interval Estmate for the Slope Confdence Interval Estmate of the Slope: b 1 ± t α / S b 1 d.f. = n - Excel Prntout for House Prces: Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.4833 58.3348 1.6996.189-35.577 3.7386 Square Feet.1977.397 3.3938.139.3374.1858 At 95% level of confdence, the confdence nterval for the slope s (.337,.1858)

Confdence Interval Estmate for the Slope (contnued) Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.4833 58.3348 1.6996.189-35.577 3.7386 Square Feet.1977.397 3.3938.139.3374.1858 Snce the unts of the house prce varable s $1s, we are 95% confdent that the average mpact on sales prce s between $33.74 and $185.8 per square foot of house sze Ths 95% confdence nterval does not nclude. Concluson: There s a sgnfcant relatonshp between house prce and square feet at the.5 level of sgnfcance t Test for a Correlaton Coeffcent Hypotheses H : ρ = (no correlaton between and ) H 1 : ρ (correlaton exsts) Test statstc t STAT = r - ρ 1 r n (wth n degrees of freedom) where r = + r = r r f b > 1 f b < 1

t-test For A Correlaton Coeffcent Is there evdence of a lnear relatonshp between square feet and house prce at the.5 level of sgnfcance? t H : ρ = H 1 : ρ (No correlaton) (correlaton exsts) α =.5, df = 1 - = 8 r ρ 1 r n.76 STAT = = = 1.76 1 3.39 (contnued) t-test For A Correlaton Coeffcent (contnued) t r ρ 1 r n.76 STAT = = = 1.76 1 d.f. = 1- = 8 α/=.5 α/=.5 3.39 Decson: Reject H Concluson: There s evdence of a lnear assocaton at the 5% level of sgnfcance Reject H Reject H t α/ Do not reject H -t α/ -.36.36 3.39

Estmatng Mean Values and Predctng Indvdual Values Goal: Form ntervals around to express uncertanty about the value of for a gven Confdence Interval for the mean of, gven = b +b 1 Predcton Interval for an ndvdual, gven Confdence Interval for the Average, Gven Confdence nterval estmate for the mean value of gven a partcular Confdence nterval for µ ˆ ± t α / S h = : Sze of nterval vares accordng to dstance away from mean, 1 ( ) 1 ( ) h = + = + n SS n ( )

Predcton Interval for an Indvdual, Gven Confdence nterval estmate for an Indvdual value of gven a partcular Confdence nterval for ˆ ± t S 1+ α / = h : Ths extra term adds to the nterval wdth to reflect the added uncertanty for an ndvdual case Estmaton of Mean Values: Example Confdence Interval Estmate for µ = Fnd the 95% confdence nterval for the mean prce of, square-foot houses Predcted Prce = 317.78 ($1,s) 1 ( ) ± t.5 S + = 317.78 ± 37.1 n ( ) Ŷ The confdence nterval endponts are 8.66 and 354.9, or from $8,66 to $354,9

Estmaton of Indvdual Values: Example Predcton Interval Estmate for = Fnd the 95% predcton nterval for an ndvdual house wth, square feet Predcted Prce = 317.85 ($1,s) 1 ( ) ± t.5 S 1+ + = 317.78 ± 1.8 n ( ) Ŷ The predcton nterval endponts are 15.5 and 4.7, or from $15,5 to $4,7 Ptfalls of Regresson Analyss Lackng an awareness of the assumptons underlyng least-squares regresson Not knowng how to evaluate the assumptons Not knowng the alternatves to least-squares regresson f a partcular assumpton s volated Usng a regresson model wthout knowledge of the subject matter Extrapolatng outsde the relevant range

Strateges for Avodng the Ptfalls of Regresson Start wth a scatter plot of vs. to observe possble relatonshp Perform resdual analyss to check the assumptons Plot the resduals vs. to check for volatons of assumptons such as homoscedastcty Use a hstogram, stem-and-leaf dsplay, boxplot, or normal probablty plot of the resduals to uncover possble non-normalty Strateges for Avodng the Ptfalls of Regresson If there s volaton of any assumpton, use alternatve methods or models (contnued) If there s no evdence of assumpton volaton, then test for the sgnfcance of the regresson coeffcents and construct confdence ntervals and predcton ntervals Avod makng predctons or forecasts outsde the relevant range

Summary Introduced types of regresson models Revewed assumptons of regresson and correlaton Dscussed determnng the smple lnear regresson equaton Descrbed measures of varaton Dscussed resdual analyss Addressed measurng autocorrelaton Summary (contnued) Descrbed nference about the slope Dscussed correlaton -- measurng the strength of the assocaton Addressed estmaton of mean values and predcton of ndvdual values Dscussed possble ptfalls n regresson and recommended strateges to avod them