Statistics for Business and Economics

Similar documents
Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Statistics for Economics & Business

Basic Business Statistics, 10/e

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Chapter 14 Simple Linear Regression

Correlation Analysis

Chapter 11: Simple Linear Regression and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Lecture 6: Introduction to Linear Regression

Chapter 15 - Multiple Regression

x i1 =1 for all i (the constant ).

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Chapter 9: Statistical Inference and the Relationship between Two Variables

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Chapter 13: Multiple Regression

Chapter 15 Student Lecture Notes 15-1

Correlation and Regression

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Introduction to Regression

Topic 7: Analysis of Variance

Scatter Plot x

STATISTICS QUESTIONS. Step by Step Solutions.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Regression. The Simple Linear Regression Model

e i is a random error

Economics 130. Lecture 4 Simple Linear Regression Continued

Statistics II Final Exam 26/6/18

Learning Objectives for Chapter 11

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Comparison of Regression Lines

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

/ n ) are compared. The logic is: if the two

17 - LINEAR REGRESSION II

a. (All your answers should be in the letter!

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

STAT 3008 Applied Regression Analysis

SIMPLE LINEAR REGRESSION

Statistics MINITAB - Lab 2

The Ordinary Least Squares (OLS) Estimator

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

T E C O L O T E R E S E A R C H, I N C.

18. SIMPLE LINEAR REGRESSION III

Biostatistics 360 F&t Tests and Intervals in Regression 1

Diagnostics in Poisson Regression. Models - Residual Analysis

January Examinations 2015

28. SIMPLE LINEAR REGRESSION III

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Basic Business Statistics 6 th Edition

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Lecture 4 Hypothesis Testing

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

This column is a continuation of our previous column

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

β0 + β1xi and want to estimate the unknown

Properties of Least Squares

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

Chapter 10. What is Regression Analysis? Simple Linear Regression Analysis. Examples

Statistics for Managers using Microsoft Excel 6 th Edition

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

CHAPTER 8. Exercise Solutions

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Midterm Examination. Regression and Forecasting Models

Regression Analysis. Regression Analysis

Chapter 3 Describing Data Using Numerical Measures

Negative Binomial Regression

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Chapter 8 Indicator Variables

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

x = , so that calculated

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

Continuous vs. Discrete Goods

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

REGRESSION ANALYSIS II- MULTICOLLINEARITY

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Lecture 3 Stat102, Spring 2007

Chapter 8 Multivariate Regression Analysis

ECON 351* -- Note 23: Tests for Coefficient Differences: Examples Introduction. Sample data: A random sample of 534 paid employees.

STAT 511 FINAL EXAM NAME Spring 2001

The SAS program I used to obtain the analyses for my answers is given below.

Tests of Exclusion Restrictions on Regression Coefficients: Formulation and Interpretation

CHAPTER 6 GOODNESS OF FIT AND CONTINGENCY TABLE PREPARED BY: DR SITI ZANARIAH SATARI & FARAHANIM MISNI

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Transcription:

Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1

11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear relatonshp between two varables: Y = β 0 + β 1 X Where Y s the dependent varable and X s the ndependent varable β 0 s the Y-ntercept β 1 s the slope Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-

Least Squares Regresson n Estmates for coeffcents β 0 and β 1 are found usng a Least Squares Regresson technque n The least-squares regresson lne, based on sample data, s yˆ = b b x 0 + 1 n Where b 1 s the slope of the lne and b 0 s the y- ntercept: b = 1 Cov(x, y) s x b0 = y b1x Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-3

Introducton to Regresson Analyss n Regresson analyss s used to: n Predct the value of a dependent varable based on the value of at least one ndependent varable n Explan the mpact of changes n an ndependent varable on the dependent varable Dependent varable: the varable we wsh to explan (also called the endogenous varable) Independent varable: the varable used to explan the dependent varable (also called the exogenous varable) Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-4

11. Lnear Regresson Model n n n The relatonshp between X and Y s descrbed by a lnear functon Changes n Y are assumed to be caused by changes n X Lnear regresson populaton equaton model Y = β + β x + 0 1 ε n Where β 0 and β 1 are the populaton model coeffcents and ε s a random error term. Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-5

Smple Lnear Regresson Model The populaton regresson model: Dependent Varable Populaton Y ntercept Y = β + β X + 0 Populaton Slope Coeffcent 1 Independent Varable ε Random Error term Lnear component Random Error component Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-6

Smple Lnear Regresson Model (contnued) Y Observed Value of Y for X Y = β + β X + 0 1 ε Predcted Value of Y for X ε Random Error for ths X value Slope = β 1 Intercept = β 0 X X Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-7

Smple Lnear Regresson Equaton The smple lnear regresson equaton provdes an estmate of the populaton regresson lne Estmated (or predcted) y value for observaton y ˆ = b + Estmate of the regresson ntercept 0 b Estmate of the regresson slope 1 x Value of x for observaton The ndvdual random error terms e have a mean of zero e ˆ = ( y - y) = y -(b0 + b1x ) Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-8

11.3 Least Squares Estmators n b 0 and b 1 are obtaned by fndng the values of b 0 and b 1 that mnmze the sum of the squared dfferences between y and ŷ : mn SSE = mn e = mn (y yˆ ) = mn [y (b 0 + b 1 x )] Dfferental calculus s used to obtan the coeffcent estmators b 0 and b 1 that mnmze SSE Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-9

Least Squares Estmators (contnued) n The slope coeffcent estmator s n (x x)(y y) Cov(x,y) = 1 b 1 = = = n sx (x x) = 1 r xy s s y x n And the constant or y-ntercept s b0 = y b1x n The regresson lne always goes through the mean x, y Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-10

Fndng the Least Squares Equaton n The coeffcents b 0 and b 1, and other regresson results n ths chapter, wll be found usng a computer n Hand calculatons are tedous n Statstcal routnes are bult nto Excel n Other statstcal analyss software can be used Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-11

Lnear Regresson Model Assumptons n n n n The true relatonshp form s lnear (Y s a lnear functon of X, plus random error) The error terms, ε are ndependent of the x values The error terms are random varables wth mean 0 and constant varance, σ (the constant varance property s called homoscedastcty) E[ε ] = 0 and E[ε ] = σ for ( = 1,,n) The random error terms, ε, are not correlated wth one another, so that E[ε ε j ] = 0 for all j Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1

Interpretaton of the Slope and the Intercept n b 0 s the estmated average value of y when the value of x s zero (f x = 0 s n the range of observed x values) n b 1 s the estmated change n the average value of y as a result of a one-unt change n x Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-13

Smple Lnear Regresson Example n A real estate agent wshes to examne the relatonshp between the sellng prce of a home and ts sze (measured n square feet) n A random sample of 10 houses s selected n Dependent varable (Y) = house prce n $1000s n Independent varable (X) = square feet Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-14

Sample Data for House Prce Model House Prce n $1000s (Y) Square Feet (X) 45 1400 31 1600 79 1700 308 1875 199 1100 19 1550 405 350 34 450 319 145 55 1700 Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-15

Graphcal Presentaton n House prce model: scatter plot House Prce ($1000s) 450 400 350 300 50 00 150 100 50 0 0 500 1000 1500 000 500 3000 Square Feet Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-16

Graphcal Presentaton n House prce model: scatter plot and regresson lne 450 400 Intercept = 98.48 House Prce ($1000s) 350 300 50 00 150 100 50 0 0 500 1000 1500 000 500 3000 Slope = 0.10977 Square Feet house prce = 98.4833 + 0.10977 (square feet) Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-17

Interpretaton of the Intercept, b 0 house prce = 98.4833 + 0.10977 (square feet) n b 0 s the estmated average value of Y when the value of X s zero (f X = 0 s n the range of observed X values) Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-18

Interpretaton of the Slope Coeffcent, b 1 house prce = 98.4833 + 0.10977 (square feet) n b 1 measures the estmated change n the average value of Y as a result of a oneunt change n X n Here, b 1 =.10977 tells us that the average value of a house ncreases by.10977($1000) = $109.77, on average, for each addtonal one square foot of sze Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-19

11.4 Measures of Varaton n Total varaton s made up of two parts: SST = SSR + SSE Total Sum of Squares Regresson Sum of Squares Error Sum of Squares = SST (y y = SSR (yˆ y SSE = (y yˆ ) ) ) where: y = Average value of the dependent varable y = Observed values of the dependent varable ŷ = Predcted value of y for the gven x value Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-0

Measures of Varaton (contnued) n n n SST = total sum of squares n Measures the varaton of the y values around ther mean, y SSR = regresson sum of squares n Explaned varaton attrbutable to the lnear relatonshp between x and y SSE = error sum of squares n Varaton attrbutable to factors other than the lnear relatonshp between x and y Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1

Measures of Varaton y Y y _ y _ SST = (y - y) SSE = (y - y ) _ SSR = (y - y) (contnued) y _ y x X Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-

Coeffcent of Determnaton, R n The coeffcent of determnaton s the porton of the total varaton n the dependent varable that s explaned by varaton n the ndependent varable n The coeffcent of determnaton s also called R-squared and s denoted as R SSR R = = SST regresson sum of squares total sum of squares note: 0 R 1 Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-3

Examples of Approxmate r Values Y r = 1 Y r = 1 X Perfect lnear relatonshp between X and Y: 100% of the varaton n Y s explaned by varaton n X r = 1 X Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-4

Examples of Approxmate r Values Y Y X 0 < r < 1 Weaker lnear relatonshps between X and Y: Some but not all of the varaton n Y s explaned by varaton n X X Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-5

Examples of Approxmate r Values Y r = 0 No lnear relatonshp between X and Y: r = 0 X The value of Y does not depend on X. (None of the varaton n Y s explaned by varaton n X) Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-6

Correlaton and R n The coeffcent of determnaton, R, for a smple regresson s equal to the smple correlaton squared R = r xy Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-7

Estmaton of Model Error Varance n n An estmator for the varance of the populaton model error s σˆ = s e = n = 1 n SSE n Dvson by n nstead of n 1 s because the smple regresson model uses two estmated parameters, b 0 and b 1, nstead of one e = s e = s e s called the standard error of the estmate Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-8

Comparng Standard Errors s e s a measure of the varaton of observed y values from the regresson lne Y Y small s e X large se X The magntude of s e should always be judged relatve to the sze of the y values n the sample data.e., s e = $41.33K s moderately small relatve to house prces n the $00 - $300K range Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-9

11.5 Inferences About the Regresson Model n The varance of the regresson slope coeffcent (b 1 ) s estmated by s b1 se = (x x) = se (n 1)s x where: s b1 = Estmate of the standard error of the least squares slope s e = SSE n = Standard error of the estmate Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-30

Comparng Standard Errors of the Slope S b1 s a measure of the varaton n the slope of regresson lnes from dfferent possble samples Y Y small S b1 X large S b1 X Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-31

Inference about the Slope: t Test n t test for a populaton slope n Is there a lnear relatonshp between X and Y? n Null and alternatve hypotheses H 0 : β 1 = 0 (no lnear relatonshp) H 1 : β 1 0 (lnear relatonshp does exst) n Test statstc t = d.f. b = β s 1 n b 1 1 where: b 1 = regresson slope coeffcent β 1 = hypotheszed slope s b1 = standard error of the slope Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-3

Inference about the Slope: t Test (contnued) House Prce n $1000s (y) Square Feet (x) 45 1400 31 1600 79 1700 308 1875 199 1100 19 1550 405 350 34 450 319 145 55 1700 Estmated Regresson Equaton: house prce = 98.5 + 0.1098 (sq.ft.) The slope of ths model s 0.1098 Does square footage of the house affect ts sales prce? Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-33

Inferences about the Slope: t Test Example H 0 : β 1 = 0 H 1 : β 1 0 Coeffcents Standard Error t Stat P-value s b1 Intercept 98.4833 58.03348 1.6996 0.189 Square Feet 0.10977 0.0397 3.3938 0.01039 b 1 t = b β s 0.10977 0 0.0397 1 1 = t = b 1 3.3938 Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-34

H 0 : β 1 = 0 H 1 : β 1 0 d.f. = 10- = 8 t 8,.05 =.3060 α/=.05 Inferences about the Slope: t Test Example Test Statstc: t = 3.39 Reject H 0 Do not reject H 0 Reject H 0 -t n-,α/ 0 t n-,α/ sb 1 Coeffcents Standard Error t Stat P-value Intercept 98.4833 58.03348 1.6996 0.189 Square Feet 0.10977 0.0397 3.3938 0.01039 α/=.05 -.3060.3060 3.39 b 1 (contnued) Decson: Reject H 0 Concluson: There s suffcent evdence that square footage affects house prce Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-35 t

Inferences about the Slope: t Test Example H 0 : β 1 = 0 H 1 : β 1 0 P-value = 0.01039 (contnued) P-value Coeffcents Standard Error t Stat P-value Intercept 98.4833 58.03348 1.6996 0.189 Square Feet 0.10977 0.0397 3.3938 0.01039 Ths s a two-tal test, so the p-value s P(t > 3.39)+P(t < -3.39) = 0.01039 (for 8 d.f.) Decson: P-value < α so Reject H 0 Concluson: There s suffcent evdence that square footage affects house prce Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-36

Confdence Interval Estmate for the Slope Confdence Interval Estmate of the Slope: b 1 tn,α/sb < β1 < b1 + tn,α/s 1 b 1 d.f. = n - Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.4833 58.03348 1.6996 0.189-35.5770 3.07386 Square Feet 0.10977 0.0397 3.3938 0.01039 0.03374 0.18580 At 95% level of confdence, the confdence nterval for the slope s (0.0337, 0.1858) Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-37

Confdence Interval Estmate for the Slope (contnued) Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Intercept 98.4833 58.03348 1.6996 0.189-35.5770 3.07386 Square Feet 0.10977 0.0397 3.3938 0.01039 0.03374 0.18580 Snce the unts of the house prce varable s $1000s, we are 95% confdent that the average mpact on sales prce s between $33.70 and $185.80 per square foot of house sze Ths 95% confdence nterval does not nclude 0. Concluson: There s a sgnfcant relatonshp between house prce and square feet at the.05 level of sgnfcance Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-38