Chapter 12 - Lecture 2 Inferences about regression coefficient

Similar documents
Correlation and the Analysis of Variance Approach to Simple Linear Regression

Multiple Linear Regression

Simple Linear Regression

Ch 2: Simple Linear Regression

Measuring the fit of the model - SSR

Chapter 11 - Lecture 1 Single Factor ANOVA

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Correlation Analysis

Lecture 10 Multiple Linear Regression

Linear Models and Estimation by Least Squares

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Linear models and their mathematical foundations: Simple linear regression

Inferences for Regression

Chapter 10. Simple Linear Regression and Correlation

Econometrics. 4) Statistical inference

TMA4255 Applied Statistics V2016 (5)

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Inference for Regression

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Lectures on Simple Linear Regression Stat 431, Summer 2012

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing

Lecture 30. DATA 8 Summer Regression Inference

Homework 2: Simple Linear Regression

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Correlation and Regression

POLI 443 Applied Political Research

Lecture 16 - Correlation and Regression

STAT 111 Recitation 7

Six Sigma Black Belt Study Guides

Lecture 15. Hypothesis testing in the linear model

Business Statistics. Lecture 10: Correlation and Linear Regression

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

ECO220Y Simple Regression: Testing the Slope

Section 3: Simple Linear Regression

df=degrees of freedom = n - 1

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.

Simple Linear Regression

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

LECTURE 5. Introduction to Econometrics. Hypothesis testing

Simple Linear Regression

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Basic Business Statistics 6 th Edition

Simple Linear Regression

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

STAT5044: Regression and Anova. Inyoung Kim

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Multivariate Regression (Chapter 10)

Applied Econometrics (QEM)

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

Statistics for Managers using Microsoft Excel 6 th Edition

This document contains 3 sets of practice problems.

Applied Statistics and Econometrics

where x and ȳ are the sample means of x 1,, x n

Finding Relationships Among Variables

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

y response variable x 1, x 2,, x k -- a set of explanatory variables

Inference in Regression Analysis

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.

Lecture 11: Simple Linear Regression

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

STAT 350 Final (new Material) Review Problems Key Spring 2016

Comparing Means from Two-Sample

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Lecture 3: Inference in SLR

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Simple Linear Regression: One Qualitative IV

The Simple Linear Regression Model

STAT Chapter 11: Regression

Hypothesis Testing hypothesis testing approach

Multivariate Regression

2.1 Linear regression with matrices

CAS MA575 Linear Models

Topic 10 - Linear Regression

Section 4.6 Simple Linear Regression

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

STAT Exam Jam Solutions. Contents

Lecture 1 Linear Regression with One Predictor Variable.p2

Harvard University. Rigorous Research in Engineering Education

Linear Regression Model. Badr Missaoui

Covariance and Correlation

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

9. Linear Regression and Correlation

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

Simple Linear Regression

STAT Chapter 9: Two-Sample Problems. Paired Differences (Section 9.3)

STAT 4385 Topic 03: Simple Linear Regression

Chapter 16. Simple Linear Regression and dcorrelation

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

BNAD 276 Lecture 10 Simple Linear Regression Model

Second Midterm Exam Economics 410 Thurs., April 2, 2009

Lecture notes on Regression & SAS example demonstration

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Chapter 16. Simple Linear Regression and Correlation

Transcription:

Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010

Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table

Facts about slope In previous lectures we have seen that the regression coefficient β 1 is a parameter that can be estimated using a sample In previous Chapters we have seen that using a sample we can make statistical inference about a parameter. That means we can use the regression line to make inference about regression slope and this is what we will see in this lecture.

Slope facts Facts about slope E( ˆβ 1 ) = β 1 Var( ˆβ 1 ) = σ2 S xx What can we say about the distribution of ˆβ 1 when n is large? So using this fact we can use a test statistic to make inference about the slope of the regression line. What test statistic can we use? What is a problem with the test statistic above?

Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Test statistic So the test statistic will be the following: T = ˆβ 1 β 1 = S Sxx ˆβ 1 β 1 S ˆβ1 Can you find the distribution of the above?

Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Constructing a confidence interval Starting from the fact that: ( P t n 2,α/2 < ˆβ 1 β 1 S ˆβ1 < t n 2,α/2 ) = 1 α We get the following (1 α)100% Confidence interval for β 1 : ˆβ 1 ± t n 2,α/2 S ˆβ 1

Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Hypothesis test Null Hypothesis: H 0 : β 1 = β 10 Test statistic: t = ˆβ 1 β 10 Rejection Regions: s ˆβ1 t n 2 t t n 2,α if H A : β 1 > β 10 t tn 2,α if H A : β 1 < β 10 t tn 2,α/2 and t t n 2,α/2 if H A : β 1 β 10

Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Hypothesis test using ANOVA In Chapter 6 we have seen that if you take a random variable U t v then U 2 F 1,v. Last lecture, I showed you how one can use the SSR and SSE to construct an ANOVA Table. The F test statistic that we get in that Table (see also next slide) is the square of a special case of the T-test we get from the test statistic in the previous slide. So the ANOVA table is another way to make a test, but only in the case that β 10 = 0, that is your null hypothesis is H 0 : β 1 = 0. The case when β 10 = 0 is considered the most useful test and is also called the model utility test. Why do you think that case is of extreme importance?

Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table ANOVA Table Table: ANOVA TABLE Source of Sum of Mean variation df Squares Squares F Regression 1 SSR SSR SSR/s 2 Error n 2 SSE s 2 = SSE/n 2 Total n 1 SSTo

Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table I want to find the regression line that relates the scores on the two Midterms in Stat 319. I randomly select five students and the score they had in Midterm 1 are 50, 70, 75, 80, 95 and in the same order the scores they had in Midterm 2 is 40, 65, 95, 90, 100. Find a 95% Confidence Interval for the regression slope. Make a test using a t-test to see if there is a relationship between the scores of the two midterms at significance level 0.02 Make a test using an F-test to see if there is a relationship between the two scores at significance level 0.02.

Correlation between two random variables In Stat 318, we defined the correlation coefficient ρ as a measure of how strong two random variables X and Y are related. The formula was: ρ = ρ(x, Y ) = Cov(X, Y ) Var(X )Var(Y ) ρ takes values between -1 and 1. The closer the value is to 1 the stronger positive relationship we have. The closer the value is to -1 the stronger negative relationship we have. The closer it is to 0 the weaker the relationship is.

Estimating correlation from a sample Let s assume we want to see the correlation of the height and weight of male students at PSU. That means we need to go ask all 25000 male students their height and weight find the covariance of the two random variables, the variances and calculate the correlation. It is much more easier, if we take a sample and estimate the correlation. That means that ρ as we learned it in Chapter 5 is a population parameter. If we want to estimate it from a sample, the formula that is being used is: ˆρ = r = S xy Sxx S yy This estimator, r, is actually equal to the square root of the Coefficient of Determination we have seen last lecture.

Hypothesis testing The following test is only true for testing the null H 0 : ρ = 0 Test statistic: t = r n 2 1 r 2 t n 2 Rejection Regions: t t n 2,α if H A : ρ > 0 t t n 2,α if H A : ρ < 0 t tn 2,α/2 and t t n 2,α/2 if H A : ρ 0

I want to find the regression line that relates the scores on the two Midterms in Stat 319. I randomly select five students and the score they had in Midterm 1 are 50, 70, 75, 80, 95 and in the same order the scores they had in Midterm 2 is 40, 65, 95, 90, 100. Perform a hypothesis testing procedure to test if there is significance evidence of positive relationship between the two scores at significance level 0.05

Extending the test to more cases Last test we have seen about ρ can be used only for the null H 0 : ρ = 0. What happens if we want to test for the null H 0 : ρ = ρ 0 when ρ 0 0? We will use Fisher transformation and random variable: V = 1 ( ) 1 + R 2 log 1 R

Distribution Random variable V as was defined in previous slide is approximately following normal distribution as follows: ( V N µ V = 1 ( ) 1 + ρ 2 log, σv 2 1 ρ = 1 ) n 3

Hypothesis testing Null hypothesis: H 0 : ρ = ρ 0 Test statistic: 1 2 log z = ( 1 + r 1 r Rejection Regions: z zα if H A : ρ > ρ 0 z z α if H A : ρ < ρ 0 ) 1 ( 1 + 2 log ρ0 1 ρ 0 ) N(0, 1) 1 n 3 z z α/2 and z z α/2 if H A : ρ ρ 0

Confidence interval for µ V Based on previous results it is easy to create a confidence interval for µ V. A (1 α)100% Confidence Interval for µ V is given by: V ± z α/2 n 3

Confidence interval for ρ Our objective is not to create a Confidence Interval for µ V. Our objective is to create a Confidence interval about ρ. A (1 α)100% Confidence Interval for ρ is given by: ( e 2c 1 ) 1 e 2c, e2c2 1 1 + 1 e 2c 2 + 1 c 1 is the lower endpoint for the interval for µ V c 2 is the upper endpoint for the interval for µ V

I want to find the regression line that relates the scores on the two Midterms in Stat 319. I randomly select five students and the score they had in Midterm 1 are 50, 70, 75, 80, 95 and in the same order the scores they had in Midterm 2 is 40, 65, 95, 90, 100. Make a hypothesis test at significance level 0.05 to see if there is significant evidence that the correlation coefficient is different than 0.5. Find a 99% confidence interval for ρ.

Section 12.3 page 609 31, 32, 33, 34, 35, 36, 37, 38, 41 Section 12.5 page 623 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67