Simple Linear Regression (Part 3)

Similar documents
Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total

Chapter 2 Multiple Regression I (Part 1)

Ch 2: Simple Linear Regression

Regression Models - Introduction

Simple Linear Regression

Regression Models - Introduction

Nonparametric Regression and Bonferroni joint confidence intervals. Yang Feng

Multiple Linear Regression

Chapter 2 Inferences in Simple Linear Regression

Weighted Least Squares

Linear models and their mathematical foundations: Simple linear regression

Weighted Least Squares

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Chapter 5 Matrix Approach to Simple Linear Regression

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Bias Variance Trade-off

Simple Linear Regression

Inference in Normal Regression Model. Dr. Frank Wood

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Ch 3: Multiple Linear Regression

The Aitken Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

Statistics for Engineers Lecture 9 Linear Regression

2. Regression Review

Lecture 1 Linear Regression with One Predictor Variable.p2

Statistical View of Least Squares

Inference in Regression Analysis

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

STAT5044: Regression and Anova. Inyoung Kim

Lecture 14 Simple Linear Regression

Multivariate Regression (Chapter 10)

Chapter 1 Linear Regression with One Predictor

Measuring the fit of the model - SSR

where x and ȳ are the sample means of x 1,, x n

Simple linear regression

Applied Econometrics (QEM)

Correlation Analysis

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

STAT 540: Data Analysis and Regression

Inference for Regression

Simultaneous Inference: An Overview

Lecture 3: Inference in SLR

Formal Statement of Simple Linear Regression Model

Lecture 6 Multiple Linear Regression, cont.

Review of probability and statistics 1 / 31

STATISTICS 3A03. Applied Regression Analysis with SAS. Angelo J. Canty

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Linear Regression Model. Badr Missaoui

Remedial Measures, Brown-Forsythe test, F test

MS&E 226: Small Data

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Define y t+h t as the forecast of y t+h based on I t known parameters. The forecast error is. Forecasting

1. The OLS Estimator. 1.1 Population model and notation

13 Simple Linear Regression

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

Data Analysis and Statistical Methods Statistics 651

IEOR E4703: Monte-Carlo Simulation

Chapter 6 Multiple Regression

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MS&E 226: Small Data

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511

variability of the model, represented by σ 2 and not accounted for by Xβ

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Multivariate Regression

Mathematical statistics

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

STAT5044: Regression and Anova. Inyoung Kim

Chapter 2 Inferences in Regression and Correlation Analysis

Chapter 14 Simple Linear Regression (A)

Math 3330: Solution to midterm Exam

Introduction to Statistical Inference Lecture 8: Linear regression, Tests and confidence intervals

Chapter 1 Simple Linear Regression (part 6: matrix version)

Bayesian Linear Regression

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Simple Linear Regression. Part

Topic 14: Inference in Multiple Regression

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com

Uncertainty Quantification for Inverse Problems. November 7, 2011

Variable Selection and Model Building

Econ 583 Final Exam Fall 2008

Simple and Multiple Linear Regression

Regression Estimation Least Squares and Maximum Likelihood

The Simple Regression Model. Part II. The Simple Regression Model

The Standard Linear Model: Hypothesis Testing

ST505/S697R: Fall Homework 2 Solution.

2.1 Linear regression with matrices

Simple Linear Regression

2. A Review of Some Key Linear Models Results. Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics / 28

Course information: Instructor: Tim Hanson, Leconte 219C, phone Office hours: Tuesday/Thursday 11-12, Wednesday 10-12, and by appointment.

Linear Algebra Review

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

ECON The Simple Regression Model

Chapter 16. Simple Linear Regression and dcorrelation

Linear Regression. 1 Introduction. 2 Least Squares

Simple Linear Regression

Transcription:

Chapter 1 Simple Linear Regression (Part 3) 1 Write an Estimated model Statisticians/Econometricians usually write an estimated model together with some inference statistics, the following are some formats people write a model (i) (S.E.) (s(b 0 )) (s(b 1 )) ˆσ 2 (or MSE) =..., R 2 =..., F-statistic =... (and others) (ii) (t-values) (t 0 ) (t 1 ) ˆσ 2 (or MSE) =..., R 2 =..., F-statistic =... (and others) (iii) (p-values) (p 0 ) (p 1 ) ˆσ 2 (or MSE) =..., R 2 =..., F-statistic =... (and others) For simple linear regression model, a plot showing the regression is also necessary. Example 1.1 The Toluca Company manufactures refrigeration equipments as well as many replacement parts. In the past, one of the replacement parts has been produced periodically in lots of varying sizes. When a cost improvement program was undertaken, company officials wished to determine the optimum lot size for producing this part. The production 1

of this part involves setting up the production process (which must be done no matter what is the lot size) and machining and assembly operations. One key input for the model to ascertain the optimum lot size was the relationship between lot size and labour hours required to produce the lot. To determine this relationship, data on lot size and work hours for 25 recent production runs were utilized. The production conditions were stable during the six-month period in which the 25 runs were made and were expected to continue to be the same during the next three years, the planning period for which the cost improvement program was being conducted. The data was collected and listed in (data010301.dat). Let X denote the lot size and Y work hours. Based on the problem, we consider the following regression model for the n = 25observations Y i = β 0 + β 1 X i + ε i, i =1, 2,..., The estimated model is (see (R code)) Ŷ = 62.366 + 3.570X (S.E.) (26.177) (0.347) MSE = 2383.392, R 2 = 0.8215, F-statistic = 105.9 2 Interval Prediction (Estimation) of E(Y ) (also called narrow intervals) Recall the model is Y = β 0 + β 1 X }{{} predictable + ε }{{} unpredictable Note that EY = β 0 + β 1 X. In other words, only the mean of Y can be predicted. Based on n observations, (X 1,Y 1 ),..., (X n,y n ) we have estimator b 1 = (X i X)(Y i Ȳ ) (X i X) 2, b 0 = Ȳ b 1 X. The fitted model 2

is actually a prediction of the mean E(Y ). We call it point prediction (estimation) of EY (for a new X) (but non-statisticians also call it prediction of Y ). Sample distribution of Ŷ : For the simple linear regression model with independent identical normal error assumptions (i.e. model (1) in part 2 of Chapter 1), we have Ŷ N(EY,σ 2[ 1 n + (X X) 2 (X i X) 2 ] ) [Proof: Since Ŷ is linear combination of IID normal distributed random variables. We only to check its mean and variance. We know EŶ = β 0 + β 1 X = EY Let and then k i =0,and Thus k i = X j X j=1 (X j X) 2 b 1 = β 1 + k i ε i. and = Ȳ b 1 X 1 + b 1 X = β 0 + β 1 X + ε i +(X n X)b 1 1 = β 0 + β 1 X + ε i +(X n X)(β 1 + k i ε i ) = β 0 + β 1 X + [ 1 n +(X X)k i ]ε i Var(Ŷ ) = Var( [ 1 n +(X X)k i ]ε i ) = = [ 1 n +(X X)k i ] 2 σ 2 [ 1 n 2 +21 n (X X)k i +(X X) 2 ki 2 ]σ 2 = [ 1 n + (X X) 2 (X ]σ2 i X) 2 3

] Since σ 2 is unknown, we need to replace σ 2 by MSE and define s 2 (Ŷ )=MSE[ 1 n + (X X) 2 (X i X) 2 ]. Sample distribution of (Ŷ EŶ )/s(ŷ ) : For the simple linear regression model with independent identical normal error assumptions (i.e. model (1) in part 2 of Chapter 1), we have Ŷ EY s(ŷ ) t(n 2) Confidence interval for E(Y ) with confidence 1 α (also called narrow intervals), we have Ŷ ± t(1 α/2,n 2)s(Ŷ ) Prediction interval for Y with new X (called wide intervals): Although we can not predict the value of Y, but we can find its possible range. Consider the distribution of Ŷ Y Write Ŷ Y = Ŷ (β 0 + β 1 X + ε) =[Ŷ EY ] ε It is easy to see it follows normal distribution. Its mean is E(Ŷ Y )=[EŶ EY ] Eε =0 Note that Y is a new sample Y = β 0 + β 1 X + ε with ε being uncorrelated with ε 1,..., ε n and Var(ε) =σ 2.Thevariance is Var(Ŷ Y ) = Var(Ŷ EY ε) =Var(Ŷ EY )+Var(ε) =Var(Ŷ )+σ2 = [ 1 n + (X X) 2 (X i X) 2 ]σ2 + σ 2 4

Thus Ŷ Y N(0, [1 + 1 n + (X X) 2 (X i X) 2 ]σ2 ) Again, we need to replace σ by s 2 (pred) =MSE [1+ 1 n + (X X) 2 ] (X i X) 2 And Ŷ Y t(n 2) (1) s(pred) With confidence 1 α, the confidence interval is Ŷ ± t(1 α/2,n 2) s(pred). R code predict(object, newdata, interval = "none"/"confidence"/"prediction", }{{} select one level = 0.95) Pointwise confidence bands Letting X go through a range, then the confidence interval Ŷ ± t(1 α/2,n 2)s(Ŷ ) (narrow confidence band) or Ŷ ± t(1 α/2,n 2) s(pred) (wide confidence band) forms a band, called confidence band. Example 2.1 For the example 1.1, with confidence level 0.95 (a) For a new point X = 4, predict the expected work hours E(Y ) and confidence interval (b) For a new point X = 4, calculate the confidence interval for Y. (c) Draw the (narrow) pointwise confidence band Solution: (See (R code)) 5

(a) 76.64667; [25.14723, 128.1461]. (b) [-36.72411, 190.0174]. (c) see Figure 1 regression analysis for the Toluca Company data Y 100 200 300 400 500 20 40 60 80 100 120 X Figure 1: The 95% narrow pointwise confidence band for EY Global confidence band [not our interest] 6