Simple and Multiple Linear Regression

Similar documents
Ch 2: Simple Linear Regression

Simple Linear Regression

Simple Linear Regression

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

Ch 3: Multiple Linear Regression

Regression Models - Introduction

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511

Linear models and their mathematical foundations: Simple linear regression

Inference in Regression Analysis

Lecture 6 Multiple Linear Regression, cont.

STAT Chapter 11: Regression

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Simple Linear Regression

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Lecture 14 Simple Linear Regression

Measuring the fit of the model - SSR

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression

Density Temp vs Ratio. temp

STAT5044: Regression and Anova. Inyoung Kim

Inference in Normal Regression Model. Dr. Frank Wood

Inference for the Regression Coefficient

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

Linear Models and Estimation by Least Squares

Inference for Regression

Simple linear regression

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

Formal Statement of Simple Linear Regression Model

Chapter 14. Linear least squares

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Chapter 1. Linear Regression with One Predictor Variable

Multiple Linear Regression

Applied Regression Analysis

Probability and Statistics Notes

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

STA121: Applied Regression Analysis

BNAD 276 Lecture 10 Simple Linear Regression Model

Coefficient of Determination

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

STAT 3A03 Applied Regression With SAS Fall 2017

Math 423/533: The Main Theoretical Topics

Bias Variance Trade-off

Statistics 112 Simple Linear Regression Fuel Consumption Example March 1, 2004 E. Bura

13 Simple Linear Regression

where x and ȳ are the sample means of x 1,, x n

Simple Linear Regression for the Climate Data

Lectures on Simple Linear Regression Stat 431, Summer 2012

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

Multiple Linear Regression

Linear Model Under General Variance

AMS-207: Bayesian Statistics

ST430 Exam 1 with Answers

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 2 Multiple Regression I (Part 1)

ECON The Simple Regression Model

Y i = η + ɛ i, i = 1,...,n.

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

ECO220Y Simple Regression: Testing the Slope

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Correlation and Regression

Handout 4: Simple Linear Regression

Regression Models - Introduction

Concordia University (5+5)Q 1.

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Linear Regression Model. Badr Missaoui

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

STAT 4385 Topic 03: Simple Linear Regression

STA 114: Statistics. Notes 21. Linear Regression

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

Math 3330: Solution to midterm Exam

Simple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5)

Section 3: Simple Linear Regression

Lecture 10 Multiple Linear Regression

Lecture 15. Hypothesis testing in the linear model

Estadística II Chapter 4: Simple linear regression

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Linear Regression. 1 Introduction. 2 Least Squares

Statistics for Engineers Lecture 9 Linear Regression

Lecture 18: Simple Linear Regression

Regression Analysis: Basic Concepts

ST430 Exam 2 Solutions

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.

6. Multiple Linear Regression

STAT 540: Data Analysis and Regression

TMA4255 Applied Statistics V2016 (5)

Section 4: Multiple Linear Regression

Chapter 14 Simple Linear Regression (A)

Regression Estimation Least Squares and Maximum Likelihood

Section 4.6 Simple Linear Regression

Ordinary Least Squares Regression Explained: Vartanian

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

2. Outliers and inference for regression

14 Multiple Linear Regression

Statistical Hypothesis Testing

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

Transcription:

Sta. 113 Chapter 12 and 13 of Devore March 12, 2010

Table of contents 1 Simple Linear Regression 2

Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where Y is the reponse x is the predictor β 0 is the unknown intercept of the line β 1 is the unknown slope of the line ɛ N(0, σ 2 ) is the noise with unknown variance σ 2

Model Simple Linear Regression Notice that Y is a random quantity due to ɛ only E(Y ) = β 0 + β 1 x V(Y ) = σ 2 Y N(β 0 + β 1 x, σ 2 )

Assumptions Simple Linear Regression Notice that A linear underlying relationship between the response and the predictor Normality of random noise Constant variance of random noise all throughout the data Independence of random noise

Least Squares Find the line passing through the data points such that the sum of squared vertical distances from this line to the data points is minimized. min b 0,b 1 n (y i b 0 b 1 x i ) 2 i=1 Since this is a minimization problem, taking the derivatives with respect to b 0 and b 1 and setting them equal to zero will result in two equations which are called the normal equations. nb 0 + ( x i )b 1 = 0 ( x i )b 0 + ( x 2 i )b 1 = x i y i

Least Squares If we solve this system we obtain b 1 = ˆβ 1 = (xi x)(y i ȳ) (xi x) 2 b 0 = ˆβ 0 = ȳ b 1 x.

How does LSE relate to MLE Notice that there is nothing probabilistic about least squares estimation. It s merely an optimization problem where the sum of squared vertical distances from actual points to a line is minimized. There is no underlying distribution assumption. In fact, nothing is treated as random. We just have a cloud of points and we pass a line through them. In the beginning we made certain assumptions about the response. We said Y i N(β 0 + β 1 x i, σ 2 ). Assuming that the responses are distributed normally with mean β 0 + β 1 x i and variance σ 2 will yield a likelihood over the unknown model parameters β 0, β 1 and σ 2. Maximizing this likelihood will yield the MLE. It turns out that under the assumptions we made earlier, the maximum likelihood estimators for β 0 and β 1 are identical to the least squares estimators.

Estimating the error variance The maximum likelihood estimator for the error variance σ 2 is easily obtained as ˆσ 2 = n i=1 (y i b 0 b 1 x i ) 2. n Recall that this is a biased estimator for σ 2. To correct for the bias we have to subtract the number of parameters estimated prior to the estimation of σ 2 from n. Thus, the unbiased estimator is obtained as n s 2 i=1 = (y i b 0 b 1 x i ) 2. n 2

Example - Murder rate vs unemployment percentage

Example - Murder rate vs unemployment percentage

Example - Murder rate vs unemployment percentage

The coefficient of determination, R 2 The coefficient of determination, denoted by R 2, is given by R 2 = 1 SSE n SST = 1 i=1 (y i b 0 b 1 x i ) 2 n i=1 (y i ȳ) 2. It is interpreted as the proportion of observed variation in y that is explained by the simple linear regression model.

s about β 1 It can be shown that b 1 = ˆβ 1 is normally distributed with mean E(b 1 ) = β 1 and variance V(b 1 ) = σ2 S xx where S xx = (x i x) 2. Thus the quantity z = b 1 β 1 σ/ S xx would be standard normally distributed. Since we don t know σ 2, if we replace is by its estimator s 2 t = b 1 β 1 s/ S xx has a t distribution with n 2 df.

Confidence interval and hypothesis test for β 1 A 100(1 α)% CI for the slope β 1 of the true regression line is given by s b 1 ± t α/2,n 2. S xx We usually test the null hypothesis H 0 : β 1 = 0 vs H a : β 1 0 where the test statistic is t = b 1 s/ S xx. Since under the null hypothesis t = b 1 s/ S xx has a t distribution with n 2 degrees of freedom, the null hypothesis is rejected if t t alpha/2,n 2 or t t alpha/2,n 2. This can easily be turned into a one-sided test.

s on µ Y,x Simple Linear Regression and the prediction of future Y values Notice that once b 0 and b 1 are calculated, b 0 + b 1 x is a point estimate of µ Y,x (the expected ot true average value of Y when x = x ). The point estimate or prediction by itself gives no information concerning how precisely µ Y,x has been estimated or Y predicted. This can be remedied by developing a CI for µ Y,x and a prediction interval (PI) for a single Y value.

s on µ Y,x Simple Linear Regression and the prediction of future Y values A 100(1 α)% CI for µ Y,x, the expected value of Y when x = x, is given by b 0 + b 1 x 1 ± t α/2,n 2 s n + (x x) 2. S xx A 100(1 α)% PI for a future Y observation to be made when x = x is given by b 0 + b 1 x ± t α/2,n 2 s 1 + 1 n + (x x) 2. S xx

Model Simple Linear Regression A multiple linear regression model is given by Y = β 0 + β 1 x 1 + β 3 x 3 +... + ɛ where Y is the reponse x 1, x 2, x 3,... are the predictors β 0, β 1, β 2,... are unknown regression coefficients ɛ N(0, σ 2 ) is the noise with unknown variance σ 2

Model Simple Linear Regression When we have n observations from such a model, i.e. y = (y 1, y 2,..., y n ), with we define X as the design matrix 1 x 11 x 12 x 1p 1 x 21 x 22 x 2p X =........ 1 x n1 x n2 x np

Least Squares The least squares solution ˆβ = (β 0, β 1, β 2,..., β p ) is given by ˆβ = (X X) 1 X y. Just like in the simple linear regression case, this is equivalent to the MLE under aforementioned assumptions.

Example - Cirrhosis data

Example - Cirrhosis data

Example - Cirrhosis data

Example - Cirrhosis data