Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Similar documents
Lecture 3: Inference in SLR

Lecture 10 Multiple Linear Regression

Regression Models - Introduction

STAT Chapter 11: Regression

Correlation Analysis

Lecture 9 SLR in Matrix Form

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Lecture 12 Inference in MLR

Lecture 1 Linear Regression with One Predictor Variable.p2

Statistical Techniques II EXST7015 Simple Linear Regression

Mathematics for Economics MA course

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Ch 2: Simple Linear Regression

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Simple Linear Regression for the Climate Data

Regression Models - Introduction

Simple linear regression

Statistics 512: Applied Linear Models. Topic 1

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Inference for Regression Simple Linear Regression

Chapter 1 Linear Regression with One Predictor

Basic Business Statistics 6 th Edition

Simple Linear Regression for the MPG Data

Course Information Text:

Simple Linear Regression Analysis

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 9: Simple Regression

Statistics for Managers using Microsoft Excel 6 th Edition

Lecture 11 Multiple Linear Regression

Statistics for Engineers Lecture 9 Linear Regression

1. Simple Linear Regression

Simple Linear Regression

9. Linear Regression and Correlation

The Multiple Regression Model

Math 3330: Solution to midterm Exam

STAT 4385 Topic 03: Simple Linear Regression

Lecture 14 Simple Linear Regression

Lecture 2 Linear Regression: A Model for the Mean. Sharyn O Halloran

Chapter 1. Linear Regression with One Predictor Variable

Inference for the Regression Coefficient

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

df=degrees of freedom = n - 1

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov

Section Least Squares Regression

SIMPLE REGRESSION ANALYSIS. Business Statistics

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Section 3: Simple Linear Regression

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Lecture 18 Miscellaneous Topics in Multiple Regression

Formal Statement of Simple Linear Regression Model

Inferences for Regression

Inference for Regression Inference about the Regression Model and Using the Regression Line

The Simple Linear Regression Model

Applied Regression Analysis

Topic 20: Single Factor Analysis of Variance

Statistical Inference with Regression Analysis

ECON3150/4150 Spring 2015

Chapter 5 Friday, May 21st

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Lecture 7 Remedial Measures

Simple Linear Regression

STAT5044: Regression and Anova. Inyoung Kim

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Chapter 7 Student Lecture Notes 7-1

Simple Linear Regression

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

Topic 14: Inference in Multiple Regression

Lecture notes on Regression & SAS example demonstration

Review of Statistics

6. Multiple Linear Regression

Topic 10 - Linear Regression

Chapter 13. Multiple Regression and Model Building

Chapter 14 Simple Linear Regression (A)

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Lecture 11: Simple Linear Regression

Lecture 13 Extra Sums of Squares

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM,

Regression Models. Chapter 4

The simple linear regression model discussed in Chapter 13 was written as

STAT 705 Chapter 16: One-way ANOVA

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Correlation and Regression

Simple and Multiple Linear Regression

22s:152 Applied Linear Regression. Chapter 5: Ordinary Least Squares Regression. Part 1: Simple Linear Regression Introduction and Estimation

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

ECON The Simple Regression Model

Unit 10: Simple Linear Regression and Correlation

General Linear Model (Chapter 4)

Intro to Linear Regression

A discussion on multiple regression models

+ Statistical Methods in

Measuring the fit of the model - SSR

Lecture 5. In the last lecture, we covered. This lecture introduces you to

Regression line. Regression. Regression line. Slope intercept form review 9/16/09

ECO220Y Simple Regression: Testing the Slope

Week 3: Simple Linear Regression

Fitting a regression model

Transcription:

Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1

Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor variable -

Relationships Among Variables Functional Relationships The value of the dependent variable Y can be computed exactly if we know the value of the independent variable X. (e.g., Y=X) Statistical Relationships Not a perfect or exact relationship. The expected value of the response variable Y is a function of the explanatory or predictor variable X. The observed value of Y is the expected value plus a random deviation. -3

Simple Linear Regression -4

Uses of SLR Why Use Simple Linear Regression? Descriptive/ Exploratory purposes (explore the strength of known cause/effect relationships) Administrative Control (often the response variable is $$$) Prediction of outcomes (predict future needs; often overlaps with cost control) -5

Statistical Relationships vs. Causality Statistical relationships do not imply causality!!! Example : A Lafayette ice cream shop does more business on days when attendance at an Indianapolis swimming pool is high. -6

Data for Simple Linear Regression Observe pairs of variables; Each pair is called a case or a data point Y i is the i th value of the response variable; X i is the i th value of the explanatory (or predictor) variable; in practice the value of X is a known constant. i -7

Simple Linear Regression Model Statement of Model Y X where i 0 1 i i Model Parameters (unknown) i 1,,..., n i ~ N 0, 0 = intercept; may not have meaning 1 = slope; 1 0 if no relationship between X and Y. is the error variance -8

Y i 0 1 i i EY i X -9

Interpretation of the Regression Coefficients 0 is the expected value of the response variable when X = 0. 1 represents the increase (or decrease if negative) in the mean response for a 1-unit increase in the value of X. -10

Features of SLR Model Errors are independent, identically distributed normal random variables: iid ~ Normal 0, i Implies Y ~ iid Normal X, i 0 1 (See A.36, p1303 for the proof) i -11

Fitted Regression Equation The parameters from the data. Estimates denoted must be estimated 0, 1, b b s. 0, 1, Fitted (or estimated) regression line is Y b b X ˆi 0 1 The hat symbol is used to differentiate the fitted value Y ˆi from the actual observed value Y i. i -1

Residuals The deviations (or errors) from the true regression line, i Yi 0 1Xi, cannot be known since the regression parameters 0 and 1 are unknown. We may estimate these by the residuals: e Observed Predicted i = Y i i Yˆ Y b b X i 0 1 i -13

Error Terms vs Residuals -14

Assumptions Model assumes that the error terms are independent, normal, and have constant variance. Residuals may be used to explore the legitimacy of these assumptions. More on this topic in later. -15

Least Squares Estimation Want to find best estimates b 0, b 1 for,. 0 1 Best estimates will minimize the sum of the squared residuals: n i i 0 1 i i1 SSE e Y b b X To do this, use calculus (see pages 17, 18 of KNNL). -16

Least Squares Solution The LS estimate for 1 can be written in terms of the sums of squares b X X Y Y SS i i XY 1 X SS i X X The LS estimate for 0 is b0 Y b1x -17

About the LS Estimates They are also the maximum likelihood estimates (see KNNL pages 7-3). These are the best estimates because they are unbiased (their expectation is the parameter that they are estimating) and they have minimum variance among all such estimators. Big picture: We wouldn t want to use any other estimates because we can do no better. -18

Mean Square Error We also need to estimate. This estimate is developed based on the sum of the squared residuals (SSE) and the available degrees of freedom: s SSE e MSE df n The error degrees of freedom are based on the fact that we have n observations and parameters 0, 1 that we have already estimated. E i -19

Variance Notation s MSE will always be the estimate for. This can be confusing, because there will be estimated variances for other quantities, and these will be denoted e.g. s b 1, s b 0, etc. These are not products, but single variance quantities. To avoid confusion, I will generally write MSE whenever referring to the estimate for. -0

EXAMPLE: Diamond Rings Variables Response Variable ~ price in Singapore dollars (Y) Explanatory Variable ~ weight of diamond in carats (X) Associated SAS File diamonds.sas -1

SAS Regression Procedure PROC REG data=diamonds; model price=weight; RUN; -

Output (1) Sum of Mean Source DF Squares Square Model 1 098596 098596 Error 46 46636 1013.81886 Total 47 1453 Root MSE = 31.8405-3

Output () Parameter Standard Variable DF Estimate Error Intercept 1-59.6591 17.31886 weight 1 371.0485 81.78588-4

Output Summary From the output, we see that b b 0 1 59.6 371.0 MSE 1014 MSE 31.8 Note that the Root-MSE has a direct interpretation as the estimated standard deviation (in $$). -5

Interpretations It doesn t really make sense to talk about a 1-carat increase. But we can change this to a 0.01-carat increase by dividing by 100. From b 1 we see that a 0.01-carat increase in the weight of a diamond will lead to a $37.1 increase in the mean response. The interpretation of b 0 would be that one would actually be paid $60 to simply take a 0-carat diamond ring. Why doesn t this make sense? -6

Scope of Model The scope of a regression model is the range of X-values over which we actually have data. Using a model to look at X-values outside the scope of the model (extrapolation) is quite dangerous. -7

-8

Prediction for 0.43 Carats Does this make sense in light of the previous discussion? Suppose we assume that it does. Then the mean price for a 0.43 carat ring can be computed as follows: Y ˆ 60 371 0.43 1340 How confident would you be in this estimate? -9

Upcoming in Lecture 3... We will discuss more about inference concerning the regression coefficients Background Reading o.1-.6-30