Simple Linear Regression Analysis

Similar documents
Lecture 14 Simple Linear Regression

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Measuring the fit of the model - SSR

Ch 2: Simple Linear Regression

ECONOMETRIC THEORY. MODULE IV Lecture - 16 Predictions in Linear Regression Model

Regression Models - Introduction

Variable Selection and Model Building

Regression Models - Introduction

STAT5044: Regression and Anova. Inyoung Kim

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Simple Linear Regression

A discussion on multiple regression models

Categorical Predictor Variables

CHAPTER EIGHT Linear Regression

ECO220Y Simple Regression: Testing the Slope

TMA4255 Applied Statistics V2016 (5)

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Lecture 10 Multiple Linear Regression

13 Simple Linear Regression

CHAPTER 4 & 5 Linear Regression with One Regressor. Kazu Matsuda IBEC PHBU 430 Econometrics

Scatter plot of data from the study. Linear Regression

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

Multivariate Regression (Chapter 10)

Analysis of Variance and Design of Experiments-II

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Simple and Multiple Linear Regression

Scatter plot of data from the study. Linear Regression

Inference for Regression

Intro to Linear Regression

Statistical Techniques II EXST7015 Simple Linear Regression

ECONOMETRIC THEORY. MODULE VI Lecture 19 Regression Analysis Under Linear Restrictions

Simple Linear Regression

Homework 2: Simple Linear Regression

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

STAT Chapter 11: Regression

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Inference in Regression Analysis

Lecture 3: Multiple Regression

Regression Estimation Least Squares and Maximum Likelihood

Section Least Squares Regression

Mathematics for Economics MA course

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Simple Linear Regression

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Concordia University (5+5)Q 1.

Correlation Analysis

Simple Linear Regression

27. SIMPLE LINEAR REGRESSION II

Gov 2000: 9. Regression with Two Independent Variables

9. Linear Regression and Correlation

Lecture 9 SLR in Matrix Form

Linear Models and Estimation by Least Squares

Chapter 11 Specification Error Analysis

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Inference in Normal Regression Model. Dr. Frank Wood

Chapter 10. Simple Linear Regression and Correlation

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Two-Variable Regression Model: The Problem of Estimation

LINEAR REGRESSION ANALYSIS

4.1 Least Squares Prediction 4.2 Measuring Goodness-of-Fit. 4.3 Modeling Issues. 4.4 Log-Linear Models

Variable Selection and Model Building

Statistical View of Least Squares

Multiple Linear Regression

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Formal Statement of Simple Linear Regression Model

Intro to Linear Regression

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511

The Multiple Regression Model

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.

Nonparametric Regression and Bonferroni joint confidence intervals. Yang Feng

Inferences for Regression

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal

Simple Linear Regression for the Climate Data

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

Answers to Problem Set #4

Simple Linear Regression

Inference for Regression Simple Linear Regression

STAT5044: Regression and Anova

15.1 The Regression Model: Analysis of Residuals

Lecture 18: Simple Linear Regression

Lecture 3: Inference in SLR

Bayesian Estimation of Regression Coefficients Under Extended Balanced Loss Function

1. Simple Linear Regression

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

ECON The Simple Regression Model

Föreläsning /31

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

ESS 265 Spring Quarter 2005 Time Series Analysis: Linear Regression

Statistics for Engineers Lecture 9 Linear Regression

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 12th Class 6/23/10

Transcription:

LINEAR REGRESSION ANALYSIS MODULE II Lecture - 6 Simple Linear Regression Analysis Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Prediction of values of study variable An important use of linear regression modeling is to predict the average and actual values of study variable. The term prediction of value of study variable corresponds to knowing the value of E(y (in case of average value and value of y (in case of actual value for a given value of explanatory variable. We consider both the cases. Case 1: Prediction of average value Under the linear regression model y = β + β1 x+ ε, the fitted model is y= b + bx 1 where b and b1 are the OLS estimators of β andβ 1 respectively. Suppose we want to predict the value of E(y for a given value of E ( y x = µ = b + bx. 1 x= x.. Then the predictor is given by Predictive bias The prediction error is given as Then Thus the predictor µ E( y = b+ bx 1 E( β+ β1x+ ε = b + bx ( β + β x 1 1 = ( b β + ( b β x. 1 1 E µ E( y Eb ( β Eb ( β x = + = + =. 1 1 µ is an unbiased predictor of E(y.

3 Predictive variance The predictive variance of µ is PV ( µ = Var( b + b x 1 [ ( ] = Var y + b x x 1 = Var( y + ( x x Var( b + ( x x Cov( y, b σ σ ( x x = + + n s 1 ( x x = σ + n s 1 1. Estimate of predictive variance The predictive variance can be estimated by substituting 1 ( x x PV ( µ = σ + n s 1 ( x x = MSE + n s. σ σ = by MSE as

4 Prediction interval estimation The 1(1 α% prediction interval for E( y x is obtained as follows: The predictor µ is a linear combination of normally distributed random variables, so it is also normally distributed as ( + ( yx µ ~ N β β x, PV µ. 1 So if σ is known, then the distribution of µ E( y x PV ( µ is N(,1, so the 1(1 α% prediction interval is obtained as µ E( y x α α = 1 α PV ( µ P z z which gives the prediction interval for E( y x as 1 ( x x 1 ( x x µ z, α σ + µ + z. α σ + n s n s

σ When is unknown, it is replaced by σ = MSE and in this case the sampling distribution of 5 µ E( y x 1 ( x x MSE + n s is t-distribution with (n - degrees of freedom, i.e., t n -. The 1(1- α % prediction interval in this case is µ E( y x α α = 1 α, n, n 1 ( x x MSE + n s P t t which gives the prediction interval as 1 ( x x 1 ( x x µ t, α MSE + µ + t MSE., n α +, n n s n s Note that the width of prediction interval E(y x is a function of x. The interval width is minimum for as x x and widens increases. This is expected also as the best estimates of y to be made at x-values lie near the center of the data and the precision of estimation to deteriorate as we move to the boundary of the x-space. x = x

6 Case : Prediction of actual value If x is the value of the explanatory variable, then the actual value predictor for y is. y = b + bx. 1 Note that the form of predictor is same as of average value predictor but its predictive error and other properties are different. This is the dual nature of predictor. Predictive bias Then the prediction error of ŷ 1 1 1 1 is given as y y = b + bx ( β + β x + ε = ( b β + ( b β x ε. Thus, we find that E( y y = Eb ( β + Eb ( β x E( ε 1 1 = + + = ŷ which implies that is an unbiased predictor of y.

7 Predictive variance ŷ Because the future observation y is independent of, the predictive variance of is PV ( y = E( y y = E[( b β + ( x x( b β + ( b β x ε ] 1 1 1 1 = Var( b + ( x x Var( b + x Var( b + Var( ε + ( x x Cov( b, b + xcov( b, b + ( x x Var( b 1 1 1 1 1 [rest of the terms are assuming the independence of ε with ε, ε,..., ε ] = Var( b + [( x x + x + ( x x] Var( b + Var( ε + [( x x + x] Cov( b, b 1 1 = Var( b + x Var( b + Var( ε + x Cov( b, b 1 1 1 x = σ + + σ + x σ x n s s s 1 n ( x x s = σ 1+ +. xσ ŷ 1 n Estimate of predictive variance The estimate of predictive variance can be obtained by replacing by its estimate as 1 ( x x PV ( y = σ 1+ + n s 1 ( x x = MSE + + n s 1. σ σ = MSE

8 Prediction interval If σ is known, then the distribution of y E( y PV ( y is N(,1. So the 1(1 α% prediction interval is obtained as y E( y P zα zα = 1 α PV ( y which gives the prediction interval for ŷ as 1 ( x x 1 ( x x y zα σ 1 + +, y + zα σ 1 + +. n s n s When σ is unknown, then y E( y PV ( y follows a t-distribution with (n - degrees of freedom.

9 The 1(1 α% prediction interval for in this case is obtained as y E( y P t t = 1, n, n PV ( y α α α which gives the prediction interval ŷ 1 ( x x 1 ( x x y tα MSE 1 + +, y + tα MSE 1 + +., n, n n s n s The prediction interval is of minimum width at x x and widens as x x increases. = ŷ µ ŷ The prediction interval for is wider than the prediction interval for because the prediction interval for depends on both the error from the fitted model as well as the error associated with the future observations. 9

1 Reverse regression method The reverse (or inverse regression approach minimizes the sum of squares of horizontal distances between the observed data points and the line in the following scatter diagram to obtain the estimates of regression parameters. The reverse regression has been advocated in the analysis of sex (or race discrimination in salaries. For example, if y denotes salary and x denotes qualifications and we are interested in determining if there is a sex discrimination in salaries, we can ask: Whether men and women with the same qualifications (value of x are getting the same salaries (value of y. This question is answered by the direct regression. Alternatively, we can ask: Whether men and women with the same salaries (value of y have the same qualifications (value of x. This question is answered by the reverse regression, i.e., regression of x on y.

11 The regression equation in case of reverse regression can be written as * * xi = β + β1 yi + δi ( i = 1,,..., n where δ i s are the associated random error components and satisfy the assumptions as in the case of usual simple linear regression model. The reverse regression estimates β of β and β of β the direct regression estimators of OR β and β 1 * * 1R 1 for the model are obtained by interchanging the x and y in. The estimates are obtained as β OR = x β y 1R and sxy β 1R = syy * * for β and β respectively. 1 The residual sum of squares in this case is Note that SS * res s = s s xy yy sxy β 1Rb1 = = r s s yy. xy where b 1 is the direct regression estimator of slope parameter and r xy is the correlation coefficient between x and y. Hence if r xy is close to 1, the two regression lines will be close to each other. An important application of reverse regression method is in solving the calibration problem.