Regression Models - Introduction

Similar documents
Regression Models - Introduction

STAT5044: Regression and Anova. Inyoung Kim

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1 Linear Regression with One Predictor

Inference about the Slope and Intercept

Simple Linear Regression

STAT Chapter 11: Regression

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Simple and Multiple Linear Regression

Statistical View of Least Squares

Inference in Regression Analysis

Lecture 10 Multiple Linear Regression

Simple Linear Regression for the Climate Data

Inference for Regression Simple Linear Regression

Simple Linear Regression for the MPG Data

Lecture 3: Inference in SLR

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Steps in Regression Analysis

Inference for the Regression Coefficient

TMA4255 Applied Statistics V2016 (5)

Statistics for Engineers Lecture 9 Linear Regression

Inference for Regression Inference about the Regression Model and Using the Regression Line

Lecture 14 Simple Linear Regression

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

The Multiple Regression Model

Inferences for Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Correlation Analysis

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov

Applied Econometrics (QEM)

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Chapter 2 The Simple Linear Regression Model: Specification and Estimation

CHAPTER 6: SPECIFICATION VARIABLES

Chapter 14 Simple Linear Regression (A)

Psychology 282 Lecture #4 Outline Inferences in SLR

ECNS 561 Multiple Regression Analysis

Statistical Techniques II EXST7015 Simple Linear Regression

1. Simple Linear Regression

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

STAT5044: Regression and Anova

Inference in Normal Regression Model. Dr. Frank Wood

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Simple Linear Regression Analysis

Ch 2: Simple Linear Regression

9. Linear Regression and Correlation

Chapter 2 Multiple Regression I (Part 1)

Simple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5)

Bias Variance Trade-off

Simple Linear Regression

ECON The Simple Regression Model

Lecture 1 Linear Regression with One Predictor Variable.p2

Applied Regression Analysis

Basic Business Statistics 6 th Edition

STAT 3A03 Applied Regression With SAS Fall 2017

Y i = η + ɛ i, i = 1,...,n.

Lecture 18: Simple Linear Regression

Inference for Regression

Statistical Modelling in Stata 5: Linear Models

Measuring the fit of the model - SSR

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Simple Linear Regression (Part 3)

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Math 3330: Solution to midterm Exam

STA121: Applied Regression Analysis

CHAPTER EIGHT Linear Regression

Lecture 2 Linear Regression: A Model for the Mean. Sharyn O Halloran

Formal Statement of Simple Linear Regression Model

6. Multiple Linear Regression

Psychology 282 Lecture #3 Outline

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Section 3: Simple Linear Regression

y response variable x 1, x 2,, x k -- a set of explanatory variables

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Weighted Least Squares

Mathematics for Economics MA course

STAT 100C: Linear models

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Categorical Predictor Variables

Linear Regression Model. Badr Missaoui

Business Statistics. Lecture 9: Simple Regression

Two-Variable Regression Model: The Problem of Estimation

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Simple linear regression

Unit 6 - Simple linear regression

Chapter 1. Linear Regression with One Predictor Variable

Graduate Econometrics Lecture 4: Heteroskedasticity

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Simple Linear Regression: The Model

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Unit 10: Simple Linear Regression and Correlation

Basic Business Statistics, 10/e

Multivariate Regression (Chapter 10)

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Multiple Linear Regression

Statistics for Managers using Microsoft Excel 6 th Edition

Transcription:

Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable, X, also called predictor variable or explanatory variable. It is sometimes modeled as random and sometimes it has fixed value for each observation. In regression models we are fitting a statistical model to data. We generally use regression to be able to predict the value of one variable given the value of others. STA61 week 13 1

Simple Linear Regression - Introduction Simple linear regression studies the relationship between a quantitative response variable Y, and a single explanatory variable X. Idea of statistical model: Actual observed value of Y = Box (a well know statistician) claim: All models are wrong, some are useful. Useful means that they describe the data well and can be used for predictions and inferences. Recall: parameters are constants in a statistical model which we usually don t know but will use data to estimate. STA61 week 13

Simple Linear Regression Models The statistical model for simple linear regression is a straight line model of the form X where Y 0 1 For particular points, Yi 0 1X i i, i 1,..., n We expect that different values of X will produce different mean response. In particular we have that for each value of X, the possible values of Y follow a distribution whose mean is... Formally it means that. STA61 week 13 3

Estimation Least Square Method Estimates of the unknown parameters β 0 and β 1 based on our observed data are usually denoted by b 0 and b 1. For each observed value x i of X the fitted value of Y is This is an equation of a straight line. yˆ i b0 b1 xi. The deviations from the line in vertical direction are the errors in prediction of Y and are called residuals. They are defined as e i y i y ˆi. The estimates b 0 and b 1 are found by the Method of Lease Squares which is based on minimizing sum of squares of residuals. Note, the least-squares estimates are found without making any statistical assumptions about the data. STA61 week 13 4

Derivation of Least-Squares Estimates Let S n yi b0 b1 x i i1 We want to find b 0 and b 1 that minimize S. Use calculus. STA61 week 13 5

Statistical Assumptions for SLR Recall, the simple linear regression model is Y i = β 0 + β 1 X i + ε i where i = 1,, n. The assumptions for the simple linear regression model are: 1) E(ε i )=0 ) Var(ε i ) = σ 3) ε i s are uncorrelated. These assumptions are also called Gauss-Markov conditions. The above assumptions can be stated in terms of Y s STA61 week 13 6

Possible Violations of Assumptions Straight line model is inappropriate Var(Y i ) increase with X i. Linear model is not appropriate for all the data STA61 week 13 7

Properties of Least Squares Estimates The least-square estimates b 0 and b 1 are linear in Y s. That it, there exists constants c i, d i such that, b 0 Proof: Exercise.. c i Y i, b 1 The least squares estimates are unbiased estimators for β 0 and β 1. Proof: d i Yi STA61 week 13 8

Gauss-Markov Theorem The least-squares estimates are BLUE (Best Linear, Unbiased Estimators). Of all the possible linear, unbiased estimators of β 0 and β 1 the least squares estimates have the smallest variance. The variance of the least-squares estimates is STA61 week 13 9

Estimation of Error Term Variance σ The variance σ of the error terms ε i s needs to be estimated to obtain indication of the variability of the probability distribution of Y. Further, a variety of inferences concerning the regression function and the prediction of Y require an estimate of σ. Recall, for random variable Z the estimates of the mean and variance of Z based on n realization of Z are. Similarly, the estimate of σ is 1 s n n i1 e i S is called the MSE (Mean Square Error) it is an unbiased estimator of σ. STA61 week 13 10

Normal Error Regression Model In order to make inference we need one more assumption about ε i s. We assume that ε i s have a Normal distribution, that is ε i ~ N(0, σ ). The Normality assumption implies that the errors ε i s are independent (since they are uncorrelated). Under the Normality assumption of the errors, the least squares estimates of β 0 and β 1 are equivalent to their maximum likelihood estimators. This results in additional nice properties of MLE s: they are consistent, sufficient and MVUE. STA61 week 13 11

Inference about the Slope and Intercept Recall, we have established that the least square estimates b 0 and b 1 are linear combinations of the Y i s. Further, we have showed that they are unbiased and have the following variances Var 1 X 0 Var n S b and b 1 XX S XX In order to make inference we assume that ε i s have a Normal distribution, that is ε i ~ N(0, σ ). This in turn means that the Y i s are normally distributed. Since both b 0 and b 1 are linear combination of the Y i s they also have a Normal distribution. STA61 week 13 1

Inference for β 1 in Normal Error Regression Model The least square estimate of β 1 is b 1, because it is a linear combination of normally distributed random variables (Y i s) we have the following result: b 1 ~ N 1, S XX We estimate the variance of b 1 by S /S XX where S is the MSE which has n- df. Claim: The distribution 1 1 of is t with n- df. Proof: b S S XX STA61 week 13 13

Tests and CIs for β 1 The hypothesis of interest about the slope in a Normal linear regression model is H 0 : β 1 = 0. The test statistic for this hypothesis is t stat S b 1 S XX We compare the above test statistic to a t with n- df distribution to obtain the P-value. b 1 S. E b 1 Further, 100(1-α)% CI for β1 is: S b t n ; S b1 tn; S Eb1 1. XX STA61 week 13 14

Important Comment Similar results can be obtained about the intercept in a Normal linear regression model. However, in many cases the intercept does not have any practical meaning and therefore it is not necessary to make inference about it. STA61 week 13 15