Weighted Least Squares

Similar documents
Weighted Least Squares

Categorical Predictor Variables

Properties of the least squares estimates

variability of the model, represented by σ 2 and not accounted for by Xβ

Chapter 5 Matrix Approach to Simple Linear Regression

Simple Linear Regression

. a m1 a mn. a 1 a 2 a = a n

Weighted Least Squares

Simple Linear Regression

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

Regression Models - Introduction

Applied Econometrics (QEM)

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Introduction to Econometrics Midterm Examination Fall 2005 Answer Key

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Matrix Approach to Simple Linear Regression: An Overview

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

A Short Introduction to the Lasso Methodology

ECON The Simple Regression Model

Introduction to Econometrics Final Examination Fall 2006 Answer Sheet

Regression diagnostics

Matrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =

Ma 3/103: Lecture 24 Linear Regression I: Estimation

MS&E 226: Small Data

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015

Lecture 9 SLR in Matrix Form

Econometrics of Panel Data

Simple Linear Regression (Part 3)

Multiple Linear Regression

Applied Regression Analysis

Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares

Linear Algebra Review

Lecture 6 Multiple Linear Regression, cont.

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Data Analysis and Statistical Methods Statistics 651

Chapter 2 Multiple Regression I (Part 1)

Reference: Davidson and MacKinnon Ch 2. In particular page

14 Multiple Linear Regression

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij =

The Gauss-Markov Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 61

4.7 Confidence and Prediction Intervals

The Simple Regression Model. Part II. The Simple Regression Model

STAT5044: Regression and Anova. Inyoung Kim

Well-developed and understood properties

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Ch 3: Multiple Linear Regression

Lecture 15. Hypothesis testing in the linear model

11 Hypothesis Testing

STAT5044: Regression and Anova. Inyoung Kim

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

Measuring the fit of the model - SSR

Math 3330: Solution to midterm Exam

Lecture 6: Linear models and Gauss-Markov theorem

Statistical View of Least Squares

22s:152 Applied Linear Regression. Take random samples from each of m populations.

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

1 One-way analysis of variance

Advanced Quantitative Methods: ordinary least squares

Ch 2: Simple Linear Regression

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

Econometrics of Panel Data

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

4 Multiple Linear Regression

Regression Models - Introduction

Scheffé s Method. opyright c 2012 Dan Nettleton (Iowa State University) Statistics / 37

Multivariate Regression Analysis

Lecture 14 Simple Linear Regression

Linear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016

arxiv: v1 [math.st] 22 Dec 2018

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them.

PAijpam.eu M ESTIMATION, S ESTIMATION, AND MM ESTIMATION IN ROBUST REGRESSION

Quantitative Methods I: Regression diagnostics

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

Data Mining Stat 588

Nonparametric Regression. Badr Missaoui

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

REGRESSION ANALYSIS AND INDICATOR VARIABLES

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

A Significance Test for the Lasso

Econ 583 Final Exam Fall 2008

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Joint Probability Distributions

Lecture 4 Multiple linear regression

Bayesian Linear Models

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Lecture 4: Regression Analysis

Appendix A: Review of the General Linear Model

The Aitken Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41

Define y t+h t as the forecast of y t+h based on I t known parameters. The forecast error is. Forecasting

Model Checking and Improvement

STAT 350: Geometry of Least Squares

2. Regression Review

Regression Analysis Chapter 2 Simple Linear Regression

Dr. Allen Back. Sep. 23, 2016

Lecture 19 Multiple (Linear) Regression

Scatter plot of data from the study. Linear Regression

Transcription:

Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w n are known positive constants. Weighted least squares is an estimation technique which weights the observations proportional to the reciprocal of the error variance for that observation and so overcomes the issue of non-constant variance. 161

Weighted Least Squares Solution Let W be a diagonal matrix with diagonal elements equal to w 1,..., w n. The the Weighted Residual Sum of Squares is defined by S w (β) = n i=1 w i (y i x t i β)2 = (Y Xβ) t W (Y Xβ). Weighted least squares finds estimates of β by minimizing the weighted sum of squares. The general solution to this is ˆβ = (X t W X) 1 X t W Y. 162

Weighted Least Squares as a Transformation Recall from the previous chapter the model y i = β 0 + β 1 x i + ε i where Var(ε i ) = x 2 i σ2. We can transform this into a regular least squares problem by taking y i = y i x i x i = 1 x i ε i = ε i x i. Then the model is y i = β 1 + β 0 x i + ε i where Var(ε i ) = σ2. 163

Weighted Least Squares as a Transformation The residual sum of squares for the transformed model is S 1 (β 0, β 1 ) = = = n i=1 n i=1 n i=1 (y i β 1 β 0 x i )2 ( ) 2 yi 1 β 1 β 0 x i x i 1 x i (y i β 0 β 1 x i ) 2 This is the weighted residual sum of squares with w i = 1/x i. Hence the weighted least squares solution is the same as the regular least squares solution of the transformed model. 164

Weighted Least Squares as a Transformation In general suppose we have the linear model Y = Xβ + ε where Var(ε) = W 1 σ 2. Let W 1/2 be a diagonal matrix with diagonal entries equal to wi. Then we have Var(W 1/2 ε) = σ 2 I n. 165

Weighted Least Squares as a Transformation Hence we consider the transformation Y = W 1/2 Y X = W 1/2 X ε = W 1/2 ε. This gives rise to the usual least squares model Y = X β + ε Using the results from regular least squares we then get the solution ˆβ = ( X t X ) 1 X t Y = (X t W X) 1 X t W Y. Hence this is the weighted least squares solution. 166

The Weights To apply weighted least squares, we need to know the weights w 1,..., w n. There are some instances where this is true. We may have a probabilistic model for Var(Y X = x i ) in which case we would use this model to find the w i. Weighted least squares may also be used to downweight or omit some observations from the fitted model. 167

The Weights Another common case is where each observation is not a single measure but an average of n i actual measures and the original measures each have variance σ 2. In that case, standard results tell us that Var(ε i ) = Var(Y i X = x i ) = σ2 n i Thus we would use weighted least squares with weights w i = n i. This situation often occurs in cluster surveys. 168

Unknown Weights In many real-life situations, the weights are non known apriori. In such cases we need to estimate the weights in order to use weighted least squares. One way to do this is possible when there are multiple repeated observations at each value of the covariate vector. That is often possible in designed experiments in which a number of replicates will be observed for each set value of the covariate vector. We can then estimate the variance of Y for each fixed covariate vector and use this to estimate the weights. 169

Pure Error Suppose that we have n i observations at x = x j, j = 1,..., k. Then a fitted model could be y ij = β 0 + β 1 x j + ε ij i = 1,..., n j ; j = 1,..., k. The (i, j) th residual can then be written as e ij = y ij ŷ ij = (y ij y j ) + (y j ŷ ij ). The first term is referred to as the pure error. 170

Pure Error Note that the pure error term does not depend on the mean model at all. We can use the pure error mean squares to estimate the variance at each value of x. s 2 j = 1 n j 1 n j i=1 (y ij y j ) 2 j = 1,..., k. Then we can use the weights w ij = 1/s 2 j (i = 1,..., n j) as the weights in a weighted least squares regression model. 171

Unknown Weights For observational studies, however, it is generally not possible. This is particularly true when there are multiple covariates. However, in such instances we may be willing to assume that the variance of the observations is the same within each level of a categorical variable but possibly different between levels. In that case we can estimate the weights assigned to observations with a given level by an estimate of the variance for that level of the categorical variable. This leads to a two-stage method of estimation. 172

Two-Stage Estimation In the two-stage estimation procedure we first fit a regular least squares regression to the data. If there is some evidence of non-homogenous variance then we examine plots of the residuals against a categorical variable which we suspect is the culprit for this problem. Note that we do still need to have some apriori knowledge of a categorical variable likely to affect variance. This categorical variable may, or may not, be included in the mean model. 173

Two-Stage Estimation Let Z be the categorical variable and assume that there are n j observations with Z = j (j = 1,..., k). If the error variability does vary with the levels of this categorical variable then we can use e 2 i z i =j ˆσ 2 j = 1 n j as an estimate of the variability when Z = j. We could then use the reciprocals of these estimates as the weights in a weighted least squares regression in the second stage. 174