Estimating Estimable Functions of β. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 17

Similar documents
2. A Review of Some Key Linear Models Results. Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics / 28

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27

Xβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X =

General Linear Test of a General Linear Hypothesis. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 35

STAT 540: Data Analysis and Regression

The Gauss-Markov Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 61

Estimable Functions and Their Least Squares Estimators. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 51

Constraints on Solutions to the Normal Equations. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41

The Aitken Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41

Scheffé s Method. opyright c 2012 Dan Nettleton (Iowa State University) Statistics / 37

Likelihood Ratio Test of a General Linear Hypothesis. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 42

THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS

ANOVA Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 32

ANOVA Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 32

ML and REML Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 58

When is the OLSE the BLUE? Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 40

11. Linear Mixed-Effects Models. Copyright c 2018 Dan Nettleton (Iowa State University) 11. Statistics / 49

3. The F Test for Comparing Reduced vs. Full Models. opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

Linear Mixed-Effects Models. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 34

Properties of the least squares estimates

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

Preliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38

Distributions of Quadratic Forms. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 31

Matrix Approach to Simple Linear Regression: An Overview

Lecture 6: Linear models and Gauss-Markov theorem

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No.

MIT Spring 2015

Introduction to Estimation Methods for Time Series models. Lecture 1

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Chapter 5 Matrix Approach to Simple Linear Regression

Gauss Markov & Predictive Distributions

STA 302f16 Assignment Five 1

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Covariance and Correlation

14 Multiple Linear Regression

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Maximum Likelihood Estimation

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

STA 2201/442 Assignment 2

Introduction to Econometrics Midterm Examination Fall 2005 Answer Key

Linear regression methods

Financial Econometrics

the error term could vary over the observations, in ways that are related

Fitting a regression model

The Statistical Property of Ordinary Least Squares

Determinants. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 25

Miscellaneous Results, Solving Equations, and Generalized Inverses. opyright c 2012 Dan Nettleton (Iowa State University) Statistics / 51

The Multivariate Normal Distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 36

Regression #4: Properties of OLS Estimator (Part 2)

LECTURE 11: GENERALIZED LEAST SQUARES (GLS) In this lecture, we will consider the model y = Xβ + ε retaining the assumption Ey = Xβ.

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Advanced Quantitative Methods: ordinary least squares

1 Cricket chirps: an example

4 Multiple Linear Regression

Linear Models and Estimation by Least Squares

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

Applied linear statistical models: An overview

Simple Linear Regression: The Model

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Lecture 1: Linear Models and Applications

Properties of Matrices and Operations on Matrices

11 Hypothesis Testing

Point-Referenced Data Models

Lecture Notes on Different Aspects of Regression Analysis

Quick Review on Linear Multiple Regression

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij =

Regression and Statistical Inference

EC3062 ECONOMETRICS. THE MULTIPLE REGRESSION MODEL Consider T realisations of the regression equation. (1) y = β 0 + β 1 x β k x k + ε,

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58

MS&E 226: Small Data

Lecture 34: Properties of the LSE

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1

Linear Models in Machine Learning

3 Multiple Linear Regression

Multiple Regression Analysis. Part III. Multiple Regression Analysis

13. The Cochran-Satterthwaite Approximation for Linear Combinations of Mean Squares

Preliminary Linear Algebra 1. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 100

The Multivariate Normal Distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 36

Data Mining Stat 588

Simple and Multiple Linear Regression

21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Linear Algebra Review

Ch 2: Simple Linear Regression

The general linear model (and PROC GLM)

BIOS 2083 Linear Models c Abdus S. Wahed

Multivariate Regression Analysis

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

General Linear Model: Statistical Inference

Simple Linear Regression

Homoskedasticity. Var (u X) = σ 2. (23)

Transcription:

Estimating Estimable Functions of β Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 7

The Response Depends on β Only through Xβ In the Gauss-Markov or Normal Theory Gauss-Markov Linear Model, the distribution of y depends on β only through Xβ, i.e., y (Xβ, σ 2 I) or y N(Xβ, σ 2 I) If X is not of full column rank, there are infinitely many vectors in the set {b : Xb = Xβ} for any fixed value of β. Thus, no matter what the value of E(y), there will be infinitely many vectors b such that Xb = E(y) when X is not of full column rank. The response vector y can help us learn about E(y) = Xβ, but when X is not of full column rank, there is no hope of learning about β alone unless additional information about β is available. Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 2 / 7

Treatment Effects Model Researchers randomly assigned a total of six experimental units to two treatments and measured a response of interest. y ij = µ + τ i + ɛ ij, i =, 2; j =, 2, y y 2 y y 2 y 22 y 2 y y 2 y y 2 y 22 y 2 = = µ + τ µ + τ µ + τ + µ τ τ 2 ɛ ɛ 2 ɛ ɛ 2 ɛ 22 ɛ 2 + ɛ ɛ 2 ɛ ɛ 2 ɛ 22 ɛ 2 Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 7

In this case, it makes no sense to estimate β = [µ, τ, τ 2 ] because there are multiple (infinitely many, in fact) choices of β that define the same mean for y. For example, µ τ τ 2 = 5, 0 4 6, or 999 995 99 all yield same Xβ = E(y). When multiple values for β define the same E(y), we say that β is non-estimable. Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 4 / 7

Estimable Functions of β A linear function of β, Cβ, is said to be estimable if there is a linear function of y, Ay, that is an unbiased estimator of Cβ. Otherwise, Cβ is said to be non-estimable. Note that Ay is an unbiased estimator of Cβ if and only if E(Ay) = Cβ β IR p AXβ = Cβ β IR p AX = C. This says that we can estimate Cβ as long as Cβ = AXβ = AE(y) for some A, i.e., as long as Cβ is a linear function of E(y). The bottom line is that we can always estimate E(y) and all linear functions of E(y); all other linear functions of β are non-estimable. Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 5 / 7

E(y) = Xβ = µ τ τ 2 = µ + τ µ + τ µ + τ = [, 0, 0, 0, 0, 0]Xβ = [,, 0]β = µ + τ [0, 0, 0,, 0, 0]Xβ = [, 0, ]β = [, 0, 0,, 0, 0]Xβ = [0,, ]β = τ τ 2 are estimable functions of β. opyright c 202 Dan Nettleton (Iowa State University) Statistics 5 6 / 7

Estimating Estimable Functions of β If Cβ is estimable, then there exists a matrix A such that C = AX and Cβ = AXβ = AE(y) for any β IR p. It makes sense to estimate Cβ = AXβ = AE(y) by AÊ(y) = Aŷ = AP Xy = AX(X X) X y = AX(X X) X Xˆβ = AP X Xˆβ = AXˆβ = Cˆβ. Cˆβ is called the Ordinary Least Squares (OLS) estimator of Cβ. Note that although the hat is on β, it is Cβ that we are estimating. opyright c 202 Dan Nettleton (Iowa State University) Statistics 5 7 / 7

Invariance of Cˆβ to the Choice of ˆβ Although there are infinitely many solutions to the normal equations when X is not of full column rank, Cˆβ is the same for all normal equation solutions ˆβ whenever Cβ is estimable. To see this, suppose ˆβ and ˆβ 2 are any two solutions to the normal equations. Then Cˆβ = AXˆβ = AP X Xˆβ = AX(X X) X Xˆβ = AX(X X) X y = AX(X X) X Xˆβ 2 = AP X Xˆβ 2 = AXˆβ 2 = Cˆβ 2. Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 8 / 7

Suppose our aim is to estimate τ τ 2. As noted before, Xβ = µ τ τ 2 = µ + τ µ + τ µ + τ = [, 0, 0,, 0, 0]Xβ = [0,, ]β = τ τ 2. Thus, we can compute the OLS estimator of τ τ 2 as [, 0, 0,, 0, 0]ŷ = [0,, ]ˆβ, where ŷ = X(X X) X y and ˆβ is any solution to the normal equations. Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 9 / 7

The normal equations in this case are b b 2 b = y y 2 y y 2 y 22 y 2 6 0 0 b b 2 b = y y y 2. Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 0 / 7

ˆβ ȳ ȳ ȳ ȳ and ˆβ 2 0 ȳ are each solutions to the normal equations because 6 0 0 ȳ ȳ ȳ ȳ = y y y 2 = 6 0 0 0 ȳ. Thus, the OLS estimator of Cβ = [0,, ]β = τ τ 2 is ȳ 0 Cˆβ = [0,, ] ȳ ȳ ȳ = ȳ = [0,, ] ȳ = Cˆβ 2. Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 7

Let (X X) = /6 0 0 0 /6 /6 0 /6 /6 and (X X) 2 = 0 / 0 0 0 /. It is straightforward to verify that (X X) and (X X) 2 generalized inverses of X X. are each It is also easy to show that ˆβ = (X X) X y and ˆβ 2 = (X X) 2 X y. Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 2 / 7

P X = X(X X) X = = 0 / 0 0 / 0 0 / 0 0 0 / 0 0 / 0 0 / 0 / 0 0 0 / 0 0 =. opyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 7

Thus Ê(y) = ŷ = P X y = is our OLS estimator of E(y) = Xβ = µ τ τ 2 = y y 2 y y 2 y 22 y 2 = µ + τ µ + τ µ + τ. ȳ ȳ ȳ opyright c 202 Dan Nettleton (Iowa State University) Statistics 5 4 / 7

Also, we can see that the OLS estimator of τ τ 2 = [0,, ] µ τ τ 2 = [, 0, 0,, 0, 0] = [, 0, 0,, 0, 0] µ + τ µ + τ µ + τ µ τ τ 2 = [, 0, 0,, 0, 0]E(y) is opyright c 202 Dan Nettleton (Iowa State University) Statistics 5 5 / 7

[, 0, 0,, 0, 0]Ê(y) = [, 0, 0,, 0, 0]ŷ = [, 0, 0,, 0, 0] = ȳ ȳ ȳ ȳ Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 6 / 7

The Gauss-Markov Theorem Under the Gauss-Markov Linear Model, the OLS estimator c ˆβ of an estimable linear function c β is the unique Best Linear Unbiased Estimator (BLUE) in the sense that Var(c ˆβ) is strictly less than the variance of any other linear unbiased estimator of c β for all β IR p and all σ 2 IR +. The Gauss-Markov Theorem says that if we want to estimate an estimable linear function c β using a linear estimator that is unbiased, we should always use the OLS estimator. In our simple example of the treatment effects model, we could have used y y 2 to estimate τ τ 2. It is easy to see that y y 2 is a linear estimator that is unbiased for τ τ 2, but its variance is clearly larger than the variance of the OLS estimator ȳ (as guaranteed by the Gauss-Markov Theorem). opyright c 202 Dan Nettleton (Iowa State University) Statistics 5 7 / 7