Classical Inference for Gaussian Linear Models

Size: px

Start display at page:

Download "Classical Inference for Gaussian Linear Models"

Christopher Leslie Burns
5 years ago
Views:

1 Classical Inference for Gaussian Linear Models Surya Tokdar

2 Gaussian linear model Data (z i, Y i ), i = 1,, n, dim(z i ) = p Model Y i = z T i Parameters: β R p, σ 2 > 0 Inference needed on η = a T β IID β + ϵ i, ϵ i N(0, σ 2 ) Useful model for many types of analyses 1 / 1

3 Matrix-vector notation Response vector, design matrix and error vector Y 1 Y = Y 2, Z = z T 2, ϵ = ϵ 2 Y n Model: Y = Zβ + ϵ, ϵ N n (0, σ 2 I n ) z T 1 z T n ϵ 1 ϵ n 2 / 1

4 Example 1 Food expenditures Y 1,, Y n Population average µ and variability σ 2 IID Model Y i N(µ, σ 2 ), β R, σ 2 > 0 This is a Gaussian linear model with p = 1, z i = 1 and β = µ 3 / 1

5 Example 2 Body weight gains of n 1 rats on high protein: H 1,, H n1 Same for n 2 rats on low protein: L 1,, L n2 Model IID Hi N(µ 1, σ 2 ), L IID j N(µ 2, σ 2 ), H i s and L j s independent µ 1, µ 2 R, σ 2 > 0 ( ) µ1 Gaussian linear model with p = 2, n = n 1 + n 2, β = µ 2 H n 1 H n1 1 0 Y = Z = L n L n2 4 / 1

6 Example 3 n subjects, males and females randomly assigned to treatment (drug) or control (placebo) Y i = improvement in condition (sleep hours) of subject i 5 / 1

7 Example 3 as Gaussian linear model Use model Y i = z i β + ϵ IID i, ϵ i N(0, σ 2 ), where p = 4 and z i = (z i1, z i2, z i3, z i4 ) with z i1 = I (i-th subject is F and gets T) z i2 = I (i-th subject is F and gets C) z i3 = I (i-th subject is M and gets T) z i4 = I (i-th subject is M and gets C) Let n FT be the number of subjects who are F and get T Similarly define n FC, n MT and n MC 6 / 1

8 Example 3 design matrix Z = n FT n FC n MT n MC 7 / 1

9 Example 3 treatment effects Treatment effect for females: η F = β 1 β 2 Treatment effect for males: η M = β 3 β 4 Treatment effect difference: η = η F η M = β 1 β 2 β 3 + β 4 8 / 1

10 Example 4 n subjects Randomly assigned to treatment or control Y i = improvement in condition for subject i Likely to depend on subject s age 9 / 1

11 Example 4 and Gaussian linear model treat i = 1 for treatment and 0 for control Model: Y i = β 1 + β 2 treat i + β 3 age i + β 4 treat i age i + ϵ i, IID ϵ i N(0, σ 2 ) ie, p = 4, z i = (1, treat i, age i, treat i age i ) 10 / 1

12 Example 4 and quantities of interest 1 Expected improvement at age 30, receiving treatment: a = (1, 1, 30, 30) T 2 Treatment effect at age 30, ie, expected additional improvement due to treatment at age 30 a = (0, 1, 0, 30) T 3 Difference in treatment effects between age 20 and 30 a = (0, 0, 0, 10) T 11 / 1

13 ML theory: the likelihood function Model: Y N n (Zβ, σ 2 I ) Observe Y = y with y = (y 1,, y n ) T Log-likelihood: l y (β, σ 2 ) = const n 2 log σ2 (y Zβ)T (y Zβ) 2σ 2 12 / 1

14 MLE First order conditions: 0 = β l y (β, σ 2 ) = Z T (y Zβ) σ 2 0 = σ 2 l y (β, σ 2 ) = n 2σ 2 + (y Zβ)T (y Zβ) 2σ 4 ˆβMLE = (Z T Z) 1 Z T y =: ˆβ LS ˆσ 2 MLE = (y Z ˆβ LS ) T (y Z ˆβ LS ) n Notation: sy z 2 = (y Z ˆβ LS ) T (y Z ˆβ LS ) n p, ie, ˆσ 2 MLE = n p n s2 y z 13 / 1

15 Least squares interpretation For any β (y Zβ) T (y Zβ) = y Zβ 2 = y Z ˆβ LS 2 + Z ˆβ LS Zβ 2 + 2(y Z ˆβ LS ) T (Z ˆβ LS Zβ) = y Z ˆβ LS 2 + Z ˆβ LS Zβ ˆβLS is the least-squares estimate of β 14 / 1

16 Profile log-likelihood of β For any β, l y (β, σ 2 ) is maximized in σ 2 at ˆσ 2 (β) = (y Zβ)T (y Zβ) n = (n p)s2 y z + (β ˆβ LS ) T (Z T Z)(β ˆβ LS ) n So the profile log-likelihood in β is l y (β) = l y (β, ˆσ 2 (β)) = const n 2 log ˆσ2 (β) n { 2 = const n 2 log 1 + (β ˆβ LS ) T (Z T Z)(β ˆβ } LS ) (n p)sy z 2 15 / 1

17 ML intervals for η = a T β Additional calculations show the profile log-likelihood in η is } l y (η) = const {1 n 2 log 1 + (n p)sy z 2 (η a ˆβ LS ) 2 a T (Z T Z) 1 a So ML intervals for η are of the form a T ˆβLS c n s y z na where n a = 1/{a T (Z T Z) 1 a}, with thresholds c n > 0 16 / 1

18 ML confidence interval for η = a T β Let F k denote the CDF of t(k) distribution Notation: z k (α) = F 1 k (1 α/2) 100(1 α)% ML confidence interval for η = a T β is a T ˆβ LS z n p (α) s y z na Due to the following fundamental theorem 17 / 1

19 A fundamental result Theorem Let Y N n (Zβ, σ 2 I ) Define H = Z(Z Z) 1 Z and ˆϵ = Y Z ˆβ LS = (I n H)Y Then 1 ˆβ LS N p (β, σ 2 (Z Z) 1 ) 2 ˆϵ N n (0, σ 2 (I n H)) 3 ˆβLS and ˆϵ are independent 4 1 σ 2 ˆϵ ˆϵ χ 2 (n p) 18 / 1

20 Coverage calculation Notation: C c (y) = a T ˆβLS c s y z na γ((β, σ 2 ), C c ) = P [Y β,σ 2 ](a T β C c (Y )) ( = P [Y β,σ 2 ] c at ˆβ LS a T β s y z / n a = P [Y β,σ 2 ]( c T c) c ) By theorem T t(n p) when Y N n (Zβ, σ 2 I n ) And so γ((β, σ 2 ), C c ) = 2F n p (c) 1 For c = z n p (α) this number equals 1 α 19 / 1

21 ML testing H 0 : a T β = η 0 where η 0 is a fixed number ML test δ c (y): reject H 0 if η 0 C c (y) Null set: Θ 0 = {(β, σ 2 ) : a T β = η 0 } note this is a set, not a single point Size of δ c is 1 γ(c c ) = 2(1 F n p (c)) [Prove in HW] In particular δ zn p (α) has size α 20 / 1

22 One sided hypothesis H 0 : a T β η 0 where η 0 is a fixed number ML test δ c (y): reject H 0 if (, η 0 ] C c (y) = Same as rejecting H 0 when η 0 < a T ˆβ LS cs y z / n a Size of δ c is 1 F n p (c) δ zn p (α) has size α/2 Can do the same for the other one-sided case: H 0 : a T β η 0 21 / 1

23 Example: Chick Weight 50 chicks assigned to one of 4 protein diets One body weight measurement from each chick between 1 and 21 days after birth Data on (log) body weight, diet and time of measurement Model weight = β 1 + β 2 Diet 2 + β 3 Diet 3 + β 4 Diet 4 + β 5 Time + ϵ 22 / 1

STA 114: Statistics. Notes 21. Linear Regression

STA 114: Statistics. Notes 21. Linear Regression STA 114: Statistics Notes 1. Linear Regression Introduction A most widely used statistical analysis is regression, where one tries to explain a response variable Y by an explanatory variable X based on