TMA4267 Linear Statistical Models V2017 (L10)

Size: px

Start display at page:

Download "TMA4267 Linear Statistical Models V2017 (L10)"

Caitlin Dawson
6 years ago
Views:

1 TMA4267 Linear Statistical Models V2017 (L10) Part 2: Linear regression: Parameter estimation [F:3.2], Properties of residuals and distribution of estimator for error variance Confidence interval and hypothesis for one regression coefficient Mette Langaas Department of Mathematical Sciences, NTNU To be lectured: February 17, / 17

2 Today 1. Properties for residuals (from the hat matrix), leading to properties for ˆσ 2, 2. Then, confidence interval and hypothesis test for regression coefficient. 1 / 17

3 The classical linear model The model Y = X β + ε is called a classical linear model if the following is true: 1. E(ε) = Cov(ε) = E(εε T ) = σ 2 I. 3. The design matrix has full rank rank(x ) = k + 1 = p. The classical normal linear regression model is obtained if additionally 1. ε N n (0, σ 2 I ) holds. For random covariates these assumptions are to be understood conditionally on X. 2 / 17

4 Results so far Least squares and maximum likelihood estimator for β: ˆβ = (X T X ) 1 X T Y with mean E(ˆβ) = β and Cov(ˆβ) = σ 2 (X T X ) 1. Restricted maximum likelihood estimator for σ 2 : ˆσ 2 = 1 n p (Y X ˆβ) T (Y X ˆβ) = SSE n p Projection matrices: idempotent, symmetric/orthogonal: H = X (X T X ) 1 X T projects onto column space of X I H = I X (X T X ) 1 X T projects onto space orthogonal to column space of X with important connection: predictions Ŷ = HY and residuals ˆε = (I H)Y 3 / 17

5 e =(I H)y = y ŷ y ˆ 0 1 ˆ 1 x ȳ = Jy =ȳ1 = JHy x ŷ = Hy = ˆ ˆ 1 x C (1) C (1 : x) Putanen, FigureStyan 8.3 Projecting and Isotalo: y onto C Matrix (1 : x). Tricks for Linear Statistical Models: Our Personal Top Twenty, Figure / 17

6 Quadratic forms [F:B3.3, Theorem B.2] Random vector X with mean µ and covariance matrix Σ, symmetric constant matrix A. Quadratic form: X T AX. The "trace-formula": E(X T AX ) = tr(aσ) + µ T Aµ. Then, let X N p (0, I ), and R is a symmetric and idempotent matrix with rank r. X T RX χ 2 r Now, also S is a symmetric and idempotent matrix with rank s, and RS = 0. sx T RX rx T SX F r,s 5 / 17

7 Properties: ˆβ and ˆσ 2 Least squares and maximum likelihood estimator for β: ˆβ = (X T X ) 1 X T Y has mean E(ˆβ) = β and Cov(ˆβ) = σ 2 (X T X ) 1. In addition ˆβ is best linear unbiased estimator (BLUE), that is, among all unbiased estimator it has minimum variance in each component. (More in TMA4295 Statistical Inference.) For the normal model: ˆβ N p (β, σ 2 (X T X ) 1 ). Restricted maximum likelihood estimator for σ 2 : For the normal model ˆσ 2 = 1 n p (Y X ˆβ) T (Y X ˆβ) = SSE n p (n p)ˆσ 2 σ 2 χ 2 n p 6 / 17

8 Acid rain in Norwegian lakes Measured ph in Norwegian lakes explained by content of x1: SO 4 : sulfate (the salt of sulfuric acid), x2: N0 3 : nitrate (the conjugate base of nitric acid), x3: Ca: calsium, x4: latent Al: aluminium, x5: organic substance, x6: area of lake, x7: position of lake (Telemark or Trøndelag), Random sample of n = 26 lakes. 7 / 17

9 Output from fitting the full model in R > fit=lm(y~.,data=ds) > summary(fit) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** x e-05 *** x x e-06 *** x x x x Signif. codes: 0 *** ** 0.01 * Residual standard error: on 18 degrees of freedom Multiple R-squared: 0.93,Adjusted R-squared: F-statistic: on 7 and 18 DF, p-value: 3.904e-09 8 / 17

10 W. S. Gosset alias Student 9 / 17

11 Historical: Student-t fordelingen W.S. Gosset ( ) was employed by the Guinness Brewing Company of Dublin. Sample sizes available for experimentation in brewing were necessarily small, and Gosset knew that a correct way of dealing with small samples was needed. He consulted Karl Pearson ( ) of Universiy College in London about the problem. Pearson told him the current state of knowledge was unsatisfactory. The following year Gosset undertook a course of study under Pearson. An outcome of his study was the publication in 1908 of Gosset s paper on "The Probable Error of a Mean," which introduced a form of what later became known as Student s t-distribution. Gosset s paper was published under the pseudonym "Student." The modern form of Student s t-distribution was derived by R.A. Fisher and first published in / 17

12 t-distribution standardnormal t df=19 t df=5 t df= / 17

13 DEF: t-distribution Let Z be a standard normal random variable and V a chi-squared random variable with parameter ν (degrees of freedom). If Z and V are independent, the distribution of the random variable T T = Z V /ν has probability density function h(t) = Γ[(ν + 1)/2] Γ(ν/2) t2 (1 + πν ν ) (ν+1)/2 for < t <. This distribution is called the (Student) t distribution with ν degrees of freedom. E(T ) = 0 if ν 2. Var(T ) = ν ν 2 if ν / 17

14 Are ˆβ and SSE are independent? Independence from Part 1: Let X (p 1) be a random vector from N p (µ, Σ). Then AX and BX are independent iff AΣB T = 0. We have: Y N n (X β, σ 2 I ) AY = ˆβ = (X T X ) 1 X T Y, and BY = (I H)Y. Now Aσ 2 I B T = σ 2 AB T = σ 2 (X T X ) 1 X T (I H) = 0 since X (I H) = X HX = X X = 0. We conclude that ˆβ is independent of (I H)Y, and, since SSE=function of (I H)Y : SSE=Y T (I H)Y, then ˆβ and SSE are independent. 13 / 17

15 Quantiles and critical values: N og t: α/2 = standardnormal t df=19 t df= / 17

16 Kritiske verdier i t-fordelingen P (T > tα,ν) = α ν\α / 17

17 Acid rain in R ds=read.table(" TMA4267/2017v/acidrain.txt",header=TRUE) fit=lm(y~.,data=ds) > confint(fit) 2.5 % 97.5 % (Intercept) x x x x x x x P-values: /02/nerdekort.jpg 16 / 17

18 Today Distribution of SSE/σ 2 is chisquared (n p). Independence of ˆβ and SSE. Inference about β components can be performed using the t-distribution 17 / 17

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:

8. PROPERTIES OF LEAST SQUARES ESTIMATES 1 Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = 0. 2. The errors are uncorrelated with common variance: These assumptions