SB1a Applied Statistics Lectures 9-10

Size: px

Start display at page:

Download "SB1a Applied Statistics Lectures 9-10"

Jean Scott
5 years ago
Views:

1 SB1a Applied Statistics Lectures 9-10 Dr Geoff Nicholls Week 5 MT15

2 - Natural or canonical) exponential families - Generalised Linear Models for data - Fitting GLM s to data MLE s Iteratively Re-weighted Least Squares

3 Natural exponential families Restrict observation model to Y fy θ) ) yθ κθ) fy θ) = exp +cy;φ), y Ω φ φ = 1 natural exponential family φ > 0 natural exponential dispersion family. ) yθ expκ/φ) = exp Ω φ +cy) dy θ {θ; κθ) < } y natural observation θ natural parameter φ dispersion parameter Note: discussion here restricted to natural exponential families with a scalar parameter. For general characterisation of EF s see recommended reading.

4 Example: Normal Y Nµ,σ 2 ) is NEDF fy θ) = 1 ) 2πσ 2 exp y µ)2 2σ 2 yµ = exp σ 2 µ2 2σ 2 log2πσ2 ) 2 ) yθ κθ) = exp +cy;φ) φ θ = µ, y R, φ = σ 2, κ = µ 2 /2 cy;φ) = 1/2)log2πφ) y 2 /2φ ) y2 2σ 2 Student s-t: Y tν) is not in this class fy ν) 1+y 2 /ν) ν+1 2 cant factorise y and ν in exponent. )

5 κ is the cummulant generating function e κ/φ ) yθ = exp Ω φ +cy) dy κθ) cummulant GF at φ = 1). κ ) y yθ φ eκ/φ = exp Ω φ φ +cy) κ = EY) κ ) 2 ) φ 2 + k e κ/φ ) 2 y = exp φ Ω φ κ ) 2 ) φ 2 + k = EY 2 )/φ 2 φ vary) = EY 2 ) EY) 2 ) = κ ) 2 +φκ κ ) 2 = φκ. Let µ i = EY i ), vary i ) = φvµ i ). dy ) yθ φ +cy) dy Exercise dµ i dθ i = Vµ i ) so µ i increases with θ i.

6 Modeling with GLM s Suppose we have a response y and some explanatory) variables x. Let η = x T β. In a NLM the response y Nηβ),σ 2 ) depends on the parameters β only through the linear combination η. Also, as η gets bigger so does EY), since EY) = η. This makes it relatively easy to interpret the effect of the parameters β. If x i > 0 then positive increasing) β i is evidence for a positive increasing) effect on the response. Let s generalise this but try and keep the interpretability.

7 Modeling with GLM s - model outline Allow the response Y fy θ) to have a distribution given by a function in an exponential family. Y could be Poisson, Binomial, Gamma,... lots. Insist that the distribution of Y depend on β only through the linear combination ηβ), that is θ = θηβ)). But, let the mean response EY) be some smooth monotone function of the linear predictor which we can specify. This defines a large but still tractable and interpretable class of models called GLM s. In this course we will look at a few of the simplest and most useful examples dimθ) = 1, mostly). However, much of what we will learn carries across to other GLM s.

8 Modeling with GLM s - definitions three decisions fix the model. Distribution: Y i fy i θ i ), iid for i = 1,2,...,n. given the EV, how is the response distributed? Linear predictor: η i = x i,1 β 1 +x i,2 β x i,p β p what variables x 1,x 2,...,x p should we use? Link function gµ i ) = η i increasing, continuous, differentiable function) How does mean response increase with LP? η i µ i θ i dbn of Y i. All these steps work because the key relations are invertible) smooth monotone functions. NLM y i = x i β +ǫ i, ǫ i N0,σ 2 ) Stochastic, Y i Nµ i,σ 2 ) jointly indpendent Deterministic, η i = x T i β, η = Xβ Link, choice gµ i ) = µ i gives NLM EY i ) = x i β.

9 Modeling with GLM s - Log-Likelihood lβ;y) = n i=1 y i θ i κθ i ) φ +cy i ;φ), with θ i = θx i β) since µ i = κ θ i ) and gµ i ) = η i both invertible Canonical link function important special case) If we choose link function gµ i ) = κ 1 µ i ) then θ i = η i since gµ i ) = θ i. Log-Likelihood canonical link, φ = 1 or known) lβ;y) = n i=1 y i η i κη i ) +cy,φ), η i = x i β. φ

10 Example - GLM for a Poisson response Y i Poissonλ i ), independent EY i ) = λ i. exp λ) λy y! = expylogλ) λ logy!)) ) yθ κθ) = exp +cy;φ) φ θ = logλ), φ = 1, κθ) = expθ). Check EY): µ = κ θ) = expθ) = λ. Check vary): φvµ) = κ θ) = λ = µ. Link function gµ) = η where η = xβ: Canonical link gµ) = κ 1 µ) = logµ) Log-likelihood lβ;y) = n i=1 y i x i β e x iβ

11 y~poissonmu) mu=exp1+2x) y x

12 Example - GLM for Y i Binomialm,p i ) fy θ) = C m y py 1 p) m y θ = log = expylog ) p 1 p p 1 p ) +mlog1 p)+logc m y )) log odds), φ = 1 κθ) = mlog1 p) = mlog1+expθ)) Check EY) = mp: κ θ) = mexpθ)/1+expθ)). Exercise Check variance φv = mp1 p). Canonical link gµ) = κ 1 µ) µ/m gµ) = log 1 µ/m Log-likelihood lβ;y) = n i=1 ) y i x i β mlog1+e x iβ )

13 Inference with GLM s - MLE s The MLEs β = ˆβ for the GLM satisfy score equations lβ; y) β i β R p,lβ;y) : R p R. = 0, i = 1,2,...,p. In terms of the gradient operator l l β = l,,..., l β 1 β 2 β p l β = 0 at β = ˆβ. ) T Hessian operator is β β T = β 2 1 β 2 β 1 β p β 1 β 1 β 2... β 1 β p β 2 β p.. β p β... 2 β 2 p

14 Observed information Expected information Compute I Jy) = / β β T I = E/ β β T ) ˆβ D Nβ,I 1 ) with n. l β i = n k=1 l η k η k β i = l η η T β i l β T = l η η T β T = l η TX ββ T = ηt β η l β T = X T 2 l η η TX

15 Let W = E so that I = X T WX and 2 ) l η η T ˆβ D Nβ,X T WX) 1 ). W is diagonal since l = kl k with l k = l k η k ). 2 ) l W kk = E = E ηk 2 [ dθk dη k η k = gκ θ k )) ] 2 θ 2 k 1 = dθ k g µ k )κ θ k ) dη k dθ k 1 = dη k g µ k )Vµ k ) = κ /φ θ 2 k W kk = 1 g µ k ) 2 φvµ k )

16 Example: Poisson, canonical link φ = 1, κθ) = expθ), Vµ) = µ. ηβ) = Xβ. Link gµ) = η, canonical link logµ) = η. Log-likelihood and score equations l/ β = 0: lβ;y) = n i=1 y i η i e η i X T y = X T µ, where µ = µβ), solve for β W kk = E = e η k = µ k = η 2 k 1 ) g µ k ) 2 φvµ k ) I = J = X T diagµ 1,...,µ n )X

17 Inference with GLM s - Estimation via Iteratively Re-weighted Least Squares Newton-Raphson iteration z : fz ) = 0 Given z : fz) 0, improve estimate via to z : fz ) 0 fz ) fz)+df/dz)z z) z = z df/dz) 1 fz) and iterate. For stationary point z : f z ) = 0 z = z d 2 f/dz 2 ) 1 f z), i = 0,1,2,... Multivariate case, expand log-likelihood at β l β β=β and solving for LHS=0, l β + 2 l β β Tβ β). β = β β β T ) 1 l β.

18 Scoring β = β +X T WX) 1 l β l l = XT β η = X T θt η l θ β = X T WX) 1 X T W l θ θ T η y µ) = φ = diag Xβ +W 1 θt η 1 g µ 1 )Vµ 1 )... 1 g µ n )Vµ n ) W 1 = diag g µ 1 ) 2 φvµ 1 )...g µ n ) 2 φvµ n ) ) The IRLS iteration is z = Xβ +g µ)y µ) β = X T WX) 1 X T Wz ) l θ )

19 Example: Poisson, canonical link φ = 1, κθ) = expθ), Vµ) = µ. Link gµ) = η, canonical link logµ) = η. W = diagµ 1,...,µ n ) z = Xβ + y µ) µ β = X T WX) 1 X T Wz µ = expxβ ) z = Xβ + y µ ) µ

20 y~poissonmu) mu=exp1+2x) y x

21 > #IRLS - initialise > mu<-y+1; > eta<-logmu); > z<-eta > W<-diagmu) y~poissonmu) mu=exp1+2x) y x

22 > #10 iterations of IRLS > its<-10; > for i in 1:its) { + beta<-solvetx)%*%w%*%x)%*%tx)%*%w%*%z; + eta<-x%*%beta; + mu<-expeta); + W<-diagcmu)) + z<-x%*%beta+y-mu)/mu); + } y~poissonmu) mu=exp1+2x) y x

23 > #MLE if converged) > beta [,1] > #inverse Fisher Information I=X^TWX > Iinv<-solvetX)%*%W%*%X) > #standard errors > sqrtdiagiinv)) > #compare glm) > summaryglmy~1+x,family= poisson ))... Coefficients: Estimate Std. Error z value Pr> z ) Intercept) e-12 x e AIC: Number of Fisher Scoring iterations: 4

24 Iteratively Re-weighted Least Squares Algorithm µ 0) = y so Xβ 0) = η 0) = gµ 0) ) = gy) z 0) = gy) and W 0) = diagg y) 2 φvy)) 1. For i = 0,1,2,..., a) set β i+1) = X T W i) X) 1 X T W i) z i) Regress z i) on X with weights diagw). b) η i+1) = Xβ i+1), µ i+1) = g 1 η i+1) ), z i+1) = η i+1) +g µ i+1) )y µ i+1) ) and W i+1) = diag 1 g µ i+1) ) 2 φvµ i+1) ) ).

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two