Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model. *Modified notes from Dr. Dan Nettleton from ISU

Size: px

Start display at page:

Download "Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model. *Modified notes from Dr. Dan Nettleton from ISU"

Bartholomew Reynard McKinney
5 years ago
Views:

1 Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model *Modified notes from Dr. Dan Nettleton from ISU

2 Suppose intelligence quotients (IQs) for a population of students are normally distributed with a mean µ and variance σ 2 u. IQ ~ N(µ,σ u 2 ) µ

3 Suppose an IQ test was given to an i.i.d sample of such students. Suppose that, given the IQ of a student (something hard to measure), the test score for that student is normally distributed with a mean equal to the student s IQ and a variance of σ 2 and is independent of the test score of any other student. score IQ ~ N(IQ,σ 2 ) IQ

4 Consider our linear mixed effects model where [ u e Y = X β + Zu + e ] ([ 0 N 0 ] [ G 0, 0 R ]) Note that this model coincides with u N(0, G), e N(0, R), independent of each other.

5 Given the data y, what is our best guess for the unobserved vector u? (The random student effects). Because u is a random vector rather than a fixed parameter, we talk about predicting u rather than estimating u. We seek a Best Linear Unbiased Predictor (BLUP) for u, which we will denote by û.

6 To be a BLUP, we require û to be a linear function of y, 2. û to be unbiased for u so that E(û u) = 0, and 3. Var(û u) to be no larger than the Var(v u), where v is any other linear and unbiased predictor.

7 The BLUP of u is û = GZ Σ 1 (y X ˆβ Σ ) And for the usual case in which G and Σ = ZG Z + R are unknown, we replace the matrices by estimates and approximate the BLUP of u by û = ĜZ ˆΣ 1 (y X ˆβ ˆΣ)

8 Let s return to the IQ example... Suppose it is known that σ2 u σ 2 =9 If the we sample 100 students and their sample mean IQ was 100, what is the best prediction of the IQ of a student who scored 130 on the test?

9 We will assume u 1,..., u 100 iid N(0,σ 2 u ) independent of e 1,..., e 100 iid N(0,σ 2 ). If we let µ + u i denote the IQ of student i, then IQs of the students are N(µ,σ 2 u), as stated at the beginning. If we let y i = µ + u i + e i denote the test score of student i, then y i (µ + u i ) N(µ + u i,σ 2 ), as stated at the beginning.

10 For this case, we have n = 100 Y = X β + Zu + e where X =1 n, β = µ, Z = I n, G = σ 2 ui n, R = σ 2 I n Then, and Σ = ZG Z + R = (σ 2 u + σ 2 )I n. GZ Σ 1 = σ2 u σ 2 u + σ 2 I n

11 And the BLUP for u is û = GZ Σ 1 (y X ˆβ Σ ) = The i th element of this vector is û i = σ2 u σ 2 u + σ 2(y i ȳ ) σ2 u σu 2 + σ2(y 1ȳ ) Thus, the BLUP for µ + u i (the IQ of student i) is σ2 u ˆµ+û i = ȳ + σu 2 + σ 2(y i ȳ ) = σ2 u σ2 σu 2 + σ 2y i+ σu 2 + σ 2ȳ

12 Note that the BLUP is a weighted average of the individual score and the overall mean score. σ 2 u σ 2 u + σ 2y i + σ2 σ 2 u + σ 2ȳ If there is relatively high variability among student scores (compared to variability within a student), then more weight is put on the individual score.

13 Let s return to the IQ example... Suppose it is known that σ2 u σ 2 =9 If we sample 100 students and their sample mean IQ was 100, what is the best prediction of the IQ of a student who scored 130 on the test? σ 2 u σ 2 u + σ 2 = σ2 u σ 2 σu 2 σ + 1 = = 0.9 We would predict the IQ of a student who scored 130 on the test to be somewhat shrunk toward the mean as 0.9(130) + 0.1(100) = 127

14 Example: Gene Expression Earlier in the semester, we introduced random effects using a gene expression example where there were 10 randomly chosen lines and 3 replicates within each line for a given gene. Y ij = µ + L i + ɛ ij for i = 1, 2,..., 10 and j = 1, 2, 3 with L i iid N(0, σ 2 L ) and ɛ ij iid N(0, σ 2 )

15 Example: Gene Expression Fit the random effects model for gene 1 and save the blups in a data set using the ODS output statement ods output SolutionR=blups; proc mixed data=gene1; class Line; model Expression=; random Line/solution; /* <---- */ run; ods output close;

16 Example: Gene Expression The grand mean is data blups; set blups; LineBlup = Estimate; keep Line LineBlup; proc print data=blups; run; Obs Line LineBlup

17 Example: Gene Expression Get the line means and compare to blups. ods output summary=means; proc means data=gene1; by Line; var Expression; run; ods output close; data means; set means; keep Line Expression_Mean Expression_N; run; data both; merge means blups; run; proc print data=both; run;

18 Example: Gene Expression Expression_ Expression_ Obs Line N Mean LineBlup Line means that are above the overall mean Ȳ.. = 4.10 have BLUPS that are brought down a bit (those that are below the overall mean have BLUPS that are brought up a bit). This is shrinkage toward the mean.

19 Example: Gene Expression proc sgplot data=both; scatter x=expression_mean y=lineblup; lineparm x=0 y=0 slope=1; refline / axis=x; refline / axis=y; run;

20 Example: Gene Expression We usually check the normality of the residuals (i.e. given the BLUPS, or conditioning on the BLUPS), but we could also check the normality of the random L i effects using the BLUPS, though I don t think this is done in practice very often.

21 Example: Gene Expression proc rank data=blups normal=blom out=diag; var LineBlup; ranks rankvalue; run; proc sgplot data=diag; scatter x=rankvalue y=lineblup; xaxis label="normal Quantiles"; run;

21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model

21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model Copyright c 2018 (Iowa State University) 21. Statistics 510 1 / 26 C. R. Henderson Born April 1, 1911,