A note on the equality of the BLUPs for new observations under two linear models

ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 14, 2010 A note on the equality of the BLUPs for new observations under two linear models Stephen J Haslett and Simo Puntanen Abstract We consider two linear models, L 1 and L 2, say, with new unobserved future observations We give necessary and sufficient conditions in the general situation, without any rank assumptions, that the best linear unbiased predictor (BLUP) of the new observation under the the model L 1 continues to be BLUP also under the model L 2 1 Introduction In the literature the invariance of the the best linear unbiased estimator (BLUE) of fixed effects under the general linear model y = Xβ + ε, (11) where E(y) = Xβ, E(ε) = 0, and cov(y) = cov(ε) = V 11, has received much attention; see, for example, the papers by Rao (1967, 1971, 1973), and Mitra and Moore (1973) In particular, the connection between the ordinary least squares estimator and BLUE (Xβ), has been studied extensively; see, eg, Rao (1967), Zyskind (1967), and the review papers by Puntanen and Styan (1989), and Baksalary, Puntanen and Styan (1990a) According to our knowledge, the equality of the best linear unbiased predictors (BLUPs) in two models L 1 and L 2, defined below, has received very little attention in the literature Haslett and Puntanen (2010a,b) considered the equality of the BLUPs of the random factor under two mixed models In their (2010a) paper they gave, without a proof, necessary and sufficient conditions in the general situation that the BLUP of the new observation under the model L 1 continues to be BLUP also under the model L 2 The purpose of this paper is to provide a complete proof of this result Received March 8, 2010 2010 Mathematics Subject Classification 15A42, 62J05, 62H12, 62H20 Key words and phrases Best linear unbiased estimator (BLUE), best linear unbiased predictor (BLUP), generalized inverse, linear fixed effects model, orthogonal projector 27

28 STEPHEN J HASLETT AND SIMO PUNTANEN Let us start formally by considering the general linear model (11), which can be represented as a triplet F = {y, Xβ, V 11 } (12) Vector y is an n 1 observable random vector, ε is an n 1 random error vector, X is a known n p model matrix, β is a p 1 vector of fixed but unknown parameters, V 11 is a known n n nonnegative definite matrix Let y f denote the q 1 unobservable random vector containing new future observations The new observations are assumed to follow the linear model y f = β + ε f, where is a known q p model matrix associated with the new observations, β is the same vector of unknown parameters as in (12), and ε f is a q 1 random error vector associated with new observations The expectation and the covariance matrix are ( ( ) ( ( ( ) y Xβ X y V11 V E = = β, cov = 12 = V, yf) β Xf) yf) V 21 V 22 where the entire covariance matrix V is assumed to be known For brevity, we denote {( ( ( )} y X V11 V L 1 =, β, 12 yf) Xf) V 21 V 22 Before proceeding, we may introduce the notation used in this paper We will denote R m n the set of m n real matrices and R m = R m 1 We will use the symbols A, A, A +, C (A), C (A), and N (A) to denote the transpose, a generalized inverse, the Moore Penrose inverse, the column space, the orthogonal complement of the column space, and the null space, of the matrix A, respectively By (A : B) we denote the partitioned matrix with A R m n and B R m k as submatrices By A we denote any matrix satisfying C (A ) = N (A ) = C (A) Furthermore, we will write P A = AA + = A(A A) A to denote the orthogonal projector (with respect to the standard inner product) onto C (A) In particular, we denote H = P X and M = I n P X Notice that one choice for X is of course M We assume the model L 1 to be consistent in the sense that y C (X : V 11 ) = C (X : V 11 M), (13) ie, the observed value of y lies in C (X : V 11 ) with probability 1 The corresponding consistency is assumed in all models that we will consider The linear predictor Gy is said to be unbiased for y f if the expected prediction error is 0: E(y f Gy) = 0 for all β R p This is equivalent to GX =, ie, X f = X G The requirement that C (X f ) C (X ) means that β is estimable under F = {y, Xβ, V 11 } Now an unbiased

EQUALITY OF THE BLUPS 29 linear predictor Gy is the best linear unbiased predictor, BLUP, for y f, if the Löwner ordering cov(gy y f ) L cov(fy y f ) holds for all F such that Fy is an unbiased linear predictor for y f The following lemma characterizes the BLUP; for the proof, see, eg, Christensen (2002, p 283), and Isotalo and Puntanen (2006, p 1015) Lemma 11 Consider the linear model L 1 (with new unobserved future observations), where β is a given vector of estimable parametric functions The linear predictor Gy is the best linear unbiased predictor (BLUP) for y f if and only if G satisfies the equation for any given choice of X G(X : V 11 X ) = ( : V 21 X ) (14) We can get, for example, the following matrices G i such that G i y equals the BLUP(y f ): G 1 = (X W X) X W + V 21 W [I n X(X W X) X W ], G 2 = (X W X) X W + V 21 M(MV 11 M) M, G 3 = (X X) X + [V 21 (X X) X V 11 ]M(MV 11 M) M, where W (and the related U) are any matrices such that W = V 11 + XUX, C (W) = C (X : V 11 ) Notice that the equation (14) has a unique solution if and only if C (X : V 11 M) = R n According to Rao and Mitra (1971, p 24) the general solution to (14) can be written, for example, as G = G i + F(I n P (X :V11 M)), where the matrix F is free to vary and P (X :V11 M) denotes the orthogonal projector onto the column space of matrix (X : V 11 M) Even though the multiplier G may not be unique, the observed value Gy of the BLUP is unique with probability 1; this is due to the consistency requirement (13) Consider now another linear model L 2, which may differ from L 1 through its covariance matrix and model matrix, ie, {( ( ( )} y X V11 V L 2 =, β, 12 yf) Xf) V 21 V 22 In the next section we consider the conditions under which the BLUP for y f under L 1 continues to be BLUP under L 2 Naturally, as pointed out by an anonymous referee, β must be estimable under {y, Xβ, V 11 }; otherwise y f does not have a BLUP under L 2

30 STEPHEN J HASLETT AND SIMO PUNTANEN 2 Main results Theorem 21 Consider the models L 1 and L 2 (with new unobserved future observations), where C (X f ) C (X ) and C (X f ) C (X ) Then every representation of the BLUP for y f under the model L 1 is also a BLUP for y f under the model L 2 if and only if X V11 M X V11 M C C (21) V 21 M V 21 M Proof For the proof it is convenient to observe that (21) holds if and only if V11 M X V11 M C C, (22a) V 21 M V 21 M and ( ( ) X X V11 M C C (22b) Xf) V 21 M Let us first assume that every representation of the BLUP for y f under L 1 continues to be BLUP under L 2 Let G 0 be a general solution to (14): G 0 = (X W X) X W + V 21 M(MV 11 M) M + F(I n P (X :V11 M)), where F is free to vary Then G 0 has to satisfy also the other fundamental BLUP equation: G 0 (X : V 11 M) = ( : V 21 M), (23) where M = I n P X The X-part of the condition (23) is (X W X) X W X + V 21 M(MV 11 M) MX + F(I n P (X :V11 M))X = (24) Because (24) must hold for all matrices F, we necessarily have F(I n P (X :V11 M))X = 0 for all F R n n, which further implies C (X) C (X : V 11 M), ie, X = XK 1 + V 11 MK 2 for some K 1 R p p and K 2 R n p, (25) and hence (X W X) X W X = (X W X) X W (XK 1 + V 11 MK 2 ) = (X W X) X W XK 1 + (X W X) X W V 11 MK 2 = K 1 + (X W X) X W WMK 2 = K 1, (26)

EQUALITY OF THE BLUPS 31 where we have used V 11 M = WM and X W W = X Note also that the assumption C (X f ) C (X ) implies (X W X) X W X = ; for properties of the matrix of type (X W X), see, eg, Baksalary and Mathew (1990, Th 2) and Baksalary, Puntanen and Styan (1990b, Th 2) In view of (25) we have V 21 M(MV 11 M) MX = V 21 M(MV 11 M) MV 11 MK 2 Because V is nonnegative definite, we have C(V 12 ) C (V 11 ), and thereby and so C (MV 12 ) C (MV 11 ) = C (MV 11 M), V 21 M(MV 11 M) MX = V 21 MK 2 (27) Combining (26), (27) and (24) shows that = K 1 +V 21 MK 1, which together with (25) yields the following: ( ( )( ) X X V11 M K1 =, (28) Xf) V 21 M K 2 ie, (22b) holds The right-hand part of the condition (23) is (X W X) X W V 11 M (29a) + V 21 M(MV 11 M) MV 11 M (29b) + F(I n P (X:V11 M))V 11 M (29c) = V 21 M (29d) Again, because (29) must hold for all matrices F, we necessarily have F(I n P (X:V11 M))V 11 M = 0 for all F R n n, which further implies C (V 11 M) C (X : V 11 M), and hence V 11 M = XL 1 + V 11 ML 2 for some L 1 R p n and L 2 R n n (210) Substituting (210) into (29a), gives (proceeding as was done above with the X-part) while the term (29b) becomes and hence (29) gives (X W X) X W V 11 M = L 1, V 21 M(MV 11 M) MV 11 ML 2 = V 21 ML 2, L 1 + V 21 ML 2 = V 21 M (211)

32 STEPHEN J HASLETT AND SIMO PUNTANEN Combining (210) and (211) yields ( ) ( V11 M X V11 M = V 21 M V 21 M )( L1 L 2 ), (212) ie, (22a) holds To go the other way, suppose that (22a) and (22b) hold and that K 1 and K 2, and L 1 and L 2 are defined as in (28) and in (212), respectively Moreover, assume that Gy is the BLUP of y f under L 1, ie, G(X : V 11 M) = ( : V 21 M) (213) ( ) Postmultiplying (213) by K1 L 1 K 2 L 2 yields G(X : V 11 M) = ( : V 21 M), which confirms that Gy is also the BLUP of y f under L 2 Thus the proof is completed If the both models L 1 and L 2 have the same model matrix part we get the following corollary Corollary 21 Consider the same situation as in Theorem 21 but suppose that the two models L 1 and L 2 have the same model matrix part ( ) X Xf Then every representation of the BLUP for y f under the model L 1 is also a BLUP for y f under the model L 2 if and only if V11 M X V11 M C C V 21 M V 21 M Moreover, the sets of all representations of BLUPs for y f under L 1 and L 2 are identical if and only if X V11 M X V11 M C = C V 21 M V 21 M Acknowledgments Thanks go to the anonymous referees for constructive remarks References Baksalary, J K and Mathew, T (1990) Rank invariance criterion and its application to the unified theory of least squares, Linear Algebra Appl 127, 393 401 Baksalary, J K, Puntanen, S and Styan, G P H (1990a) On T W Anderson s contributions to solving the problem of when the ordinary least-squares estimator is best linear unbiased and to characterizing rank additivity of matrices; In: The Collected Papers of T W Anderson: 1943 1985 (GPH Styan, ed), Wiley, New York, pp 1579 1591 Baksalary, J K, Puntanen, S and Styan, G P H (1990b) A property of the dispersion matrix of the best linear unbiased estimator in the general Gauss Markov model, Sankhyā Ser A 52, 279 296 Christensen, R (2002) Plane Answers to Complex Questions: The Theory of Linear Models, Third Edition, Springer, New York

REFERENCES 33 Haslett, S J and Puntanen, S (2010a) On the equality of the BLUPs under two linear mixed models, Metrika, in press, available online Haslett, S J and Puntanen, S (2010b) Equality of BLUEs or BLUPs under two linear models using stochastic restrictions, Statist Papers, in press, available online Isotalo, J and Puntanen, S (2006) Linear prediction sufficiency for new observations in the general Gauss Markov model, Comm Statist Theory Methods 35, 1011 1023 Mitra, S K and Moore, B J (1973) Gauss Markov estimation with an incorrect dispersion matrix, Sankhyā Ser A 35, 139 152 Puntanen, S and Styan, G P H (1989) The equality of the ordinary least squares estimator and the best linear unbiased estimator (with discussion), Amer Statist 43, 151 161 [Commented by Oscar Kempthorne on pp 161 162 and by Shayle R Searle on pp 162 163, Reply by the authors on p 164] Rao, C R (1967) Least squares theory using an estimated dispersion matrix and its application to measurement of signals; In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability: Berkeley, California, 1965/1966, vol 1 (LM Le Cam and J Neyman, eds), Univ of California Press, Berkeley, pp 355 372 Rao, C R (1971) Unified theory of linear estimation, Sankhyā Ser A 33, 371 394 [Corrigendum (1972), 34, p 194 and p 477] Rao, C R (1973) Representations of best linear unbiased estimators in the Gauss Markoff model with a singular dispersion matrix, J Multivariate Anal 3, 276 292 Rao, C R and Mitra, S K (1971) Generalized Inverse of Matrices and Its Applications, Wiley, New York Zyskind, G (1967) On canonical forms, non-negative covariance matrices and best and simple least squares linear estimators in linear models, Ann Math Statist 38, 1092 1109 Institute of Fundamental Sciences, Massey University, Palmerston North, New Zealand E-mail address: sjhaslett@masseyacnz Department of Mathematics and Statistics, FI-33014 University of Tampere, Tampere, Finland E-mail address: simopuntanen@utafi