VARIANCE COMPONENT ESTIMATION & BEST LINEAR UNBIASED PREDICTION (BLUP)

Size: px

Start display at page:

Download "VARIANCE COMPONENT ESTIMATION & BEST LINEAR UNBIASED PREDICTION (BLUP)"

Preston Rice
6 years ago
Views:

1 VARIANCE COMPONENT ESTIMATION & BEST LINEAR UNBIASED PREDICTION (BLUP) V.K. Bhatia I.A.S.R.I., Library Avenue, New Delhi Introduction Variance components are commonly used in formulating appropriate designs, establishing quality control procedures, or, in statistical genetics in estimating heritabilities and genetic correlations. Traditionally the estimators used most often have been the analysis of variance (ANOVA) estimators, which are obtained by equating observed and expected mean squares from an analysis of variance and solving the resulting equations. If the data are balanced, the ANOVA estimators have many appealing properties. In unbalanced situations, these properties are rarely hold true which create number of problems in arriving at correct decisions. As in reality, variance components are mostly estimated from unbalanced data only so there are number of problems associated with them in these situations. In unbalanced situations, two general classes of estimators have sparked considerable interest: maximum likelihood and restricted maximum likelihood (ML and REML) and minimum norm and minimum variance quadratic unbiased estimation (MINQUE and MIVQUE). The links between them is also very important component. In addition to estimation problems in unbalanced case, the notion of robust estimation which takes care of influence of outliers and underlying statistical assumptions is also of interest. The classical least squares model contains only are random element, the random error; all other effects are assumed to be fixed constants. For this class of models, the assumption of independence of the i implies independence of the y i. That is, if Var I 2, then Vary I 2 also. Such models are called fixed effects models or more simply fixed models. There are situations when there is more than one random term. The classical variance components problems, in which the purpose is to estimate components of variance rather than specific treatment effects, is one example. In these cases, the treatment effects are assumed to be a random sample from a population of such effects and the goal of the study is to estimate the variance among these effects in the population. The individual effects that happen to be observed in the study are not of any particular interest except for the information they provide on the variance component. Models in which all effects assumed to be random effects are called random models. Models that contain both fixed and random effects are called mixed models. Analysis of Variance Approach The conventional least square approach, sometimes called the analysis of variance approach, to mixed model is to assume initially that all effects, other than the term that assigns a unique random element to each observation are fixed effects. Least squares is applied to this fixed model to obtain relevant partitions of the sums of squares. Then, the model containing the random effects as reinstated and expectations of the mean squares are derived. The mean square expectations determine how tests of significance are to be made and how variance

2 components are to be estimated. Adjustments to tests of significance are made by constructing an error mean square that has the proper expectation with respect to the random elements. This requires the expectations of the mean squares under the random model. For balanced data the mean square expectations are easily obtained. The expectations are expressed in terms of a linear function of the variance components for the random effect plus a general statement of the classes of fixed effects involved in the quadratic function. Henderson s Methods I, II & III Henderson described three methods of estimating variance components that are just three different ways of using the general ANOVA method. They differ only in the different quadratics ( not always sums of squares ) used for a vector of any linearly independent quadratic forms of the observations. All three also suffer from demerits of the general ANOVA method - that for unbalanced data no optimal application of the method is known, the methods can yield negative estimates, and distributional properties of the estimators are not known. Method I In Method I the quadratics used are analogous to the sums of squares used for balanced data, the analogy being such that certain sums of squares in balanced data become, for unbalanced data, quadratic forms that are not non-negative definite, and so they are not sums of squares. Thus e.g. for 2-way cross classification with n observations per cell, the sum of squares bnyi.. y... bnyi.. abny... becomes for unbalanced data bni. yi.. y... bni. yi.. abn.. y... This method is easy to compute, even for large data sets; and for random models, it yields estimators that are unbiased. It can not be used for mixed models. It can only be adopted to a mixed model by altering that model and treating the fixed effect either as non-existent or as random - in which case the estimated variance components for the true random effects will be biased. Method II This is designed to capitalize on the easy computability of Method I and to broaden its use by removing the limitation of Method I that it can not be used for mixed models. The method has two parts. First make the temporary assumption that the random effects are fixed and for the model y = X + Z u + e solve the normal equation for 0 XX XZ 0 Xy 0 ZX ZZ u Zy Then consider the vector of data adjusted for 0, namely z = y - X 0 and then model for z will be z = 1 + Zu + Ke

3 where differ from and where K is known. This can thus easily be analysed by Method I. Method II is relatively easy to compute, especially when the number of fixed effects is not too large. And although it can be used for a wide variety of mixed models, it can not be used for those mixed models that have interactions between fixed and random factors, whether those interactions are defined as random effects (the usual case) or as fixed effects. Method III This uses sums of squares that arise in fitting an overparameterised model and submodels thereof. It can be used for any mixed model and yield estimators that are unbiased. Although the method uses sums of squares that are known (at least in some cases) to be useful in certain fixed effects models, no analytic evidence is available that these sums of squares have optimal properties for estimating variances. The main disadvantage of this method is that though its confinement to sums of squares for fitting overparameterised models, there is a problem of too many sums of squares being available. For example for the 2-way crossed classification overparameterised model with equation y e ijk i j ij ijk Suppose all effects are random. There are then four variance components to estimate : R,,, and. But for that model there are five different sums of squares,, R, R, and R R,, as well as SSE which can be used. From these at least three sets suggest themselves as possible candidates for Method III estimation R R,, SSE (a) R, (b) R R, R (c) R, R, R,, SSE,, SSE All three sets yield the same estimators of 2 and 2 e. Two different estimators of 2 2 arise and it is difficult to conclude that which sets of estimators are to be preferred. and ML (Maximum Likelihood) The method is based on the maximizing the likelihood function. For the mixed model, under the assumption of normality of error terms and random effects we have y X Zu e NX, V with 2 ' 2 2 ' V = i ZZ i i ein= i ZZ i i i1 The likelihood function is then 2 12 / N 12 / ' V exp 1/ 2 yx V 1 yx i0 L = Maximizing L with respect to elements of and the variance components (the i 2 s that occur in V) leads to equations that have to be solved to yield ML estimators of and of 2. These

4 equations can be written in a variety of ways and can be solved iteratively. Despite the numerical difficulties involved in solving these equations for obtaining ML estimators of variance components, it is preferred over ANOVA method. The reason is that this method is well defined and the resulting estimators have attractive, well-known large-sample properties they are normally distributed and their sampling variances are known. REML (Restricted Maximum Likelihood) REML estimators are obtained from maximizing that part of the likelihood which is invariant to the location parameter; i.e. in terms of the mixed model y X Zu e, invariant to X. Another way of looking at it, is that REML maximizes the likelihood of a vector combinations of the observations that are invariant to X. Suppose Ly is such a vector. Then Ly LX LZu Le is invariant to X if and only if LX = 0. Computational problems for obtaining solutions are same as that of ML method. The REML estimation procedure does not, however, include estimating. On the other hand the REML equations with balanced data provide solutions that are identical to ANOVA estimators which are unbiased and have attractive minimum variance properties. In this sense REML is said to take account of the degrees of freedom involved in estimating the fixed effects, whereas ML estimators do not. The easiest example of this is the case of a simple sample of n observations 2 from a N(, ) distribution. The two estimators of 2 are 2 ML 2 i / 2 2 REML i x x n x x /( n1) The REML estimator has taken account of the one degree of freedom required for estimating, whereas the ML estimator has not. The REML estimator is also unbiased, but the ML estimator is not. In the general case of unbalanced data neither the ML estimator nor the REML estimators are unbiased. MINQUE (Minimum Norm Quadratic Unbiased Estimation) The Method is based on the concept that the estimation minimize a (Euclidean) norm, be a quadratic form of the observations and be unbiased. Its development involves extensive algebra. More importantly, its concept demands the use of some pre-assigned weights that effectively play a part of a priori values for the unknown variance components. This method has two advantages; it involves no normality assumptions as do ML and REML. And the equations that yield the estimator do not have to solved iteratively. The solution only depends on the pre-assigned values; different pre-assigned values can give different estimators from the same data set. One must therefore talk about a MINQUE estimator and not the MINQUE estimator. This appears to a troublesome feature of the MINQUE procedure. Also, its estimators can be negative and they are only unbiased if indeed the true, unknown value of 2 is pre-assigned. There is also a close relationship between REML and MINQUE i.e. a MINQUE solution = a first iterate of REML.

5 MIVQUE (Minimum Variance Quadratic Unbiased Estimation) MINQUE demands no assumptions about the form of the distribution of y. But if the usual normality assumptions are invoked, the MINQUE solution has the properties of being that unbiased quadratic form of the observations which has minimum variance; i.e. it is a minimum variance quadratic unbiased estimator, MIVQUE. I-MINQUE (Iterative MINQUE) As already pointed out, the MINQUE procedure demands using a weight vector for the preassigned value for 2. No iteration is involved. But having obtained a solution, 2 1 say, its existence prompts the idea of using it as a new pre-assigned value for getting a new estimate of 2, say 2 2. This leads to using the MINQUE equations iteratively to yield iterative MINQUE, or I-MINQUE estimators. They are, of course, if one iterates to convergence, the same as REML estimators. Hence I-MINQUE = REML. Even in the absence of normality assumptions on y, the I-MINQUE solutions do have large-sample normality properties. Negative Variance Component Estimates The variance components should always be positive because they are assumed to represent the variance of a random variable. But some of existing methods like ANOVA and MIVQUE do give rise to negative estimates. These negative estimates may arise for a variety of reasons. The variability in your data may be large enough to produce a negative estimate even though the true value of the variance component is positive. Data may contain outliers which exhibit unusual large variability. A different model for interpreting your data may be appropriate. Under some statistical models for variance components analysis, negative estimates are an indication that observations in the data are negatively correlated. Robust Estimation Outliers may occur with respect to any of the random components in a mixed - model analysis of variance. There is an extensive literature on robust estimation in the case of single error component. There is, however, only a small body of literature on robust estimation in the variance-component model Computational Problems The special features of various computational problems of estimating variance components involve the application of iterative procedures such as Newton-Raphson and Marquardt method, Method of scoring, Quasi-Newton methods, EM algorithm and Method of successive approximations. Evaluation of Algorithms Several recent research papers evaluate algorithms for variance components estimation. While there is no consensus on the best method, some general conclusions seem to be as follows: 1. The Newton-Raphson method often converges in the fewest iterations, followed by the scoring method and the EM algorithm. In some cases the EM algorithm requires a very large number of iterations. The individual iterations tend to be slightly shorter for the EM algorithm, but this depends greatly on the details of the programming.

6 2. The robustness of the methods to their starting values (ability to converge given poor starting values) is the reverse of the rate of convergence. The EM algorithm is better than Newton-Raphson. 3. The EM algorithm automatically takes care of inequality constraints imposed by the parameter space. Other algorithms need specialized programming to incorporate constraints. 4. Newton-Raphson and scoring generate an estimated, asymptotic variance-covariance matrix for the estimates as a part of their calculations. At the end of the EM iterations, special programming [perhaps a single step of Newton-Raphson ] needs to be employed to calculate asymptotic standard errors. Computational Methods Available in SAS Four methods are available in SAS PROC VARCOMP statements using the METHOD = option. They are The Type 1 Method This method (METHOD = TYPE 1) computes the type 1 sum of squares for each effect, equates each mean square involving only random effects to its expected values and solves the resulting system of equation. The MIVQUE0 Method The MIVQUE0 method (METHOD = MIVQUE0) produces unbiased estimates that are invariant with respect to the fixed effects of the model and are locally best quadratic unbiased estimates given that the true ratio of each component to the residual error component is zero. The technique is similar to Type 1 except that the random effects are adjusted only for the fixed effects. This is a default method used in PROC VARCOMP. The MAXIMUM - LIKELIHOOD Method The ML method (METHOD = ML) computes maximum likelihood estimates of the variance components. The RESTRICTED MAXIMUM - LIKELIHOOD Method The restricted maximum likelihood method (METHOD = REML) is similar to ML method, but it first separates the likelihood into two parts, one that contains the fixed effects and another that does not. This is an iterated version of MIVQUE0. Specification for using PROC VARCOMP in SAS The following statements are used in the VARCOMP procedure Required in this order : Optional : PROC VARCOMP <option> ; CLASS Variables ; MODEL dependents = effects </option> ; BY variables ;

7 Only one MODEL statement is allowed. The BY, CLASS and MODEL statements are described after the PROC VARCOMP statements. PROC VARCOMP statement PROC VARCOMP <option> ; DATA (SAS data set. If this is omitted the most recently created SAS data set is used) EPSILON = number (default 1E - 8) (Convergence value) MAXITER = number (number of iterations) (default = 50) METHOD = TYPE 1/MIVQUE0/ML/REML (default = MIVQUE0) By statement BY variables ; A BY statement can be used with PROC VARCOMP to obtain separate analyses on observation in groups determined by the BY variables. CLASS statement The CLASS statement specifies the classification variables to be used in the analysis. MODEL statement MODEL dependents = effects </option> ; The MODEL statement gives the dependent variables and independent effects. If more than one dependent is specified, a separate analysis is performed for each one. Only one MODEL statement is allowed. Only one option is available in the MODEL statement. FIXED = n = Tells VARCOMP that the first n effects is the MODEL statement are effects. The remaining effects are assumed to be random. By default PROC VARCOMP assumes that all effects are random in the model of Y = A B/ Fixed = 1 then A x B is considered a random effect. Example: In this example, A & B are classification variables and Y is the dependent variable. A is declared fixed, and B and A x B are random. data a; input a b y; cards;

8 ; Proc Varcomp method = type 1; Class a b; model y = a b/ Fixed = 1 ; run; Proc Varcomp method = mivque0; Class a b; model y = a b/ Fixed = 1; run; Proc Varcomp method = ml; class a b; model y = a b/ Fixed = 1; run; Proc varcomp method = reml; class a b; model y = a b/ Fixed = 1; run; Exercise: The data given below is first month milk yield of 28 daughters of 4 sires in 3 herds Herd Sire Daughter Milk Yield , 160, , 110, 115, , , 130, , 142, , 117, , , 125, , , 129, 131 Case (i) Assume herd and sire as random components. (ii) Assume only sire as random component. Obtain the different variance components by all the four methods.

9 Best Linear Unbiased Prediction (BLUP) A problem that occurs frequently in animal and plant breeding applications and probably in many other fields as well, is that given a sample data vector from a mixed model, the experimenter wishes to predict some set of linear functions of a future random vector. Thus, it is a problem of prediction of random vector in mixed linear models and takes different form under different situations, which is known as (a) Best Prediction (BP) (i) The form of the joint distribution of records and of the random vector to be predicted is known. (ii) Parameters of the distribution are known. (iii) It has been proved that the conditional mean of genetic values given the records, has optimum properties. (b) Best Linear Prediction(BLP) (i) The form of the distribution is not known or certain parameters are not known. (i) We do know means of the records, the means of the genetic values and variances and covariances or second moments are known. (ii) This involves finding that linear function of the records which minimizes the average of squared errors of prediction. (iii) In case of normal distribution BLP is BP. (c) Best Linear Unbiased Prediction (BLUP) (i) The problem is the same as for BLP, but now we do not know the means. (ii) Only the variances and covariances of the random vectors are known. (iii) (iii)we find the linear function of the records which has same expectation as the genetic values to be predicted and which is, in the class of that function, which minimizes the average of squared errors. (d) Neither first nor second moments are known and still it is desired to use linear prediction methods (i) We never really known parameters, but we may have good prior estimates of them and it will be (1) BP when we have good estimates of all parameters (2) BLP when we have good estimates of first and second moments (3) BLUP when we have good estimates of the second central moments (ii) If we have no prior estimates of either first or second moments, we need to estimate them from the same data that are used for prediction. In practical situations, mostly problems are of the type in which we assume that the variance covariance matrix of random variables is known and further it is assumed that records follow mixed model. Two methods been most frequently used. In the first, a regular least squares solution is obtained by treating all random variables except an error vector with variance I 2 as fixed. Then the predictor is as linear function of the least square solution. In the second method, estimates of the fixed estimates of the model are obtained by some method, possibly by regular

10 least square as in the first method, the data are adjusted for the fixed effects and then selection index methods are applied to the adjusted data as though the adjustment had been made with known values of fixed effects. Henderson, (1963) suggested a combination of these methods and described a mixed model method which resulted simultaneously best linear unbiased estimators of estimable linear functions of the fixed elements of the model and best linear unbiased predictors of the random elements of the model. The general linear model y = X + Zu + e where y is a nx1 vector of observations X is known (nxp) matrix is (p x 1) vector of fixed effects u is (q x 1) non observable random effect e is (n x 1) error effect vector and y X y V ZG R E u 0 and V u GZ ' G O e 0 e R O R No assumptions are made concerning the distribution of the random variables, however G and R are assumed known without error and are non singular. The general problem to be solved is to predict a function K +M u ( generally fixed, u generally random) as the predictand, by a linear function of the observations, L y, the predictor, such that the prediction error variances for predictors of each element of K + M u are minimized and such that the expected value of the predictor is equal to the expected value of the predictand. The function K must be an estimable function. The prediction error is K + M u - L y and the variance-covariance matrix of this function is the matrix of interest since we wish to minimize each individual diagonal element. To do this we define this matrix algebraically Assumption: V( K ) = 0 and all cov involving K = 0, V(K + M u - L y) = V(M u) + V(L y) - Cov(M u, y L) - Cov(L y, u M) = M GM + L VL - M GZ L - L ZGM To ensure that the predictor is unbiased, i.e., has the same expected value as the predictand, we add a Lagrange Multiplier to the variance-covariance matrix of prediction errors prior to minimizing the function.

11 We know that and E(K + M u)= K E(L y) = L X Thus in order for L X = K for all possible vectors,, then L X - K = 0 must be true. Hence the Lagrange Multiplier becomes (L X - K ). The LM added to V(K + M u - L y) gives the function, F, below F = M GM + L VL - M GZ L - L ZGM + (L X - K ) The function F is differentiated with respect to the unknowns, L and, and the derivatives are equated to zero (null matrices) F = 2VL - 2ZGM + X = 0 L' F and = L X - K = 0 Note that the second derivative provides the condition which must hold in order for the prediction to be unbiased. These results can be rearranged in matrix form as follows: V X L ZGM X' K Recall that V = ZGZ + R and let = 1 2 ZGZ' R X L ZGM X' 0 K From the first line RL + ZG(Z L - M) + X = 0 Let S = G(Z L - M) and note that 1 G S Z' L M and M = Z L - G -1 S Now we can write the following equations R Z X L 0 1 Z' G 0 S M X' 0 0 K

12 Absorb the L equation into the other two ZR ' Z G ZR ' X S M - XR ' Z XR ' X 1 1 K Multiply both sides by -1 and let C C C C 12' 22 = ZR 1 Z G 1 ZR 1 ' ' X 1 1 XR ' Z XR ' X 1 Then and and where or S = C C C C ' 22 M K RL = -ZS - X C11 C12 M = - Z X C12' C 22 K L = R -1 Z X C C M C12' C 22 K C L y = M' K' C = M u + K u = M' K' C C ' 22 1 ZR ' y 1 XR ' y u = ZR 1 Z G 1 ZR 1 ' ' X 1 1 XR ' Z XR ' X u = 1 1 XR ' X XR ' Z ZR ' X ZR ' Z G ZR ' y 1 XR ' y 1 XR ' y 1 ZR ' y These equations are commonly referred to as Henderson s mixed model equations, and these provide predictors with the smallest prediction error variances among all linear unbiased predictors. This methodology can be extended to various situations like the case of individual model and model for related sires etc.

13 Animal Additive Genetic Model The model for the individual record is y i = x i + z i u + a i + e i where represents fixed effects with x i relating the record on the i-th animal to this vector, u represents random effects other than breeding values, z i relates this vector to y i a i is the additive genetic value of the i-th animal and e i is a random error associated with the individual record. The vector representation of the entire set of records is y = X + Zu + Z a a + e If a represents only those animals with records, Z a = I. Otherwise it is an identity matrix with rows deleted that correspond to animals without records. Var (u) = G Var (a) = A 2 a Var (e) = R, usually I 2 e Cov (u,a ) = 0, Cov (u,e ) = 0, Cov (a,e ) = 0 If Z a I, the mixed model equations are XR ' 1 X XR ' 1 Z XR ' 1 Z a Z' R X Z' R Z G Z' R Z a ZaR ' 1 X Za ' R 1 Z Za ' R 1 Za A 1 / 2 a o u = a XR ' 1 y ZR ' 1 y Z' a R 1 y If Z a = I, it simplifies to XR ' 1 X XR ' 1 Z XR ' 1 ZR ' 1 X ZR ' 1 Z G 1 ZR ' 1 R 1 X R 1 Z R 1 A 1 / 2 a o u = a XR ' 1 y ZR ' 1 y R 1 y If R = I 2 e it further simplifies to XX ' XZ ' X' ZX ' ZZ ' G 1 2 e Z' X Z I A 1 2 e / 2 a o u = a Xy ' Zy ' y If the number of animals is large, one should, of course, use Henderson s method (1976) for computing A -1. Since this method requires using a base population of non-inbred, unrelated animals, some of these probably do not have records. Also we may wish to evaluate some

14 progeny that have not yet made a record. Both of these circumstances will result in Z a I, but a will contain predicted breeding values of these animals without records. Sire model with additive genetic effects The model in which related sires are mated to a random sample of unrelated dams, no dam has more than one progeny with a record, and each progeny produces one record, is y ij = x ij + s i + z i u + e ij where represents fixed effects with x ij relating the j-th progeny of the i-th sire to these effects s i represents the sire effect on the progeny record u represents other random factors with z ij relating these to the ij-th progeny record e ij is a random error The vector representation is y = X + Z s s + Zu + e Var (s) = A 2 s where A is the numerator relationship of the sires and 2 s is the sire variance in the base population. If the sires comprise a random sample from this population 2 s = 1 4 additive genetic variance. Some columns of Z s will be null if s contains sires with no progeny, as will usually be the case if the simple method for computation of A -1 requiring base population animals, is used. Var (u) = G, Cov (s,u ) = 0 Var (e) = R, usually = I 2 e Cov (s,e ) = 0, Cov (u,e ) = 0 If sires and dams are truly random, I 2 e =.75I (additive genetic variance) + I (environmental variance) With this model the mixed model equations are XR ' 1 X XR ' 1 Z s XR ' 1 Z Z' s R X Z' s R Z s A / s Z' s R Z ZR ' 1 X ZR ' 1 Zs ZR ' 1 Z G 1 If R = I e 2, then it simplifies to XX ' XZs ' X' Z' s X Zs ' Zs A 1 2 e / 2 s Z' s Z ZX ' Z ZZ ' 2 1 eg o s = u o s = u Xy ' Z' s y Zy ' XR ' 1 y Z' s R 1 y ZR ' 1 y

15 Illustration: Suppose we consider seven sires with the following relationships: S 0 S 1 S 2 S 3 S 4 S 5 S 6 There are no progeny on S 0. Each sire has two progeny in each of the two contemporary groups that differ by 100 kg and the calves by each sire within contemporary group differ by 100 kgs. There are a total of 24 calves. Results obtained under different models are as under: Table 1: Summary of solutions from different models Solutions Model g 1 g 2 S 0 S 1 S 2 S 3 S 4 S 5 S 6 Sires with progeny data Groups + Sires with progeny u g s Sires related Groups + Sires related Table 2: Rank of sires under different models Model Rank Sires with progeny data 3, 5, 4-1, 6, 2 Groups + Sires with progeny 3, 5, 4, 6, 1, 2 Sires related 3, 1, 5, 4, 0, 6, 2 Groups + Sires related 3, 1, 5, 0, 4, 6, 2 The changes in the sire evaluations as best described by the changes in rank shown in Table 2. Based on progeny data alone, no distinction can be made between sire one and four. The addition of groups changes the rank of one, four and six. The addition of relationship creates more rank changes. Clearly the relationships are the leading contributing factors to the correct ranking, since three is the best bull by all models and he was the son of one and one is the son of the base sire.

16 The addition of the relationship matrix does two things for the prediction procedure. 1) It provides RELATIONSHIP TIES among the animals in different contemporary groups. Relationship ties do the same thing as reference sire having progeny in many different contemporary groups. This is an important aspect of including A -1. 2) It also gives predictions that include the parental half-sib information that is available. The lower the heritability of the trait, the more important this aspect of including the relationship inverse becomes. This second aspect is equivalent to the selection index theory approach which combines sources of information into one predicted value. References Dempfle,L.(1977). Comparison of several sire evaluation methods in dairy cattle breeding. Livestock Production Science, pp Goldberger, A.S.(1962). Best linear unbiased prediction in the generalised linear regression model. JASA, 57, pp Harville, D. A.(1976). Extension of the Gauss-Markov theorem to include the estimation of random effects. Ann. Statist., 4, pp Harville,D.A.(1990). BLUP (Best Linear Unbiased Prediction) and Beyond. In Advances in Statistical Methods for Genetic Improvement of Livestock ( D. Gianola and K. Hammoud, eds.), pp Springer, New York. Henderson, C.R.(1963). Selection index and expected genetic advance. In Statistical Genetics and Plant Breeding, pp Nat. Acad. Sci., Nat. Res. Council Publication, 982, Washington, DC. Henderson, C.R.(1975). Best linear unbiased estimation and prediction under a selection model. Biometrics, 31, pp Henderson, C.R.(1976). A simple method for computing the inverse of a numerator relationship matrix used in prediction used in prediction of breeding values. Biometrics, 32, pp Henderson, C.R.(1984). Applications of Linear Models in Animal Breeding. University of Guelph. Lindley D.V. and Smith, A.F.M.(1972). Bayes estimates for the linear model (with discussion). JRSS, Ser. B, 34, pp Robinson, G.K.(1991). That BLUP is a good thing: The estimation of random effects. Statistical Science, 6(1), pp

Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012

Mixed-Model Estimation of genetic variances Bruce Walsh lecture notes Uppsala EQG 01 course version 8 Jan 01 Estimation of Var(A) and Breeding Values in General Pedigrees The above designs (ANOVA, P-O