Estimation of Variances and Covariances

Size: px

Start display at page:

Download "Estimation of Variances and Covariances"

Jocelyn Davidson
5 years ago
Views:

1 Estimation of Variances and Covariances Variables and Distributions Random variables are samples from a population with a given set of population parameters Random variables can be discrete, having a limited number of distinct possible values, or continuous Continuous Random Variables The cumulative distribution function of a random variable is F (y) = P r(y y), for < y < As y approaches, then F (y) approaches 0 As y approaches, then F (y) approaches F (y) is a nondecreasing function of y If a < b, then F (a) < F (b) p(y) = F (y) y = F (y), wherever the derivative exists p(y) y = F (t) = t p(y) y E(y) = y p(y) y E(g(y)) = g(y) p(y) y V ar(y) = E(y ) [E(y)] Normal Random Variables Random variables in animal breeding problems are typically assumed to be samples from Normal distributions, where p(y) = (π) 5 σ exp( 5(y µ) σ ) for < x < +, where σ is the variance of y and µ is the expected value of y

2 For a random vector variable, y, the multivariate normal density function is p(y) = (π) 5n V 5 exp( 5(y µ) V (y µ)) denoted as y N(µ, V) where V is the variance-covariance matrix of y The determinant of V must be positive, otherwise the density function is undefined Chi-Square Random Variables If y N(0, I), then y y χ n, where χ n is a central chi-square distribution with n degrees of freedom and n is the length of the random vector variable y The mean is n The variance is n If s = y y > 0, then p(s n) = (s) (n/) exp 05s/[ 05n Γ(05n)] If y N(µ, I), then y y χ n,λ where λ is the noncentrality parameter which is equal to 5µ µ The mean of a noncentral chi-square distribution is n + λ and the variance is n + 8λ If y N(µ, V), then y Qy has a noncentral chi-square distribution only if QV is idempotent, ie QVQV = QV The noncentrality parameter is λ = 5µ QVQµ and the mean and variance of the distribution are tr(qv) + λ and tr(qv) + 8λ, respectively If there are two quadratic forms of y, say y Qy and y Py, and both quadratic forms have chi-square distributions, then the two quadratic forms are independent if QVP = 0 3 The Wishart Distribution The Wishart distribution is similar to a multivariate Chi-square distribution An entire matrix is envisioned of which the diagonals have a Chi-square distribution, and the off-diagonals have a built-in correlation structure The resulting matrix is positive definite This distribution is needed when estimating covariance matrices, such as in multiple trait models, maternal genetic effect models, or random regression models 4 The F-distribution The F-distribution is used for hypothesis testing and is built upon two independent Chi-square random variables Let s χ n and v χ m with s and v being independent, then (s/n) (v/m) F n,m

3 The mean of the F-distribution is m/(m ) The variance is m (n + m ) n(m ) (m 4) 3 Expectations of Random Vectors Let y be a random vector variable, then E(y ) = µ = for a vector of length n If c is a scalar constant, then Similarly, if C is a matrix of constants, then µ µ µ n E(cy ) = cµ E(Cy ) = Cµ, Let y be another random vector variable of the same length as y, then E(y + y ) = E(y ) + E(y ) = µ + µ 4 Variance-Covariance Matrices Let y be a random vector variable of length n, then the variance-covariance matrix of y is: V ar(y) = E(yy ) E(y)E(y ) σy σ y y σ y y n σ y y = σy σ y y n = V σ y y n σ y y n σ y n A variance-covariance (VCV) matrix is square, symmetric and should always be positive definite, ie all of the eigenvalues must be positive 3

4 Another name for VCV matrix is a dispersion matrix or (co)variance matrix Let C be a matrix of constants conformable for multiplication with the vector y, then V ar(cy) = E(Cyy C ) E(Cy)E(y C ) = CE(yy )C CE(y)E(y )C = C ( E(yy ) E(y)E(y ) ) C = CV ar(y)c = CVC If there are two sets of functions of y, say C y and C y, then Cov(C y, C y) = C VC If y and z represent two different random vectors, possibly of different orders, and if the (co)variance matrix between these two vectors is W, then Cov(C y, C z) = C WC 5 Quadratic Forms Variances are estimated using sums of squares of various normally distributed variables, and these are known as quadratic forms The general quadratic form is y Qy, where y is a random vector variable, and Q is a regulator matrix Usually Q is a symmetric matrix, but not necessarily positive definite Examples of different Q matrices are as follows: Q = I, then y Qy = y y which is a total sum of squares of the elements in y Q = J(/n), then y Qy = y Jy(/n) where n is the length of y Note that J =, so that y Jy = (y )( y) and ( y) is the sum of the elements in y 3 Q = (I J(/n)) /(n ), then y Qy gives the variance of the elements in y, σ y The expected value of a quadratic form is E(y Qy) = E(tr(y Qy)) = E(tr(Qyy )) = tr(qe(yy )), and the covariance matrix is V ar(y) = E(yy ) E(y)E(y ) 4

5 so that then E(yy ) = V ar(y) + E(y)E(y ), E(y Qy) = tr(q(v ar(y) + E(y)E(y ))) Let V ar(y) = V and E(y) = µ, then E(y Qy) = tr(q(v + µµ )) = tr(qv) + tr(qµµ ) = tr(qv) + µ Qµ The expectation of a quadratic form was derived without knowing the distribution of y However, the variance of a quadratic form requires that y follows a multivariate normal distribution Without showing the derivation, the variance of a quadratic form, assuming y has a multivariate normal distribution, is V ar(y Qy) = tr(qvqv) + 4µ QVQµ The quadratic form, y Qy, has a chi-square distribution if tr(qvqv) = tr(qv), and µ QVQµ = µ Qµ, or the single condition that QV is idempotent Then if m = tr(qv) and λ = 5µ Qµ, the expected value of y Qy is m + λ and the variance is m + 8λ, which are the usual results for a noncentral chi-square variable The covariance between two quadratic forms, say y Qy and y Py, is Cov(y Qy, y Py) = tr(qvpv) + 4µ QVPµ The covariance is zero if QVP = 0, then the two quadratic forms are said to be independent 6 Basic Model for Variance Components The general linear model is described as y = Xb + Zu + e, where E(y) = Xb, E(u) = 0, and E(e) = 0 5

6 Often u is partitioned into s factors as u = (u u u s ) The (co)variance matrices are defined as G = V ar(u) = V ar u u u s G σ G σ G s σs and R = V ar(e) = Iσ0 Then V ar(y) = V = ZGZ + R, and if Z is partitioned corresponding to u, as Z = [ Z Z Z s ], then s ZGZ = Z i G i Z iσi i= Let V i = Z i G i Z i and V 0 = I, then V = s V i σi i=o 6 Mixed Model Equations Henderson s mixed model equations (MME) are written as X R X X R Z X R Z X R Z s Z R X Z R Z + G σ Z R Z Z R Z s Z R X Z R Z Z R Z + G σ Z R Z s Z sr X Z sr Z Z sr Z Z sr Z s + G s = X R y Z R y Z R y Z sr y σs ˆb û û û s 6

7 7 Unbiased Estimation of Variances Assume that all G i are equal to I for this example, so that Z i G i Z i simplifies to Z iz i Let X = Z =, Z = , and y =, , Then and and V 0 = I V = Z Z = V = Z Z = , 7 Define the Necessary Quadratic Forms At least three quadratic forms are needed in order to estimate the variances Below are three arbitrary Q-matrices that were chosen such that Q k X = 0 Let 0 Q =, 0 Q = and Q 3 = 0 0, The numeric values of the quadratic forms are y Q y = 657, y Q y = 306, and y Q 3 y = 88 7

8 For example, ( y Q y = ) = The Expectations of the Quadratic Forms The expectations of the quadratic forms are E(y Q y) = trq V 0 σ 0 + trq V σ + trq V σ = 4σ0 + σ + σ E(y Q y) = 4σ0 + 4σ + σ, E(y Q 3 y) = 6σ 0 + 4σ + 4σ 73 Equate Expected Values to Numerical Values Equate the numeric values of the quadratic forms to their corresponding expected values, which gives a system of equations to be solved, such as Fσ = w In this case, the equations would be 4 σ σ = 306, σ 88 which gives the solution as ˆσ = F w, or ˆσ 0 ˆσ ˆσ = The resulting estimates are unbiased 74 Variances of Quadratic Forms The variance of a quadratic form is V ar(y Qy) = trqvqv + 4b X QVQXb Only translation invariant quadratic forms are typically considered in variance component estimation, that means b X QVQXb = 0 Thus, only trqvqv needs to be calculated Remember that V can be written as the sum of s + matrices, V i σ i, then s s trqvqv = trq V i σi Q V j σj i=o j=o 8

9 = s i=o s j=o tr QV i QV j σ i σ j For example, if s =, then The sampling variances depend on trqvqv = trqv 0 QV 0 σ trqv 0 QV σ 0σ + trqv 0 QV σ 0σ + trqv QV σ 4 + trqv QV σ σ + trqv QV σ 4 The true magnitude of the individual components, The matrices Q k, which depend on the method of estimation and the model, and 3 The structure and amount of the data through X and Z For small examples, the calculation of sampling variances can be easily demonstrated In this case, V ar(f w) = F V ar(w)f, a function of the variance-covariance matrix of the quadratic forms Using the small example of the previous section, the V ar(w) is a 3x3 matrix The (,) element is the variance of y Q y which is V ar(y Q y) = trq VQ V = trq V 0 Q V 0 σ trQ V 0 Q V σ 0σ +4trQ V 0 Q V σ 0σ + trq V Q V σ 4 +4trQ V Q V σ σ + trq V Q V σ 4 = 0σ σ 0σ + 6σ 0σ + 8σ 4 + 0σ σ + 8σ 4 The (,) element is the covariance between the first and second quadratic forms, Cov(y Q y, y Q y) = trq VQ V, and similarly for the other terms All of the results are summarized in the table below Forms σ0 4 σ0 σ σ0 σ σ 4 σ σ σ 4 V ar(w ) Cov(w, w ) Cov(w, w 3 ) V ar(w ) Cov(w, w 3 ) V ar(w 3 )

10 To get numeric values for these variances, the true components need to be known Assume that the true values are σ 0 = 50, σ = 0, and σ = 80, then the variance of w is V ar(w ) = 0(50) + 6(50)(0) + 6(50)(80) +8(0) + 0(0)(80) + 8(80) =, 66, 000 The complete variance- covariance matrix of the quadratic forms is w, 66, 000, 47, 800, 44, 000 V ar w =, 47, 800, 757, 00, 8, 400 w 3, 44, 000, 8, 400 3, 550, 800 The variance-covariance matrix of the estimated variances (assuming the above true values) would be V ar(ˆσ) = F V ar(w)f = 405, , , , , 900 4, , 700 4, , 500 = C 75 Variance of A Ratio of Variance Estimates Often estimates of ratios of functions of the variances are needed for animal breeding work, such as heritabilities, repeatabilities, and variance ratios Let such a ratio be denoted as a/c where and (NOTE: the negative estimate for ˆσ a = ˆσ = (0 0 )ˆσ = 7 c = ˆσ 0 + ˆσ + ˆσ = ( )ˆσ = 88 was set to zero before calculating c From Osborne and Patterson (95) and Rao (968) an approximation to the variance of a ratio is given by Now note that V ar(a/c) = (c V ar(a) + a V ar(c) ac Cov(a, c))/c 4 Then V ar(a) = (0 0 )C(0 0 ) = 93, 500, V ar(c) = ( )C( ) = 3, 00, Cov(a, c, ) = (0 0 )C( ) = 94, 750 V ar(a/c) = [(88) (93, 500) + (7) (3, 00) (7)(88)(94, 750)]/(88) 4 =

11 This result is very large, but could be expected from only 3 observations Thus, (a/c) = 5 with a standard deviation of 5933 Another approximation method assumes that the denominator has been estimated accurately, so that it is considered to be a constant, such as the estimate of σ e Then, For the example problem, this gives V ar(a/c) = V ar(a)/c V ar(a/c) = 93, 500/(88) = , which is slightly larger than the previous approximation The second approximation would not be suitable for a ratio of the residual variance to the variance of one of the other components Suppose a = ˆσ 0 = 6, and c = ˆσ = 7, then (a/c) = 30, and with the first method, and V ar(a/c) = [(7) (405, 700) + (6) (93, 500) (7)(6)( 40, 700)]/(7) 4 = , V ar(a/c) = 405, 700/(7) = 786, with the second method The first method is probably more realistic in this situation, but both are very large 8 Useful Derivatives of Quantities The following information is necessary for derivation of methods of variance component estimation based on the multivariate normal distribution The (co)variance matrix of y is V = s Z i G i Z iσi + Rσ0 i= = ZGZ + R Usually, each G i is assumed to be I for most random factors, but for animal models G i might be equal to A, the additive genetic relationship matrix Thus, G i does not always have to be diagonal, and will not be an identity in animal model analyses

12 The inverse of V is V = R R Z(Z R Z + G ) Z R To prove, show that VV = I Let T = Z R Z + G, then VV = (ZGZ + R)[R R ZT Z R ] = ZGZ R ZGZ R ZT Z R +I ZT Z R = I + [ZGT ZGZ R Z Z](T Z R ) = I + [ZG(Z R Z + G ) ZGZ R Z Z](T Z R ) = I + [ZGZ R Z + Z ZGZ R Z Z](T Z R ) = I + [0](T Z R ) = I 3 If k is a scalar constant and A is any square matrix of order m, then Ak = k m A 4 For general square matrices, say M and U, of the same order then MU = M U 5 For the general matrix below with A and D being square and non-singular (ie the inverse of each exists), then A Q B D = A D + QA B = D A + BD Q Then if A = I and D = I, then I =, so that I + QB = I + BQ = I + B Q = I + Q B

13 6 Using the results in (4) and (5), then V = R + ZGZ = R(I + R ZGZ ) = R I + R ZGZ = R I + Z R ZG = R (G + Z R Z)G = R G + Z R Z G 7 The mixed model coefficient matrix of Henderson can be denoted by ( ) X C = R X X R Z Z R X Z R Z + G then the determinant of C can be derived as C = X R X G + Z (R R X(X R X) X R )Z = Z R Z + G X (R R Z(Z R Z + G ) Z R )X Now let S = R R X(X R X) X R then C = X R X G + Z SZ = Z R Z + G X V X 8 A projection matrix, P, is defined as Properties of P: P = V V X(X V X) X V PX = 0, Py = V (y Xˆb), where ˆb = (X V X) X V y Therefore, y PZ i G i Z ipy = (y Xˆb) V Z i G i Z iv (y Xˆb) 3

14 9 Derivative of V is V σ i = V V σi V 0 Derivative of ln V is ln V σ i = tr ( V ) V σi Derivative of P is P σ i = P V σi P Derivative of V is V σ i = Z i G i Z i 3 Derivative of ln X V X is ln X V X σ i = tr(x V X) X V V σi V X 9 Random Number Generators and R R has some very good random number generators built into it These functions are very useful for application of Gibbs sampling methods in Bayesian estimation Generators for the uniform distribution, normal distribution, Chi-square distribution, and Wishart distribution are necessary to have Below are some examples of the various functions in R The package called MCMCpack should be obtained from the CRAN website 4

15 require(mcmcpack) # Uniform distribution generator # num = number of variates to generate # min = minimum number in range # max = maximum number in range x = runif(00,min=5,max=0) # Normal distribution generator # num = number of deviates to generate # xmean = mean of the distribution you want # xsd = standard deviation of deviates you want w = rnorm(00,-,63) # Chi-square generator # num = number of deviates to generate # df = degrees of freedom # ncp = non-centrality parameter, usually 0 w = rchisq(5,4,0) # Inverted Wishart matrix generator # df = degrees of freedom # SS = matrix of sum of squares and crossproducts U = riwish(df,ss) # New covariance matrix is the inverse of U V = ginv(u) A Chi-square variate with m degrees of freedom is the sum of squares of m random normal deviates The random number generator, however, makes use of a gamma distribution, which with the appropriate parameters is a Chi-square distribution The uniform distribution is the key distribution for all other distribution generators R uses the Mersenne Twister (Matsumoto and Nishimura, 997) with a cycle time of 9937 The Twister is based on a Mersenne prime number 0 Positive Definite Matrices A covariance matrix should be positive definite To check a matrix, compute the eigenvalues and eigenvectors of the matrix All eigenvalues should be positive If they are not positive, then they can be modified, and a new covariance matrix constructed from the eigenvectors and the modified set of eigenvalues The procedure is shown in the following R statements 5

16 # Compute eigenvalues and eigenvectors GE = eigen(g) nre = length(ge $values) for(i in :nre) { qp = GE$ values[i] if(qp < 0)qp = (qp*qp)/0000 GE$ values[i] = qp } # Re-form new matrix Gh = GE$ vectors GG = Gh %*% diag(ge$values) %*% t(gh) If the eigenvalues are all positive, then the new matrix, GG, will be the same as the input matrix, G EXERCISES This is an example of Henderson s Method of unbiased estimation of variance components Let y = µ + Z u + Z u + e, with data as follows: = µ u u u u u u 3 u 4 + e Also, V = Z Z σ + Z Z σ + Iσ 0 Calculate the following: (a) M = ( ) 6

17 (b) A = Z (Z Z ) Z (c) B = Z (Z Z ) Z (d) Q 0 = I M (e) Q = A M (f) Q = B M (g) y Q 0 y (h) y Q y (i) y Q y (j) E(y Q 0 y) = tr(q 0 V 0 )σ 0 + tr(q 0V )σ + tr(q 0V )σ (k) E(y Q y) (l) E(y Q y) (m) Estimate the variances (n) Compute the variances of the estimated variances Check the following matrix for positive definiteness, and create a new modified matrix from it, that is positive definite (if it is not already positive definite) R =

Estimation of Variance Components in Animal Breeding

Estimation of Variance Components in Animal Breeding L R Schaeffer July 19-23, 2004 Iowa State University Short Course 2 Contents 1 Distributions 9 11 Random Variables 9 12 Discrete Random Variables 10