Solving Large Test-Day Models by Iteration on Data and Preconditioned Conjugate Gradient

Size: px

Start display at page:

Download "Solving Large Test-Day Models by Iteration on Data and Preconditioned Conjugate Gradient"

Rachel Smith
6 years ago
Views:

1 Solving Large Test-Day Models by Iteration on Data and Preconditioned Conjugate Gradient M. LIDAUER, I. STRANDÉN, E. A. MÄNTYSAARI, J. PÖSÖ, and A. KETTUNEN Animal Production Research, Agricultural Research Centre, FIN Jokioinen, Finland ABSTRACT A preconditioned conjugate gradient method was implemented into an iteration on a program for data estimation of breeding values, and its convergence characteristics were studied. An algorithm was used as a reference in which one fixed effect was solved by Gauss-Seidel method, and other effects were solved by a second-order Jacobi method. Implementation of the preconditioned conjugate gradient required storing four vectors (size equal to number of unknowns in the mixed model equations) in random access memory and reading the data at each round of iteration. The preconditioner comprised diagonal blocks of the coefficient matrix. Comparison of algorithms was based on solutions of mixed model equations obtained by a singletrait animal model and a single-trait, random regression test-day model. Data sets for both models used milk yield records of primiparous Finnish dairy cows. Animal model data comprised 665,69 lactation milk yields and random regression test-day model data of 6,73,765 test-day milk yields. Both models included pedigree information of 1,099,6 animals. The animal model {random regression test-day model} required 1 {305} rounds of iteration to converge with the reference algorithm, but only 88 {149} were required with the preconditioned conjugate gradient. To solve the random regression test-day model with the preconditioned conjugate gradient required 37 megabytes of random access memory and took 14% of the computation time needed by the reference algorithm. (Key words: iteration on data, preconditioned conjugate gradient, test-day model) Abbreviation key: GSSJ = Gauss-Seidel second-order Jacobi, LSC = least significant change in indices, MME = mixed model equations, PCG = preconditioned conjugate gradient, RRM = random regression testday model, STM = single-trait animal model. Received January 5, Accepted June 4, INTRODUCTION Recently, more accurate and realistic statistical models have been introduced for dairy cattle breeding value estimation (7, 13). Particularly, implementation of test-day models into routine national evaluations of breeding values has been reported in several studies (6, 7, 10, 15, 1). One practical difficulty in the utilization of test-day models is heavy computing requirements arising from a dramatic increase in the number of unknowns to be solved. For example, in Canada a multiple-trait, random regression test-day model (RRM) with 7 equations per animal with records led to mixed model equations (MME) with over 87 million unknowns (7). Under Finnish conditions, replacement of the current single-trait repeatability animal model with a multiple-trait RRM would increase the number of unknowns in the MME from 3.5 million to about 50 million. The MME of such size can only be solved by powerful iterative methods. Most common iteration algorithms for the estimation of breeding values (e.g., second-order Jacobi, Gauss-Seidel, and successive overrelaxation) belong to the family of linear stationary iterative methods (4). Consecutive solutions obtained by these algorithms converge by a geometric process. Solutions approach the true solutions rapidly during the early stage of iteration but slowly at the later stage of iteration (1). Theoretically, to obtain the true solutions would require an infinite number of iterations. Consequently, the stopping criterion of the iteration process is a compromise between accuracy of solutions and costs of computations (6, 15). A proper stopping criterion may often be difficult to find because the accuracy of intermediate solutions is unknown, and the formulas that give a good approximation of the relative error involve considerably more computation (4). To overcome this problem, quasi-true solutions, obtained by performing many iterations, are often used to assess the iteration round in which the desired accuracy of solutions has been reached (1, 15, 19, ). The low rate of convergence may require too many iterations to get quasitrue solutions. Thus, for very large MME, the empirical investigation of the stopping criterion might be 1999 J Dairy Sci 8:

2 SOLVING LARGE TEST-DAY MODELS 789 impossible. Moreover, it is questionable whether the stopping criterion validated in a subset of the data applies to the complete data, which might behave differently (pedigree length and connectedness). If so, uncertainty exists as to whether the solutions have converged at a given round of iteration when using Jacobi or Gauss-Seidel or related iterative methods for solving large MME. Methods based on conjugate gradients (5) have become dominant in solving linear systems of equations in the field of numerical analysis. These methods give the true solutions in a finite number of iteration steps (4). Furthermore, parameter estimates, like relaxation factors in second-order Jacobi or in successive overrelaxation, are not necessarily required. In animal breeding, only few studies have investigated the potential of conjugate gradient methods in solving large linear models. In solving a multiple-trait sire model (0), the conjugate gradient method was found to be less efficient than the successive overrelaxation method with an optimum relaxation factor. The conjugate gradient method was 55% more efficient than successive overrelaxation when the diagonal of the coefficient matrix was used as a preconditioner (1). Also Carabaño et al. () found the preconditioned conjugate gradient (PCG) method to be superior over Gauss- Seidel related algorithms but observed that the method was not stable in certain cases. In all of these studies, the size of the MME was less than 1,000 equations, which left unsolved whether the conjugate gradient method was favorable for solving large MME. The objective of this study was to implement the PCG method into an iteration on data BLUP program. The convergence characteristics of PCG were compared with a typical iteration on data algorithm, in which one fixed effect was solved by Gauss-Seidel method, and other effects were solved by second-order Jacobi method. Algorithms were tested with a singletrait animal model (STM) and with an RRM for the same data set. Data MATERIALS AND METHODS The data were from primiparous Finnish Ayrshires, Holstein-Friesians, and Finncattle calving between January 1988 and October The 305-d lactation milk yields of 665,69 cows and 6,73,765 test-day milk yields of 674,397 cows were used. Test-day observations were restricted within 4 to 350 DIM; herds with less than 0 test-day measurements were discarded. Pedigree data for both analyses comprised 7795 bulls and 1,091,87 cows from the three breeds. Breed differences and genetic differences between base animals of different ages were described by 108 phantom parent groups. Statistical Models Model 1. The STM as given by Pösö et al. (14), was y ijklmn = herd i + cmy j +ad k + hcy l +a m +e ijklmn where y ijklmn is 305-d milk yield, herd i is herd effect, cmy j is calving month calving year effect, ad k is calving age days open effect, hcy l is random effect of calving year within herd, a m is random additive genetic effect, and e ijklmn is random residual. There were 4,468 herds, 103 calving month calving year classes, 49 calving age days open classes, and 170,353 calving year within herd levels. In matrix notation the model can be written as y = Hh+ Xf+ Tc+ Za+ e where h contains the herd effect, f includes all other fixed effects, c is the random effect of calving year within herd, a is the additive genetic effect, and e is the random residuals. H, X, T, and Z are incidence matrices. It was assumed that var (c) =I σ c, var (a) = A σ a, where A is the numerator relationship matrix, and var (e) =R = I σ e. The MME can be written [H R 1H H R 1X H R 1T H R 1Z X R 1 H X R 1 X X R 1 T X R 1 Z T R 1 H T R 1 X T R 1 T + Iσ c T R 1 Z Z R 1 H Z R 1 X Z R 1 T Z R 1 Z + A 1 σ â] ] [ĥ [H R 1y fˆ X R 1 y ĉ = T R 1. y Z R 1 y The variance components were the same as given in Pösö et al. (14): σ c / σ e = 0.185, σ a / σ e = 0.617, and σ e = 431,10 kg. Model. The RRM based on covariance functions was y ijklmnopq = herd i +ym j + 5 r=1 a ] s kr v(r) + age l + dcc m + htm n + φ o(p)a p + φ o(p)p p +e ijklmnopq

3 790 LIDAUER ET AL. where y ijklmnopq is test-day milk yield; herd i is herd effect; ym j is test-year effect test-month effect; s kr are five regression coefficients of test-day milk yield on DIM, which describe the shape of lactation curves within calving season class k; v =[1cc dd ], where c and c are the linear and quadratic Legendre polynomials (9) for DIM, and d=ln(dim); age l is calving age effect; dcc m is days carried calf effect; htm n is random effect of test-month within herd; a p is a vector of three random regression coefficients describing breeding value of animal p; φ o(p) is a vector of first three Legendre polynomials (9) for DIM of observation o of animal p; p p is a vector of first three random regression coefficients for nonhereditary animal effects describing the environmental covariances among measurements along lactation of animal p; and e ijklmnopq is random residual. There were 4,31 herds, 106 test-year test-month classes, 8 calving age classes, 5 d carried calf classes, and 1,933,641 test month within herd levels. The fixed regression coefficients were estimated within three calving season classes (October to February; March to June; July to September). Similarly to STM, RRM can be written in matrix notation as y = Hh+ Xf+ Tc+ Za+ Wp+ e where h contains the herd effect; f includes all other fixed effects; c comprises the random test month within herd effect; and a = [a 1,...,a n], and p = [p 1,...,p m] where n is the number of animals and m is the number of cows with records, and e contains the random residuals. H, X, T, Z, and W are the incidence and covariate matrices. For each animal with observations, Z and W contain the appropriate Φ; for animal i with n observations, Φ i =[φ i1,...,φ in ]. Note that H, X, T, and Z, as well as the corresponding vectors h, f, c, and a, had different meanings in RRM than in STM. It was assumed that e] a var[c p = [Iσ c A K a R] I K p where A is the numerator relationship matrix, K a and K p are the variance-covariance matrices of additive genetic and nonhereditary animal effects, and R = I σ e. Then, MME becomes [H R 1 H H R 1 X H R 1 T H R 1 Z H R 1 W ][ĥ X R 1 H X R 1 X X R 1 T X R 1 Z X R 1 W fˆ T R 1 H T R 1 X T R 1 T + Iσ c T R 1 Z T R 1 W ĉ Z R 1 H Z R 1 X Z R 1 T Z R 1 Z + A 1 K a 1 Z R 1 W â W R 1 H W R 1 X W R 1 T W R 1 Z W R 1 W + I K pˆ] y] [H R 1y X R 1 y = T R 1 y. Z R 1 y p 1 W R 1 The variance-covariance components (Table 1) for RRM were derived from multiple-trait REML variance components using continuous covariance function approach described by Kirkpatrick et. al (9). Note that the additive genetic variance-covariance matrix for the first 305 test days can be obtained by multiplication: G Φ K a Φ, where Φ =[φ 1,...,φ 305 ]. The heritability for a particular test day j is hj φ = jk a φ j (Table ). For all analyses, φ jk a φ j + φ jk p φ j + σe variance-covariance components and observations were scaled to units of residual standard deviation. Algorithms The MME for STM and RRM contained 1,94,694 and 7,80,477 equations, respectively. Because of the size of RRM (Table 3), iteration on data technique (11, 16, 18) was employed in the algorithm when solving the unknowns. Iteration on data technique avoids forming the MME. It allows solving the MME, although it cannot be stored in memory, but the cost is that of reading the data at each round of iteration. Let C be the coefficient matrix of the MME, x the vector of unknowns, and b the right-hand side (i.e., C x = b). Following Ducrocq (3), we rewrite the equation as [M 0 +(C M 0 )]x = b [1] then the functional iterative procedure for several iterative algorithms can be outlined as x (k + 1) = M 1 0 (b Cx (k) )+x (k). [] Let L be strictly the lower triangular of C, and D the diagonal of C. Then, if M 0 = D, Equation [] defines

4 SOLVING LARGE TEST-DAY MODELS 791 the Jacobi iteration. If M 0 = L + D, Equation [] gives the Gauss-Seidel iteration. Extending Jacobi to second-order Jacobi method increases the rate of convergence (11). Following the notation of [], second-order Jacobi can be written as x (k + 1) = M 1 0 (b Cx (k) )+x (k) + γ(x (k) x (k 1) ) where M 1 0 is D 1, and γ is the relaxation factor. [3] Gauss-Seidel second-order Jacobi algorithm. The Gauss-Seidel second-order Jacobi algorithm (GSSJ) was used as a reference in this study. The algorithm is a hybrid of the iterative methods given above and solves the fixed effect of herd (h) by Gauss- Seidel and other effects by second-order Jacobi (8, 10, 11). The GSSJ algorithm was implemented to utilize the block structure in MME. The diagonal block for equations pertaining to f were treated as a single block. For STM the design of the matrix M 0 in Equation [3] becomes }] [H R 1H X R 1 H X R 1 X 0 0 T R 1 H 0 diag s s {T R 1 T + Iσ c } 0 Z R 1 H 0 0 diag t t {Z R 1 Z + A 1 σ a where s=t=1and, correspondingly, for RRM, M 0 was }] [H R 1H X R 1 H X R 1 X T R 1 H 0 diag s s {T R 1 T + Iσ c } 0 0 Z R 1 H 0 0 diag t t {Z R 1 Z + A 1 K a 1 } 0 W R 1 H diag t t {W R 1 W + I K 1 p where s=1,andt=3.foraparticular animal i with observations, diag t t {Z R 1 Z + A 1 K a 1 } i, is diagonal block Φ ir 1 Φ i +a ii K a 1, where a ii is diagonal element i in A 1, and diag t t {W R 1 W + I K p 1 } i is diagonal block Φ ir 1 Φ i + K 1 p. For effects solved by secondorder Jacobi, corresponding diagonal blocks in M 0 were inverted and stored on disk. Relaxation factor γ for STM was 0.9, as suggested by Strandén and Mäntysaari (19). For the RRM two relaxation factors, γ = 0.8 and γ = 0.9, were investigated. For herd solutions (h) the relaxation factor in [3] was zero, leading to Gauss- Seidel for this effect. The equations for the first level of calving age days open effect in STM and for the first level of test year test month, calving age, and days carried calf effect in RRM were removed to ensure X R 1 X being full rank. Preconditioned conjugate gradient algorithm. Implementation of the PCG iterative method required storing four vectors (size equal to the number of unknowns in MME) in random access memory; a vector of residuals (r), a search-direction vector (d), the solution vector (x), and a work vector (v). Each round of iteration required one pass through the data to calculate the product Cd. The preconditioner matrices M were block diagonal matrices formed from the M 0 matrices in the GSSJ but without the off-diagonal blocks (X R 1 H, T R 1 H, Z R 1 H, andw R 1 H). The inverse of the preconditioner matrix (M 1 ) was stored on disk and read at each round of iteration. The starting values were x (0) = 0, r (0) = b Cx (0) = b, and d (0) = M 1 r (0) = M 1 b. At every iteration step (k + 1) the following calculations were performed: v = Cd (k), α = r (k) M 1 r (k) d (k) v, x (k + 1) = x (k) + αd (k), r (k + 1) = r (k) αv, v = M 1 r (k+1), r (k+1) v β = r (k) M 1 r (k), and d (k + 1) = v + βd (k) [4] where α and β are step sizes in the PCG method. Restrictions were imposed on the same equations as for GSSJ either on both C and M or on M only.

5 79 LIDAUER ET AL. TABLE 1. Variance-covariance components for test month within herd effect (σ c) and additive genetic (K a ) and nonhereditary animal (K p ) effects, each with three regression coefficients, and for residual effect (σ e) when estimating breeding values for milk yield with random regression test-day model. 1 K a K p Linear Quadratic Cubic Linear Quadratic Cubic σc term term term term term term σe ( 0.04) ( 0.67) (0.04) ( 0.19) (0.8) (0.5) Variance-covarance components are scaled by residual standard deviation. The correlations between regression coefficients are in parenthesis. Investigation of Convergence For both algorithms, the stage of convergence was monitored after each round of iteration. Two convergence indicators were used: the relative difference between consecutive solutions c (n) d = x(n+1) x (n) x (n + 1) and the relative average difference between the righthand and left-hand sides (1) c (n) r = b Cx(n+1) b where y = yi. i To allow comparisons between the methods, we first investigated how small the values of c r and c d needed to be to reach the accuracy of the solutions sufficient in practical breeding work. Therefore, quasi-true EBV were obtained by performing PCG iterations until c r became smaller than 10 6, which corresponded to a standard deviation of the values in r being more than 10 7 times smaller than the residual standard deviation. This required 301 and 681 rounds of iteration for STM and RRM, respectively. In the case of RRM, the breeding values for 305-d lactation were calculated using the animal EBV coefficients â i :EBV i = Σ(Φ â i ). Intermediate EBV for various c r values were obtained from corresponding solutions of MME. The EBV were standardized before comparing them. In Finland, the published indices are formed by dividing EBV by 1/10 of the standard deviation of active sires EBV and rounding them to the nearest full integer. Thus, a difference of one index point in the published index was equal to 43.3 kg of milk in EBV. For each investigated c r value, the correlation between the intermediate and the quasi-true indices was calculated. Furthermore, the percentage of indices were recorded if different from the quasi-true indices by one or more index points. Solutions were considered as converged if less than 1% of the indices deviated, at most, one index point from the quasi-true indices. This least significant change in the indices (LSC) was used as convergence criterion. To avoid a reduction in selection intensity caused by inaccurate solutions of MME, LSC was a minimum requirement. The convergence of the indices was analyzed in three different animal groups: young cows, evaluated sires, and young sires. The group of young cows included all cows having their first lactation in 1995; evaluated sires consisted of bulls born in 1984 and 1985; and young sires com- TABLE. Heritability (diagonal) and genetic correlations for daily milk yield in different DIM for the random regression test-day model. DIM DIM

6 SOLVING LARGE TEST-DAY MODELS 793 TABLE 3. Number of equations, nonzeros in corresponding mixed model equations, and memory requirements for preconditioned conjugate gradient (PCG) and Gauss-Seidel second-order Jacobi (GSSJ) method when solving a single-trait animal model (STM) and a random regression test-day model (RRM). Memory requirements are given in megabytes. Iteration on data Size of iteration Random access Number of data files memory Number of nonzero Model equations elements C 1 PCG GSSJ PCG GSSJ STM 1,94,697 17,71, RRM 7,80, ,117,019, Memory requirement for storing the nonzero elements of the lower triangle and diagonal of the coefficient matrix (C) of the mixed model equations as linked list. Covariables, to account for the shape of the lactation curve, were stored in a table rather than reading them from the iteration files. prised progeny tested bulls born in 1991 and 199. There were 8,109; 651; and 318 animals in the young cow, evaluated sire, and young sire groups, respectively. RESULTS AND DISCUSSION For STM, PCG required 88 rounds of iteration to meet the convergence criterion LSC, whereas GSSJ needed 1 rounds (Table 4). This result was in agreement with the findings by Berger et al. (1). They reported 83 rounds of iteration with PCG versus 169 rounds for successive overrelaxation, when solving a reduced animal model. For RRM the difference between methods was even more apparent. Convergence was reached after 149 rounds of iteration with PCG but not before 305 rounds with GSSJ (Table 5). For GSSJ, the rate of convergence decreased considerably at the later stages of iteration, whereas for PCG it remained almost unchanged (Figure 1). This finding reflected the weakness of Gauss-Seidel and secondorder Jacobi related methods, which required many iterations to gain additional increase in accuracy toward the end of the iteration process. If the relaxation TABLE 4. Different convergence indicators for preconditioned conjugate gradient (PCG) and Gauss-Seidel second-order Jacobi (GSSJ) method when solving a single-trait animal model with 1,94,694 unknowns in mixed model equations. Cows Evaluated sires Young sires Iteration 1 method c r c d % of indices deviate % of indices deviate % of indices deviate Iteration rounds 1 pt 3 pt 4 5 r I,It 1pt pt r I,It 1pt pt r I,It PCG < < GSSJ γ 6 = < < Relative difference between right-hand and left-hand sides. Relative difference between consecutive solutions. 3 Percentage of indices that deviate one index point from their quasi-true indices. 4 Percentage of indices that deviate two or more index points from their quasi-true indices. 5 Correlation between intermediate indices obtained by PCG or GSSJ and quasi-true indices obtained after the PCG iteration process reached c r -value below Relaxation factor for second-order Jacobi in GSSJ.

7 794 LIDAUER ET AL. factor is not optimal, this problem can be even more severe. For instance, satisfying the convergence criterion LSC required over 600 rounds of iteration when the relaxation factor for GSSJ was 0.8 (Table 5). Carabaño et al. () observed in all their analyses two distinct iteration phases for PCG; an unstable starting phase in which solutions converged and diverged alternately was followed by a phase with a very high rate of convergence. We observed the same behavior in PCG whenever we imposed restrictions on the fixed effect equations in both the coefficient matrix and the preconditioner matrix. Note that our implementation required restrictions in the X R 1 X block of the preconditioner to enable matrix inversion. When constraints were applied only for the preconditioner matrix, a high rate of convergence was realized during the entire iteration process (Figure 1, Tables 4 and 5). With constraints in both matrices, 3 and 19 additional rounds of iteration were required to reach convergence for STM and RRM, respectively. This result was converse to the findings of Berger et. al (1), who reported 50% reduction in the number of iteration rounds when restrictions were imposed on the fixed effect equations. Their result was based on a sire model in which the herd-year-season effect was absorbed, and the remaining 890 equations consisted of five fixed birth year groups and 885 sires. The restriction was performed by deleting the first birth year group. According to the theory demonstrated in the literature (4, 17), the PCG method guarantees convergence to the true solutions for symmetric and positive definite coefficient matrices. Without restrictions the coefficient matrix was not of full rank and, hence, was only semi-positive definite. Because the rate of convergence clearly improved without restrictions, and because the numerical values of all estimable functions do not change (1), it seems beneficial to leave the coefficient matrix unrestricted when PCG method is used. From a practical point of view, comparison of algorithms with respect to execution time is more useful. For RRM, the PCG method required 59 CPU seconds per round of iteration, and convergence was reached after.5 CPU hours of computation. In contrast, the GSSJ algorithm needed 03 CPU seconds per round (without calculation of c r ), and convergence was reached after 17. CPU hours. Both analyses were performed on a Cycle SPARCengine Ultra AXmp (300 MHz) workstation of the Finnish Agricultural Data TABLE 5. Different convergence indicators for preconditioned conjugate gradient (PCG) and Gauss-Seidel second-order Jacobi (GSSJ) method when solving random regression test-day model with 7,80,477 unknowns in mixed model equations. Cows Evaluated sires Young sires Iteration 1 method c r c d % of indices % of indices % of indices Iteration deviate deviate deviate rounds 1 pt 3 pt 4 5 r I,It 1pt pt r I,It 1pt pt r I,It PCG < GSSJ γ 6 = GSSJ γ = < Relative difference between right-hand and left-hand sides. Relative difference between consecutive solutions. 3 Percentage of indices that deviate one index point from their quasi-true indices. 4 Percentage of indices that deviate two or more index points from their quasi-true indices. 5 Correlation between intermediate indices obtained by PCG or GSSJ and quasi-true indices obtained after the PCG iteration process reached c r -value below Relaxation factor for second-order Jacobi in GSSJ.

8 SOLVING LARGE TEST-DAY MODELS 795 Figure 1. Relative average difference between left-hand and righthand sides (Cr), for the Gauss-Seidel second-order Jacobi method with two different relaxation factors, (a) γ = 0.8 and (b) γ = 0.9, and for (c) preconditioned conjugate gradient, when solving a random regression test-day model with 7,80,477 unknowns in the mixed model equations. Processing Centre. All data files were kept in random access memory during the iteration process to keep CPU time unaffected by input/output operations. Two reasons existed for the large difference in execution time between algorithms. Implementation of PCG enabled a more efficient program code than an algorithm employing Gauss-Seidel. Both algorithms required reading of the data at each round of iteration. Additional computing time was required by GSSJ to store the contributions to the MME of each herd and to reread them to adjust the right-hand sides with new Gauss-Seidel solutions for the herd effect. For the same reason GSSJ does not allow the method of residual updating (18), but PCG does. Strandén and Lidauer (18) introduced a new technique for iteration on data. Iteration on data requires a fixed number of calculations for each record say p (multiplications and additions), to compute the record contribution to the matrix multiplication Cx in [3] and Cd in [4]. By using standard iteration on data technique, p follows a quadratic function of the number of effects in the statistical model. The PCG allows a reordering of the multiplications in a way that p is a linear function of the number of effects in the statistical model (18). Consequently, for RRM p was 573 for GSSJ but was 66 for PCG. This reduction explained most of the difference in computing time per round of iteration. In fact, computation of the product Cd in [4], with the new iteration on data technique required less multiplications and additions than if the sparse matrix of coefficients (403,117,019 nonzero elements) would have been used. A disadvantage of PCG, in comparison to the GSSJ method, was a greater demand of random access memory, which may limit its use in large applications. One way to circumvent this problem is to store the solution vector on a disk, and to make the work vector unnecessary by reading the data twice at each round of iteration. An expense of these modifications is increased computing time. The most common convergence indicator in animal breeding applications is c d because it is easy to obtain. However, it has been demonstrated (1) that the evaluation of convergence from solely c d may be inappropriate, because the real accuracy of the solutions can be much lower than indicated by c d. Our results supported this conclusion. When c d was applied to solutions obtained by GSSJ, the indicator suggested that the accuracy of the solutions from round 300, with γ = 0.8, was higher than that from round 174, with γ = 0.9 (Table 5). However, indices with one point deviation from the quasi-true indices and a correlation between intermediate and quasi-true indices proved to be the opposite (Table 5). This finding was also supported by the approximated accuracy of the solutions as derived by Misztal et. al (1), which was for solutions from round 300 with γ = 0.8 versus for solutions from round 174 with γ = 0.9. The convergence indicator c r was regarded as more reliable (0), but for secondorder Jacobi and Gauss-Seidel methods, calculation of c r was expensive. In PCG all components of c r were readily available. Estimation of breeding values with RRM required greater accuracy in the solutions of MME than with STM. This was because a breeding value of an animal in RRM was a function of breeding value coefficients (â p ) rather than a single solution from MME. For both models, a high correlation of at least was observed between the quasi-true indices and the indices that fulfilled the convergence criterion LSC. The correlation between the converged indices from the STM and the RRM were 0.967, 0.990, and 0.988, for young cows, evaluated sires, and young sires, respectively. However, percentages of indices differing two or more index points between the two models were 49.5, 8.4, and 44.6 for young cows, evaluated sires, and young sires, respectively. This finding indicated a significant change in ranking of the animals when estimating EBV with STM or with RRM.

9 796 LIDAUER ET AL. CONCLUSIONS The PCG seemed to be an attractive alternative to solve large MME. Solving the MME of RRM was accomplished in only 14% of the computation time needed for GSSJ. The implementation of PCG was straightforward and without any parameter estimates (e.g., relaxation factors). This gave another advantage over second-order Jacobi-related methods. We observed that PCG performed better when no restrictions were imposed on the coefficient matrix. Thus, the convergence was not impaired by the coefficient matrix being semi-positive definite. The estimation of breeding values with RRM required a greater accuracy of the solutions of MME than with STM. This finding favored PCG in particular, for which an additional increase in the accuracy of the solutions was computationally less costly than for GSSJ because of the high rate of convergence during later stages of iteration. REFERENCES 1 Berger, P. J., G. R. Luecke, and A. Hoekstra Iterative algorithms for solving mixed model equations. J. Dairy Sci. 7: Carabaño, M. J., S. Najari, and J. J. Jurado Solving iteratively the M. M. E. Genetic and numerical criteria. Pages in Book Abstr. 43rd Annu. Mtg. Eur. Assoc. Anim. Prod., Madrid, Spain. Wageningen Pers, Wageningen, The Netherlands. 3 Ducrocq, V Solving animal model equations through an approximate incomplete Cholesky decomposition. Genet. Sel. Evol. 4: Hageman, L. A, and D. M. Young Applied iterative methods. Acad. Press, Inc., San Diego, CA. 5 Hestenes, M. R., and E. L. Stiefel. Methods of conjugate gradients for solving linear systems. Natl. Bur. Std. J. Res. 49: Jamrozik, J., L. R. Schaeffer, and J.C.M. Dekkers Genetic evaluation of dairy cattle using test day yields and random regression model. J. Dairy Sci. 80: Jamrozik, J., L. R. Schaeffer, Z. Liu, and G. Jansen Multiple trait random regression test day model for production traits. Pages in Bull. no. 16. INTERBULL Annu. Mtg., Vienna, Austria. Int. Bull. Eval. Serv., Uppsala, Sweden. 8 Jensen, J., and P. Madsen DMU: A package for the analy- sis of multivariate mixed models. Proc. 5th World Congr. Genet. Appl. Livest. Prod., Guelph, ON, Canada XXII: Kirkpatrick, M., W. G. Hill, and R. Thompson Estimating the covariance structure of traits during growth and aging, illustrated with lactation in dairy cattle. Genet. Res. Camb. 64: Lidauer, M., E. A. Mäntysaari, I. Strandén, A. Kettunen, and J. Pösö DMUIOD: A multitrait BLUP program suitable for random regression testday models. Proc. 6th World Congr. Genet. Appl. Livest. Prod., Armidale, NSW, Australia XXVII: Misztal, I., and D. Gianola Indirect solution of mixed model equations. J. Dairy Sci. 70: Misztal, I., D. Gianola, and L. R. Schaeffer Extrapolation and convergence criteria with Jacobi and Gauss-Seidel iteration in animal models. J. Dairy Sci. 70: Misztal, I., L. Varona, M. Culbertson, N. Gengler, J. K. Bertrand, J. Mabry, T. J. Lawlor, and C. P. Van Tassell Studies of the values of incorporating effect of dominance in genetic evaluations of dairy cattle, beef cattle, and swine. Proc. 6th World Congr. Genet. Appl. Livest. Prod., Armidale, NSW, Australia XXV: Pösö, J., E. A. Mäntysaari, M. Lidauer, I. Strandén, and A. Kettunen Empirical bias in the pedigree indices of heifers evaluation using test day models. Proc. 6th World Congr. Genet. Appl. Livest. Prod., Armidale, NSW, Australia XXIII: Reents, R., J.C.M. Dekkers, and L. R. Schaeffer Genetic evaluation for somatic cell score with a test day model for multiple lactations. J. Dairy Sci. 78: Schaeffer, L. R., and B. W. Kennedy Computing strategies for solving mixed model equations. J. Dairy Sci. 69: Shewchuk, J. R An introduction to the conjugate gradient method without the agonizing pain. School of Computer Sci., Carnegie Mellon Univ. Pittsburgh, PA. 18 Strandén, I., and M. Lidauer Solving large mixed linear models using preconditioned conjugate gradient iteration. J. Dairy Sci. 8: Strand en, I., and E. A. Mäntysaari Animal model evaluation in Finland: experience with two algorithms. J. Dairy Sci. 75: Van Vleck, L. D., and D. J. Dwyer Successive overrelaxation, block iteration, and method of conjugate gradients for solving equations for multiple trait evaluation of sires. J. Dairy Sci. 68: Wiggans, G. R., and M. E. Goddard A computationally feasible test day model for genetic evaluation of yield traits in the United States. J. Dairy Sci. 80: Wiggans, G. R., I. Misztal, and L. D. Van Vleck Animal model evaluation of Ayrshire milk yield with all lactations, herdsire interaction, and groups based on unknown parents. J. Dairy Sci. 71:

Impact of Using Reduced Rank Random Regression Test-Day Model on Genetic Evaluation

Impact of Using Reduced Rank Random Regression Test-Day on Genetic Evaluation H. Leclerc 1, I. Nagy 2 and V. Ducrocq 2 1 Institut de l Elevage, Département Génétique, Bât 211, 78 352 Jouy-en-Josas, France