NATURAL SELECTION FOR WITHIN-GENERATION VARIANCE IN OFFSPRING NUMBER JOHN H. GILLESPIE Department of Biology, University of Penmyluania, Philadelphia, Pennsyluania 19174 Manuscript received September 17, 1973 ABSTRACT In this paper it is shown that natural selection can act on the withingeneration variance in offspring number. The fitness of a genotype will increase as its variance in offspring number decreases. The intensity of selection on the variance component is inversely proportional to population size, although the fixation probability of a gene which differs from its allele only in the variance in its offspring number is independent of population size. The concept of effective population size is shown to be of limited use when there is genetic variation in the variance in offspring number. HE probability that the nest of an ant bird in tropical America escapes pre- Tdation is about.l. If, due to some biological constraint, ant birds can lay no more than M eggs in a breeding season, it is reasonable to ask: should all M eggs be put in one basket, or should they be evenly distributed into k baskets? The mean number of successful fledglings will be the same in both cases, but the variance in the numbers of fledglings will be M2 (.09) in the first case, M2 (.09)/k in the second. The second strategy seems preferable, yet the existing mathematiccal theory of natural selection does not seem to have treated this situation. In this paper the consequences of selection on the variance in offspring number will be explored mathematically. At the outset it is important to note that the variance in the numbers of offspring of a genotype has two components, the withingeneration component resulting from different individuals of the same genotype having different numbers of offspring, and a between-generation component due to the effects of a changing environment. This latter component has been recently investigated (GILLESPIE 1973); the former will be explored in this paper. 1. Haploid selectiom Consider a haploid population made up of two genotypes, A, and A,, whose numbers in the tth generation, t = 1, 2, 3,.,., are z1 (5) and x2(t), respectively. Let 1 + Yij (t) be a random variable representing the number of offspring of the jth individual of the ith genotype in the tth generation. If the collection of random variables {Yij(t), i = 1,2; i = 1,2,... xi (t)} are independent, the process (z, (t), x2 ( t )) is the two-dimensional branching process introduced as a genetic model by FELLER (1951). As pointed out by FELLER, this process may be approximated by a diffusion process whose forward equation may be written where 1 +pi is the mean number of offspring of genotype Ai and U: is the variance in the number of offspring of genotype Ai. The density of (zl(t), x2 (t)) is given by +(x,, z2, t). This Genetics 76: 601-606 March, 1974.
602 J. H. GILLESPIE equation may be transformed into an equation in p, the frequency of allele Ai, and n, the total population size by yielding a new forward equation p=- 21 X1+%, n=x,+x, If we regard this equation as approximating the behavior of a density-regulated population, n may be viewed as a fixed constant instead of a Markovian variable, and we will be left with a one-dimension process in p with drift coefficient and diffusion coefficient V(p) =-- n (5) Unlike the equivalent coefficients used in the stochastic models pioneered by WRIGHT (1946) and KIMURA (lw), these coefficients explicitly incorporate the variances in offspring numbers of the two genotypes. Their exclusion from the WRIGHT-KIMURA theory is due to the postulated infinite gametic pool (CROW and KIMURA 1970, p. 327) which, through the law of large numbers, removes the possibility of variance effects being important. In the present model, one which is certainly closer to reality for many species, the gametic pool will be essentially the same size as the final population. Examination of M(p) shows immediately the two main properties of the action of selection on the within-generation component of variance in offspring number: (1) Lowering the variance in offspring number will increase the fitness of a genotype. (2) The strength of selection for the variance component is inversely proportional to population size. Selection on variance in offspring number can be studied in isolation from mean effects if we set p1 = pz. In this situation the probability of fixation of allele A,, given an initial frequency p, is where (See CROW and KIMURA 1970 for the theory which allows the calculation of u(p)). This probability is independent of population size, contrary to all other forms of natural selection (KIMURA 1962). The reason for this is that both the intensity of selection and the magnitude of the stochastic effects are inversely proportional to population size. It is also interesting to note that the probability, u(p), depends on the ratio of variances rather than their absolute values, this, again, being unlike other forms of selection. Examination of u(p) shows that the allele with the smaller variance in offspring number always has a better chance of fixation than a neutral allele with the same initial frequency. This further strengthens the statement that selection favors smaller variances in offspring number. A seemingly paradoxical situation arises if we consider the limiting deterministic model which is obtained by letting n + 00. At the limit the rate of change of p is zero, yet, since u(p> is independent of n, there remains a non-zero probability of the allele with the smaller variance
VARIANCE IN OFFSPRING NUMBER 603 becoming fixed. To resolve this, examine the conditional mean time to fixation for allele, A,, given that it does, in fact, become fixed. Using the theory of EWENS (1972), this mean time may be shown to equal or, apprroximately if p/a is small: The mean time is, therefore, proportional to the population size. In large populations, selection for variance in offspring numbers would proceed very slowly; at the limit it would take, on the average, an infinite time to reach fixation. Thus, the paradox is resolved. It is possible to make a somewhat stronger statement about the behavior of the conditional waiting time to fixation than simply noting that its mean time approaches infinity. For the conditional process the drift coefficient is U'(P) M*b) = M(p) + V(P) - (PI * The diffusion coefficient is the same as in the unconditional process (EWENS 1972). The Laplace transform of the waiting time to fixation density g(h;p,n), satisfies an equation of the form (10) vag" + M* (p) g' - Xg= 0. (11) 2 Since both V(p) and M(p) have factors of l/n, multiplying through this equation by n causes n to appear only as a factor of A. The Laplace transform for populations with successively larger and larger population sizes will, therefore, be simple compressions of the smaller ones toward zero. This implies that the ( 12) liliwg(h; p,n) = 0, A > 0 : Thus, the entire probability mass of the waiting time density approaches infinity. Turning now to the more complicated case where the two alleles differ in both their mean and variance in offspring numbers, we arrive at the probability of fixation of A, as where U(P) =- - 2ns 1 - (1 + p/.) 022 - U12-2ns 1 - ( 1 + 1/a) u22 - u12 +I + I (13) This formula illustrates the way that increasing population size decreases the role of variance selection. In fact, as n+ 00-2n (41- PJ P 1 -e u12 P(P) -+- (15) -2n (a - Pz 1 I-e UI2 which is of the same form as the fixation probability applicable to the WRIGHT-KIMURA model (KIMURA 1962). KIMURA'S formula may also be obtained from (15) by allowing the variances in offspring number of the two genotypes to approach each other. Examination of the second derivative of u(p) shows it to be convex or concave depending on the sign of s. If
604 J. H. GILLESPIE then A, has a better chance of becoming fixed than a neutral allele with the same initial frequency. For this reason the best measure of the fitness of genotype Ai is 1 p. - - u.2 n a a quantity which depends on the population size. With this definition comical situations will sometimes arise. For example, if p, > p2 and u12 < u22, there exists a population size for which the two alleles are neutral (M(p) =O). The accuracy of these formulae may be checked by various methods. In Table 1 the results of computations of fixation probabilities when the numbers of offspring per individual are binomially distributed and the population process is generated by the direct-product branching process model of KARLIN and MCGRFGOR (19W) are compared to the diffusion approximations. The agreement is quite good, even for small population sizes. This justifies, in part, the use of n as a constant rather than a Markovian variable. TABLE 1 A comparison of the exact values for the fixation probabilities with formula (6) P N=10 U = 1.33 20.1.1367.1374,1429.1510.la8.I 629.5.5981.5987.moo.6330.m3.6366.9.9335.9336.9310.9w,9464.9* 2. Diploid selectiom The diploid model is treated in an analagous fashion. If there are three genotypes, A,A,, A,A,, and A,A2, with means and variances in offspring numbers given by 1 + mi and 1 + ui2, i = 1, 2, 3 respectively, it can be shown (see APPENDIX) that the drift coefficient for the diffusion approximation is M(P) =p(l--p) t-p(~l-~2)-(l-7j) (m,-m,) and the diffusion coefficient is The drift coefficient is an obvious extension of the haploid case. The diffusion coefficient is somewhat more interesting. It should be viewed as the sum of two components, the first, is a consequence of the randomness introduced by mendelian segregation. The second, comes from the randomness due to variable offspring number. If the variances of all three genotypes are equal, and if m, = 0, we obtain the coefficients of the WRIGHT-KIMURA theory. Examination of the drift coefficient shows that selection again has the ability to work on variances in offspring numbers. Of particular interest is the possibility of heterozygote advantage resulting from a lower variance in offspring number in the heterozygote. The strength of this selection is, again, inversely proportional to population size, and for very large ppulations this would be a negligible force for maintaining variations. The probability of fixation of allele A, may be written down for the diploid case, although the expression is rather cumbersome and yields little additional insight into the process.
VARIANCE IN OFFSPRING NUMBER 605 DISCUSSION The most remarkable aspects of selection for within-generation variance in offspring number are (1) the necessity of including the population size in the definition of fitness and (2) the appearance of p in the coefficient of p(1-p) in V(p). This latter property requires some comments since, by convention, the effective population size is precisely this coefficient, and is independent of p in the WRIGHT-KIMURA model. In the haploid model the effective size is and in the diploid model is = n pu; + ( 1 -p) U; 2n ne(p) = 1 + m* + U," + 2p(l-p) (U; + U; - 2,;). In both cases the most common allele is the least important in determining the effective population size. In the diploid model, when either allele is rare, the heterozygote will always be the genotype of importance in determining the effective population size. By continuity, if one allele is eliminated from the population, the effective size should be determined by this now-absent allele. This rather uncomfortable conclusion suggests that the concept of effective population size loses a good deal of its value in the context of the present model. It remains useful, however, as a description of the gross reproductive pattern of the population. Selection for the within-generation component of variance in off spring number shares certain properties with selection for the between-generation component. The most striking similarity is in the respective measures of fitness. For the between-generation component, the best measure of fitness of genotype Ai is 1 pi -- a? 2 ' (GILLESPIE 1973). The similarity of this expression to (16) suggests that a similar phenomenon is operating in both models. One expression of this phenomenon is that the advantage which a genotype gains thro,ugh producing many offspring in a good year does not balance the disadvantage from producing few offspring in a bad year. Thus a lowering in the variance in the offspring number, be it the within- or between-generation components, can only raise the probability of leaving offspring behind. I would like to thank BILL DRITSCHILO for computing the exact results given in Table 1. DRS. JAMES CROW and WARREN EWENS provided many helpful suggestions which significantly improved the final version of the paper. LITERATURE CITED CROW, J. F. and M. KIMURA, 1970 An Introduction to Population Genetics Theory. Harper and Row, New York. EWENS, W. J., 1972 Concepts of substitutional load in finite populations. Theoret. Population Biol. 3 : 153-161.
606 J. H. GILLESPIE FELLER, W., 1951 Diffusion processes in genetics. Proc. 2nd Berkeley Symp. on Math Statistics and Probability 227-246. GILLESPIE, J. H., 1973 Natural selection with varying selection coefficients-a haploid model. Genet. Res. 21: 115-120. KARLIN, S. and J. MCGREGOR, 1964 Direct product branching processes and related Markov chains. Proc. Natl. Acad. Sci. U.S. 51 : 598-602. KIMURA, M., 1954 Processes leading to quasi-fixation of genes in natural populations due to random fluctuation in selection intensities. Genetics 39: 280-295. -, 1962 On the probability of fixation of mutant genes in a population. Genetics 47: 713-719. -, 1964 Diffusion models in population genetics. J. Appl. Probability 1 : 177-232. WRIGHT, S., 1945 The differential equation of the distribution of gene frequencies. Proc. Natl. Acad. Sci. US. 31: 382-389. Corresponding editor: R. LEWONTIN APPENDIX We will begin by deriving the drift and diffusion coefficients for the diploid forward equation corresponding to (1). If, in generation t there are z, (t) A, alleles and z,(t) A, alleles, the num- bers of these three genotypes are assumed to be npz, n2pq, and nq2 where 2n = 2, + x2, p = z,/n, and q = 1-p. The mean and variance of the numbers of A, and A, alleles produced by these genotypes, under the assumption of random mating, may be derived by combining the contributions of the three genotypes considered separately. The mean number of A, and A, alleles produced will be np2 (1 + m,) + pq (1 + m,) ne (1 + m3) + P4 (1 + 7%) so the expected changes in the number of A, and A, alleles are M, (XI, z2) = n(p"m,+ P P, ) M, (z,, 22) = n(pqm2 + q2m,). These correspond to p,z, and p,z, in (1). They have been notated using n and p for convenience, although it should be kept in mind that the process being described is (z,, z2) and not, at this point, (n,p)* The variance in the number of A, alleles produced may also be calculated by considering the contributions of the three genotypes taken separately. For A,, the contribution from A,A, is n.pzu12 and from A,A, is npqu,2/2 + npg (1 f mz)/2. The contribution from the heterozygote may be calculated from the generating function of the number of A, alleles produced per heterozygote, F(1/2+ 1/2s), where F(s) is the generating function of the number of offspring per heterozygote. This, and the analogous calculations for A,, lead to the diffusion coefficients. Finally, there is a covariance term which comes from the fact that each gamete produced by a heterozygote is either A, or A,. The covariance in the number of A, and A, alleles is readily shown, again using generating functions, to be Mi, Vi, and C define a diffusion process in z, and z,. Using the transformation (2) this is converted into one in p and n. Holding n constant yields (1 7) and (18).