A CHARACTERIZATION OF ANCESTRAL LIMIT PROCESSES ARISING IN HAPLOID. Abstract. conditions other limit processes do appear, where multiple mergers of

A CHARACTERIATIO OF ACESTRAL LIMIT PROCESSES ARISIG I HAPLOID POPULATIO GEETICS MODELS M. Mohle, Johannes Gutenberg-University, Mainz and S. Sagitov 2, Chalmers University of Technology, Goteborg Abstract The classical n-coalescent introduced by Kingman is not the only possible it process arising in ancestral population genetics. Under smooth conditions other it processes do appear, where multiple mergers of ancestral lines happen with positive probability. This processes are characterized by a probability measure on the interval [0; ]. The classical n-coalescent is the special case when this measure is the point measure in 0 and in this case only pairwise coalescent events appear. Simple necessary and sucient conditions on the underlying haploid population model are presented which ensure that the ancestral process converges weakly to such a it process if the population size tends to innity. ACESTRAL PROCESS; COALESCET; EXCHAGEABILITY; EUTRALITY; POPULATIO GEETICS; WEAK COVERGECE AMS 99 SUBJECT CLASSIFICATIO: Primary 92D25, 60J70 Secondary 92D5, 60F7 Introduction Consider the haploid population models with non-overlapping generations r 2 I 0 : f0; ; 2; : : :g and xed population size 2 I : f; 2; : : :g introduced by Cannings [, 2]. This models are characterized by exchangeable random variables ; : : : ; with values in f0; : : : ; g, where i denotes the number of ospring of the i-th individual. As the population size is xed the condition + + () has to be satised. Fix n and sample n individuals at random from the 0-th generation. For r 2 I 0 let R r denote the equivalence relation which contains the pair (i; j) i the i-th and the j-th individual of this sample have a common ancestor in the r-th generation backwards in time. The process (R r ) r2i0 is a time homogeneous Markov chain with state space E n, the set of all equivalence relations on f; : : :; ng and initial value R 0 f(i; i) j i 2 f; : : :; ngg. Obviously for ; 2 E n the transition probability p : P (E r j R r? ) is Postal address: Johannes Gutenberg-University Mainz, Department of Mathematics, Saarstrae 2, 55099 Mainz, Germany; e-mail: moehle@mathematik.uni-mainz.de 2 Postal address: Chalmers University of Technology, Department of Mathematical Statistics, 4296 Goteborg, Sweden; e-mail: serik@math.chalmers.se

equal to zero for 6. Assume now that. In analogy to Kingman [4, 5, 6] let C ; : : : ; C a denote the equivalence classes of and let C, 2 S f; : : :; ag, b 2 f; : : :; b g denote the equivalence classes of such that C C. The transition probability is given by p () b X i ;:::;ia all distinct E(( i ) b ( ia ) ba ) () a () b E(( ) b ( a ) ba ); (2) where b : jj b + + b a is the number of equivalence classes of and the notation (x) b : x(x?) (x?b+) is used. Let c denote the probability that two individuals, chosen randomly without replacement from some generation, have a common ancestor one generation backwards in time, i.e. c : () 2 X i E(( i ) 2 ) E(( ) 2 )? Var( )?? E( 2 ): (3) This probability, called the coalescence probability is of fundamental interest in coalescent theory as c? is the proper time scale to get convergence to the coalescent. For technical reasons it is assumed that c > 0 for all. ote that c 0 if and only if almost surely. The coalescence probability is also important as it is directly connected via c? 2 to the eigenvalue 2 : E( 2 ) of the transition matrix of the descendant process, i.e. the genealogical process looking forwards in time (see []). ote that for 2 the correlation coecient between and 2 is given by %( ; 2 ) E( 2 )? Var( )?? < 0: (4) For a large class of models, for example for the Moran model and the Wright- Fisher model (Kingman [5]), it is well known that the nite-dimensional distributions of the time-scaled ancestral process (R [tc ]) t0 converge to those of the (classical) n-coalescent. The n-coalescent is a time continuous Markov process with state space E n, initial state and innitesimal generator Q (q ) ;2En given by q 8 < :?jj(jj? )2 if, if, 0 otherwise, where if and only if and jj jj +, i.e. is obtained by merging two equivalence classes of. This convergence results are based on an expansion of the transition probabilities of the form p + c q + o(c ); ; 2 E n which is often written in matrix notation P I + c Q + o(c ); (5) 2

where P : (p ) ;2En denotes the transition matrix of the ancestral process. For a large class of models this expansion is valid ([8]). Recently (see [0]) this expansion has been extended to a wider class of generators Q. In order to describe this class of generators the following denition is needed: Denition. A pair (; ), ; 2 E n is called a k-merger, k 2 f2; : : : ; jjg, if and if b k and b 2 b a, i.e. if has exactly one equivalence class which is a union of k classes of and all the other classes of are also classes of. Use the notation k if (; ) is a k-merger. Remarks. If k then jj jj + k?. If : f(i; j) j i; j 2 f; : : : ; ngg denotes the full relation then for each 2 E n the pair (; ) is a jj-merger. If n 3 then for all ; 2 E n with the pair (; ) is a (jj? jj + )-merger. The class of generators Q is characterized by an arbitrary probability measure on the interval [0; ]: q 8 >< >:? [0;]? (? x) b? (? x + bx) x 2 (dx) if, x k?2 (? x) b?k (dx) if k, [0;] 0 otherwise, where b : jj. For 0 the entries (6) are identical to the entries (5) of the standard n-coalescent. Another example with is given in [8]. The class ([0; x]) x?, 2 (0; ) is considered in more detail in [0]. In the next section it is shown that under smooth conditions, which depend only on the common distribution of the variables and 2, the generator Q! (P? I)c exists and has the form (6). The conditions are similar (and certainly equivalent) to the conditions given in [0], but they have a more simple form. 2 The convergence theorem In this section we rst present the \if"-part of the main convergence result (Theorem 2.) and most of the section deals with the proof of this theorem. At the end of the section the \only-if"-part is considered (Theorem 2.6) which is very easy to verify. Theorem 2. Assume that the following two conditions are satised:?k (I) k : E(( ) k ) exists for all k 2.! c (II)! 2 c E(( ) 2 ( 2 ) 2 ) 0. (6) 3

Then for each sample size n 2 I the it P? I Q (q ) ;2En :! c exists and the entries of Q have the form (6), where the probability measure on [0; ] is uniquely determined via its moments R [0;] xk (dx) k+2, k 2 I 0. Remarks.. From E(( ) 2 ) (? )c it follows that 2. For k 3 it follows that?k c E(( ) k )?k k?2 E(( ) 2 )? c ; i.e. the sequence (?k c? E(( ) k )) 2I is bounded. Hence there exists a subsequence ( l ) l2i with l! l such that?k l k E(( ) k ) l! c l exists. Thus the condition (I) is not as strong as it seems to be at a rst glance. We will see in Lemma 2.3 that the condition (I) ensures the existence of the measure. This measure is uniquely determined by the numbers k, i.e. it depends only on the distribution of the ospring variable. The condition (I) is in fact equivalent ([0], Equation (3)) to the following tail condition: There exists a probability measure on the interval [0; ] such that the convergence! c P ( > x) y?2 (dy) (7) [x;] holds at all points x 2 (0; ) where the it function is continuous. 2. The condition (II) is for example satised if! c 0 and if the random variables ( ) 2 and ( 2 ) 2 are not positively correlated, i.e. E(( ) 2 ( 2 ) 2 ) (E(( ) 2 )) 2 ( c ) 2 2 c 2 : (8) It is shown in the next lemma that the condition (II) is equivalent to! 2 c E((? ) 2 ( 2? ) 2 ) 0 which corresponds to the condition (4) in [0] for a 2. 4

The proof of Theorem 2. is split up into several parts. First it is shown (Lemma 2.2) that condition (II) is equivalent to q 0 for all ; 2 E n with such that (; ) is not a k-merger. In Lemma 2.3 the existence of the probability measure is derived and afterwards (Lemma 2.4 and Lemma 2.5) the formula for q is shown if (; ) is a k-merger. This proofs dier from the proofs given in [0]. Finally the case is considered. Lemma 2.2 The following conditions are equivalent: (i) (ii) (iii)!! 2 E(( ) 2 ( 2 ) 2 ) 0: c 2 c E((? ) 2 ( 2? ) 2 ) 0:! a?b c? E(( ) b ( a ) ba ) 0 for all a 2, b ; b 2 2 and b 3 ; : : :; b a, where b : b + + b a. (iv) q :! p c 0 for all n 2 I and all ; 2 E n with such that the pair (; ) is not a k-merger. Remark. ote that the condition (i) is exactly the condition (II) of the convergence Theorem 2.. Proof. (i), (ii): ote rst that (? ) 2 ( 2? ) 2 (( ) 2? (? )) (( 2 ) 2? ( 2? )) ( ) 2 ( 2 ) 2? (? )( 2 ) 2?( 2? )( ) 2 + (? )( 2? ): Using E((? )( 2? )) E( 2 )??c it follows that E((? ) 2 ( 2? ) 2 )? E(( ) 2 ( 2 ) 2 )?2E(( ) 2 ( 2? ))? c?2e(( ) 2 2 ) + 2E(( ) 2 )? c?2e(( ) 2 2 ) + 2(? )c? c?2e(( ) 2 2 ) + (2? 3)c : ow (? )E(( ) 2 2 ) E(( ) 2 ( 2 + + )) E(( ) 2 ) (? )c, i.e. E(( ) 2 2 ) O( c ). Thus E((? ) 2 ( 2? ) 2 )? E(( ) 2 ( 2 ) 2 ) O( c ) and the equivalence of (i) and (ii) follows immediately. (i) ) (iii): S : X i ;:::;ia all distinct ( i ) b ( ia ) ba 5

X i ;i 2 i 6i 2 X i;j i6j ( i ) 2 b?2 i ( i2 ) 2 b2?2 i X i 3;:::;i a b2 i 3 ba i a ( i ) 2 b?2 ( j ) 2 b2?2 ( + + ) b3++ba b?4 X i;j i6j ( i ) 2 ( j ) 2 and hence E(S) b?4 X i;j E(( i ) 2 ( j ) 2 ) b?2 E(( ) 2 ( 2 ) 2 ): i6j Thus a?b c? E(( ) b ( a ) ba ) () a E(( ) b ( a ) ba ) () b c E(S) () b c b?2 E(( ) 2 ( 2 ) 2 ) () b c E(( 2 ) 2 ( 2 ) 2 ) c which converges to zero by assumption. (iii) ) (i): Choose a : b : b 2 : 2 in (ii) and (i) follows immediately. (iii) ) (iv): Let ; 2 E n such that and that (; ) is not a k-merger. Then a 2 and (without loss of generality) b ; b 2 2 and hence p c () a () b c E(( ) b ( a ) ba ) a?b c? E(( ) b ( a ) ba ) which converges to zero as!. (iv) ) (i): Choose n 4, and let 2 E n denote the equivalence relation with classes f; 2g and f3; 4g. Then p () 2 () 4 E(( ) 2 ( 2 ) 2 ) and (i) follows from! p c 0 as (; ) is a 2-merger. 2 Lemma 2.3 Let the condition (I) be satised. Then there exists a probability measure on [0; ] such that for all k 2.?k k E(( ) k )! c x k?2 (dx) [0;] 6

Proof. Let Y be a random variable with distribution P (Y i) : (i) 2 E(( ) 2 ) P ( i) i(i? ) (? )c P ( i); where i 2 f0; : : : ; g. The k-the moment of X : Y is then given by E(X k ) X i0 X i0 ( i )k P (Y i) ( i )k i(i? ) (? )c P ( i)?k (? )c E( k+2? k+ ): P Using the equation t k k l S kl(t) l for all t 2 IR and all k, where S kl are the Stirling numbers of the second kind this leads to! E(X k )?k k+2 k+ X X E S k+2;l ( ) l? S k+;l ( ) l (? )c l l!?k k+ X E ( ) k+2 + (S k+2;l? S k+;l ) ( ) l (? )c?k (? )c E(( ) k+2 ) + l2 k+ X l2?k (S k+2;l? S k+;l ) E(( ) l ) (? )c which converges to k+2 under condition (I). ote that 0 X almost surely. It is known (see [3], Chapter 8, Section ) that for a sequence of probability measures on [0; ] the convergence of the moments is equivalent to the weak convergence of the sequence (X ) 2I to a it X where the distribution : P X of X is uniquely determined by it's k-th moments E(X k )! E(X k ) k+2. Thus k+2 E(X k ) x k (dx) [0;] for all k 2 I 0 and the lemma is established. 2 Lemma 2.4 If the two conditions (I) and (II) of Theorem 2. are satised then?k E(( ) k 2 a )! c for all k 2 and all a. x k?2 (? x) a? (dx) [0;] 7

Proof. By induction over a. For a this follows from Lemma 2.3. Assume that the formula is valid for some a and all k 2. Then using the exchangeability of the ospring variables ; : : : ; and () it follows that (? a + )E(( ) k 2 a+ ) X ia+ E(( ) k 2 a i ) E(( ) k 2 a ( a+ + + )) E(( ) k 2 a (??? a )) E(( ) k 2 a (? k? a +? (? k)? ax i2 ( i? ))) (? k? a + ) E(( ) k 2 a )? E(( ) k+ 2 a )? ax i2 E(( ) k 2 a ( i? )) (? k? a + ) E(( ) k 2 a )? E(( ) k+ 2 a )?(a? )E(( ) k ( 2 ) 2 3 a ): Multiplying this equation by?k c and taking the it! leads to?k E(( ) k 2 a+ )! c!?k c E(( ) k 2 a )??k E(( ) k+ 2 a )! c?(a? )?k E(( ) k ( 2 ) 2 2 a ):! c The last it is equal to zero by Lemma 2.2 (iii). Thus?k E(( ) k 2 a+ )! c!?k c E(( ) k 2 a )? x k?2 (? x) a? (dx)? [0;] x k?2 (? x) a (dx): [0;]!?k c E(( ) k+ 2 a ) x k? (? x) a? (dx) [0;] Thus the formula is valid for a + and the induction is done. 2 Corollary 2.5 Let the conditions (I) and (II) be satised. If (; ) is a k-merger (k 2) then p q :! c x k?2 (? x) a? (dx); [0;] 8

where a : jj. Proof. Let (; ) be a k-merger (k 2). With the notation b : jj and a : jj it follows that p q! c () a E(( ) k 2 a )! () b c! a b c E(( ) k 2 a )?k E(( ) k 2 a ); (9)! c as a b? k +. The corollary follows now directly from Lemma 2.3. 2 In order to nish the proof of Theorem 2. it remains to consider the diagonal entries q. For 2 E n with b : jj it follows from P 2E n p that q? X??? 62E n q? bx k2 bx k2 X 62En (;) is a k?merger? b x k?2 (? x) b?k (dx) k [0;] x?2 [0;] bx k2? b k x k (? x) b?k (dx) q x?2 (? (? x) b? bx(? x) b? ) (dx) [0;] and the proof of Theorem 2. is nished. At the end of this section the \onlyif"-part, i.e. the opposite view of Theorem 2. is presented, which is simple to prove. Theorem 2.6 Assume that for each sample size n 2 I the it Q : (q ) ;2En :! (P? I)c exists such that q 0 if the pair (; ) is not a k-merger. Then the conditions (I) and (II) are satised and Q has the form (6) where the probability measure on the interval [0; ] is uniquely determined via its moments R [0;] xk?2 (dx)!?k c? E(( ) k ), k 2. Proof. Fix k 2, n k and choose 2 E n such that jj k. Using the equation (9) for a it follows that the condition (I) is satised with k : q!?k c? E(( ) k ). The implication from (iv) to (i) in Lemma 2.2 ensures that the condition (II) is also satised. ow apply Theorem 2.. 2 9

3 Further convergence results and open problems Theorem 3. Assume that the conditions (I) and (II) of Theorem 2. are satised and assume further that c :! c exists. If c > 0 then the ancestral process (R r ) r2i0 converges as! weakly to a discrete time Markov process (R r ) r2i0 with initial state R 0 and transition matrix I + cq, where Q is given as in Theorem 2.. If c 0 then the time-scaled ancestral process (R [tc ]) t0 converges as! weakly in DE n ([0; )) to a continuous time Markov process (R t ) t0 with initial state R 0 and transition matrix e tq, where Q is again given as in Theorem 2.. Proof. From Theorem 2. it follows that Q :! (P? I)c is of the form (6) where the measure R is uniquely determined by its moments [0;] xk (dx) k+2, k 2 I 0. For c > 0 the theorem follows from P I + c Q + o(c ) which converges to I + cq as tends to innity. Assume now that c 0. Then for t 0 it follows that jjp [tc]? (I + c Q) [tc ] jj [tc ] jjp? (I + c Q)jj converges to zero as tends to innity, i.e.! P [tc ]! (I + c Q) [tc ] e tq 8 t 0: This is equivalent to the convergence of the nite-dimensional distributions of the process (R [tc ]) t0 to those of a process (R t ) t0 with transition matrix e tq. The convergence in DE n ([0; )) can be shown using the method described in [9]. 2 Theorem 3.2 There exists a subsequence ( l ) l2i such that c : l! c l exists and such that P l? I Q (q ) ;2En : (0) l! c l P exists. Further q 0 for 6, q 2 [0; ] for and 2E q n 0 for all 2 E n. If c > 0 then the ancestral process (R r ) r2i0 converges as l! weakly to a discrete time Markov process (R r ) r2i0 with initial state R 0 and transition matrix I + cq. If c 0 then the ancestral process (R [tcl ]) t0 converges as l! weakly in DE n ([0; )) to a continuous time Markov process (R t ) t0 with initial state R 0 and transition matrix e tq. Proof. Consider ; 2 E n with, i.e. a < b b + + b a. Then not all the b i are equal to one. Without loss of generality assume b 2 and consider the sum S : X i ;:::;ia all distinct ( i ) b ( ia ) ba 0

X i X i ( i ) 2 b?2 i X i 2;:::;i a b2 i 2 ba i a ( i ) 2 b?2 ( + + ) b2++ba b?2 X i( i ) 2 : Thus p E(S) b?2 c () b c () b c X i E(( i ) 2 ) b n ; () b () n i.e. the sequence (p c ) 2I is bounded where the bound does only depend on the sample size n. ow make use of p?? P p to verify that the sequence (p? )c is also bounded, where the bound does also only depend on n. Hence the matrix sequence ((P? I)c ) 2I is bounded. Further c 2 [0; ] for all 2 I, i.e. there exists a subsequence ( l ) l2i with the required properties. For the inequality q follows from p c b () b. The weak convergence results are derived as in the proof of Theorem 3.. 2 Final remarks. The generator Q depends on the considered population model, i.e. Q depends on the distribution of the ospring vectors ( ; : : : ; ) for large. It would be nice to characterize the class of all generators arising as a it in (0). In order to do this assume without loss of generality that the subsequence ( l ) l2i is the full sequence of all integers. Thus we assume that c :! c exists and that P? I Q! c exists. Choosing the full relation :, i.e. a jj in (9) it follows that the condition (I) of Theorem 2. is satised with k : q as long as jj k. If the condition (II) is also satised it follows from Theorem 2. that Q is of the form (6). On the other hand, if the condition (II) is not satised then Lemma 2.2 ensures that there exists a k-merger (; ) such that q > 0. Hence in this case Q cannot be of the form (6). It is an open problem to characterize the generators Q for the case when the condition (II) is not satised. Finally examples are presented where the condition (II) is not satised. From (8) it follows that for such examples the variables ( ) 2 and ( 2 ) 2 are positively correlated or that! c 6 0. Example : Assume for technical reasons that is even. Consider the model with P ( i j 2) : p for all i 6 j and with? P ( ) : q where the parameters p ; q > 0 are such that p 2 + q. Obviously E(( ) 2 ) (? )p ( 2 ) 2 ()3 4 p

and hence the coalescence probability is given by c (? 2)p 4. Further E(( ) 2 ( 2 ) 2 ) (( 2 ) 2) 2 p 2 (?2) 2 o matter how you choose the parameter p the condition (II) is not satised, as E(( 2 ) 2 ( 2 ) 2 )?2 4 4 c : The variables ( ) 2 and ( 2 ) 2 are positively correlated for 0 < p < (? )?2 and negatively correlated for (? )?2 < p 2((? )). If p is of order?2 then c :! c 6 0, if p is of smaller order then c 0. 6 p : correlation p c :! c between ( ) 2 and ( 2 ) 2 0 positive 3 positive 2 4 2 (?) negative 2 ote that the condition (I) is satised with k+2 (2) k, which corresponds to the measure 2 (see Lemma 2.3). Example 2: Fix a constant L 2 I and assume for technical reasons that is a multiple of L. Consider the population model where in each generation L individuals have each of them L ospring and all the other individuals have no ospring, i.e. P ( L L; L+ 0)? L. For this model it follows that E(( ) 2 ) (? L)L, i.e. the coalescence probability is given by Further and hence c?l L(?) L : c > 0: E(( ) 2 ( 2 ) 2 ) L? (?L) 2 L 3? E(( 2 ) 2 ( 2 ) 2 ) L? L c 2?L L? L 2 ; i.e. the condition (II) is not satised. From L it follows that the variables ( ) 2 and ( 2 ) 2 are not positively correlated. ote that the condition (I) is satised with k+2 (L) k, which corresponds to the measure L. Acknowledgement. The authors wish to thank the organizers of the fth international conference on mathematical population dynamics for steering such an interesting, interdisciplinary meeting, especially Ellen Baake for organizing the population genetics and evolutionary dynamics sessions. Further many thanks 2

to Adam Bobrowski, Ali Falahati, Marek Kimmel, and to all participants of the conference for sharing their scientic experiences. The research of the second author is supported by the Bank of Sweden Tercentenary Foundation project \Dependence and Interaction in Stochastic Population Dynamics". References [] Cannings, C.: The latent roots of certain Markov chains arising in genetics: a new approach, I. Haploid models. Adv. Appl. Prob. 6, 260 { 290 (974). [2] Cannings, C.: The latent roots of certain Markov chains arising in genetics: a new approach, II. Further haploid models. Adv. Appl. Prob. 7, 264 { 282 (975). [3] Feller, W.: An Introduction to Probability Theory and Its Applications. Volume I, Second Edition, Wiley (97). [4] Kingman, J.F.C.: On the Genealogy of Large Populations. J. Appl. Prob. 9A, 27{43 (982). [5] Kingman, J.F.C.: Exchangeability and the Evolution of Large Populations. in: Koch, G. and Spizzichino, F.: Exchangeability in Probability and Statistics, orth{holland Publishing Company, pp. 97{2 (982). [6] Kingman, J.F.C.: The Coalescent. Stoch. Process. Appl. 3, 235{248 (982). [7] Mohle, M.: Robustness Results for the Coalescent. J. Appl. Prob. 35, 438 { 447 (998). [8] Mohle, M.: Ancestral Processes in Population Genetics - The Coalescent. Journal of Theoretical Biology (submitted 998). [9] Mohle, M.: Weak convergence to the coalescent in neutral population models. J. Appl. Prob. (to appear June 999). [0] Sagitov, S.: The General Coalescent with Asynchronous Mergers of Ancestral Lines. J. Appl. Prob. (submitted 998) 3