A CHARACTERIZATION OF ANCESTRAL LIMIT PROCESSES ARISING IN HAPLOID. Abstract. conditions other limit processes do appear, where multiple mergers of

Similar documents
ON COMPOUND POISSON POPULATION MODELS

Coins with arbitrary weights. Abstract. Given a set of m coins out of a collection of coins of k unknown distinct weights, we wish to

The Combinatorial Interpretation of Formulas in Coalescent Theory

Stochastic Demography, Coalescents, and Effective Population Size

The Wright-Fisher Model and Genetic Drift

Mathematical Institute, University of Utrecht. The problem of estimating the mean of an observed Gaussian innite-dimensional vector

A comparison of two popular statistical methods for estimating the time to most recent common ancestor (TMRCA) from a sample of DNA sequences

Evolution in a spatial continuum

G METHOD IN ACTION: FROM EXACT SAMPLING TO APPROXIMATE ONE

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

Yaglom-type limit theorems for branching Brownian motion with absorption. by Jason Schweinsberg University of California San Diego

Theoretical Population Biology

An FKG equality with applications to random environments

PROOF OF TWO MATRIX THEOREMS VIA TRIANGULAR FACTORIZATIONS ROY MATHIAS

STABILITY OF INVARIANT SUBSPACES OF COMMUTING MATRICES We obtain some further results for pairs of commuting matrices. We show that a pair of commutin

1 Positive definiteness and semidefiniteness

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

MARKOV CHAINS: STATIONARY DISTRIBUTIONS AND FUNCTIONS ON STATE SPACES. Contents

Endowed with an Extra Sense : Mathematics and Evolution

PARAMETER IDENTIFICATION IN THE FREQUENCY DOMAIN. H.T. Banks and Yun Wang. Center for Research in Scientic Computation

1 Introduction It will be convenient to use the inx operators a b and a b to stand for maximum (least upper bound) and minimum (greatest lower bound)

G : Statistical Mechanics Notes for Lecture 3 I. MICROCANONICAL ENSEMBLE: CONDITIONS FOR THERMAL EQUILIBRIUM Consider bringing two systems into

Distribution of Eigenvalues of Weighted, Structured Matrix Ensembles

290 J.M. Carnicer, J.M. Pe~na basis (u 1 ; : : : ; u n ) consisting of minimally supported elements, yet also has a basis (v 1 ; : : : ; v n ) which f

System theory and system identification of compartmental systems Hof, Jacoba Marchiena van den

classes with respect to an ancestral population some time t

On some properties of elementary derivations in dimension six

Stochastic Nonlinear Stabilization Part II: Inverse Optimality Hua Deng and Miroslav Krstic Department of Mechanical Engineering h

6 Introduction to Population Genetics

16 Chapter 3. Separation Properties, Principal Pivot Transforms, Classes... for all j 2 J is said to be a subcomplementary vector of variables for (3.

Containment restrictions

The genealogy of branching Brownian motion with absorption. by Jason Schweinsberg University of California at San Diego

Limiting distribution for subcritical controlled branching processes with random control function

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes

2 W. LAWTON, S. L. LEE AND ZUOWEI SHEN is called the fundamental condition, and a sequence which satises the fundamental condition will be called a fu

EE263 Review Session 1

1 Matrices and Systems of Linear Equations

EIGENVALUES AND EIGENVECTORS 3

Course 2BA1: Trinity 2006 Section 9: Introduction to Number Theory and Cryptography

Evolution of cooperation in finite populations. Sabin Lessard Université de Montréal

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

Extremal Cases of the Ahlswede-Cai Inequality. A. J. Radclie and Zs. Szaniszlo. University of Nebraska-Lincoln. Department of Mathematics

Analysis on Graphs. Alexander Grigoryan Lecture Notes. University of Bielefeld, WS 2011/12

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition)

Linear Algebra (part 1) : Matrices and Systems of Linear Equations (by Evan Dummit, 2016, v. 2.02)

8 Singular Integral Operators and L p -Regularity Theory

The nested Kingman coalescent: speed of coming down from infinity. by Jason Schweinsberg (University of California at San Diego)

2 JOSE BURILLO It was proved by Thurston [2, Ch.8], using geometric methods, and by Gersten [3], using combinatorial methods, that the integral 3-dime

On quasi-contractions in metric spaces with a graph

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

6 Introduction to Population Genetics

π b = a π a P a,b = Q a,b δ + o(δ) = 1 + Q a,a δ + o(δ) = I 4 + Qδ + o(δ),

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

1 Introduction This work follows a paper by P. Shields [1] concerned with a problem of a relation between the entropy rate of a nite-valued stationary

Problem set 2 The central limit theorem.

STOCHASTIC DIFFERENTIAL EQUATIONS WITH EXTRA PROPERTIES H. JEROME KEISLER. Department of Mathematics. University of Wisconsin.

Alternative Characterization of Ergodicity for Doubly Stochastic Chains

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Course MA2C02, Hilary Term 2013 Section 9: Introduction to Number Theory and Cryptography

The mathematical challenge. Evolution in a spatial continuum. The mathematical challenge. Other recruits... The mathematical challenge

Increments of Random Partitions

Abstract. We show that a proper coloring of the diagram of an interval order I may require 1 +

A representation for the semigroup of a two-level Fleming Viot process in terms of the Kingman nested coalescent

Mh -ILE CPYl. Caregi Mello University PITBRH ENYVNA123AS1718. Carnegi Melo Unovrsity reecs ~ 8

Lecture 18 : Ewens sampling formula

Lecture 6 & 7. Shuanglin Shao. September 16th and 18th, 2013


Computational Systems Biology: Biology X

Eigenvalue Statistics for Toeplitz and Circulant Ensembles

Minimum and maximum values *

and the initial value R 0 = 0, 0 = fall equivalence classes ae singletons fig; i = 1; : : : ; ng: (3) Since the tansition pobability p := P (R = j R?1

The following can also be obtained from this WWW address: the papers [8, 9], more examples, comments on the implementation and a short description of

MAT 22B - Lecture Notes

Censoring Technique in Studying Block-Structured Markov Chains

Coalescent Theory for Seed Bank Models

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

BIRS workshop Sept 6 11, 2009


and the nite horizon cost index with the nite terminal weighting matrix F > : N?1 X J(z r ; u; w) = [z(n)? z r (N)] T F [z(n)? z r (N)] + t= [kz? z r

Frequency Spectra and Inference in Population Genetics

October 7, :8 WSPC/WS-IJWMIP paper. Polynomial functions are renable

The Moran Process as a Markov Chain on Leaf-labeled Trees

Demography April 10, 2015

REVIEW FOR EXAM III SIMILARITY AND DIAGONALIZATION

Markov Chains, Stochastic Processes, and Matrix Decompositions

The tree-valued Fleming-Viot process with mutation and selection

Linearly-solvable Markov decision problems

Population Genetics I. Bio

On the convergence of interpolatory-type quadrature rules for evaluating Cauchy integrals

Lecture 20 : Markov Chains

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression

Detailed Proofs of Lemmas, Theorems, and Corollaries

Essentials of Intermediate Algebra

Gorenstein homological dimensions

A Characterization of (3+1)-Free Posets

Garrett: `Bernstein's analytic continuation of complex powers' 2 Let f be a polynomial in x 1 ; : : : ; x n with real coecients. For complex s, let f

Linear Algebra, 4th day, Thursday 7/1/04 REU Info:

Nets Hawk Katz Theorem. There existsaconstant C>so that for any number >, whenever E [ ] [ ] is a set which does not contain the vertices of any axis

LESSON 7.2 FACTORING POLYNOMIALS II

Pointwise convergence rate for nonlinear conservation. Eitan Tadmor and Tao Tang

Transcription:

A CHARACTERIATIO OF ACESTRAL LIMIT PROCESSES ARISIG I HAPLOID POPULATIO GEETICS MODELS M. Mohle, Johannes Gutenberg-University, Mainz and S. Sagitov 2, Chalmers University of Technology, Goteborg Abstract The classical n-coalescent introduced by Kingman is not the only possible it process arising in ancestral population genetics. Under smooth conditions other it processes do appear, where multiple mergers of ancestral lines happen with positive probability. This processes are characterized by a probability measure on the interval [0; ]. The classical n-coalescent is the special case when this measure is the point measure in 0 and in this case only pairwise coalescent events appear. Simple necessary and sucient conditions on the underlying haploid population model are presented which ensure that the ancestral process converges weakly to such a it process if the population size tends to innity. ACESTRAL PROCESS; COALESCET; EXCHAGEABILITY; EUTRALITY; POPULATIO GEETICS; WEAK COVERGECE AMS 99 SUBJECT CLASSIFICATIO: Primary 92D25, 60J70 Secondary 92D5, 60F7 Introduction Consider the haploid population models with non-overlapping generations r 2 I 0 : f0; ; 2; : : :g and xed population size 2 I : f; 2; : : :g introduced by Cannings [, 2]. This models are characterized by exchangeable random variables ; : : : ; with values in f0; : : : ; g, where i denotes the number of ospring of the i-th individual. As the population size is xed the condition + + () has to be satised. Fix n and sample n individuals at random from the 0-th generation. For r 2 I 0 let R r denote the equivalence relation which contains the pair (i; j) i the i-th and the j-th individual of this sample have a common ancestor in the r-th generation backwards in time. The process (R r ) r2i0 is a time homogeneous Markov chain with state space E n, the set of all equivalence relations on f; : : :; ng and initial value R 0 f(i; i) j i 2 f; : : :; ngg. Obviously for ; 2 E n the transition probability p : P (E r j R r? ) is Postal address: Johannes Gutenberg-University Mainz, Department of Mathematics, Saarstrae 2, 55099 Mainz, Germany; e-mail: moehle@mathematik.uni-mainz.de 2 Postal address: Chalmers University of Technology, Department of Mathematical Statistics, 4296 Goteborg, Sweden; e-mail: serik@math.chalmers.se

equal to zero for 6. Assume now that. In analogy to Kingman [4, 5, 6] let C ; : : : ; C a denote the equivalence classes of and let C, 2 S f; : : :; ag, b 2 f; : : :; b g denote the equivalence classes of such that C C. The transition probability is given by p () b X i ;:::;ia all distinct E(( i ) b ( ia ) ba ) () a () b E(( ) b ( a ) ba ); (2) where b : jj b + + b a is the number of equivalence classes of and the notation (x) b : x(x?) (x?b+) is used. Let c denote the probability that two individuals, chosen randomly without replacement from some generation, have a common ancestor one generation backwards in time, i.e. c : () 2 X i E(( i ) 2 ) E(( ) 2 )? Var( )?? E( 2 ): (3) This probability, called the coalescence probability is of fundamental interest in coalescent theory as c? is the proper time scale to get convergence to the coalescent. For technical reasons it is assumed that c > 0 for all. ote that c 0 if and only if almost surely. The coalescence probability is also important as it is directly connected via c? 2 to the eigenvalue 2 : E( 2 ) of the transition matrix of the descendant process, i.e. the genealogical process looking forwards in time (see []). ote that for 2 the correlation coecient between and 2 is given by %( ; 2 ) E( 2 )? Var( )?? < 0: (4) For a large class of models, for example for the Moran model and the Wright- Fisher model (Kingman [5]), it is well known that the nite-dimensional distributions of the time-scaled ancestral process (R [tc ]) t0 converge to those of the (classical) n-coalescent. The n-coalescent is a time continuous Markov process with state space E n, initial state and innitesimal generator Q (q ) ;2En given by q 8 < :?jj(jj? )2 if, if, 0 otherwise, where if and only if and jj jj +, i.e. is obtained by merging two equivalence classes of. This convergence results are based on an expansion of the transition probabilities of the form p + c q + o(c ); ; 2 E n which is often written in matrix notation P I + c Q + o(c ); (5) 2

where P : (p ) ;2En denotes the transition matrix of the ancestral process. For a large class of models this expansion is valid ([8]). Recently (see [0]) this expansion has been extended to a wider class of generators Q. In order to describe this class of generators the following denition is needed: Denition. A pair (; ), ; 2 E n is called a k-merger, k 2 f2; : : : ; jjg, if and if b k and b 2 b a, i.e. if has exactly one equivalence class which is a union of k classes of and all the other classes of are also classes of. Use the notation k if (; ) is a k-merger. Remarks. If k then jj jj + k?. If : f(i; j) j i; j 2 f; : : : ; ngg denotes the full relation then for each 2 E n the pair (; ) is a jj-merger. If n 3 then for all ; 2 E n with the pair (; ) is a (jj? jj + )-merger. The class of generators Q is characterized by an arbitrary probability measure on the interval [0; ]: q 8 >< >:? [0;]? (? x) b? (? x + bx) x 2 (dx) if, x k?2 (? x) b?k (dx) if k, [0;] 0 otherwise, where b : jj. For 0 the entries (6) are identical to the entries (5) of the standard n-coalescent. Another example with is given in [8]. The class ([0; x]) x?, 2 (0; ) is considered in more detail in [0]. In the next section it is shown that under smooth conditions, which depend only on the common distribution of the variables and 2, the generator Q! (P? I)c exists and has the form (6). The conditions are similar (and certainly equivalent) to the conditions given in [0], but they have a more simple form. 2 The convergence theorem In this section we rst present the \if"-part of the main convergence result (Theorem 2.) and most of the section deals with the proof of this theorem. At the end of the section the \only-if"-part is considered (Theorem 2.6) which is very easy to verify. Theorem 2. Assume that the following two conditions are satised:?k (I) k : E(( ) k ) exists for all k 2.! c (II)! 2 c E(( ) 2 ( 2 ) 2 ) 0. (6) 3

Then for each sample size n 2 I the it P? I Q (q ) ;2En :! c exists and the entries of Q have the form (6), where the probability measure on [0; ] is uniquely determined via its moments R [0;] xk (dx) k+2, k 2 I 0. Remarks.. From E(( ) 2 ) (? )c it follows that 2. For k 3 it follows that?k c E(( ) k )?k k?2 E(( ) 2 )? c ; i.e. the sequence (?k c? E(( ) k )) 2I is bounded. Hence there exists a subsequence ( l ) l2i with l! l such that?k l k E(( ) k ) l! c l exists. Thus the condition (I) is not as strong as it seems to be at a rst glance. We will see in Lemma 2.3 that the condition (I) ensures the existence of the measure. This measure is uniquely determined by the numbers k, i.e. it depends only on the distribution of the ospring variable. The condition (I) is in fact equivalent ([0], Equation (3)) to the following tail condition: There exists a probability measure on the interval [0; ] such that the convergence! c P ( > x) y?2 (dy) (7) [x;] holds at all points x 2 (0; ) where the it function is continuous. 2. The condition (II) is for example satised if! c 0 and if the random variables ( ) 2 and ( 2 ) 2 are not positively correlated, i.e. E(( ) 2 ( 2 ) 2 ) (E(( ) 2 )) 2 ( c ) 2 2 c 2 : (8) It is shown in the next lemma that the condition (II) is equivalent to! 2 c E((? ) 2 ( 2? ) 2 ) 0 which corresponds to the condition (4) in [0] for a 2. 4

The proof of Theorem 2. is split up into several parts. First it is shown (Lemma 2.2) that condition (II) is equivalent to q 0 for all ; 2 E n with such that (; ) is not a k-merger. In Lemma 2.3 the existence of the probability measure is derived and afterwards (Lemma 2.4 and Lemma 2.5) the formula for q is shown if (; ) is a k-merger. This proofs dier from the proofs given in [0]. Finally the case is considered. Lemma 2.2 The following conditions are equivalent: (i) (ii) (iii)!! 2 E(( ) 2 ( 2 ) 2 ) 0: c 2 c E((? ) 2 ( 2? ) 2 ) 0:! a?b c? E(( ) b ( a ) ba ) 0 for all a 2, b ; b 2 2 and b 3 ; : : :; b a, where b : b + + b a. (iv) q :! p c 0 for all n 2 I and all ; 2 E n with such that the pair (; ) is not a k-merger. Remark. ote that the condition (i) is exactly the condition (II) of the convergence Theorem 2.. Proof. (i), (ii): ote rst that (? ) 2 ( 2? ) 2 (( ) 2? (? )) (( 2 ) 2? ( 2? )) ( ) 2 ( 2 ) 2? (? )( 2 ) 2?( 2? )( ) 2 + (? )( 2? ): Using E((? )( 2? )) E( 2 )??c it follows that E((? ) 2 ( 2? ) 2 )? E(( ) 2 ( 2 ) 2 )?2E(( ) 2 ( 2? ))? c?2e(( ) 2 2 ) + 2E(( ) 2 )? c?2e(( ) 2 2 ) + 2(? )c? c?2e(( ) 2 2 ) + (2? 3)c : ow (? )E(( ) 2 2 ) E(( ) 2 ( 2 + + )) E(( ) 2 ) (? )c, i.e. E(( ) 2 2 ) O( c ). Thus E((? ) 2 ( 2? ) 2 )? E(( ) 2 ( 2 ) 2 ) O( c ) and the equivalence of (i) and (ii) follows immediately. (i) ) (iii): S : X i ;:::;ia all distinct ( i ) b ( ia ) ba 5

X i ;i 2 i 6i 2 X i;j i6j ( i ) 2 b?2 i ( i2 ) 2 b2?2 i X i 3;:::;i a b2 i 3 ba i a ( i ) 2 b?2 ( j ) 2 b2?2 ( + + ) b3++ba b?4 X i;j i6j ( i ) 2 ( j ) 2 and hence E(S) b?4 X i;j E(( i ) 2 ( j ) 2 ) b?2 E(( ) 2 ( 2 ) 2 ): i6j Thus a?b c? E(( ) b ( a ) ba ) () a E(( ) b ( a ) ba ) () b c E(S) () b c b?2 E(( ) 2 ( 2 ) 2 ) () b c E(( 2 ) 2 ( 2 ) 2 ) c which converges to zero by assumption. (iii) ) (i): Choose a : b : b 2 : 2 in (ii) and (i) follows immediately. (iii) ) (iv): Let ; 2 E n such that and that (; ) is not a k-merger. Then a 2 and (without loss of generality) b ; b 2 2 and hence p c () a () b c E(( ) b ( a ) ba ) a?b c? E(( ) b ( a ) ba ) which converges to zero as!. (iv) ) (i): Choose n 4, and let 2 E n denote the equivalence relation with classes f; 2g and f3; 4g. Then p () 2 () 4 E(( ) 2 ( 2 ) 2 ) and (i) follows from! p c 0 as (; ) is a 2-merger. 2 Lemma 2.3 Let the condition (I) be satised. Then there exists a probability measure on [0; ] such that for all k 2.?k k E(( ) k )! c x k?2 (dx) [0;] 6

Proof. Let Y be a random variable with distribution P (Y i) : (i) 2 E(( ) 2 ) P ( i) i(i? ) (? )c P ( i); where i 2 f0; : : : ; g. The k-the moment of X : Y is then given by E(X k ) X i0 X i0 ( i )k P (Y i) ( i )k i(i? ) (? )c P ( i)?k (? )c E( k+2? k+ ): P Using the equation t k k l S kl(t) l for all t 2 IR and all k, where S kl are the Stirling numbers of the second kind this leads to! E(X k )?k k+2 k+ X X E S k+2;l ( ) l? S k+;l ( ) l (? )c l l!?k k+ X E ( ) k+2 + (S k+2;l? S k+;l ) ( ) l (? )c?k (? )c E(( ) k+2 ) + l2 k+ X l2?k (S k+2;l? S k+;l ) E(( ) l ) (? )c which converges to k+2 under condition (I). ote that 0 X almost surely. It is known (see [3], Chapter 8, Section ) that for a sequence of probability measures on [0; ] the convergence of the moments is equivalent to the weak convergence of the sequence (X ) 2I to a it X where the distribution : P X of X is uniquely determined by it's k-th moments E(X k )! E(X k ) k+2. Thus k+2 E(X k ) x k (dx) [0;] for all k 2 I 0 and the lemma is established. 2 Lemma 2.4 If the two conditions (I) and (II) of Theorem 2. are satised then?k E(( ) k 2 a )! c for all k 2 and all a. x k?2 (? x) a? (dx) [0;] 7

Proof. By induction over a. For a this follows from Lemma 2.3. Assume that the formula is valid for some a and all k 2. Then using the exchangeability of the ospring variables ; : : : ; and () it follows that (? a + )E(( ) k 2 a+ ) X ia+ E(( ) k 2 a i ) E(( ) k 2 a ( a+ + + )) E(( ) k 2 a (??? a )) E(( ) k 2 a (? k? a +? (? k)? ax i2 ( i? ))) (? k? a + ) E(( ) k 2 a )? E(( ) k+ 2 a )? ax i2 E(( ) k 2 a ( i? )) (? k? a + ) E(( ) k 2 a )? E(( ) k+ 2 a )?(a? )E(( ) k ( 2 ) 2 3 a ): Multiplying this equation by?k c and taking the it! leads to?k E(( ) k 2 a+ )! c!?k c E(( ) k 2 a )??k E(( ) k+ 2 a )! c?(a? )?k E(( ) k ( 2 ) 2 2 a ):! c The last it is equal to zero by Lemma 2.2 (iii). Thus?k E(( ) k 2 a+ )! c!?k c E(( ) k 2 a )? x k?2 (? x) a? (dx)? [0;] x k?2 (? x) a (dx): [0;]!?k c E(( ) k+ 2 a ) x k? (? x) a? (dx) [0;] Thus the formula is valid for a + and the induction is done. 2 Corollary 2.5 Let the conditions (I) and (II) be satised. If (; ) is a k-merger (k 2) then p q :! c x k?2 (? x) a? (dx); [0;] 8

where a : jj. Proof. Let (; ) be a k-merger (k 2). With the notation b : jj and a : jj it follows that p q! c () a E(( ) k 2 a )! () b c! a b c E(( ) k 2 a )?k E(( ) k 2 a ); (9)! c as a b? k +. The corollary follows now directly from Lemma 2.3. 2 In order to nish the proof of Theorem 2. it remains to consider the diagonal entries q. For 2 E n with b : jj it follows from P 2E n p that q? X??? 62E n q? bx k2 bx k2 X 62En (;) is a k?merger? b x k?2 (? x) b?k (dx) k [0;] x?2 [0;] bx k2? b k x k (? x) b?k (dx) q x?2 (? (? x) b? bx(? x) b? ) (dx) [0;] and the proof of Theorem 2. is nished. At the end of this section the \onlyif"-part, i.e. the opposite view of Theorem 2. is presented, which is simple to prove. Theorem 2.6 Assume that for each sample size n 2 I the it Q : (q ) ;2En :! (P? I)c exists such that q 0 if the pair (; ) is not a k-merger. Then the conditions (I) and (II) are satised and Q has the form (6) where the probability measure on the interval [0; ] is uniquely determined via its moments R [0;] xk?2 (dx)!?k c? E(( ) k ), k 2. Proof. Fix k 2, n k and choose 2 E n such that jj k. Using the equation (9) for a it follows that the condition (I) is satised with k : q!?k c? E(( ) k ). The implication from (iv) to (i) in Lemma 2.2 ensures that the condition (II) is also satised. ow apply Theorem 2.. 2 9

3 Further convergence results and open problems Theorem 3. Assume that the conditions (I) and (II) of Theorem 2. are satised and assume further that c :! c exists. If c > 0 then the ancestral process (R r ) r2i0 converges as! weakly to a discrete time Markov process (R r ) r2i0 with initial state R 0 and transition matrix I + cq, where Q is given as in Theorem 2.. If c 0 then the time-scaled ancestral process (R [tc ]) t0 converges as! weakly in DE n ([0; )) to a continuous time Markov process (R t ) t0 with initial state R 0 and transition matrix e tq, where Q is again given as in Theorem 2.. Proof. From Theorem 2. it follows that Q :! (P? I)c is of the form (6) where the measure R is uniquely determined by its moments [0;] xk (dx) k+2, k 2 I 0. For c > 0 the theorem follows from P I + c Q + o(c ) which converges to I + cq as tends to innity. Assume now that c 0. Then for t 0 it follows that jjp [tc]? (I + c Q) [tc ] jj [tc ] jjp? (I + c Q)jj converges to zero as tends to innity, i.e.! P [tc ]! (I + c Q) [tc ] e tq 8 t 0: This is equivalent to the convergence of the nite-dimensional distributions of the process (R [tc ]) t0 to those of a process (R t ) t0 with transition matrix e tq. The convergence in DE n ([0; )) can be shown using the method described in [9]. 2 Theorem 3.2 There exists a subsequence ( l ) l2i such that c : l! c l exists and such that P l? I Q (q ) ;2En : (0) l! c l P exists. Further q 0 for 6, q 2 [0; ] for and 2E q n 0 for all 2 E n. If c > 0 then the ancestral process (R r ) r2i0 converges as l! weakly to a discrete time Markov process (R r ) r2i0 with initial state R 0 and transition matrix I + cq. If c 0 then the ancestral process (R [tcl ]) t0 converges as l! weakly in DE n ([0; )) to a continuous time Markov process (R t ) t0 with initial state R 0 and transition matrix e tq. Proof. Consider ; 2 E n with, i.e. a < b b + + b a. Then not all the b i are equal to one. Without loss of generality assume b 2 and consider the sum S : X i ;:::;ia all distinct ( i ) b ( ia ) ba 0

X i X i ( i ) 2 b?2 i X i 2;:::;i a b2 i 2 ba i a ( i ) 2 b?2 ( + + ) b2++ba b?2 X i( i ) 2 : Thus p E(S) b?2 c () b c () b c X i E(( i ) 2 ) b n ; () b () n i.e. the sequence (p c ) 2I is bounded where the bound does only depend on the sample size n. ow make use of p?? P p to verify that the sequence (p? )c is also bounded, where the bound does also only depend on n. Hence the matrix sequence ((P? I)c ) 2I is bounded. Further c 2 [0; ] for all 2 I, i.e. there exists a subsequence ( l ) l2i with the required properties. For the inequality q follows from p c b () b. The weak convergence results are derived as in the proof of Theorem 3.. 2 Final remarks. The generator Q depends on the considered population model, i.e. Q depends on the distribution of the ospring vectors ( ; : : : ; ) for large. It would be nice to characterize the class of all generators arising as a it in (0). In order to do this assume without loss of generality that the subsequence ( l ) l2i is the full sequence of all integers. Thus we assume that c :! c exists and that P? I Q! c exists. Choosing the full relation :, i.e. a jj in (9) it follows that the condition (I) of Theorem 2. is satised with k : q as long as jj k. If the condition (II) is also satised it follows from Theorem 2. that Q is of the form (6). On the other hand, if the condition (II) is not satised then Lemma 2.2 ensures that there exists a k-merger (; ) such that q > 0. Hence in this case Q cannot be of the form (6). It is an open problem to characterize the generators Q for the case when the condition (II) is not satised. Finally examples are presented where the condition (II) is not satised. From (8) it follows that for such examples the variables ( ) 2 and ( 2 ) 2 are positively correlated or that! c 6 0. Example : Assume for technical reasons that is even. Consider the model with P ( i j 2) : p for all i 6 j and with? P ( ) : q where the parameters p ; q > 0 are such that p 2 + q. Obviously E(( ) 2 ) (? )p ( 2 ) 2 ()3 4 p

and hence the coalescence probability is given by c (? 2)p 4. Further E(( ) 2 ( 2 ) 2 ) (( 2 ) 2) 2 p 2 (?2) 2 o matter how you choose the parameter p the condition (II) is not satised, as E(( 2 ) 2 ( 2 ) 2 )?2 4 4 c : The variables ( ) 2 and ( 2 ) 2 are positively correlated for 0 < p < (? )?2 and negatively correlated for (? )?2 < p 2((? )). If p is of order?2 then c :! c 6 0, if p is of smaller order then c 0. 6 p : correlation p c :! c between ( ) 2 and ( 2 ) 2 0 positive 3 positive 2 4 2 (?) negative 2 ote that the condition (I) is satised with k+2 (2) k, which corresponds to the measure 2 (see Lemma 2.3). Example 2: Fix a constant L 2 I and assume for technical reasons that is a multiple of L. Consider the population model where in each generation L individuals have each of them L ospring and all the other individuals have no ospring, i.e. P ( L L; L+ 0)? L. For this model it follows that E(( ) 2 ) (? L)L, i.e. the coalescence probability is given by Further and hence c?l L(?) L : c > 0: E(( ) 2 ( 2 ) 2 ) L? (?L) 2 L 3? E(( 2 ) 2 ( 2 ) 2 ) L? L c 2?L L? L 2 ; i.e. the condition (II) is not satised. From L it follows that the variables ( ) 2 and ( 2 ) 2 are not positively correlated. ote that the condition (I) is satised with k+2 (L) k, which corresponds to the measure L. Acknowledgement. The authors wish to thank the organizers of the fth international conference on mathematical population dynamics for steering such an interesting, interdisciplinary meeting, especially Ellen Baake for organizing the population genetics and evolutionary dynamics sessions. Further many thanks 2

to Adam Bobrowski, Ali Falahati, Marek Kimmel, and to all participants of the conference for sharing their scientic experiences. The research of the second author is supported by the Bank of Sweden Tercentenary Foundation project \Dependence and Interaction in Stochastic Population Dynamics". References [] Cannings, C.: The latent roots of certain Markov chains arising in genetics: a new approach, I. Haploid models. Adv. Appl. Prob. 6, 260 { 290 (974). [2] Cannings, C.: The latent roots of certain Markov chains arising in genetics: a new approach, II. Further haploid models. Adv. Appl. Prob. 7, 264 { 282 (975). [3] Feller, W.: An Introduction to Probability Theory and Its Applications. Volume I, Second Edition, Wiley (97). [4] Kingman, J.F.C.: On the Genealogy of Large Populations. J. Appl. Prob. 9A, 27{43 (982). [5] Kingman, J.F.C.: Exchangeability and the Evolution of Large Populations. in: Koch, G. and Spizzichino, F.: Exchangeability in Probability and Statistics, orth{holland Publishing Company, pp. 97{2 (982). [6] Kingman, J.F.C.: The Coalescent. Stoch. Process. Appl. 3, 235{248 (982). [7] Mohle, M.: Robustness Results for the Coalescent. J. Appl. Prob. 35, 438 { 447 (998). [8] Mohle, M.: Ancestral Processes in Population Genetics - The Coalescent. Journal of Theoretical Biology (submitted 998). [9] Mohle, M.: Weak convergence to the coalescent in neutral population models. J. Appl. Prob. (to appear June 999). [0] Sagitov, S.: The General Coalescent with Asynchronous Mergers of Ancestral Lines. J. Appl. Prob. (submitted 998) 3