Crump Mode Jagers processes with neutral Poissonian mutations

Similar documents
From Individual-based Population Models to Lineage-based Models of Phylogenies

The Contour Process of Crump-Mode-Jagers Branching Processes

Dynamics of the evolving Bolthausen-Sznitman coalescent. by Jason Schweinsberg University of California at San Diego.

Random trees and branching processes

arxiv: v1 [math.pr] 29 Jul 2014

Yaglom-type limit theorems for branching Brownian motion with absorption. by Jason Schweinsberg University of California San Diego

Introduction to self-similar growth-fragmentations

The genealogy of branching Brownian motion with absorption. by Jason Schweinsberg University of California at San Diego

The range of tree-indexed random walk

Almost sure asymptotics for the random binary search tree

Linear-fractional branching processes with countably many types

THE x log x CONDITION FOR GENERAL BRANCHING PROCESSES

Splitting trees with neutral Poissonian mutations I: Small families.

Branching Processes II: Convergence of critical branching to Feller s CSB

1 Informal definition of a C-M-J process

Time Reversal Dualities for some Random Forests

BRANCHING PROCESSES AND THEIR APPLICATIONS: Lecture 15: Crump-Mode-Jagers processes and queueing systems with processor sharing

Finite rooted tree. Each vertex v defines a subtree rooted at v. Picking v at random (uniformly) gives the random fringe subtree.

UPPER DEVIATIONS FOR SPLIT TIMES OF BRANCHING PROCESSES

ON THE COMPLETE LIFE CAREER OF POPULATIONS IN ENVIRONMENTS WITH A FINITE CARRYING CAPACITY. P. Jagers

FRINGE TREES, CRUMP MODE JAGERS BRANCHING PROCESSES AND m-ary SEARCH TREES

ON COMPOUND POISSON POPULATION MODELS

Modern Discrete Probability Branching processes

Decomposition of supercritical branching processes with countably many types

The Λ-Fleming-Viot process and a connection with Wright-Fisher diffusion. Bob Griffiths University of Oxford

How robust are the predictions of the W-F Model?

The Combinatorial Interpretation of Formulas in Coalescent Theory

Branching Brownian motion seen from the tip

RW in dynamic random environment generated by the reversal of discrete time contact process

PREPRINT 2007:25. Multitype Galton-Watson processes escaping extinction SERIK SAGITOV MARIA CONCEIÇÃO SERRA

Wald Lecture 2 My Work in Genetics with Jason Schweinsbreg

The nested Kingman coalescent: speed of coming down from infinity. by Jason Schweinsberg (University of California at San Diego)

BRANCHING PROCESSES 1. GALTON-WATSON PROCESSES

Pathwise construction of tree-valued Fleming-Viot processes

On the age of a randomly picked individual in a linear birth and death process

arxiv: v1 [math.pr] 4 Feb 2016

Recovery of a recessive allele in a Mendelian diploid model

WXML Final Report: Chinese Restaurant Process

stochnotes Page 1

Endowed with an Extra Sense : Mathematics and Evolution

Computational Systems Biology: Biology X

Branching within branching: a general model for host-parasite co-evolution

The contour of splitting trees is a Lévy process

an author's

Two viewpoints on measure valued processes

Lecture 06 01/31/ Proofs for emergence of giant component

Final Exam: Probability Theory (ANSWERS)

arxiv: v2 [math.pr] 18 Oct 2018

Adaptive dynamics in an individual-based, multi-resource chemostat model

Erdős-Renyi random graphs basics

A simple branching process approach to the phase transition in G n,p

Problems on Evolutionary dynamics

6 Introduction to Population Genetics

Mean-field dual of cooperative reproduction

Learning Session on Genealogies of Interacting Particle Systems

Evolution in a spatial continuum

The Wright-Fisher Model and Genetic Drift

Resistance Growth of Branching Random Networks

Recent results on branching Brownian motion on the positive real axis. Pascal Maillard (Université Paris-Sud (soon Paris-Saclay))

LogFeller et Ray Knight

Stochastic flows associated to coalescent processes

Continuous-state branching processes, extremal processes and super-individuals

6 Introduction to Population Genetics

process on the hierarchical group

Infinite geodesics in hyperbolic random triangulations

De los ejercicios de abajo (sacados del libro de Georgii, Stochastics) se proponen los siguientes:

First order logic on Galton-Watson trees

Branching processes. Chapter Background Basic definitions

General Branching Processes and Cell Populations. Trinity University

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Lecture 18 : Ewens sampling formula

A representation for the semigroup of a two-level Fleming Viot process in terms of the Kingman nested coalescent

Notes 20 : Tests of neutrality

Itô s excursion theory and random trees

ANCESTRAL PROCESSES WITH SELECTION: BRANCHING AND MORAN MODELS

The mathematical challenge. Evolution in a spatial continuum. The mathematical challenge. Other recruits... The mathematical challenge

Advanced School and Conference on Statistics and Applied Probability in Life Sciences. 24 September - 12 October, 2007

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Mouvement brownien branchant avec sélection

QUEUEING FOR AN INFINITE BUS LINE AND AGING BRANCHING PROCESS

A. Bovier () Branching Brownian motion: extremal process and ergodic theorems

Discrete random structures whose limits are described by a PDE: 3 open problems

SOLUTIONS IEOR 3106: Second Midterm Exam, Chapters 5-6, November 8, 2012

Budding Yeast, Branching Processes, and Generalized Fibonacci Numbers

14 Branching processes

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Critical branching Brownian motion with absorption. by Jason Schweinsberg University of California at San Diego

Frequency Spectra and Inference in Population Genetics

Rare Alleles and Selection

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Homework 6: Solutions Sid Banerjee Problem 1: (The Flajolet-Martin Counter) ORIE 4520: Stochastics at Scale Fall 2015

abstract 1. Introduction

THE STANDARD ADDITIVE COALESCENT 1. By David Aldous and Jim Pitman University of California, Berkeley

A stochastic model for the mitochondrial Eve

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Asymptotic Genealogy of a Branching Process and a Model of Macroevolution. Lea Popovic. Hon.B.Sc. (University of Toronto) 1997

Continuous-Time Markov Chain

Convergence exponentielle uniforme vers la distribution quasi-stationnaire en dynamique des populations

arxiv: v1 [math.pr] 6 Mar 2018

Mandelbrot s cascade in a Random Environment

Transcription:

Crump Mode Jagers processes with neutral Poissonian mutations Nicolas Champagnat 1 Amaury Lambert 2 1 INRIA Nancy, équipe TOSCA 2 UPMC Univ Paris 06, Laboratoire de Probabilités et Modèles Aléatoires Paris, le 15 septembre 2011

Branching processes with mutations Yule (1924) = pure-birth process, species and genera Griffiths & Pakes (1988) = Galton Watson tree and independent mutations with fixed probability Taïb (1992) = general branching process, mutation at birth Abraham & Delmas (2007) = continuous-state branching processes, all mutants have the same type Bertoin (2009, 2010, 2011) = Galton Watson, allelic partition of total descendance Sagitov & Serra (2009, 2011) = waiting time to n-th mutation

Splitting tree in forward time (Geiger & Kersting 97) We consider an asexual population where t individuals reproduce independently have i.i.d. lifetime durations during which they give birth at constant rate b The population size process (N t ;t 0) is a non-markovian branching process called (homogeneous, binary) Crump Mode Jagers process.

Representation in backward time (1) Starting from one single individual, the subtree spanned by the individuals alive at time t can be represented as follows... t 0 1 2 3 H 1 H 3 H 4 4 H 2...where the times H 1,H 2,H 3... are called coalescence times.

Representation in backward time (2) One can again represent this subtree... like this......or like that t H 1 H 3 H 4 H 1 H 3 H 4 H 2 H 2...And in each case, the coalescence times characterize the subtree.

Contour of a splitting tree A splitting tree and the (jumping) contour process of its truncation below time t. a) t b) t

First result Theorem (L. (2010)) The (jumping) contour of a (truncated) splitting tree is a strong Markov process. As a consequence, the coalescence times H 1,H 2,H 3... of the splitting tree form a sequence of i.i.d. positive random variables killed at its first value larger than t. There is a positive increasing function W such that W(0) = 1 and P(H > x) = 1 W(x). The sequence (H 1,H 2,...) is called a coalescent point-process.

Coalescent point process Generally, a coalescent point process is the genealogy generated by a sequence of arbitrary i.i.d. positive r.v. as below. In that case, we will define W(x) as 1/P(H > x). 0 1 2 3 4 5 6 7 8 9 10 12 13 14 15

Precision Two objects : splitting trees and coalescent point processes ; Results at fixed times are valid for coalescent point processes (more general than splitting trees) ; Asymptotic (t ) results are valid for supercritical splitting trees.

Assumptions on the mutation scheme Now conditional on the genealogy, point mutations occur randomly. 1 mutations occur at constant rate θ during lifetimes, or equivalently, on branch lengths of the coalescent point process 2 mutations are neutral : they have no effect on the genealogy (birth rate, lifetimes...) 3 each mutation yields a new type, called allele, to its carrier (infinitely-many alleles model) 4 types are inherited by the offspring born after this mutation and before the next one.

Mutation at rate θ N = 9 alive individuals shared out in 6 types : 4 types of abundance 1, 2 types of abundance 2, and 1 type of abundance 3. a a c f c c d e b f e a c d b

Frequency spectrum For a total population of N individuals, we adopt the following condensed notation : A = # distinct types in the population A(k) = # types represented by k individuals so that... A(k) = A and ka(k) = N k 1 k 1 (A(k);k 1) = «frequency spectrum».

Clonal splitting trees (1) the genalogy of clonal individuals is a splitting tree with (birth rate b and) lifetime duration distributed as V θ := min(v,e), where E is an exponential variable with parameter θ independent of V. to a clonal splitting tree is associated a clonal coalescent point process with i.i.d. branch lengths H1 θ,hθ 2,... whose inverse of the tail distribution is denoted by W θ P(H θ > s) =: 1 W θ (s).

Clonal splitting trees (2) Proposition (L. (2009)) In the case of a coalescent point process with branch lengths H 1,H 2,..., we can define H θ as max(h 1,...,H B θ ), where B θ is the index of first virgin lineage (i.e., carrying no mutation since it has split from ancestral lineage 0). In both cases, the scale function W θ associated with clonal trees is related to W via with W θ (0) = 1. W θ (x) = e θx W (x) x 0,

Virgin lineage Below, the index of the first virgin lineage is 8 a 0 c 1 c 2 c 3 b 4 b 5 d 6 d 7 a 8 e a 9 10 c b d e B θ H θ a

Clonal coalescent point process (1) lineage last first virgin lineage i a a a not carrying a y + dy a Goal. Compute the number of alleles of age in (y,y + dy) and carried by k alive individuals at time t, jointly with N t.

Clonal coalescent point process (2) lineage i H θ 1 B θ 1 last first virgin lineage a a a not carrying a H θ 2 B θ 2 B θ 3 H θ 4 y + dy a Goal. Compute the number of alleles of age in (y,y + dy) and carried by k alive individuals at time t, jointly with N t.

Finer result on clonal coalescent point process (1) B θ i = distances between consecutive virgin lineages Hi θ = max of branch lengths between consecutive virgin lineages = (B θ i,hθ i ) are i.i.d. Bθ 1 B θ 2 Bθ 3 Bθ 4 Bθ 5 B θ 6 Bθ 7 a a a a H3 θ a a a H θ 5 a a

Finer result on clonal coalescent point process (2) We are interested in the joint law of H θ and B θ. Set W θ (x,s) := 1 1 E(s Bθ,H θ x) x 0,s [0,1]. In particular, W θ (x,1) = W θ (x). Theorem (Champagnat & L. 2010) We have x W θ (x,s) = e θx W(x,s) x 0, x with W θ (0,γ) = 1, where W(x,s) := In particular, W(x, 1) = W(x). 1 1 sp(h x).

Expected frequency spectrum Recall N t is the population size at time t. Theorem (Champagnat & L. 2010) If A(k,t,dy) denotes the number of alleles of age in (y,y + dy) and carried by k alive individuals at time t, then E ( s N t 1 A(k,t,dy) N t 0 ) = θ dy W(t;s)2 W(t) e θy ( ) 1 k 1 W θ (y;s) 2 1 W θ (y;s)

Random characteristics for any individual i in the population, let σ i be her birth time, and let χ i (t σ i ) be a random characteristic of i examples : number of descendants of i, indicator of alive descendants of i, of alive clonal descendants of i... t units of time after her birth (χ i (t σ i ) = 0 if t < σ i ) the process Zt χ := i χ i (t σ i ) is a branching process counted by random characteristic, as defined by Jagers (74) and Jagers-Nerman (84a, 84b) under suitable conditions, in the supercritical case, ratios of the branching process counted by different random characteristics converge a.s. on the survival event to a deterministic limit.

Almost sure convergence, small families For any k 1 and 0 a < b, b U k (a,b) := θ dy a e θy ( W θ (y) 2 1 1 ) k 1 W θ (y) Thanks to the theory of random characteristics, we get Theorem (Champagnat & L. 2010) If A(k,t;a,b) denotes the number of alleles of age in (a,b) carried by k alive individuals at time t, then on the survival event, A(k,t;a,b) lim = U k (a,b) a.s. t N t

Preliminary remark We consider a supercritical splitting tree with Malthusian parameter α, so that N t increases like e αt. Since θ is an additional death rate for clonal families, Clonal families are supercritical if α > θ critical if α = θ subcritical if α < θ.

We define Notation M t (x;a,b)= number of families of size x and of age in [a,b] b M t (x;a,b) := A(k,t,dy) k x a L t (x)= number of families of size x L t (x) := M t (x;0, ) O t (a)= number of families of age a O t (a) := M t (0;a, ). Goal. Find x t such that E L t (x t ) = O(1) and a t such that E O t (a t ) = O(1), as t.

Case α > θ Assume α > θ. Proposition (Champagnat & L. 2011) For any c > 0 and a < b, ) E M t (ce (α θ)t ;t b,t a = O(1), so that largest families have sizes cn 1 θ/α and are also the oldest ones (born at times O(1)).

Case α < θ : largest families Assume α < θ and set β := θ/(θ α). Proposition (Champagnat & L. 2011) For some other explicit constant b, set x t := b(αt β log(t)). Then for any c ( E L t (x t + c) E M t x t + c;(1 ε) log(t) ) log(t),(1 + ε) = O(1), θ α θ α so that largest families have sizes b(log(n) β log(logn)) + c and they all have age log(t) θ α.

Case α < θ : oldest families Assume (again) α < θ and set γ := α/θ < 1. Proposition (Champagnat & L. 2011) For any a, E O t (γt + b) = O(1), and for any y t, E M t (y t ;γt + a, ) = 0 so that oldest families have ages γt + a and tight sizes.

Convergence in distribution (1) Take the coalescent point process at time t, choose s t such that s t, and set N t s t := number of subtrees (T i ) grafted on branch lengths s t s t T T 2 1 T N t s t N t s t t s t

Set X (k) t Convergence in distribution (2) := size of the k-th largest family in the whole population Y i := size of the largest family in subtree T i. When α θ, we can define { log(t) / (θ α) if α < θ s t := t / 2 if α = θ satisfying N t s t (X (1) t,...,x (k) )= first k order statistics of {Y 1,...,Y N } W.H.P. t s t t P(Y x t + c) = P(L st (x t + c) 1) E(L st (x t + c)).

Convergence in distribution (3) The same results hold with A (k) t := age of the k-th oldest family in the whole population Y i := age of the oldest family in subtree T i, and { s t := αt / θ if α < θ t log(t) / α if α = θ.

Assume α < θ. Convergence in distribution (4) Theorem (Champagnat & L. 2011) There are some explicit constants u,c, such that 1 lim t P(X(1) t < b(αt β log(t)) + k) = 1 + u.c k. More specifically, along some subsequence, (X (k) t b(αt β log(t));k 1) converge (fdd) to the (ranked) atoms of a mixed Poisson point measure with intensity where E is some exponential r.v. E c j δ j, j Z

Convergence in distribution (5) Assume again α < θ. Theorem (Champagnat & L. 2011) There is some explicit constant v > 0 such that 1 lim t P(A(1) t < (αt / θ) + a) = 1 + v.e θa. More specifically, (A (k) t (αt / θ);k 1) converge (fdd) to the (ranked) atoms of a mixed Poisson point measure with intensity where E is some exponential r.v. E e θa da,

Acknowledgements Laboratoire de Probabilités et Modèles Aléatoires UPMC Univ Paris 06 ANR MANEGE (Modèles Aléatoires en Écologie, Génétique, Évolution)...et merci de votre patience.