The mathematical challenge What is the relative importance of mutation, selection, random drift and population subdivision for standing genetic variation? Evolution in a spatial continuum Al lison Etheridge University of Oxford Joint work with Nick Barton (IST Vienna), Jerome Kelleher (Edinburgh) and Amandine Véber (ENS) Banff, September 2009 p.3 Banff, September 2009 p.1 The mathematical challenge Other recruits... What is the relative importance of mutation, selection, random drift and population subdivision for standing genetic variation? Use pattern of variation in a sample to infer the genealogical relationships between individuals coalescent models Nathanael Berestycki (Cambridge) Martin Hutzenthaler (Frankfurt) Tom Kurtz (Madison) Habib Saadi (Oxford) Feng Yu (Bristol) Banff, September 2009 p.3 Banff, September 2009 p.2 The mathematical challenge The mathematical challenge What is the relative importance of mutation, selection, random drift and population subdivision for standing genetic variation? Use pattern of variation in a sample to infer the genealogical relationships between individuals coalescent models We require consistent forwards in time models for evolution of population, backwards in time models for genealogical trees relating individuals in a sample from the population. Banff, September 2009 p.3 Banff, September 2009 p.3
Drift (large population limit) Drift (large population limit) Neutral (haploid) panmictic population of constant size Forwards in time, E[Δp] =0(neutrality) Backwards in time MRCA Neutral (haploid) panmictic population of constant size E[(Δp) 2 ]=δtp(1 p) E[(Δp) 3 ]=O(δt) 2 Coalesent time Wright Fisher time dp t = p t (1 p t )dw t Coalescence rate ( k 2). dp τ = 1 p τ (1 p τ )dw τ, Coalescence rate 1 ( ) k N e N e 2 Banff, September 2009 p.4 Banff, September 2009 p.4 Spatial structure Drift (large population limit) Kimura s stepping stone model Neutral (haploid) panmictic population of constant size dp i = j 1 m ji (p j p i )dt+ p i (1 p i )dw i N e Forwards in time, E[Δp] =0(neutrality) System of interacting W-F diffusions E[(Δp) 2 ]=δtp(1 p) E[(Δp) 3 ]=O(δt) 2 dp t = p t (1 p t )dw t Banff, September 2009 p.5 Banff, September 2009 p.4 Spatial structure Drift (large population limit) Kimura s stepping stone model dp i = j 1 m ji (p j p i )dt+ p i (1 p i )dw i N e System of interacting W-F diffusions The coalescent dual process n evolves as follows: n i n i 1 n j n j +1 at rate n im ji n i n i 1 at rate 1 2N e n i (n i 1) Neutral (haploid) panmictic population of constant size Forwards in time, E[Δp] =0(neutrality) E[(Δp) 2 ]=δtp(1 p) E[(Δp) 3 ]=O(δt) 2 dp t = p t (1 p t )dw t Backwards in time MRCA Coalesent time Coalescence rate ( k 2). Wright Fisher time Banff, September 2009 p.5 Banff, September 2009 p.4
Evolution in a spatial continuum? Evolution in a spatial continuum? For many biological populations it is more natural to consider a spatial continuum. Malécot and Wright (almost) solved this problem in the 1940s: Assume uniform density, independent reproduction For many biological populations it is more natural to consider a spatial continuum. Identity in state between two genes x apart: 0= 2μF (x)+ 1 2ρ (1 F (x))g 2σ 2(x)+ [F (y) F (x)] G 2σ 2(y x)dy where individuals leave offspring following a Gaussian G σ 2. Banff, September 2009 p.6 Banff, September 2009 p.6 Evolution in a spatial continuum? Evolution in a spatial continuum? For many biological populations it is more natural to consider a spatial continuum. Malécot and Wright (almost) solved this problem in the 1940s: Assume uniform density, independent reproduction For many biological populations it is more natural to consider a spatial continuum. Malécot and Wright (almost) solved this problem in the 1940s: Identity in state between two genes x apart: 0= 2μF (x)+ 1 2ρ (1 F (x))g 2σ 2(x)+ [F (y) F (x)] G 2σ 2(y x)dy where individuals leave offspring following a Gaussian G σ 2. 1 F (0) = 1+N / log( 2μ) where N =4πρσ2 is the neighbourhood size. F (x) 1 N K 0( x /l) for x σ, l = σ/ 2μ. Banff, September 2009 p.6 Banff, September 2009 p.6 Mathematical problems Evolution in a spatial continuum? For many biological populations it is more natural to consider a spatial continuum. Malécot and Wright (almost) solved this problem in the 1940s: Assume uniform density, independent reproduction Banff, September 2009 p.7 Banff, September 2009 p.6
Mathematical problems Mathematical problems Felsenstein (1975). The pain in the torus: Independent reproduction = clumping; Felsenstein (1975). The pain in the torus: Independent reproduction = clumping; Local regulation = correlated reproduction. In 2D the diffusion limit fails over small scales The obvious backwards model fails in 2D Banff, September 2009 p.7 Banff, September 2009 p.7...but Mathematical problems f 1 Felsenstein (1975). The pain in the torus: Independent reproduction = clumping; Local regulation = correlated reproduction. 0.5 1 2 3 4 5 6 7 Banff, September 2009 p.8 Banff, September 2009 p.7 Biological problems Mathematical problems Felsenstein (1975). The pain in the torus: Independent reproduction = clumping; Local regulation = correlated reproduction. In 2D the diffusion limit fails over small scales Banff, September 2009 p.9 Banff, September 2009 p.7
Biological problems Biological problems Genetic diversity much lower than expected from census numbers Genetic diversity much lower than expected from census numbers Allele frequencies correlated over long distances Correlations across loci reflect a shared history Demographic history of many species dominated by large scale extinction-recolonisation events Banff, September 2009 p.9 Banff, September 2009 p.9 Biological problems Biological problems Genetic diversity much lower than expected from census numbers Allele frequencies correlated over long distances Genetic diversity much lower than expected from census numbers Allele frequencies correlated over long distances Correlations across loci reflect a shared history Demographic history of many species dominated by large scale extinction-recolonisation events...in a spatial continuum, neighbourhood size could be small and then pairwise coalescences may not dominate. Banff, September 2009 p.9 Banff, September 2009 p.9 Λ-coalescents Biological problems Pitman (1999), Sagitov (1999) If there are currently n ancestral lineages, each transition involving j of them merging happens at rate β n,j = 1 0 u j 2 (1 u) n j Λ(du) Genetic diversity much lower than expected from census numbers Allele frequencies correlated over long distances Correlations across loci reflect a shared history Λ a finite measure on [0, 1] Kingman s coalescent, Λ=δ 0 Banff, September 2009 p.10 Banff, September 2009 p.9
Forwards in time Bertoin & Le Gall (2003) Suppose there is no Kingman component. The Λ-coalescent describes the genealogy of a sample from a population evolving according to a Λ-Fleming-Viot process. Poisson point process intensity dt u 2 Λ(du) individual sampled at random from population proportion u of population replaced by offspring of chosen individual Banff, September 2009 p.11 Pick u ν r (du). Each individual in dies with probability u
x r Pick u ν r (du). Each individual in dies with probability u New individuals born according to a Poisson λu1 B(x,r) dx Pick u ν r (du). Each individual in dies with probability u New individuals born according to a Poisson λu1 B(x,r) dx x r Pick u ν r (du). Each individual in dies with probability u New individuals born according to a Poisson λu1 B(x,r) dx Pick u ν r (du). Each individual in dies with probability u New individuals born according to a Poisson λu1 B(x,r) dx x r x r Pick u ν r (du). Each individual in dies with probability u New individuals born according to a Poisson λu1 B(x,r) dx Pick u ν r (du). Each individual in dies with probability u New individuals born according to a Poisson λu1 B(x,r) dx
A continuum limit A continuum limit If λ is sufficiently large, the population survives with positive probability (N. Berestycki, E & Hutzenthaler) Possibility of empty s makes formulae cumbersome. If λ is sufficiently large, the population survives with positive probability (N. Berestycki, E & Hutzenthaler) Possibility of empty s makes formulae cumbersome. Replace according to Gaussian density instead of just in disc. Let λ. Model retains signature of finite local population density. a spatial Λ-Fleming-Viot process Genealogy of a sample from the population described by a spatial Λ-coalescent Lineages follow coalescing Lévy (actually compound Poisson) processes with multiple mergers Banff, September 2009 p.13 Banff, September 2009 p.13 The spatial Λ-Fleming-Viot process A continuum limit State {ρ(t, x, ) M 1 (K),x R 2,t 0}. If λ is sufficiently large, the population survives with positive probability (N. Berestycki, E & Hutzenthaler) Possibility of empty s makes formulae cumbersome. Replace according to Gaussian density instead of just in disc. Banff, September 2009 p.13 The spatial Λ-Fleming-Viot process A continuum limit State {ρ(t, x, ) M 1 (K),x R 2,t 0}. π Poisson point process rate μ(dr) dx dt. If λ is sufficiently large, the population survives with positive probability (N. Berestycki, E & Hutzenthaler) Possibility of empty s makes formulae cumbersome. Replace according to Gaussian density instead of just in disc. Let λ. Model retains signature of finite local population density. a spatial Λ-Fleming-Viot process Banff, September 2009 p.13
The spatial Λ-Fleming-Viot process The spatial Λ-Fleming-Viot process State {ρ(t, x, ) M 1 (K),x R 2,t 0}. π Poisson point process rate μ(dr) dx dt. For each r>0, ν r (du) M 1 ([0, 1]). Dynamics: for each (t, x, r) π, u ν r (du) State {ρ(t, x, ) M 1 (K),x R 2,t 0}. π Poisson point process rate μ(dr) dx dt. For each r>0, ν r (du) M 1 ([0, 1]). z U(B r (x)) The spatial Λ-Fleming-Viot process The spatial Λ-Fleming-Viot process State {ρ(t, x, ) M 1 (K),x R 2,t 0}. π Poisson point process rate μ(dr) dx dt. For each r>0, ν r (du) M 1 ([0, 1]). Dynamics: for each (t, x, r) π, u ν r (du) State {ρ(t, x, ) M 1 (K),x R 2,t 0}. π Poisson point process rate μ(dr) dx dt. For each r>0, ν r (du) M 1 ([0, 1]). Dynamics: for each (t, x, r) π, z U(B r (x)) k ρ(t,z, ). The spatial Λ-Fleming-Viot process The spatial Λ-Fleming-Viot process State {ρ(t, x, ) M 1 (K),x R 2,t 0}. π Poisson point process rate μ(dr) dx dt. For each r>0, ν r (du) M 1 ([0, 1]). Dynamics: for each (t, x, r) π, u ν r (du) State {ρ(t, x, ) M 1 (K),x R 2,t 0}. π Poisson point process rate μ(dr) dx dt. For each r>0, ν r (du) M 1 ([0, 1]). Dynamics: for each (t, x, r) π, u ν r (du) z U(B r (x)) k ρ(t,z, ). For all y B r (x), ρ(t, y, ) =(1 u)ρ(t,y, )+uδ k.
Mixing Patterns of allele frequencies (Amandine Véber) Work on a torus T(L) of side L in R 2. Two types of event: Small events affecting bounded s. Large events affecting s of diameter O(L α ) Each ancestral lineage is hit by a small event at rate O(1), butbya large event at rate O(1/ρ(L)). Sample at random from the whole of T(L). What happens to the genealogy as L? Banff, September 2009 p.16 Banff, September 2009 p.15 Case (i): α<1 Mixing On a suitable timescale the genealogy converges to a Kingman coalescent (with an effective parameter), The effective population size can depend on both large and small scale events. (Amandine Véber) Work on a torus T(L) of side L in R 2. c.f. Zähle, Cox, Durrett for classical stepping stone model. Banff, September 2009 p.17 Banff, September 2009 p.16 Case (ii): α =1 Mixing Three cases: (Amandine Véber) Work on a torus T(L) of side L in R 2. Two types of event: Small events affecting bounded s. Large events affecting s of diameter O(L α ) Each ancestral lineage is hit by a small event at rate O(1), butbya large event at rate O(1/ρ(L)). Banff, September 2009 p.18 Banff, September 2009 p.16
Case (ii): α =1 Case (ii): α =1 Three cases: ρ(l) L 2, timescale ρ(l), spatial Λ-coalescent in which lineages follow independent Brownian motions in between coalescence events. Three cases: ρ(l) L 2, timescale ρ(l), spatial Λ-coalescent in which lineages follow independent Brownian motions in between coalescence events. ρ(l) L 2 log L, timescale ρ(l), non-spatial Λ-coalescent. ρ(l) L 2 log L, timescale L 2 log L, Kingman coalescent. c.f. Nordborg & Krone (2002) Banff, September 2009 p.18 Banff, September 2009 p.18 Detecting large scale events Case (ii): α =1 Two ideas: Three cases: ρ(l) L 2, timescale ρ(l), spatial Λ-coalescent in which lineages follow independent Brownian motions in between coalescence events. ρ(l) L 2 log L, timescale ρ(l), non-spatial Λ-coalescent. Banff, September 2009 p.19 Banff, September 2009 p.18 Detecting large scale events Case (ii): α =1 Two ideas: Slow decay in probability of identity Three cases: ρ(l) L 2, timescale ρ(l), spatial Λ-coalescent in which lineages follow independent Brownian motions in between coalescence events. ρ(l) L 2 log L, timescale ρ(l), non-spatial Λ-coalescent. ρ(l) L 2 log L, timescale L 2 log L, Kingman coalescent. Banff, September 2009 p.19 Banff, September 2009 p.18
Adding recombination Detecting large scale events (Amandine again) Small events: Pick two parents, types ab and AB, say.writer L for fraction of recombinants. Two ideas: Slow decay in probability of identity Correlations between loci Banff, September 2009 p.21 Banff, September 2009 p.19 Adding recombination Malécot again (Amandine again) Small events: Pick two parents, types ab and AB, say.writer L for fraction of recombinants. ρ(t) =(1 u)ρ(t ) + 1 2 u(1 r L)(δ AB + δ ab )+ 1 2 ur L(δ ab + δ Ab ) Banff, September 2009 p.21 Banff, September 2009 p.20 Adding recombination Adding recombination (Amandine again) (Amandine again) Small events: Pick two parents, types ab and AB, say.writer L for fraction of recombinants. ρ(t) =(1 u)ρ(t ) + 1 2 u(1 r L)(δ AB + δ ab )+ 1 2 ur L(δ ab + δ Ab ) Large events: ignore recombination. Banff, September 2009 p.21 Banff, September 2009 p.21
Correlations Correlations Cases... but e.g. ρ(l) L 2α Start L β apart, β>α. Cases... but e.g. ρ(l) L 2α If log ρ(l) log(1 + r lim ) Lρ(L) L 2log(L β α 1 ) then genealogies asymptotically independent. Otherwise genealogies completely correlated up to some time L η. Banff, September 2009 p.22 Banff, September 2009 p.22 Some work in progress Correlations Cases... but e.g. ρ(l) L 2α Start L β apart, β>α. Banff, September 2009 p.23 Banff, September 2009 p.22 Some work in progress Correlations For d 2, Cox, Durrett, Perkins rescaled voter model to obtain superbrownian motion. Restrict patch sizes to recover same result. What about heavy tails? (With Nathanael Berestycki & Amandine Véber) Cases... but e.g. ρ(l) L 2α Start L β apart, β>α. If log ρ(l) log(1 + r lim ) Lρ(L) L 2log(L β α 1 ) then genealogies asymptotically independent. Banff, September 2009 p.23 Banff, September 2009 p.22
A framework for modelling Some work in progress Replace R 2 by an arbitrary Polish space For d 2, Cox, Durrett, Perkins rescaled voter model to obtain superbrownian motion. Restrict patch sizes to recover same result. What about heavy tails? (With Nathanael Berestycki & Amandine Véber) Adding selection. After rescaling one can recover the classical Fisher wave. Small noise perturbations? Spatial hitchhiking. (With Amandine Véber and Feng Yu) Banff, September 2009 p.24 Banff, September 2009 p.23 A framework for modelling Some work in progress Replace R 2 by an arbitrary Polish space Choose a Poisson number of parents at each reproduction event For d 2, Cox, Durrett, Perkins rescaled voter model to obtain superbrownian motion. Restrict patch sizes to recover same result. What about heavy tails? (With Nathanael Berestycki & Amandine Véber) Adding selection. After rescaling one can recover the classical Fisher wave. Small noise perturbations? Spatial hitchhiking. (With Amandine Véber and Feng Yu) Instead of replacing fraction u of population in a disc, replace according to a distribution (eg Gaussian). (With Nick Barton & Jerome Kelleher) Banff, September 2009 p.24 Banff, September 2009 p.23 A framework for modelling Some work in progress Replace R 2 by an arbitrary Polish space Choose a Poisson number of parents at each reproduction event Choose spatial position of parents non-uniformly. For d 2, Cox, Durrett, Perkins rescaled voter model to obtain superbrownian motion. Restrict patch sizes to recover same result. What about heavy tails? (With Nathanael Berestycki & Amandine Véber) Adding selection. After rescaling one can recover the classical Fisher wave. Small noise perturbations? Spatial hitchhiking. (With Amandine Véber and Feng Yu) Instead of replacing fraction u of population in a disc, replace according to a distribution (eg Gaussian). (With Nick Barton & Jerome Kelleher) Convergence of genealogies. (With Tom Kurtz) Banff, September 2009 p.24 Banff, September 2009 p.23
A framework for modelling A framework for modelling Replace R 2 by an arbitrary Polish space Choose a Poisson number of parents at each reproduction event Choose spatial position of parents non-uniformly. Impose spatial motion of individuals not linked directly to the reproduction events. Replace R 2 by an arbitrary Polish space Choose a Poisson number of parents at each reproduction event Choose spatial position of parents non-uniformly. Impose spatial motion of individuals not linked directly to the reproduction events. Spatially varying population density....and many more. Banff, September 2009 p.24 Banff, September 2009 p.24 A framework for modelling Replace R 2 by an arbitrary Polish space Choose a Poisson number of parents at each reproduction event Choose spatial position of parents non-uniformly. Impose spatial motion of individuals not linked directly to the reproduction events. Spatially varying population density. Banff, September 2009 p.24 A framework for modelling Replace R 2 by an arbitrary Polish space Choose a Poisson number of parents at each reproduction event Choose spatial position of parents non-uniformly. Impose spatial motion of individuals not linked directly to the reproduction events. Spatially varying population density....and many more. Banff, September 2009 p.24