Stochastic Demography, Coalescents, and Effective Population Size

Similar documents
Demography April 10, 2015

How robust are the predictions of the W-F Model?

Challenges when applying stochastic models to reconstruct the demographic history of populations.

The Wright-Fisher Model and Genetic Drift

Evolution in a spatial continuum

Mathematical models in population genetics II

Wald Lecture 2 My Work in Genetics with Jason Schweinsbreg

ON COMPOUND POISSON POPULATION MODELS

Frequency Spectra and Inference in Population Genetics

The Combinatorial Interpretation of Formulas in Coalescent Theory

BIRS workshop Sept 6 11, 2009

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Gene Genealogies Coalescence Theory. Annabelle Haudry Glasgow, July 2009

The Λ-Fleming-Viot process and a connection with Wright-Fisher diffusion. Bob Griffiths University of Oxford

6 Introduction to Population Genetics

Coalescent Theory for Seed Bank Models

The mathematical challenge. Evolution in a spatial continuum. The mathematical challenge. Other recruits... The mathematical challenge

6 Introduction to Population Genetics

A comparison of two popular statistical methods for estimating the time to most recent common ancestor (TMRCA) from a sample of DNA sequences

Computing likelihoods under Λ-coalescents

Evolution of cooperation in finite populations. Sabin Lessard Université de Montréal

Expected coalescence times and segregating sites in a model of glacial cycles

Dynamics of the evolving Bolthausen-Sznitman coalescent. by Jason Schweinsberg University of California at San Diego.

Mathematical Biology. Computing likelihoods for coalescents with multiple collisions in the infinitely many sites model. Matthias Birkner Jochen Blath

Introductory seminar on mathematical population genetics

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Computational Systems Biology: Biology X

Mathematical Population Genetics II

Population Genetics: a tutorial

Mathematical Population Genetics II

Metapopulation models for historical inference

122 9 NEUTRALITY TESTS

Estimating effective population size from samples of sequences: inefficiency of pairwise and segregating sites as compared to phylogenetic estimates

Genetic Variation in Finite Populations

Learning Session on Genealogies of Interacting Particle Systems

Effective population size and patterns of molecular evolution and variation

Population Genetics I. Bio

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.

arxiv: v1 [q-bio.pe] 20 Sep 2018

Endowed with an Extra Sense : Mathematics and Evolution

The Moran Process as a Markov Chain on Leaf-labeled Trees

Selection and Population Genetics

Problems for 3505 (2011)

A CHARACTERIZATION OF ANCESTRAL LIMIT PROCESSES ARISING IN HAPLOID. Abstract. conditions other limit processes do appear, where multiple mergers of

CLOSED-FORM ASYMPTOTIC SAMPLING DISTRIBUTIONS UNDER THE COALESCENT WITH RECOMBINATION FOR AN ARBITRARY NUMBER OF LOCI

Pathwise construction of tree-valued Fleming-Viot processes

Joyce, Krone, and Kurtz

2 Discrete-Time Markov Chains

J. Stat. Mech. (2013) P01006

I of a gene sampled from a randomly mating popdation,

Rare Alleles and Selection

Reflected Brownian Motion

Lecture 14 Chapter 11 Biology 5865 Conservation Biology. Problems of Small Populations Population Viability Analysis

Mathematical Biology. Sabin Lessard

Wright-Fisher Processes

ANCESTRAL PROCESSES WITH SELECTION: BRANCHING AND MORAN MODELS

classes with respect to an ancestral population some time t

Statistical Tests for Detecting Positive Selection by Utilizing High. Frequency SNPs

Probability Distributions

AARMS Homework Exercises

The coalescent process

Genetic Drift in Human Evolution

A phase-type distribution approach to coalescent theory

Evolutionary Theory. Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A.

ON THE DIFFUSION OPERATOR IN POPULATION GENETICS

Australian bird data set comparison between Arlequin and other programs

arxiv: v2 [q-bio.pe] 26 Sep 2018

Coalescent based demographic inference. Daniel Wegmann University of Fribourg

RW in dynamic random environment generated by the reversal of discrete time contact process

A representation for the semigroup of a two-level Fleming Viot process in terms of the Kingman nested coalescent

DUALITY AND FIXATION IN Ξ-WRIGHT-FISHER PROCESSES WITH FREQUENCY-DEPENDENT SELECTION. By Adrián González Casanova and Dario Spanò

URN MODELS: the Ewens Sampling Lemma

Theoretical Population Biology

Branching Processes II: Convergence of critical branching to Feller s CSB

Concentration inequalities for Feynman-Kac particle models. P. Del Moral. INRIA Bordeaux & IMB & CMAP X. Journées MAS 2012, SMAI Clermond-Ferrand

Two viewpoints on measure valued processes

Convergence Time to the Ewens Sampling Formula

Theoretical Population Biology

Population Structure

EXTINCTION TIMES FOR A GENERAL BIRTH, DEATH AND CATASTROPHE PROCESS

LogFeller et Ray Knight

Problems on Evolutionary dynamics

Theoretical Population Biology

Mean field simulation for Monte Carlo integration. Part II : Feynman-Kac models. P. Del Moral

Contrasts for a within-species comparative method

Bayesian Methods with Monte Carlo Markov Chains II

The Common Ancestor Process for a Wright-Fisher Diffusion

Introduction to self-similar growth-fragmentations

process on the hierarchical group

EXTINCTION TIMES FOR A GENERAL BIRTH, DEATH AND CATASTROPHE PROCESS

Statistics 992 Continuous-time Markov Chains Spring 2004

Central limit theorems for ergodic continuous-time Markov chains with applications to single birth processes

π b = a π a P a,b = Q a,b δ + o(δ) = 1 + Q a,a δ + o(δ) = I 4 + Qδ + o(δ),

Modelling populations under fluctuating selection

Diffusion Models in Population Genetics

The effects of a weak selection pressure in a spatially structured population

THE SPATIAL -FLEMING VIOT PROCESS ON A LARGE TORUS: GENEALOGIES IN THE PRESENCE OF RECOMBINATION

The Impact of Sampling Schemes on the Site Frequency Spectrum in Nonequilibrium Subdivided Populations

Lecture 4a: Continuous-Time Markov Chain Models

Measure-valued diffusions, general coalescents and population genetic inference 1

Transcription:

Demography Stochastic Demography, Coalescents, and Effective Population Size Steve Krone University of Idaho Department of Mathematics & IBEST Demographic effects bottlenecks, expansion, fluctuating population size, population structure) affect polymorphism data Ex) Detecting selective sweeps confounded by demography and structure Effective population size: When is it meaningful? What is effect of demography? Appropriate scaling comes from what is observable in the coalescent i.e., what has an explicit effect on data). Wright Fisher model discrete time generations) constant population size panmictic no selection, no recombination ancestry: each individual chooses haploid) parent at random prob / each) from previous generation Effective population size Other population models reproduction, variable pop size, structure,...) sometimes behave in certain respects like a W-F model with an effective population size e. inbreeding effective size probability of identity by descent) variance effective size variance in reproductive success) eigenvalue effective size leading non-unit eigenvalue for allele frequency transition matrix) coalescent effective size if it exists) supersedes all of these. Exists when scaled ancestral process converges to linear time change of Kingman s coalescent; demographic fluctuations average out. The coalescent P indiv choose same parent) / Takes O) generations to find common ancestor per pair) Measure time in units of generations...[t] A τ) # ancestors τ generations in past A [t]) At)...Kingman coalescent All genetic information about a sample polymorphism data) is embedded in the coalescent. Fu and Li s F statistic F F π, η s,s) π n n )η s c S + c S where n sample size π ave. # pairwise differences influenced by deep branches) η s # singletons influenced by external branches) S # segregating sites

where a n n i i Tajima s D statistic π S a D Dπ, η s,s) n c S + c S Both statistics have mean, variance. Deviations from assumptions neutrality, constant pop size, panmixia,...) produce changes in F and D. Relative time scales Coalescence events have prob O/ ). Events that are faster have prob O/ α ), where α<. Effects appear in coalescent only in average sense. All demographic processes fast coalescent effective size exists.) Events with prob O/ ) areincorporatedinthe coalescent and affect pattern of variation in nonhomogeneous way. o coalescent effective size) Fluctuating population size backward) size process M ),M ),M 3),... Markov chain with state space {,,...} i x i How does this affect the coalescent? Depends on time it takes for large size changes i.e., O)) to occur. Fast size fluctuations linear time change Large pop size changes occur quickly e.g., every generation); size process stationary distribution γ,γ,...); W-F reproduction: P no coalescence in [t] generations) E [ [t] τ i ) ] M τ) γ i x i ) [t] exp{ t γi /x i } Limiting coalescent... linear time change of standard coalescent: A [t]) Act) where c γ i x i... pairwise coalescence rate pairwise coalescence prob γi x i e e c γ i ) i... harmonic mean of sizes This is the coalescent effective size : e /c Intermediate fluctuations stochastic time change What if macroscopic changes in pop. size i.e., O)) occur on coalescent time scale i.e., O) generations)? Pop. size τ generations in past Markov chain): M τ) X τ), where relative size proc. X [t]) M [t]) Xt)... cont-time Markov e.g., diffusion proc. or cont-time jump chain)

Reproduction Ex: Wright Fisher model Let c M τ ),M τ) ) denote prob. that two lineages coalesce when going from gen. τ to gen. τ in past). Assume c k, m) H k, m ), where H k, m ) Hx, y) as k/ x and m/ y. Time change becomes c M τ ),M τ) ) M τ) X τ) So c k, m) m H k, m ), where H x, y) y Hx, y) t HX s,x s )ds. Ex: Cannings model c M τ ),M τ) ) M τ )) M τ) i E [ ν τ) i ) ] ν τ) i... number of offspring produced by ith indiv in gen τ. With exchangeable reproduction, get k ) H, m k k ) ) md yd Hx, y) x Large size changes occur on same time scale as coalescence events; do not average out. Limiting coalescent is of form where the time change A [t]) AY t)), Y t) t HXs),Xs))ds is nonlinear and stochastic coalescence intensity). [WF case: Y t) t ds] Kaj and Krone 3; Donnelly and Kurtz Xs) 999; Griffiths and Tavaré 994.) o coalescent) effective size! Behavior different from any standard W-F model. Effects should show up in polymorphism data. Idea for W-F model Full convergence theorem P no coalescence in [t] generations {M )}) [t] τ [t] τ ) M τ) ) X τ) exp [t] τ ) t exp X τ) Xs) ds) X [t]),a [t])) Xt),AY t )) in D S {,...,n} [, ), whenever X ) X) in S. Transition semigroup T t fx, i) E x,i)[ fxt),ay t )) ] can be decomposed as T t fx, i) i j lj i C l i, j)e x,i)[ fxt),j) e l )Y t ]

Idea of Proof C l i, j) j+ s i ) s j r i,r l r ) l ) l ) )l j j l ) i) l j!l j)!i l), j l i. For i n, x Z and r, transition operator for X,A ): T r fx, i) E x,i)[ fx r),a r)) ], ) Show uniform convergence of semigroups: sup T[t] fx, i) T tfx, i), x,i Lfx, i) d dt T tfx, i) t ) i Lfx, i)+ Hx, x) fx, i ) fx, i) ) Local time interpretation of time change For any t fixed and j i n, as, P k,i) M [t]),a [t])) m, j) ) i C l i, j) lj P ) ) [t] ) l ˆP k, m)+o, L x t t... diffusion local time mdx)... speed measure ds X s E x Lx t mdx) Combinatorial term is same in discrete semigroup, so decomposition implies enough to show uniform convergence of discrete Feynman Kac semigroups to E x,i)[ fxt),j) e l )Y t ] Simulations for fluctuating size sizes, ; equal prob of size change q q q; mutation prob u.;, runs per data pt.; stationary starting size. Plot of Fu and Li s F 3, 5 3, 4 Rule of thumb: q, ) no averaging; too close to coalescent scale.

Dependence on initial size Why do polymorphism data always seem to suggest population expansion, and not population contraction? Top curve: initial size 3 Bottom curve: initial size 5 3, 5 ; q q 4. General Population Structure Many deviations from basic Wright Fisher model can be thought of as examples of population structure. geographic structure age classes diploidy ordborg and Donnelly 997) Population divided into groups that are connected by migration. Fast migration... effects only present in averaged sense. Slow migration... effects explicitly appear in coalescent. Differences in scaling... hierarchical structuring of population males and females a)ocollapse...structuredcoalescent. b) Full collapse to Kingman coalescent. Partial collapse to structured coalescent.

Rates depend on configuration of ancestral lineages While there are r ancestors r,...,n), the configuration process moves among the configurations in level r: S r {x,...,x L ):x + + x L r}. Ex. Ancestral urn for age structure. Starting with sample of size n, the state space for the configuration process is S S S n.forany configuration x,...,x L ) S, specify probabilities of jumping to other configurations due to migration and/or coalescence of ancestors. Proof of convergence Stationary distribution of backward migration process: γ γ,...,γ L ). Stationary distribution for level-r configuration process: π r x) r! x! x L! γx γx L L. Transition matrix for whole configuration process on S S S n : Π I + α B + ) C + o, where I is the identity matrix, B is a block diagonal matrix B B..... B........ B n,n B n,n... from backward migration jumps, Möhle s Lemma 998) C is a block matrix of the form C C C C.............. C n,n C n,n... due to coalescence jumps. Case α : Π [t] A + C + o )) [t] P I + e tg, where P lim k A k and G PCP.

Structured Populations Population of total size, subdivided into L demes, connected by migration. Pop. size in deme k is k a k a + + a L ). Migration on same time scale as coalescence events i.e., migration prob. for lineage b ij β ij / ) limiting coalescent is structured. no averaging, no coalescent effective size) Fast migration i.e., b ij β ij / α, α<), and stationary distribution for locations γ,γ,...,γ L ) averaging occurs w/ coalescent time change c coalescent effective size is L k e c γ k k ) γ k a k. harmonic mean In case of fast migration, structured model can be thought of as panmictic W-F model with pop. size e. Simulations for population subdivision demes, equal size, equal migration rate β b 3 4 I. Kaj, Uppsala Univ. Mathematics) M. ordborg, USC Molecular and Computational Biology) M. Lascoux, Uppsala Univ. Cons. Biology & Genetics) P. Sjödin, Uppsala Univ. Cons. Biology & Genetics) SF DMS--798 IH P RR6448 P. Sjödin,I.Kaj,S.Krone,M.Lascoux,M.ordborg 5) On the meaning and existence of an effective population size. Genetics 69: 6-7. I. Kaj and S. Krone 3) The coalescent process in a population with stochastically varying size. J. Appl. Probab. 4: 33-48. M. ordborg and S. Krone ) Separation of time scales and convergence to the coalescent in structured populations. In Modern Developments in Theoretical Population Genetics. M. Slatkin and M. Veuille eds.).