The Λ-Fleming-Viot process and a connection with Wright-Fisher diffusion. Bob Griffiths University of Oxford

Similar documents
Dynamics of the evolving Bolthausen-Sznitman coalescent. by Jason Schweinsberg University of California at San Diego.

The mathematical challenge. Evolution in a spatial continuum. The mathematical challenge. Other recruits... The mathematical challenge

Evolution in a spatial continuum

ON COMPOUND POISSON POPULATION MODELS

Wald Lecture 2 My Work in Genetics with Jason Schweinsbreg

Yaglom-type limit theorems for branching Brownian motion with absorption. by Jason Schweinsberg University of California San Diego

MANY questions in population genetics concern the role

Some mathematical models from population genetics

The genealogy of branching Brownian motion with absorption. by Jason Schweinsberg University of California at San Diego

Computing likelihoods under Λ-coalescents

Measure-valued diffusions, general coalescents and population genetic inference 1

Theoretical Population Biology

Mathematical Biology. Computing likelihoods for coalescents with multiple collisions in the infinitely many sites model. Matthias Birkner Jochen Blath

Stochastic Demography, Coalescents, and Effective Population Size

Look-down model and Λ-Wright-Fisher SDE

M A. Consistency and intractable likelihood for jump diffusions and generalised coalescent processes. Jere Juhani Koskela. Thesis

LECTURE NOTES: Discrete time Markov Chains (2/24/03; SG)

Pathwise construction of tree-valued Fleming-Viot processes

Branching Processes II: Convergence of critical branching to Feller s CSB

Learning Session on Genealogies of Interacting Particle Systems

A representation for the semigroup of a two-level Fleming Viot process in terms of the Kingman nested coalescent

Modelling populations under fluctuating selection

DUALITY AND FIXATION IN Ξ-WRIGHT-FISHER PROCESSES WITH FREQUENCY-DEPENDENT SELECTION. By Adrián González Casanova and Dario Spanò

The effects of a weak selection pressure in a spatially structured population

arxiv: v1 [math.pr] 29 Jul 2014

Computational Systems Biology: Biology X

arxiv: v5 [stat.me] 23 Jan 2017

Theoretical Population Biology

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

BIRS workshop Sept 6 11, 2009

process on the hierarchical group

ON THE TWO OLDEST FAMILIES FOR THE WRIGHT-FISHER PROCESS

ON THE TWO OLDEST FAMILIES FOR THE WRIGHT-FISHER PROCESS

Evolution of cooperation in finite populations. Sabin Lessard Université de Montréal

Two viewpoints on measure valued processes

Lecture 18 : Ewens sampling formula

The tree-valued Fleming-Viot process with mutation and selection

Stochastic flows associated to coalescent processes

arxiv: v1 [math.pr] 13 Jul 2011

This ODE arises in many physical systems that we shall investigate. + ( + 1)u = 0. (λ + s)x λ + s + ( + 1) a λ. (s + 1)(s + 2) a 0

Intertwining of Markov processes

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

The Wright-Fisher Model and Genetic Drift

A modified lookdown construction for the Xi-Fleming-Viot process with mutation and populations with recurrent bottlenecks

Probability Distributions

Crump Mode Jagers processes with neutral Poissonian mutations

Strauss PDEs 2e: Section Exercise 4 Page 1 of 6

MOMENTS AND CUMULANTS OF THE SUBORDINATED LEVY PROCESSES

arxiv: v1 [math.pr] 18 Sep 2007

Generalized Fleming-Viot processes with immigration via stochastic flows of partitions

Demography April 10, 2015

Inférence en génétique des populations IV.

Linear-fractional branching processes with countably many types

The Kernel Trick. Robert M. Haralick. Computer Science, Graduate Center City University of New York

AARMS Homework Exercises

Bayesian Methods with Monte Carlo Markov Chains II

Optimal filtering and the dual process

Stochastic Population Models: Measure-Valued and Partition-Valued Formulations

LEGENDRE POLYNOMIALS AND APPLICATIONS. We construct Legendre polynomials and apply them to solve Dirichlet problems in spherical coordinates.

Macroscopic quasi-stationary distribution and microscopic particle systems

6 Introduction to Population Genetics

EXCHANGEABLE COALESCENTS. Jean BERTOIN

STOCHASTIC FLOWS ASSOCIATED TO COALESCENT PROCESSES III: LIMIT THEOREMS

Frequency Spectra and Inference in Population Genetics

De los ejercicios de abajo (sacados del libro de Georgii, Stochastics) se proponen los siguientes:

9.2 Branching random walk and branching Brownian motions

Problems on Evolutionary dynamics

Deterministic and stochastic approaches for evolutionary branching in adaptive dynamics

Permutations with Inversions

Nonlinear Integral Equation Formulation of Orthogonal Polynomials

P (A G) dp G P (A G)

Probability, Population Genetics and Evolution

By Jochen Blath 1, Adrián González Casanova 2, Noemi Kurt 1 and Maite Wilke-Berenguer 3 Technische Universität Berlin

Ancestor Problem for Branching Trees

A phase-type distribution approach to coalescent theory

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

WXML Final Report: Chinese Restaurant Process

Ancestral Inference in Molecular Biology. Simon Tavaré. Saint-Flour, 2001

Closed-form sampling formulas for the coalescent with recombination

04. Random Variables: Concepts

Two special equations: Bessel s and Legendre s equations. p Fourier-Bessel and Fourier-Legendre series. p

Lectures on Probability and Statistical Models

The Combinatorial Interpretation of Formulas in Coalescent Theory

Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk

The Poisson-Dirichlet Distribution: Constructions, Stochastic Dynamics and Asymptotic Behavior

STATISTICS 385: STOCHASTIC CALCULUS HOMEWORK ASSIGNMENT 4 DUE NOVEMBER 23, = (2n 1)(2n 3) 3 1.

A coalescent model for the effect of advantageous mutations on the genealogy of a population

Krzysztof Burdzy Robert Ho lyst Peter March

Final Exam May 4, 2016

Renormalisation of hierarchically interacting Cannings processes

Probability and Measure

The nested Kingman coalescent: speed of coming down from infinity. by Jason Schweinsberg (University of California at San Diego)

Degree Master of Science in Mathematical Modelling and Scientific Computing Mathematical Methods I Thursday, 12th January 2012, 9:30 a.m.- 11:30 a.m.

Week 1 Quantitative Analysis of Financial Markets Distributions A

2 Discrete-Time Markov Chains

An algebraic approach to stochastic duality. Cristian Giardinà

Lecture 4a: Continuous-Time Markov Chain Models

The Contour Process of Crump-Mode-Jagers Branching Processes

Chemistry 532 Problem Set 7 Spring 2012 Solutions

Transcription:

The Λ-Fleming-Viot process and a connection with Wright-Fisher diffusion Bob Griffiths University of Oxford

A d-dimensional Λ-Fleming-Viot process {X(t)} t 0 representing frequencies of d types of individuals in a population has a generator described by Lg(x) = 1 0 d i=1 x i ( g(x(1 y) + yei ) g(x) ) F (dy) y 2. The population is partitioned at events of change by choosing type i [d] to reproduce with probability x i, then rescaling the population with additional offspring y of type i to be x(1 y)+ye i at rate y 2 F (dy).

Examples Eldon and Wakeley (2006). A model where F has a single point of increase in (0, 1] with a possible atom at 0. A natural class that arises from discrete models is when F has a Beta(α, β) density, particularly a Beta(2 α, α) density coming from a discrete model where the offspring distribution tails are asymptotic to a power law of index α. Birkner, Blath, Capaldo, Etheridge, Möhle, Schweinsberg, Wakolbinger (2005) give a connection to stable processes. Birkner and Blath (2009) describe the Λ-Fleming-Viot process and discrete models whose limit gives rise to it.

If F has a single atom at 0, then {X t } t 0 is the d-dimensional Wright-Fisher diffusion process with generator L = 1 2 d i,j=1 2 x i (δ ij x j ) x i x j X 1 (t) is a one-dimensional Wright-Fisher diffusion process with generator L = 1 2 x(1 x) 2 x 2

The Λ-coalescent process is a random tree process back in time which has multiple merger rates for a specific k lineages coalescing while n edges in the tree of 1 n kf (dx) λ nk = 0 xk (1 x) x 2, k 2 After coalescence there are n k + 1 edges in the tree. This process was introduced by Pitman (1999), Sagitov (1999) and has been extensively studied. Berestycki (2009); recent results in the Λ-coalescent.

There is a connection between continuous state branching processes and the Λ-coalescent. The connection is through the Laplace exponent 1 ψ(q) = 0 ( e qy 1 + qy ) y 2 F (dy) Bertoin and Le Gall (2006) showed that the Λ-coalescent comes down from infinity under the same condition that the continuous state branching process becomes extinct in finite time, that is when 1 dq ψ(q) <

Some papers on the Λ-coalescent Berestycki (2009) Recent progress in coalescent theory. Berestycki, Berestycki, and Limic (2012) Asymptotic sampling formulae for Λ-coalescents. Berestycki, Berestycki, and Limic (2012) A small-time coupling between Λ- coalescents and branching processes. Bertoin and Le Gall (2003) Stochastic flows associated to coalescent processes. Bertoin and Le Gall (2006) J. Bertoin and J.-F. Le Gall (2006). Stochastic flows associated to coalescent processes III: Limit theorems. Birkner, Blath, Capaldo, Etheridge, Möhle, Schweinsberg, Wakolbinger, (2005) Alpha-stable branching and Beta-coalescents. Birkner and Blath (2009) Measure-Valued diffusions, general coalescents and population genetic inference.

A Wright-Fisher generator connection Theorem Let L be the Λ-Fleming-Viot generator, V be a uniform random variable on [0, 1], U a random variable on [0, 1] with density 2u, 0 < u < 1 and W = Y U, where Y has distribution F and V, U, Y are independent. Denote the second derivatives of a function g(x) by g ij (x). Then Lg(x) = 1 2 d i,j=1 ( ) x i (δ ij x j )E [g ] ij x(1 W ) + W V ei where expectation E is taken over V, W.

Wright-Fisher generator Lg(x) = 1 2 d i,j=1 x i (δ ij x j )g ij ( x ) Λ-Fleming-Viot generator Lg(x) = 1 2 d i,j=1 ( ) x i (δ ij x j )E [g ] ij x(1 W ) + W V ei

Method of proof Lg(x) = Lg(x) = 1 2 1 0 d i,j=1 d i=1 x i ( g(x(1 y) + yei ) g(x) ) F (dy) y 2 ( ) x i (δ ij x j )E [g ] ij x(1 W ) + W V ei Show that the generators have the same answer acting on g(x) = exp { d } η i x i, η R d i=1

1-dimensional generator Wright-Fisher diffusion generator Lg(x) = 1 2 x(1 x)g ( x ) Λ-Fleming-Viot process generator Lg(x) = 1 2 x(1 x)e [ g ( x(1 W ) + W V )] or Lg(x) = 1 2 x(1 x)e g ( x(1 W ) + W ) g ( x(1 W ) ) W

The Laplace transform of W is related to the Laplace exponent by E [ e ηw ] = 2 1 0 e ηy 1 + ηy (yη) 2 F (dy) W = UY is continuous in (0, 1) with a possible atom at 0. P (W = 0) = P (Y = 0)

Adding mutation The generator has an additional term added of θ 2 d i=1 ( d j=1 p ji x j x i ) x i If mutation is parent independent θp ij = θ j, not depending on i, and the additional term is 1 2 d i=1 ( d j=1 θ j x j θx i ) x i

Eigenstructure of the Λ-Fleming-Viot process Theorem Let {λ n }, {P n (x)} be the eigenvalues and eigenvectors of L, the generator which includes mutation, satisfying LP n (x) = λ n P n (x) Denote the d 1 eigenvalues of the mutation matrix P which have modulus less than 1 by {φ k } d 1 k=1. The eigenvalues of L are λ n = 1 2 n(n 1)E [ (1 W ) n 2 ] + θ 2 d 1 k=1 (1 φ k )n k

Polynomial Eigenvectors Denote the d 1 eigenvalues of P which have modulus less than 1 by {φ k } d 1 k=1 corresponding to eigenvectors which are rows of a d 1 d matrix R satisfying d i=1 r ki p ji = φ k r kj, k = 1,..., d 1. Define a d 1 dimensional vector ξ = Rx. The polynomials P n (x) are polynomials in the d 1 terms in ξ = (ξ 1,..., ξ d 1 ) whose only leading term of degree n is d 1 j=1 ξ n j j

In the parent independent model of mutation λ n = λ n = 1 ] {(n 2 n 1)E [(1 W ) n 2 } + θ ( ) n+d 2 repeated n times with non-unique polynomial eigenfunctions within the same degree n. The non-unit eigenvalues of the mutation matrix with identical rows are zero. Wright-Fisher diffusion, general mutation structure. λ n = 1 2 n(n 1) + θ 2 d 1 k=1 (1 φ k )n k

Λ-coalescent eigenvalues and rates 1 2 ] n(n 1)E [(1 W ) n 2 = k=2 ( n) λnk k which is the total jump rate away from n individuals. These are the eigenvalues in the Λ-coalescent tree.

The individual rates can be expressed as where ) λnk = ( n k k 0 yk (1 y) n kf (dy) y 2 = n 2 E P k(n, W ) P k 1 (n, W ), W 2 ( n ) 1 P k (n, w) = ( n 1) (1 w) n k w k k 1 is a negative binomial probability of a waiting time of n trials to obtain k successes, where the success probability is w.

Two types The generator is specified by Lg(x) = 1 2 x(1 x)e [g ( x(1 W ) + W V )] + 1 2 (θ 1 θx)g (x) The eigenvalues are λ n = 1 ] {(n 2 n 1)E [(1 W ) n 2 } + θ and the eigenvectors are polynomials satisfying LP n (x) = λ n P n (x), n 1.

Polynomial eigenvectors 1 P n x(1 x)e 2 ( x(1 W ) + W ) P n ( x(1 W ) ) W + 1 2 (θ 1 θx)p n (x) = 1 2 n (n 1)E [ (1 W ) n 2] + θ P n (x) The monic polynomial P n (x) is uniquely defined by recursion of its coefficients.

Stationary distribution ψ(x) 1 0 Lg(x)ψ(x)dx = 0 σ 2 (x) = x(1 x), µ(x) = θ 1 θx k(x) = E [(1 W ) 2 g ( x(1 W ) + V W )] An equation for the stationary distribution 1 [ d 2 0 = 0 k(x) 1 2dx 2 + k(x) d dx [ σ 2 (x)ψ(x) ] g(x) d [ 1 2 σ2 (x)ψ(x) ] 1 0 dx [ µ(x)ψ(x) ] ] dx + g(x)µ(x)ψ(x) 1 0

In a diffusion process k(x) = g(x) and the boundary terms vanish. Then there is a solution found by solving 1 [ σ 2 2dx 2 (x)ψ(x) ] d [ ] µ(x)ψ(x) = 0 dx d 2 however k(x) g(x) so we do not have an equation like this.

Green s function, γ(x) Solve, for a given function g(x) Lγ(x) = g(x), γ(0) = γ(1) = 0. Then γ(x) = 1 0 G(x, ξ)g(ξ)dξ A non-linear equation, equivalent to [ 1 x(1 x)e 2 γ (x(1 W ) + V W ) ] = g(x)

Green s function solution Define k(x) = E [(1 W ) 2 γ ( x(1 W ) + V W )] then with a solution k (x) = 2 g(x) x(1 x) k(x) = k(0)(1 x) + k(1)x x 2g(η) 1 + (1 x) 0 (1 η) dη + x 2g(η) x η dη

Mean time to absorption If g(x) = 1, x (0, 1) then γ(x) is the mean time to absorption at 0 or 1 when X(0) = x. k(x) = k(0)(1 x) + k(1)x + (1 x) x 0 2 1 (1 η) dη + x x 2 η dη There is a non-linear equation to solve of k(x) = k(0)(1 x) + k(1)x 2(1 x) log(1 x) 2x log x where k(x) = E [(1 W ) 2 γ ( x(1 W ) + V W )]

Stationary distribution, Λ-Fleming process with mutation Let Z be a random variable with the size-biassed distribution of X, Z a size-biassed Z random variable and Z a size-biassed random variable with respect to 1 Z. Let B be Bernoulli random variable, independent of the other random variables in the following equation such that P (B = 1) = θ θ 1 θ(θ 1 + 1) An interesting distributional identity V Z = D ( 1 B ) V Z + B ( Z (1 W ) + W V )

The frequency spectrum in the infinitely-many-alleles model Take a limit from a d-allele model with θ i = θ/d, i [d]. The limit is a point process {X i } i=1. The 1-dimensional frequency spectrum h(x) is a non-negative measure such that for suitable functions k on [0,1] in the stationary distribution [ ] 1 E k(x i ) i=1 Symmetry in the d-allele model shows that 1 = 0 k(x)h(x)dx. [ k(x)h(x)dx = lim de k(x 1 ) d 0 The classical Wright-Fisher diffusion gives rise to the Poisson- Dirichlet process with a frequency spectrum of h(x) = θx 1 (1 x) θ 1, 0 < x < 1. ].

Λ Fleming-Viot frequency spectrum Let Z have a density Interesting identity f(z) = zh(z), 0 < z < 1 V Z = D Z (1 W ) + W V where Z is size-biassed with respect to Z, and Z is size-biassing with respect to 1 Z. The constant θ appears in the identity through scaling in the size-biassed distributions. Limit distribution of excess life in a renewal process 1 P (V Z > η) = η P (Z > z) dz E[Z]

Typed dual Λ-coalescent The Λ-Fleming-Viot process is dual to the system of coalescing lineages {L(t)} t 0 which takes values in Z d + and for which the transition rates are, for i, j [d] and l 2, q Λ (ξ, ξ e i (l 1)) = [0,1] ( ξ ) y l (1 y) l ξ i + 1 l ξ + 1 l ξ lf (dy) y 2 M(ξ e i (l 1)) M(ξ) q Λ (ξ, ξ + e i e j ) = µ ij (ξ i + 1 δ ij ) M(ξ + e i e j ) M(ξ) The process is constructed as a moment dual from the generator.

A different dual process Define a sequence of monic polynomials {g n (x)} by the generator equation 1 x(1 x)eg n 2 (x(1 W ) + V W ) + 1 2 (θ 1 θx)g n (x) = ( n) E(1 W ) n 2 [g n 1 (x) g n (x)] + n 1 2 2 [θ 1g n 1 (x) θg n (x)] The defining equation mimics the Wright-Fisher diffusion acting on test functions g n (x) = x n 1 d2 x(1 x) 2 dx 2xn + 1 2 (θ 1 θx) d dx xn = ( n) (x n 1 x n ) + 1 2 2 n(θ 1x n 1 θx n )

Jacobi polynomial analogues The eigenfunctions are polynomials {P n (x)} satisfying LP n (x) = λ n P n (x) The coefficients are P n (x) = g n (x) + n 1 r=0 c nr g r (x) c nr = λ r+1 λ n (λ r λ n ) (λ n 1 λ n ) [ where λ n = n (n 1)E(1 W ) n 2 + θ 2 λ n = n ] [(n 1)E(1 W ) n 2 + θ 1 2 ]

In the stationary distribution E [ g n (X) ] = ω n where ω n is a Beta moment analog ω n = nj=1 ( (j 1)E(1 W ) j 2 + θ 1 ) nj=1 ( (j 1)E(1 W ) j 2 + θ ) Let so h n = g n ω n E [ h n (X) ] = 1

Dual Generator Dual equation Lh n = λ n [ hn 1 h n ] ] [ E X(0)=x [h n (X(t)) = E N(0)=n h N(t) (x) where {N(t), t 0} is a death process with rates λ n = 1 2 n( (n 1)E [ (1 W ) n 2] + θ ) The process {N(t), t 0} comes down from infinity if and only if n=2 λ 1 n < which implies the Λ-coalescent comes down from infinity. ]

The transition functions are then P ( N(t) = j N(0) = i ) = i k=j e λ kt ( 1) k j {l:j l k+1} λ l {l:j l k+1, l k} (λ l λ k ) P ( N(t) = j N(0) = ) is well defined if N(t) comes down from infinity. In the Kingman coalescent the death rates are n(n 1 + θ)/2 and N(t) describes the number of non-mutant lineages at time t back in the population.

Wright-Fisher diffusion. Fixation probability with selection. Two types. Let P (x) be the probability that the 1st type fixes, starting from an initial frequency of x. P (x) is the solution of L σ P (x) = 1 2 x(1 x)p (x) σx(1 x)p (x) = 0 with P (0) = 0, P (1) = 1. equation is The solution of this differential P (x) = e2σx 1 e 2σ 1

Λ-Fleming Viot process. Fixation probability with selection. Let P (x) be the probability that the 1st type fixes, starting from an initial frequency of x. P (x) is the solution of L σ P (x) = 1 2 x(1 x)e [P (x(1 W )+W V ) with P (0) = 0, P (1) = 1. Der, Epstein and Plotkin (2011,2012). ] σx(1 x)p (x) = 0 For some measures F If and σ it is possible that P (x) = 1 or 0 for all x (0, 1). σ > 0, fixation is certain if σ > 1 0 log(1 y) y 2 F (dy)

A computational solution for P (x) when fixation or loss is not certain from x (0, 1). Define a sequence of polynomials {h n (x)} n=0 pre-specified constants {h n (0)} as solutions of ( ) ( ) x(1 W ) + W hn x(1 W ) E h n W where the leading coefficient in h n (x) is 1 n 1 j=1 E[ (1 W ) j] for a choice of = nh n 1 (x)

Polynomial solution for P (x) P (x) = ( e 2σ 1 ) 1 n=1 (2σ) n H n (x), n! where {H n (x)} are polynomials derived from H n (x) = x 0 nh n 1(ξ)dξ and the constants {h n (0)} are chosen so that 1 0 nh n 1(ξ)dξ = 1 The coefficients of H n (x) are well defined by a recurrence relationship with the coefficients of H n 1 (x).