Statistical population genetics

Similar documents
The Wright-Fisher Model and Genetic Drift

Statistical population genetics

Evolution in a spatial continuum

How robust are the predictions of the W-F Model?

Lecture 18 : Ewens sampling formula

Population Genetics I. Bio

LECTURE NOTES: Discrete time Markov Chains (2/24/03; SG)

Introduction to population genetics & evolution

URN MODELS: the Ewens Sampling Lemma

Neutral Theory of Molecular Evolution

So in terms of conditional probability densities, we have by differentiating this relationship:

Population Structure

Inventory Model (Karlin and Taylor, Sec. 2.3)

Wright-Fisher Models, Approximations, and Minimum Increments of Evolution

Demography April 10, 2015

Population Genetics 7: Genetic Drift

The mathematical challenge. Evolution in a spatial continuum. The mathematical challenge. Other recruits... The mathematical challenge

Diffusion Models in Population Genetics

Frequency Spectra and Inference in Population Genetics

Heterozygosity is variance. How Drift Affects Heterozygosity. Decay of heterozygosity in Buri s two experiments

Population Genetics: a tutorial

14 Branching processes

EVOLUTIONARY DYNAMICS AND THE EVOLUTION OF MULTIPLAYER COOPERATION IN A SUBDIVIDED POPULATION

Computational Systems Biology: Biology X

Evolution & Natural Selection

Drift Inflates Variance among Populations. Geographic Population Structure. Variance among groups increases across generations (Buri 1956)

Stochastic Demography, Coalescents, and Effective Population Size

Mathematical Population Genetics. Introduction to the Stochastic Theory. Lecture Notes. Guanajuato, March Warren J Ewens

Stat 516, Homework 1

Genetic Variation in Finite Populations

Homework 3 posted, due Tuesday, November 29.

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency

Genetic hitch-hiking in a subdivided population

Lectures on Probability and Statistical Models

Lecture 24: Multivariate Response: Changes in G. Bruce Walsh lecture notes Synbreed course version 10 July 2013

Elementary Applications of Probability Theory

Theoretical Population Biology

2 Discrete-Time Markov Chains

The genomes of recombinant inbred lines

8. Genetic Diversity

Hitting Probabilities

Problems on Evolutionary dynamics

Endowed with an Extra Sense : Mathematics and Evolution

Contrasts for a within-species comparative method

AARMS Homework Exercises

Modern Discrete Probability Branching processes

Gene Genealogies Coalescence Theory. Annabelle Haudry Glasgow, July 2009

NEUTRAL EVOLUTION IN ONE- AND TWO-LOCUS SYSTEMS

Evolutionary dynamics on graphs

Darwinian Selection. Chapter 6 Natural Selection Basics 3/25/13. v evolution vs. natural selection? v evolution. v natural selection

Finite-Horizon Statistics for Markov chains

Introductory seminar on mathematical population genetics

GENETICS. On the Fixation Process of a Beneficial Mutation in a Variable Environment

6 Introduction to Population Genetics

π b = a π a P a,b = Q a,b δ + o(δ) = 1 + Q a,a δ + o(δ) = I 4 + Qδ + o(δ),

The Wright Fisher Controversy. Charles Goodnight Department of Biology University of Vermont

arxiv: v2 [q-bio.pe] 13 Jul 2011

Mathematical Population Genetics II

Population genetics snippets for genepop

Random Times and Their Properties

Lecture 14 Chapter 11 Biology 5865 Conservation Biology. Problems of Small Populations Population Viability Analysis

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

Classical Selection, Balancing Selection, and Neutral Mutations

Intensive Course on Population Genetics. Ville Mustonen

STABILIZING SELECTION ON HUMAN BIRTH WEIGHT

Population Genetics & Evolution

Mathematical modelling of Population Genetics: Daniel Bichener

The fate of beneficial mutations in a range expansion

An introduction to mathematical modeling of the genealogical process of genes

Observation: we continue to observe large amounts of genetic variation in natural populations

Invariant Manifolds of models from Population Genetics

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

The neutral theory of molecular evolution

Strong Selection is Necessary for Repeated Evolution of Blindness in Cavefish

(Write your name on every page. One point will be deducted for every page without your name!)

Inférence en génétique des populations IV.

Introduction to Natural Selection. Ryan Hernandez Tim O Connor

Detecting selection from differentiation between populations: the FLK and hapflk approach.

Identify the 6 kingdoms into which all life is classified.

Derivation of Itô SDE and Relationship to ODE and CTMC Models

GCSE Science. Module B3 Life on Earth What you should know. Name: Science Group: Teacher:

Mathematical Population Genetics II

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes?

Selection and Population Genetics

6 Introduction to Population Genetics

Optimum selection and OU processes

Population genetics. Key Concepts. Hardy-Weinberg equilibrium 3/21/2019. Chapter 6 The ways of change: drift and selection

Population Genetics and Evolution III

Microevolution Changing Allele Frequencies

Some mathematical models from population genetics

Notes on Population Genetics

THE EVOLUTION OF POPULATIONS THE EVOLUTION OF POPULATIONS

THE genetic consequences of population structure simple migration models with no selection (cited above),

The problem Lineage model Examples. The lineage model

Lecture 4: Random Variables and Distributions

Genetics: Early Online, published on February 26, 2016 as /genetics Admixture, Population Structure and F-statistics

Evolution of quantitative traits

MAT PS4 Solutions

Surfing genes. On the fate of neutral mutations in a spreading population

Transcription:

Statistical population genetics Lecture 2: Wright-Fisher model Xavier Didelot Dept of Statistics, Univ of Oxford didelot@stats.ox.ac.uk Slide 21 of 161

Heterozygosity One measure of the diversity of a population is its heterozygosity. Definition (Heterozygosity). Heterozygosity is the probability that two genes chosen at random from the population have different alleles. In a biallelic WF model, the heterozygosity is equal to: H t = 2 X t M ( 1 X ) t M How does this evolve with time in the WF? Slide 22 of 161

Heterozygosity in the WF Theorem (Heterozygosity under the biallelic WF model). Under the biallelic WF model, the expected heterozygosity decays approximately at rate 1/M when M is large. Slide 23 of 161

Heterozygosity in the WF Proof. E(H t+1 ) = 2 M 2E(X t+1(m X t+1 )) = 2 M 2 { ME(Xt+1 ) E(X 2 t+1) } = 2 { ME(Xt+1 M 2 ) var(x t+1 ) E(X t+1 ) 2} = 2 { } MX M 2 t X t + X2 t M X2 t ( = H t 1 1 ) M By induction on t we get that: E(H t ) = H 0 ( 1 1 M ) t H 0 e t/m Slide 24 of 161

Heterozygosity The decay of the heterozygosity illustrates how genetic drift tends to remove genetic variation from populations. Smaller populations loose variation faster than larger populations. The rate at which heterozygosity decays can be used to estimate the effective population size. Slide 25 of 161

The Buri experiment Once again, the data of Buri (1956) does not fit our expectation whenm = 32 but behaves as ifm = 18. Slide 26 of 161

Fixation X t = 0 and X t = M are absorbing states of the biallelic WF process. Genetic drift leads to either A or a being lost from the population. When this happens, the surviving allele is said to be fixed in the population, and the lost allele is said to be extinct. What is the probability that A will reach fixation rather than a given its initial frequency? Slide 27 of 161

Fixation Theorem (Probability of fixation). The probability that an allele will reach fixation given its initial frequency is equal to its initial frequency. Slide 28 of 161

Slide 29 of 161 Fixation Proof. The result is implied by the fact that E(X t ) remains constant and equal tox 0 : If fixation is reached at timet, then: E(X t ) = P(Afixed) M +P(afixed) 0 so that: P(Afixed) = E(X t )/M = X 0 /M Genealogical approach: eventually all genes in the population will be descended from one unique gene in generation 0, and this gene has probability X 0 /M to be of allele A. Markov Chain approach: let q i be the probability of fixation ofagiven X t = i, solve: q i = M q j P i,j j=0

Examples In the Buri (1956) experiment, 58 of the 107 populations reached fixation: 28 for allele bw 75 and 30 for the other allele. The probability that a new allele appearing in a population through mutation will eventually become fixed is equal to1/m provided no further mutation occurs. What is the expected time before fixation? Slide 30 of 161

Time before fixation Theorem (Time before fixation). Let τ(p) be the expected time before fixation given that X 0 = pm. Then: τ(p) 2M(plog(p)+(1 p)log(1 p)) with the approximation being valid for large populations. Slide 31 of 161

Time before fixation Proof. Ifp = 0 orp = 1, fixation is reached so that τ(0) = 0 and τ(1) = 0. Otherwise, τ(p) is equal to one plus the fixation time in the next step. By summing over all possibilities for the next step, we get: τ(p) = 1+ M P pm,j τ(j/m) j=0 This expresses τ as the solution of a linear equation. Unfortunately, this equation becomes increasingly difficult to solve as M increases. We therefore use an approximation. Slide 32 of 161

Time before fixation Let p t = X t /M. Recall that the variance ofp t+1 about p t is of order 1/M. Thus when M is large, the terms in the sum for whichabs(pm j) is large can be ignored. This suggests a continuous approximation. Let us assume that p is a continuous function in[0,1]. Then we can rewrite as: τ(p) = 1+ ǫ P(p p+ǫ)τ(p+ǫ)dǫ Slide 33 of 161

Time before fixation Sinceǫis small, we can expand τ(p+ǫ) as a Taylor series: τ(p) 1+ P(p p+ǫ)(τ(p)+ǫτ (p)+ǫ 2 τ (p)/2)dǫ ǫ = 1+τ(p)+τ (p) P(p p+ǫ)ǫdǫ ǫ + (τ (p)/2) P(p p+ǫ)ǫ 2 dǫ ǫ = 1+τ(p)+τ (p)e(ǫ)+(τ (p)/2)e(ǫ 2 ) Slide 34 of 161

Time before fixation SinceE(ǫ) = E(p t+1 p t ) = 0 and E(ǫ 2 ) = var(ǫ) = var(p t+1 ) = p(1 p)/m, we have: or τ(p) = 1+τ(p)+τ (p)p(1 p)/(2m) τ (p) = 2M p(1 p) This can be solved with boundary conditions τ(0) = 0 and τ(1) = 0 to give the required result.. Slide 35 of 161

Time before fixation Thus, for the Wright-Fisher model, the expected time to fixation is of order O(M). This is the so-called diffusion approximation to the mean absorption time, although we have not used diffusion theory explicitly here. For example, in the case of a newly appeared mutation, we have p = 1/M and τ(p) 2+2log(M) In the case where p = 1/2, we have τ(p) 1.38M Slide 36 of 161

Summary The pure Wright-Fisher model results in a decay of genetic variation. This is the effect of genetic drift, which is compensated by mutation. It is straightforward to extend the WF model to incorporate mutations. Exact calculations are impossible so that we need to use diffusion approximations as we did to find the time before fixation. This approach was championed in the 50s and 60s by Kimura. This is one of the most sophisticated branches of applied probability. We will avoid these complications! Slide 37 of 161