Introduction to Wright-Fisher Simulations. Ryan Hernandez

Similar documents
Introduction to Natural Selection. Ryan Hernandez Tim O Connor

Darwinian Selection. Chapter 6 Natural Selection Basics 3/25/13. v evolution vs. natural selection? v evolution. v natural selection

Processes of Evolution

Genetics and Natural Selection

Population Genetics I. Bio

The Wright-Fisher Model and Genetic Drift

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

Problems for 3505 (2011)

Question: If mating occurs at random in the population, what will the frequencies of A 1 and A 2 be in the next generation?

Outline of lectures 3-6

Population Genetics & Evolution

Microevolution Changing Allele Frequencies

List the five conditions that can disturb genetic equilibrium in a population.(10)

8. Genetic Diversity

NOTES CH 17 Evolution of. Populations

Introduction to population genetics & evolution

Microevolution 2 mutation & migration

How robust are the predictions of the W-F Model?

Outline of lectures 3-6

Genetical theory of natural selection

Mechanisms of Evolution

Natural Selection results in increase in one (or more) genotypes relative to other genotypes.

The Genetics of Natural Selection

Evolution and the Genetics of Structured populations. Charles Goodnight Department of Biology University of Vermont

POPULATIONS. p t+1 = p t (1-u) + q t (v) p t+1 = p t (1-u) + (1-p t ) (v) Phenotypic Evolution: Process HOW DOES MUTATION CHANGE ALLELE FREQUENCIES?

Lecture 14 Chapter 11 Biology 5865 Conservation Biology. Problems of Small Populations Population Viability Analysis

Evolution PCB4674 Midterm exam2 Mar

A. Correct! Genetically a female is XX, and has 22 pairs of autosomes.

Segregation versus mitotic recombination APPENDIX

Mathematical modelling of Population Genetics: Daniel Bichener

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

Evolution of Populations. Chapter 17

Selection Page 1 sur 11. Atlas of Genetics and Cytogenetics in Oncology and Haematology SELECTION

D. Incorrect! That is what a phylogenetic tree intends to depict.

Tutorial on Theoretical Population Genetics

Big Idea #1: The process of evolution drives the diversity and unity of life

Enduring Understanding: Change in the genetic makeup of a population over time is evolution Pearson Education, Inc.

Topic 09 Evolution. I. Populations A. Evolution is change over time. (change in the frequency of heritable phenotypes & the alleles that govern them)

(Write your name on every page. One point will be deducted for every page without your name!)

Evolution & Natural Selection

Darwinian Selection. Chapter 7 Selection I 12/5/14. v evolution vs. natural selection? v evolution. v natural selection

Chapter 16. Table of Contents. Section 1 Genetic Equilibrium. Section 2 Disruption of Genetic Equilibrium. Section 3 Formation of Species

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

AEC 550 Conservation Genetics Lecture #2 Probability, Random mating, HW Expectations, & Genetic Diversity,

The Wright Fisher Controversy. Charles Goodnight Department of Biology University of Vermont

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

19. Genetic Drift. The biological context. There are four basic consequences of genetic drift:

#2 How do organisms grow?

The Mechanisms of Evolution

AP Biology Review Packet 5- Natural Selection and Evolution & Speciation and Phylogeny

Outline of lectures 3-6

STAT 536: Genetic Statistics

Evolution of Populations

Pre-Lab: Aipotu: Evolution

Biology 11 UNIT 1: EVOLUTION LESSON 2: HOW EVOLUTION?? (MICRO-EVOLUTION AND POPULATIONS)

The theory of evolution continues to be refined as scientists learn new information.

1 Errors in mitosis and meiosis can result in chromosomal abnormalities.

Gene Genealogies Coalescence Theory. Annabelle Haudry Glasgow, July 2009

Case Studies in Ecology and Evolution

Darwin s Observations & Conclusions The Struggle for Existence

The genome encodes biology as patterns or motifs. We search the genome for biologically important patterns.

AP Biology Concepts and Connections. Reading Guide. Your Name: ! Chapter 13 How Populations Evolve. Key Terms

mrna Codon Table Mutant Dinosaur Name: Period:

Mutation, Selection, Gene Flow, Genetic Drift, and Nonrandom Mating Results in Evolution

Processes of Evolution

Mechanisms of Evolution Microevolution. Key Concepts. Population Genetics

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency

Problems on Evolutionary dynamics

Systems Biology: A Personal View IX. Landscapes. Sitabhra Sinha IMSc Chennai

9 Genetic diversity and adaptation Support. AQA Biology. Genetic diversity and adaptation. Specification reference. Learning objectives.

genome a specific characteristic that varies from one individual to another gene the passing of traits from one generation to the next

overproduction variation adaptation Natural Selection speciation adaptation Natural Selection speciation

UNIT V. Chapter 11 Evolution of Populations. Pre-AP Biology

The neutral theory of molecular evolution

opulation genetics undamentals for SNP datasets

It all depends on barriers that prevent members of two species from producing viable, fertile hybrids.

URN MODELS: the Ewens Sampling Lemma

Selection and Population Genetics

Biology 8 Learning Outcomes

Evolution. Species Changing over time

Study of similarities and differences in body plans of major groups Puzzling patterns:

Quantitative Genetics I: Traits controlled my many loci. Quantitative Genetics: Traits controlled my many loci

Ecology and Evolutionary Biology 2245/2245W Exam 3 April 5, 2012

Chapter 16: Evolutionary Theory

How to Use This Presentation

BIOL Evolution. Lecture 9

Chapter 17: Population Genetics and Speciation

Population Genetics: a tutorial

Functional divergence 1: FFTNS and Shifting balance theory

Is there any difference between adaptation fueled by standing genetic variation and adaptation fueled by new (de novo) mutations?

Linking levels of selection with genetic modifiers

Mechanisms of Evolution. Adaptations. Old Ideas about Evolution. Behavioral. Structural. Biochemical. Physiological

Beaming in your answers

Gene Pool The combined genetic material for all the members of a population. (all the genes in a population)

Evidence of Evolution

STAT 536: Migration. Karin S. Dorman. October 3, Department of Statistics Iowa State University

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

Introductory seminar on mathematical population genetics

Since we re not going to have review this week either

Runaway. demogenetic model for sexual selection. Louise Chevalier. Jacques Labonne

Transcription:

Introduction to Wright-Fisher Simulations Ryan Hernandez 1

Goals Simulate the standard neutral model, demographic effects, and natural selection Start with single sites, and build in multiple sites 2

Hardy-Weinberg Assumptions: rinciple Diploid organism Sexual reproduction Non-overlapping generations Only two alleles Random mating Conclusion 1: Both allele AND genotype frequencies will remain constant at HWE generation after generation... forever! 3 Godfrey H. Hardy: 1877-1947 Wilhelm Weinberg: 1862-1937 Identical frequencies in males/females Infinite population size No migration No mutation No natural selection =p 2 Q=2p(1-p) R=(1-p) 2

Hardy-Weinberg rinciple Imagine a population of diploid individuals A A A a a a =0.81 Q =0.18 R =0.01 frequency AA Aa aa 2 4 6 8 10 4

Hardy-Weinberg rinciple Imagine a population of diploid individuals A A A a a a =0.5 Q =0.1 R =0.4 p = + Q/2 =0.55 p 2 =0.3025 2p(1 p) =0.495 (1 p) 2 =0.2025 frequency AA Aa aa 2 4 6 8 10 5 Conclusion 2: A single round of random mating will return the population to HWE frequencies!

Hardy-Weinberg rinciple Assumptions: Diploid organism Sexual reproduction Non-overlapping generations Only two alleles Random mating Godfrey H. Hardy: 1877-1947 Wilhelm Weinberg: 1862-1937 Identical frequencies in males/females Infinite population size No migration No mutation No natural selection 6

Wright-Fisher Model Sewall Wright: 1889-1988 Sir Ronald Fisher 1890-1962 Suppose a population of N individuals. Let X(t) be the #chromosomes carrying an allele A in generation t: N (X(t + 1) = j X(t) =i) = p j (1 p) N j j j N j N i N i =Bin(j N,i/N) = j N N 7

Wright-Fisher Model A simple R function to simulation genetic drift: Initial pop size Starting frequency number of simulated generations WF=function(N, p, G){ t=array(,dim=g); t[1] = p; for(i in 2:G){ t[i] = rbinom(1,n,t[i-1])/n; } return(t); } Run it in R using: f=wf(100, 0.5, 200) plot(f) 8

Wright-Fisher Model 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 9

Demographic Effects opulation changes size at a given generation 10

Wright-Fisher Model Sewall Wright: 1889-1988 Sir Ronald Fisher 1890-1962 Suppose a population of N individuals. Let X(t) be the #chromosomes carrying an allele A in generation t: N (X(t + 1) = j X(t) =i) = p j (1 p) N j j j N j N i N i =Bin(j N,i/N) = j N N 11

A simple R function to simulation demographic effects: Run it using: WFdemog = function(n, p, G, Gd, v){ t=array(,dim=g); t[1] = p; for(i in 2:G){ if(i == Gd){ N = N*v; } t[i] = rbinom(1,n,t[i-1])/n; } return(t); } f=wfdemog(100, 0.5, 200, 50, 100) plot(f) Wright-Fisher Model Initial pop size Starting frequency s to simulate demographic event happens Magnitude of size change 12

Wright-Fisher Model with Expansion 13

Wright-Fisher Model with Contraction Run it using: WFdemog(100, 0.5, 200, 50, 0.1) 14

Wright-Fisher Model with Contraction Run it using: WFdemog(100, 0.5, 200, 50, 0.1) 14

Hardy-Weinberg rinciple Assumptions: Diploid organism Sexual reproduction Non-overlapping generations Only two alleles Random mating Identical frequencies in males/females Infinite population size No migration No mutation No natural selection What happens when we allow natural selection to occur? Alleles change frequency! 15

Natural Selection Genotype AA Aa aa Frequency p 2 2pq q 2 Fitness 1 1+hs 1+s The expected frequency in the next generation (q ) is then the density of offspring produced by carriers of the derived allele divided by the population fitness: q 0 = q2 (1 + s)+pq(1 + hs) 1+sq(2hp + q) 16

Natural Selection Trajectory of selected allele with various selection coefficients under genic selection (h=0.5) in an infinite population Frequency of A allele 0.5 0.3 0.2 0.1 0.05 0 20 40 60 80 100 17

Hardy-Weinberg rinciple Assumptions: Diploid organism Sexual reproduction Non-overlapping generations Only two alleles Random mating Identical frequencies in males/females Infinite population size No migration No mutation No natural selection What happens with natural selection in a finite population? Directional selection AND drift! 18

Simulating Natural Selection First write an R function for the change in allele frequencies: fitfreq = function(q, h, s){ p=1-q; return((q^2*(1+s) + p*q*(1+h*s))/( 1 + s*q*(2*h*p+q))); } Now use this in an updated WF simulator: initial freq dominance fitness pop size initial freq dominance fitness gens to simulate WF.sel=function(N, q, h, s, G){ t=array(,dim=g); t[1] = N*q; for(i in 2:G){ t[i] = rbinom(1,n,fitfreq(t[i-1], h, s))/n; } return(t); } 19

20 Natural Selection WF.sel(100, 0.01, 0.5, 0.1, 100)

Natural Selection WF.sel(100, 0.01, 0.5, 0.1, 100) 1 simulations 10 simulations Frequency 0.0 0.4 0.8 Frequency 0.0 0.4 0.8 0 20 40 60 80 100 50 simulations 0 20 40 60 80 100 100 simulations Frequency 0.0 0.4 0.8 Frequency 0.0 0.4 0.8 20 0 20 40 60 80 100 0 20 40 60 80 100

Simulating Natural Selection How would you simulate both selection AND demographic effects? Now use this in an updated WF simulator: pop size initial freq dominance fitness gens to simulate Gen demographic event happens Magnitude of size change WF.demsel=function(N, q, h, s, G, Gd, v){ t=array(,dim=g); t[1] = N*q; for(i in 2:G){ if(i == Gd){ N = N*v; } t[i] = rbinom(1,n,fitfreq(t[i-1], h, s))/n; } return(t); } 21

Wright-Fisher Model with Contraction Run it using: WF.demsel(100,0.5,0.5,0.1,100,50,100) 22

Wright-Fisher Model with Contraction Run it using: WF.demsel(100,0.5,0.5,0.1,100,50,100) Frequency 0 20 40 60 80 100 0 20 40 60 80 100 Frequency Frequency Frequency 22 0 20 40 60 80 100 0 20 40 60 80 100

What parameters generated these? Frequency 0 20 40 60 80 100 0 20 40 60 80 100 Frequency Frequency Frequency 0 20 40 60 80 100 23 0 20 40 60 80 100

Article Functional Segregation of Overlapping Genes in HIV Jason D. Fernandes, 1,2 Tyler B. Faust, 1,3 Nicolas B. Strauli, 4,5 Cynthia Smith, 1 David C. Crosby, 1 Robert L. Nakamura, 1 Ryan D. Hernandez, 4 and Alan D. Frankel 1,6, * 1 HIV genes Tat and Rev overlap. At protein level, many overlapping sites are conserved in both, but some sites only conserved in Rev. 24 Fernandez, et al., Cell (2016) Is joint conservation due to dual function or genetic code?

-0.5 Unfit Experimental Fitness 0 Neutral 0.5 Fit 1 10 20 30 40 50 60 70 80 86 Non-Overlap Deep Mutational Scanning M E V D R L E W H G S Q T A C T N C Y C C C F H C Q V C F M T A L G I S Y G R R R Q R R R A H Q N S Q T H Q A S L S Q T S Q S R G D T G E Aromatic Aliphatic olar Charged W F Y M I L V A G C S T Q N D E H R * roline-rich Cysteine-rich Core ARM 1 Vpr 10 20 30 40 Rev 50 60 70 Env 80 86 M E V D N L E W H G S Q T A C T N C Y C C C Y H C Q V C F L G L G I S Y G R R R Q R R R T Q S S D H Q N I S Q L S Q T R G D T G E Overlap atient Conservation Aromatic Aliphatic olar Charged W F Y M I L V A G C S T Q N D E H R * 25 Allele Frequency 0% 5% 100% Functional Segregation of Overlapping Genes in HIV Article Jason D. Fernandes, 1,2 Tyler B. Faust, 1,3 Nicolas B. Strauli, 4,5 Cynthia Smith, 1 David C. Crosby, 1 Robert L. Nakamura, 1 Ryan D. Hernandez, 4 and Alan D. Frankel 1,6, * 1 In patient data, Tat sites that overlap with Rev are highly conserved. HIV can be engineered so that Tat and Rev do not overlap Deep mutational scanning in non-overlap context (all possible codons at each position) shows that many sites lack conservation in cell lines. Is this due to drift (neutral) or selection? Fernandez, et al., Cell (2016)

Functional Segregation of Overlapping Genes in HIV Deep mutational scanning: Create exhaustive libraries with all possible codons at all overlapping positions Allow population mixture to evolve for G generations, then sequence to measure final frequencies of all amino acids Simulate to evaluate significance of allele frequency change Factors you might want to include in your simulation: the overall population growth function the number of generations the starting allele frequency the ending read depth for the experiment and the amino acid identity of the allele 26 Fernandez, et al., Cell (2016)

ositively selected Neutral Negatively selected 27 Fernandez, et al., Cell (2016)

Natural Selection Time-course data from artificial selection/ancient DNA Let s estimate some selection coefficients! Given 2 alleles at a locus with frequencies p0 and q0, and fitnesses w1 and w2 (with w the population-wide fitness). Expected freq. in next generation is: p1=p =p0w1/w. We can then write: p 1 = p 0w 1 /w q 1 q 0 w 2 /w = Using induction, you could prove for any generation t: p0 q 0 w1 w 2 28 p t = p 0w 1 /w q t q 0 w 2 /w = p0 q 0 w1 w 2 t

Natural Selection Taking the natural log of this equation: log pt q t = log w1 w 2 t + log p0 q 0 Which is now a linear function of t, the number of generations. Therefore, the ratio of the fitnesses w1/w2 = e slope 29

Natural Selection Experiment: Set up a population of bacteria in a chemostat, and let them reproduce. Sample roughly every 5 generations. A slope of 0.139 implies: w1=e 0.139 = 1.15 Assume w2=1. Thus, allele p has a 15% fitness advantage over allele q! ln(p/q) 0 1 2 3 4 5 slope = 0.139 (simulated with 20% advantage) 30 0 5 10 15 20 25 30 35

Hidden: Adaptive sites with tiny allele frequency changes Hidden: Neutral sites with large allele frequency changes 31 Fernandez, et al., Cell (2016)

Existing forward simulators SFS_CODE: Hernandez (2008) Command-line flexibility shameless plug! SLIM 2: Haller & Messer (2017) R-like scripting environment that provides control over most aspects of the simulated evolutionary scenarios FWD: Thornton (2014) C++ library of routines intended to facilitate the development of forward-time simulations under 32 arbitrary mutation and fitness models