Introduction to Natural Selection. Ryan Hernandez Tim O Connor

Similar documents
Introduction to Wright-Fisher Simulations. Ryan Hernandez

The Wright-Fisher Model and Genetic Drift

Population Genetics I. Bio

Question: If mating occurs at random in the population, what will the frequencies of A 1 and A 2 be in the next generation?

Genetics and Natural Selection

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

Introduction to population genetics & evolution

8. Genetic Diversity

URN MODELS: the Ewens Sampling Lemma

Processes of Evolution

Microevolution Changing Allele Frequencies

Mechanisms of Evolution

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

#2 How do organisms grow?

19. Genetic Drift. The biological context. There are four basic consequences of genetic drift:

Evolution and the Genetics of Structured populations. Charles Goodnight Department of Biology University of Vermont

Population Genetics 7: Genetic Drift

genome a specific characteristic that varies from one individual to another gene the passing of traits from one generation to the next

Outline of lectures 3-6

The Genetics of Natural Selection

AP Biology Evolution Review Slides

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

GENETICS - CLUTCH CH.1 INTRODUCTION TO GENETICS.

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

A. Correct! Genetically a female is XX, and has 22 pairs of autosomes.

How robust are the predictions of the W-F Model?

Lecture 14 Chapter 11 Biology 5865 Conservation Biology. Problems of Small Populations Population Viability Analysis

(Write your name on every page. One point will be deducted for every page without your name!)

Heredity and Genetics WKSH

Outline of lectures 3-6

STAT 536: Genetic Statistics

Q Expected Coverage Achievement Merit Excellence. Punnett square completed with correct gametes and F2.

NOTES CH 17 Evolution of. Populations

Problems for 3505 (2011)

AEC 550 Conservation Genetics Lecture #2 Probability, Random mating, HW Expectations, & Genetic Diversity,

Microevolution 2 mutation & migration

Population Structure

Darwinian Selection. Chapter 6 Natural Selection Basics 3/25/13. v evolution vs. natural selection? v evolution. v natural selection

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

Notes for MCTP Week 2, 2014

Natural Selection results in increase in one (or more) genotypes relative to other genotypes.

Chapter 13 Meiosis and Sexual Reproduction

Big Idea #1: The process of evolution drives the diversity and unity of life

Tutorial on Theoretical Population Genetics

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Segregation versus mitotic recombination APPENDIX

Darwinian Selection. Chapter 7 Selection I 12/5/14. v evolution vs. natural selection? v evolution. v natural selection

Selection Page 1 sur 11. Atlas of Genetics and Cytogenetics in Oncology and Haematology SELECTION

Mutation, Selection, Gene Flow, Genetic Drift, and Nonrandom Mating Results in Evolution

Population Genetics & Evolution

Rebops. Your Rebop Traits Alternative forms. Procedure (work in pairs):

POPULATIONS. p t+1 = p t (1-u) + q t (v) p t+1 = p t (1-u) + (1-p t ) (v) Phenotypic Evolution: Process HOW DOES MUTATION CHANGE ALLELE FREQUENCIES?

The Wright Fisher Controversy. Charles Goodnight Department of Biology University of Vermont

What is a sex cell? How are sex cells made? How does meiosis help explain Mendel s results?

Natural Selection. Population Dynamics. The Origins of Genetic Variation. The Origins of Genetic Variation. Intergenerational Mutation Rate

STAT 536: Migration. Karin S. Dorman. October 3, Department of Statistics Iowa State University

Meiosis. Activity. Procedure Part I:

The theory of evolution continues to be refined as scientists learn new information.

Population Genetics: a tutorial

Evolution of Populations. Chapter 17

Learning Objectives:

Labs 7 and 8: Mitosis, Meiosis, Gametes and Genetics

Name. Diversity of Life

Meiosis and Mendel. Chapter 6

Mathematical modelling of Population Genetics: Daniel Bichener

D. Incorrect! That is what a phylogenetic tree intends to depict.

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

Full file at CHAPTER 2 Genetics

Outline for today s lecture (Ch. 14, Part I)

Science Unit Learning Summary

Science 9 Unit 2 pack: Reproduction

List the five conditions that can disturb genetic equilibrium in a population.(10)

Effective population size and patterns of molecular evolution and variation

Introductory seminar on mathematical population genetics

Sexual and Asexual Reproduction. Cell Reproduction TEST Friday, 11/13

Outline of lectures 3-6

AP Biology Review Packet 5- Natural Selection and Evolution & Speciation and Phylogeny

Chromosome Chr Duplica Duplic t a ion Pixley

Evolution & Natural Selection

Math 345 Intro to Math Biology Lecture 7: Models of System of Nonlinear Difference Equations

Module B Unit 5 Cell Growth and Reproduction. Mr. Mitcheltree

UNIT 3: GENETICS 1. Inheritance and Reproduction Genetics inheritance Heredity parent to offspring chemical code genes specific order traits allele

Cover Requirements: Name of Unit Colored picture representing something in the unit

Introduction to Genetics

2. the variants differ with respect to their expected abilities to survive and reproduce in the present environment (S 0), then

Lesson Overview Meiosis

Gene Pool The combined genetic material for all the members of a population. (all the genes in a population)

Selection and Population Genetics

Discrete probability and the laws of chance

Genetical theory of natural selection

Objective 3.01 (DNA, RNA and Protein Synthesis)

Chapter 16. Table of Contents. Section 1 Genetic Equilibrium. Section 2 Disruption of Genetic Equilibrium. Section 3 Formation of Species

Strong Selection is Necessary for Repeated Evolution of Blindness in Cavefish

The genome encodes biology as patterns or motifs. We search the genome for biologically important patterns.

Chapter 13: Meiosis and Sexual Life Cycles

Enduring Understanding: Change in the genetic makeup of a population over time is evolution Pearson Education, Inc.

Chapter 4 Lesson 1 Heredity Notes

Processes of Evolution

Guided Reading Chapter 1: The Science of Heredity

Evolutionary Genetics Midterm 2008

Transcription:

Introduction to Natural Selection Ryan Hernandez Tim O Connor 1

Goals Learn about the population genetics of natural selection How to write a simple simulation with natural selection 2

Basic Biology genome Functional non-coding mutations far { near { 5 3 micro-rna gene trans-regulatory region (enhancers) Un-Translated Regions (UTRs) cis-regulatory region (promoters: transcription factor binding sites) noncoding RNA: snornas, sirnas, pirnas, long ncrnas Promoter 3 Overall, ~2% of the human genome is protein coding ~5% of genome is obviously functional ~80% of genome has functional activity

Life Cycle Sexual selection Parents Gametic selection Adults Gametes (sperm & eggs) Viability selection Zygotes Compatibility selection 4

Modern Human Genomics: A case for rare variants? 1.1 10-8 6 10 9 = 66 [muts / person] 66 [muts/p] 130M [p/y] 3B [bp] 2.86 muts/bp/yr 5

Sequencing Data Chromosome SNP 1 SNP 2 SNP 3 SNP 4 SNP 5 SNP 6 1 A C A G C C 2 A T G A C T 3 G T G A T T 4 A C G A C T # Pairwise differences 3 4 3 3 3 3! = average pairwise diversity 6

Sequencing Data Chromosome SNP 1 SNP 2 SNP 3 SNP 4 SNP 5 SNP 6 1 A C A G C C 2 A T G A C T 3 G T G A T T 4 A C G A C T # Pairwise differences 3 4 3 3 3 3 # Compared 6 6 6 6 6 6! = average pairwise diversity 7

Sequencing Data Chromosome SNP 1 SNP 2 SNP 3 SNP 4 SNP 5 SNP 6 1 A C A G C C 2 A T G A C T 3 G T G A T T 4 A C G A C T # Pairwise differences 3 4 3 3 3 3 # Compared 6 6 6 6 6 6 Avg. Pairwise Diff 0.5 0.67 0.5 0.5 0.5 0.5 Number of variants: 6 SNPs Diversity (π): 3.1667/L 8

Sequencing Data Chromosome SNP 1 SNP 2 SNP 3 SNP 4 SNP 5 SNP 6 1 A C A G C C 2 A T G A C T 3 G T G A T T 4 A C G A C T Minor Allele A C A G C T MAF 0.25 0.5 0.25 0.25 0.25 0.25 5 MAF 5 1 Number of SNPs 2.5 0 1/4 2/5 9 Minor Allele Frequency

Sequencing Data Chromosome SNP 1 SNP 2 SNP 3 SNP 4 SNP 5 SNP 6 1 A C A G C C 2 A T G A C T 3 G T G A T T 4 A C G A C T Minor Allele A C A G C T MAF 0.25 0.5 0.25 0.25 0.25 0.25 0.9 MAF 5 1 Fraction of SNPs 0.45 0 1/4 2/5 10 Minor Allele Frequency

Sequencing Data Chromosome SNP 1 SNP 2 SNP 3 SNP 4 SNP 5 SNP 6 1 A C A G C C 2 A T G A C T 3 G T G A T T 4 A C G A C T Chimp A C A G C T 11

Sequencing Data Chromosome SNP 1 SNP 2 SNP 3 SNP 4 SNP 5 SNP 6 1 A C A G C C 2 A T G A C T 3 G T G A T T 4 A C G A C T Chimp A C A G C T 12

Sequencing Data Chromosome SNP 1 SNP 2 SNP 3 SNP 4 SNP 5 SNP 6 13 Proportion of SNPs 1 A C A G C C 2 A T G A C T 3 G T G A T T 4 A C G A C T Chimp A C A G C T Derived count 0.5 0.333 0.167 1 2 3 3 1 1 0 1 2 3 Derived frequency in sample Site-Frequency Spectrum (SFS)

Site-Frequency Spectrum * * * * * * * * * * * * * * * * 1 C A T T C G A A G C G A T C A G G C T A T A 2 C A T T T G A G A C G A T C A G G C T A T A 3 C G T T T G A G A C G A T T A G G C C A T A 4 C A T T C G A G A C G A T C A G G C T A T A outgroup T A C C C A G G A G A T A C G C A T T T A T = non-coding = synonymous = nonsynonymous * - Substitution between species 14

Site-Frequency Spectrum The proportion of SNPs at each frequency in a sample of chromosomes. proportion of SNPs 0.0 0.1 0.2 0.3 0.4 0.5 1 2 3 4 5 6 7 8 9 10 11 15 derived count in sample of 12 chrs. 15

Site-Frequency Spectrum SNM AfAm (Human) Ch (RheMac) Rufi (rice) Indica (rice) Japonica (rice) In (RheMac) proportion of SNPs 0.0 0.1 0.2 0.3 0.4 0.5 Characteristic signature of rapid adaptation 1 2 3 4 5 6 7 8 9 10 11 16 derived count in sample of 12 chrs. 16

Population Genetics Imagine a population of diploid individuals P Q R Principles of random mating: Any two individuals are equally likely to mate and reproduce to populate the next generation. 17 Either chromosome is equally likely to be passed on.

Hardy-Weinberg Assumptions: Principle Diploid organism Sexual reproduction Non-overlapping generations Only two alleles Random mating Conclusion 1: Both allele AND genotype frequencies will remain constant at HWE generation after generation... forever! 18 Godfrey H. Hardy: 1877-1947 Wilhelm Weinberg: 1862-1937 Identical frequencies in males/females Infinite population size No migration No mutation No natural selection P=p 2 Q=2p(1-p) R=(1-p) 2

Hardy-Weinberg Principle Imagine a population of diploid individuals A A A a a a P =0.81 Q =0.18 R =0.01 frequency 0.0 0.2 0.4 0.6 0.8 1.0 AA Aa aa 2 4 6 8 10 19

Hardy-Weinberg Principle Imagine a population of diploid individuals A A A a a a P =0.5 Q =0.1 R =0.4 p = P + Q/2 =0.55 p 2 =0.3025 2p(1 p) =0.495 (1 p) 2 =0.2025 frequency 0.0 0.2 0.4 0.6 0.8 1.0 AA Aa aa 2 4 6 8 10 20 Conclusion 2: A single round of random mating will return the population to HWE frequencies!

Hardy-Weinberg Principle Assumptions: Diploid organism Sexual reproduction Non-overlapping generations Only two alleles Random mating Godfrey H. Hardy: 1877-1947 Wilhelm Weinberg: 1862-1937 Identical frequencies in males/females Infinite population size No migration No mutation No natural selection 21

Hardy-Weinberg Equilibrium 22

Hardy-Weinberg Equilibrium 23 Graham Coop

Genetic Drift In finite populations, allele frequencies can and do change over time. In fact, EVERY genetic variant will either be lost from the population (p=0) or fixed in the population (p=1) some time in the future. The most common model for finite populations is the Wright-Fisher model. This model makes explicit use of the binomial distribution. 24

Bernoulli Distribution Jacob Bernoulli 1655-1705 One of the simplest probability distributions A discrete probability distribution Classic example: tossing a coin If a coin toss comes up heads with probability p, it results in tails with probability 1-p. If X is a Bernoulli Random Variable, x is an observation we write: ( p if x =1 f(x p) = 1 p if x =0 The Expected Value is E[X] = p, and the Variance is V[X] = p(1-p). 25

Binomial Distribution We introduced the Bernoulli Distribution, where we imagine a coin flip resulting in heads with probability p. But if we flipped the coin N times, how many heads would we expect? What is the probability that we get heads all N times? The number of successes in a fixed number of trials is described by the Binomial Distribution. Written out, if the probability of each success is p, then the probability we observe j successes in N trials is: P (j N,p) = N p j (1 p) N j N ; j j 26 N! = j!(n j)!

Binomial Mean and Variance The mean of a Binomial Random Variable is: E[J] = Np With variance: V[J] = p(1-p)/n 27

Wright-Fisher Model Sewall Wright: 1889-1988 Sir Ronald Fisher 1890-1962 Suppose a population of N individuals. Let X(t) be the #chromosomes carrying an allele A in generation t: N P (X(t + 1) = j X(t) =i) = p j (1 p) N j j j N j N i N i =Bin(j N,i/N) = j N N 28

Wright-Fisher Model A simple R function to simulation genetic drift: Run it in R using: f=wf(100, 0.5, 200) plot(f) WF=function(N, p, G){ t=array(,dim=g); t[1] = p; for(i in 2:G){ t[i] = rbinom(1,n,t[i-1])/n; } return(t); } 29

Wright-Fisher Model 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 30

Demographic Effects What do you think will happen if a population grows? Or shrinks? 31

Wright-Fisher Model Sewall Wright: 1889-1988 Sir Ronald Fisher 1890-1962 Suppose a population of N individuals. Let X(t) be the #chromosomes carrying an allele A in generation t: N P (X(t + 1) = j X(t) =i) = p j (1 p) N j j j N j N i N i =Bin(j N,i/N) = j N N 32

Wright-Fisher Model A simple R function to simulation genetic drift: WFdemog = function(n, p, G, Gd, v){ t=array(,dim=g); t[1] = p; for(i in 2:G){ if(i == Gd){ N = N*v; } t[i] = rbinom(1,n,t[i-1])/n; } return(t); } 33

Wright-Fisher Model with Expansion 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 34 Run it using: WFdemog(100, 0.5, 200, 50, 100)

Wright-Fisher Model with Contraction 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 35 Run it using: WFdemog(100, 0.5, 200, 50, 0.1)

Hardy-Weinberg Principle Assumptions: Diploid organism Sexual reproduction Non-overlapping generations Only two alleles Random mating Identical frequencies in males/females Infinite population size No migration No mutation No natural selection What happens when we allow natural selection to occur? Alleles change frequency! 36

Natural Selection Usually parameterized in terms of a dominance coefficient (h), and a selection coefficient (s), with wildtype fitness set to 1: Genotype AA Aa aa Frequency p 2 2pq q 2 Fitness 1 1+hs 1+s h=1 is completely dominant h=0 is completely recessive h=0.5 is genic selection, or codominance, or additive fitness 37

Natural Selection Genotype AA Aa aa Frequency p 2 2pq q 2 Fitness 1 1+hs 1+s How do we model the change in allele frequencies? What is fitness?! Refers to the average number of offspring individuals with a particular genotype will have. Wild-type individuals have on average 1 offspring, while homozygous derived individuals have on average 1+s offspring. 38

Natural Selection Genotype AA Aa aa Frequency p 2 2pq q 2 Fitness 1 1+hs 1+s In this case, s is the absolute fitness. If the population size is fixed, then we need to consider relative fitness. That is, how fit is an individual genotype relative to the population. For this, we need to know average population fitness! 39 w = p 2 (1) + 2pq(1 + hs)+q 2 (1 + s) =1+sq(2hp + q)

Natural Selection Genotype AA Aa aa Frequency p 2 2pq q 2 Fitness 1 1+hs 1+s The expected frequency in the next generation (q ) is then the density of offspring produced by carriers of the derived allele divided by the population fitness: q 0 = q2 (1 + s)+pq(1 + hs) 1+sq(2hp + q) 40

Natural Selection Trajectory of selected allele with various selection coefficients under genic selection (h=0.5) in an infinite population Frequency of A allele 0.0 0.2 0.4 0.6 0.8 1.0 0.5 0.3 0.2 0.1 0.05 0 20 40 60 80 100 41

Hardy-Weinberg Principle Assumptions: Diploid organism Sexual reproduction Non-overlapping generations Only two alleles Random mating Identical frequencies in males/females Infinite population size No migration No mutation No natural selection What happens with natural selection in a finite population? Directional selection AND drift! 42

Simulating Natural Selection First write an R function for the change in allele frequencies: fitfreq = function(q, h, s){ p=1-q; return((q^2*(1+s) + p*q*(1+h*s))/( 1 + s*q*(2*h*p+q))); } Now use this in an updated WF simulator: WF.sel=function(N, q, h, s, G){ t=array(,dim=g); t[1] = N*q; for(i in 2:G){ t[i] = rbinom(1,n,fitfreq(t[i-1]/n, h, s)); } return(t); } 43

Natural Selection N=100; s=0.1; h=0.5 1 simulations 10 simulations Frequency 0.0 0.4 0.8 Frequency 0.0 0.4 0.8 0 20 40 60 80 100 50 simulations 0 20 40 60 80 100 100 simulations Frequency 0.0 0.4 0.8 Frequency 0.0 0.4 0.8 44 0 20 40 60 80 100 0 20 40 60 80 100

Natural Selection Fixation Probability 0.0 0.1 0.2 0.3 0.4 0.5 Simulated (1 exp( s))(1 exp( Ns) 45 0.1 0.0 0.1 0.2 0.3 0.4 0.5 selection coefficient (s) Estimating the probability of fixation of a new mutation (p0=1/n) 5000 simulations: N=100; h=0.5 Pr(Fixation s=0, p0) = p0!!

Natural Selection Time-course data from artificial selection/ancient DNA Let s estimate some selection coefficients! Given 2 alleles at a locus with frequencies p0 and q0, and fitnesses w1 and w2 (with w the population-wide fitness). Expected freq. in next generation is: p1=p =p0w1/w. We can then write: p 1 = p 0w 1 /w q 1 q 0 w 2 /w = Using induction, you could prove for any generation t: p0 q 0 w1 w 2 46 p t = p 0w 1 /w q t q 0 w 2 /w = p0 q 0 w1 w 2 t

Natural Selection Taking the natural log of this equation: log pt q t = log w1 w 2 t + log p0 q 0 Which is now a linear function of t, the number of generations. Therefore, the ratio of the fitnesses w1/w2 = e slope 47

Natural Selection Experiment: Set up a population of bacteria in a chemostat, and let them reproduce. Sample roughly every 5 generations. A slope of 0.139 implies: w1=e 0.139 = 1.15 Assume w2=1. Thus, allele p has a 15% fitness advantage over allele q! ln(p/q) 0 1 2 3 4 5 slope = 0.139 (simulated with 20% advantage) 48 0 5 10 15 20 25 30 35

Summary Hardy-Weinberg Equilibrium requires many assumptions, all of which are routinely violated in natural populations. Nevertheless, the vast majority of variants are in HWE. Deviations almost always due to technical artifacts! Simulating Wright-Fisher models is easy! Natural selection changes the expected allele frequency in the next generation. But drift still acts in finite populations! 49