Population Structure

Similar documents
Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

LECTURE # How does one test whether a population is in the HW equilibrium? (i) try the following example: Genotype Observed AA 50 Aa 0 aa 50

Population genetics snippets for genepop

Observation: we continue to observe large amounts of genetic variation in natural populations

19. Genetic Drift. The biological context. There are four basic consequences of genetic drift:

STAT 536: Migration. Karin S. Dorman. October 3, Department of Statistics Iowa State University

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

(Write your name on every page. One point will be deducted for every page without your name!)

Lecture 13: Variation Among Populations and Gene Flow. Oct 2, 2006

Population Genetics I. Bio

Genetic Variation in Finite Populations

F SR = (H R H S)/H R. Frequency of A Frequency of a Population Population

The Wright-Fisher Model and Genetic Drift

Lecture 13: Population Structure. October 8, 2012

Breeding Values and Inbreeding. Breeding Values and Inbreeding

How robust are the predictions of the W-F Model?

Notes on Population Genetics

Shane s Simple Guide to F-statistics

Problems for 3505 (2011)

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:

CONGEN Population structure and evolutionary histories

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency

Microevolution 2 mutation & migration

Processes of Evolution

Effective population size and patterns of molecular evolution and variation

Genetic hitch-hiking in a subdivided population

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

Evolutionary Theory. Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A.

Darwinian Selection. Chapter 7 Selection I 12/5/14. v evolution vs. natural selection? v evolution. v natural selection

8. Genetic Diversity

Case Studies in Ecology and Evolution

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

NOTES CH 17 Evolution of. Populations

Demography April 10, 2015

Lecture 2. Basic Population and Quantitative Genetics

Big Idea #1: The process of evolution drives the diversity and unity of life

Population Genetics & Evolution

Outline of lectures 3-6

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

Question: If mating occurs at random in the population, what will the frequencies of A 1 and A 2 be in the next generation?

Population Genetics 7: Genetic Drift

The genomes of recombinant inbred lines

A. Correct! Genetically a female is XX, and has 22 pairs of autosomes.

Introduction to Natural Selection. Ryan Hernandez Tim O Connor

AEC 550 Conservation Genetics Lecture #2 Probability, Random mating, HW Expectations, & Genetic Diversity,

NEUTRAL EVOLUTION IN ONE- AND TWO-LOCUS SYSTEMS

Neutral Theory of Molecular Evolution

Notes for MCTP Week 2, 2014

Outline of lectures 3-6

1 Errors in mitosis and meiosis can result in chromosomal abnormalities.

Q Expected Coverage Achievement Merit Excellence. Punnett square completed with correct gametes and F2.

Mathematical Population Genetics II

Lecture 3. Introduction on Quantitative Genetics: I. Fisher s Variance Decomposition

Introduction to population genetics & evolution

Week 7.2 Ch 4 Microevolutionary Proceses

Levels of genetic variation for a single gene, multiple genes or an entire genome

Chapter 02 Population Genetics

Mathematical Population Genetics II

Chromosome Chr Duplica Duplic t a ion Pixley

POPULATIONS. p t+1 = p t (1-u) + q t (v) p t+1 = p t (1-u) + (1-p t ) (v) Phenotypic Evolution: Process HOW DOES MUTATION CHANGE ALLELE FREQUENCIES?

Relatedness 1: IBD and coefficients of relatedness

Biology 211 (1) Exam 4! Chapter 12!

A Primer of Population Genetics

Chapter 16. Table of Contents. Section 1 Genetic Equilibrium. Section 2 Disruption of Genetic Equilibrium. Section 3 Formation of Species

Applications of Genetics to Conservation Biology

Genetical theory of natural selection

Evolution and the Genetics of Structured populations. Charles Goodnight Department of Biology University of Vermont

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

Outline for today s lecture (Ch. 14, Part I)

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

There are 3 parts to this exam. Use your time efficiently and be sure to put your name on the top of each page.

Gene Pool The combined genetic material for all the members of a population. (all the genes in a population)

overproduction variation adaptation Natural Selection speciation adaptation Natural Selection speciation

Population Genetics: a tutorial

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014

I of a gene sampled from a randomly mating popdation,

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

Lecture 2: Introduction to Quantitative Genetics

Linkage and Linkage Disequilibrium

URN MODELS: the Ewens Sampling Lemma

Estimating effective population size from samples of sequences: inefficiency of pairwise and segregating sites as compared to phylogenetic estimates

Chapter 8: Evolution and Natural Selection

Genetics: Early Online, published on February 26, 2016 as /genetics Admixture, Population Structure and F-statistics

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection

Introductory seminar on mathematical population genetics

Match probabilities in a finite, subdivided population

A SIMPLE METHOD TO ACCOUNT FOR NATURAL SELECTION WHEN

Functional divergence 1: FFTNS and Shifting balance theory

Introduction to Advanced Population Genetics

Outline of lectures 3-6

There are 3 parts to this exam. Take your time and be sure to put your name on the top of each page.

Meiosis and Mendel. Chapter 6

ACCORDING to current estimates of spontaneous deleterious

Conservation Genetics. Outline

Linear Regression (1/1/17)

Lecture 18 : Ewens sampling formula

The theory of evolution continues to be refined as scientists learn new information.

A consideration of the chi-square test of Hardy-Weinberg equilibrium in a non-multinomial situation

Gene Flow, Genetic Drift Natural Selection

Transcription:

Ch 4: Population Subdivision Population Structure v most natural populations exist across a landscape (or seascape) that is more or less divided into areas of suitable habitat v to the extent that populations are isolated, they will become genetically differentiated due to genetic drift, selection, and eventually mutation v genetic differentiation among populations is relevant to conservation biology as well as fundamental questions about how adaptive evolution proceeds 1

Definitions v panmixia v population structure v subpopulation v gene flow v isolation by distance v vicariance (vicariant event) Structure Results in Inbreeding v given finite population size, autozygosity gradually increases because the members of a population share common ancestors ² even when there is no close inbreeding 2

Identical by Descent v what is the probability that two randomly sampled alleles are identical by descent (i.e., replicas of a gene present in a previous generation )? ² Wright s fixation index F v at the start of the process (time 0), declare all alleles in the population to be unique or unrelated, F t = 0 at t = 0 v in the next generation, the probability of two randomly sampled alleles being copies of the same allele from a single parent = 1/(2N), so 3

Identical by Descent F t = 1 2N + # 1 1 & % ( F $ 2N ' t 1 = probability that alleles are copies of the same gene from the immediately preceding generation plus the probability that the alleles are copies of the same gene from an earlier generation or # F t =1-1 1 & % ( $ 2N ' t assuming F 0 = 0 compare to: mean time to fixation for new mutant = ~4N 4

v Suppose multiple subpopulations: Overall average allele frequency stays the same but heterozygosity declines Predicted distributions of allele frequencies in replicate populations of N = 16 same process as in this figure 5

Population Structure v F t for a single population is essentially the same thing as F ST ² a measure of genetic differentiation among populations based on the reduction in heterozygosity v due to increasing autozygosity, structured populations have lower heterozygosity than expected if all were combined into a single random breeding population Aa = 0 6

F ST v measures the deficiency of heterozygotes in the total population relative to the expected level (assuming HWE) v in the simplest case, one can calculate F ST for a comparison of two populations F ST = H T H S H T Two population, two allele F ST Frequency of "A" Popula2on 1 Popula2on 2 H T H S F ST 0.5 0.5 0.5 0.5 0 0.4 0.6 0.5 0.48 0.04 0.3 0.7 0.5 0.42 0.16 0.2 0.8 0.5 0.32 0.36 0.1 0.9 0.5 0.18 0.64 0.0 1.0 0.5 0 1 0.3 0.35 0.43875 0.4375 0.002849 0.65 0.95 0.32 0.275 0.140625 7

F ST - Whalund Effect v Whalund principle - reduction in homozygosity that results from combining differentiated populations Frequency of heterozygotes in the combined population is higher than the average of the separate populations (0.42 > 0.40) Heterozygosity 0.6 0.5 0.4 0.3 0.2 F ST = H H T S 0.42 0.40 = = 0.0476 H T 0.42 F ST = var(p) pq = 0.01 0.21 = 0.0476 0.1 0 0 0.2 0.4 0.6 0.8 1 Allele Frequency 8

F ST - Whalund Effect v Whalund principle - reduction in homozygosity due to combining differentiated populations ² R = frequency of homozygous recessive genotype R separate R fused = q 2 2 1 + q 2 q 2 2 = 1 ( 2 q q 1 ) 2 + 1 ( 2 q q 2 ) 2 = σ q 2 F ST - Whalund Effect (Nielsen & Slatkin) f A = 2N 1 f A1 + 2N 2 f A2 2N 1 + 2N 2 f A = f A1 + f A2 2 ( ) + 2 f A2 ( 1 f A2 ) H S = 2 f 1 f A1 A1 2 # H T = 2 f A1 + f A2 &# % ( 1 f A1 + f A2 & % ( = f A1 1 f A1 $ 2 ' $ 2 ' where δ = f A1 f A2 = f A1 ( 1 f A1 ) + f A2 ( 1 f A2 ) ( ) + f A2 ( 1 f A2 ) + δ 2 2 9

F ST over time w/ no migration F t = 1 2N + " 1 1 % $ # 2N & " F t =1 1 1 % $ ' # 2N & F ST 1 e 1 2 N t t 'F t 1 1 e 1 2 N t F ST increases with time due to genetic drift in exactly the same way as F t 10

Migration v migration between populations results in gene flow, which counters the effects of genetic drift (and selection) and tends to homogenize allele frequencies v what level of migration is sufficient to counter the effects of genetic drift? ² Nm ~ 1 v what level of migration is sufficient to counter the effects of selection? ² m > s The Island Model assumptions: v equal population sizes v equal migration rates in all directions 11

Equilibrium value of F ST v change in F t with migration " F t = 1 % $ ' 1 m # 2N & " ( ) 2 + $ 1 1 # % ' 1 m 2N & ( ) 2 F t 1 setting ˆ F = F t = F t 1 some algebra + ignoring terms in m 2 and m/n... ˆF 1 1+ 4Nm Equilibrium value of F ST ˆF 1 1+ 4Nm Nm =1 Fig. 4.5, pg. 69 12

Migration rate vs. Number of migrants v migration rates yielding Nm = 1 ² N e = 100, m = 0.01 ² N e = 1000, m = 0.001 ² N e = 10000, m = 0.0001 ² N e = 100000, m = 0.00001 Equilibrium value of F ST mtdna or y-chromosome 13

F ST over time w/ no migration 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 mtdna or y-chromosome autosomal loci 0 5000 10000 15000 Nm = 1 corresponds to F ST = 0.2 v Wright (1978) ² F ST = 0.05 to 0.15 - moderate differentiation ² F ST = 0.15 to 0.25 - great genetic differentiation ² F ST > 0.25 - very great genetic differentiation Nm =1 14

Nm = 1 corresponds to F ST = 0.2 v Wright (1978) ² F ST = 0.05 to 0.15 - moderate differentiation ² F ST = 0.15 to 0.25 - great genetic differentiation ² F ST > 0.25 - very great genetic differentiation v populations of most mammalian species range from F ST = 0.1 to 0.8 v humans: ² among European groups: 0 to 0.025 ² Among Asians, Africans & Europeans: 0.05 to 0.2 F ST v theoretical maximum is 1 if two populations are fixed for different alleles v but, there are some issues v fixation index developed by Wright in 1921 when we knew essentially nothing about molecular genetics ² two alleles at a locus (with or w/o mutation between them) was the model 15

F ST versus G ST v F ST derived by Wright as a function of the variance in allele frequencies v G ST derived by Nei as a function of within and among population heterozygosities G ST = H T H S H T F ST = var(p) pq " =1 H % S $ ' # & H T G ST with multiple alleles v microsatellite loci, for example, may have many alleles in all subpopulations v F ST can not exceed the average level of homozygosity (1 minus heterozygosity) G ST =1 H S H T <1 H S 16

G ST = H T H S H T G ST ~ 0.12 Balloux et al. 2000 Evolution Hedrick (2005) Evolution v a standardized genetic distance measure for k populations: G ST v where: ' G ST ( ) ( )( 1 H S ) = G ST = G ST k 1+ H S G ST (Max) k 1 G ST (Max) = H T (Max) H S H T (Max) and H T (Max) =1 1 k 2 p ij 2 i j 17

H S H T H T(max) G ST(max) G' ST Subpopulation Subpopulation Allele 1 2 1 2 1 0.1 0.1 2 0.2 0.2 3 0.2 0.2 0.1 4 0.2 0.2 0.2 5 0.2 0.2 0.2 6 0.1 0.1 0.2 7 0.1 0.2 8 0.2 0.1 9 0.2 10 0.2 11 0.2 12 0.1 H S H T 0.820 0.910 F ST (G ST ) 0.099 F ST (G ST ) 0.910 H T(max) G ST(max) G' ST 0.820 0.850 0.035 0.910 0.099 0.099 1 0.357 18

Coalescent-based Measures v Slatkin (1995) Genetics F ST = T T W T v where T and T W are the mean coalescence times for all alleles and alleles within subpopulations T W = 2N e d T B = 2N e d + d 1 2m 19

R ST for microsatellites v under a stepwise mutation model for microsatellites, the difference in repeat number is correlated with time to coalescence R ST = S - S W S v where S and S W are the average squared difference in repeat number for all alleles and alleles within subpopulations v violations of the stepwise mutation model are a potential problem Φ ST for DNA sequences v the number of pairwise differences between two sequences provides an estimate of time to coalescence v method of Excoffier et al. (1992) takes into account the number of differences between haplotypes v Arelquin (software for AMOVA analyses) calculates both F ST and Φ ST for DNA sequence data ² important to specify which one is calculated 20