Population Genetics II (Selection + Haplotype analyses)

Similar documents
Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Population Genetics I. Bio

MCVEAN (2002) showed that predictions for r 2,a

Neutral Theory of Molecular Evolution

Introduction to Linkage Disequilibrium

Problems for 3505 (2011)

Outline of lectures 3-6

Outline of lectures 3-6

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium. November 12, 2012

Processes of Evolution

Mechanisms of Evolution Microevolution. Key Concepts. Population Genetics

The phenotype of this worm is wild type. When both genes are mutant: The phenotype of this worm is double mutant Dpy and Unc phenotype.

Natural Selection. DNA encodes information that interacts with the environment to influence phenotype

BIOL Evolution. Lecture 9

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

Tutorial on Theoretical Population Genetics

A. Correct! Genetically a female is XX, and has 22 pairs of autosomes.

ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG Human Population Genomics

Microevolution 2 mutation & migration

For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M.

List the five conditions that can disturb genetic equilibrium in a population.(10)

(Write your name on every page. One point will be deducted for every page without your name!)

AP Biology Evolution Review Slides

Evolutionary Genetics Midterm 2008

Perplexing Observations. Today: Thinking About Darwinian Evolution. We owe much of our understanding of EVOLUTION to CHARLES DARWIN.

Evolution of Populations. Chapter 17

19. Genetic Drift. The biological context. There are four basic consequences of genetic drift:

Classical Selection, Balancing Selection, and Neutral Mutations

Genetic variation of polygenic characters and the evolution of genetic degeneracy

Population Genetics: a tutorial

Darwinian Selection. Chapter 7 Selection I 12/5/14. v evolution vs. natural selection? v evolution. v natural selection

Case-Control Association Testing. Case-Control Association Testing

Outline of lectures 3-6

7. Tests for selection

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

Genetical theory of natural selection

Question: If mating occurs at random in the population, what will the frequencies of A 1 and A 2 be in the next generation?

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:

NOTES CH 17 Evolution of. Populations

Allele Frequency Estimation

Microevolution Changing Allele Frequencies

Introduction to Natural Selection. Ryan Hernandez Tim O Connor

LECTURE # How does one test whether a population is in the HW equilibrium? (i) try the following example: Genotype Observed AA 50 Aa 0 aa 50

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

URN MODELS: the Ewens Sampling Lemma

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency

Introduction to Advanced Population Genetics

Functional divergence 1: FFTNS and Shifting balance theory

Parts 2. Modeling chromosome segregation

Linkage and Linkage Disequilibrium

Big Idea #1: The process of evolution drives the diversity and unity of life

Genetics and Natural Selection

Mutation, Selection, Gene Flow, Genetic Drift, and Nonrandom Mating Results in Evolution

THEORETICAL EVOLUTIONARY GENETICS JOSEPH FELSENSTEIN

p(d g A,g B )p(g B ), g B

Evidence of Evolution

F SR = (H R H S)/H R. Frequency of A Frequency of a Population Population

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

How robust are the predictions of the W-F Model?

D. Incorrect! That is what a phylogenetic tree intends to depict.

Selection and Population Genetics

Chapter 16. Table of Contents. Section 1 Genetic Equilibrium. Section 2 Disruption of Genetic Equilibrium. Section 3 Formation of Species

Learning gene regulatory networks Statistical methods for haplotype inference Part I

When one gene is wild type and the other mutant:

Lecture 14 Chapter 11 Biology 5865 Conservation Biology. Problems of Small Populations Population Viability Analysis

Effective population size and patterns of molecular evolution and variation

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.

Parts 2. Modeling chromosome segregation

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)

Inbreeding depression due to stabilizing selection on a quantitative character. Emmanuelle Porcher & Russell Lande

Long-Term Response and Selection limits

Application Evolution: Part 1.1 Basics of Coevolution Dynamics

Introduction to population genetics & evolution

Introduction to Probability and Statistics

Friday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo

UNIT V. Chapter 11 Evolution of Populations. Pre-AP Biology

Chapter 13 Meiosis and Sexual Reproduction

Segregation versus mitotic recombination APPENDIX

The Wright-Fisher Model and Genetic Drift

STAT 536: Migration. Karin S. Dorman. October 3, Department of Statistics Iowa State University

The Mechanisms of Evolution

(Genome-wide) association analysis

4. Populationsgenetik

Chapter 17: Population Genetics and Speciation

Is there any difference between adaptation fueled by standing genetic variation and adaptation fueled by new (de novo) mutations?

D. Gordon, M.A. Levenstien, S.J. Finch, J. Ott. Pacific Symposium on Biocomputing 8: (2003)

Linking levels of selection with genetic modifiers

Introductory seminar on mathematical population genetics

1. Understand the methods for analyzing population structure in genomes

Lecture 2. Basic Population and Quantitative Genetics

8. Genetic Diversity

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power

ASSOCIATION ANALYSES of the MAS-QTL DATA SET using GRAMMAR, PRINCIPAL COMPONENTS and BAYESIAN NETWORK METHODOLOGIES

Lesson 2 Evolution of population (microevolution)

AUTHORIZATION TO LEND AND REPRODUCE THE THESIS. Date Jong Wha Joanne Joo, Author

POPULATIONS. p t+1 = p t (1-u) + q t (v) p t+1 = p t (1-u) + (1-p t ) (v) Phenotypic Evolution: Process HOW DOES MUTATION CHANGE ALLELE FREQUENCIES?

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

Processes of Evolution

Transcription:

26 th Oct 2015 Poulation Genetics II (Selection + Halotye analyses) Gurinder Singh Mickey twal Center for Quantitative iology

Natural Selection Model (Molecular Evolution) llele frequency Embryos Selection dults llele frequency One generation

Examle of natural selection in mice Day 5 after fertilization of egg 53+/+ 53-/- 53-/- +LIF injection Imlantation sites Hu et al (2007) Genotye of C57L/6J mice Male Female LIF injection Imlantation sites (verage±se) Number of recovered blastocysts (verage±se) +/+ +/+ - 8.4±0.5 0 5 -/- -/- - 2.7±0.8 3.2 ±0.6 6 -/- -/- + 7±0.8 0.6±0.6 3 n

Hardy Weinberg Law Consider 2 alleles (,a) with frequency llele frequency of = llele frequency of a = q = 1- Randomly-mating large diloid oulation with no mutation, migration, selection and drift Genotye a aa Hardy- Weinberg Frequency 2 2q q 2

Fitness Genotye a aa Newborn frequency 2 2q q 2 Fitness w w a w aa Relative fitness Frequency after selection w w = 1 w w a 2 s q 1 2 w 1 hs 2 q 1 w w waa = 1 hs = 1 s w s = selection coefficient (relative viability of over aa) h = heterozygous effect w = mean relative fitness

Mean Relative Fitness of Poulation 2 mean fitness = w = w + 2qw + a q 2 w aa mean relative fitness = w = w w w 2 = 1 2qhs q s w "1 Genetic Load = L = 1- w 0 " L "1

Heterozygous advantage h=0 dominant, a recessive h=1 recessive, a dominant 0<h<1 incomlete dominance h<0 overdominance h>1 underdominance h determines the equilibrium allele frequency s determines how fast the equilibrium is achieved

Fundamental Theorem of Natural Selection R. Fisher, 1958 Change of mean fitness is roortional to additive genetic variance w' w = qs 2w 2 w ' =fitness in next generation

Tyes of selection Directional selection (0<h<1) causes to go to 1 conventional Darwinian natural selection alancing selection (h<0) cause to go to some equilibrium value e e.g. heterozygous variant of H gene confers resistance to malaria athogen (Plasmodium falciarum) Disrutive selection (h>1) if < e then goes to 0 if > e then goes to 1

Examle of human directional selection P C Sabeti et al. Science 2006;312:1614-1620 The FY * O allele in the romotor gene of Duffy antigen gene, which confers resistance to Plasmodium vivax malaria, is revalent and even fixed in many frican oulations

What about drift? Very imortant in small oulations. Deends on relative ratios of s and 1/2N e.g. allele has a selective advantage over allele a with selection coefficient s w w aa = 1 s In an initial oulation entirely consisting of aa genotyes, robability of new mutant fixing In an initial oulation entirely consisting of genotyes, robability of new mutant a fixing = 1 1 e e e e s = 2Ns s 2Ns 1 1 > 0 Therefore, even deleterious alleles can fixate in a small oulation!

Detecting Natural Selection in the Human Genome e.g. McDonald- Kreitman test e.g. Tajima D test P C Sabeti et al. Science 2006;312:1614-1620 Choice of selection test deends on the time scale of evolution

HPLOTYPE STUDIES

Halotye Ø Sequence of contiguous SNP alleles on a chromatid Ø Hard to determine directly across whole genome Ø Usually only the genotyes are rovided, giving ambiguous halotyes Ø Halotyes usually inferred ( hased ) by statistical comutation Ø Newer exerimental methods can directly hase halotyes, but are costly

Tyical Results of Genotye ssays SNPS 1 2 3 4 5 6 7 8 9 10 Cell Lines / Patients 6023 T/T G/G / C/C / / C/C G/G C/C G/G 6031 T/T G/G / C/C / / C/G G/G C/C G/G 6032 C/C / C/C T/T C/C C/C C/G / G/G / 6033 C/T /G /C C/T /C /C C/G /G C/G /G 6034 T/T G/G / C/C / / C/G G/G C/C G/G 6046 T/T G/G / C/C / / C/G G/G C/C G/G 6047 C/T /G /C C/T /C /C C/C /G C/G /G 6048 C/T /G /C C/T /C /C C/G /G C/G /G 6053 C/C / / T/T C/C C/C C/G / G/G / 6054 T/T G/G / C/C / / C/G G/G C/C G/G 6055 C/T /G /C failed /C /C C/G /G C/G /G 6056 C/T /G /C C/T /C /C C/G /G C/G /G 6057 C/T /G /C C/T /C /C C/G /G C/G /G 6060 C/T /G /C C/T /C /C failed /G C/G /G 6061 C/C / C/C T/T C/C C/C C/G / G/G / 6067 T/T G/G / C/C / / C/C G/G C/C G/G

Linkage Disequilibrium Ø Linkage Disequilibrium (LD) = correlation of nucleotide alleles at different loci across the oulation l On average, there is strong LD between nearby alleles on the same chromosome Ø Linkage Equilibrium = random association (indeendence) of alleles at different loci across the oulation Ø LD reflects many factors of oulation history Ø LD ermits us to use roxy SNPs as diagnostic biomarkers for disease-causing mutations

Poulation history and SNP correlations Present day chromosomes Mutations occurring at various times of oulation history Neutral mutation Disease mutation resent time ast

New halotyes generated by mutations and C Locus 1 Locus 2 T ncestral chromosome with two loci shown C T T Mutation at locus 1 C C T T G Mutation at locus 2 on ancestral chromosome

intra-chromosomal recombination efore recombination C C T T G Halotye 1 Halotye 2 Halotye 3 fter recombination C C T T G G recombination between halotyes 2 and 3 generates a new halotye from existing mutations

Quantifying linkage disequilibrium Ø From the oulation halotye frequencies we can calculate the correlations between SNPs. Ø Commonly used LD summaries l D l Lewontin s D l r 2

Halotye frequencies Halotye with 2 SNPs /a /b LOCUS 2 llele llele b Totals llele b LOCUS 1 llele a a ab a Totals b 1.0

Linkage Equilibrium definition ) )(1 (1 ) (1 ) (1 b a ab a a b b = = = = Random association of alleles Exected for SNPs at distant loci

Linkage Disequilibrium definition ) )(1 (1 ) (1 ) (1 b a ab a a b b Non-random association of alleles Exected for SNPs at nearby loci

LD measure : D Deviation from linkage equilibrium D = Thus it can be shown that all 4 of the 2-SNP halotye frequencies can be exressed in terms of D, and only. i.e., a b ab = + D = (1 ) D = (1 ) D = (1 )(1 ) + D Note also, D = ab b a

LD measure : Lewontin s D Normalized version of D: D ' = D D max where D max is given by D = max min[ min[ b,, a a b ] ] if D>0 if D<0 D ranges between -1 and 1 directly related to recombination fraction D=0 if linkage equilibrium D =1 if only 2 or 3 halotyes are resent out of the ossible 4 D uwardly biased in small samles

LD measure : r 2 Square of the correlation coefficient r 2 = D a 2 b ranges between 0 and 1 useful in association maing r 2 =0 if linkage equilibrium r 2 =1 if only 2 halotyes are resent roortional to mutual information between 2 loci when D small

Factors affecting Linkage Disequilibrium Increases LD Ø Finite Samling (Drift) Ø Demograhic bottleneck Ø Selection Ø Emigration Decreases LD Ø Immigration Ø Recombination decreases number (or variability) of halotyes increases number (or variability) of halotyes

How does LD decay over time? Ø Recombination reduces correlation between SNPs Halotye frequencies at time t P a a b b P b P a P ab

Decay of linkage disequilibrium in large oulation Ø The frequency of in the new generation (time t+1) will deend on the frequencies of, a, and b in the old generation (time t) and also the recombination rate, c t+ 1 = = = (1 (1 t c) c) t t cd t + + c c t ( t t D ) t Therefore, D t+ 1 = D t (1 c) D t+ n = D t (1 c) n D t ex( cn) (at large times)

Different oulations exhibit characteristic LD decay across the genome 1 0.9 Caucasian frican-merican sian Yoruban 0.8 0.7 Mean D' 0.6 0.5 0.4 0.3 0.2 0.1 0 0 50,000 100,000 150,000 Distance (b) Gabriel et al, 2002

Finite oulation size : Recombination-Drift Equilibrium Ø Rate of decay of LD by recombination is cancelled out by rate of increase of LD by drift r 2! 1 1+ 4N e cd N e = effective oulation size (~10,000 for humans) c = recombination rate (er base-air) d = distance across genome (base-airs) 1 N e = 1 T! # " 1 + 1 +... + 1 N 1 N 2 N T $ & % Note that N e will be dominated by the times when oulation sizes are reduced (oulation bottleneck)