Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium. November 12, 2012

Similar documents
7. Tests for selection

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

Lecture 18 - Selection and Tests of Neutrality. Gibson and Muse, chapter 5 Nei and Kumar, chapter 12.6 p Hartl, chapter 3, p.

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

p(d g A,g B )p(g B ), g B

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:

Introduction to Linkage Disequilibrium

Population Genetics I. Bio

Neutral Theory of Molecular Evolution

Using Molecular Data to Detect Selection: Signatures From Multiple Historical Events

Lecture WS Evolutionary Genetics Part I 1

The genomic rate of adaptive evolution

Processes of Evolution

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.

122 9 NEUTRALITY TESTS

Fitness landscapes and seascapes

Using Molecular Data to Detect Selection: Signatures From Multiple Historical Events

Gene Genealogies Coalescence Theory. Annabelle Haudry Glasgow, July 2009

Molecular Evolution & the Origin of Variation

Molecular Evolution & the Origin of Variation

Parts 2. Modeling chromosome segregation

Parts 2. Modeling chromosome segregation

The neutral theory of molecular evolution

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

When one gene is wild type and the other mutant:

The Quantitative TDT

Population Genetics II (Selection + Haplotype analyses)

Group activities: Making animal model of human behaviors e.g. Wine preference model in mice

Lecture Notes: BIOL2007 Molecular Evolution

Recombina*on and Linkage Disequilibrium (LD)

2. Map genetic distance between markers

CSS 350 Midterm #2, 4/2/01

Selection and Population Genetics

CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation

SEQUENCE DIVERGENCE,FUNCTIONAL CONSTRAINT, AND SELECTION IN PROTEIN EVOLUTION

Linear Regression (1/1/17)

Natural selection on the molecular level

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

Question: If mating occurs at random in the population, what will the frequencies of A 1 and A 2 be in the next generation?

Classical Selection, Balancing Selection, and Neutral Mutations

A. Correct! Genetically a female is XX, and has 22 pairs of autosomes.

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

7.36/7.91 recitation CB Lecture #4

Full file at CHAPTER 2 Genetics

Understanding relationship between homologous sequences

The Lander-Green Algorithm. Biostatistics 666 Lecture 22

Natural Selection. Population Dynamics. The Origins of Genetic Variation. The Origins of Genetic Variation. Intergenerational Mutation Rate

BIOL 502 Population Genetics Spring 2017

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Runaway. demogenetic model for sexual selection. Louise Chevalier. Jacques Labonne

Gene expression differences in human and chimpanzee cerebral cortex

Chapter 4 Lesson 1 Heredity Notes

Supporting Information

Friday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo

Sequence evolution within populations under multiple types of mutation

A fast estimate for the population recombination rate based on regression

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

Estimating selection on non-synonymous mutations. Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh,

AEC 550 Conservation Genetics Lecture #2 Probability, Random mating, HW Expectations, & Genetic Diversity,

Heredity and Genetics WKSH

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

Mathematical models in population genetics II

Lab 12. Linkage Disequilibrium. November 28, 2012

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

Stationary Distribution of the Linkage Disequilibrium Coefficient r 2

Solutions to Problem Set 4

POPULATION GENETICS Biology 107/207L Winter 2005 Lab 5. Testing for positive Darwinian selection

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection

EVOLUTION ALGEBRA Hartl-Clark and Ayala-Kiger

Lecture 13: Population Structure. October 8, 2012

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Name: Period: EOC Review Part F Outline

Introduction to Natural Selection. Ryan Hernandez Tim O Connor

Objectives. Announcements. Comparison of mitosis and meiosis

Levels of genetic variation for a single gene, multiple genes or an entire genome

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Adaptation in the Human Genome. HapMap. The HapMap is a Resource for Population Genetic Studies. Single Nucleotide Polymorphism (SNP)

The Genealogy of a Sequence Subject to Purifying Selection at Multiple Sites

For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M.

It has been more than 25 years since Lewontin

GSBHSRSBRSRRk IZTI/^Q. LlML. I Iv^O IV I I I FROM GENES TO GENOMES ^^^H*" ^^^^J*^ ill! BQPIP. illt. goidbkc. itip31. li4»twlil FIFTH EDITION

Problems for 3505 (2011)

Outline of lectures 3-6

Calculation of IBD probabilities

Genetics (patterns of inheritance)

Conservation Genetics. Outline

- mutations can occur at different levels from single nucleotide positions in DNA to entire genomes.

Processes of Evolution

FUNDAMENTALS OF MOLECULAR EVOLUTION

(Genome-wide) association analysis

F1 Parent Cell R R. Name Period. Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes

Calculation of IBD probabilities

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency

Genetic diversity and population structure in rice. S. Kresovich 1,2 and T. Tai 3,5. Plant Breeding Dept, Cornell University, Ithaca, NY

Evolutionary Genetics Midterm 2008

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

BIOL ch (3C) Winter 2017 Evolutionary Genetics

Yesterday s Picture UNIT 3D

ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG Human Population Genomics

Transcription:

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012

Last Time Sequence data and quantification of variation Infinite sites model Nucleotide diversity (π) Sequence-based tests of neutrality Tajima s D Hudson-Kreitman-Aguade Synonymous versus Nonsynonymous substitutions McDonald-Kreitman

Today Signatures of selection based on synonymous and nonsynonymous substitutions Multiple loci and independent segregation Estimating linkage disequilibrium

Using Synonymous Substitutions to Control for Factors Other Than Selection d N /d S or Ka/Ks Ratios

Types of Mutations (Polymorphisms)

Synonymous versus Nonsynonymous SNP First and second position SNP often changes amino acid UCA, UCU, UCG, and UCC all code for Serine Third position SNP often synonymous Majority of positions are nonsynonymous Not all amino acid changes affect fitness: allozymes

Synonymous & Nonsynonymous Substitutions Synonymous substitution rate can be used to set neutral expectation for nonsynonymous rate d S is the relative rate of synonymous mutations per synonymous site d N is the relative rate of nonsynonymous mutations per non-synonymous site ω = d N /d S If ω = 1, neutral selection If ω < 1, purifying selection If ω > 1, positive Darwinian selection For human genes, ω 0.1

Complications in Estimating d N /d S Multiple mutations in a codon give multiple possible paths Two types of nucleotide base substitutions resulting in SNPs: transitions and transversions not equally likely CGT(Arg)->AGA(Arg) CGT(Arg)->AGT(Ser)->AGA(Arg) CGT(Arg)->CGA(Arg)->AGA(Arg) Back-mutations are invisible Complex evolutionary models using likelihood and Bayesian approaches must be used to estimate d N /d S (also called K A /K S or K N /K S depending on method) (PAML package) http://www.mun.ca/biology/scarr/transitions_vs_transversions.html

dn/ds ratios for 363 mouse-rat comparisons Most genes show purifying selection (dn/ds < 1) Some evidence of positive selection, especially in genes related to immune system interleukin-3: mast cells and bone marrow cells in immune system Hartl and Clark 2007

McDonald-Kreitman Test Conceptually similar to HKA test Uses only one gene Contrasts ratios of synonymous divergence and polymorphism to rates of nonsynonymous divergence and polymorphism Gene provides internal control for evolution rates and demography

Application of McDonald- Kreitman Test: Aligned 11,624 gene sequences between human and chimp Calculated synonymous and nonsynonymous substitutions between species (Divergence) and within humans (SNPs) Identified 304 genes showing evidence of positive selection (blue) and 814 genes showing purifying selection (red) in humans Positive selection: defense/immunity, apoptosis, sensory perception, and transcription factors Purifying selection: structural and housekeeping genes Bustamente et al. 2005. Nature 437, 1153-1157

Genes showing purifying (red) or positive (blue) selection in the human genome based on the McDonald-Kreitman Test Bustamente et al. 2005. Nature 437, 1153-1157

How can you differentiate between effects of selection and demographic effects on sequence variation? Will this work for organellar DNA?

Extending to Multiple Loci So far, only considering dynamics of alleles at single loci Loci occur on chromosomes, linked to other loci! The fitness of a single locus ripped from its interactive context is about as relevant to real problems of evolutionary genetics as the study of the psychology of individuals isolated from their social context is to an understanding of man s sociopolitical evolution Richard Lewontin (quoted in Hedrick 2005) Size of region that must be considered depends on Linkage Disequilibrium

Gametic (Linkage) Disequilibrium (LD) Nonrandom association of alleles at different loci into gametes Haplotype: Genotype of a group of closely linked loci LD is a major factor in evolution LD itself provides insights into population history Estimation of LD is critical for ALL population genetic data

Nomenclature and concepts Two loci, two alleles Frequency of allele i at locus 1 is p i Frequency of allele i at locus 2 is q i p 1 p 2 A 1 A 2 B 1 B 2 q 1 q 2 n i= 1 n p = i q i= 1 i = 1

Nomenclature and concepts Genotype is written as A 1 B 1 A 2 B 2 A 1 A 2 B 1 B 2 A 1 and B 1 are in coupling phase A 1 and B 2 are in repulsion phase

Gametic Disequilibrium Easiest to think about physically linked loci, but not necessarily the case Meiosis A 1 B 1 A 2 B 2 A 1 B 1 A 1 B 2 A 2 B 1 A 2 B 2 p 1 q 1 p 1 q 2 p 2 q 1 p 2 q 2 What Are Expected Frequencies of Gametes in a Population Under Independent Assortment?

What are expected frequency of Gametes with complete linkage? p 1 p 2 A 1 A 2 B 1 B 2 q 1 q 2 Meiosis A 1 B 1 A 2 B 2 A 1 B 1 A 1 B 2 A 2 B 1 A 2 B 2 x 11 x 12 x 21 x 22

Linkage disequilibrium measure, D Independent Assortment: With LD: Substituting from above table: D = x x x x 11 22 12 21

Problem: D is sensitive to allele frequencies Can t have negative gamete frequencies Maximum D set by allele frequencies Example, if D is positive: p 1 =0.5, q 2 =0.5, Dmax=0.25 but p 1 =0.1, q 2 =0.9, D max =0.09 Solution: D' = D/D max ranges from -1 to 1 D max Calculation: If D is positive, D max is lesser of p 1 q 2 or p 2 q 1 If D is negative, D max is lesser of p 1 q 1 or p 2 q 2

LD can also be estimated as correlation between alleles r = r can also be standardized to a -1 to 1 scale It is equivalent to D in this case p 1 D p 2 2 q q 1 2 p p q q r ' = 1 2 1 2 = D p 1 D p max 2 q q 1 2 D'

Recombination Shuffling of parental alleles during meiosis A 1 B 1 A 2 B 2 A 1 B 1 A 1 B 2 A 2 B 2 A 2 B 1 Occurs for unlinked loci and linked loci Rate of recombination for linked markers is partially a function of physical distance

What is the expected recombination rate for unlinked loci? Meiosis A 1 B 1 A 2 B 2 A 1 B 1 A 1 B 2 A 2 B 1 A 2 B 2 c Coupling n r n r = Where n r is number of repulsion phase gametes, and n c is number of coupling phase gametes + n c Repulsion Repulsion Coupling

LD is partially a function of recombination rate Expected proportions of gametes produced by various genotypes over two generations First generation (Second generation) Where c is the recombination rate and D 0 is the initial amount of LD

Recombination degrades LD over time D = x' x' x' x' 1 11 22 12 21 = ( x cd )( x cd ) ( x cd )( x cd ) 11 0 22 0 12 0 21 0 D = 1 c) 1 D t ( D ct 0 D = e D = ( 1 c) t t 0 D0 Where t is time (in generations) and e is base of natural log (2.718)

Effects of recombination rate on LD Decline in LD over time with different theoretical recombination rates (c) Even with independent segregation (c=0.5), multiple generations required to break up allelic associations Genome-wide linkage disequilibrium can be caused by demographic factors (more later)