Evolu&on, Popula&on Gene&cs, and Natural Selec&on Computa.onal Genomics Seyoung Kim

Similar documents
Recombina*on and Linkage Disequilibrium (LD)

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.

Neutral Theory of Molecular Evolution

Classical Selection, Balancing Selection, and Neutral Mutations

Introduction to Statistical Genetics (BST227) Lecture 6: Population Substructure in Association Studies

Purging of inbreeding load in par;ally selfing popula;ons

Evolu&on Cont d. h:p:// content/uploads/2009/09/evolu&on.jpg. 7 th Grade Biology Mr. Joanides

Genetical theory of natural selection

Gene expression differences in human and chimpanzee cerebral cortex

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki

Processes of Evolution

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Exponen'al growth Limi'ng factors Environmental resistance Carrying capacity logis'c growth curve

Specia'on. The linkage between geographical circumstance and evolu5onary development is embodied in the very word biogeography."

Darwinian Selection. Chapter 7 Selection I 12/5/14. v evolution vs. natural selection? v evolution. v natural selection

Introduction to Advanced Population Genetics

Choose the strongest accurate answer

Mathematical models in population genetics II

Evolution before Darwin

From Gene Trees to Species Trees. Tandy Warnow The University of Texas at Aus<n

Meiosis. Varia%on - The art of crossing over

Choose the strongest accurate answer

Option D.2 Species and Speciation

ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG Human Population Genomics

Application Evolution: Part 1.1 Basics of Coevolution Dynamics

Phylogeny Fig Overview: Inves8ga8ng the Tree of Life Phylogeny Systema8cs

(Write your name on every page. One point will be deducted for every page without your name!)

7. Tests for selection

Mendelian Gene*cs & the Chromosome Theory of Inheritance

Sec$on 9. Evolu$onary Rela$onships

Selection and Population Genetics

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

6.6 Meiosis and Genetic Variation. KEY CONCEPT Independent assortment and crossing over during meiosis result in genetic diversity.

Ch. 26 Phylogeny BIOL 221

SWEEPFINDER2: Increased sensitivity, robustness, and flexibility

Q Expected Coverage Achievement Merit Excellence. Punnett square completed with correct gametes and F2.

Mutation, Selection, Gene Flow, Genetic Drift, and Nonrandom Mating Results in Evolution

Heritability and the response to selec2on

Population Structure

4-1: Biodiversity and Evolution

Ch 11.4, 11.5, and 14.1 Review. Game

Febuary 1 st, 2010 Bioe 109 Winter 2010 Lecture 11 Molecular evolution. Classical vs. balanced views of genome structure

Effective population size and patterns of molecular evolution and variation

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Mechanisms of Evolution Microevolution. Key Concepts. Population Genetics

Taming the Beast Workshop

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Methods to reconstruct phylogene1c networks accoun1ng for ILS

Observing Patterns in Inherited Traits

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

There are 3 parts to this exam. Use your time efficiently and be sure to put your name on the top of each page.

6. Evolu)on, Co- evolu)on (and Ar)ficial Life) Part 1

Reproduc)ve Biology II

Group activities: Making animal model of human behaviors e.g. Wine preference model in mice

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

SNPs versus sequences for phylogeography an explora:on using simula:ons and massively parallel sequencing in a non- model bird

Neutral behavior of shared polymorphism

Introduc)on to Gene)cs How to Analyze Your Own Genome Fall 2013

Genetic Drift in Human Evolution

CSCI1950 Z Computa3onal Methods for Biology* (*Working Title) Lecture 1. Ben Raphael January 21, Course Par3culars

394C, October 2, Topics: Mul9ple Sequence Alignment Es9ma9ng Species Trees from Gene Trees

Full file at CHAPTER 2 Genetics

Differen'al Privacy with Bounded Priors: Reconciling U+lity and Privacy in Genome- Wide Associa+on Studies

Hidden Markov models in population genetics and evolutionary biology

Mul$ple Sequence Alignment Methods. Tandy Warnow Departments of Bioengineering and Computer Science h?p://tandy.cs.illinois.edu

Gene Genealogies Coalescence Theory. Annabelle Haudry Glasgow, July 2009

HUMAN EVOLUTION. Where did we come from?

Linear Regression and Correla/on. Correla/on and Regression Analysis. Three Ques/ons 9/14/14. Chapter 13. Dr. Richard Jerz

Linear Regression and Correla/on

Adaptation in the Human Genome. HapMap. The HapMap is a Resource for Population Genetic Studies. Single Nucleotide Polymorphism (SNP)

6 Introduction to Population Genetics

Chapter 7: Covalent Structure of Proteins. Voet & Voet: Pages ,

CSCI1950 Z Computa4onal Methods for Biology Lecture 5

SEQUENCE DIVERGENCE,FUNCTIONAL CONSTRAINT, AND SELECTION IN PROTEIN EVOLUTION

Meiosis (3.3) Think back to mitosis! What are the steps?? What happens during each one?

Lecture Notes: BIOL2007 Molecular Evolution

Natural Selection. Population Dynamics. The Origins of Genetic Variation. The Origins of Genetic Variation. Intergenerational Mutation Rate

Breeding Values and Inbreeding. Breeding Values and Inbreeding

Patterns of inheritance

10.1 Meiosis (AHL) Essen1al idea: Meiosis leads to independent assortment of chromosomes and unique composi1on of alleles in daughter cells.

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

V - Ekologické modely a mikrosvět. Ekologie volně žijících pro3st

Evolution Problem Drill 10: Human Evolution

Genotype Imputation. Class Discussion for January 19, 2016

AP Biology Day 47. Wednesday, January 11, 2017

This course is about VARIATION: its causes, effects, and history.

ANTHROPOLOGY 150: EVOLUTION AND HUMAN EMERGENCE NM HED Area III: Laboratory Science Competencies UNM Core Area 3: Physical and Natural Sciences

Genetic hitch-hiking in a subdivided population

CS 581 / BIOE 540: Algorithmic Computa<onal Genomics. Tandy Warnow Departments of Bioengineering and Computer Science hhp://tandy.cs.illinois.

BIG IDEA 4: BIOLOGICAL SYSTEMS INTERACT, AND THESE SYSTEMS AND THEIR INTERACTIONS POSSESS COMPLEX PROPERTIES.

Phylogenomics of closely related species and individuals

Since we re not going to have review this week either

Recessive Genetic Disorders: a recessive trait is expressed when the individual is homozygous recessive for the trait.

6 Introduction to Population Genetics

Organizing Life s Diversity

Genes Within Populations

Lecture 11 Friday, October 21, 2011

Cladistics and Bioinformatics Questions 2013

What modern day animals could have evolved from the above extinct animals?

Transcription:

Evolu&on, Popula&on Gene&cs, and Natural Selec&on 02-710 Computa.onal Genomics Seyoung Kim

Phylogeny of Mammals

Phylogene&cs vs. Popula&on Gene&cs Phylogene.cs Assumes a single correct species phylogeny that holds across genomes Reduces the en.re popula.on of a species into a single individual Ignores varia.ons among individuals of the same species or assumes a negligible variability within species Popula.on gene.cs Usually concerned with within- species varia.on in genomes Individuals within a species are related by genealogies Siepel, A. Genome Res. 19(11):1929-41. 2009. Phylogenomics of primates and their ancestral popula.ons.

Write- Fisher Model Stochas.c model for gene frequencies in finite popula.ons Assume the same popula.on size N at each genera.on. Thus, 2N copies of genes. (no recombina.on) p, q: allele frequencies of two alleles the probability of having k copies of one allele (with frequency p in the current genera.on) in the next genera.on is given as: 2N p k q k 2N k

Write- Fisher Model An individual at genera.on t randomly samples two individuals at genera.on t- 1 as parents We can trace the ancestry of each individual at the current genera.on backward In finite number of genera.ons, all individuals will coalesce into a single individual

Phylogeny of Mammals

Popula&on- aware Phylogene&cs Primate species Divergence.me is short rela.ve to ancestral popula.on sizes Phylogene.cs assump.ons do not hold Non- negligible popula.on gene.c effects Interspecies comparison, taking into account selec.ve forces within species

Popula&on Gene&cs Interpreta&on of Specia&on τ: specia.on.me T: coalescent.me N e : Popula.on size T ~ exp(2n e ) τ>>t (τ>>n e ): the phylogene.cs assump.on holds Divergence between individual chromosomes as an es.mate of specia.on.me τ<<t (τ<<n e ): Equivalent to the coalescence in popula.on gene.cs Coalescent.me dominates τ~t (τ~n e ): Both ancestral popula.on dynamics and interspecies divergence must be considered Popula.on- aware phylogene.cs

Three- Species Phylogeny τ: specia.on.me T: coalescent.me Gray phylogeny: concordant with the phylogeny among the three species Black phylogeny: discordance with the phylogeny among the three species Human Chimpanzee Gorilla ILS: incomplete lineage sor.ng with deep coalescent

Three- Species Phylogeny τ: specia.on.me T: coalescent.me When N xy, N xyz are small, τ xy and τ xyz approximate the divergence.me well Otherwise, the coalescent.me T xy, T xyz need to be taken into account Human Chimpanzee Gorilla

What if We Ignore Incomplete Lineage Sor&ng Aligned human (Hom), chimpanzee (Pan), gorilla (Gor), orangutan (Pon) sequences Two different es.mated lineages Without considera.on of ILS, subs.tu.on rates are overes.mated

Coalescent- HMM (Hobolth et al., 2009,2011) Four states corresponding to different phylogenies with ILS Transi.ons to other states correspond to recombina.ons

Coalescent- HMM HC1 state (with no ILS) explains only ~50% of the loci Remaining states explain the other 50% propor.oned roughly equally

Summary For a large popula.on with a rela.vely short popula.on history, the concepts of popula.on gene.cs and evolu.on for specia.on are closely related. Incomplete lineage sor.ng is observed among species when divergence happens in a rela.vely short period of.me compared to the popula.on size and coalescent.me

Natural Selec&on

Divergence and Selec&on Gene&c drio: the stochas.c change in popula.on frequency of a muta.on due to the sampling process that is inherent in reproduc.on. Selec&ve sweep: The process by which new favorable muta.ons become fixed so quickly that physically linked alleles also become fixed Fixa&on: the state where a muta.on has achieved a frequency of 100% in a natural popula.on Hitchhiking of neighboring polymorphisms

Complete vs Incomplete Sweep Complete sweep: the favored allele reaches a fixa.on Local varia.on is removed except the ones that arise by muta.on and recombina.on during the selec.ve sweep Incomplete sweep: posi.vely selected alleles are currently on the rise, but have not reached a frequency of 100%

Selec&ve Sweep

Different Types of Natural Selec&on Direc.onal selec.on Posi&ve selec&on Reduces gene.c diversity The small region around the selected varia.on is over- represented Hitch- hiking of neighboring alleles Nega&ve selec&on (purifying selec.on) Can lead to background selec.on, where linked varia.ons are also removed Reduces gene.c diversity, but no skewing in allele frequencies

Different Types of Natural Selec&on Balancing selec&on: a selec.on for the maintenance of two or more alleles at a single locus in a popula.on Heterozygote advantage over homozygotes G6PD muta.on G6PD enzyme deficiency and sickle- cell anemia in homozygote state Par.al protec.on against malaria in the heterozygous state CFTR muta.on Causes cys.c fibrosis in the homozygous state Protects against asthma in the heterozygous state Disrup&ve (diversifying) selec&on: extreme values of the traits are favored than intermediate values

Time Scales for the Signatures of Selec&on How do we detect the loci under natural selec.on from genome data?

1. Reduc&on in Gene&c Diversity Hitch- hiking of neutral muta.ons near beneficial muta.ons Selec.ve sweep lowers the gene.c diversity Subsequent muta.ons restores gene.c diversity Rare alleles at low frequency ~ 1 million years for a rare allele to drie to a high frequency allele

Low Diversity and Many Rare Alleles Speed of posi.ve selec.on Rapid selec.on larger selected region easier detec.on but harder to dis.nguish between hitchhiking and beneficial varia.ons Recombina.on rate Lower recombina.on rate leads to a larger selected region

Low Diversity and Many Rare Alleles

2. High- frequency Derived Alleles Derived vs ancestral alleles Use alleles in closely related species for dis.nc.on Derived alleles typically have lower frequency, but under selec.ve pressure, the frequency rises Many high- frequency derived alleles as a signature for posi.ve selec.on High- frequency derived alleles rapidly drie to near fixa.on, thus persists for a shorter period.me.

High- frequency Derived Alleles Duffy red cell an.gen in Africans: selec.on for resistance to P. vivax malaria (red: derived, grey: ancestral)

3. Popula&on Differences Geographically separated popula.ons and different selec.ve pressure on different popula.ons. Rela.vely large differences in allele frequencies between popula.ons as a signature for posi.ve selec.on. Useful only for posi.ve selec.on aeer popula.on differen.a.on (e.g., human migra.on out of Africa 50,000-75,000 years ago) F ST can be used as test sta.s.c

Extreme Popula&on Differences FY*O allele for resistance to P. vivax malaria

4. Long Haplotypes Alleles under recent posi.ve selec.on, thus ligle break- downs by recombina.on long haplotypes with high- frequency alleles This signature persists for a short period of.me before recombina.on breaks down the chromosome.

Long Haplotypes LCT allele for lactase persistence (high frequency ~77% in European popula.ons but long haplotypes)

Determining Func&on of the Candidate Loci What is the associated trait that is being selected, given the candidate locus for selec.on? Examine the annota.ons of the genes Look for associa.ons to phenotypes in a popula.on Use compara.ve genomics e.g., SLC24A5 gene posi.vely selected in human popula.on. Zebrafish homolog of this gene determines pigmenta.on phenotype. It is harder to iden.fy func.on for varia.ons that reached fixa.on in human popula.on e.g., speech- related genes

Natural Selec&on and Diseases Posi.ve selec.on Response to pathogens, environmental condi.ons, diet. Beneficial muta.ons for infec.ous diseases can be selected rapidly Nega.ve selec.on Deleterious muta.on with only small or moderate effects can reach a low- level frequency with rela.ve low selec.ve pressure

Difficul&es in Detec&ng Natural Selec&on Confounding effects of demography Popula.on bogleneck and expansion can leave signatures that look like a posi.ve selec.on Ascertainment bias for SNPs Regions where many sequences were used for ascertainment may appear to have more segrega.ng alleles at low frequencies with more haplotypes. Recombina.on rate Strong signature for selec.on for regions with low recombina.on rates