Detecting selection from differentiation between populations: the FLK and hapflk approach.
|
|
- Clifford Washington
- 5 years ago
- Views:
Transcription
1 Detecting selection from differentiation between populations: the FLK and hapflk approach. Bertrand Servin Maria-Ines Fariello, Simon Boitard, Claude Chevalet, Magali SanCristobal, Maxime Bonhomme INRA Animal Genetics Toulouse, France June 18, 2013
2 Introduction We will be considering a set of populations differentiated through the effect of drift As selection modifies allele frequencies within a population it amplifies differentiation between populations at selected loci Differentiation-based tests for selection are characterized by: their models for background differentiation ( neutral demographic model ) how they capture outliers / model selective effects FLK (and hapflk): Pure drift model, with population splits (tree demography, phylogeny ) Outlier approach: look for genome regions where the neutral (null) model does not fit well (goodness of fit statistic)
3 Outline 1 Theoretical Background Neutral model for SNP data in multiple populations Single SNP statistics for detecting selection: FLK Incorporating haplotype information: hapflk 2 Genome Scanning with hapflk
4 Outline 1 Theoretical Background Neutral model for SNP data in multiple populations Single SNP statistics for detecting selection: FLK Incorporating haplotype information: hapflk 2 Genome Scanning with hapflk
5 Outline 1 Theoretical Background Neutral model for SNP data in multiple populations Single SNP statistics for detecting selection: FLK Incorporating haplotype information: hapflk 2 Genome Scanning with hapflk
6 Single Population: evolution of allele frequency under drift Consider a biallelic locus (SNP) in a population evolving under pure drift Starting at a frequency p 0 Let F t be the fixation index of the population after t generations. (F t = 1 (1 1 2N )t t/2n) Provided F t is small, we can model: See Nicholson et al. (2002). p(t) N (p0, F t p0(1 p0))
7 Single Population: evolution of allele frequency under drift Simulated trajectories Normal approximation
8 Multiple populations: star-like evolution Consider an ancestral population split at time t 0 in multiple populations, evolving in parallel, i.e. star-like population tree. Assume no mutation after the split (F t is small), for each population: p i N (p 0, F i p 0 (1 p 0 )) NB: if we were to assume the same F i for each population, then F i = F ST.
9 Multiple populations: tree-like evolution Under the star-like model, conditional on p 0, all populations are independent, (Cov(p i, p j ) = 0) If we allow Cov(p i, p j ) 0 : a population tree ( F 3 = 1 f 12 = 1 Kinship matrix 1 1 2N 3 ) t ( 1 1 2N 12 ) t12 Var(p i ) = F i p 0 (1 p 0 ) Cov(p i, p j ) = f ij p 0 (1 p 0 ) F = F 1 f 12 0 f 12 F F 3 Var(p) = Fp 0 (1 p 0 )
10 Estimation of the neutral model This evolutionary model (population tree, pure drift) has two parameters : p 0 and F Suppose F is known, then a natural estimator of p 0 is the generalized least squares estimator: ˆp 0 = 1T F 1 p 1 T F 1 1 Estimating F means reconstructing the population tree, with branch length unit expressed in terms of fixation indices.
11 Estimation of the population kinship matrix Branch length of the tree are measured in units of drift ( t/2n) For each pair of population the Reynolds genetic distance D (Reynolds, Weir and Cockerham, 1983) between two populations i and j has expectation: see Laval et al. (2002). E(D ij ) = F i + F j 2 The population tree can be built using the neighbour joining algorithm on the Reynolds distances matrix 2, computed over many ( 10 4 ) SNPs. Assumes majority of them are neutral. Rooting the tree requires an outgroup. If not, uses midpoint rooting.
12 Conclusions on the neutral model We have described a neutral model for population allele frequencies at a SNP We can estimate the model parameters: p 0 : ancestral allele frequency, locus specific F: population kinship matrix, constant across loci. Note that other procedures could be used to estimate F. The hapflk software allows to use any kinship matrix. In our context: detecting selection is identifying loci for which the neutral model is not a good fit.
13 Outline 1 Theoretical Background Neutral model for SNP data in multiple populations Single SNP statistics for detecting selection: FLK Incorporating haplotype information: hapflk 2 Genome Scanning with hapflk
14 The FLK statistic: goodness-of-fit of the neutral model We can think of our Neutral model as a linear model: p = 1p 0 + r with r N (0, V), V = Fp 0 (1 p 0 ). A goodness-of-fit statistic for this model, estimated at a particular locus, is the deviance: (p 1 ˆp 0 ) T V 1 (p 1 ˆp 0 ) named the FLK Statistic (Bonhomme et al., 2010) Under H 0 (neutral model) for n populations, FLK follows a χ 2 (n 1).
15 Relationship with other statistics If we were to assume a star-like, equal branch length population tree: F = I n F ST where F ST is the mean F ST over loci (genomewide F ST ) ˆp 0 = p FLK = (n 1) F F ST ST
16 Relationship with other statistics If we were to assume a star-like, equal branch length population tree: F = I n F ST where F ST is the mean F ST over loci (genomewide F ST ) ˆp 0 = p FLK = (n 1) F F ST ST The Lewontin and Krakauer (1973) statistic (LK) LK = (n 1) F F ST ST The LK statistic gives the same ranking as F ST LK (or F ST ) scans for selection assume a very particular evolution model for populations Outliers of LK (or F ST ): bad fit of this model. Might not be due to selection (but wrong evolutionary model H 0 ).
17 Outline 1 Theoretical Background Neutral model for SNP data in multiple populations Single SNP statistics for detecting selection: FLK Incorporating haplotype information: hapflk 2 Genome Scanning with hapflk
18 Principle 1 Incorporate an haplotype diversity model within the FLK framework 2 Considering haplotypes as multi-allelic markers, use a multiallelic version of FLK (Bonhomme et al. 2010). However, haplotypes are not ancestral alleles (recombination happens). Modified multiallelic version: the hapflk statistic (Fariello et al. 2013). Unknown distibution.
19 The Scheet and Stephens (aka. fastphase) model Models the local similarity between haplotypes via a reduction of dimension: local clustering of haplotypes The underlying clusters can be considered as local haplotypes. Definition changes along the chromosome Model is a Hidden Markov Model, hidden states are clusters. As for all mixture models, need to specify number of components K.
20 Using LD models for FLK transform SNP genotypes into multiallelic genotypes Based on the posterior probability: P(Z il = k G i ) where Z il Underlying cluster for individual i at SNP l G i Observed SNP genotype (multilocus) Consider haplotype clusters as alleles The frequency of a cluster within a population is : p kl = 1 P(Z il = k G i ) N Advantages No need for sliding windows Model can be estimated on unphased genotype data Can incorporate missing data (e.g. mixture of dense and sparse data...) i
21 Outline 1 Theoretical Background Neutral model for SNP data in multiple populations Single SNP statistics for detecting selection: FLK Incorporating haplotype information: hapflk 2 Genome Scanning with hapflk
22 Example data: sheep from Northern Europe Kijas et al. (2012) PLoS Biology 6 Populations + Outgroup (Soay), 388 individuals, 49K SNPs Available at
23 Before diving in... Remember assumptions underlying the neutral model: Population tree Pure drift model (no mutations, no admixture) Small F i (say < 0.2) This means Discard strongly bottleneck-ed or admixed populations Consider that low frequency variants are more likely to have appeared after population spit. Perform a diversity analysis before: Population structure (STRUCTURE, PCA, treemix...) Within population kinship between individuals to identify a set of unrelated individuals
24 Get the software :) Available for Linux 64bits and MacOSX For estimation of the kinship matrix, needs R with ape and phangorn packages.
25 Run single SNP analysis hapflk reads PLINK files (ped/map or bed/bim/fam), first column (FID) must give the population name hapflk --bfile NorthernSheep --outgroup Soay 1. [ 00:00:00 ] Reading Input Files 2. [ 00:00:58 ] Computing Allele Frequencies NewZealandRomney 3. [ 00:02:21 ] Computing Reynolds distances 4. [ 00:02:21 ] Computing Kinship Matrix Loading required package: ape 5. [ 00:02:21 ] Computing FLK tests 6. [ 00:02:32 ] Writing down results 7. [ 00:02:36 ] The End NB: single SNP analysis is fast.
26 Output files hapflk_reynolds.txt : Reynolds Distance Matrix hapflk_poptree.pdf : Population tree figure hapflk_kinship.r : R code for estimating the kinship matrix hapflk_fij.txt : Kinship matrix hapflk.frq : Allele frequencies hapflk-snp-reynolds.txt: Reynolds Distances in the region (more later) hapflk.flk : FLK results
27 Population Tree IrishSuffolk NewZealandRomney Galway GermanTexel ScottishTexel NewZealandTexel
28 Fit of the χ 2 distribution (1) flk=read.table( hapflk.flk,head=t) mysnps=flk$pzero > 0.05 & flk$pzero < 0.95 hist(flk$flk[mysnps],n=50,freq=f,xlab= FLK,main= ) lines(xx,dchisq(xx,df=5),lwd=2) Density Good overall fit Slightly less high value than a χ 2 (5). Relatively high drift (F i 0.16) FLK
29 Fit of the χ 2 distribution (2) hist(flk$flk[!mysnps],n=50,freq=f,xlab= FLK,main= ) Density No fit For these SNPs, our neutral model is clearly wrong (no mutation in the tree). Proceed with caution for low/high ˆp 0 SNPs FLK
30 FLK Manhattan plot log10(p) Large drift affects power of single SNP tests
31 Let s go the haplotype way Wait..., what K? use fastphase cross validation routine on your favorite chromosome (the big one). hint : plink --chr 1 --recode-fastphase... hapflk2 --bfile NorthernSheep --outgroup Soay --kinship hapflk_fij.txt --chr 1 -K 40 --ncpu 7 -p OAR1 One run for each chromosome Note: if phased data, or inbred lines, possibility to specify it --phased or --inbred. Makes fitting LD model much faster.
32 hapflk output files hapflk.hapflk : hapflk results hapflk.kfrq.fit_{n}.bz2 : haplotype cluster freq. The fastphase model is estimated several (T) times (by default 20), and the hapflk statistic is averaged over this T fits. For each fit the haplotype cluster frequencies are given.
33 Distribution of the hapflk statistic In this particular case, the distribution is close to normal + outliers Robust estimation of the Normal distribution parameters: require(mass) mod=rlm(hapflk~1) mu=mod$coefficients[1] ss=mod$s pvalue=1-pnorm(hapflk,mean=mu,sd=ss)
34 Manhattan plot hapflk hapflk reveals clear outlying regions
35 Looking at a particular region Once outlying regions are found, we want to know which population(s) has experienced a selection event. Local allele frequencies Build local population trees to find which branch(es) have been affected. ( Eigen decomposition of hapflk )
36 Local SNP and Cluster frequencies chr 2 : Selection in Texel breeds : GDF8 (MSTN) mutation R script for haplotype cluter plots provided on hapflk webpage.
37 Local SNP and Cluster frequencies chr 14 : less obvious
38 Local population trees Script for making these trees to be released...
39 Example on the 1000 Bull genomes data Differentiation based tests can find causal mutations: example of coat color mutations (MC1R) in the 1000 Bull genomes dataset Position (Kbp)
40 References Nicholson et al J. Roy. Stat. Soc. B 64(4), Laval et al Genetics Selection Evolution, 34(4), Bonhomme et al Genetics, 186(1), Fariello et al Genetics, 193(3),
(Genome-wide) association analysis
(Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by
More information1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:
.5. ESTIMATION OF HAPLOTYPE FREQUENCIES: Chapter - 8 For SNPs, alleles A j,b j at locus j there are 4 haplotypes: A A, A B, B A and B B frequencies q,q,q 3,q 4. Assume HWE at haplotype level. Only the
More informationDetecting Selection in Population Trees: The Lewontin and Krakauer Test Extended
Copyright Ó 2010 by the Genetics Society of America DOI: 10.1534/genetics.110.117275 Detecting Selection in Population Trees: The Lewontin and Krakauer Test Extended Maxime Bonhomme,* Claude Chevalet,*
More informationGenetic Drift in Human Evolution
Genetic Drift in Human Evolution (Part 2 of 2) 1 Ecology and Evolutionary Biology Center for Computational Molecular Biology Brown University Outline Introduction to genetic drift Modeling genetic drift
More informationPopulations in statistical genetics
Populations in statistical genetics What are they, and how can we infer them from whole genome data? Daniel Lawson Heilbronn Institute, University of Bristol www.paintmychromosomes.com Work with: January
More informationPopulation Genetics I. Bio
Population Genetics I. Bio5488-2018 Don Conrad dconrad@genetics.wustl.edu Why study population genetics? Functional Inference Demographic inference: History of mankind is written in our DNA. We can learn
More informationNotes on Population Genetics
Notes on Population Genetics Graham Coop 1 1 Department of Evolution and Ecology & Center for Population Biology, University of California, Davis. To whom correspondence should be addressed: gmcoop@ucdavis.edu
More informationLearning ancestral genetic processes using nonparametric Bayesian models
Learning ancestral genetic processes using nonparametric Bayesian models Kyung-Ah Sohn October 31, 2011 Committee Members: Eric P. Xing, Chair Zoubin Ghahramani Russell Schwartz Kathryn Roeder Matthew
More informationMathematical models in population genetics II
Mathematical models in population genetics II Anand Bhaskar Evolutionary Biology and Theory of Computing Bootcamp January 1, 014 Quick recap Large discrete-time randomly mating Wright-Fisher population
More informationChapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)
12/5/14 Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) Linkage Disequilibrium Genealogical Interpretation of LD Association Mapping 1 Linkage and Recombination v linkage equilibrium ²
More informationPopulation Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda
1 Population Genetics with implications for Linkage Disequilibrium Chiara Sabatti, Human Genetics 6357a Gonda csabatti@mednet.ucla.edu 2 Hardy-Weinberg Hypotheses: infinite populations; no inbreeding;
More informationMicrosatellite data analysis. Tomáš Fér & Filip Kolář
Microsatellite data analysis Tomáš Fér & Filip Kolář Multilocus data dominant heterozygotes and homozygotes cannot be distinguished binary biallelic data (fragments) presence (dominant allele/heterozygote)
More informationProblems for 3505 (2011)
Problems for 505 (2011) 1. In the simplex of genotype distributions x + y + z = 1, for two alleles, the Hardy- Weinberg distributions x = p 2, y = 2pq, z = q 2 (p + q = 1) are characterized by y 2 = 4xz.
More information7. Tests for selection
Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info
More informationLecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium. November 12, 2012
Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012 Last Time Sequence data and quantification of variation Infinite sites model Nucleotide diversity (π) Sequence-based
More informationMultivariate analysis of genetic data an introduction
Multivariate analysis of genetic data an introduction Thibaut Jombart MRC Centre for Outbreak Analysis and Modelling Imperial College London Population genomics in Lausanne 23 Aug 2016 1/25 Outline Multivariate
More informationIntroduction to Advanced Population Genetics
Introduction to Advanced Population Genetics Learning Objectives Describe the basic model of human evolutionary history Describe the key evolutionary forces How demography can influence the site frequency
More informationMethods for Cryptic Structure. Methods for Cryptic Structure
Case-Control Association Testing Review Consider testing for association between a disease and a genetic marker Idea is to look for an association by comparing allele/genotype frequencies between the cases
More informationGBLUP and G matrices 1
GBLUP and G matrices 1 GBLUP from SNP-BLUP We have defined breeding values as sum of SNP effects:! = #$ To refer breeding values to an average value of 0, we adopt the centered coding for genotypes described
More informationLecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency
Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency Bruce Walsh lecture notes Introduction to Quantitative Genetics SISG, Seattle 16 18 July 2018 1 Outline Genetics of complex
More informationThe E-M Algorithm in Genetics. Biostatistics 666 Lecture 8
The E-M Algorithm in Genetics Biostatistics 666 Lecture 8 Maximum Likelihood Estimation of Allele Frequencies Find parameter estimates which make observed data most likely General approach, as long as
More informationEvolution of Populations. Chapter 17
Evolution of Populations Chapter 17 17.1 Genes and Variation i. Introduction: Remember from previous units. Genes- Units of Heredity Variation- Genetic differences among individuals in a population. New
More informationIntroduction to Linkage Disequilibrium
Introduction to September 10, 2014 Suppose we have two genes on a single chromosome gene A and gene B such that each gene has only two alleles Aalleles : A 1 and A 2 Balleles : B 1 and B 2 Suppose we have
More informationCSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation
CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation Instructor: Arindam Banerjee November 26, 2007 Genetic Polymorphism Single nucleotide polymorphism (SNP) Genetic Polymorphism
More informationProportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power
Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion
More informationProcesses of Evolution
15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection
More informationStatistical issues in QTL mapping in mice
Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities
More informationEffect of Genetic Divergence in Identifying Ancestral Origin using HAPAA
Effect of Genetic Divergence in Identifying Ancestral Origin using HAPAA Andreas Sundquist*, Eugene Fratkin*, Chuong B. Do, Serafim Batzoglou Department of Computer Science, Stanford University, Stanford,
More informationHaplotype-based variant detection from short-read sequencing
Haplotype-based variant detection from short-read sequencing Erik Garrison and Gabor Marth July 16, 2012 1 Motivation While statistical phasing approaches are necessary for the determination of large-scale
More informationNew imputation strategies optimized for crop plants: FILLIN (Fast, Inbred Line Library ImputatioN) FSFHap (Full Sib Family Haplotype)
New imputation strategies optimized for crop plants: FILLIN (Fast, Inbred Line Library ImputatioN) FSFHap (Full Sib Family Haplotype) Kelly Swarts PAG Allele Mining 1/11/2014 Imputation is the projection
More informationLecture WS Evolutionary Genetics Part I 1
Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in
More informationp(d g A,g B )p(g B ), g B
Supplementary Note Marginal effects for two-locus models Here we derive the marginal effect size of the three models given in Figure 1 of the main text. For each model we assume the two loci (A and B)
More information1. Understand the methods for analyzing population structure in genomes
MSCBIO 2070/02-710: Computational Genomics, Spring 2016 HW3: Population Genetics Due: 24:00 EST, April 4, 2016 by autolab Your goals in this assignment are to 1. Understand the methods for analyzing population
More informationQ1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.
OEB 242 Exam Practice Problems Answer Key Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. First, recall
More informationGenetic Association Studies in the Presence of Population Structure and Admixture
Genetic Association Studies in the Presence of Population Structure and Admixture Purushottam W. Laud and Nicholas M. Pajewski Division of Biostatistics Department of Population Health Medical College
More informationThe problem Lineage model Examples. The lineage model
The lineage model A Bayesian approach to inferring community structure and evolutionary history from whole-genome metagenomic data Jack O Brien Bowdoin College with Daniel Falush and Xavier Didelot Cambridge,
More informationLecture 13: Population Structure. October 8, 2012
Lecture 13: Population Structure October 8, 2012 Last Time Effective population size calculations Historical importance of drift: shifting balance or noise? Population structure Today Course feedback The
More informationGene mapping in model organisms
Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2
More informationBinomial Mixture Model-based Association Tests under Genetic Heterogeneity
Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Hui Zhou, Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 April 30,
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm
More informationHomework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:
Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships
More information122 9 NEUTRALITY TESTS
122 9 NEUTRALITY TESTS 9 Neutrality Tests Up to now, we calculated different things from various models and compared our findings with data. But to be able to state, with some quantifiable certainty, that
More informationGenetic diversity and population structure in rice. S. Kresovich 1,2 and T. Tai 3,5. Plant Breeding Dept, Cornell University, Ithaca, NY
Genetic diversity and population structure in rice S. McCouch 1, A. Garris 1,2, J. Edwards 1, H. Lu 1,3 M Redus 4, J. Coburn 1, N. Rutger 4, S. Kresovich 1,2 and T. Tai 3,5 1 Plant Breeding Dept, Cornell
More informationHidden Markov models in population genetics and evolutionary biology
Hidden Markov models in population genetics and evolutionary biology Gerton Lunter Wellcome Trust Centre for Human Genetics Oxford, UK April 29, 2013 Topics for today Markov chains Hidden Markov models
More informationSpace Time Population Genetics
CHAPTER 1 Space Time Population Genetics I invoke the first law of geography: everything is related to everything else, but near things are more related than distant things. Waldo Tobler (1970) Spatial
More informationBTRY 7210: Topics in Quantitative Genomics and Genetics
BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu February 12, 2015 Lecture 3:
More informationPopulation Structure
Ch 4: Population Subdivision Population Structure v most natural populations exist across a landscape (or seascape) that is more or less divided into areas of suitable habitat v to the extent that populations
More informationClassical Selection, Balancing Selection, and Neutral Mutations
Classical Selection, Balancing Selection, and Neutral Mutations Classical Selection Perspective of the Fate of Mutations All mutations are EITHER beneficial or deleterious o Beneficial mutations are selected
More informationComputational Systems Biology: Biology X
Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,
More informationSWEEPFINDER2: Increased sensitivity, robustness, and flexibility
SWEEPFINDER2: Increased sensitivity, robustness, and flexibility Michael DeGiorgio 1,*, Christian D. Huber 2, Melissa J. Hubisz 3, Ines Hellmann 4, and Rasmus Nielsen 5 1 Department of Biology, Pennsylvania
More informationIntroduction to population genetics & evolution
Introduction to population genetics & evolution Course Organization Exam dates: Feb 19 March 1st Has everybody registered? Did you get the email with the exam schedule Summer seminar: Hot topics in Bioinformatics
More informationThe neutral theory of molecular evolution
The neutral theory of molecular evolution Introduction I didn t make a big deal of it in what we just went over, but in deriving the Jukes-Cantor equation I used the phrase substitution rate instead of
More informationPhylogenetic Networks with Recombination
Phylogenetic Networks with Recombination October 17 2012 Recombination All DNA is recombinant DNA... [The] natural process of recombination and mutation have acted throughout evolution... Genetic exchange
More informationMajor questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.
Evolutionary Genetics (for Encyclopedia of Biodiversity) Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville, TN 37996-6 USA Evolutionary
More information1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics
1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
More informationSupporting Information Text S1
Supporting Information Text S1 List of Supplementary Figures S1 The fraction of SNPs s where there is an excess of Neandertal derived alleles n over Denisova derived alleles d as a function of the derived
More informationLinkage and Linkage Disequilibrium
Linkage and Linkage Disequilibrium Summer Institute in Statistical Genetics 2014 Module 10 Topic 3 Linkage in a simple genetic cross Linkage In the early 1900 s Bateson and Punnet conducted genetic studies
More informationOverview. Background
Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Institute of Statistics and Decision Support Systems
More informationHaploid & diploid recombination and their evolutionary impact
Haploid & diploid recombination and their evolutionary impact W. Garrett Mitchener College of Charleston Mathematics Department MitchenerG@cofc.edu http://mitchenerg.people.cofc.edu Introduction The basis
More informationGenotype Imputation. Biostatistics 666
Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives
More informationSolutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin
Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin CHAPTER 1 1.2 The expected homozygosity, given allele
More informationThe genomes of recombinant inbred lines
The genomes of recombinant inbred lines Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman C57BL/6 2 1 Recombinant inbred lines (by sibling mating)
More informationSupplementary Materials: Efficient moment-based inference of admixture parameters and sources of gene flow
Supplementary Materials: Efficient moment-based inference of admixture parameters and sources of gene flow Mark Lipson, Po-Ru Loh, Alex Levin, David Reich, Nick Patterson, and Bonnie Berger 41 Surui Karitiana
More informationTesting for spatially-divergent selection: Comparing Q ST to F ST
Genetics: Published Articles Ahead of Print, published on August 17, 2009 as 10.1534/genetics.108.099812 Testing for spatially-divergent selection: Comparing Q to F MICHAEL C. WHITLOCK and FREDERIC GUILLAUME
More informationExpected complete data log-likelihood and EM
Expected complete data log-likelihood and EM In our EM algorithm, the expected complete data log-likelihood Q is a function of a set of model parameters τ, ie M Qτ = log fb m, r m, g m z m, l m, τ p mz
More informationHow robust are the predictions of the W-F Model?
How robust are the predictions of the W-F Model? As simplistic as the Wright-Fisher model may be, it accurately describes the behavior of many other models incorporating additional complexity. Many population
More informationCONSERVATION AND THE GENETICS OF POPULATIONS
CONSERVATION AND THE GENETICS OF POPULATIONS FredW.Allendorf University of Montana and Victoria University of Wellington and Gordon Luikart Universite Joseph Fourier, CNRS and University of Montana With
More informationUSING LINEAR PREDICTORS TO IMPUTE ALLELE FREQUENCIES FROM SUMMARY OR POOLED GENOTYPE DATA. By Xiaoquan Wen and Matthew Stephens University of Chicago
Submitted to the Annals of Applied Statistics USING LINEAR PREDICTORS TO IMPUTE ALLELE FREQUENCIES FROM SUMMARY OR POOLED GENOTYPE DATA By Xiaoquan Wen and Matthew Stephens University of Chicago Recently-developed
More informationBig Idea #1: The process of evolution drives the diversity and unity of life
BIG IDEA! Big Idea #1: The process of evolution drives the diversity and unity of life Key Terms for this section: emigration phenotype adaptation evolution phylogenetic tree adaptive radiation fertility
More informationAdaptation and genetics. Block course Zoology & Evolution 2013, Daniel Berner
Adaptation and genetics Block course Zoology & Evolution 2013, Daniel Berner 2 Conceptual framework Evolutionary biology tries to understand the mechanisms that lead from environmental variation to biological
More informationIntroduction to Natural Selection. Ryan Hernandez Tim O Connor
Introduction to Natural Selection Ryan Hernandez Tim O Connor 1 Goals Learn about the population genetics of natural selection How to write a simple simulation with natural selection 2 Basic Biology genome
More informationNeutral Theory of Molecular Evolution
Neutral Theory of Molecular Evolution Kimura Nature (968) 7:64-66 King and Jukes Science (969) 64:788-798 (Non-Darwinian Evolution) Neutral Theory of Molecular Evolution Describes the source of variation
More informationPopulation Genetics & Evolution
The Theory of Evolution Mechanisms of Evolution Notes Pt. 4 Population Genetics & Evolution IMPORTANT TO REMEMBER: Populations, not individuals, evolve. Population = a group of individuals of the same
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA
More information2. Map genetic distance between markers
Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,
More informationLecture 13: Variation Among Populations and Gene Flow. Oct 2, 2006
Lecture 13: Variation Among Populations and Gene Flow Oct 2, 2006 Questions about exam? Last Time Variation within populations: genetic identity and spatial autocorrelation Today Variation among populations:
More informationAssociation Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5
Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative
More informationProduction type of Slovak Pinzgau cattle in respect of related breeds
Original Paper Production type of Slovak Pinzgau cattle in respect of related breeds Veronika Šidlová* 1, Nina Moravčíková 1, Anna Trakovická 1, Maja Ferenčaković 2, Ino Curik 2, Radovan Kasarda 1 1 Slovak
More informationTheoretical and computational aspects of association tests: application in case-control genome-wide association studies.
Theoretical and computational aspects of association tests: application in case-control genome-wide association studies Mathieu Emily November 18, 2014 Caen mathieu.emily@agrocampus-ouest.fr - Agrocampus
More informationBreeding Values and Inbreeding. Breeding Values and Inbreeding
Breeding Values and Inbreeding Genotypic Values For the bi-allelic single locus case, we previously defined the mean genotypic (or equivalently the mean phenotypic values) to be a if genotype is A 2 A
More informationGenetics: Early Online, published on February 26, 2016 as /genetics Admixture, Population Structure and F-statistics
Genetics: Early Online, published on February 26, 2016 as 10.1534/genetics.115.183913 GENETICS INVESTIGATION Admixture, Population Structure and F-statistics Benjamin M Peter 1 1 Department of Human Genetics,
More informationLinkage disequilibrium and the genetic distance in livestock populations: the impact of inbreeding
Genet. Sel. Evol. 36 (2004) 281 296 281 c INRA, EDP Sciences, 2004 DOI: 10.1051/gse:2004002 Original article Linkage disequilibrium and the genetic distance in livestock populations: the impact of inbreeding
More informationUse of hidden Markov models for QTL mapping
Use of hidden Markov models for QTL mapping Karl W Broman Department of Biostatistics, Johns Hopkins University December 5, 2006 An important aspect of the QTL mapping problem is the treatment of missing
More informationY-STR: Haplotype Frequency Estimation and Evidence Calculation
Calculating evidence Further work Questions and Evidence Mikkel, MSc Student Supervised by associate professor Poul Svante Eriksen Department of Mathematical Sciences Aalborg University, Denmark June 16
More informationModelling Genetic Variations with Fragmentation-Coagulation Processes
Modelling Genetic Variations with Fragmentation-Coagulation Processes Yee Whye Teh, Charles Blundell, Lloyd Elliott Gatsby Computational Neuroscience Unit, UCL Genetic Variations in Populations Inferring
More information, Helen K. Pigage 1, Peter J. Wettstein 2, Stephanie A. Prosser 1 and Jon C. Pigage 1ˆ. Jeremy M. Bono 1*
Bono et al. BMC Evolutionary Biology (2018) 18:139 https://doi.org/10.1186/s12862-018-1248-4 RESEARCH ARTICLE Genome-wide markers reveal a complex evolutionary history involving divergence and introgression
More informationUsing haplotypes for the prediction of allelic identity to fine-map QTL: characterization and properties
Using haplotypes for the prediction of allelic identity to fine-map QTL: characterization and properties Laval Jacquin 1,2,3 Corresponding author Email: Julien.Jacquin@toulouse.inra.fr Jean-Michel Elsen
More informationI N N O V A T I O N L E C T U R E S (I N N O l E C) Petr Kuzmič, Ph.D. BioKin, Ltd. WATERTOWN, MASSACHUSETTS, U.S.A.
I N N O V A T I O N L E C T U R E S (I N N O l E C) Binding and Kinetics for Experimental Biologists Lecture 2 Evolutionary Computing: Initial Estimate Problem Petr Kuzmič, Ph.D. BioKin, Ltd. WATERTOWN,
More informationThe Quantitative TDT
The Quantitative TDT (Quantitative Transmission Disequilibrium Test) Warren J. Ewens NUS, Singapore 10 June, 2009 The initial aim of the (QUALITATIVE) TDT was to test for linkage between a marker locus
More informationDemographic Inference with Coalescent Hidden Markov Model
Demographic Inference with Coalescent Hidden Markov Model Jade Y. Cheng Thomas Mailund Bioinformatics Research Centre Aarhus University Denmark The Thirteenth Asia Pacific Bioinformatics Conference HsinChu,
More informationopulation genetics undamentals for SNP datasets
opulation genetics undamentals for SNP datasets with crocodiles) Sam Banks Charles Darwin University sam.banks@cdu.edu.au I ve got a SNP genotype dataset, now what? Do my data meet the requirements of
More informationTutorial Session 2. MCMC for the analysis of genetic data on pedigrees:
MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation
More informationSpatial localization of recent ancestors for admixed individuals
G3: Genes Genomes Genetics Early Online, published on November 3, 2014 as doi:10.1534/g3.114.014274 Spatial localization of recent ancestors for admixed individuals Wen-Yun Yang 1, Alexander Platt 2, Charleston
More informationMapping QTL to a phylogenetic tree
Mapping QTL to a phylogenetic tree Karl W Broman Department of Biostatistics & Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman Human vs mouse www.daviddeen.com 3 Intercross
More informationPopulation Genetics: a tutorial
: a tutorial Institute for Science and Technology Austria ThRaSh 2014 provides the basic mathematical foundation of evolutionary theory allows a better understanding of experiments allows the development
More informationPhasing via the Expectation Maximization (EM) Algorithm
Computing Haplotype Frequencies and Haplotype Phasing via the Expectation Maximization (EM) Algorithm Department of Computer Science Brown University, Providence sorin@cs.brown.edu September 14, 2010 Outline
More informationBustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #
Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either
More informationLies, damn lies, and. genomics
Lies, damn lies, and. genomics you, your data, your perceptions and reality Christopher West Wheat Goal of this lecture Present a critical view of ecological genomics Make you uncomfortable by sharing
More information