Competing selective sweeps

Size: px
Start display at page:

Download "Competing selective sweeps"

Transcription

1 Competing selective sweeps Sebastian Bossert Dissertation zur Erlangung des Doktorgrades der Fakultät für Mathematik und Physik der Albert-Ludwigs-Universität Freiburg im Breisgau

2 Pr r t r rö r r t Pr r P t r P r Pr r rt t3 t r t r ü Prü

3 Abstract The principle of natural selection, characterised by Darwin and Wallace in 1858 changed the way of thinking about the development of species completely and radically. They were the first to point out, that certain traits improve an individual s chance to survive and reproduce and are therefore inherited more frequently. Scientific progress and technical developments have in the meantime led to the identification of underlying molecular mechanisms and to the decoding of the DNA, the biopolymer that stores the genetic information of an individual. This progress enables scientists to search nowadays for beneficial traits on the molecular level. In doing so particular detection tools are applied, which make use of the existing DNA variation in a population. Such tools are often developed using theoretical models and they try to find special DNA patterns, which indicate that selection might have happened in this DNA area. A pioneering theoretical approach that contributed to the development of first detection tools was presented by Maynard Smith and Haigh in They analysed in a theoretical framework the process when a new, strongly selected advantageous mutation becomes fixed in a population. Under the assumption of an otherwise neutral panmictic population such a (single hard) selective sweep leads to the reduction of diversity around the selected locus. In the following years other scientists were able to identify further properties that characterise such an evolution, e.g. an increased number of low- and high-frequency variants. In addition people also faced the question to what extent these characteristics still hold, when certain assumptions are modified. Such approaches and extensions are absolutely necessary to be able to identify different kinds of selective mechanisms and evolutions. In this scientific work a situation is examined, where two selective sweeps within a narrow genomic region overlap in a sexually evolving population. For such a competing sweeps situation at first a mathematical model, based on reasonable biological assumptions, is set up to identify what kind of evolutions can happen. Of particular interest is the probability of a fixation of both beneficial alleles, in cases where these alleles are not initially linked. To handle this question a graphical tool, the ancestral selection recombination graph, is utilized, which is based on a genealogical view on the population. This approach provides a limit result (for large selection coefficients) for the probability that both beneficial mutations will eventually fix and enables us to analyse the role of selection, recombination and the population size. In particular, we could establish that under certain starting conditions the fixation probability is heavily dependent on the population size. Although we limit mainly on panmictic populations, we argue heuristically how the presented results can be extended to island models. The analytical examination is complemented by a simulation study. Here we analyse on the one hand to what extent the derived limit formulas are suited for large finite populations. On the other hand simulations are conducted to identify possible signatures for the considered scenario. These simulations point out that fixation of both beneficial alleles leads in the region between the two selective loci to altered patterns compared to the single sweep case. The theoretical results suggest that this is attributable to the fact that in a competing sweeps setting different founders can appear and contribute to the genomic variation at fixation time. Altogether, the theoretical examination as well as the simulation study indicate that competing selective sweeps might be an explanation for strong haplotype patterns found in SNP data of Drosophila melanogaster.

4 Declaration of contributions as a co-author This dissertation presents the results of my doctoral research from August 2012 until February The research has been conducted in collaboration with other scientists. The thesis consists of two main parts, an analytical part and a corresponding simulation study. The first topic entitled Fixation probabilities in a competing sweeps scenario was supervised by Prof. Dr. Peter Pfaffelhuber. It presents theoretical limit results developed together with Prof. Dr. Peter Pfaffelhuber. I wrote the manuscript and proved the theorems. The second part builds up on the analytical results and illustrates simulation results. Under the supervision of Prof. Dr. Peter Pfaffelhuber, I wrote the program to simulate data for the Wright Fisher population. In this part additional simulation programs and statistics from other authors were used. The use of these programs is clearly indicated. I evaluated the obtained data and wrote the manuscript. Publication of another topic Besides the work on the dissertation the following publication in collaboration with Prof. Dr. Peter Pfaffelhuber was developed: Bossert, Sebastian and Peter Pfaffelhuber. The Yule approximation for the site frequency spectrum after a selective sweep. PloS ONE, 9(1), This publication grow out of my Diploma thesis entitled Das Frequenzspektrum nach einem Selective Sweep and is not presented in this thesis.

5 ts rst t t 2 s r Pr ss r r P t r P r r s 1 t t r r2 r t t 2 t t 2 s ss s t st r s r t t r s t s r s t s rt r t st 2 rs t r t rs 2 s t s r 2 s t t r sts 2 t s s t t r P t r P r rt r t r 3 r r rs t r t 2 t r st s ss s t P t r 3 r s r2 ts t s r t t t r t s rt t rs t rs r r t r s t str t r t s t t t t r t rs t2 t r s r t s t t Pr r t3 r Pr r s Pr r ss r t r s t t2 r r t r s t t t ö r s t s rt t rs t st st r r r r rs t t t rt t 2 s t s t 2 2 t s s t s r t t s ss

6

7 Contents 1 General introduction 3 2 Introduction to competing selective sweeps Models Heuristics on the event of fixation Fixation probabilities in a competing sweeps scenario Main result Proof Diffusion to ASRG Inside the ASRG Supporting lemmata Simulation results Fixation probabilities Methods Comparison results Extensions Signatures for the fixation of the double beneficial type Methods Statistical results Discussion of results Accuracy and validity of the limit result Fixation signatures Conclusions and Outlook 126 A Appendix 129 A.1 Addenda to the proof of Theorem A.2 Additional figures A.3 Extensions of the Wright Fisher model A.4 Additional statistical results A.5 Contents of the supplemental CD List of Figures 144 List of Tables 145 Bibliography 146

8

9 Chapter 1 General introduction Meanwhile it is well-known that the blueprint of every individual is saved on its DNA. During reproduction the DNA is copied whereby modifications can happen. Some of these modifications can improve an individual s chances to survive and reproduce. While Darwin, the first scientist determining this course of action, could only see the outcome, provided that the modifications lead to a change in the visible structure, nowadays (nearly) all molecular DNA information is available. O Connell described the impact of Darwin s discovery metaphorically by the following words: One of the most exciting detective stories in modern times began with Darwin s theory of evolution. (O Connell in Donnelly and Tavaré (1997)) And indeed one can interpret the processing in genetics as a more and more successful search for evidence of evolution and the mechanisms that underlies it. For the analysis only data from today (or some years ago) are available, while the reasons lie far back. So figuratively speaking the detective work consists in analysing and interpreting the present information to get inference about the past evolution, which underlies the examined data. At first glance this investigation seems manageable since there are mainly only four essential forces (mutation, selection, recombination, genetic drift) affecting a population s evolution. On closer inspection, however, one must notice that these factors exist in a huge diversity. Furthermore, they are influencing each other strongly. These characteristics impede the evolutionary research on the one side but on the other side these features contribute to the unique attractiveness of evolutionary biology. The tricky cases characterize a fascinating detective story. Since evolutionary processes happen over hundreds or thousands of generations it turned out that mathematical models can help to derive knowledge about evolutionary events and their likelihood. Such models were developed first by Haldane (1927), Fisher (1930), and Wright (1931) and helped to solve the discrepancies between the evolutionary theory of Darwin and Mendelian inheritance. In the context of their work they provide a decisive contribution to the standard model of evolution, the modern evolutionary synthesis (see e.g. Dobzhansky, 1937; Huxley, 1942; Mayr, 1942). During the past decades both the technical and analytical tools in genetics have improved extremely. Crucial stages of the technological development were the first DNA sequencing by Sanger et al. (1977) and the complete sequencing of the human genome by the HapMapProject (International Human Genome Sequencing Consortium, 2004). These days typically the third generation of sequencing technique (illumina sequencing) is used, which has accelerated the DNA sequencing once again while at the same time reducing its costs. On the analytical side primary the Moran model (Moran, 1958) and the introduction of the genealogical view,

10 General introduction 4 first recorded in Kingman s coalescent (Kingman, 1982), are to be mentioned. When using reasonable assumptions such mathematical models can result in helpful statistical methods, which enlarge the understanding of the complex interrelationships between the evolutionary forces. This link-up between the theoretical and practical side was put in a nutshell by John H. Gillespie (Gillespie, 2010): Population geneticists spend most of their time doing one of two things: describing the genetic structure of populations or theorizing on the evolutionary forces acting on populations. On a good day, these two activities mesh and true insights emerge. Selection represents the most important evolutionary force for shifts of allele frequencies in large populations and leads to a better adaptation to the environment. Therefore one major challenge in evolutionary theory is to identify genomic areas that appear to be affected by selection. Hereby one can distinguish between macro and micro evolutionary levels. While macro evolutionary studies focus on selective events that happen within the deep past, based on comparisons of different species, micro evolutionary studies refer to smaller more recent evolutionary changes within a population. Such micro evolutionary studies are mainly based on statistical measures derived from Single Nucleotide Polymorphism (SNP) variation data. Hereby a SNP is a DNA sequence variation, in which a single nucleotide differs between individuals of a population. In a follow-up study functional consequences induced by the selected allele need to be characterized. Meanwhile, there is a plurality of statistical methods for the identification of possible selective regions, which are portrayed e.g. in Vitti et al. (2013) or Wollstein and Stephan (2015). Some of these statistics will be handled in more detail in Section 4.2 and Section 5.2. One reason for the large variety of statistics is that there is also a considerable variety how selection acts. An overview of the different forms of selection is given in Hartl and Clark (1997). We will concentrate in the following on one type of selection, namely strong directional selection and will first review the theoretical achievements in this field of activity. When a single beneficial mutation arises and sweeps to fixation rapidly, this process is called a selective sweep. In a seminal paper Maynard Smith and Haigh (1974) determined the consequences of such a process for the genomic region surrounding the selective locus. Under the assumption of an otherwise neutral population a population-wide reduction in genetic diversity in the considered region can be detected. In the past 40 years many mathematical publications have build up on this work and supplemented the findings. Two standard techniques which help to study this fixation process are diffusion processes and coalescent processes (see e.g. Ewens, 2004, Chap. 5 or Patwa and Wahl, 2008). By using such approaches Kaplan et al. (1989), Stephan et al. (1992), Braverman et al. (1995), and Etheridge et al. (2006) have analysed the whole fixation process. Beside the reduction in genetic diversity the neighbouring region of the selective locus has further characteristics. Fay and Wu (2000) were the first to report a U-shape of the frequency spectrum as a consequence of the selective sweep. A third feature is an elevated linkage disequilibrium score (LD score) on both sides of the selective locus, but not across the selected site (Stephan et al., 2006; Pfaffelhuber et al., 2008). LD is the non-random association of alleles at different loci and can be quantified by different scores. Based on these theoretical findings statistical methods, which are sensitive for this kind of signals, were identified and developed (e.g. SweepFinder, SweeD, and OmegaPlus). These methods were adopted successfully in numerous studies and link again theory and practice. In doing so, signals for selective sweeps could be identified for example in drosophila (Voigt et al., 2015), arabidopsis (Huber et al., 2014), humans (Sabeti et al., 2007), or influenza (Strelkowa and Lässig, 2012).

11 5 At this point the following crucial aspect when interpreting the statistical results must be emphasized. In the basic theoretical selection model a special situation is assumed where evolutionary forces like population structure or further beneficial (or deleterious) mutations are excluded. However, such forces quite surely work in real populations and they influence the evolutionary process and thereby the variation pattern. Hence, when the special (hard) selective sweep situation is not fulfilled, then the statistics around a beneficial locus can change and the considered approach must be adopted. In order to take this issue into account many authors have extended the basic model in various directions to get predictions for sequence diversity in these cases. For example, Pennings and Hermisson (2006a; 2006b) have analysed the selective process under altered conditions. They replaced the assumption that exactly one mutant starts the sweep by the assumption that the sweep is based on standing genetic variation. Santiago and Caballero (2005) examined the effect of genetic hitchhiking in subdivided populations, and Kim and Stephan (2000) investigated the joined effect of background selection and genetic hitchhiking. Authors like Teshima and Przeworski (2006) or Ewing et al. (2011) have taken directional selection with different forms of dominance, like recessive beneficial alleles, into account. The list of extensions is virtually endless. Nevertheless, the models often consist of only small adjustments and expansions of the basic model. So one might ask why only such small changes in the assumptions are applied and not a combination of different changes to get a realistic scenario. The answer is that it is hard enough to handle and analyse such small changes in the assumptions theoretically. Hence more realistic scenarios have to be approached step by step and at best every model extension and derived theoretical result helps to improve the understanding of the complex interaction of the evolutionary forces. In this thesis the following extension of the basic sweep model is considered. A situation is analysed where two selective sweeps overlap. This means it is assumed that a further beneficial mutation arises at a different locus during the time course of the sweep of the first beneficial mutant. Here the interaction between the different selective types and their loci must be incorporated which leads to a considerable increase of complexity. Since more than one locus is under consideration, differences between asexual and sexual populations must be taken into account. In a sexual population the allelic types at two different loci can be inherited from different individuals (namely each from one parent) due to recombination events. Fisher (1930) and Muller (1932) were the first to point out that such recombination events can bring beneficial mutations on the same genetic background. This can result in an evolutionary advantage compared to asexually evolving populations. In asexual populations when different beneficial alleles appear on different backgrounds only one of these alleles can fix in the whole population. Later on Hill and Robertson (1966) gave a theoretical reasoning for this evolutionary advantage (including simulations). They pointed out that selectively beneficial alleles occurring on linked loci interfere with each other and recombination is able to affect the fixation of the alleles by bringing alleles together, which were not initially linked. Hence in such a situation the recombination probability between the beneficial loci plays a major role for the fixation probability. While the verbal arguments for the role of recombination in a situation with interfering selective alleles is easily given, in a concrete analysis one has to struggle with some difficulties (Taylor, 2007). Generally such a situation can be described as follows: A beneficial mutation, called A with selective advantage s 1 is undergoing a selective sweep when a second beneficial mutation B (with selective advantage s 2 ) arises at a linked locus (on the wild-type background). An evolutionary model with two beneficial alleles was already analysed by different authors like Stephan (1995), Barton (1995), Otto and Barton (1997), Chevin et al. (2008), Yu and Etheridge (2010), and Cuthbertson et al. (2012). Stephan (1995) investigates a two-locus, two-allele model with additive, directional selection, and recombination formulated in terms of

12 General introduction 6 a four-dimensional ordinary differential equation. Results on stochastic models for a panmictic population focus on fixation probabilities. Hereby panmictic means a population without any spatial structure and therefore with total random mating. Barton (1995) and Otto and Barton (1997) consider the case where the selective advantage s 1 of the first mutant A is larger than the advantage s 2 of the second positively selected mutant B and used a stochastic model to analyse the fixation probability. In that case the probability of fixation of the double mutant depends on the ratio s 1 /s 2 of the selection coefficients and on the ratio between the recombination parameter r and the selective advantage s 1, but only weakly on the effective population size. Yu and Etheridge (2010) and Cuthbertson et al. (2012) complement this with an analysis for the fixation probability when the first selection coefficient is smaller than the second (s 2 > s 1 ). They used various comparisons between the development of the different alleles and deterministic logistic growth curves. Furthermore, they used an ordinary differential equation for the spread of AB alleles (individuals with both beneficial alleles) to get an approximation for the fixation probability. Overall they observe a rather different behaviour in that case. Now the fixation probability depends on the population size. The both situations s 2 > s 1 and s 1 > s 2 need different approaches to get good approximation results. The reason for this characteristic is that the development of the first beneficial allele A can be analysed independently of the development of the B allele in the situation s 1 > s 2, while it is heavily dependent on this development in the situation s 2 > s 1. Therefore the approximation steps of Barton and Cuthbertson et al. are only appropriate in one of the two situations. In the last decades a new tool, which can be classified between the practical real data and the mathematical calculation, has moved mainstream in population genetics. Simulation studies are model based and provide results, where theoretical calculations are (so far) too challenging. Such studies have assisted to gain further insight in evolutionary processes. We have mentioned already an early elementary simulation study of Hill and Robertson in Thanks to the technical advances in hardware and software the simulation tools used today are much more mature and can simulate whole chromosomes over a multitude of generations. Such simulations can on the one side be used for comparisons with real data to detect e.g. the demographic history of a population. On the other side they allow the detailed analysis of special evolutionary scenarios and their influence on genetic patterns. For a review on current simulation tools for selection and their included features we refer to Bank et al. (2014). The competing selective sweep situation was handled in a large simulation study by Chevin et al. (2008), which revealed interesting outcomes. These authors analysed different statistics like Tajima s D and the frequency spectrum at the moment of fixation of the double beneficial mutant. They pointed out that these statistics can look very different compared to a single selective sweep scenario. Furthermore, they argue that the case s 2 > s 1 is more likely to be encountered in real data exhibiting fixation at both selected loci, since in such a situation a smaller recombination probability is sufficient for the chance of fixation. In this work the competing sweeps situation is handled comprehensively. In doing so limit results for the fixation probability in the case s 2 > s 1 are given based on novel analytical methods. So basically a similar situation like in Cuthbertson et al. (2012) is studied leading to comparable conclusions about the importance and influence of the selection and recombination coefficients. But different to their results we start with a diffusion model and use a combination of the ancestral selection and ancestral recombination graph as a key tool for the proof of an analytical formula for the fixation probability of the double beneficial type AB. Beside these mathematical calculations extensive simulations of the competing sweeps situation with different parameter choices for selection, recombination etc. were carried out.

13 7 These simulations are used to illustrate to what extent the limit results are suited for large finite populations. In addition whole chromosomes of a model population were simulated conditioned on the fixation of the double beneficial type to analyse typical descriptive and inductive statistics at the moment of fixation. This study supplements the simulation results of Chevin et al. (2008) by linkage disequilibrium based statistics and up-to-date outlier tests for selection. The thesis is organized as follows: Chapter 2 introduces the well-known population genetics models (Moran model and Wright Fisher model) in a competing sweeps situation and the corresponding diffusion system. In the second part heuristic arguments for possible evolutions are given based on these models. In the course of this analysis the results of Otto and Barton (1997) and Cuthbertson et al. (2012) are recorded in more detail. Chapter 3 presents the mathematical limit result for the fixation probability of the double beneficial type based on the diffusion system. Hereby, special starting conditions and parameter ratios are assumed to get non-trivial results. The results highlight the influence of the different parameters like selection, recombination or population size on the fixation probability. Furthermore, the conducted graphical construction gives insight in the genealogy of the different types and the expected number of recombinations leading to the double beneficial type. Chapter 4 contains the realized simulations. At first the simulations to check the validity of the limit result are presented. After this comparison some simulation extensions are portrayed and discussed. In the second part the simulation software SLiM (Messer, 2013) is utilized for a whole chromosome study of the competing sweeps situation. This simulation serves mainly to identify possible signatures for such a scenario and to survey, whether the standard statistics like Tajima s D are able to detect selection in such a scenario. Chapter 5 discusses both the mathematical and the simulation results. One important issue is whether the considered scenario is possible and conceivable in natural populations. The other important question concerns possible signatures for competing selective sweeps. To what extent do common statistics differ in a competing sweeps scenario compared to a classical selective sweep? How do the scores change in specific regions? Chapter 6 handles further prospects and possible extensions of the considered scenario. For example the analytical challenges of more general models are highlighted, potential addons of the simulation study are discussed and possible applications to natural populations are debated.

14 Chapter 2 Introduction to competing selective sweeps For a single beneficial mutant in a large diploid (otherwise neutral) population the first (and still useful) approximation for the fixation probability by Haldane (1927) is more than 80 years old. The situation is completely different when it is assumed that selective sweeps overlap. Here the interaction between the different selective types and their loci complicates the calculation of fixation probabilities. Therefore it took until the 1990s that workable approximation results in such an overlap scenario were published. Before we comment on these results, models for the competing sweeps situation and underlying assumptions are presented. We will use particular settings of the models, which we will specify in the following. The differences to the models used for previously released approximation results will be discussed in Section 5.1. In this thesis two of the most popular models in population genetics will be used, the Moran model and the Wright Fisher model. We will assume in both models a fixed population size N. The main difference between these two models consists in discrete time evolution of populations in the Wright Fisher model versus a continuous time approach in the Moran model. In the Wright Fisher model each individual of the new generation picks a parent from the previous generation according to a certain probability distribution. This course of action leads to fast changes in the population and explains the popular use of this model for simulations. On the other hand the calculation of analytical results is often difficult in the Wright Fisher model and easier to handle in a continuous context. In the Moran model reproduction events happen according to certain rates. At each event only two different actions may happen. Either one individual reproduces and replaces another one or two individuals recombine and replace another one. Under proper rescaling of the parameters both models converge for N to a similar limit process. We will formulate our main result for this limit diffusion. The introduction of the finite population size models has several reasons. For instance, the reproduction process of the Moran model plays an important role in the proof of the main result. Hence, the Moran model serves as a visualization, which helps to understand the evolution of the population both viewed forward and backward in time. The presented Wright Fisher model will be used for simulations to check the applicability of the limit results for large finite populations. Besides, the presentation of both models simplifies the comparison with other approximation results.

15 Models 2.1 Models Compared to asexually reproducing populations, in sexually reproducing ones chromosomes are mainly not passed down as intact units. Through the mechanism of meiosis and mitosis every parent inherits one chromosome of his diploid chromosome pair. These two haploid chromosomes are fused to form a new diploid pair in the offspring. During the merging process recombination acts, which can result in the circumstance that the allelic types at two distinct loci on one chromosome are inherited one from each parent. In particular individuals can be formed with completely new allele combinations and beneficial (or adverse) alleles can be brought together on one chromosome through this mechanism. As is customary this complex recombination process is integrated in a simplified way in models (see e.g. Ewens, 2004). In our context, where only two loci are of matter, we only need to specify how likely both loci are inherited from the same parent or each from one parent, respectively. Hence, when assuming that this probability is constant over time only one recombination parameter r is needed in the theoretical model. Moreover, it is common to consider a haploid model of size 2N instead of a diploid model of size N, when mainly the development of allelic frequencies is analysed. This simplification is appropriate when a dominance coefficient of 1/2 is assumed for the beneficial alleles and large populations are under consideration. We will also make use of this conversion, since we will concentrate on this standard case throughout the whole thesis. The justification for this simplification is based on the fact that the differences between the two models are negligible when the population size converges to. More detailed mathematical arguments can be found in standard population genetics literature, like Durrett (2008). In order to get a plain notation we will even substitute the haploid population size of 2N by N. Consequently our models refer to diploid models of size N/2 and we assume implicitly that N is odd. In this context it should be noted that we will speak in the following inaccurately of N individuals in the haploid models, although this refers technically to only one part of the diploid genome of an individual. This imprecision shortens notation. Each individual in the (haploid) population carries an allele combination with either allele a or A and allele b or B. Only this allele combination is of matter for the fitness of an individual. Here big letters represent beneficial alleles and small letters the wild-type alleles. It is assumed that individuals, which possess both beneficial alleles also have a fitness advantage. So altogether we have to distinguish between 4 different types, the complete wild-type ab and the beneficial types Ab, ab, and AB. These beneficial types have selective advantages of s Ab, s ab, and s AB with s Ab,s ab,s AB > 0. To facilitate the differentiation we assign numbers to every type. Ab is denoted by 1, ab by 2 and type AB by 3. The wild-type has a special role and is labelled with 0, since this type has no selective advantage (s ab = 0). After this preparation the discrete Wright Fisher model with two selective loci can be introduced. Definition 2.1 (Wright Fisher model). We consider a panmictic population with N (haploid) individuals, which evolves in discrete generations. The four possible types of the individuals are ab, Ab, ab and AB. The selective advantages are given by s ab = 0 and s Ab,s ab,s AB > 0. In order to obtain the (t+1)-st generation from the t-th, the following steps are performed. Reproduction: Assume that in generation t, n t ab individuals are of type ab, nt Ab individuals are of type Ab, n t ab individuals are of type ab and nt AB individuals are of type AB. Then a parent of type ij with i {a,a} and j {b,b} is chosen with probability n t ij (1+s ij) i {a,a},j {b,b} nt ij (1+s ij). (2.1)

16 Introduction to competing selective sweeps 10 Recombination: With probability 1 r no recombination happens and the offspring is of the same type ij as the parent. Otherwise (with probability r) a second individual of type kl with k {a,a} and l {b,b} is chosen according to the probabilities of Eq. (2.1) and the offspring is generated by a combination of these two individuals. In this case the descendant is of type il with probability 1/2 and of type kj with probability 1/2. Remark 2.2. i) There are various other possibilities how to define the generational transition in the Wright Fisher model. For example a permutation of the reproduction and recombination step, whereby in the case of a recombination event the individuals are chosen according to the frequencies in generation t. As long as the selection coefficients and the recombination probability are small, this modification leads only to negligible differences in the evolution of the population (cf. Ewens, 2004). In particular the differences vanish in the limit N under the assumptions that the product of recombination probability and population size r N N and the products of the selection coefficients and population size Ns N i for i {1,2,3} converge to constants. ii) We have deliberately not included a starting value in the definition. This topic will be discussed later. In the following we will assume large population sizes N, small positive recombination probabilities of order O(1/N) and strongly selected mutations. So random drift plays only a minor role and selection is the driving force for considerable frequency changes. The case of only weakly selected alleles was e.g. analysed by McVean and Charlesworth (2000) and Gillespie (2001). The special feature of the recombination step is that it can form one of the four types, although this type was not existent in the generation before. This is of course only possible if different alleles are available in the population. Once only one type is present in the population, meaning all N individuals are of the same type, no further changes are possible in the following generations. We call this event fixation and the first time when only one type is existent is called fixation time. We are interested in the fixation probability of type AB under certain starting conditions. These starting conditions include that type AB is not present in the population at the beginning. Hence (at least) one recombination event is needed so that fixation of AB can occur. Furthermore, the fixation probability is only non-trivial as long as type AB has the highest fitness advantage. Otherwise AB is dominated by one of the other selective types and the fixation is extremely unlikely. So we should assume that s AB > max(s Ab,s ab ). Without loss of generality we also choose max(s Ab,s ab ) = s ab. Thereby the fitness order of the different types is determined. Next the Moran model is defined. In this model basic reproduction events and selective reproductive events are distinguished. Here the number coding of the selective advantages is used (s Ab = s 1,s ab = s 2 and s AB = s 3 ). Definition 2.3 (Moran model). We consider a haploid panmictic population of size N. Each individual is of one of the types ab, Ab, ab or AB, respectively. Denote for t R + by n t ab the number of individuals of type ab, by n t Ab the number of individuals of type Ab, by nt ab the number of individuals of type ab, and by n t AB the number of individuals of type AB at time t. The following events shift the state of the process. Reproduction: Any individual reproduces at rate 1/2. In such a case a second individual (it might be the same individual as the first) is randomly chosen from the population. This second individual dies and the first one splits into two.

17 Models Selective reproduction: Individuals of type Ab reproduce additionally at rate s 1, individuals of type ab reproduce additionally at rate s 2, and individuals of type AB additionally at rate s 3. Here again a second individual is chosen randomly and gets replaced. Recombination: Every individual initializes a recombination event with rate r. Then a second individual is randomly chosen and replaced by a new individual. When the first individual is of type ij and the second one of type kl with i,k {a,a} and j,l {b,b}, then the new individual is of type il with probability 1/2 and of type kj otherwise. Such a model is best visualized by a graphical representation (see Fig. 2.1). Time is running from the bottom to the top. The unnumbered arrows denote basic reproduction events. Every individual sends such an arrow with rate 1/2. All lines (including the line itself) are equally likely chosen as the tip of this arrow. Then the individual at the tip is replaced by an offspring of the individual at the tail. The additional selective reproduction events can be integrated in the graph as follows. Every line sends selection arrows at rate s 3, where again the tip is placed randomly. Such an arrow gets the label 1 with probability s 1 /s 3, 2 with probability (s 2 s 1 )/s 3 and 3 otherwise. Only individuals which have a number equal or higher than the label of the line can use such arrows and place an offspring on the tip of the arrow. In doing so, the selective reproduction rate of all individuals is as described in Def In Fig. 2.1 the arrow labelled with 1 on the bottom (in the middle) cannot be used by type ab since this type has no fitness advantage, for which reason this arrow is dashed. Whereas the arrow labelled with 2 can be used by type ab since his order in the fitness rank is 2. The arrows labelled with a and b represent recombination events. Every line sends such a recombination arrow with rate r and the tip is again chosen randomly. Then the individual present on the line at the tip is replaced by an individual with an allele combination from the individual at the tail and the tip. If the arrow is labelled with a (which happens with probability 1/2) then the new individual gets his a locus from the individual at the tail and his b locus from the individual at the tip. Hence in the given example the arrow labelled with a leads to an individual of type AB on the third line. Is the arrow labelled with b then it is the other way round and the b locus comes from the individual at the tail. In keeping with these rules the types at the top can be identified, when the types at the bottom are given. Since the fixation probability is under examination we are only interested in the number of alleles of the different types and not in the complex connection between the different types. For the given Moran model this development can be described by a (multidimensional) Markov jump process. Let (N ij (t)) t 0 be the number of lines of a type ij, with i {a,a} and j {b,b} and let r ij + and r ij be the rates at which type ij increases and decreases by 1. Then given (N ab (t),n Ab (t),n ab (t),n AB (t)) = (n ab,n Ab,n ab,n AB ) the transition rates at time t are given by r + ab = n ab(n n ab ) 2N r ab = n ab(n n ab ) 2N r + Ab = n Ab(N n Ab ) 2N r Ab = n Ab(N n Ab ) 2N r + ab = n ab(n n ab ) 2N + r 2N (n abn Ab +n ab n ab +2n Ab n ab ) + s 1n ab n Ab N + s 2n ab n ab N + s 1n Ab (N n Ab ) N + s 2n Ab n ab N + s 3n Ab n AB N + s 2n ab (N n ab ) N + s 3n ab n AB N + r 2N n ab(n Ab +n ab +2n AB ) + r 2N (n Abn ab +n Ab n AB +2n AB n ab ) + r 2N (n Abn ab +n Ab n AB +2n Ab n ab ) + r 2N (n abn ab +n ab n AB +2n AB n ab )

18 Introduction to competing selective sweeps 12 Ab Ab AB AB ab Ab Ab Ab 1 b Time t a ab Ab ab ab AB ab Ab Ab Figure 2.1: Graphical representation of the Moran model: The unnumbered arrows characterise resampling events, while the numbered arrows specify selection events. The arrows labelled with a or b refer to recombination events. Dashed arrows indicate that the type at the tail cannot use this arrow (because of his too small fitness rank). r ab = n ab(n n ab ) 2N r + AB = n AB(N n AB ) 2N r AB = n AB(N n AB ) 2N + s 1n Ab n ab N + s 3n ab n AB N + s 3n AB (N n AB ) N + s 1n Ab n AB N + s 2n ab n AB N + r 2N (n abn ab +n ab n AB +2n Ab n ab ) + r 2N (n abn AB +n Ab n AB +2n Ab n ab ) + r 2N (n ABn Ab +n AB n ab +2n ab n AB ). (2.2) Remark 2.4. i) In the Moran model a genome can have zero or two offspring (due to reproduction). Hence the Moran model can be viewed as a birth-death process. This feature makes this model analytically more tractable compared to the Wright Fisher model, where a genome can have theoretically from 0 up to 2N descendants. ii) In Cuthbertson et al. (2012) and Yu and Etheridge (2010) the dynamics of the system are slightly different. Here the selective advantage is included in the standard resampling events via the probability of replacing another individual. Nevertheless both models lead to the same margins r ij + r ij, if one bears in mind, that in Cuthbertson et al. (2012) and Yu and Etheridge (2010) the total population size is 2N instead of N. As already mentioned above after proper rescaling both models, the Wright Fisher model and the Moran model, lead to the same limit process for N. This property holds very generally and is explained in detail in many fundamental books about population genetics like Ethier and Kurtz (1986), Ewens (2004) or Durrett (2008). Since a case with selection and recombination is rather rare the connection in this special case is illustrated by some calculations. Hereby we will concentrate on the Moran model where the verification is more illustratively.

19 Models To formulate the limit result, we have to index the recombination and selection parameters with the population size N. Furthermore, we will consider the frequencies of the different types instead of the total numbers. For this purpose the process (Y N ab (t),y N Ab (t),y N ab(t),y N AB(t)) := (N ab (t),n Ab (t),n ab (t),n AB (t))/n is defined. Besides a rescaling of the space also a time scaling is needed. Denote by (X N ab (t),xn Ab (t),xn ab(t),x N AB(t)) := (Y N ab (t N),Y N Ab (t N),Y N ab(t N),Y N AB(t N)). Under the assumptions lim N s N i N = α i for i {1,2,3} and lim N Nr N = ρ one arrives at a multidimensional diffusion for N. (0)) N (x ab,x Ab,x ab,x AB ) in Proposition 2.5 (Convergence to diffusion). Given (Xab N(t),XN Ab (t),xn ab (t),xn AB (t)) (described by Eq. (2.2)), with (Xab N(0),XN Ab (0),XN ab (0),XN AB distribution and lim N Ns N i = α i for i {1,2,3} and lim N Nr N = ρ. Then for N this system converges in distribution in D(R +,[0,1] 4 ) with the Skorohodtopology towards (X ab,x Ab,X ab,x AB ), where (X ab,x Ab,X ab,x AB ) is the solution of the stochastic differential equation dx ab = [ α 1 X ab X Ab α 2 X ab X ab α 3 X ab X AB +ρ(x Ab X ab X ab X AB )]dt X ab X Ab dw 01 X ab X ab dw 02 X ab X AB dw 03 dx Ab = [α 1 X Ab (1 X Ab ) α 2 X Ab X ab α 3 X Ab X AB +ρ(x AB X ab X Ab X ab )]dt X Ab X ab dw 10 X Ab X ab dw 12 X Ab X AB dw 13 dx ab = [α 2 X ab (1 X ab ) α 1 X Ab X ab α 3 X ab X AB +ρ(x AB X ab X Ab X ab )]dt (2.3) X ab X ab dw 20 X ab X Ab dw 21 X ab X AB dw 23 dx AB = [α 3 X AB (1 X AB ) α 1 X Ab X AB α 2 X ab X AB +ρ(x Ab X ab X ab X AB )]dt X AB X ab dw 30 X AB X Ab dw 31 X AB X ab dw 32, where (W kl ) k>l is a family of independent Brownian motions with W kl = W lk for k {1,2,3} and l {0,1,2,3}, X ab +X Ab +X ab +X AB = 1, started in (X ab (0),X Ab (0),X ab (0),X AB (0)) := x = (x ab,x Ab,x ab,x AB ). Proof. The convergence of the starting value is predetermined. Due to classical results about convergence in distribution against diffusion processes (Ethier and Kurtz, 1986) the proof relies on the computation of the infinitesimal parameters and their convergence. We will not present all calculations in detail and concentrate on some characteristic rates. We analyse the drift and the covariance for(x N ab (t ),X N Ab (t ),X N ab (t ),X N AB (t )) = (x ab,x Ab,x ab,x AB ). The drift calculation of type Ab is given by d dt E[XN Ab (t)] = 1 N (r+ Ab r Ab )N = s N 1 Nx Ab (1 x Ab ) s N 2 Nx Ab x ab s N 3 Nx Ab x AB +r N N (x AB x ab x Ab x ab ) N α 1 x Ab (1 x Ab ) α 2 x Ab x ab α 3 x Ab x AB +ρ(x AB x ab x Ab x ab ). For the covariance terms between the different types, the convergence follows for instance for the types ab and Ab since d dt E[(XN ab (t) x ab)(x N Ab (t) x Ab)] = 1 ( N 2 N 2 x ab x Ab +s N 1 N 2 x ab x Ab +r N N 2( x Ab x ab + x Abx ab 2 + x )) ABx ab. 2

20 Introduction to competing selective sweeps 14 Because of r N = O(1/N) and s N i = O(1/N) this leads to d dt E[(XN ab (t) x ab)(x N Ab (t) x Ab)] = x ab x Ab +O(1/N) N x ab x Ab. The calculations for the other types and events are quite similar. Altogether the convergence follows using standard theory (see e.g. Karlin and Taylor, 1981, Chap. 15). Remark 2.6. The drift terms of the diffusion system (2.3) consist of two parts, a selection part (composed of the terms with an α i component) and a recombination part (composed of the terms with a ρ component). The tendency of the selection part of a type depends on the configuration of the whole system. For example conditioned on X ab = X AB = 0 the selection drift of X Ab is given by α 1 x Ab (1 x Ab ) and is strictly positive, while in the case of high values of X AB the tendency changes because of α 3 > α 1. The recombination drift shows a different dynamic, with highest positive values when the considered type is not existent. If for example X AB = 0, then the recombination drift of type AB is given by ρx Ab x ab. This rate represents recombination events between type Ab and ab, leading to an increase of type AB and plays therefore an important role in the analysis of the fixation probability of type AB. For example in the extreme case ρ = 0, a situation without recombination, there is no chance to escape from 0 for type AB. Naturally many features of the Moran model transfer to the diffusion system. When only one type is present in the population then no further changes happen afterwards. Formally expressed, when max(x ab (t),x Ab (t),x ab (t),x AB (t)) = 1, then we get for all times s t 0 (X ab (s),x Ab (s),x ab (s),x AB (s)) = (X ab (t),x Ab (t),x ab (t),x AB (t)). So the process stays in one of the four states (1,0,0,0),(0,1,0,0),(0,0,1,0), or (0,0,0,1), respectively. We are interested in cases with fixation of type AB and want to calculate their likelihood. As explained, in such a situation it holds that X AB ( ) = 1. The probability for reaching a certain fixation state depends heavily on the starting situation of the system. Before this topic is discussed in detail we quote, that the convergence result of Proposition 2.5 carries over to the fixation probabilities. Corollary 2.7. The probability for fixation of type ij with i {a, A} and j {b, B} in the rescaled Moran model converges for N in probability to the fixation probability of this type in the diffusion system (2.3), when the starting value of the Moran model X N 0 converges in distribution to the starting value x of the diffusion. In formula, using the notation p N ij ( ) for the event of fixation of type ij in the Moran model, lim P X N 0 N(pN ij( )) = P x (X ij ( ) = 1). Proof. Since the evolution of the Moran model converges in distribution to the diffusion process, according to Prop. 2.5, and the starting value of the Moran model converges, the convergence of the fixation probability is straight forward (see e.g. Ethier and Kurtz, 1986, Chap. 10, Cor. 2.7). At this point we go back to the biological situation, which shall be analysed with the presented models. This step back is done to clarify the choice of the starting conditions. We want to understand the interaction of two strongly beneficial partly linked alleles in a sexual reproducing population. These beneficial alleles appear by single mutation events. The case of recurrent beneficial mutation events is not treated here. We are interested in scenarios

21 Heuristics on the event of fixation where the beneficial alleles are not connected at first 1. This means they appear on different backgrounds at the beginning and only recombination can bring them together. When we neglect the biological unrealistic case that the two mutations happen exactly in the same generation, then there is a time interval where only one beneficial type is present. During this phase the evolution is comparable with the evolution in a single sweep case. The beneficial type can survive or die by chance. The latter is not interesting and we concentrate on survival situations. Furthermore, we are only interested in situations with large selection coefficients. At some time point the second beneficial allele appears on a wild-type and the exciting evolution starts. We want to start our analysis at this moment and try to calculate the fixation probability of both beneficial alleles. Hence in a Moran model the frequency of the second beneficial allele is 1/N at the beginning. Since this starting situation translates to a starting frequency of δ 0 of type ab in the diffusion limit, the considered fixation probability has to be multiplied with a compensation term to get a proper limit result. This problem as well as other required technical assumptions will be treated later before the main result about the fixation probability is calculated. Here one further biological reasonable property of the starting situation is discussed. If we treat the arrival time of the second beneficial mutation as uniformly distributed over the time course of the sweep of the first mutation, we can assume that the second mutation happens, while the frequency of the first is below ǫ or above 1 ǫ. The phase of a selective sweep between ǫ and 1 ǫ is very short (for large selection coefficients) and can therefore be neglected. When the frequency of the first mutant is above 1 ǫ the probability that the second falls on the wild-type is smaller than ǫ. Hence also this case is negligible. Therefore it is reasonable to assume that the frequency of the first mutant is below ǫ, when the second beneficial allele appears. In the next section heuristic arguments are presented to get an intuition for the impact of the different parameters and possible scenarios. For that purpose the stochastic differential system (2.3) is analysed based on the introduced starting conditions, assuming that it describes the evolution of a large panmictic population of size N. In doing so we also comment on the published approximation results in the different situations. 2.2 Heuristics on the event of fixation We start this section by summing up the properties and orders of magnitudes of the different parameters presented in detail above. The selection coefficients of the different types are ordered by s AB > s ab > s Ab > s ab = 0. The recombination probability r is rather small, so that there is a mid-size constant G, which bounds the product rn < G. Here in this section we utilize the stochastic differential equation system (2.3) somewhat imprecisely and assume that it describes the evolution of a large panmictic population of size N. Combining the large population size with the precondition of strongly selected beneficial alleles, leads to large α i coefficients in Eq. (2.3). So once a type reaches a certain frequency its evolution is dominated by the deterministic tendency according to Eq. (2.3) and the stochastic effects only play a minor role. As described in Remark 2.6, this tendency of the different types depends on the configuration of the whole system. Due to these parameter assumptions the evolution can be splitted in different phases with different possible outcomes and ends with the fixation of one type. 1 The other case, when the second beneficial mutation happens on an individual which has already the first beneficial allele, can be handled using classical results about sweeps. Only the selective coefficients of the different types are needed.

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics. Evolutionary Genetics (for Encyclopedia of Biodiversity) Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville, TN 37996-6 USA Evolutionary

More information

Mathematical models in population genetics II

Mathematical models in population genetics II Mathematical models in population genetics II Anand Bhaskar Evolutionary Biology and Theory of Computing Bootcamp January 1, 014 Quick recap Large discrete-time randomly mating Wright-Fisher population

More information

Population Genetics I. Bio

Population Genetics I. Bio Population Genetics I. Bio5488-2018 Don Conrad dconrad@genetics.wustl.edu Why study population genetics? Functional Inference Demographic inference: History of mankind is written in our DNA. We can learn

More information

SWEEPFINDER2: Increased sensitivity, robustness, and flexibility

SWEEPFINDER2: Increased sensitivity, robustness, and flexibility SWEEPFINDER2: Increased sensitivity, robustness, and flexibility Michael DeGiorgio 1,*, Christian D. Huber 2, Melissa J. Hubisz 3, Ines Hellmann 4, and Rasmus Nielsen 5 1 Department of Biology, Pennsylvania

More information

How robust are the predictions of the W-F Model?

How robust are the predictions of the W-F Model? How robust are the predictions of the W-F Model? As simplistic as the Wright-Fisher model may be, it accurately describes the behavior of many other models incorporating additional complexity. Many population

More information

Evolution in a spatial continuum

Evolution in a spatial continuum Evolution in a spatial continuum Drift, draft and structure Alison Etheridge University of Oxford Joint work with Nick Barton (Edinburgh) and Tom Kurtz (Wisconsin) New York, Sept. 2007 p.1 Kingman s Coalescent

More information

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information # Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either

More information

Introduction to population genetics & evolution

Introduction to population genetics & evolution Introduction to population genetics & evolution Course Organization Exam dates: Feb 19 March 1st Has everybody registered? Did you get the email with the exam schedule Summer seminar: Hot topics in Bioinformatics

More information

Segregation versus mitotic recombination APPENDIX

Segregation versus mitotic recombination APPENDIX APPENDIX Waiting time until the first successful mutation The first time lag, T 1, is the waiting time until the first successful mutant appears, creating an Aa individual within a population composed

More information

Endowed with an Extra Sense : Mathematics and Evolution

Endowed with an Extra Sense : Mathematics and Evolution Endowed with an Extra Sense : Mathematics and Evolution Todd Parsons Laboratoire de Probabilités et Modèles Aléatoires - Université Pierre et Marie Curie Center for Interdisciplinary Research in Biology

More information

Population Genetics: a tutorial

Population Genetics: a tutorial : a tutorial Institute for Science and Technology Austria ThRaSh 2014 provides the basic mathematical foundation of evolutionary theory allows a better understanding of experiments allows the development

More information

The Wright-Fisher Model and Genetic Drift

The Wright-Fisher Model and Genetic Drift The Wright-Fisher Model and Genetic Drift January 22, 2015 1 1 Hardy-Weinberg Equilibrium Our goal is to understand the dynamics of allele and genotype frequencies in an infinite, randomlymating population

More information

6 Introduction to Population Genetics

6 Introduction to Population Genetics Grundlagen der Bioinformatik, SoSe 14, D. Huson, May 18, 2014 67 6 Introduction to Population Genetics This chapter is based on: J. Hein, M.H. Schierup and C. Wuif, Gene genealogies, variation and evolution,

More information

The Evolution of Gene Dominance through the. Baldwin Effect

The Evolution of Gene Dominance through the. Baldwin Effect The Evolution of Gene Dominance through the Baldwin Effect Larry Bull Computer Science Research Centre Department of Computer Science & Creative Technologies University of the West of England, Bristol

More information

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have. Section 1: Chromosomes and Meiosis KEY CONCEPT Gametes have half the number of chromosomes that body cells have. VOCABULARY somatic cell autosome fertilization gamete sex chromosome diploid homologous

More information

6 Introduction to Population Genetics

6 Introduction to Population Genetics 70 Grundlagen der Bioinformatik, SoSe 11, D. Huson, May 19, 2011 6 Introduction to Population Genetics This chapter is based on: J. Hein, M.H. Schierup and C. Wuif, Gene genealogies, variation and evolution,

More information

Genetic hitch-hiking in a subdivided population

Genetic hitch-hiking in a subdivided population Genet. Res., Camb. (1998), 71, pp. 155 160. With 3 figures. Printed in the United Kingdom 1998 Cambridge University Press 155 Genetic hitch-hiking in a subdivided population MONTGOMERY SLATKIN* AND THOMAS

More information

Diffusion Models in Population Genetics

Diffusion Models in Population Genetics Diffusion Models in Population Genetics Laura Kubatko kubatko.2@osu.edu MBI Workshop on Spatially-varying stochastic differential equations, with application to the biological sciences July 10, 2015 Laura

More information

Effective population size and patterns of molecular evolution and variation

Effective population size and patterns of molecular evolution and variation FunDamental concepts in genetics Effective population size and patterns of molecular evolution and variation Brian Charlesworth Abstract The effective size of a population,, determines the rate of change

More information

Linking levels of selection with genetic modifiers

Linking levels of selection with genetic modifiers Linking levels of selection with genetic modifiers Sally Otto Department of Zoology & Biodiversity Research Centre University of British Columbia @sarperotto @sse_evolution @sse.evolution Sally Otto Department

More information

The mathematical challenge. Evolution in a spatial continuum. The mathematical challenge. Other recruits... The mathematical challenge

The mathematical challenge. Evolution in a spatial continuum. The mathematical challenge. Other recruits... The mathematical challenge The mathematical challenge What is the relative importance of mutation, selection, random drift and population subdivision for standing genetic variation? Evolution in a spatial continuum Al lison Etheridge

More information

Selection and Population Genetics

Selection and Population Genetics Selection and Population Genetics Evolution by natural selection can occur when three conditions are satisfied: Variation within populations - individuals have different traits (phenotypes). height and

More information

The concept of the adaptive landscape

The concept of the adaptive landscape 1 The concept of the adaptive landscape The idea of a fitness landscape was introduced by Sewall Wright (1932) and it has become a standard imagination prosthesis for evolutionary theorists. It has proven

More information

The Combinatorial Interpretation of Formulas in Coalescent Theory

The Combinatorial Interpretation of Formulas in Coalescent Theory The Combinatorial Interpretation of Formulas in Coalescent Theory John L. Spouge National Center for Biotechnology Information NLM, NIH, DHHS spouge@ncbi.nlm.nih.gov Bldg. A, Rm. N 0 NCBI, NLM, NIH Bethesda

More information

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics: Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships

More information

Study of similarities and differences in body plans of major groups Puzzling patterns:

Study of similarities and differences in body plans of major groups Puzzling patterns: Processes of Evolution Evolutionary Theories Widely used to interpret the past and present, and even to predict the future Reveal connections between the geological record, fossil record, and organismal

More information

Outline of lectures 3-6

Outline of lectures 3-6 GENOME 453 J. Felsenstein Evolutionary Genetics Autumn, 009 Population genetics Outline of lectures 3-6 1. We want to know what theory says about the reproduction of genotypes in a population. This results

More information

Processes of Evolution

Processes of Evolution 15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection

More information

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM Life Cycles, Meiosis and Genetic Variability iclicker: 1. A chromosome just before mitosis contains two double stranded DNA molecules. 2. This replicated chromosome contains DNA from only one of your parents

More information

Meiosis, Sexual Reproduction, & Genetic Variability

Meiosis, Sexual Reproduction, & Genetic Variability Meiosis, Sexual Reproduction, & Genetic Variability Teachers Guide NARRATION FOR MEIOSIS, SEXUAL REPRODUCTION, AND GENETIC VARIABILITY Since the members of no species, even California redwoods or giant

More information

EVOLUTIONARY DYNAMICS AND THE EVOLUTION OF MULTIPLAYER COOPERATION IN A SUBDIVIDED POPULATION

EVOLUTIONARY DYNAMICS AND THE EVOLUTION OF MULTIPLAYER COOPERATION IN A SUBDIVIDED POPULATION Friday, July 27th, 11:00 EVOLUTIONARY DYNAMICS AND THE EVOLUTION OF MULTIPLAYER COOPERATION IN A SUBDIVIDED POPULATION Karan Pattni karanp@liverpool.ac.uk University of Liverpool Joint work with Prof.

More information

Computer Simulations on Evolution BiologyLabs On-line. Laboratory 1 for Section B. Laboratory 2 for Section A

Computer Simulations on Evolution BiologyLabs On-line. Laboratory 1 for Section B. Laboratory 2 for Section A Computer Simulations on Evolution BiologyLabs On-line Laboratory 1 for Section B Laboratory 2 for Section A The following was taken from http://www.biologylabsonline.com/protected/evolutionlab/ Introduction

More information

Chapter 16: Evolutionary Theory

Chapter 16: Evolutionary Theory Chapter 16: Evolutionary Theory Section 1: Developing a Theory Evolution: Artificial Selection: Evolution: I. A Theory to Explain Change Over Time B. Charles Darwin C. Theory: D. Modern evolutionary theory

More information

Some mathematical models from population genetics

Some mathematical models from population genetics Some mathematical models from population genetics 5: Muller s ratchet and the rate of adaptation Alison Etheridge University of Oxford joint work with Peter Pfaffelhuber (Vienna), Anton Wakolbinger (Frankfurt)

More information

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection CHAPTER 23 THE EVOLUTIONS OF POPULATIONS Section C: Genetic Variation, the Substrate for Natural Selection 1. Genetic variation occurs within and between populations 2. Mutation and sexual recombination

More information

NOTES CH 17 Evolution of. Populations

NOTES CH 17 Evolution of. Populations NOTES CH 17 Evolution of Vocabulary Fitness Genetic Drift Punctuated Equilibrium Gene flow Adaptive radiation Divergent evolution Convergent evolution Gradualism Populations 17.1 Genes & Variation Darwin

More information

Wright-Fisher Models, Approximations, and Minimum Increments of Evolution

Wright-Fisher Models, Approximations, and Minimum Increments of Evolution Wright-Fisher Models, Approximations, and Minimum Increments of Evolution William H. Press The University of Texas at Austin January 10, 2011 1 Introduction Wright-Fisher models [1] are idealized models

More information

Section 15 3 Darwin Presents His Case

Section 15 3 Darwin Presents His Case Section 15 3 Darwin Presents His Case (pages 378 386) Key Concepts How is natural variation used in artificial selection? How is natural selection related to a species fitness? What evidence of evolution

More information

Mechanisms of Evolution

Mechanisms of Evolution Mechanisms of Evolution 36-149 The Tree of Life Christopher R. Genovese Department of Statistics 132H Baker Hall x8-7836 http://www.stat.cmu.edu/ ~ genovese/. Plan 1. Two More Generations 2. The Hardy-Weinberg

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA Human Population Genomics Outline 1 2 Damn the Human Genomes. Small initial populations; genes too distant; pestered with transposons;

More information

Outline of lectures 3-6

Outline of lectures 3-6 GENOME 453 J. Felsenstein Evolutionary Genetics Autumn, 007 Population genetics Outline of lectures 3-6 1. We want to know what theory says about the reproduction of genotypes in a population. This results

More information

Mutation, Selection, Gene Flow, Genetic Drift, and Nonrandom Mating Results in Evolution

Mutation, Selection, Gene Flow, Genetic Drift, and Nonrandom Mating Results in Evolution Mutation, Selection, Gene Flow, Genetic Drift, and Nonrandom Mating Results in Evolution 15.2 Intro In biology, evolution refers specifically to changes in the genetic makeup of populations over time.

More information

Lecture WS Evolutionary Genetics Part I 1

Lecture WS Evolutionary Genetics Part I 1 Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in

More information

The Evolution of Sex Chromosomes through the. Baldwin Effect

The Evolution of Sex Chromosomes through the. Baldwin Effect The Evolution of Sex Chromosomes through the Baldwin Effect Larry Bull Computer Science Research Centre Department of Computer Science & Creative Technologies University of the West of England, Bristol

More information

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) 12/5/14 Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) Linkage Disequilibrium Genealogical Interpretation of LD Association Mapping 1 Linkage and Recombination v linkage equilibrium ²

More information

4. Identify one bird that would most likely compete for food with the large tree finch. Support your answer. [1]

4. Identify one bird that would most likely compete for food with the large tree finch. Support your answer. [1] Name: Topic 5B 1. A hawk has a genetic trait that gives it much better eyesight than other hawks of the same species in the same area. Explain how this could lead to evolutionary change within this species

More information

Genetic Variation in Finite Populations

Genetic Variation in Finite Populations Genetic Variation in Finite Populations The amount of genetic variation found in a population is influenced by two opposing forces: mutation and genetic drift. 1 Mutation tends to increase variation. 2

More information

Chapter 13 Meiosis and Sexual Reproduction

Chapter 13 Meiosis and Sexual Reproduction Biology 110 Sec. 11 J. Greg Doheny Chapter 13 Meiosis and Sexual Reproduction Quiz Questions: 1. What word do you use to describe a chromosome or gene allele that we inherit from our Mother? From our Father?

More information

First go to

First go to Name Date Block Evolution Webquest Directions: In this webquest you will be exploring evolution and the mechanisms that drive evolution. You will use three websites to answer the following questions and

More information

Ch. 13 Meiosis & Sexual Life Cycles

Ch. 13 Meiosis & Sexual Life Cycles Introduction Ch. 13 Meiosis & Sexual Life Cycles 2004-05 Living organisms are distinguished by their ability to reproduce their own kind. -Offspring resemble their parents more than they do less closely

More information

arxiv: v2 [q-bio.pe] 26 May 2011

arxiv: v2 [q-bio.pe] 26 May 2011 The Structure of Genealogies in the Presence of Purifying Selection: A Fitness-Class Coalescent arxiv:1010.2479v2 [q-bio.pe] 26 May 2011 Aleksandra M. Walczak 1,, Lauren E. Nicolaisen 2,, Joshua B. Plotkin

More information

Selection 10: Theory of Natural Selection

Selection 10: Theory of Natural Selection Selection 10: Theory of Natural Selection Darwin began his voyage thinking that species could not change His experience during the five-year journey altered his thinking Variation of similar species among

More information

Stationary Distribution of the Linkage Disequilibrium Coefficient r 2

Stationary Distribution of the Linkage Disequilibrium Coefficient r 2 Stationary Distribution of the Linkage Disequilibrium Coefficient r 2 Wei Zhang, Jing Liu, Rachel Fewster and Jesse Goodman Department of Statistics, The University of Auckland December 1, 2015 Overview

More information

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. OEB 242 Exam Practice Problems Answer Key Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. First, recall

More information

Frequency Spectra and Inference in Population Genetics

Frequency Spectra and Inference in Population Genetics Frequency Spectra and Inference in Population Genetics Although coalescent models have come to play a central role in population genetics, there are some situations where genealogies may not lead to efficient

More information

I. Short Answer Questions DO ALL QUESTIONS

I. Short Answer Questions DO ALL QUESTIONS EVOLUTION 313 FINAL EXAM Part 1 Saturday, 7 May 2005 page 1 I. Short Answer Questions DO ALL QUESTIONS SAQ #1. Please state and BRIEFLY explain the major objectives of this course in evolution. Recall

More information

(Write your name on every page. One point will be deducted for every page without your name!)

(Write your name on every page. One point will be deducted for every page without your name!) POPULATION GENETICS AND MICROEVOLUTIONARY THEORY FINAL EXAMINATION (Write your name on every page. One point will be deducted for every page without your name!) 1. Briefly define (5 points each): a) Average

More information

GENE genealogies under neutral evolution are commonly

GENE genealogies under neutral evolution are commonly Copyright Ó 2008 by the Genetics Society of America DOI: 10.1534/genetics.107.076018 An Accurate Model for Genetic Hitchhiking Anders Eriksson,* Pontus Fernström, Bernhard Mehlig,1 and Serik Sagitov *Department

More information

Evolutionary Genetics Midterm 2008

Evolutionary Genetics Midterm 2008 Student # Signature The Rules: (1) Before you start, make sure you ve got all six pages of the exam, and write your name legibly on each page. P1: /10 P2: /10 P3: /12 P4: /18 P5: /23 P6: /12 TOT: /85 (2)

More information

Evolution and the Genetics of Structured populations. Charles Goodnight Department of Biology University of Vermont

Evolution and the Genetics of Structured populations. Charles Goodnight Department of Biology University of Vermont Evolution and the Genetics of Structured populations Charles Goodnight Department of Biology University of Vermont Outline What is Evolution Evolution and the Reductionist Approach Fisher/Wright Controversy

More information

MS-LS3-1 Heredity: Inheritance and Variation of Traits

MS-LS3-1 Heredity: Inheritance and Variation of Traits MS-LS3-1 Heredity: Inheritance and Variation of Traits MS-LS3-1. Develop and use a model to describe why structural changes to genes (mutations) located on chromosomes may affect proteins and may result

More information

DARWIN: WHICH MATHEMATICS?

DARWIN: WHICH MATHEMATICS? 200 ANNI DI DARWIN Facoltà di Scienze Matemtiche Fisiche e Naturali Università del Salento 12 Febbraio 2009 DARWIN: WHICH MATHEMATICS? Deborah Lacitignola Department of Mathematics University of Salento,,

More information

Evolutionary change. Evolution and Diversity. Two British naturalists, one revolutionary idea. Darwin observed organisms in many environments

Evolutionary change. Evolution and Diversity. Two British naturalists, one revolutionary idea. Darwin observed organisms in many environments Evolutionary change Evolution and Diversity Ch 13 How populations evolve Organisms change over time In baby steps Species (including humans) are descended from other species Two British naturalists, one

More information

Evolution & Natural Selection

Evolution & Natural Selection Evolution & Natural Selection Learning Objectives Know what biological evolution is and understand the driving force behind biological evolution. know the major mechanisms that change allele frequencies

More information

Classical Selection, Balancing Selection, and Neutral Mutations

Classical Selection, Balancing Selection, and Neutral Mutations Classical Selection, Balancing Selection, and Neutral Mutations Classical Selection Perspective of the Fate of Mutations All mutations are EITHER beneficial or deleterious o Beneficial mutations are selected

More information

Evolution of Populations. Chapter 17

Evolution of Populations. Chapter 17 Evolution of Populations Chapter 17 17.1 Genes and Variation i. Introduction: Remember from previous units. Genes- Units of Heredity Variation- Genetic differences among individuals in a population. New

More information

Mathematical Population Genetics II

Mathematical Population Genetics II Mathematical Population Genetics II Lecture Notes Joachim Hermisson March 20, 2015 University of Vienna Mathematics Department Oskar-Morgenstern-Platz 1 1090 Vienna, Austria Copyright (c) 2013/14/15 Joachim

More information

EvolutionIntro.notebook. May 13, Do Now LE 1: Copy Now. May 13 12:28 PM. Apr 21 6:33 AM. May 13 7:22 AM. May 13 7:00 AM.

EvolutionIntro.notebook. May 13, Do Now LE 1: Copy Now. May 13 12:28 PM. Apr 21 6:33 AM. May 13 7:22 AM. May 13 7:00 AM. Different interpretations of cetacean evolutionary history 4/19/10 Aim: What is Evolution by Natural Selection Do Now: How do we know all life on earth is related? Homework Read pp. 375 379 p. 379 # 1,2,3

More information

Surfing genes. On the fate of neutral mutations in a spreading population

Surfing genes. On the fate of neutral mutations in a spreading population Surfing genes On the fate of neutral mutations in a spreading population Oskar Hallatschek David Nelson Harvard University ohallats@physics.harvard.edu Genetic impact of range expansions Population expansions

More information

Introduction to Digital Evolution Handout Answers

Introduction to Digital Evolution Handout Answers Introduction to Digital Evolution Handout Answers Note to teacher: The questions in this handout and the suggested answers (in red, below) are meant to guide discussion, not be an assessment. It is recommended

More information

The neutral theory of molecular evolution

The neutral theory of molecular evolution The neutral theory of molecular evolution Introduction I didn t make a big deal of it in what we just went over, but in deriving the Jukes-Cantor equation I used the phrase substitution rate instead of

More information

The Genetics of Natural Selection

The Genetics of Natural Selection The Genetics of Natural Selection Introduction So far in this course, we ve focused on describing the pattern of variation within and among populations. We ve talked about inbreeding, which causes genotype

More information

Challenges when applying stochastic models to reconstruct the demographic history of populations.

Challenges when applying stochastic models to reconstruct the demographic history of populations. Challenges when applying stochastic models to reconstruct the demographic history of populations. Willy Rodríguez Institut de Mathématiques de Toulouse October 11, 2017 Outline 1 Introduction 2 Inverse

More information

NOTES Ch 17: Genes and. Variation

NOTES Ch 17: Genes and. Variation NOTES Ch 17: Genes and Vocabulary Fitness Genetic Drift Punctuated Equilibrium Gene flow Adaptive radiation Divergent evolution Convergent evolution Gradualism Variation 17.1 Genes & Variation Darwin developed

More information

Structures and Functions of Living Organisms (LS1)

Structures and Functions of Living Organisms (LS1) EALR 4: Big Idea: Core Content: Life Science Structures and Functions of Living Organisms (LS1) Processes Within Cells In prior grades students learned that all living systems are composed of cells which

More information

Evolutionary dynamics on graphs

Evolutionary dynamics on graphs Evolutionary dynamics on graphs Laura Hindersin May 4th 2015 Max-Planck-Institut für Evolutionsbiologie, Plön Evolutionary dynamics Main ingredients: Fitness: The ability to survive and reproduce. Selection

More information

Chapter 22: Descent with Modification: A Darwinian View of Life

Chapter 22: Descent with Modification: A Darwinian View of Life Chapter 22: Descent with Modification Name Period Chapter 22: Descent with Modification: A Darwinian View of Life As you study this chapter, read several paragraphs at a time to catch the flow of ideas

More information

Darwin s Theory of Evolution. The Puzzle of Life s Diversity

Darwin s Theory of Evolution. The Puzzle of Life s Diversity Darwin s Theory of Evolution The Puzzle of Life s Diversity Evolutionary Theory A scientific explanation that can illustrate the diversity of life on Earth Theory A well-supported, testable explanation

More information

Formalizing the gene centered view of evolution

Formalizing the gene centered view of evolution Chapter 1 Formalizing the gene centered view of evolution Yaneer Bar-Yam and Hiroki Sayama New England Complex Systems Institute 24 Mt. Auburn St., Cambridge, MA 02138, USA yaneer@necsi.org / sayama@necsi.org

More information

The Wright Fisher Controversy. Charles Goodnight Department of Biology University of Vermont

The Wright Fisher Controversy. Charles Goodnight Department of Biology University of Vermont The Wright Fisher Controversy Charles Goodnight Department of Biology University of Vermont Outline Evolution and the Reductionist Approach Adding complexity to Evolution Implications Williams Principle

More information

Mathematical modelling of Population Genetics: Daniel Bichener

Mathematical modelling of Population Genetics: Daniel Bichener Mathematical modelling of Population Genetics: Daniel Bichener Contents 1 Introduction 3 2 Haploid Genetics 4 2.1 Allele Frequencies......................... 4 2.2 Natural Selection in Discrete Time...............

More information

The Structure of Genealogies in the Presence of Purifying Selection: a "Fitness-Class Coalescent"

The Structure of Genealogies in the Presence of Purifying Selection: a Fitness-Class Coalescent The Structure of Genealogies in the Presence of Purifying Selection: a "Fitness-Class Coalescent" The Harvard community has made this article openly available. Please share how this access benefits you.

More information

Neutral Theory of Molecular Evolution

Neutral Theory of Molecular Evolution Neutral Theory of Molecular Evolution Kimura Nature (968) 7:64-66 King and Jukes Science (969) 64:788-798 (Non-Darwinian Evolution) Neutral Theory of Molecular Evolution Describes the source of variation

More information

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation page 2 Page 2 2 Introduction Family Trees for all grades Goals Discover Darwin all over Pittsburgh in 2009 with Darwin 2009: Exploration is Never Extinct. Lesson plans, including this one, are available

More information

EVOLUTION change in populations over time

EVOLUTION change in populations over time EVOLUTION change in populations over time HISTORY ideas that shaped the current theory James Hutton & Charles Lyell proposes that Earth is shaped by geological forces that took place over extremely long

More information

Science Unit Learning Summary

Science Unit Learning Summary Learning Summary Inheritance, variation and evolution Content Sexual and asexual reproduction. Meiosis leads to non-identical cells being formed while mitosis leads to identical cells being formed. In

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

p(d g A,g B )p(g B ), g B

p(d g A,g B )p(g B ), g B Supplementary Note Marginal effects for two-locus models Here we derive the marginal effect size of the three models given in Figure 1 of the main text. For each model we assume the two loci (A and B)

More information

URN MODELS: the Ewens Sampling Lemma

URN MODELS: the Ewens Sampling Lemma Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 3, 2014 1 2 3 4 Mutation Mutation: typical values for parameters Equilibrium Probability of fixation 5 6 Ewens Sampling

More information

EVOLUTION. HISTORY: Ideas that shaped the current evolutionary theory. Evolution change in populations over time.

EVOLUTION. HISTORY: Ideas that shaped the current evolutionary theory. Evolution change in populations over time. EVOLUTION HISTORY: Ideas that shaped the current evolutionary theory. Evolution change in populations over time. James Hutton & Charles Lyell proposes that Earth is shaped by geological forces that took

More information

98 Washington State K-12 Science Learning Standards Version 1.2

98 Washington State K-12 Science Learning Standards Version 1.2 EALR 4: Big Idea: Core Content: Life Science Structures and Functions of Living Organisms (LS1) Processes Within Cells In prior grades students learned that all living systems are composed of cells which

More information

Reproduction- passing genetic information to the next generation

Reproduction- passing genetic information to the next generation 166 166 Essential Question: How has biological evolution led to the diversity of life? B-5 Natural Selection Traits that make an organism more or less likely to survive in an environment and reproduce

More information

The Mechanisms of Evolution

The Mechanisms of Evolution The Mechanisms of Evolution Figure.1 Darwin and the Voyage of the Beagle (Part 1) 2/8/2006 Dr. Michod Intro Biology 182 (PP 3) 4 The Mechanisms of Evolution Charles Darwin s Theory of Evolution Genetic

More information

Interest Grabber. Analyzing Inheritance

Interest Grabber. Analyzing Inheritance Interest Grabber Section 11-1 Analyzing Inheritance Offspring resemble their parents. Offspring inherit genes for characteristics from their parents. To learn about inheritance, scientists have experimented

More information

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation page 2 Page 2 2 Introduction Family Trees for all grades Goals Discover Darwin all over Pittsburgh in 2009 with Darwin 2009: Exploration is Never Extinct. Lesson plans, including this one, are available

More information

Modelling populations under fluctuating selection

Modelling populations under fluctuating selection Modelling populations under fluctuating selection Alison Etheridge With Aleksander Klimek (Oxford) and Niloy Biswas (Harvard) The simplest imaginable model of inheritance A population of fixed size, N,

More information

Runaway. demogenetic model for sexual selection. Louise Chevalier. Jacques Labonne

Runaway. demogenetic model for sexual selection. Louise Chevalier. Jacques Labonne Runaway demogenetic model for sexual selection Louise Chevalier Master 2 thesis UPMC, Specialization Oceanography and Marine Environments Jacques Labonne UMR Ecobiop INRA - National Institute for Agronomic

More information

9 Genetic diversity and adaptation Support. AQA Biology. Genetic diversity and adaptation. Specification reference. Learning objectives.

9 Genetic diversity and adaptation Support. AQA Biology. Genetic diversity and adaptation. Specification reference. Learning objectives. Genetic diversity and adaptation Specification reference 3.4.3 3.4.4 Learning objectives After completing this worksheet you should be able to: understand how meiosis produces haploid gametes know how

More information

EVOLUTION change in populations over time

EVOLUTION change in populations over time EVOLUTION change in populations over time HISTORY ideas that shaped the current theory James Hutton (1785) proposes that Earth is shaped by geological forces that took place over extremely long periods

More information

Topic 7: Evolution. 1. The graph below represents the populations of two different species in an ecosystem over a period of several years.

Topic 7: Evolution. 1. The graph below represents the populations of two different species in an ecosystem over a period of several years. 1. The graph below represents the populations of two different species in an ecosystem over a period of several years. Which statement is a possible explanation for the changes shown? (1) Species A is

More information