Formalizing the gene centered view of evolution

Similar documents
arxiv:physics/ v1 [physics.bio-ph] 8 Feb 2000 Formalizing the gene centered view of evolution

Spontaneous pattern formation and genetic invasion in locally mating and competing populations

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Reproduction- passing genetic information to the next generation

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

Processes of Evolution

Evolution. Just a few points

The theory of evolution continues to be refined as scientists learn new information.

The Evolution of Gene Dominance through the. Baldwin Effect

overproduction variation adaptation Natural Selection speciation adaptation Natural Selection speciation

Evolution by Natural Selection

Chapter 17: Population Genetics and Speciation

STUDY GUIDE SECTION 16-1 Genetic Equilibrium

Evolution (Chapters 15 & 16)

Heaving Toward Speciation

The Wright Fisher Controversy. Charles Goodnight Department of Biology University of Vermont

Darwinian Selection. Chapter 6 Natural Selection Basics 3/25/13. v evolution vs. natural selection? v evolution. v natural selection

Big Idea #1: The process of evolution drives the diversity and unity of life

Chapter 2 Section 1 discussed the effect of the environment on the phenotype of individuals light, population ratio, type of soil, temperature )

Natural selection, Game Theory and Genetic Diversity

Population Genetics & Evolution

Mechanisms of Evolution Microevolution. Key Concepts. Population Genetics

Natural Selection: Genetics of Families and Populations

Chapter 5 Evolution of Biodiversity

Ch. 16 Evolution of Populations

Chapter 16. Table of Contents. Section 1 Genetic Equilibrium. Section 2 Disruption of Genetic Equilibrium. Section 3 Formation of Species

Reproduction and Evolution Practice Exam

Text of objective. Investigate and describe the structure and functions of cells including: Cell organelles

Name Date Class CHAPTER 15. In your textbook, read about developing the theory of natural selection. For each statement below, write true or false.

Evolution Unit: What is Evolution?

The Mechanisms of Evolution

NOTES Ch 17: Genes and. Variation

1. T/F: Genetic variation leads to evolution. 2. What is genetic equilibrium? 3. What is speciation? How does it occur?

EVOLUTION change in populations over time

A Simple Haploid-Diploid Evolutionary Algorithm

Ohio Tutorials are designed specifically for the Ohio Learning Standards to prepare students for the Ohio State Tests and end-ofcourse

Theory a well supported testable explanation of phenomenon occurring in the natural world.

Quantitative Genetics & Evolutionary Genetics

EVOLUTION change in populations over time

UNIT V. Chapter 11 Evolution of Populations. Pre-AP Biology

Name Date Class. Patterns of Evolution

EVOLUTION UNIT. 3. Unlike his predecessors, Darwin proposed a mechanism by which evolution could occur called.

Genes and DNA. 1) Natural Selection. 2) Mutations. Darwin knew this

5/31/17. Week 10; Monday MEMORIAL DAY NO CLASS. Page 88

UON, CAS, DBSC, General Biology II (BIOL102) Dr. Mustafa. A. Mansi. The Origin of Species

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

Microevolutionary changes show us how populations change over time. When do we know that distinctly new species have evolved?

Linking levels of selection with genetic modifiers

Speciation and Patterns of Evolution

How robust are the predictions of the W-F Model?

Study of similarities and differences in body plans of major groups Puzzling patterns:

Concepts of Evolution

The concept of adaptive phenotypic polymorphism in evolutionary theory

NOTES CH 17 Evolution of. Populations

Mutation, Selection, Gene Flow, Genetic Drift, and Nonrandom Mating Results in Evolution

Evolution. Chapters 16 & 17

Biology II : Embedded Inquiry

Exercise 3 Exploring Fitness and Population Change under Selection

The Evolution of Sex Chromosomes through the. Baldwin Effect

Curriculum Map. Biology, Quarter 1 Big Ideas: From Molecules to Organisms: Structures and Processes (BIO1.LS1)

Introduction to Digital Evolution Handout Answers

Evolutionary quantitative genetics and one-locus population genetics

Evolution and the Genetics of Structured populations. Charles Goodnight Department of Biology University of Vermont

Evolution of Populations. Chapter 17

Evolution and Natural Selection (16-18)

Evolution AP Biology

The Origin of Species

MS-LS3-1 Heredity: Inheritance and Variation of Traits

Evolution - Unifying Theme of Biology Microevolution Chapters 13 &14

Introduction to population genetics & evolution

The Origin of Species

AP Biology Notes Outline Enduring Understanding 1.C. Big Idea 1: The process of evolution drives the diversity and unity of life.

Chapter 6 Meiosis and Mendel

EVOLUTION. HISTORY: Ideas that shaped the current evolutionary theory. Evolution change in populations over time.

Evolution Test Review

I. Short Answer Questions DO ALL QUESTIONS

Statistical Models in Evolutionary Biology An Introductory Discussion

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

Chapter 5. Evolution of Biodiversity

Conceptually, we define species as evolutionary units :

It all depends on barriers that prevent members of two species from producing viable, fertile hybrids.

A A A A B B1

Essential knowledge 1.A.2: Natural selection

Environmental Influences on Adaptation

Enduring Understanding: Change in the genetic makeup of a population over time is evolution Pearson Education, Inc.

Slide 1. Slide 2. Slide 3. Concepts of Evolution. Isn t Evolution Just A Theory? Evolution

A. Incorrect! Form is a characteristic used in the morphological species concept.

CH 16: Evolution of Population

Big Idea 1: The process of evolution drives the diversity and unity of life.

Lecture WS Evolutionary Genetics Part I 1

Biology Eighth Edition Neil Campbell and Jane Reece

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

AP Curriculum Framework with Learning Objectives

Biology 11 UNIT 1: EVOLUTION LESSON 2: HOW EVOLUTION?? (MICRO-EVOLUTION AND POPULATIONS)

e.g. population: 500, two alleles: Red (R) and White (r). Total: 1000 genes for flower color in the population

The Origin of Species

Runaway. demogenetic model for sexual selection. Louise Chevalier. Jacques Labonne

Enduring understanding 1.A: Change in the genetic makeup of a population over time is evolution.

AGENDA Go Over DUT; offer REDO opportunity Notes on Intro to Evolution Cartoon Activity

Chapter 8: Evolution and Natural Selection

Transcription:

Chapter 1 Formalizing the gene centered view of evolution Yaneer Bar-Yam and Hiroki Sayama New England Complex Systems Institute 24 Mt. Auburn St., Cambridge, MA 02138, USA yaneer@necsi.org / sayama@necsi.org A historical dispute in the conceptual underpinnings of evolution is the validity of the gene centered view of evolution. We transcend this debate by formalizing the gene centered view and establishing the limits on its applicability. We show that the genecentered view is a dynamic version of the well known mean field approximation. It breaks down for trait divergence which corresponds to symmetry breaking in evolving populations. 1.1 Introduction A basic formulation of evolution requires reproduction (trait heredity) with variation and selection with competition. At a particular time, there are a number of organisms which differ from each other in traits that affect their ability to survive and reproduce. Differential reproduction over generations leads one organism s offsprings to progressively dominate over others and changes the composition of the population of organisms. Variation during reproduction allows offspring to differ from the parent and an ongoing process of change over multiple generations is possible. One of the difficulties with this conventional view of evolution is that many organisms reproduce sexually, and thus the offspring of an organism may be quite different from the parent. A partial solution to this problem is recognizing that it is sufficient for offspring traits to be correlated to parental traits for the principles of evolution to apply.

2 Formalizing the gene centered view of evolution However, the gene centered view[2] gives a more simplified perspective for addressing this problem. In the gene centered view there are assumed to be indivisible elementary units of the genome (thought of as individual genes) that are preserved from generation to generation. Different versions of the gene (alleles) compete and mutate rather than the organism as a whole. Thus the subject of evolution is the allele, and, in effect, the selection is of alleles rather than organisms. This simple picture was strongly advocated by some evolutionary biologists, while others maintained more elaborate pictures which, for example, differentiate between vehicles of selection (the organisms) and replicators (the genes). However, a direct analysis of the gene centered view to reveal its domain of applicability has not yet been discussed. In this article we will review the mathematics of some standard conceptual models of evolution to clarify the relationship between gene centered and organism based notions of evolution. We will show that the gene centered view is of limited validity and is equivalent to a mean field approximation where correlations between the different genes are ignored, i.e. each gene evolves in an average environment (mean field) within a sexually reproducing population. By showing this we can recognize why the gene centered view is useful, and also when it is invalid when correlations are relevant. Correlations between genes arise when the presence of one allele in one place in the genome affects the probability of another allele appearing in another place in the genome, which is technically called linkage disequilibrium. One of the confusing points about the gene centered theory is that there are two stages in which the dynamic introduction of correlations must be considered: selection and sexual reproduction (gene mixing). Correlations occur in selection when the probability of survival favors certain combinations of alleles, rather than being determined by a product of terms given by each allele separately. Correlations occur in reproduction when parents are more likely to mate if they have certain combinations of alleles. If correlations only occur in selection and not in reproduction, the mean field approximation continues to be at least partially valid. However, if there are correlations in both selection and sexual reproduction then the mean field approximation and the gene centered view break down. Indeed, there are cases for which it is sufficient for there to be very weak correlations in sexual reproduction for the breakdown to occur. For example, populations of organisms distributed over space and an assumption that reproductive coupling is biased toward organisms that are born closer to each other can self-consistently generate allelic correlations in sexual reproduction by symmetry breaking. This is thus particularly relevant to considering trait divergence of subpopulations. Simulations of models that illustrate trait divergence through symmetry breaking can be found elsewhere[3, 4]. 1.2 Formalizing the gene centered view To clarify how standard models of evolution are related to the picture described above, it must be recognized that the assumptions used to describe the effect of

Formalizing the gene centered view of evolution 3 sexual reproduction are as important as the assumptions that are made about selection. A standard first model of sexual reproduction assumes that recombination of the genes during sexual reproduction results in a complete mixing of the possible alleles not just in each pair of mating organisms but rather throughout the species the group of organisms that is mating and reproducing. Offspring are assumed to be selected from the ensemble which represents all possible combinations of the genomes from reproducing organisms (panmixia). If we further simplify the model by assuming that each gene controls a particular phenotypic trait for which selection occurs independent of other gene-related traits, then each gene would evolve independently; a selected allele reproduces itself and its presence within an organism is irrelevant. Without this further assumption, selection should be considered to operate on the genome of organism, which may induce correlations in the allele populations in the surviving (reproducing) organisms. As the frequency of one allele in the population changes due to evolution over generations, the fitness of another allele at a different gene will be affected. However, due to the assumption of complete mixing in sexual reproduction, the correlations disappear in the offspring and only the average effect (mean field) of one gene on another is relevant. From the point of view of a particular allele at a particular gene, the complete mixing means that at all other genes alleles will be present in the same proportion that they appear in the population. Thus the assumption of complete mixing in sexual reproduction is equivalent to a gene based mean field approximation. The mean field approximation is widely used in statistical physics as a zeroth order approximation to understanding the properties of systems. There are many cases where it provides important insight to some aspects of a system (e.g. the Ising model of magnets) and others where is essentially valid (conventional BCS superconductivity). The application of the mean field approximation to a problem involves assuming an element (or small part of the system) can be treated in the average environment that it finds throughout the system. This is equivalent to assuming that the probability distribution of the states of the elements factor. 1 This qualitative discussion of standard models of evolution and their relationship to the mean field approximation can be shown formally. In the mean field approximation, the probability of appearance of a particular state of the system s (e.g. a particular genome) is considered as the product of probabilities of the components a i (e.g. its alleles): P (s) =P (a 1,...,a n )= i P i (a i ) (1.1) In the usual application of this approximation, it can be shown to be equivalent to allowing each of the components to be placed in an environment which is an 1 Systematic strategies for improving the study of systems beyond the mean field approximation both analytically and through simulations allow the inclusions of correlations between element behavior. An introduction to the mean field approximation and a variety of applications can be found in Bar-Yam[1].

4 Formalizing the gene centered view of evolution average over the possible environments formed by the other components of the system, hence the term mean field approximation. The key to applying this in the context of evolution is to consider carefully the effect of the reproduction step, not just the selection step. The two steps of reproduction and selection can be written quite generally as: {N(s, t +1)} = R[{N (s, t)}] (1.2) {N (s, t)} = D[{N(s, t)}] (1.3) The first equation describes reproduction. The number of offspring N(s, t + 1) having a particular genome s is written as a function of the reproducing organisms N (s, t) from the previous generation. The second equation describes selection. The reproducing population N (s, t) is written as a function of the same generation at birth N(s, t). The brackets on the left indicate that each equation represents a set of equations for each value of the genome. The brackets within the functions indicate, for example, that each of the offspring populations depends on the entire parent population. The proportion of alleles can be written as the number of organisms which have a particular allele a i at gene i divided by the total number of organisms: P i (a i,t)= 1 N 0 (t) N (s, t) (1.4) a j,j i where s =(a 1,...,a n ) represents the genome in terms of alleles a i. 2 The sum is over all alleles of genes j except gene i that is fixed to allele a i. N 0(t) is the total reproducing population at time t. Using the assumption of complete allelic mixing by sexual reproduction, the frequency of allele a i in the offspring is determined by only the proportion of a i in the parent population. Then, the same offspring would be achieved by an averaged population with a number of reproducing organisms given by Ñ (s, t) =N 0(t) P i (a i,t) (1.5) i since this Ñ (s, t) has the same allelic proportions as N (s, t) in (1.4). Thus complete reproductive mixing assumes that: R[{Ñ (s, t)}] R[{N (s, t)}] (1.6) The form of (1.5) indicates that the effective probability of a particular genome can be considered as a product of the probabilities of the individual genes as if they were independent. It follows that a complete step including both reproduction and selection can also be written in terms of the allele probabilities in the whole population. Given the above equations the update of an allele probability is: P i (a i,t+1) N 0 1 (t +1) a j,j i 2 This expression applies generally to haploid, diploid, or other cases. D s [R[{Ñ (s, t)}]] (1.7)

Formalizing the gene centered view of evolution 5 where D s is a function which satisfies N (s, t) =D s [{N(s, t)}]. Given the form of (1.5) and the additional assumption that the relative dynamics of change of genome proportions is not affected by the absolute population size N 0, we could write this as an effective one-step update P i (a i,t+1)= D[{P i (a i,t)}] (1.8) which describes the allele population change from one generation to the next of offspring. Since this equation describes the behavior of a single allele it corresponds to the gene centered view. There is still a difficulty pointed out by Sober and Lewontin[5]. The effective fitness of each allele depends on the distribution of alleles in the population. Thus, the fitness of an allele is coupled to the evolution of other alleles. This is apparent in (1.8) which, as indicated by the brackets, is a function of all the allele populations. It corresponds, as in other mean field approximations, to placing an allele in an average environment formed from the other alleles. This problem with fitness assignment would not be present if each allele separately coded for an organism trait. While this is a partial violation of the simplest conceptual view of evolution, however, the applicability of a gene centered view can still be justified, as long as the contextual assignment of fitness is included. When the fitness of organism phenotype is dependent on the relative frequency of phenotypes in a population of organisms it is known as frequency dependent selection, which is a concept that is being applied to genes in this context. A more serious breakdown of the mean field approximation occurs when the assumption of complete mixing during reproduction does not hold. This corresponds to symmetry breaking. 1.3 Breakdown of the gene centered view We can provide a specific example of breakdown of the mean field approximation using a simple example. We start by using a simple model for population growth, where an organism that reproduces at a rate of λ offspring per individual per generation has a population growth described by an iterative equation: N(t +1)=λN(t) (1.9) We obtain a standard model for fitness and selection by taking two equations of the form (1.9) for two populations N 1 (t) and N 2 (t) with λ 1 and λ 2 respectively, and normalize the population at every step so that the total number of organisms remains fixed at N 0. We have that: N 1 (t +1) = N 2 (t +1) = λ 1 N 1 (t) λ 1 N 1 (t)+λ 2 N 2 (t) N 0 λ 2 N 2 (t) λ 1 N 1 (t)+λ 2 N 2 (t) N 0 (1.10) The normalization does not change the relative dynamics of the two populations, thus the faster-growing population will dominate the slower-growing one

6 Formalizing the gene centered view of evolution according to their relative reproduction rates. If we call λ i the fitness of the ith organism we see that according to this model the organism populations grow at a rate that is determined by the ratio of their fitness to the average fitness of the population. Consider now sexual reproduction where we have multiple genes. In particular, consider two nonhomologue genes with selection in favor of a particular combination of alleles on genes. Specifically, after selection, when allele A 1 appears in one gene, allele B 1 must appear on the second gene, and when allele A 1 appears on the first gene allele B 1 must appear on the second gene. We can write these high fitness organisms with the notation (1, 1) and ( 1, 1), and the organisms with lower fitness (for simplicity, λ =0)as(1, 1) and ( 1, 1). When correlations in reproduction are neglected there are two stable states of the population with all organisms (1, 1) or all organisms ( 1, 1). If we start with exactly 50% of each allele, then there is an unstable steady state in which 50% of the organisms reproduce and 50% do not in every generation. Any small bias in the proportion of one or the other will cause there to be progressively more of one type over the other, and the population will eventually have only one set of alleles. We can solve this example explicitly for the change in population in each generation when correlations in reproduction are neglected. It simplifies matters to realize that the reproducing parent population (either (1, 1) or ( 1, 1)) must contain the same proportion of the correlated alleles (A 1 and B 1 ) so that: P 1,1 (t)+p 1, 1 (t) = P 1,1 (t)+p 1,1 (t) = p(t) P 1,1 (t)+p 1, 1 (t) = P 1, 1 (t)+p 1, 1 (t) = 1 p(t) (1.11) where p is a proportion of allele A 1 or B 1. The reproduction equations are: P 1,1 (t +1) = p(t) 2 P 1, 1 (t +1)=P 1,1 (t +1) = p(t)(1 p(t)) (1.12) P 1, 1 (t +1) = (1 p(t)) 2 The proportion of the alleles in the generation t is given by the selected organisms: p(t) =P 1,1(t)+P 1, 1(t) =P 1,1(t)+P 1,1(t) (1.13) Since the less fit organisms (1, 1) and ( 1, 1) do not reproduce this is described by: p(t) =P 1,1(t) = This gives the update equation p(t +1)= P 1,1 (t) P 1,1 (t)+p 1, 1 (t) (1.14) p(t) 2 p(t) 2 +(1 p(t)) 2 (1.15) which has the behavior described above and shown in Fig. 1.1. This problem is reminiscent of an Ising ferromagnet at very low temperature. Starting from

Formalizing the gene centered view of evolution 7 a nearly random state with a slight bias in the number of up and down spins, the spins align becoming either all up or all down. p 1 0.9 0.8 0.8 0.7 0.6 0.55 0.51 0.6 0.5 0.5 0.4 0.3 0.2 0.4 0.45 0.49 0.1 0.2 0 0 1 2 3 4 5 6 7 8 9 10 t Figure 1.1: Behavior of p in (1.15) with several different initial values. Since we can define the proportion of a gene in generation t and in generation t + 1 we can always write an expression for allele evolution in the form P i (a i,t+1)= λ ai a i λ ai P i (a i,t) P i(a i,t) (1.16) so that we have evolution that can be described in terms of gene rather than organism behavior. The fitness coefficient λ 1 for allele A 1 or B 1 is seen from (1.15) to be λ 1 (t) =p(t) (1.17) with the corresponding λ 1 =1 λ 1. The assignment of a fitness to an allele reflects the gene centered view. The explicit dependence on the population composition has been objected to on grounds of biological appropriateness[5]. For our purposes, we recognize this dependence as the natural outcome of a mean field approximation. It is interesting to consider when this picture breaks down more severely due to a breakdown in the assumption of complete reproductive mixing. In this example, if there is a spatial distribution in the organism population with mating correlated by spatial location and fluctuations so that the starting population has more of the alleles represented by 1 in one region and more of the alleles represented by 1 in another region, then patches of organisms that have predominantly (1, 1) or ( 1, 1) will form after several generations. This symmetry breaking, like in a ferromagnet, is the usual breakdown of the mean field approximation. Here it creates correlations in the genetic makeup of the population. When the correlations become significant then the whole population becomes to contain a number of types. The formation of organism types depends on the existence of correlations in reproduction that are, in effect, a partial form of speciation. For an example of such symmetry breaking and pattern formation see reference[3, 4].

8 Formalizing the gene centered view of evolution Thus we see that the most dramatic breakdown of the mean field approximation / gene centered view occurs when multiple organism types form. This is consistent with our understanding of ergodicity breaking, phase transitions and the mean field approximation. Interdependence at the genetic level is echoed in the population through the development of subpopulations. We should emphasize again that this symmetry breaking required both selection and reproduction to be coupled to gene correlations. 1.4 Conclusion The gene centered view can be applied directly in populations where sexual reproduction causes complete allelic mixing, and only so long as effective fitnesses are understood to be relative to the prevailing gene pool. However, structured populations (e.g. species with demes local mating neighborhoods) are unlikely to conform to the mean field approximation / gene centered view. Moreover, it does not apply to considering the consequences of trait divergence, which can occur when such correlations in organism mating occur. These issues are important in understanding problems that lie at scales traditionaly between the problems of population biology and those of evolutionary theory: e.g. the understanding of ecological diversity and sympatric speciation[3, 4]. Bibliography [1] Bar-Yam, Y., Dynamics of Complex Systems, Perseus Books Cambridge, MA (1997). [2] Dawkins, Richard, The Selfish Gene 2nd ed., Oxford University Press Oxford (1989). [3] Sayama, H., L. Kaufman, and Y. Bar-Yam, The role of spontaneous pattern formation in the creation and maintenance of biological diversity, Interjournal (http://www.interjournal.org/) (2000), submitted. [4] Sayama, H., L. Kaufman, and Y. Bar-Yam, Symmetry breaking and coarsening in spatially distributed evolutionary processes including sexual reproduction and disruptive selection, Physical Review E 62 (2000), 7065 7069. [5] Sober, E., and R. C. Lewontin, Artifact, cause and genic selection, Philosophy of Science 49 (1982), 157 180.