Self-Assemblage of Gene Nets in Evolution via Recruiting of New Netters

Similar documents
Midterm 1. Average score: 74.4 Median score: 77

Bacterial Genetics & Operons

Developmental genetics: finding the genes that regulate development

Chapter 18 Lecture. Concepts of Genetics. Tenth Edition. Developmental Genetics

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Haploid & diploid recombination and their evolutionary impact

Why EvoSysBio? Combine the rigor from two powerful quantitative modeling traditions: Molecular Systems Biology. Evolutionary Biology

Enduring understanding 1.A: Change in the genetic makeup of a population over time is evolution.

Big Idea 1: The process of evolution drives the diversity and unity of life.

AP Curriculum Framework with Learning Objectives

purpose of this Chapter is to highlight some problems that will likely provide new

Evolutionary Developmental Biology

A A A A B B1

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection

AP Biology Essential Knowledge Cards BIG IDEA 1

Chapters AP Biology Objectives. Objectives: You should know...

Molecular evolution - Part 1. Pawan Dhar BII

Valley Central School District 944 State Route 17K Montgomery, NY Telephone Number: (845) ext Fax Number: (845)

Unicellular: Cells change function in response to a temporal plan, such as the cell cycle.

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Processes of Evolution

There are 3 parts to this exam. Use your time efficiently and be sure to put your name on the top of each page.

Genomes and Their Evolution

Essential knowledge 1.A.2: Natural selection

The Emergence of Modularity in Biological Systems

AP Biology Curriculum Framework

Genetic Algorithms. Donald Richards Penn State University

UNIT 5. Protein Synthesis 11/22/16

Evolution of Genotype-Phenotype mapping in a von Neumann Self-reproduction within the Platform of Tierra

COMP598: Advanced Computational Biology Methods and Research

5/31/17. Week 10; Monday MEMORIAL DAY NO CLASS. Page 88

Development. biologically-inspired computing. lecture 16. Informatics luis rocha x x x. Syntactic Operations. biologically Inspired computing

Biology Unit Overview and Pacing Guide

NEUROEVOLUTION. Contents. Evolutionary Computation. Neuroevolution. Types of neuro-evolution algorithms

Map of AP-Aligned Bio-Rad Kits with Learning Objectives

3/8/ Complex adaptations. 2. often a novel trait

EVOLUTION change in populations over time

178 Part 3.2 SUMMARY INTRODUCTION

Evolutionary Robotics

MOLECULAR CONTROL OF EMBRYONIC PATTERN FORMATION

e.g. population: 500, two alleles: Red (R) and White (r). Total: 1000 genes for flower color in the population

Evolving plastic responses to external and genetic environments

The Evolution of Gene Dominance through the. Baldwin Effect

EVOLUTION change in populations over time

The Evolution of Sex Chromosomes through the. Baldwin Effect

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

Chapter 8: Evolution and Natural Selection

Lecture 7. Development of the Fruit Fly Drosophila

Theory a well supported testable explanation of phenomenon occurring in the natural world.

Full file at CHAPTER 2 Genetics

Evidence for Evolution

Axis Specification in Drosophila

Mutation, Selection, Gene Flow, Genetic Drift, and Nonrandom Mating Results in Evolution

HEREDITY AND VARIATION

18.4 Embryonic development involves cell division, cell differentiation, and morphogenesis

NOTES Ch 17: Genes and. Variation

Axis Specification in Drosophila

Extranuclear Inheritance

(Write your name on every page. One point will be deducted for every page without your name!)

networks in molecular biology Wolfgang Huber

Curriculum Links. AQA GCE Biology. AS level

Activation of a receptor. Assembly of the complex

Campbell Biology AP Edition 11 th Edition, 2018

EVOLUTION. HISTORY: Ideas that shaped the current evolutionary theory. Evolution change in populations over time.

Grade 11 Biology SBI3U 12

Honors Biology Reading Guide Chapter 11

Axis determination in flies. Sem 9.3.B.5 Animal Science

Curriculum Map. Biology, Quarter 1 Big Ideas: From Molecules to Organisms: Structures and Processes (BIO1.LS1)

The Phenotype-Genotype- Phenotype (PGP) Map. Nayely Velez-Cruz and Dr. Manfred Laubichler

Modes of Macroevolution

Genetic transcription and regulation

AP Biology Gene Regulation and Development Review

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

Lesson Overview. Gene Regulation and Expression. Lesson Overview Gene Regulation and Expression

Engineering local and reversible gene drive for population replacement. Bruce A. Hay.

Chapter 11. Development: Differentiation and Determination

NOTES CH 17 Evolution of. Populations

Lectures on Medical Biophysics Department of Biophysics, Medical Faculty, Masaryk University in Brno. Biocybernetics

86 Part 4 SUMMARY INTRODUCTION

Molecular Evolution & the Origin of Variation

Molecular Evolution & the Origin of Variation

EVOLUTION. Evolution - changes in allele frequency in populations over generations.

1.A- Natural Selection

Why Flies? stages of embryogenesis. The Fly in History

5/4/05 Biol 473 lecture

4. Why not make all enzymes all the time (even if not needed)? Enzyme synthesis uses a lot of energy.

Rui Dilão NonLinear Dynamics Group, IST

Chapter 8: Introduction to Evolutionary Computation

Bi 8 Lecture 11. Quantitative aspects of transcription factor binding and gene regulatory circuit design. Ellen Rothenberg 9 February 2016

Computational Biology: Basics & Interesting Problems

Enduring Understanding: Change in the genetic makeup of a population over time is evolution Pearson Education, Inc.

A Simple Protein Synthesis Model

Normalised evolutionary activity statistics and the need for phenotypic evidence


Big Idea 3: Living systems store, retrieve, transmit, and respond to information essential to life processes.

Science Unit Learning Summary

56:198:582 Biological Networks Lecture 10

Evolutionary Computation

Transcription:

Self-Assemblage of Gene Nets in Evolution via Recruiting of New Netters Alexander V. Spirov I.M.Sechenov Institute of Evolutionary Physiology and Biochemistry, 44, Thorez Prospect, S.-Petersburg,194223, Russia e- mail: spirov@iephb.ru; WWW http://avs.iephb.ru Abstract. The fundamental dynamical processes of evolution are connected to processes based on sequences - the genetic messages coded by DNA. In biological evolution we can discover stages of the emergence of novel features. Nature apparently explores some unknown mechanisms of complexification of nets of replicating strings. It is known that genetic changes are not directly manifested in phenotypic changes. Rather, a complex developmental machinery mediates between genetic information and phenotypic characteristics. It provides a certain robustness by filtering out genetic changes. Such degree of freedom allows the species to accumulate appropriate mutations without interruption of the development. When the volume of heritable changes achieving critical threshold, this can force out the development to a new higher-level trajectory. I intend to overview here some findings on the way of searching and exploitation of the rules for evolutionary complexification. I hope these algorithms could find applications in the presentation problem of evolutionary computations. 1. Introduction Understanding the conditions under which mutations and selection can lead to rising levels of organisation is of importance for evolutionary biology as well as for evolutionary computations. The fundamental dynamical processes of evolution are connected to processes based on sequences - the genetic messages coded by DNA. Considering detaily evolutionary pathways, we can discover stages of the emergence of novel features as well as stages of slow-scale change of existing ones. To model the origin of new features, we require a mechanism for how complex, higher-level behavior emerges from low-level interactions (Kaneko, 1994). Nature apparently explores some unknown mechanisms of evolutionary complexification of the replicating strings nets. According to classic point of view evolutionary complexification of the living beings apparently requires the co-ordinate change of several phenotypic characters to

produce adaptive variants. However this requires simultaneous mutations at several genes. But it is impossible, of course! On the other hand, genetic changes are not directly manifested in phenotypic changes. Rather, a complex developmental machinery mediates between genetic information and phenotypic characteristics. It provides a certain robustness by filtering out genetic changes. Some genetic changes make little or no difference to the final phenotype. In other words, robustness is a way to escape the requirement of simultaneous mutations at several genes for evolutionary complexification. Such degree of freedom allows the species to accumulate appropriate mutations without interruption of the development. When the volume of heritable changes achieving critical threshold, this can force out the development to a new higher-level trajectory. Our computer simulations of evolution of the gene networks governing the morphogenesis of early embryos shows the possibility for self-organisation or selfassemblage (or outgrowth ) of gene networks during evolution. This selfassemblage proceeds by means of recruiting of a new gene via closing up of new cascades of interactions between new and old members of the network. (These new genes could appear by way of duplication of the members of this net or another one.) The recruiting of new netters does not need to be forced by selection; stabilising selection is quite enough for this. However, if these newly formed gene systems prove to be good for raising the morphological or functional level of organisation, then these Goldschmidt s hopeful monsters (Wallace, 1985) can be caught up by driving selection. Following Dellaert and Beer (1994), Kitano (1994), Wagner and Altenberg (1995), we will study, from the point of view of self-organisation, the evolution of the genotype to phenotype map via the creation of new genes. The fundamental dynamical processes of evolution are connected to dynamical processes based on sequences. Nature apparently explores some unknown mechanisms of strings nets self-complexification. In the paper we present our findings on the way of searching and exploitation of the algorithms for the self-complexification of the regulatory nets of replicating strings. We think these complexification algorithms are not trivial result and it could find applications in the presentation problem of evolutionary computations. 2. Networks of Genes-Controllers and Their Evolution The identification of controller genes has been a significant recent finding in developmental biology. Networks and cascades of controller genes serve to orchestrate expression of the genome during embryo development. Now we have a lot of knowledge about mechanisms of appearance and maintenance of patterns of the controller genes expression. The networks activity as kind of self-organising mechanisms of morphogenesis allows the conditions of the selection effectiveness to be better understood. Stabilising and driving selection applied to these self-organising mechanisms can dramatically accelerate evolutionary complexification of 2

developmental processes. According to this, self-organisation become apparent in evolution as self-assemblage of the gene nets and cascades. My approach exploits the following essential characters of gene networks. Each gene-member of the network encodes a peptide-product whose function is activation or repression of another gene (its so called target gene). These transcription factors recognise specific sequences of DNA in regulatory domains of the target genes and tightly bind to the target sites. A given member of the network produces transcription factors that specifically activate or repress other netters, and vice versa. A complex network of autoregulatory and cross-regulatory actions functionally connects the genes thereby forming gene networks and cascades. The result of action of the network or cascade is a particular spatial pattern of activity (expression) of the network members. Down-activated members, in turn, switch on structural genes at the appropriate time and place. These genes, in turn, produce enzymes for the subsequent differentiation and morphogenesis of embryo rudiments. We preferably assume the concentration-dependent action of the peptide-regulator on the targeting gene. Namely, at low concentrations it acts as activator, while at high concentrations it acts as repressor (See Jackle et al., 1992). We may be considering the space of all possible length sequences of DNA composed of four types of bases. Our genotypes consist of set of N sequences of length L, forming a certain region in the sequence space. The majority of possible sequences are forbidden in biological reality, leaving only a subset of N allowed ones for participation in the evolution (See Asselmeyer et al., 1995). Let us include in consideration a mutation that takes place in a sequence. To measure the strength of the change we can use a metric on the sequence space introduced by means of the Hamming prescription. The Hamming distance between two sequences is defined as the number of non-coincidences. Two sequences with Hamming distance equal to one are neighbours. Discussing the problem of the complexification of living beings we really bringing in mind following picture. Neighbourhood structure of the sequence space of ancestor and its advanced offspring are generally two unconnected graphs, as is the case of Fig.1a. Evolutionary pathway from lower to higher level of organisation needs connection between the graphs. It is impossible to jump from real ancestor to potential offspring graph. All appropriate sequences formally connecting two graphs are forbidden because they give abnormal phenotypes. 3

Fig.1. Complicated topology of the neighbourhood structure of the sequence space (a) and its change in evolution (b). It is strict definition of above mentioned evolutionary complexification scheme via simultaneous mutations in several points of the DNA-string. Apparently to achieve the higher level, the evolutionary process must have chance to get over the gap. In principle, we can imagine some ways to achieve this. The simplest one consists in decreasing of adaptive role of the traits under consideration. This leads to broadening of variability the traits. In such a case, previously forbidden sequences determining phenotypes with deviations will get chance for reproduction. As a result, connection between actual and potential graph forms, as in the case of Fig.1b. 3. Mechanisms for Jumping Over the Gaps: Conflicting Gene Systems Complicated topology of the neighbourhood structure of the sequence space (Fig.1a) is ensured by intrinsic potential for complexification of the string regulatory networks. Meanwhile connections between isolated actual and potential graphs might be facilitated by the robustness of the developmental mechanisms. Appearance of such connections between diverged levels of organisation is assumed to be achieved by exploitation of canalization schemes of development. In our simulations of the emergence of novel features the host genome initially consists of functionally coupled pair of gene regulatory elements (O-gene + 4

A-gene): O A (the O-gene product activates A-gene). The A-gene contains CATAAT sequences that belong to targeting sequences for O-product binding. For the purpose of an oversimplified but indirect implementation of the driving force, I assume the appearance of a virus. The virus is randomly transmitted from carriers to healthy genomes. By definition, the virus is successfully transmitted if the host genome has in the A-gene an O-binding site. I assume it inserts in the A-gene by cutting of the O-binding site and becomes silent. With a predetermined probability the virus wakes up and with time gradually decreases the host s reproductive potential, finally killing the host, thus eliminating the affected genome from evolution. Point mutations in the O-binding sites lead to insensitivity to virus. However, the wild type of our genome could not lack the O-binding sites in A-gene, because it would consequently lack the normal phenotype (the A-product concentration profile). Hence, in the case of a genome with absence of the wild-type O-binding site in the A-gene (but with a normal pattern of A-gene expression), this prospective mutant will be insensitive to the virus and obtain a selective advantage. In time, the mutant genome will exclude the wild type. This is, of course, a model example of host-parasite evolution. I must emphasize that such selection by parasite pressure is effective only if the design of the wild genome allows appropriate reorganization in principle. The virus acts as a catalyst of the process. It does not produce but catches new forms. If the wild-type genome does not have an appropriate potential for reorganization (testable in calculations), a virus will not help. In this I imply a broad interpretation of virus, that is, it may be plasmid or transposable element (Daniels et al., 1990). 4. Regulatory Nets with High Potential for Complexification Evolutionary outgrowth of gene networks is possible if only the design of the wild genome allows appropriate reorganizations in principle. If testable in calculations wild-type genome have not appropriate potential for complexification, selection will not help. Impressive outgrowth of the Drosophila pair-rule network on the way from primitive to higher insects are inspired us to search simple regulatory networks with high potential for complexification. Recruiting of new netters depends upon plasticity of regulatory pathways inside the network. Firstly, each transcription factor can recognize not unique DNA sequence (say, CATAAT), but family of similar sequences (say, CATAAT, CATAAC, AATAAT, AATAAC ). Then, the sequence families for binding of different transcription factors can be very similar or overlap each other. In other words, the same target sequence can be recognized by two factors, especially if they have antagonistic action on the same targeting gene. These two characteristics are necessary and sufficient to ensure the recruiting phenomenon. 5

Consider following simple gene cascade having potential for complexification. Each genome initially consists of functionally coupled pair of gene regulatory elements (O-gene + A-gene): M O A. M is site-specific transcription factor and its target is gene O. It forms exponentially decaying gradient like Drosophila s bicoid morphogen. M has affinity to the CATAAT-like family of sequences and regulatory region of the O has two such sites. The exponential-in-distance morphogen gradient activates O-gene in concentration dependent manner. In its turn, O-product will activate A-gene in concentration-dependent manner also. Concentration profiles of the O- and A- products for wild species have following simple view: and correspond to the early embryo phenotype with two bands of A-gene expression: The wild-type genomes are two-string variables. Initially the A-gene string is CATAATnCATAATnCATAATn, where A, T, G, C is four-letter DNA code and n is spacer. The O-gene string has a similar form. A fixed probability for point mutations, that is for substitution of one of the simbols by another, is prescribed before the first run. When the computations begin, each genome is tested for governing of development. This genotype-phenotype presentation is achieved by translation of the strings into coupled ordinary differential eqs. The overall view of the eqs. set depends not only upon the number of genesstrings in a given genome, but also upon sequences of targeting sites. Special subroutine ( ODE_Set ) analyses the sequences and choses equations of adequate structure (See next section). 6

After evaluation of the eqs. set the program finds the phenotype of the tested species. Namely, the values of A-product (and products of other genes) are calculated for each of 50 points of the exponentially decaying M-gradient. The results are used for graphic presentation of the phenotypes and for the Scoring procedure. The Scoring procedure compares the calculated set of the A-values with the prescribed canonical A-pattern. The sum of square deviations in each of 50 points of the M-gradient is calculated. If the sum is above threshold, then the species genome are eliminated. Then assume, that two new genes O and A appear in the genome as a result of duplication of initial O + A pair: O A (O' A'). Initial duplication of the O+A gene pair is followed by multiple point mutations in these O and A gene regulatory sites. Functionally useless duplicates O and A are lost over time with a prescribed probability. However, before this happens, the silent genes accumulate point mutations. With time appear, by chance, first example of genome consisting of four genes, including proper to recruitment of a new gene pair, B + C. O A (B' C'). The silent O and A extra-copies accumulate point mutations and there is possibility for shifting the target site specificity compared with the wild-type O and A pair. Only unique combinations of nucleotide substitutions in the silent pair of duplicates facilitate following recruiting of newly modified genes in the growing cascade: The recruiting of new genes via closing of new regulatory pathways really includes step-by-step handing of steerage from an old gene-regulator to a new one. Intermediate mutant genes regulated by both regulators represent a bottleneck for evolution of the net. A rare combination of kinetic parameters of gene activation will facilitate passing through the bottleneck to a new structure of the net. In our computations, for example, the A-gene has three O-binding sites: CCTAAT CATAAT AATAAT. In the example discussed, the O-product recognizes CATAAT sequence family. Hence, the change of the B-gene product recognition specificity to CAGAAT sequence would be appropriate for subsequent evolution. However, this must coincide also with shifting of the C-gene product specificity to the AGAAT sequence. I assume that waking up of the appropriate B + C pair coincides with appropriate point mutations in the A- and C-genes. There is a very low but finite probability of the triple coincidence. Thus the first intermediate mutant has the following A-gene sequence (two old binding sites for the O-product and one site for B-binding overlapping with C-binding site): CCTAAT CAGAAT AATAAT. 7

Apparently the intermediate forms with doubly regulated A-gene have weaken fitness and will be eliminated by selection. However, if the mutant has selective advantages, say partial tolerance to the virus, then the intermediates will accumulate in population. In the future, the number of the new mutants will grow and eventually a complete mutant with absence of the O-binding sites and tolerance to the virus will appear. In our case it will have the gene A: CAGAAT CAGAAT CAGAAT. The 4-gene cascades escape infection pressure but curry out morphogenesis successfully (the A-product concentration pattern produces the morphology of the wild type). This hopeful monster shows additional four bands of the C-expression as compared with the wild type early embryo: Now main idea became clear: action of the activator B-product and the repressor C-product mimics action of the O. Namely, B-product gives two-wave gradient, similar to the A-concentration profile. The B-gradient activate A-gene by pure activation mechanism giving rise similar two-wave A-gradient. However, this pure activation mechanism gives too broad bands of the A-gene expression. Meanwhile the C-product is activated by the B in concentration-dependent manner, yielding fourwave C-concentration profile. It is essential that each pair of the C-product bands set boundaries for A-bands, narrowing them. Finally we obtain the same two-waive A- concentration profile, as in the case of the wild type genome. Hence we achieve the recruitment of a pair of new members into simple cascade. As we can see, the same pattern of A-gene expression is achieved by action of more complicated gene network. Apparently it is a kind of evolutionary game with strings, but it remains some essential features in organization of Drosophila segmentation network. 8

5. Computer Evolution The overall organisation of the program is similar to many known simple programs in GA s approach. Depending upon RAM volume, up to 12,000 of strings-genomes could treat in this example of computational evolution. There are Mutation, Scoring and Reproduction subprograms. The Mutation subprogram includes a Point_Mutation subroutine, as well as Crossover one. The Scoring subprogram begins treatment of each genome with reconstruction of an ODE set describing genotype to phenotype transformation procedure. Then calculated profiles of the genes expression are compare with a canonical picture. Finally, the Reproduction subprogram completes reproduction of the winner-genomes in accordance with truncated or proportional strategies. Results of each Mutation-Scoring-Reproduction round are displayed as a horizontal multicoloured line. Each pixel of the line corresponds to one (or 2 or 4) genome, and the genomes are arranged by score. Addition line by line gives, with time, a live tree of the computer evolution. When the program starts, a population of wild-type genomes is created. Probabilities for random point nucleotide substitutions as well as for O + A gene pair duplication are predetermined before first run. The genes encode patterns of their expression that are scored. I prefer to use truncation strategy of stabilizing selection, namely, those which score above threshold are preferentially reproduced for the next generation, with the mutation operators applied. Losers are eliminated. I performed simulations both with and without recombination. In this case, recombination turned out not to have a major influence on computational evolution. Our computations reveal simple but impressive examples of rising of structural and functional redundancy in evolutionary computation. Namely, the pair of wildtype genes O & A and the four-gene cascade really perform the same task: maintenance two-waive pattern of the A-gene expression. Such redundancy is ensured by known features of kinetics of gene expression. Drosophila has rapid establishment of its body plan, in comparison with primitive insects. The segments in long germ-band insect embryos, like the fruit fly, are all determined at syncytial blastoderm stage. This is in contrast to short germband insects (such as grasshopper) which show an early determination of only the anterior head segments, whereas the more posterior thoracic and abdominal segments are sequentially added after formation of a primary germ anlage (Patel et al., 1992). Segment formation in Drosophila involves the pair-rule genes network which defines double segmental periodicities and which has been considered to represent a special adaptation to the long germ-band type of development. My computational experiments allow an evolutionary appearance of the Drosophila segmentation mechanism to be simulated and tested. I suggest that the evolutionarily fast formation of the fly segmentation scheme is facilitated by selfenlargement of the initial segmentation cascade via recruiting of new gene members. 9

Acknowledgements. Supported by Russian Foundation for Basic Researches (Grant No 96-04-49350). I thank Richard Gordon, Denis Thieffry and anonymous reviewers for critical comments. References 1. Asselmeyer, T., W.Ebeling amd H.Rose, (1995). Smoothing representation of fitness landscapes - the genotype-phenotype map of evolution, BioSystems. 2. Daniels SB; Peterson KR; Strausbaugh LD; Kidwell MG; Chovnick A (1990). Evidence for horizontal transmission of the P transposable element between Drosophila species. Genetics 124, 339-55. 3. Dellaert, F. and R.D. Beer (1994). Toward an evolvable model of development for autonomous agent synthesis. In: Artificial Life IV: Proceedings of the Fourth International Workshop on the Synthesis and Simulation of Living Systems. R. Brooks and P. Maes (Eds.). MIT Press. 4. Jackle H; Hoch M; Pankratz MJ; Gerwin N; Sauer F; Bronner G. (1992). Transcriptional control by Drosophila gap genes. J Cell Sci Suppl 16, 39-51. 5. Kaneko, K (1994). Chaos as source of complexity and diversity in evolution, ALife, 1, 163-177. 6. Kitano, H. (1994). Evolution of Metabolism for Morphogenesis. In: Artificial Life IV: proceedings of the fourth international workshop on the synthesis and simulation of living systems. R. Brooks and P. Maes (Eds.). MIT Press. 7. Patel NH; Ball EE; Goodman CS (1992). Changing role of even-skipped during the evolution of insect pattern formation. Nature 357: 339-342. 8. G.P. Wagner and L. Altenberg. (1995). Complex Adaptations and the Evolution of Evolvability. WWW http://peaplant.biology.yale.edu: 8001/papers/CompAdapt/compadapt.html 9. Wallace, B. (1985). Reflections on the still hopeful monster. Quart. Rev. Biol. 60, 31-42. 10