Neutral Networks of RNA Genotypes and RNA Evolution in silico

Similar documents
Evolution of Biomolecular Structure 2006 and RNA Secondary Structures in the Years to Come. Peter Schuster

How Nature Circumvents Low Probabilities: The Molecular Basis of Information and Complexity. Peter Schuster

Is the Concept of Error Catastrophy Relevant for Viruses? Peter Schuster

RNA From Mathematical Models to Real Molecules

RNA Bioinformatics Beyond the One Sequence-One Structure Paradigm. Peter Schuster

Error thresholds on realistic fitness landscapes

Self-Organization and Evolution

Tracing the Sources of Complexity in Evolution. Peter Schuster

Evolution on simple and realistic landscapes

Von der Thermodynamik zu Selbstorganisation, Evolution und Information

Evolution and Molecules

Complexity in Evolutionary Processes

Mathematische Probleme aus den Life-Sciences

Designing RNA Structures

RNA From Mathematical Models to Real Molecules

Chemistry and Evolution at the Origin of Life. Visions and Reality

Mathematical Modeling of Evolution

Different kinds of robustness in genetic and metabolic networks

How computation has changed research in chemistry and biology

Evolution on Realistic Landscapes

The Advantage of Using Mathematics in Biology

The Role of Topology in the Study of Evolution

COMP598: Advanced Computational Biology Methods and Research

Origin of life and early evolution in the light of present day molecular biology. Peter Schuster

Mechanisms of molecular cooperation

Chemistry on the Early Earth

Chance and necessity in evolution: lessons from RNA

Problem solving by inverse methods in systems biology

Replication and Mutation on Neutral Networks: Updated Version 2000

Evolution of Genotype-Phenotype mapping in a von Neumann Self-reproduction within the Platform of Tierra

RNA evolution and Genotype to Phenotype maps

Evolutionary Dynamics and Optimization. Neutral Networks as Model-Landscapes. for. RNA Secondary-Structure Folding-Landscapes

Molecular evolution - Part 1. Pawan Dhar BII

Chapter 17. From Gene to Protein. Biology Kevin Dees

Systems biology and complexity research

When to use bit-wise neutrality

RNA folding at elementary step resolution

arxiv:physics/ v1 [physics.bio-ph] 27 Jun 2001

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

EVOLUTIONARY DYNAMICS ON RANDOM STRUCTURES SIMON M. FRASER CHRISTIAN M, REIDYS DISCLAIMER. United States Government or any agency thereof.

UNIT 5. Protein Synthesis 11/22/16

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Plasticity, Evolvability, and Modularity in RNA

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

APES C4L2 HOW DOES THE EARTH S LIFE CHANGE OVER TIME? Textbook pages 85 88

IIASA. Continuity in Evolution: On the Nature of Transitions INTERIM REPORT. IR / April

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig

Bridging from Chemistry to the Life Sciences Evolution seen with the Glasses of a Physicist a

Evolution and Design

Curriculum Links. AQA GCE Biology. AS level

Genetical theory of natural selection

1. Contains the sugar ribose instead of deoxyribose. 2. Single-stranded instead of double stranded. 3. Contains uracil in place of thymine.

The Mathematics of Darwinian Systems

Evolutionary dynamics of populations with genotype-phenotype map

Lecture 7: Simple genetic circuits I

From gene to protein. Premedical biology

Full file at CHAPTER 2 Genetics

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection

Critical Mutation Rate Has an Exponential Dependence on Population Size

Types of RNA. 1. Messenger RNA(mRNA): 1. Represents only 5% of the total RNA in the cell.

RNA & PROTEIN SYNTHESIS. Making Proteins Using Directions From DNA

The Origin of Life on Earth

Chapter 8: Introduction to Evolutionary Computation

EVOLUTIONARY DYNAMICS AND THE EVOLUTION OF MULTIPLAYER COOPERATION IN A SUBDIVIDED POPULATION

SECOND PUBLIC EXAMINATION. Honour School of Physics Part C: 4 Year Course. Honour School of Physics and Philosophy Part C C7: BIOLOGICAL PHYSICS

Neutral Evolution of Mutational Robustness

Quantifying slow evolutionary dynamics in RNA fitness landscapes

Exercise 3 Exploring Fitness and Population Change under Selection

arxiv: v1 [q-bio.bm] 31 Oct 2017

Quasispecies distribution of Eigen model

Complete Suboptimal Folding of RNA and the Stability of Secondary Structures

GENE ACTIVITY Gene structure Transcription Transcript processing mrna transport mrna stability Translation Posttranslational modifications

The Topology of the Possible: Formal Spaces Underlying Patterns of Evolutionary Change

Cell Growth and Genetics

Why EvoSysBio? Combine the rigor from two powerful quantitative modeling traditions: Molecular Systems Biology. Evolutionary Biology

Peter Schuster, Institut für Theoretische Chemie der Universität Wien, Austria Darwin s Optimization in the looking glasses of physics and chemistry

Problems on Evolutionary dynamics

The Phenotype-Genotype- Phenotype (PGP) Map. Nayely Velez-Cruz and Dr. Manfred Laubichler

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

GCD3033:Cell Biology. Transcription

Objective 3.01 (DNA, RNA and Protein Synthesis)

QUANTIFYING SLOW EVOLUTIONARY DYNAMICS IN RNA FITNESS LANDSCAPES

SI Appendix. 1. A detailed description of the five model systems

Computational Biology: Basics & Interesting Problems

Genetics 304 Lecture 6

Mutation, Selection, Gene Flow, Genetic Drift, and Nonrandom Mating Results in Evolution

Systems Biology: A Personal View IX. Landscapes. Sitabhra Sinha IMSc Chennai

Casting Polymer Nets To Optimize Molecular Codes

CelloS: a Multi-level Approach to Evolutionary Dynamics

The role of form in molecular biology. Conceptual parallelism with developmental and evolutionary biology

Metastable Evolutionary Dynamics: Crossing Fitness Barriers or Escaping via Neutral Paths?

The Amoeba-Flagellate Transformation

arxiv:physics/ v1 [physics.bio-ph] 19 Apr 2000

Haploid & diploid recombination and their evolutionary impact

Multiple Choice Review- Eukaryotic Gene Expression

Genetic Algorithms. Donald Richards Penn State University

Vom Modell zur Steuerung

RIBOSOME: THE ENGINE OF THE LIVING VON NEUMANN S CONSTRUCTOR

The RNA World, Fitness Landscapes, and all that

Darwin's theory of natural selection, its rivals, and cells. Week 3 (finish ch 2 and start ch 3)

Transcription:

Neutral Networks of RNA Genotypes and RNA Evolution in silico Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien RNA Secondary Structures in Dijon Dijon, 24. 26.06.2002

No new principle will declare itself from below a heap of facts. Sir Peter Medawar, 1985

The genotypes or genomes of individuals and species, being reproductively related ensembles of individuals, are DNA or RNA sequences. They are changing from generation to generation through mutation and recombination. Genotypes unfold into phenotypes or organisms, which are the targets of the evolutionary selection process. Point mutations are single nucleotide exchanges. The Hamming distance of two sequences is the minimal number of single nucleotide exchanges that mutually converts the two sequence into each other.

A A A A A U U U U U U C C C C C C C C G G G G G G G G A U C G = adenylate = uridylate = cytidylate = guanylate 5 - -3 Genotype: The sequence of an RNA molecule consisting of monomers chosen from four classes.

Phenotype: Three-dimensional structure of phenylalanyl transfer-rna

Sequence 5'-End 3'-End GCGGAUUUAGCUCAGDDGGGAGAGCMCCAGACUGAAYA UCUGGAGMUCCUGUGTPCGAUCCACAGAAUUCGCACCA 3'-End 5'-End 70 Secondary Structure 10 60 50 20 30 40 Symbolic Notation 5'-End 3'-End Definition and formation of the secondary structure of phenylalanyl-trna

CGTCGTTACAATTTAGGTTATGTGCGAATTCACAAATTGAAAATACAAGAG... CGTCGTTACAATTTAAGTTATGTGCGAATTCCCAAATTAAAAACACAAGAG... Hamming distance d (S,S ) = H 1 2 4 (i) (ii) (iii) d (S,S) = 0 H 1 1 d (S,S) = d (S,S) H 1 2 H 2 1 d (S,S) d (S,S) + d (S,S) H 1 3 H 1 2 H 2 3 The Hamming distance induces a metric in sequence space

Hydrogen bonds Hydrogen bonding between nucleotide bases is the principle of template action of RNA and DNA.

5' 3' Plus Strand G C C C G Synthesis 5' 3' Plus Strand G C C C G C 3' G Synthesis 5' 3' Plus Strand G C C C G Minus Strand C G G G C 3' 5' Complex Dissociation 5' 3' Plus Strand G C C C G 5' Minus Strand C G G + G C 3' Complementary replication as the simplest copying mechanism of RNA

(A) + I 1 f 1 I 1 + I 1 dx / dt = f x - x j j j j Φ =( f j - Φ) xj (A) + I 2 f 2 I 2 + I 2 Φ =Σ i f i x i ; Σ x = 1 ; i,j i i =1,2,...,n [A] = a = constant f m = max { f; j=1,2,...,n} j (A) + I j f j Ij + Ij x (t) 1 for t m (A) + I m f m I m + I m s = (f m+1 -f m )/f m succession of temporarily fittest variants: m m+1... (A) + I n f n I n + I n Selection of the fittest or fastest replicating species I m

Stock Solution [A] = a 0 Reaction Mixture: A; I, k=1,2,... k k 1 A + I 2 I 1 1 d 1 k 2 A + I 2 I 2 2 d 2 k 3 A + I 2 I 3 3 d 3 k 4 A + I 2 I 4 4 d 4 k 5 A + I 2 I 5 5 d 5 Replication in the flow reactor P.Schuster & K.Sigmund, Dynamics of evolutionary optimization, Ber.Bunsenges.Phys.Chem. 89: 668-682 (1985)

Concentration of stock solution a 0 A + I + I 1 2 A + I 1 A k 1 A + I 2 I 1 1 d 1 k 2 A + I 2 I 2 2 d 2 k 3 A + I 2 I 3 3 d 3 k 4 A + I 2 I 4 4 d 4 k 5 A + I 2 I 5 5 d 5 k > k > k > k > k 1 2 3 4 5 Flow rate r = R -1 Selection in the flow reactor: Reversible replication reactions

Concentration of stock solution a 0 A + I 1 A f 1 A + I 2 I 1 1 f 2 A + I 2 I 2 2 f 3 A + I 2 I 3 3 f 4 A + I 2 I 4 4 f 5 A + I 2 I 5 5 f > f > f > f > f 1 2 3 4 5 Flow rate r = R -1 Selection in the flow reactor: Irreversible replication reactions

1 Fraction of advantageous variant 0.8 0.6 0.4 0.2 s = 0.1 s = 0.02 s = 0.01 0 0 200 400 600 800 1000 Time [Generations] Selection of advantageous mutants in populations of N = 10 000 individuals

Σ dx / dt = f x - x j i iq ji i j Φ I 1 + Ij Φ = Σ f x i i i ; Σ x = 1 ; i i Σ Q i ij = 1 f j Q 1j I 2 + Ij Q = (1-p) p ij n-d(i,j) d(i,j) p... Error rate per digit f j Q 2j d(i,j)... Hamming distance between I i and Ij (A) + I j f j Q jj Ij + Ij [A] = a = constant f j Q nj I n + Ij Chemical kinetics of replication and mutation

Concentration Master sequence Mutant cloud Sequence space The molecular quasispecies in sequence space

The RNA model considers RNA sequences as genotypes and simplified RNA structures, called secondary structures, as phenotypes. The mapping from genotypes into phenotypes is many-to-one. Hence, it is redundant and not invertible. Genotypes, i.e. RNA sequences, which are mapped onto the same phenotype, i.e. the same RNA secondary structure, form neutral networks. Neutral networks are represented by graphs in sequence space.

S k = ψ( I. ) f k = fs ( k ) Sequence space Phenotype space Non-negative numbers Mapping from sequence space into phenotype space and into fitness values

A multi-component neutral network Giant Component

A connected neutral network

Optimization of RNA molecules in silico W.Fontana, P.Schuster, A computer model of evolutionary optimization. Biophysical Chemistry 26 (1987), 123-147 W.Fontana, W.Schnabl, P.Schuster, Physical aspects of evolutionary optimization and adaptation. Phys.Rev.A 40 (1989), 3301-3321 M.A.Huynen, W.Fontana, P.F.Stadler, Smoothness within ruggedness. The role of neutrality in adaptation. Proc.Natl.Acad.Sci.USA 93 (1996), 397-401 W.Fontana, P.Schuster, Continuity in evolution. On the nature of transitions. Science 280 (1998), 1451-1455 W.Fontana, P.Schuster, Shaping space. The possible and the attainable in RNA genotypephenotype mapping. J.Theor.Biol. 194 (1998), 491-515

Evolution in the Flow Reactor: The RNA Model Sequence-structure map Structure-function map Environment ψ : f Ω(t) h s { I; d } { S; d }. ij ij { s S; } R. : d ij + Mapping into fitness values f k ( S, Ω( t) ) = f ( ( I, Ω( t)), Ω( )) ( t) = f ψ t k k dx dt k ( Q f ( t) Φ( t) ) + Q f ( t) x + η ( x, t) ω ( t), k = 1,..., n = xk kk k n j= 1, j k kj j j k k Wiener process dw ( t) = ω ( t) dt

Genotype-Phenotype Mapping I { S { = ( I { ) S { f = ƒ( S ) { { Evaluation of the Phenotype Mutation Q {j f 2 I 2 f 1 I1 I n I 1 f n I 2 f 2 f 1 I n+1 f n+1 f { Q f 3 I 3 Q f 3 I 3 f 4 I 4 I { f { I 4 I 5 f 4 f 5 f 5 I 5 Evolutionary dynamics including molecular phenotypes

Concentration Master sequence Mutant cloud Off-the-cloud mutations Sequence space The molecular quasispecies and mutations producing new variants

Stock Solution Reaction Mixture The flowreactor as a device for studies of evolution in vitro and in silico

50 S Average structure distance to target d 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory

36 Relay steps Number of relay step 38 Evolutionary trajectory 1250 44 10 0 40 42 44 Time Average structure distance to target d S Endconformation of optimization

36 Relay steps Number of relay step 38 Evolutionary trajectory 1250 44 43 10 0 40 42 44 Time Average structure distance to target d S Reconstruction of the last step 43 44

36 Relay steps Number of relay step 44 43 Average structure distance to target d S 42 Evolutionary trajectory 1250 10 0 38 40 42 44 Reconstruction of last-but-one step 42 43 ( 44) Time

36 Relay steps Number of relay step 44 43 Average structure distance to target d S 41 42 Evolutionary trajectory 1250 10 0 38 40 42 44 Reconstruction of step 41 42 ( 43 44) Time

36 Relay steps Number of relay step 44 43 Average structure distance to target d S 40 41 42 Evolutionary trajectory 1250 10 0 38 40 42 44 Reconstruction of step 40 41 ( 42 43 44) Time

Average structure distance to target d S 10 Relay steps 36 38 40 42 44 Number of relay step Evolutionary trajectory 0 1250 Time Evolutionary process 39 40 41 42 43 44 Reconstruction Reconstruction of the relay series

Transition inducing point mutations Neutral point mutations Change in RNA sequences during the final five relay steps 39 44

50 Relay steps S Average structure distance to target d 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory and relay steps

50 Relay steps S Average structure distance to target d 40 30 20 10 Uninterrupted presence Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Uninterrupted presence

Number of relay step Average structure distance to target d S 20 10 0 Uninterrupted presence Evolutionary trajectory 250 500 08 10 12 14 Time (arbitrary units) Transition inducing point mutations Neutral point mutations Neutral genotype evolution during phenotypic stasis

Number of relay step Average structure distance to target d S 30 20 10 Uninterrupted presence Evolutionary trajectory 750 1000 1250 20 25 30 35 Time (arbitrary units) 18 20 19 21 26 28 29 31 A random sequence of minor or continuous transitions in the relay series

18 20 19 21 26 28 29 31 A random sequence of minor or continuous transitions in the relay series

Shortening of Stacks Elongation of Stacks Multiloop Minor or continuous transitions: Occur frequently on single point mutations Opening of Constrained Stacks

50 Relay steps S Average structure distance to target d 40 30 20 10 Uninterrupted presence Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Uninterrupted presence

Main transition leading to clover leaf 36 37 38 Average structure distance to target d S 10 Relay steps Evolutionary trajectory 36 38 40 42 44 Number of relay step 0 1250 Time Reconstruction of a main transitions 36 37 ( 38)

50 Relay steps Main transitions Average structure distance to target d S 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Main transitions

Shift Roll-Over α α a Flip a α a b Double Flip a α b β β Main or discontinuous transitions: Structural innovations, occur rarely on single point mutations Closing of Constrained Stacks Multiloop

50 Relay steps Main transitions Average structure distance to target d S 40 30 20 10 Uninterrupted presence Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor

Variation in genotype space during optimization of phenotypes

Statistics of evolutionary trajectories Population size N Number of replications < n > rep Number of transitions < n > tr Number of main transitions < n > dtr The number of main transitions or evolutionary innovations is constant.

00 09 31 44 Three important steps in the formation of the trna clover leaf from a randomly chosen initial structure corresponding to three main transitions.

Main results of computer simulations of molecular evolution No trajectory was reproducible in detail. Sequences of target structures were different. Nevertheless solutions of comparable or the same quality are almost always achieved. Transitions between molecular phenotypes represented by RNA structures can be classified with respect to the induced structural changes. Highly probable minor transitions are opposed by main transitions with low probability of occurrence. Main transitions represent important innovations in the course of evolution. The number of minor transitions decreases with increasing population size. The number of main transitions or evolutionary innovations is approximately constant for given start and stop structures. Not all structures are accessible through evolution in the flow reactor. An example is the trna clover leaf for GC-only sequences.

...Variations neither useful not injurious would not be affected by natural selection, and would be left either a fluctuating element, as perhaps we see in certain polymorphic species, or would ultimately become fixed, owing to the nature of the organism and the nature of the conditions.... Charles Darwin, Origin of species (1859)

Adaptive Periods End of Walk Fitness Random Drift Periods Start of Walk Genotype Space Evolution in genotype space sketched as a non-descending walk in a fitness landscape

Coworkers Walter Fontana, Santa Fe Institute, NM Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Peter Stadler, Universität Leipzig, GE Ivo L.Hofacker, Christoph Flamm, Universität Wien, AT Bärbel Stadler, Andreas Wernitznig, Universität Wien, AT Michael Kospach, Ulrike Langhammer, Ulrike Mückstein, Stefanie Widder Jan Cupal, Kurt Grünberger, Andreas Svrček-Seiler, Stefan Wuchty Ulrike Göbel, Institut für Molekulare Biotechnologie, Jena, GE Walter Grüner, Stefan Kopp, Jaqueline Weber