RNA From Mathematical Models to Real Molecules

Similar documents
Neutral Networks of RNA Genotypes and RNA Evolution in silico

How Nature Circumvents Low Probabilities: The Molecular Basis of Information and Complexity. Peter Schuster

Is the Concept of Error Catastrophy Relevant for Viruses? Peter Schuster

Evolution of Biomolecular Structure 2006 and RNA Secondary Structures in the Years to Come. Peter Schuster

Error thresholds on realistic fitness landscapes

Different kinds of robustness in genetic and metabolic networks

RNA Bioinformatics Beyond the One Sequence-One Structure Paradigm. Peter Schuster

Designing RNA Structures

Tracing the Sources of Complexity in Evolution. Peter Schuster

RNA From Mathematical Models to Real Molecules

Von der Thermodynamik zu Selbstorganisation, Evolution und Information

Evolution on simple and realistic landscapes

Self-Organization and Evolution

Chemistry and Evolution at the Origin of Life. Visions and Reality

Complexity in Evolutionary Processes

Evolution and Molecules

Mathematische Probleme aus den Life-Sciences

Evolution on Realistic Landscapes

Mathematical Modeling of Evolution

The Advantage of Using Mathematics in Biology

How computation has changed research in chemistry and biology

Chemistry on the Early Earth

Origin of life and early evolution in the light of present day molecular biology. Peter Schuster

arxiv:physics/ v1 [physics.bio-ph] 27 Jun 2001

A Method for Estimating Mean First-passage Time in Genetic Algorithms

Mechanisms of molecular cooperation

The Role of Topology in the Study of Evolution

COMP598: Advanced Computational Biology Methods and Research

Replication and Mutation on Neutral Networks: Updated Version 2000

Linear Algebra of Eigen s Quasispecies Model

Molecular evolution - Part 1. Pawan Dhar BII

Problem solving by inverse methods in systems biology

Evolutionary Dynamics and Optimization. Neutral Networks as Model-Landscapes. for. RNA Secondary-Structure Folding-Landscapes

Linear Algebra of the Quasispecies Model

The Mathematics of Darwinian Systems

Bridging from Chemistry to the Life Sciences Evolution seen with the Glasses of a Physicist a

Chance and necessity in evolution: lessons from RNA

Systems biology and complexity research

Error Propagation in the Hypercycle

Evolution and Design

The RNA World, Fitness Landscapes, and all that

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

I N N O V A T I O N L E C T U R E S (I N N O l E C) Petr Kuzmič, Ph.D. BioKin, Ltd. WATERTOWN, MASSACHUSETTS, U.S.A.

Convergence of a Moran model to Eigen s quasispecies model

RNA folding at elementary step resolution

UNIT 5. Protein Synthesis 11/22/16

Systems Biology: A Personal View IX. Landscapes. Sitabhra Sinha IMSc Chennai

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

Effects of Neutral Selection on the Evolution of Molecular Species

Quasispecies distribution of Eigen model

Problems on Evolutionary dynamics

Critical Mutation Rate Has an Exponential Dependence on Population Size

EVOLUTIONARY DYNAMICS ON RANDOM STRUCTURES SIMON M. FRASER CHRISTIAN M, REIDYS DISCLAIMER. United States Government or any agency thereof.

EVOLUTIONARY DISTANCES

Lecture 7: Simple genetic circuits I

SI Appendix. 1. A detailed description of the five model systems

Stochastic Processes around Central Dogma

Networks in Molecular Evolution

Record dynamics in spin glassses, superconductors and biological evolution.

The role of form in molecular biology. Conceptual parallelism with developmental and evolutionary biology

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Evolving Antibiotic Resistance in Bacteria by Controlling Mutation Rate

Biology Semester 1 Study Guide

Institut für Mathematik, Universität Wien, Strudlhofgasse 4, A-1090 Wien, Austria.

Peter Schuster, Institut für Theoretische Chemie der Universität Wien, Austria Darwin s Optimization in the looking glasses of physics and chemistry

IIASA. Continuity in Evolution: On the Nature of Transitions INTERIM REPORT. IR / April

Evolutionary dynamics on graphs

The Amoeba-Flagellate Transformation

The tree of life: Darwinian chemistry as the evolutionary force from cyanic acid to living molecules and cells

When to use bit-wise neutrality

Population Genetics and Evolution III

Grundlagen der Systembiologie und der Modellierung epigenetischer Prozesse

Why EvoSysBio? Combine the rigor from two powerful quantitative modeling traditions: Molecular Systems Biology. Evolutionary Biology

Formative/Summative Assessments (Tests, Quizzes, reflective writing, Journals, Presentations)

Population Structure

Population Genetics: a tutorial

Neutral Evolution of Mutational Robustness

Foundations in Microbiology Seventh Edition

Chemical Organization Theory

Blank for whiteboard 1

RNA evolution and Genotype to Phenotype maps

Invariant Manifold Reductions for Markovian Ion Channel Dynamics

Relaxation Property and Stability Analysis of the Quasispecies Models

BIOLOGY STANDARDS BASED RUBRIC

From gene to protein. Premedical biology

Optimum selection and OU processes

Plasticity, Evolvability, and Modularity in RNA

Introduction to Polymer Physics

APES C4L2 HOW DOES THE EARTH S LIFE CHANGE OVER TIME? Textbook pages 85 88

Complete Suboptimal Folding of RNA and the Stability of Secondary Structures

Self Similar (Scale Free, Power Law) Networks (I)

Prediction of Locally Stable RNA Secondary Structures for Genome-Wide Surveys

Genotypes and Phenotypes in the Evolution of Molecules*

Scope and Sequence. Course / Grade Title: Biology

Evolutionary dynamics of populations with genotype-phenotype map

QUANTIFYING SLOW EVOLUTIONARY DYNAMICS IN RNA FITNESS LANDSCAPES

EVOLUTIONARY DYNAMICS AND THE EVOLUTION OF MULTIPLAYER COOPERATION IN A SUBDIVIDED POPULATION

Scientists have been measuring organisms metabolic rate per gram as a way of

Error threshold in simple landscapes

Transcription:

RNA From Mathematical Models to Real Molecules 3. Optimization and Evolution of RNA Molecules Peter Schuster Institut für Theoretische hemie und Molekulare Strukturbiologie der Universität Wien IMPA enoma School Valdivia, 12. 16.01.2004

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

RNA molecules Bacteria Higher multicelluar organisms eneration time 10 000 generations 10 6 generations 10 7 generations 10 sec 27.8 h = 1.16 d 115.7 d 3.17 a 1 min 6.94 d 1.90 a 19.01 a 20 min 138.9 d 38.03 a 380 a 10 h 11.40 a 1 140 a 11 408 a 10 d 274 a 27 380 a 273 800 a 20 a 20 000 a 2 10 7 a 2 10 8 a Time scales of evolutionary change

5' 3' Plus Strand Synthesis 5' 3' Plus Strand 3' Synthesis 5' 3' Plus Strand Minus Strand James Watson and Francis rick, 1953 3' 5' 5' Plus Strand 5' Minus Strand omplex Dissociation + 3' 3' omplementary replication as the simplest copying mechanism of RNA omplementarity is determined by Watson-rick base pairs: and A=U

(A) + I 1 f 1 I 2 + I 1 dx 1 / dt = f2 x 2 - x 1 Φ dx 2 / dt = f 1 x 1 - x 2 Φ (A) + I 2 f 2 I 1 + I 2 Φ =Σ i f x i ; Σ x = 1 ; i =1,2 i i i omplementary replication as the simplest molecular mechanism of reproduction

Equation for complementary replication: [I i ] = x i 0, f i > 0 ; i=1,2 Solutions are obtained by integrating factor transformation f x f x f x x f dt dx x x f dt dx = + = = = 2 2 1 1 2 1 1 2 1 2 2 1,, φ φ φ () ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2 1 2 2 1 1 2 2 2 1 1 1 1 2 1 1 2 1 2 1 2,1 1,2 (0), (0) (0) (0), (0) (0) exp 0 ) ( exp 0 ) ( exp 0 exp 0 f f f x f x f x f x f t f f f t f f f t f t f f t x = = + = + + = γ γ γ γ γ γ 0 ) exp( as ) ( and ) ( 2 1 1 2 2 1 2 1 + + ft f f f t x f f f t x

5' 3' Plus Strand Minus Strand Plus Strand Minus Strand 5' 3' 3' 5' Plus Strand 3' 5' + 5' 3' Minus Strand 3' 5' Direct replication of DNA is a higly complex copying mechanism involving more than ten different protein molecules. omplementarity is determined by Watson-rick base pairs: and A=T

(A) + I 1 f 1 I 1 + I 1 (A) + I 2 f 2 I 2 + I 2 dx / dt = f x - x i i i i Φ = x (f i - Φ i ) Φ =Σ j f j x j ; Σ x = 1 ; i,j =1,2,...,n j j (A) + I i f i I i + I i [I i] = x i 0 ; i =1,2,...,n ; [A] = a = constant f m = max { f j ; j=1,2,...,n} (A) + I m f m I m + I m x m (t) 1 for t (A) + I n f n I n + I n Reproduction of organisms or replication of molecules as the basis of selection

Selection equation: [I i ] = x i 0, f i > 0 n n ( f φ ), i = 1,2, L, n; x = 1; φ = f x f dx i = xi i i i j j j dt = = 1 = 1 Mean fitness or dilution flux, φ (t), is a non-decreasing function of time, n dφ = dt i= 1 f i dx dt i = f 2 ( ) 2 f = var{ f } 0 Solutions are obtained by integrating factor transformation ( 0) exp( f t) xi i xi () t = ; i = 1,2, L, n n x ( ) ( f jt) j = j 0 exp 1

s = ( f 2 -f 1 ) / f 1 ; f 2 > f 1 ; x 1 (0) = 1-1/N ; x 2 (0) = 1/N 1 Fraction of advantageous variant 0.8 0.6 0.4 0.2 s = 0.1 s = 0.02 s = 0.01 0 0 200 400 600 800 1000 Time [enerations] Selection of advantageous mutants in populations of N = 10 000 individuals

hanges in RNA sequences originate from replication errors called mutations. Mutations occur uncorrelated to their consequences in the selection process and are, therefore, commonly characterized as random elements of evolution.

5' 3' Plus Strand 5' 3' AAUAA AAUUAA Plus Strand Insertion 3' 5' 3' Minus Strand AAUAA AAUA Plus Strand 5' 3' Deletion Point Mutation The origins of changes in RNA sequences are replication errors called mutations.

Theory of molecular evolution M.Eigen, Self-organization of matter and the evolution of biological macromolecules. Naturwissenschaften 58 (1971), 465-526.J. Thompson, J.L. McBride, On Eigen's theory of the self-organization of matter and the evolution of biological macromolecules. Math. Biosci. 21 (1974), 127-142 B.L. Jones, R.H. Enns, S.S. Rangnekar, On the theory of selection of coupled macromolecular systems. Bull.Math.Biol. 38 (1976), 15-28 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part A: Emergence of the hypercycle. Naturwissenschaften 58 (1977), 465-526 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part B: The abstract hypercycle. Naturwissenschaften 65 (1978), 7-41 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part : The realistic hypercycle. Naturwissenschaften 65 (1978), 341-369 J. Swetina, P. Schuster, Self-replication with errors - A model for polynucleotide replication. Biophys.hem. 16 (1982), 329-345 J.S. Mcaskill, A localization threshold for macromolecular quasispecies from continuously distributed replication rates. J.hem.Phys. 80 (1984), 5194-5202 M.Eigen, J.Mcaskill, P.Schuster, The molecular quasispecies. Adv.hem.Phys. 75 (1989), 149-263. Reidys,.Forst, P.Schuster, Replication and mutation on neutral networks. Bull.Math.Biol. 63 (2001), 57-94

hemical kinetics of molecular evolution M. Eigen, P. Schuster, `The Hypercycle, Springer-Verlag, Berlin 1979

I 1 + I j f j Q j1 I 2 + I j Σ dx / dt = f x - x i j jq ji j i Φ Φ = Σ f x j j i ; Σ x = 1 ; j j Σ Q i ij = 1 f j Q j2 f j Q ji I i + I j [I i ] = x i 0 ; i =1,2,...,n ; [A] = a = constant (A) + I j f j Q jj I j + I j Q = (1-) p ij l-d(i,j) p d(i,j) p... Error rate per digit f j Q jn I n + I j l... hain length of the polynucleotide d(i,j)... Hamming distance between I i and I j hemical kinetics of replication and mutation as parallel reactions

... AU... d =2 d =1 H... A U...... UU... H d =1 H... UU... ity-block distance in sequence space 2D Sketch of sequence space Single point mutations as moves in sequence space

Mutation-selection equation: [I i ] = x i 0, f i > 0, Q ij 0 dx i n n n = f Q x x i n x f x j j ji j i φ, = 1,2, L, ; i i = 1; φ = j j j dt = = 1 = 1 = 1 f Solutions are obtained after integrating factor transformation by means of an eigenvalue problem x i () t = n 1 k n j= 1 ( 0) exp( λkt) c ( 0) exp( λ t) l = 0 ik ck ; i = 1,2, L, n; c (0) = n 1 k l k= 0 jk k k n i= 1 h ki x i (0) W 1 { f Q ; i, j= 1,2, L, n} ; L = { l ; i, j= 1,2, L, n} ; L = H = { h ; i, j= 1,2, L, n} i ij ij ij { λ ; k = 0,1,, n 1} 1 L W L = Λ = k L

oncentration Master sequence Mutant cloud Sequence space The molecular quasispecies in sequence space

Quasispecies as a function of the replication accuracy q

In evolution variation occurs on genotypes but selection operates on the phenotype. Mappings from genotypes into phenotypes are highly complex objects. The only computationally accessible case is in the evolution of RNA molecules. The mapping from RNA sequences into secondary structures and function, sequence structure function, is used as a model for the complex relations between genotypes and phenotypes. Fertile progeny measured in terms of fitness in population biology is determined quantitatively by replication rate constants of RNA molecules. Population biology Molecular genetics Evolution of RNA molecules enotype enome RNA sequence Phenotype Organism RNA structure and function Fitness Reproductive success Replication rate constant The RNA model

Optimized element: RNA structure

Hamming distance d (S,S ) = H 1 2 4 (i) (ii) (iii) d (S,S) = 0 H 1 1 d (S,S ) = d (S,S ) H 1 2 H 2 1 d (S,S) d (S,S) + d (S,S) H 1 3 H 1 2 H 2 3 The Hamming distance between structures in parentheses notation forms a metric in structure space

Replication rate constant: f k = / [ + d S (k) ] d S (k) = d H (S k,s ) f 6 f 7 f5 f 0 f f 4 f 1 f 2 f 3 Evaluation of RNA secondary structures yields replication rate constants

Stock Solution Reaction Mixture Replication rate constant: f k = / [ + d S (k) ] d S (k) = d H (S k,s ) Selection constraint: # RNA molecules is controlled by the flow N ( t) N ± N The flowreactor as a device for studies of evolution in vitro and in silico

3'-End 5'-End 70 60 10 50 20 30 40 Randomly chosen initial structure Phenylalanyl-tRNA as target structure

oncentration Master sequence Mutant cloud Off-the-cloud mutations Sequence space The molecular quasispecies in sequence space

enotype-phenotype Mapping I { S { = ( I { ) S { f = ƒ( S ) { { Evaluation of the Phenotype Mutation Q {j f 2 I 2 f 1 I1 I n I 1 f n I 2 f 2 f 1 I n+1 f n+1 f { Q f 3 I 3 Q f 3 I 3 f 4 I 4 I { f { I 4 I 5 f 4 f 5 f 5 I 5 Evolutionary dynamics including molecular phenotypes

50 S Average distance from initial structure 50 - d 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory (biologists view)

50 S Average structure distance to target d 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory (physicists view)

36 Relay steps Number of relay step 38 Evolutionary trajectory 1250 44 10 0 40 42 44 Time Average structure distance to target d S Endconformation of optimization

36 Relay steps Number of relay step 38 Evolutionary trajectory 1250 44 43 10 0 40 42 44 Time Average structure distance to target d S Reconstruction of the last step 43 44

36 Relay steps Number of relay step 44 43 Average structure distance to target d S 42 Evolutionary trajectory 1250 10 0 38 40 42 44 Reconstruction of last-but-one step 42 43 ( 44) Time

36 Relay steps Number of relay step 44 43 Average structure distance to target d S 41 42 Evolutionary trajectory 1250 10 0 38 40 42 44 Reconstruction of step 41 42 ( 43 44) Time

36 Relay steps Number of relay step 44 43 Average structure distance to target d S 40 41 42 Evolutionary trajectory 1250 10 0 38 40 42 44 Reconstruction of step 40 41 ( 42 43 44) Time

Average structure distance to target d S 10 Relay steps 36 38 40 42 44 Number of relay step Evolutionary trajectory 0 1250 Time Evolutionary process 39 40 41 42 43 44 Reconstruction Reconstruction of the relay series

Transition inducing point mutations Neutral point mutations hange in RNA sequences during the final five relay steps 39 44

50 Relay steps S Average structure distance to target d 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory and relay steps

50 Relay steps Main transitions Average structure distance to target d S 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Main transitions

00 09 31 44 Three important steps in the formation of the trna clover leaf from a randomly chosen initial structure corresponding to three main transitions.

10-1 3'-End Frequency of occurrence 10-2 10-3 10-4 2 5 10 20 5'-End 10 30 40 70 50 60 10-5 Frequent neighbors Rare neighbors Minor transitions Main transitions 10-6 10 0 10 1 10 2 10 3 10 4 10 5 Rank Probability of occurrence of different structures in the mutational neighborhood of trna phe

Definition of an -neighborhood of structure S k Y(S k )... set of all structures occurring in the Hamming distance one neighborhood of the neutral network k of S k jk... number of contacts between the two neutral networks j and k jk = kj Probability of occurrence: ρ (S j ; S k ) = γ jk n( κ 1) k ; ρ (S k ; S j ) ρ (S j ; S k ) { S Υ(S ) ρ(s ;S ) ε } ε neighborhood of S : Ψε (S ) = > k k j k j k

AU Movies of optimization trajectories over the AU and the alphabet

0.2 0.15 Frequency 0.1 0.05 0 0 1000 2000 3000 4000 5000 Runtime of trajectories Statistics of the lengths of trajectories from initial structure to target (AU-sequences)

0.3 0.25 Main transitions 0.2 Frequency 0.15 0.1 All transitions 0.05 0 0 20 40 60 80 100 Number of transitions Statistics of the numbers of transitions from initial structure to target (AU-sequences)

Alphabet Runtime Transitions Main transitions No. of runs AU 385.6 22.5 12.6 1017 U 448.9 30.5 16.5 611 2188.3 40.0 20.6 107 Statistics of trajectories and relay series (mean values of log-normal distributions)

Number of relay step 28 neutral point mutations during a long quasi-stationary epoch Average structure distance to target d S 20 10 0 Uninterrupted presence Evolutionary trajectory 250 500 08 10 12 14 Time (arbitrary units) Transition inducing point mutations Neutral point mutations Neutral genotype evolution during phenotypic stasis

Variation in genotype space during optimization of phenotypes Mean Hamming distance within the population and drift velocity of the population center in sequence space.

Spread of population in sequence space during a quasistationary epoch: t = 150

Spread of population in sequence space during a quasistationary epoch: t = 170

Spread of population in sequence space during a quasistationary epoch: t = 200

Spread of population in sequence space during a quasistationary epoch: t = 350

Spread of population in sequence space during a quasistationary epoch: t = 500

Spread of population in sequence space during a quasistationary epoch: t = 650

Spread of population in sequence space during a quasistationary epoch: t = 820

Spread of population in sequence space during a quasistationary epoch: t = 825

Spread of population in sequence space during a quasistationary epoch: t = 830

Spread of population in sequence space during a quasistationary epoch: t = 835

Spread of population in sequence space during a quasistationary epoch: t = 840

Spread of population in sequence space during a quasistationary epoch: t = 845

Spread of population in sequence space during a quasistationary epoch: t = 850

Spread of population in sequence space during a quasistationary epoch: t = 855

Massif entral Examples of smooth landscapes on Earth Mount Fuji

Dolomites Examples of rugged landscapes on Earth Bryce anyon

Fitness End of Walk Start of Walk enotype Space Evolutionary optimization in absence of neutral paths in sequence space

Adaptive Periods End of Walk Fitness Random Drift Periods Start of Walk enotype Space Evolutionary optimization including neutral paths in sequence space

rand anyon Example of a landscape on Earth with neutral ridges and plateaus

Neutral ridges and plateus

Acknowledgement of support Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Universität Wien Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European ommission: Project No. EU-980189 Siemens A, Austria The Santa Fe Institute and the Universität Wien The software for producing RNA movies was developed by Robert iegerich and coworkers at the Universität Bielefeld

oworkers Walter Fontana, Santa Fe Institute, NM Universität Wien hristian Reidys, hristian Forst, Los Alamos National Laboratory, NM Peter Stadler, Bärbel Stadler, Universität Leipzig, E Ivo L.Hofacker, hristoph Flamm, Universität Wien, AT Andreas Wernitznig, Michael Kospach, Universität Wien, AT Ulrike Langhammer, Ulrike Mückstein, Stefanie Widder Jan upal, Kurt rünberger, Andreas Svrček-Seiler, Stefan Wuchty Ulrike öbel, Institut für Molekulare Biotechnologie, Jena, E Walter rüner, Stefan Kopp, Jaqueline Weber

Web-Page for further information: http://www.tbi.univie.ac.at/~pks