How Nature Circumvents Low Probabilities: The Molecular Basis of Information and Complexity. Peter Schuster

Similar documents
Evolution of Biomolecular Structure 2006 and RNA Secondary Structures in the Years to Come. Peter Schuster

Is the Concept of Error Catastrophy Relevant for Viruses? Peter Schuster

RNA Bioinformatics Beyond the One Sequence-One Structure Paradigm. Peter Schuster

Neutral Networks of RNA Genotypes and RNA Evolution in silico

RNA From Mathematical Models to Real Molecules

Error thresholds on realistic fitness landscapes

Evolution on simple and realistic landscapes

Tracing the Sources of Complexity in Evolution. Peter Schuster

Designing RNA Structures

Complexity in Evolutionary Processes

Evolution and Molecules

Mathematical Modeling of Evolution

How computation has changed research in chemistry and biology

The Advantage of Using Mathematics in Biology

Evolution on Realistic Landscapes

Mathematische Probleme aus den Life-Sciences

RNA From Mathematical Models to Real Molecules

Origin of life and early evolution in the light of present day molecular biology. Peter Schuster

Mechanisms of molecular cooperation

Chemistry and Evolution at the Origin of Life. Visions and Reality

Self-Organization and Evolution

Von der Thermodynamik zu Selbstorganisation, Evolution und Information

Different kinds of robustness in genetic and metabolic networks

Evolution and Design

Problem solving by inverse methods in systems biology

COMP598: Advanced Computational Biology Methods and Research

Chemistry on the Early Earth

Replication and Mutation on Neutral Networks: Updated Version 2000

The Role of Topology in the Study of Evolution

Systems biology and complexity research

Evolutionary Dynamics and Optimization. Neutral Networks as Model-Landscapes. for. RNA Secondary-Structure Folding-Landscapes

arxiv:physics/ v1 [physics.bio-ph] 27 Jun 2001

The role of form in molecular biology. Conceptual parallelism with developmental and evolutionary biology

Chance and necessity in evolution: lessons from RNA

RNA folding at elementary step resolution

Bridging from Chemistry to the Life Sciences Evolution seen with the Glasses of a Physicist a

Outline. The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation. Unfolded Folded. What is protein folding?

EVOLUTIONARY DYNAMICS ON RANDOM STRUCTURES SIMON M. FRASER CHRISTIAN M, REIDYS DISCLAIMER. United States Government or any agency thereof.

Error Propagation in the Hypercycle

Neutral Evolution of Mutational Robustness

Vom Modell zur Steuerung

The Mathematics of Darwinian Systems

Evolutionary design of energy functions for protein structure prediction

Networks in Molecular Evolution

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig

Effects of Neutral Selection on the Evolution of Molecular Species

Evolution of Model Proteins on a Foldability Landscape

Protein Folding & Stability. Lecture 11: Margaret A. Daugherty. Fall How do we go from an unfolded polypeptide chain to a

Prediction of Locally Stable RNA Secondary Structures for Genome-Wide Surveys

A Method for Estimating Mean First-passage Time in Genetic Algorithms

Thermodynamics. Entropy and its Applications. Lecture 11. NC State University

Protein Folding Prof. Eugene Shakhnovich

Many proteins spontaneously refold into native form in vitro with high fidelity and high speed.

Molecular evolution - Part 1. Pawan Dhar BII

Quiz 2 Morphology of Complex Materials

arxiv: v1 [q-bio.bm] 31 Oct 2017

Lecture 11: Protein Folding & Stability

Protein Folding & Stability. Lecture 11: Margaret A. Daugherty. Fall Protein Folding: What we know. Protein Folding

Quasispecies distribution of Eigen model

Problems on Evolutionary dynamics

Timo Latvala Landscape Families

IIASA. Continuity in Evolution: On the Nature of Transitions INTERIM REPORT. IR / April

Systems Biology: A Personal View IX. Landscapes. Sitabhra Sinha IMSc Chennai

Plasticity, Evolvability, and Modularity in RNA

SI Appendix. 1. A detailed description of the five model systems

Fitness landscapes and seascapes

Second Law Applications. NC State University

Protein Folding In Vitro*

Complete Suboptimal Folding of RNA and the Stability of Secondary Structures

RIBOSOME: THE ENGINE OF THE LIVING VON NEUMANN S CONSTRUCTOR

Formative/Summative Assessments (Tests, Quizzes, reflective writing, Journals, Presentations)

QUANTIFYING SLOW EVOLUTIONARY DYNAMICS IN RNA FITNESS LANDSCAPES

Elucidation of the RNA-folding mechanism at the level of both

Quantifying slow evolutionary dynamics in RNA fitness landscapes

Evolution of functionality in lattice proteins

Genotypes and Phenotypes in the Evolution of Molecules*

W.Fontana et al.: RNA Landscapes Page 1 Summary. RNA secondary structures are computed from primary sequences by means of a folding algorithm which us

Lecture 18 Generalized Belief Propagation and Free Energy Approximations

Multiple Choice Review- Eukaryotic Gene Expression

Paul Sigler et al, 1998.

Population Genetics: a tutorial

Master equation approach to finding the rate-limiting steps in biopolymer folding

The tree of life: Darwinian chemistry as the evolutionary force from cyanic acid to living molecules and cells

CelloS: a Multi-level Approach to Evolutionary Dynamics

Chemical Organization Theory

When to use bit-wise neutrality

Protein Folding Challenge and Theoretical Computer Science

Stochastic Processes around Central Dogma

Evolutionary Dynamics & its Tendencies. David Krakauer, Santa Fe Institute.

arxiv: v1 [cond-mat.soft] 22 Oct 2007

EVOLUTIONARY DISTANCES

arxiv:cond-mat/ v1 [cond-mat.soft] 19 Mar 2001

Critical Mutation Rate Has an Exponential Dependence on Population Size

DANNY BARASH ABSTRACT

The Amoeba-Flagellate Transformation

EVOLUTION BEFORE GENES

Evolutionary dynamics of populations with genotype-phenotype map

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

Metastable Evolutionary Dynamics: Crossing Fitness Barriers or Escaping via Neutral Paths?

Gene regulation: From biophysics to evolutionary genetics

Transcription:

How Nature Circumvents Low Probabilities: The Molecular Basis of Information and Complexity Peter Schuster Institut für Theoretische Chemie Universität Wien, Austria Nonlinearity, Fluctuations, and Complexity Brussels, 16. 19.03.2005

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

Protein folding: Levinthal s paradox How can Nature find the native conformation in the folding process? Evolution: Wigner s paradox How can Nature find the optimal sequence of a protein in the evolutionary optimization process? Lysozyme A small protein molecule n = 130 amino acid residues 6 130 = 1.44 10 101 conformations 20 130 = 1.36 10 169 sequences

1. Solutions to Levinthal s paradox 2. Selection as solution to low probabilities in evolution 3. Origin of information by mutation and selection 4. Evolution of RNA phenotypes 5. The role of neutrality in molecular evolution 6. An experiment with RNA molecules 7. Summary

1. Solutions to Levinthal s paradox 2. Selection as solution to low probabilities in evolution 3. Origin of information by mutation and selection 4. Evolution of RNA phenotypes 5. The role of neutrality in molecular evolution 6. An experiment with RNA molecules 7. Summary

The golf course landscape Levinthal s paradox K.A. Dill, H.S. Chan, Nature Struct. Biol. 4:10-19

The pathway landscape The pathway solution to Levinthal s paradox K.A. Dill, H.S. Chan, Nature Struct. Biol. 4:10-19

The folding funnel The answer to Levinthal s paradox K.A. Dill, H.S. Chan, Nature Struct. Biol. 4:10-19

A more realistic folding funnel The answer to Levinthal s paradox K.A. Dill, H.S. Chan, Nature Struct. Biol. 4:10-19

An all (or many) paths lead to Rome situation N native conformation A reconstructed free energy surface for lysozyme folding: C.M. Dobson, A. Šali, and M. Karplus, Angew.Chem.Internat.Ed. 37: 868-893, 1988

1. Solutions to Levinthal s paradox 2. Selection as solution to low probabilities in evolution 3. Origin of information by mutation and selection 4. Evolution of RNA phenotypes 5. The role of neutrality in molecular evolution 6. An experiment with RNA molecules 7. Summary

Earlier abstract of the Origin of Species Charles Robert Darwin, 1809-1882 Alfred Russell Wallace, 1823-1913 The two competitors in the formulation of evolution by natural selection

(A) + I 1 f 1 I 1 + I 1 (A) + I 2 f 2 I 2 + I 2 dx / dt = f x - x i i i i Φ = x (f i - Φ i ) Φ =Σ j f j x j ; Σ x = 1 ; i,j =1,2,...,n j j (A) + I i f i I i + I i [I i] = x i 0 ; i =1,2,...,n ; [A] = a = constant f m = max { f j ; j=1,2,...,n} (A) + I m f m I m + I m x m (t) 1 for t (A) + I n f n I n + I n Reproduction of organisms or replication of molecules as the basis of selection

Selection equation: [I i ] = x i 0, f i > 0 n n ( f φ ), i = 1,2, L, n; x = 1; φ = f x f dx i = xi i i i j j j dt = = 1 = 1 Mean fitness or dilution flux, φ (t), is a non-decreasing function of time, n dφ = dt i= 1 f i dx dt i = f 2 ( ) 2 f = var{ f } 0 Solutions are obtained by integrating factor transformation ( 0) exp( f t) xi i xi () t = ; i = 1,2, L, n n x ( ) ( f jt) j = j 0 exp 1

s = ( f 2 -f 1 ) / f 1 ; f 2 > f 1 ; x 1 (0) = 1-1/N ; x 2 (0) = 1/N 1 Fraction of advantageous variant 0.8 0.6 0.4 0.2 s = 0.1 s = 0.02 s = 0.01 0 0 200 400 600 800 1000 Time [Generations] Selection of advantageous mutants in populations of N = 10 000 individuals

1. Solutions to Levinthal s paradox 2. Selection as solution to low probabilities in evolution 3. Origin of information by mutation and selection 4. Evolution of RNA phenotypes 5. The role of neutrality in molecular evolution 6. An experiment with RNA molecules 7. Summary

I 1 + I j f j Q j1 I 2 + I j Σ dx / dt = f x - x i j jq ji j i Φ Φ = Σ f x j j i ; Σ x = 1 ; j j Σ Q i ij = 1 f j Q j2 f j Q ji I i + I j [I i ] = x i 0 ; i =1,2,...,n ; [A] = a = constant (A) + I j f j Q jj I j + I j Q = (1-) p ij l-d(i,j) p d(i,j) p... Error rate per digit f j Q jn I n + I j l... Chain length of the polynucleotide d(i,j)... Hamming distance between I i and I j Chemical kinetics of replication and mutation as parallel reactions

Mutation-selection equation: [I i ] = x i 0, f i > 0, Q ij 0 dx i n n n = f Q x x i n x f x j j ji j i φ, = 1,2, L, ; i i = 1; φ = j j j dt = = 1 = 1 = 1 f Solutions are obtained after integrating factor transformation by means of an eigenvalue problem x i () t = n 1 k n j= 1 ( 0) exp( λkt) c ( 0) exp( λ t) l = 0 ik ck ; i = 1,2, L, n; c (0) = n 1 k l k= 0 jk k k n i= 1 h ki x i (0) W 1 { f Q ; i, j= 1,2, L, n} ; L = { l ; i, j= 1,2, L, n} ; L = H = { h ; i, j= 1,2, L, n} i ij ij ij { λ ; k = 0,1,, n 1} 1 L W L = Λ = k L

e 1 x 1 l 0 e 1 e 3 x 3 x 2 e 2 e 2 e 3 l 2 l 1 { } 3 The quasispecies on the concentration simplex S 3 = x 0, i = 1, 2,3; x = 1 i i= 1 i

Quasispecies Uniform distribution 0.00 0.05 0.10 Error rate p = 1-q Quasispecies as a function of the replication accuracy q

Concentration Master sequence Mutant cloud Off-the-cloud mutations Sequence space The molecular quasispecies in sequence space

1. Solutions to Levinthal s paradox 2. Selection as solution to low probabilities in evolution 3. Origin of information by mutation and selection 4. Evolution of RNA phenotypes 5. The role of neutrality in molecular evolution 6. An experiment with RNA molecules 7. Summary

Computer simulation of RNA optimization Walter Fontana and Peter Schuster, Biophysical Chemistry 26:123-147, 1987 Walter Fontana, Wolfgang Schnabl, and Peter Schuster, Phys.Rev.A 40:3301-3321, 1989

Walter Fontana, Wolfgang Schnabl, and Peter Schuster, Phys.Rev.A 40:3301-3321, 1989

Evolution in silico W. Fontana, P. Schuster, Science 280 (1998), 1451-1455

Mapping from sequence space into structure space and into function

Neutral networks are sets of sequences forming the same object in a phenotype space. The neutral network G k is, for example, the preimage of the structure S k in sequence space: G k = -1 (S k ) π { j (I j ) = S k } The set is converted into a graph by connecting all sequences of Hamming distance one. Neutral networks of small biomolecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4 n, becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence space. In this approach, nodes are inserted randomly into sequence space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.

-1 k k j j k G = ( S) U I ( I) = S λ j = 12 / 27 = 0.444, λ k = (k) j G k Connectivity threshold: / κ- λcr = 1 - κ -1 ( 1) λ λ Alphabet size : AUGC = 4 > λ k cr.... < λ k cr.... network G k is connected network G k is not connected cr 2 0.5 3 0.423 4 0.370 GC,AU GUC,AUG AUGC Mean degree of neutrality and connectivity of neutral networks

A connected neutral network formed by a common structure

Giant Component A multi-component neutral network formed by a rare structure

Properties of RNA sequence to secondary structure mapping 1. More sequences than structures

Properties of RNA sequence to secondary structure mapping 1. More sequences than structures

Properties of RNA sequence to secondary structure mapping 1. More sequences than structures 2. Few common versus many rare structures

Properties of RNA sequence to secondary structure mapping 1. More sequences than structures 2. Few common versus many rare structures n = 100, stem-loop structures n = 30 RNA secondary structures and Zipf s law

Properties of RNA sequence to secondary structure mapping 1. More sequences than structures 2. Few common versus many rare structures 3. Shape space covering of common structures

Properties of RNA sequence to secondary structure mapping 1. More sequences than structures 2. Few common versus many rare structures 3. Shape space covering of common structures

Properties of RNA sequence to secondary structure mapping 1. More sequences than structures 2. Few common versus many rare structures 3. Shape space covering of common structures 4. Neutral networks of common structures are connected

Properties of RNA sequence to secondary structure mapping 1. More sequences than structures 2. Few common versus many rare structures 3. Shape space covering of common structures 4. Neutral networks of common structures are connected

Genotype-Phenotype Mapping I { S { = ( I { ) S { f = ƒ( S ) { { Evaluation of the Phenotype Mutation Q {j f 2 I 2 f 1 I1 I n I 1 f n I 2 f 2 f 1 I n+1 f n+1 f { Q f 3 I 3 Q f 3 I 3 f 4 I 4 I { f { I 4 I 5 f 4 f 5 f 5 I 5 Evolutionary dynamics including molecular phenotypes

Replication rate constant: f k = / [ + d S (k) ] d S (k) = d H (S k,s ) Selection constraint: Population size, N = # RNA molecules, is controlled by the flow N ( t) N ± N Mutation rate: p = 0.001 / site replication The flowreactor as a device for studies of evolution in vitro and in silico

Replication rate constant: f k = / [ + d S (k) ] d S (k) = d H (S k,s ) f 6 f 7 f5 f 0 f f 4 f 1 f 2 f 3 Evaluation of RNA secondary structures yields replication rate constants

3'-End 5'-End 70 60 10 50 20 30 40 Randomly chosen initial structure Phenylalanyl-tRNA as target structure

50 S Average structure distance to target d 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Evolutionary trajectory

Number of relay step 28 neutral point mutations during a long quasi-stationary epoch Average structure distance to target d S 20 10 0 Uninterrupted presence Evolutionary trajectory 250 500 08 10 12 14 Time (arbitrary units) Transition inducing point mutations Neutral point mutations Neutral genotype evolution during phenotypic stasis

1. Solutions to Levinthal s paradox 2. Selection as solution to low probabilities in evolution 3. Origin of information by mutation and selection 4. Evolution of RNA phenotypes 5. The role of neutrality in molecular evolution 6. An experiment with RNA molecules 7. Summary

Evolutionary trajectory Spreading of the population through diffusion on the neutral network Velocity of the population center in sequence space

Spread of population in sequence space during a quasistationary epoch: t = 150

Spread of population in sequence space during a quasistationary epoch: t = 170

Spread of population in sequence space during a quasistationary epoch: t = 200

Spread of population in sequence space during a quasistationary epoch: t = 350

Spread of population in sequence space during a quasistationary epoch: t = 500

Spread of population in sequence space during a quasistationary epoch: t = 650

Spread of population in sequence space during a quasistationary epoch: t = 820

Spread of population in sequence space during a quasistationary epoch: t = 825

Spread of population in sequence space during a quasistationary epoch: t = 830

Spread of population in sequence space during a quasistationary epoch: t = 835

Spread of population in sequence space during a quasistationary epoch: t = 840

Spread of population in sequence space during a quasistationary epoch: t = 845

Spread of population in sequence space during a quasistationary epoch: t = 850

Spread of population in sequence space during a quasistationary epoch: t = 855

1. Solutions to Levinthal s paradox 2. Selection as solution to low probabilities in evolution 3. Origin of information by mutation and selection 4. Evolution of RNA phenotypes 5. The role of neutrality in molecular evolution 6. An experiment with RNA molecules 7. Summary

A ribozyme switch E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452

Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis- -virus (B)

The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures

Two neutral walks through sequence space with conservation of structure and catalytic activity

1. Solutions to Levinthal s paradox 2. Selection as solution to low probabilities in evolution 3. Origin of information by mutation and selection 4. Evolution of RNA phenotypes 5. The role of neutrality in molecular evolution 6. An experiment with RNA molecules 7. Summary

Example of a smooth landscape on Earth Mount Fuji

Dolomites Bryce Canyon Examples of rugged landscapes on Earth

Fitness End of Walk Start of Walk Genotype Space Evolutionary optimization in absence of neutral paths in sequence space

Adaptive Periods End of Walk Fitness Random Drift Periods Start of Walk Genotype Space Evolutionary optimization including neutral paths in sequence space

Example of a landscape on Earth with neutral ridges and plateaus Grand Canyon

Conformational and mutational landscapes of biomolecules as well as fitness landscapes of evolutionary biology are rugged. Adaptive or non-descending walks on rugged landscapes end commonly at one of the low lying local maxima. Fitness End of Walk Start of Walk Genotype Space End of Walk Selective neutrality in the form of neutral networks plays an active role in evolutionary optimization and enables populations to reach high local maxima or even the global optimum. Fitness Start of Walk Genotype Space

Acknowledgement of support Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Universität Wien Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Project No. EU-980189 Austrian Genome Research Program GEN-AU Siemens AG, Austria Universität Wien and the Santa Fe Institute

Coworkers Walter Fontana, Harvard Medical School, MA Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Universität Wien Peter Stadler, Bärbel Stadler, Universität Leipzig, GE Jord Nagel, Kees Pleij, Universiteit Leiden, NL Ivo L.Hofacker, Christoph Flamm, Universität Wien, AT Andreas Wernitznig, Michael Kospach, Universität Wien, AT Ulrike Langhammer, Ulrike Mückstein, Stefanie Widder Jan Cupal, Kurt Grünberger, Andreas Svrcek-Seiler, Stefan Wuchty Stefan Bernhart, Lukas Endler Ulrike Göbel, Institut für Molekulare Biotechnologie, Jena, GE Walter Grüner, Stefan Kopp, Jaqueline Weber

Web-Page for further information: http://www.tbi.univie.ac.at/~pks