Molecular models of biological evolution have attracted

Similar documents
arxiv:physics/ v1 [physics.bio-ph] 27 Jun 2001

Quasispecies distribution of Eigen model

Relaxation Property and Stability Analysis of the Quasispecies Models

Linear Algebra of the Quasispecies Model

Linear Algebra of Eigen s Quasispecies Model

Critical Mutation Rate Has an Exponential Dependence on Population Size

Error threshold in simple landscapes

The 6-vertex model of hydrogen-bonded crystals with bond defects

Error thresholds on realistic fitness landscapes

arxiv: v1 [cond-mat.stat-mech] 6 Mar 2008

Complex Systems Theory and Evolution

A Modified Earthquake Model Based on Generalized Barabási Albert Scale-Free

A Method for Estimating Mean First-passage Time in Genetic Algorithms

Convergence of a Moran model to Eigen s quasispecies model

Numerical evaluation of the upper critical dimension of percolation in scale-free networks

Field Theory of a Reaction- Diffusion Model of Quasispecies Dynamics

Selective Pressures on Genomes in Molecular Evolution

NATURAL SELECTION FOR WITHIN-GENERATION VARIANCE IN OFFSPRING NUMBER JOHN H. GILLESPIE. Manuscript received September 17, 1973 ABSTRACT

Tracing the Sources of Complexity in Evolution. Peter Schuster

Is the Concept of Error Catastrophy Relevant for Viruses? Peter Schuster

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

IN THIS PAPER, we consider a class of continuous-time recurrent

Evolution on simple and realistic landscapes

arxiv:cond-mat/ v4 [cond-mat.stat-mech] 19 Jun 2007

adap-org/ Jan 1994

Phase transition phenomena of statistical mechanical models of the integer factorization problem (submitted to JPSJ, now in review process)

Influence of noise on the synchronization of the stochastic Kuramoto model

Phase Transitions in Spin Glasses

Lee Yang zeros and the Ising model on the Sierpinski gasket

Magnetic ordering of local moments

Quantum annealing by ferromagnetic interaction with the mean-field scheme

Predicting Fitness Effects of Beneficial Mutations in Digital Organisms

Population Genetics: a tutorial

Neutral Networks of RNA Genotypes and RNA Evolution in silico

Problems on Evolutionary dynamics

A Realization of Yangian and Its Applications to the Bi-spin System in an External Magnetic Field

Nonlinear Dynamical Behavior in BS Evolution Model Based on Small-World Network Added with Nonlinear Preference

From time series to superstatistics

An Explanation on Negative Mass-Square of Neutrinos

Synchronization and Bifurcation Analysis in Coupled Networks of Discrete-Time Systems

Population Genetics and Evolution III

Stability and topology of scale-free networks under attack and defense strategies

(PP) rrap3ht/c 61 PCBM (Fig. 2e) MEHPPV/C 61 PCBM (Fig. 2f) Supplementary Table (1) device/figure a HF (mt) J (mt) (CTE) 4 2 >1 0.

Self-organized Criticality in a Modified Evolution Model on Generalized Barabási Albert Scale-Free Networks

The Metal-Insulator Transition in Correlated Disordered Systems

V. Evolutionary Computing. Read Flake, ch. 20. Assumptions. Genetic Algorithms. Fitness-Biased Selection. Outline of Simplified GA

V. Evolutionary Computing. Read Flake, ch. 20. Genetic Algorithms. Part 5A: Genetic Algorithms 4/10/17. A. Genetic Algorithms

Quantification of Entanglement Entropies for Doubly Excited States in Helium

arxiv: v1 [q-bio.qm] 7 Aug 2017

Extending the Tools of Chemical Reaction Engineering to the Molecular Scale

Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)

Non-Adaptive Evolvability. Jeff Clune

Critical Properties of Complex Fitness Landscapes

Evolution on Realistic Landscapes

Spin glasses and Adiabatic Quantum Computing

Dependence of conductance on percolation backbone mass

Chapter 2 Ensemble Theory in Statistical Physics: Free Energy Potential

Error Propagation in the Hypercycle

Classical and quantum simulation of dissipative quantum many-body systems

Monte Carlo Study of Planar Rotator Model with Weak Dzyaloshinsky Moriya Interaction

Lattice Bhatnagar Gross Krook model for the Lorenz attractor

Lecture Notes: Markov chains

Molecular evolution - Part 1. Pawan Dhar BII

The phenomenon of biological evolution: 19th century misconception

Quantum annealing for problems with ground-state degeneracy

Application of Mean-Field Jordan Wigner Transformation to Antiferromagnet System

Segregation versus mitotic recombination APPENDIX

Physics 127b: Statistical Mechanics. Renormalization Group: 1d Ising Model. Perturbation expansion

Effects of Interactive Function Forms in a Self-Organized Critical Model Based on Neural Networks

MULTIPLE SOLUTIONS FOR AN INDEFINITE KIRCHHOFF-TYPE EQUATION WITH SIGN-CHANGING POTENTIAL

Density Functional Theory for Electrons in Materials

Multiobjective Optimization of an Extremal Evolution Model

Numerical Analysis of 2-D Ising Model. Ishita Agarwal Masters in Physics (University of Bonn) 17 th March 2011

IV. Evolutionary Computing. Read Flake, ch. 20. Assumptions. Genetic Algorithms. Fitness-Biased Selection. Outline of Simplified GA

Neutral Evolution of Mutational Robustness

Applications of Diagonalization

COMPARATIVE ANALYSIS ON TURING MACHINE AND QUANTUM TURING MACHINE

Structures of (ΩΩ) 0 + and (ΞΩ) 1 + in Extended Chiral SU(3) Quark Model

Local Search & Optimization

Mathematical Structures of Statistical Mechanics: from equilibrium to nonequilibrium and beyond Hao Ge

SI Appendix. 1. A detailed description of the five model systems

Trapping and survival probability in two dimensions

Self-organized Criticality and Synchronization in a Pulse-coupled Integrate-and-Fire Neuron Model Based on Small World Networks

Phase transitions and critical phenomena

EXPLODING BOSE-EINSTEIN CONDENSATES AND COLLAPSING NEUTRON STARS DRIVEN BY CRITICAL MAGNETIC FIELDS

Phase Transitions in Nonequilibrium Steady States and Power Laws and Scaling Functions of Surface Growth Processes

Exact solution of site and bond percolation. on small-world networks. Abstract

Giant Enhancement of Quantum Decoherence by Frustrated Environments

Dynamical principles in biological processes

An Application of Catalan Numbers on Cayley Tree of Order 2: Single Polygon Counting

Oscillatory Turing Patterns in a Simple Reaction-Diffusion System

7 Rate-Based Recurrent Networks of Threshold Neurons: Basis for Associative Memory

Herd Behavior and Phase Transition in Financial Market

I of a gene sampled from a randomly mating popdation,

Ising Model with Competing Interactions on Cayley Tree of Order 4: An Analytic Solution

Tensor network simulations of strongly correlated quantum systems

Escape Problems & Stochastic Resonance

Phase Transition & Approximate Partition Function In Ising Model and Percolation In Two Dimension: Specifically For Square Lattices

Critical Dynamics of Two-Replica Cluster Algorithms

Resistance distribution in the hopping percolation model

Transcription:

Exact solution of the Eigen model with general fitness functions and degradation rates David B. Saakian* and Chin-Kun Hu* *Institute of Physics, Academia Sinica, ankang, Taipei 11529, Taiwan; and Yerevan Physics Institute, Alikhanian Brothers Street 2, Yerevan 375036, Armenia Edited by H. Eugene Stanley, Boston University, Boston, MA, and approved January 18, 2006 (received for review June 13, 2005) We present an exact solution of Eigen s quasispecies model with a general degradation rate and fitness functions, including a square root decrease of fitness with increasing Hamming distance from the wild type. The found behavior of the model with a degradation rate is analogous to a viral quasispecies under attack by the immune system of the host. Our exact solutions also revise the known results of neutral networks in quasispecies theory. To explain the existence of mutants with large Hamming distances from the wild type, we propose three different modifications of the Eigen model: mutation landscape, multiple adjacent mutations, and frequency-dependent fitness in which the steady-state solution shows a multicenter behavior. quasispecies virus evolution error threshold Molecular models of biological evolution have attracted much attention in recent decades (1 15). Among them, Eigen s concept of quasispecies plays a fundamental role (1, 2). It describes the evolution of a population consisting of a wild type accompanied by a large number of mutant types in sequence space by a large system of ordinary differential equations. The Eigen model has been found to describe quite well the evolution of viral populations (3) and has deeply changed our view of the process of evolution: adaptation does not wait for better adapted mutants to arise but starts with the selection of the better adapted mutants and then explores by mutation the surrounding sequence space for even better mutants. When the mutation rate surpasses an error threshold, the population gets genetically unstable, and it could be shown that indeed virus populations can be driven to extinction when the error rate is artificially raised beyond the error threshold. To describe the population precisely, we should know the fitness value of each type and the mutation rates to go from one type to another. The experimental efforts to do so are immense. During the last three decades, the model has been investigated numerically as well as analytically for a simple fitness function. Although this sort of data reduction does allow a view on a large population, the fitness functions chosen are too simplistic to explain realistic cases such as a population of RA virus. In this work, we solve the system of differential equations exactly, assuming uniform degradation rates and fitness functions including a square root decrease of fitness with increasing Hamming distance (HD) from the wild type. Our exact solutions also revise the known theoretical results of neutral networks in quasispecies theory (2). To explain biological systems more realistically (16), we propose three different modifications of the Eigen model: mutation landscape, multiple adjacent mutations, and frequency-dependent fitness in which the steady-state solution shows a multicenter behavior. Model Several excellent reviews (2 5) emphasize the merits of the quasispecies model for the interpretation of virological studies. Let us give a brief description of the quasispecies model as we use it in this work: a sequence type of length is specified by a sequence of spin values s k 1, 1 k (1, 2). In reality the spins can take four values corresponding to the natural nucleotide types, but a two-value spin model already catches the essential features and can be studied more easily. Two-value spin models also have been used to study long-range correlations in DA sequences (17) and DA unzipping (18, 19), and valuable results have been obtained. A generalization of our results for the four-value spin case is presented in the Supporting Text, which is published as supporting information on the PAS website. However, such results include more cumbersome formula, and from now on we will only consider the two-value spin model. Let s j 1 represent purines (R) and s j 1 pyrimidines (Y). Type i is then specified by S i (s 1 (i), s 2 (i),..., s (i) ). The model describes replicating molecules under control of variation and natural selection with Eq. 1, to be defined below. The rate coefficients of replication and mutation are assumed to be independent of the concentration of the types. The model describes the exponential growth phase of virus evolution, in which there are enough nutrients and low virus concentration. The multistep cross-catalytic reactions are replaced by an autocatalytic one. Here the evolution picture is rather simple, compared with the linear growth phase in the case of strong saturation effects. Selection is on a genotype level: fitness is a function of S i. The variation is assumed to be produced only by point mutations. Eigen made a deterministic approach with kinetic rate equations that requires an infinite population, whereas classical population genetics uses probabilistic equations. We denote the probability for the appearance of S i at time t by p Si p i (t) and define fitness r i of S i as the average number of offsprings produced per unit time and degradation rate D i of S i as an inverse mean longevity. The chosen r i and D i are functions in genome sequence space S i, i.e., r i f(s i ) and D i D(S i ). The mutation matrix element Q ij is the probability that an offspring produced by state j changes to state i, and the evolution is given by the set of equations for 2 probabilities p i (2, 6) dp i dt 2 Q ijr j ijd j p j p i 2 r j D j p j. [1] 2 Here p i satisfies i1 p i 1 and Q ij q d(i,j) (1 q) d(i,j) with q being the mean nucleotide incorporation fidelity, and d(i, j) ( l1 s (i) l s (j) l )2 being the HD between S i and S j (1, 2); d(i, j) represents the total number of different spin values in S i and S j. In ref. 1 the concept of an error threshold is introduced, and the error threshold has been quantified by a formula. For the calculation of the error threshold, the selection values and mutation rates of all types would be required, which is still not Conflict of interest statement: o conflicts declared. This paper was submitted directly (Track II) to the PAS office. Abbreviations: AS, antiselective; FM, ferromagnetic; HD, Hamming distance; PM, paramagnetic. To whom correspondence should be sent at the * address. E-mail: huck@phys.sinica. edu.tw. 2006 by The ational Academy of Sciences of the USA www.pnas.orgcgidoi10.1073pnas.0504924103 PAS March 28, 2006 vol. 103 no. 13 4935 4939

feasible. For data reduction, several fitness functions (landscapes) have been considered (2). The simplest one has a single peak (2) in a flat landscape. Without loss of generality we set the peak to S 1 (1, 1,..., 1), i.e., the state with all spin up, and have fs 1 A 1, and fs i 1 for S i S 1, [2] for which Eigen error threshold formula (equation II-45 in ref. 1) gives an exact result Ae 1, [3] for successful selection (p. 180 in ref. 2). The parameter (1 q) describes the mutation efficiency. When the inequality is satisfied, a mutant distribution is built around the peak configuration in the steady state. Otherwise the distribution is flat in the infinite genome length limit, i.e., no sequence is preferred [error catastrophe (1, 2)]. We consider general fitness function f(s i ) and degradation D(S i ), which depend on the HD (the number of nucleotide differences between two sequences) from the peak configuration S 1. In statistical physics this case corresponds to mean-field-like interaction, which is exactly solvable (20). We can write f(s i ) and D(S i ) as (6) fs i f 0 i l s l, DS i d 0 l s l, [4] where f 0 (k) and d 0 (k) with k l s (i) l are polynomials. It is easy to show that k 1 2d 1, where d 1 is the HD between S i and S 1. Following ref. 13, in Supporting Text we exactly reformulate the solution of the system (Eq. 1) as a problem of statistical mechanics of quantum spins with some non-hermitian Hamiltonian H. Error threshold corresponds to singularity in the phase structure of Eq. 1. The same singularity exists also in the partition function Z Tr exp[h] at the limit 3 ; here, the inverse temperature in statistical mechanics model, corresponds to the time in Eq. 1. InSupporting Text, we show that, when 3, the dominant contributions to come from spin configurations with a particular k resulting in e11k2 f 0 k d 0 k. [5] The mean growth rate i (r i D i )p i is given as the maximum of Eq. 5 at 1 k 1, where nonzero k means a successful selection. We also show that for the steady-state distribution p i 2 the surplus production s i1 p i l1 s i l satisfy the equation e 11k2 f 0 k d 0 k f 0 s d 0 s, [6] where k gives the maximum of Eq. 5. The surplus s has a direct biological meaning about how the population is grouped around the peak configuration. The difference between s and k has been discussed carefully in ref. 12. Error Thresholds ow we use Eq. 5 to study error thresholds for the following cases. Case 1: Single Peak Fitness Function Without Degradation. Let us first disregard the degradation (d 0 (k) 0), and take f 0 (k) 1 (A 1)k p at the limit p 3. There is a simple nonselective paramagnetic (PM) phase with k 0 and. At high values of A there is a selective ferromagnetic (FM) phase with k 1 and Ae. The system chooses the phase with the highest Z, and the Eigen error threshold formula of Eq. 3 is i obtained. Eigen introduced Ae as a selective value of the peak (equation II-31 in ref. 1). In case of several isolated peaks the system chooses the one with the maximal selective value. Case 2: Single Peak Fitness Function with Degradation. Consider nonzero degradation d 0 (k) c k, where the positive number 0 is the degradation parameter, and the same fitness function as in case 1. Physically, d 0 (k) should be always positive for any value of k and we take c. Because of symmetry of Eigen s equations under the transformation D i 3 D i C with C being a constant, we can simply choose d 0 (k) k, which will be used in the following discussion. ow we have the PM phase with k 0 and 1 W PM, the FM phase with k 1 and Ae W FM k. [7] Besides these phases, we also have the antiselective (AS) phase, which can be located by finding the maximum of e11k2 k W AS k, [8] in the interval 1 k 1. To find the maximum we should take the value of Eq. 8 either at the border with k 1 or at the local maximum point. The derivative of W AS (k) with respect to k gives The solution of W AS k ke11k2. 1 k 2 W AS k 0 0, [9] gives a negative value of k 0, which means that most spin values of the spin state with k k 0 are different from those of the peak configuration S 1 ; hence, we call the spin state AS phase. At k 0, W AS (k) of Eq. 8 coincides with W PM and W AS (k) 0; W AS (k) decreases in the interval [0, 1]. As k moves from 0 to the negative k 0 of Eq. 9, W AS (k) moves from to 0. Thus, W AS (k) monotonically decreases at the interval [k 0, 0], and it is larger than W PM at the internal [k 0, 0). In case of several solutions of Eq. 9, we should choose the one that gives the maximal W AS and compare such W AS with the border value e at k 1to determine the maximal weight W M AS for the AS phase W M AS maxe 11k 2 0 k 0, e. [10] There are two possible stable phases, AS and FM, in the model. For given, A, and, we can compare W FM and W M AS to determine which phase dominates. The error threshold (phase boundary) is given by Ae W M AS. [11] For a nonzero degradation rate, ref. 2 gives the formula: Ae 1 D 0 D i (see equations III.4 and III.5 in ref. 2), where D 0 is the degradation rate of the master type and D i is the average degradation rate of all other types. The majority of types groups around k 0, therefore D i 0 and the error threshold condition of ref. 2 is Ae 1. [12] This error threshold deviates from Eq. 11 because the AS phase was not considered in ref. 2. This AS phase can represent a virus quasispecies under attack by the immune system (21). 4936 www.pnas.orgcgidoi10.1073pnas.0504924103 Saakian and Hu

Case 3: Flat Fitness Functions. During the last decade the importance of extended or flat fitness landscapes for the biological evolution (22 29) has been recognized. We first consider a simple mesa-type mount in the fitness landscape f 0 k A, k 0 k 1, f 0 k 1, 0 k k 0, [13] where k is the overlap of the points on the mesa with the configuration with all up spins S 1 ;1 k 0 defines the broadness of the mesa. For the fitness function of Eq. 13 and zero degradation, again a PM solution with k 0 is found. In the FM phase Eq. 5 takes a maximum value at the border k k 0, and we derive an error threshold Ae 11k 2 0 1. [14] If there are several mesa mounts with different A and k 0, the system chooses the one with the maximal left expression of Eq. 14, and we call this expression the selective value of the mesa. We see that increasing the extension of the mesa increases its selective ability. Case 4: Fitness Decreases Slowly with the HD d from the Peak. We now consider a landscape with a mount that has shallow slopes from the peak till d m fd A1 a4d, 0 d d m, [15] and f(d) 1, for d d m. We consider the case d m. Eqs. 5 and 15 with d 0 0 and small (1 k) 2d gives an optimization problem for Ae [1 ( a) 4d]. For a the population groups closely around the master type, whereas for the distribution becomes wider. Therefore, such a simple mechanism could describe an extensive genomic change. As in ref. 30, the population landscape has a narrow steep peak at low mutation rates but a wide shallow hill at high mutation rates. eutral Selective Value When flat fitness function of Eq. 13 exists only for a fraction (0 1) of spins, then fitness f 0 (k 1, k 2 ) is a function of two group of spins with and (1 ) spins, respectively f 0 k 1,1 A for k 0 k 1 1, f 0 k 1, k 2 1, otherwise, [16] where k 1 i1 s i () corresponds to flat fitness spin group, and k 2 i1 s i [(1 )] the remainder spins. In Section II of Supporting Text, for the partition Z Tr exp[h] we obtain e1m f 0 k 1, k 2 hm x 1 k 1 1 x 2 k 2 h 2 x 1 2 1 h 2 x 2 2, [17] where m is a transverse magnetization (related to quantum spin operator x ) for both groups of spins, and k 1, k 2 are longitudinal magnetization (related to quantum spin operator z ) for the first and second groups; h, x 1, and x 2 are Lagrange multipliers corresponding to m, k 1, and k 2, respectively. Setting the derivatives of with respect to h, x 1, and x 2 to 0, we obtain e1m f 0 k 1, k 2, [18] with m 1 k 1 2 (1 ) 1 k 2 2 and 1 k 1 1, 1 k 2 1. The maximum is at k 2 1 and k 1 k 0. The error threshold condition is Ae 11k 2 0 1. [19] The right-hand side is the selective value in the considered case. The neutral selective value has a factor 1 k 2 0 in the exponent, which is a product of two factors: the width and the length 1 k 2 0. In refs. 25 (equation 4) and 27 (equation 2), Ae (1) has been considered as a neutral selective value, and a similar expression has been given in ref. 29. They defined as the average number of neutral neighbors. If we consider small HD from the peak, (1 k 0 ), then the mean neutrality coincides with in our fitness landscape. We see that in the formula of refs. 25 and 27 the factor 1 k 2 0 (length of neutrality) is missed. To define neutral selective value, we should take exact copying probability q plus the probability of all neutral mutations. We should differentiate neutrality for single mutation from neutrality for multiple ones; the latter makes major contribution for selective ability, which is the origin of small cutoff factor 1 k 2 0. Even if there are neutral paths with a large HD (31) of length 40, their contributions to the mean fitness and error threshold might still be negligible due to small (width of neutrality). Steady-State Distribution ow we solve the steady-state distribution of the Eigen model with a single peak fitness and d 0 (k) 0. Because of the symmetry of the problem, the probabilities p i of Eq. 1 depend only on the HD from the peak configuration S 1 whose probability to appear is p 1. We denote the representatives of other classes as configurations C 2,..., C (1), with corresponding probabilities p 2,...,p 1. The total probability of the class C l is: pˆl p l ( l1 ). We suggest an ansatz for p l : p 1 1 and p l () l1. From Eq. 1 with dp i dt 0, we can derive an expression for p 1 and write p n1 for n 1 in terms of p 1,...,p n : p 1 e A 1 A 1, [20] p n1 1 A 1p 1 A n n!p l1 nl n l!l! np n [21]. Eq. 20 implies L (A j p j r j )A (1 e ), which is a generalization of Haldane result for the mutational load (28, 32) L. Let us consider the crude picture of the Eigen model s steady state. There is a central cluster around the peak configuration (with maximal density). The majority of population is concentrated within the HD (1 s)2, where the surplus s is defined above by Eq. 6. For the further classes the density decreases times with every step. The case when fitness is a function of distances from several peaks, again, could be solved and is consistent with the described picture. Mutants with a Large HD from the Wild Type Rohde, Daum, and Biebricher (16) observed many different mutants at large HDs (up to 9) from the wild type. Eq. 21 implies that mutants with an HD d from the wild type will appear with a probability of the order of () d ; thus the data in ref. 16 could Saakian and Hu PAS March 28, 2006 vol. 103 no. 13 4937

not be described by the original Eigen model in any way. Here we propose three possible modifications of the Eigen model to explain the experimental data. The first way is to introduce a nontrivial mutation landscape (33), which perhaps is the case for retroviruses. In this case q j depends on the site and the mutation matrix Q ij q j d(i,j) (1 q j ) d(i,j) is nonsymmetric. When q j is a function of the HD from the wild configuration, Eq. 5 is modified: 3 (k), where (k) (1 q j ) for the configuration S j having the overlap k with the wild one. Mutants at large HDs distances can appear, if the mutation rate for those genotypes is small enough to have a selective value A exp[(k)] close to the wild one. The second way is to introduce the frequency-dependent fitness (even in exponentially growth phase); such a proposal is supported by experimental data for RA viruses (34). For example, in Eq. 1 with D j 0 we replace r j by r j (1 cp j ), where c is a small coefficient, and have dp i dt 2 Q ijr j 1 cp j p j p i 2 r j 1 cp j p j. [22] We consider the case of fitness r 1 r 2 1 and r 2 r 1 (1 c), 1 c 0. For other configurations again r j 1. Configuration S 2 is located in some HD d(1, 2) 1 from the peak configuration S 1. When e Q 1, where Q 1 1r 1, there is no selection in the system; the steady-state distribution is flat. When Q 1 e Q 2, there is macroscopic distribution only around first configuration (Q 2 will be defined later by Eq. 24). When e Q 2, there are macroscopic probabilities 1 only at configurations S 1 and S 2 with p 1 x and p 2 y. For other configurations p i 1 d, where d is the shortest HD of the given configuration from S 1 or S 2. Then we immediately derive a system of two equations for x and y r 1 Q1 cxx xr 1 1 cxx r 2 1 cyy 1 x y, [23] r 2 Q1 cyy yr 1 1 cxx r 2 1 cyy 1 x y, where Q q. The system is easy to solve. From the condition x 0 and y 0, we derive an equation for the threshold Q 2 Q 2 r 2 1r 1 r 2 r 1 c cr 1 r 2. [24] When x 0 and y 0, the distribution is a sum of two distributions, decreasing quickly with the HDs from the two peak configurations. We have a recurrence formula for the probabilities p n1 x n1 of the first configuration s neighbors at the HD n r 1 1 cxx x n1 1 n n!x l1 nl l!n l! r 1 1 cx 1 nx n, [25] where x 1 x. Let us consider an experimental mutant spectrum with RA replicated by Q replicase (16). In this work, mutations in 86 nucleotides ( 86) with mutation rate per replication 0.03 have been carefully studied. Figure 4 in ref. 16 shows a central cluster around the peak (the top one labeled by I24 at the left-hand side of the figure), as well as mutants at HD 1 4 from the central cluster, having relative frequencies (compared with the peak one p 1 ) between 66 1.0 (for I28 and I20 with HD 1 and 2, respectively) and 16 0.166...(for I1 and I4 with HD 4, I11 with HD 2, etc.). ow let us take I1 with HD 4asan example to compare the results from the traditional approach and the frequency-dependent fitness approach. Extending the method to derive Eq. 21 to the present case with r 1 r 2 1 (but p i still satisfy Eq. 1), we have an estimate for the p 5 with HD 4 p 5 p 1 4 r 1 r 1 r 2 10 13, [26] with r 1 (r 1 r 2 ) 10. Of course the last formula could be modified because of finite population effects, but no rugged fitness landscape could increase the effect by 10 13 times (10 7,in case 1). The frequency-dependent Eigen model could predict p 5 p 1 with a correct order of magnitude as follows. Using r 1 10, r 2 9, Q 0.851, and c 0.19 in Eq. 23, we have x 0.67 and y 0.16; the latter is consistent with the relative frequency 16 for I1 mentioned above. We see that the population of the second peak y is independent of the HD d from the main peak, and y is finite at small mutations, whereas in the case of the Eigen model (c 0) y disappears as () d(1,2). The third possible explanation could be multiple (duplet, triplet...) adjacent mutations (35). We can suggest a simple explanation by assuming triplet adjacent mutations (36). Assuming the existence (besides the simple point mutations) of triplet mutations with per genome per replication mutation probability t 1, r 1 (r 1 r 2 ) 10, and missing the nonlinearity, we can get a reasonable estimate p i t p 1 r 1 r 1 r 2 0.1, [27] which is of the same order as p I112 p 1 with p I112 being the probability for the appearance of I112 in figure 4 in ref. 16 (I112 has adjacent triplet mutations). Discussion More than 30 years after the first analytical result by Eigen (1), we presented exact formulas (Eqs. 5 and 6) for the mean fitness and surplus production in Eigen s quasispecies model, using simplified landscapes with fitness values and degradation rates that are functions of the HD from the master type (Eqs. 1 and 4). In addition to the selective and nonselective phases, an AS phase is described at higher degradation rates. Depending on the landscape chosen, the threshold condition reported in ref. 2 had to be modified. The Eigen model immediately produces the notion of selective value (vs. fitness value on Wright s fitness landscape in zero mutation case); the evolving system is choosing the peak with maximal selective value. For an isolated peak, the selective value is the product of fitness and copying fidelity (ref. 1). We derived a selective value for a general case (Eq. 5), including the neutral phenomenon (mesa with neutral spins) Eq. 19. The fitness function, decreasing from the master type with the square root of the HD, plays a special role. Such fitness allows a radical rearrangement of population. The model with nonzero degradation maybe realistic for the host parasite interaction (21) and could be useful for the problems of immunology as well (37). We have proposed three possible modifications of Eigen s quasispecies model to describe the existence of mutants at large HDs from the wild type. All three mechanisms exist in experimental data (16, 33, 34). From the theoretical point of view the adjacent site mutations are especially interesting, because they can give rapid relaxation in changing environments in realistic case of finite population. Consider the situation, when the adjacent triplet mutations have the same rate as the single-site one, and new wild configuration in changing environments at the 4938 www.pnas.orgcgidoi10.1073pnas.0504924103 Saakian and Hu

HD d from the current configuration could be achieved by triplet mutations. Let us denote by 1 and 3 the probabilities of single and triplet mutations per genome per replication. In ref. 38 it has been calculated that finite population relaxation period (fitness barrier crossing time) in case of single mutations scales as t 1 ( 1 ) d. A similar estimate for triplet mutations gives t 3 ( 3 ) d/3. If a system needs 1 million replication cycles (t 1 )to reach the steady state in case of only single-point mutations; a similar system needs only 100 replication cycles (t 3 ) to reach approximately the same steady state through adjacent site triplet mutations. Thus, the latter can relax more quickly. Although we propose to modify the quasispecies model to explain the experimental data, we agree with the most fundamental idea of the quasispecies theory: the whole collection of the viruses (quasispecies) acts as a target of selection and mutation. Further work should deal with more realistic or interesting situations, such as finite population problems (39, 40), random fitness landscapes, and the role of neutral networks in evolution. We thank C. Biebricher, M. Deem, and H. Wagner for useful discussions. This work was supported by ational Science Council of the Republic of China (Taiwan) Grants SC 93-2112-M-001-027, 94-2811-M-001-014, and SC 94-2119-M-002-001; Academia Sinica (Taiwan) Grant AS-92- TP-A09; and U.S. Civilian Research and Development Foundation Grant ARP2-2647-Ye-05. 1. Eigen, M. (1971) aturwissenschaften 58, 465 523. 2. Eigen, M., McCaskill, J. & Schuster, P. (1989) Adv. Chem. Phys. 75, 149 263. 3. Dominigo, E., Webster, R. & Holland, J. J. (1999) Origin and Evolution of Viruses (Academic, San Diego). 4. Dominigo, E., Biebricher, C., Eigen, M. & Holland, J. J. (2001) Quasispecies and RA Virus Evolution: Principles and Consequences, (Landes Bioscience, Austin, TX). 5. Eigen, M. & Biebricher, C. (2005) Virus Res. 107, 117 127. 6. Swetina, J. & Schuster, P. (1982) Biophys. Chem. 16, 329 345. 7. Leuthausser, I. (1987) J. Stat. Phys. 48, 343 360. 8. Tarazona, P. (1992) Phys. Rev. A 45, 6038 6050. 9. Franz, S. & Peliti, L. (1997) J. Phys. A. 30, 4481 4487. 10. Crow, J. F. & Kimura, M. (1970) An Introduction to Population Genetics Theory (Harper Row, ew York). 11. Baake, E., Baake, M. & Wagner, H. (1997) Phys. Rev. Lett. 78, 559 562. 12. Baake, E. & Wagner, H. (2001) Genet. Res. 78, 93 117. 13. Saakian, D. B. & Hu, C.-K. (2004) Phys. Rev. E 69, 021913. 14. Saakian, D. B. & Hu, C.-K. (2004) Phys. Rev. E 69, 046121. 15. Saakian, D. B., Hu, C.-K. & Khachatryan, H. (2004) Phys. Rev. E 70, 041908. 16. Rohde,., Daum, H. & Biebricher, C. (1995) J. Mol. Biol. 249, 754 762. 17. Peng, C. K., Buldyrev, S. V., Goldberger, A. L., Havlin, S., Sciortino, F., Simons, M. & Stanley, H. E. (1992) ature 356, 168 170. 18. Lubensky, D. K. & elson, D. R. (2000) Phys. Rev. Lett. 85, 1572 1575. 19. Allahverdyan, A. E., Gevorkian, S., Hu, C. K. & Wu, M. C. (2004) Phys. Rev. E 69, 061908. 20. Baxter, R. J. (1982) Exactly Solvable Models in Statistical Mechanics (Academic, London). 21. Kamp, C. & Bornholdt, S. (2002) Phys. Rev. Lett. 88, 068104. 22. Huynen, M., Stadler, P. F. & Fontana, W. (1996) Proc. atl. Acad. Sci. USA 93, 397 401. 23. imwegen, E. V., Crutchfield, J. P. & Huynen, M. (1999) Proc. atl. Acad. Sci. USA 96, 9716 9720. 24. Wilke, C. O., Wang, J., Ofria, L. C., Lenski, R. E. & Adami, C. (2001) ature 412, 331 333. 25. Wilke, C. O. (2001) Evolution 55, 2412 2420. 26. Reydys, C., Forst, C. V. & Schuster, P. (2001) Bull. Math. Biol. 63, 57 94. 27. Ofria, C. O., Adami, C. & Collier, T. C. (2003) J. Theor. Biol. 222, 477 483. 28. Wilke, C. O. (2001) Bull. Math. Biol. 63, 715 730. 29. Schuster, P. (2001) The Origin of Life 1, 329 346. 30. Schuster, P. & Swetina, J. (1988) Bull. Math. Biol. 50, 635 660. 31. Schultes, E. A. & Bartel, D. P. (2000) Science 289, 448 452. 32. Kimura, M. & Maruyama, T. (1966) Genetics 54, 1337 1351. 33. Sasaki, A. & owak, M. A. (2003) J. Theor. Biol. 224, 241 246. 34. Elena, S. F., Mirales, R. & Moya, A. (1997) Evolution 51, 984 987. 35. Averof, M., Rokas, A., Wolfe, K. H. & Sharp, P. M. (2000) Science 287, 1283 1286. 36. Whelan, S. & Goldman,. (2004) Genetics 167, 2027 2043. 37. Deem, M. W. & Lee, H. Y. (2003) Phys. Rev. Lett. 91, 068101. 38. Van imwegen, E. & Crutchfield, J. P. (2000) Bull. Math. Biol. 62, 799 848. 39. Wilke, C. O. (2005) BMC Evol. Biol. 5, 44. 40. Forster, R., Adami, C. & Wilke, C. O. (2005) arxiv:q-bio.pe0509002. Saakian and Hu PAS March 28, 2006 vol. 103 no. 13 4939