Text S1: Detailed proofs for The time scale of evolutionary innovation

Similar documents
Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

More metrics on cartesian products

Linear Regression Analysis: Terminology and Notation

Affine transformations and convexity

NUMERICAL DIFFERENTIATION

Difference Equations

Problem Set 9 Solutions

Edge Isoperimetric Inequalities

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Economics 101. Lecture 4 - Equilibrium and Efficiency

Kernel Methods and SVMs Extension

APPENDIX A Some Linear Algebra

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

Linear Approximation with Regularization and Moving Least Squares

An (almost) unbiased estimator for the S-Gini index

Lecture 12: Discrete Laplacian

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Random Walks on Digraphs

Chapter 13: Multiple Regression

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Foundations of Arithmetic

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

2.3 Nilpotent endomorphisms

HMMT February 2016 February 20, 2016

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

COS 521: Advanced Algorithms Game Theory and Linear Programming

The optimal delay of the second test is therefore approximately 210 hours earlier than =2.

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

The Order Relation and Trace Inequalities for. Hermitian Operators

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Maximizing the number of nonnegative subsets

MA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

On the correction of the h-index for career length

Time-Varying Systems and Computations Lecture 6

Lecture 4: Universal Hash Functions/Streaming Cont d

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

DIFFERENTIAL FORMS BRIAN OSSERMAN

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Note on EM-training of IBM-model 1

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Markov Chain Monte Carlo Lecture 6

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Feature Selection: Part 1

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness.

Convergence of random processes

Expected Value and Variance

Some basic inequalities. Definition. Let V be a vector space over the complex numbers. An inner product is given by a function, V V C

The Second Anti-Mathima on Game Theory

x = , so that calculated

Notes on Frequency Estimation in Data Streams

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Economics 130. Lecture 4 Simple Linear Regression Continued

REAL ANALYSIS I HOMEWORK 1

DUE: WEDS FEB 21ST 2018

MMA and GCMMA two methods for nonlinear optimization

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

CHAPTER III Neural Networks as Associative Memory

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Another converse of Jensen s inequality

Lecture 10 Support Vector Machines II

THE SUMMATION NOTATION Ʃ

Report on Image warping

Complete subgraphs in multipartite graphs

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING

Joint Statistical Meetings - Biopharmaceutical Section

1 The Mistake Bound Model

Errors for Linear Systems

Prof. Dr. I. Nasser Phys 630, T Aug-15 One_dimensional_Ising_Model

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

The Expectation-Maximization Algorithm

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Lecture 10: May 6, 2013

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13]

Randić Energy and Randić Estrada Index of a Graph

The lower and upper bounds on Perron root of nonnegative irreducible matrices

CSCE 790S Background Results

Section 8.3 Polar Form of Complex Numbers

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

Lecture 3: Probability Distributions

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Fluctuation Results For Quadratic Continuous-State Branching Process

Workshop: Approximating energies and wave functions Quantum aspects of physical chemistry

Exercise Solutions to Real Analysis

/ n ) are compared. The logic is: if the two

Chapter 9: Statistical Inference and the Relationship between Two Variables

Min Cut, Fast Cut, Polynomial Identities

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

Google PageRank with Stochastic Matrix

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

EPR Paradox and the Physical Meaning of an Experiment in Quantum Mechanics. Vesselin C. Noninski

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

First day August 1, Problems and Solutions

Transcription:

Text S: Detaled proofs for The tme scale of evolutonary nnovaton Krshnendu Chatterjee Andreas Pavloganns Ben Adlam Martn A. Nowak. Overvew and Organzaton We wll present detaled proofs of all our results. In ths secton we present an overvew of the proof structure and the organzaton of our results.. In Secton 2 we present relevant lower and upper bounds on httng tme for Markov chans on an onedmensonal grd. The results of ths secton are techncal and the bass for the results of the followng sectons. However, a reader does not need to understand the techncal proofs of ths secton for the followng sectons. We wll only use the results of Lemma S3 and Lemma S4 (and ther consequence Corollary S); and Lemma S5 (and ts mplcaton) n the followng subsectons. We present the results n the most general form for Markov chans, and they mght possbly be used n other contexts as well; and then present smple applcatons of the general results of Markov chans for the dscovery tme of evolutonary processes. 2. In Secton 3 we ntroduce evolutonary processes and for smplcty we ntroduce them for evolutonary adaptaton of bt strngs. Also for mathematcally elegant proofs we frst ntroduce the Ferm evolutonary process n ths secton, and later consder the Moran process. 3. In Secton 4 we present our results for the Ferm evolutonary process wth neutral ftness landscapes and a broad peak of targets. 4. In Secton 5 we present our results for constant selecton n the Ferm evolutonary process wth a broad peak of targets. 5. In Secton 6 we show how the results of Secton 4 and Secton 5 mply all the desred results for the Moran evolutonary process. 6. In Secton 7 we show how the results of Secton 4, Secton 5, and Secton 6, extend from bt strngs to strngs over alphabet of any sze (and obtan the results for four letter alphabet as a specal case). 7. In Secton 8 we present the results for multple ndependent searches; and n Secton 9 we dscuss some cases of dstrbuted targets. 8. In Secton 0 we dscuss the results for a mechansm to enable evoluton to work n polynomal tme. Fnally, n Secton we present detals of some numercal calculatons used n the man artcle. 9. In Secton 2 we dscuss and compare our results wth relevant related work, and end wth addtonal smulaton results n Secton 3. Whle for smplcty n the man artcle we only present our results for neutral ftness landscapes, Sectons 5, 6 and 7 deal wth the more general case of selecton actng on multplcatve ftness landscapes.

2. Bounds on httng tmes of Markov chans on a lne In ths secton we wll present our basc lower and upper bounds on httng tmes of Markov chans on a lne. The results of ths secton wll be used repeatedly n the later sectons to provde lower and upper bounds on the dscovery tme for several evolutonary processes. We start wth the defnton of Markov chans, and then defne the specal case of Markov chans on a lne. Defnton S (Markov chans). A fnte-state Markov chan MC L = (S, δ) conssts of a fnte set S of states, wth S = {0,,... L} (.e., the set of states s a fnte subset of the natural numbers startng from 0), and a stochastc transton matrx δ that specfes the transton probabltes,.e., δ(, j) denotes the probablty of transton from state to state j (n other words, for all 0, j L we have 0 δ(, j) and for all 0 L we have L j=0 δ(, j) = ). We now ntroduce Markov chans on a lne. Intutvely a Markov chan on a lne s defned as a specal case of Markov chans, for whch n every state, the allowed transtons are ether self-loops, or to the left, or to the rght. The formal defnton s as follows. Defnton S2 (Markov chans on a lne). A Markov chan on a lne, denoted as M L, s a fnte-state Markov chan (S, δ) where S = {0,,... L} and for all 0, j L, f δ(, j) > 0, then j,.e., the transtons allowed are only self-loops, to the left, and to the rght (see Supplementary Fgure ). δ(0, 0) δ(, ) δ(, ) δ( +, + ) δ(l, L) δ(, 2) δ(, ) δ( +, ) 0... +... L δ(, ) δ(, + ) δ( +, + 2) Supplementary Fgure : Markov chan on a lne. Pctoral llustraton of Markov chan on a lne. We now defne the noton of httng tmes for Markov chans on a lne. Defnton S3 (Httng tme). Gven a Markov chan on a lne M L, and two states n and n 2 (.e., 0 n, n 2 L), we denote by H(n, n 2 ) the expected httng tme from the startng state n 2 to the target state n,.e., the expected number of transtons requred to reach the target state n startng from the state n 2. The recurrence relaton for httng tme. Gven a Markov chan on a lne M L = (S, δ), and a state n (.e., 0 n L), the followng recurrence relaton holds:. H(n, n ) = 0, 2. H(n, ) = + δ(, + ) H(n, + ) + δ(, ) H(n, ) + δ(, ) H(n, ), for all n < < L, and 3. H(n, L) = + δ(l, L ) H(n, L ) + δ(l, L) H(L, L). The argument s as follows: (a) Case s trval. (b) For case 2, snce n, at least one transton needs to be taken to a neghbor state j of, from whch the httng tme s H(n, j). Wth probablty δ(, + ) the neghbor j s state +, whle wth probablty δ(, ) the neghbor j s state. On the other hand, wth probablty δ(, ) the self-loop transton s taken, and the expected httng tme remans the same. 2

(c) Case 3 s a degenerate verson of case 2, where the only possble transtons from the state L are ether to the state L, whch s taken wth probablty δ(l, L ), or the self-loop, whch s taken wth probablty δ(l, L). Also note that n Case 3 we have δ(l, L ) = δ(l, L). In the followng lemma we show that usng the recurrence relaton, the httng tme can be expressed as the sum of a sequence of numbers. Lemma S. Consder a Markov chan on a lne M L, wth a target state n, such that for all n < L we have δ(, ) > 0. For all n < L we have that H(n, ) = L n j=l b j, where b j s the sequence defned as: () b 0 = δ(l, L ) ; (2) b j = + δ(l j, L j + ) b j δ(l j, L j ) for j > 0. Proof. We consder the recurrence relaton for the httng tme and frst show that for all 0 < L n we can wrte H(n, L ) as H(n, L ) = b + H(n, L ) for the desred sequence b. (Base case). For = 0 we have H(n, L) = + δ(l, L ) H(n, L ) + δ(l, L) H(n, L) = + δ(l, L ) H(n, L ) + ( δ(l, L )) H(n, L) = δ(l, L ) + H(n, L ), thus the statement holds wth b 0 = δ(l,l ). (Inductve case). Assume that the statement holds for some (nductve hypothess) and we wll show that t also holds for. Let y = δ(l, L + ) and x = δ(l, L ). We establsh the followng equalty: H(n, L ) = + y H(n, L + ) + x H(n, L ) + ( x y) H(n, L ) = + y (b + H(n, L ) ) + x H(n, L ) + ( x y) H(n, L ) = + y b x + H(n, L ). The frst equalty follows from the recurrence relaton (case 2) by substtutng wth L ; the second equalty follows by substtuton and the nductve hypothess; the thrd equalty s smple re-wrtng, snce x 0. Thus we have H(n, L ) = b + H(n, L ), where () b 0 = δ(l, L ) ; (2) b j = + δ(l j, L j + ) b j δ(l j, L j ) for j > 0. Hence we have H(n, L ) = L n j= b j and by substtutng back L we obtan H(n, ) = b j. The desred result follows. L n j=l Defnton S4. For postve real-valued constants A and B, we defne the sequence a (A, B) as follows: () a 0 (A, B) = B (2) a (A, B) = + A a (A, B) B for > 0. Lemma S2. For postve real-valued constants A and B, the followng assertons hold for the sequence a (A, B): 3

If A > B and B, then a (A, B) ( ) A, B wth A B >. If A B, then a (A, B) = O( B ). Proof. The result s as follows: Case A > B: Then we have a (A, B) = + A a (A, B) B > A B a (A, B) (by just gnorng the term n the numerator and snce B ). ( ) A a 0 (A, B) = B ( ) A B B Case A B: Then A B and a (A, B) = +A a (A,B) B B + a (A, B) B + B = O( B ). The desred result follows. ( ) A B Exponental lower bound. We wll use the followng standard conventon n ths paper: a functon t(l) s lower bounded by an exponental functon, f there exst constants c >, l > 0 and L 0 N such that for all L L 0 we have t(l) c l L = 2 c l L, where c = log c > 0,.e., t s lower bounded by a lnear functon n the exponent. Exponental lower bound on httng tmes for Markov chans on a lne. In the followng lemma we wll show an exponental lower bound on the httng tme. We consder a Markov chan on a lne M L, such that there exst two states x and y = x + k, for k > 0, such that n the whole contguous segment between x and y the rato of the probablty to drft towards the rght as compared to the left s at least + A, for a constant A > 0 (strctly bounded away from ). Then the expected httng tme from any startng pont rght of x to a target to the left of x s at least ( + A) k. Lemma S3 (Lower bound). Consder a Markov chan on a lne M L. If there exst two states x, y L wth y = x + k, for k > 0, and a constant A > 0 such that for all x < < y we have δ(,+) δ(, ) + A, then for all n, n 2 L such that n x < n 2 we have H(n, n 2 ) ( + A) k. Proof. From Lemma S we have that H(n, n 2 ) = L n j=l n 2 b j : H(n, n 2 ) = L n j=l n 2 b j L (x+) j=l (x+) b j = b L x. We have δ(,+) δ(, ) + A by the gven condton of the lemma. We show by nducton that for all j between L y and L x (.e., L y j L x ) we have b j a j L+y ( + A, ).. (Base case). We have b L y = a 0 ( + A, ), snce b j s non-decreasng and b 0 = δ(l,l ). 2. (Inductve case). By nductve hypothess on j we have b j a j L+y ( + A, ), and then we have b j = + δ(l j, L j + ) b j δ(l j, L j ) + ( + A) a j L+y ( + A, ) = a j L+y ( + A, ) snce δ(l j,l j+) δ(l j,l j ) + A and δ(l j, L j ). Thus we have b j a j L+y ( + A, ). Thus for all L y j L x we have b j a j L+y ( + A, ). Hence H(n, n 2 ) b L x a y x ( + A, ) = a k ( + A, ) ( + A) k (from Lemma S2, snce + A > ). 4

Lemma S4 (Upper bound). Gven a Markov chan on a lne M L and 0 n < n 2 L, f for all n < < L we have δ(, ) δ(, + ), then H(n, n 2 ) = O( L2 B ), where B = mn n< L( δ(, )). Proof. From Lemma S we have that H(n, n 2 ) = L n j=l n 2 b j. Let B = mn n< L δ(, ). We show by nducton that for all 0 j L n we have b j a j (, B).. (Base case). We have b 0 = δ(l,l ) B = a 0(A, B) (because of our choce of B we have B δ(l, L )). 2. (Inductve case). By nductve hypothess on j we have b j a j (, B). Then b j = + δ(l j, L j + ) b j δ(l j, L j ) δ(l j, L j ) + δ(l j, L j + ) a j (, B) δ(l j, L j ) B + a j (, B) + a j (, B) B = a j (, B). snce δ(l j,l j ) B, δ(l j,l j+) δ(l j,l j ), and B. Thus b j a j (, B). It follows that for all L n 2 j L n we have b j a j (, B) and thus b j = O( j B ) from Lemma S2. Then H(n, n 2 ) = L n j=l n 2 b j = O ( ) (n 2 n ) (L n ) B = O( L 2 B ). Let j = arg mn n < L δ(, ). We have B = δ(j, j ) 2 (δ(j, j ) + δ(j, j + )) = 2 ( δ(j, j)) 2 B because δ(j, j ) δ(j, j + ) and δ(j, j) B. We conclude that H(n, n 2 ) = O( L2 B ). Markov chans on a lne wthout self-loops. A specal case of the above lemma s obtaned for Markov chans on a lne wth no self-loops n states other than state 0,.e., for all 0 < L we have δ(, ) = = B. We consder a Markov chan on a lne wthout self-loops M L, such that there exst two states x and y = x + k, for k > 0, such that n the whole contguous segment between x and y the probablty to drft towards the rght s at least a constant A > 2 (strctly bounded away from 2 ). We also assume A <, snce otherwse transtons to the left are never taken. Then the expected httng tme from any startng pont rght of x to a target to the left of x s at least c k A, where c A = A A > (see Supplementary Fgure 2). Corollary S. Gven a Markov chan on a lne M L such that for all 0 < L we have δ(, ) = 0, the followng assertons hold:. Lower bound: If there exst two states x, y L wth y = x + k, for k > 0, and a constant A > 2 such that for all x < y we have δ(, + ) A > 2, then for all n, n 2 L such that n x < n 2 we have H(n, n 2 ) c k A for c A = A A >. 2. Upper bound: For 0 n < n 2 L, f for all n < < L we have δ(, ) 2, then H(n, n 2 ) = O(L 2 ). Proof. Snce δ(, ) = 0, we have that δ(, + ) A mples that δ(,+) δ(, ) A A, and then the frst tem s an easy consequence of Lemma S3. For tem (2), we have δ(, ) 2 mples δ(, ) δ(, + ) and hence the result follows from Lemma S4 wth B = snce δ(j, j) = 0 for all n < j < L. Unloop varant of Markov chans on a lne. We wll now show how gven a Markov chan on a lne wth self-loops we can create a varant wthout self-loops and establsh a relaton on the httng tme of the orgnal Markov chan and ts varant wthout self-loops. Defnton S5 (Unloop varant of Markov chan on a lne). Gven a Markov chan on a lne M L = (S, δ), we call ts unloop varant a Markov chan on a lne M L = (S, δ), wth the followng propertes: 5

0 n x n 2 y L A > A > 2 A > 2 A > 2 2 k (a) 2 2 2 2 0 n n 2 L (b) (c) Supplementary Fgure 2: Lower and upper bound on httng tmes for Markov chan on a lne. Fgure (a) shows a Markov chan on a lne wthout self-loops, where for a length k between x and y the transton probabltes to the rght are at least a constant A > 2, and then the httng tme from any startng pont n 2 to the rght of x to a target n to the left of x s at least exponental n the length k; fgure (b) shows a Markov chan on a lne wthout self-loops where all the transton probabltes to the left upto the target n are at least, and then the httng tme for any start pont to the rght of the target n to the target s at most O(L 2 ); the graph (c) shows the exponental lower bound (red) and polynomal upper bound (green) on the httng tmes H(n, n 2 ) n the log-scale. 6

δ(0, ) = ; δ(, ) For all 0 < < L, we have δ(, ) = δ(, )+δ(,+) and δ(, + ) = δ(,+) δ(, )+δ(,+),.e., the probabltes of transtons to rght and left are normalzed so that they sum to ; and δ(l, L ) =. We now show the followng: () the httng tme of the orgnal Markov chan on a lne M L s always at least the httng tme of the unloop varant; and (2) the httng tme of the orgnal Markov chan s at most z tmes the httng tme of the unloop varant, where z s the maxmum of the nverse of the mnus the self-loop transton probabltes. Lemma S5. Consder a Markov chan on a lne M L = (S, δ) and ts unloop varant M L = (S, δ). Let 0 < n, n 2 L and n < n 2, and let H(n, n 2 ) denote the httng tme to state n from state n 2 n M L, and H(n, n 2 ) denote the correspondng httng tme n M L. The followng assertons hold: () H(n, n 2 ) H(n, n 2 ). () H(n, n 2 ) z H(n, n 2 ), where z = max 0< L δ(,). Proof. From Lemma S we have that for all 0 < L, we can wrte H(n, ) = L n j=l b j and H(n, ) = L n j=l b j where b j and b j are the sequences defned as: () b 0 = δ(l, L ) ; (2) b j = + δ(l j, L j + ) b j δ(l j, L j ) for j > 0. and () b 0 = ; (2) b j = + δ(l j, L j + ) b j δ(l j, L j ) for j > 0. () We prove nductvely that for all 0 < j < L, we have b j b j.. (Base case). b 0 = δ(l,l ) = b 0. 2. (Inductve Step). The nductve hypothess guarantees that b j b j. Observe that δ(l j,l j+) δ(l j,l j+) δ(l j,l j ) = R. Then b j = + δ(l j, L j + ) b j δ(l j, L j ) = δ(l j, L j ) + R b j δ(l j, L j ) + R b j = + δ(l j, L j + ) b j = b j δ(l j, L j ) because of the nductve hypothess and δ(l j, L j ) δ(l j, L j ). Thus for all such j, we have b j b j, and H(n, n 2 ) = L n j=l n 2 b j L n j=l n 2 b j = H(n, n 2 ). () We prove nductvely that for all 0 < j < L, we have b j z b j.. (Base case). b 0 = δ(l,l ) = δ(l,l) z = z b 0. δ(l j,l j ) = 2. (Inductve Step). The nductve hypothess guarantees that b j z b j. Observe that δ(l j,l j+) = δ(l j,l j+) δ(l j,l j ) δ(l j,l j ) = R. Moreover, let x = δ(l j, L j ) and y = δ(l j, L j +), and then we have: 7

Thus z δ(l j, L j) = z x + y = z (x + y) = z x + y x x. b j = + y b j x = x +R b j z x + y x +R z b j = z + δ(l j, L j + ) b j = z b j δ(l j, L j ) snce x+y x = δ(l j,l j ). Thus for all 0 < j < L, we have b j z b j, and hence H(n, n 2 ) = L n j=l n 2 b j L n j=l n 2 z b j = z H(n, n 2 ). Ths completes the proof. Implcaton of Lemma S5. The man mplcaton of Lemma S5 s as follows: any lower bound on the httng tme on the unloop varant s a lower bound on the httng tme of the orgnal Markov chan; and an upper bound on the httng tme on the unloop varant multpled by z gves an upper bound on the httng tme of the orgnal Markov chan. 3. Evolutonary Process In ths secton we consder a smple model of evolutonary process, where organsms/genotypes are represented as strngs of length L, and vew evoluton as a dscrete tme process. For smplcty, we wll frst consder the case of bt strngs and present all our results wth bt strngs because all the key proof deas are llustrated there. We wll then generalze our results to strngs for any alphabet sze n Secton 7. For a bt strng s, at any tme pont a random mutaton can appear wth probablty u, whch wll nvert a sngle bt of the strng s. Such mutatons can be vewed as transtons between genotypes whch form a random walk n the L-dmensonal genotypc space of all 2 L strngs. Notatons. For L N, we denote by B(L) the set of all L-bt strngs. Gven a strng s B(L), the neghborhood Nh(s) of s s the set of strngs that dffer from s by only one bt,.e., Nh(s) = {s B(L) : s, s dffer n exactly one poston}. In order to model natural selecton, we wll consder a constant selecton ntensty β R and each strng s wll be assocated wth a ftness accordng to a ftness functon f(s) R. The selecton ntensty and the ftness functon wll determne the transton probabltes between s and ts neghbors. Transton probablty between strngs. Gven a strng s and s Nh(s), the transton probablty (s, s ) from s to s depends () on the ftness of s and the ftness of the neghbors n Nh(s), and () the selecton ntensty. For all s Nh(s), let df (s, s ) = (f (s ) f (s)) denote the dfference n ftness of s and s, and let g(s, s ) = +e β df (s,s ). Then the transton probablty s defned as follows: (s, s g(s, s ) ) = u s Nh(s) g(s, s ) () The ntutve descrpton of the transton probablty (whch s refered as Ferm process) s as follows: the term u represents the probablty of a mutaton occurrng n s, whle the choce of the neghbor s s based on a normalzed weghted sum, wth each sgmod term +e β df (s,s ) beng determned by the ftness dfference between s, s and the selecton ntensty. The selecton ntensty acts lke the temperature functon. The hgh values of the selecton ntensty wll favor those transtons to neghbors that have hgher ftness, whle 8

settng β = 0 turns all the possble transtons of equal probablty and ndependent of the ftness landscape (we refer to ths case as neutral selecton). Dscovery tme. Gven a strng space B(L), a ftness functon f and a selecton ntensty β, for two strngs s, s 2 B(L), we denote by T (s, s 2, f, β) the expected dscovery tme of the target strng s from the startng strng s 2,.e., the average number of steps necessary to transform s 2 to s under the ftness landscape f and selecton ntensty β. Gven a start strng s 2 and a target set U of strngs we denote by T (U, s 2, f, β) the expected dscovery tme of the target set U startng from the strng s 2,.e., the average number of steps necessary to transform s 2 to some strng n U. In the followng secton we wll present several lower and upper bounds on the dscovery tmes dependng on the ftness functon and selecton ntensty. Moran evolutonary process. The evolutonary process we descrbed s the Ferm process where the transton probabltes are chosen accordng to the Ferm functon and the ftness dfference. We wll frst present lower and upper bounds for the Ferm evolutonary process for mathematcally elegant proofs, and then argue how the bounds are easly transferred to the Moran evolutonary process. 4. Neutral Selecton In ths secton we consder the case of neutral selecton, and hence the transton probabltes are ndependent of the ftness functon. Snce β = 0 for all strngs s, the transton probablty equaton (Eqn ) smplfes to (s, s ) = u L for all s Nh(s). We wll present an exponental lower bound on the dscovery tme of a set of targets concentrated around the sequence 0, and we wll refer to ths case as broad peak. For a constant 0 < c <, let Uc L denote the set of all strngs such that at most cl bts are ones (.e., at least ( c) L bts are zeros). In other words, Uc L s the set of strngs that have Hammng dstance at most cl to 0. We consder the set Uc L as the target set. Because there s neutral selecton the ftness landscape s mmateral, and for the sequel of ths secton we wll drop the last two arguments of T (,, f, β) snce β = 0 and the dscovery tme s ndependent of f. We model the evolutonary process as a Markov chan on a lne, M L,0 = (S, δ 0 ) (0 for neutral), whch s obtaned as follows: by symmetry, all strngs that have exactly -ones and (L )-zeros form an equvalence class, whch s represented as state of the Markov chan. The transton probabltes from state are as follows: () for 0 < < L we have δ 0 (, ) = u L and δ 0(, + ) = u (L ) L ; () δ 0 (0, ) = u; and () δ 0 (L, L ) = u. Then we have the followng equvalence: for a strng s n B(L) \ Uc L the dscovery tme T (Uc L, s) from s to the set Uc L under neutral selecton s same as the httng tme H(cL, ) n the Markov chan on a lne M L,0, where s has exactly -ones. Each state has a self-loop wth probablty ( u), and we gnore the self-loop probabltes (.e., set u = ) because by Lemma S5 all lower bounds on the httng tme for the unloop varant are vald for the orgnal Markov chan; and all upper bounds on the httng tme for the unloop varant need to be multpled by u to obtan the upper bounds on the httng tme for the orgnal Markov chan. In other words, we wll consder the followng transton probabltes: () for 0 < < L we have δ 0 (, ) = L and δ 0(, + ) = (L ) L ; () δ 0 (0, ) = ; and () δ 0 (L, L ) =. Theorem S. For all constants c < 4 2, for all strng spaces B(L) wth L we have T (Uc L, s) c l L A, where A = 3 2 c 4 = 2 + 2 c 4 > 2, c A = A A 2 c, and for all s B(L) \ U c L, 2 c > and l = 4 > 0. Proof. We consder the Markov chan M L,0 for L 4 2 c and let us consder the mdpont between cl and +2 c 2 L,.e., = 4 L. Such a mdpont exsts snce L 4 2 c. Then for all j such that cl j we have δ 0 (j, j + ) = L j L L L = 3 2 c 4 = A > 2. The frst nequalty holds snce j, whle the second nequalty s due to c < 2. We now use Corollary S (tem ) for M L,0 wth n = x = cl, y =, and k = ( +2 c 4 c) L = l L and vary n 2 from x + to L to obtan that H(n, n 2 ) c l L A, and hence for all s B(L) \ Uc L we have T (Uc L, s) c l L 9 A.

Four-letter alphabet. As n typcal cases n bology the alphabet sze s four (e.g., DNA, RNA), we state the analog of Theorem S for a four-letter alphabet. Later n ths document, Theorem S4 states the general case for arbtrary alphabet. Consder the alphabet {0,, 2, 3}. We can agan consder a Markov chan on a lne M L,0, where ts -th poston encodes all the strngs n B(L) whch dffer from t n exactly postons. We consder a strng s that corresponds to the -th state of M L,β, for 0 < < L. Then we have the followng cases: There are exactly neghbors of s n state, snce n each poston among the postons that s does not agree wth t, there s exactly one mutaton that wll make s and t match n that poston. There are exactly 3 (L ) neghbors of s n state +, snce n each poston among the L postons n whch s agrees wth t, there are 3 mutatons that wll make s not agree wth t n that poston. There are exactly 2 neghbors of s n state, snce n each poston j among the postons that s does not agree wth t, there are 2 mutatons that wll preserve ths dsagreement. Based on the above analyss and Equaton, the followng holds for the transton probabltes of M L,0 : whle for δ 0 (, ) we have: δ 0 (, + ) 3 (L ) = δ 0 (, ) δ 0 (, ) = 2 + 3 (L ) + 2 = 3 2 ( ) + L whch s maxmzed when = L to δ 0 (L, L) = 2 3, constant. Theorem S2. For a four-letter alphabet sze the followng assertons hold :. f c < 3 4, then there exsts L 0 N such that for all strng spaces B(L) wth L L 0, for all s B(L)\U L c we have T (U L c, s) 2 (3 4 c) L 6 log 6 4+3 c ; and 2. f c 3 4, then for all strng spaces B(L), for all s B(L) \ U L c we have T (U L c ) = O(L 2 ). Proof. We prove each step separately.. We consder the Markov chan M L,0 for L L 0 = 8 3 L 3 4 c. Consder the mdpont between cl and 4,.e., = 3+4 c 8 L (such a mdpont exsts because L L 0 and the choce of c). For all cl < j we have: δ β (j, j + ) δ β (j, j ) = 3 (L j) j 3 (L ) 3 5 4 c 3 + 4 c 6 4 c + 3 > snce c < 3 4. We now use Lemma S3 for M L,0 wth n = x = cl, y =, and k = L 3 4 c 8 = l L and vary n 2 from x + to L to obtan that H(n, n 2 ) A l L, and hence for all s B(L) \ Uc L we have T (Uc L, s) 2 ( 3 4 c 6 8 L ) log 4+3 c 2 (3 4 c) L 6 log 6 4+3 c. 2. We consder the Markov chan M L,0. For every cl < j < L we have: δ β (j, j + ) δ β (j, j ) = 3 (L j) j 3 (L c L) c L = Thus for all cl < j < L we have δ 0 (j, j ) δ β (j, j + ), and δ 0 (j, j) 2 3. Then, by Lemma S4 we have that H(cL, n 2 ) = O(L 2 ) for all n 2 > cl, We conclude that T (Uc L, s) = O(L 2 ) for all s B(s)\Uc L. 0

...... B(L) Mdpont between cl and L 2 2 c 4 L 0 cl cl + L 2 L A = 2 + 2 c 4 > 2 (a) (b) Supplementary Fgure 3: Neutral selecton wth broad peaks. The fgure shows that when the target set s Uc L of strngs that have at most c n ones (blue n (a)), for c < 2, for a regon of length l L, whch s from c n to the md-pont between cl and L 2, the transton probablty to the rght s at least a constant A > 2, and ths contrbutes to the exponental httng tme to the target set. Fgure (b) shows the comparson of the exponental tme for multple targets and sngle target under neutral selecton.

l L Mdpont between cl and L +e β 0 cl cl + L 2 L A > 2 L +e β (a) 2 2 2 2 0 cl L 2 L L +e β (b) Supplementary Fgure 4: Constant selecton wth broad peaks. The fgure shows the llustraton of the dchotomy theorem. The blue regon represents the states that correspond to targets, whle the green regon depcts the states where the transton probablty to the left s greater than 2. Intutvely gven a selecton ntensty β, the selecton ntensty allows to reach the regon L n polynomal tme. +e β In fgure (a), there exsts a regon between the blue and green, of length l L, where the probablty of transtonng to the rght s a constant, greater than 2. In other words, when the blue and green regon do not overlap, n the md-regon between the blue and green regon the transton probablty to the rght s at least A > 2, and hence the httng tme s exponental. When β and c are large enough so that the two regons overlap (fgure (b)), then all transtons to the left tll the target set s at least 2, and hence the httng tme s polynomal. 2

5. Constant Ftness Dfference Functon In ths secton we consder the case where the selecton ntensty β > 0 s postve, and the ftness functon s lnear. For a strng s, let h(s) denote the number of ones n s,.e., the hammng dstance from the strng 0. We consder a lnear ftness functon f such that for two strngs s and s Nh(s) we have df (s, s ) = (f(s ) f(s)) = (h(s ) h(s)), the dfference n the ftness s constant and depends negatvely on the hammng dstance. In other words, strngs closer to 0 have greater ftness and the ftness change s lnear wth coeffcent. We call the ftness functon wth constant dfference as the lnear ftness functon. Agan we consder a broad peak of targets Uc L, for some constant 0 < c < 2. Snce we consder all strngs n Uc L as the target set, t follows that for all strngs s B(L) \ Uc L the dfference n the hammng dstance between s and s Nh(s) from 0 and the target set Uc L s the same. Smlarly as n the neutral casel, due to symmetry of the lnear ftness functon f, we construct an equvalent Markov chan on a lne, denoted M L,β = (S, δ β ), as follows: state of the Markov chan represents strngs wth exactly -ones, and we have the followng transton functon: () δ β (0, ) = ; () δ β (L, L ) = ; and () for 0 < < L we have δ β (, + ) = ; δ + e β β (, ) = L + e β L (also see the techncal appendx for the dervaton of the above probabltes). Agan the dscovery tme corresponds to the httng tme n the Markov chan M L,β. Note that agan we have gnored the self-loops of probablty ( u), and by Lemma S5 all lower bounds for httng tme for the unloop varant are vald for the orgnal Markov chan; and all upper bounds on the httng tme for the unloop varant need to be multpled by u to obtan upper bounds on the httng tme for the orgnal Markov chan. We wll present a dchotomy result: the frst result shows that f c ( + e β ) <, for selecton ntensty β > 0, then the dscovery tme s exponental, whle the second result shows that f c ( + e β ), then the dscovery tme s polynomal. We frst present the two lemmas. Lemma S6. For the lnear ftness functon f, for all selecton ntenstes β > 0 and all constants c 2 such that c v <, where v = + e β, there exsts L 0 N such that for all strng spaces B(L) wth L L 0, for all s B(L) \ Uc L we have T (Uc L, s, f, β) c l L A where A = 2 + v (2 c v) 2 (v (c v+2 2 c) 2) > 2, c A = A A > and l = c v 2 v > 0. Proof. We consder the Markov chan M L,β for L L 0 = 2 v c v. Consder the mdpont between cl and L v,.e., = +c v 2 v L (such a mdpont exsts because L L 0 and the choce of c). For all cl < j we have: δ β (j, j + ) = = + e β j L j + e β L 2 v c v c v 2 + 2 v ( c) 2 = + (v ) = 2 + v (2 c v) 2 (v (c v + 2 2 c) 2) = A > 2. j L j L ; +c v 2 v L L +c v 2 v L snce j ; the second equalty s obtaned snce (v ) = eβ The frst nequalty holds as and substtutng wth ts value +c v 2 v L; and the result of the equaltes are smple calculaton; and the descrpton of the fnal nequalty s as follows: () snce c v <, we have 2 c v > 0, () the fact that c 2 and c v 0 mples that c v + 2 2 c and snce we have v > 2, t follows that v (c v + 2 2 c) > 2; establshng that the term along wth 2 n A s strctly postve. We now use Corollary S (tem ) for M L,β wth n = x = cl, y =, and k = c v 2 v L = l L and vary n 2 from x + to L to obtan that H(n, n 2 ) c l L A, and hence for all s B(L) \ Uc L we have T (Uc L, s, f, β) c l L A. 3

Lemma S7. For all strng spaces B(L), for all c < 2 and the lnear ftness functon, for all selecton ntenstes β > 0 wth c ( + e β ), for all s B(L) \ Uc L we have T (Uc L, s, f, β) = O(L 2 ). Proof. We consder the Markov chan M L,β, where β s such that we have c. For every cl < j < L +e β we have: δ β (j, j ) = + e β L j j + e β L cl cl = + e β c c The frst nequalty holds because L j j L cl cl snce cl < j; the second nequalty holds snce c ( + eβ ) whch mples that ( e β c ), and hence + e β ( c ) 2. Thus for all cl < j < L we have δ β (j, j ) 2, and by Corollary S (tem 2) we have that H(cL, n 2) = O(L 2 ) for all n 2 > cl. Thus we conclude that T (Uc L, s, f, β) = O(L 2 ) for all s B(s) \ Uc L. The desred result follows. Theorem S3. For the lnear ftness functon f, selecton ntensty β > 0, and constant c 2, the followng assertons hold:. If c ( + e β ) <, then there exsts L 0 N such that for all strng spaces B(L) wth L L 0, for all s B(L) \ Uc L we have T (Uc L, s, f, β) c l L A where A = 2 + v (2 c v) 2 (v (c v+2 2 c) 2) > 2, c A = A A > and l = c v 2 v > 0. 2. If c (+e β ), then for all strng spaces B(L), for all s B(L)\U L c we have T (U L c, s, f, β) = O(L 2 ). 2. 6. Moran Process Model In the prevous secton we consdered the constant selecton ntensty wth Ferm process. We now dscuss how from the results of the prevous secton we can obtan smlar results f we consder the Moran process for evoluton. Basc Moran process descrpton. A populaton of N ndvduals mutates wth probablty u n each round, at N u rate. Consder that the populaton s currently n state (whch represents all bt strngs wth exactly ones): the probablty that the next state s s the rate of an mutant to be ntroduced, tmes the fxaton probablty of the mutant n the populaton. Formally, the transton probablty matrx δ M (M for Moran process) for the Markov chan on a lne under the Moran process s as follows: () δ M (, ) = N u L ρ, ; (2) δ M (, +) = N u L L ρ,+; (3) δ M (, ) = δ M (, ) δ M (, +). We assume that N u < and ρ,j s the fxaton probablty of a j mutant n a populaton of N ndvduals of type. In partcular, ρ,j = f f j ( ) N f f j and ρ,j (0, ) for postve ftness f and f j, where f (resp. f j ) denotes the ftness of strngs wth exactly (resp. j) ones. We frst show a bound for the self-loop probabltes δ M (, ): snce strngs closer to the target have a greater ftness value we have f f ; and hence the probablty of fxaton of an ( )-mutant n a populaton of type s at least N. Thus we have δ M (, ) = N u L ρ, N u L N u L Then, δ M (, ) δ M (, ) u L, and δ M (,) L u. Hence we wll consder the unloop varant of the Markov chan and by Lemma S5 all lower bounds on dscovery tme for the unloop varant hold for the orgnal Markov chan; and the upper bounds for the unloop varant need to by multpled by L u to obtan 4

the upper bounds for the orgnal Markov chan. Hence f we consder the unloop varant of the Markov chan on a lne obtaned from the Moran process we have: δ M (, ) = δ M (, ) δ M (, ) + δ M (, + ) = + (L ) ρ,+ ρ, = + L ρ,+ ρ, and δ M (, + ) = δ M (, ). We now consder the case of multplcatve ftness functon. Multplcatve ftness rates. We consder the case where we have multplcatve ftness functon where = r, as the ftness functon ncreases as we move closer to the target. Then f f and δ M (, ) = ρ,+ ρ, = r + r N + r r N + L = ρ,+ ρ, For constant factor r = r for all, we obtan = r (N ) r N r+ N r+ r + L r (N ) δ M (, ) = + L r. (N ) r N r+ r+ N r Let us denote by δ M,r (, ) = + L r (N ) the transton probabltes of the unloop varant of the Markov chan on a lne for the Moran process wth multplcatve constant r. Then we have the followng cases:. (Neutral case). In the neutral case we have r =, and then the Markov chan wth transton probabltes δ M, s the same as the transton probabltes δ 0 of the Markov chan M L,0 n Secton 4 for neutral selecton. 2. (Constant r multplcatve ftness). The transton probabltes δ M,r (, ) has the same form as the transton probabltes of the Markov chan M L,β under postve selecton ntensty and lnear ftness functon, n Secton 5. In partcular, for e β = r N, we have δ β of M L,β s the same as δ M,r, and thus from the results of Secton 5 we obtan smlar results for the Moran process. Summary of results for Moran process wth multplcatve ftness landscape. From the results of Secton 4 and Secton 5, and the equvalence of the transton probabltes of the Markov chans n Secton 4 and Secton 5 wth those n the Moran process, we obtan the followng results for Moran process of evoluton under constant multplcatve ftness landscape r:. (Sngle target). For a sngle target, for all constants r and populaton sze N, the dscovery tme from any non-target strng to the target s exponental n the length of the bt strngs. 2. (Broad peaks). For broad peaks wth constant c fracton of clustered targets wth c 2, f c (+rn ) <, then the dscovery tme from any non-target strng to the target set s at least exponental n the length L of the bt strngs; and f c ( + r N ), then the dscovery tme from any non-target strng to the target set s at most O( L3 u ) (.e., polynomal). The polynomal dscovery tme for a broad peak surrounded by a ftness slope, requres the slope to extend to a Hammng dstance greater than 3L/4. What happens then, f the slope only extends to a certan maxmum dstance less than 3L/4? Suppose the ftness gan only arses, f the sequence dffers from the specfc sequence n not more than a fracton s of postons. Formally, we can consder any ftness functon, f, that assgns zero ftness to sequences that are at a Hammng dstance of at least sl from the specfc. 5

Supplementary Fgure 5: Broad peak wth dfferent ftness landscapes. For the broad peak there s a specfc sequence, and all sequences that are wthn Hammng dstance cl are part of the target set. (A) If the ftness landscape s flat outsde the broad peak and f c < 3/4, then the dscovery tme s exponental n sequence length, L. (B) If the broad peak s surrounded by a multplcatve ftness landscape whose slope extends over the whole sequence space, then the dscovery tme s ether polynomal or exponental n L dependng on whether c( + rn /3) or not. (C) If the ftness slope extends to a Hammng dstance less than 3L/4, then the dscovery tme s exponental n L. (D) Numercal calculatons for broad peaks surrounded by flat ftness landscapes. We observe exponental dscovery tme for c = /3 and c = /2. (E) Numercal calculatons for broad peaks surrounded by multplcatve ftness landscapes. The broad peak extends to c = /6 and the slope of the ftness landscape to s = /2. The dscovery tme s exponental, because s < 3/4. The ftness gan s r =.0 and the populaton sze s as ndcated. As the populaton sze, N, ncreases the dscovery tme converges to that of a broad peak wth c = /2 embedded n a flat ftness landscape. 6

sequence. Now our prevous result for neutral drft wth broad peak apples. Snce we must rely on neutral drft untl the ftness gan arses, the dscovery tme n ths ftness landscape s at least as long as the dscovery tme for neutral drft wth a broad peak of sze c = s. If s < 3/4, then the expected dscovery tme startng from any sequence outsde the ftness gan regon s exponental n L. Tables and 2 llustrate how settng c = s under-approxmates the expected dscovery tme. r =.0 N = 0 2 N = 5 0 2 N = 0 3 N = 0 4 N = s = 3 s = 2 c = 2.872592 0 337 6.49382 0 70 5.893335 0 70 5.89566 0 70 5.89566 0 70 c = 6 5.962263 0 260 6.49382 0 70 5.893335 0 70 5.89566 0 70 5.89566 0 70 c = 2 3.28507 0 264.307607 0 65.285938 0 65.285790 0 65.285790 0 65 c = 6.396805 0 88.307607 0 65.285938 0 65.285790 0 65.285790 0 65 Supplementary Table : Numercal data for the dscovery tme of broad peaks embedded n multplcatve ftness landscapes. The wdth of the broad peak s ether c = /2 or c = /6 and L = 000. The ftness slope extends to s = /3 and s = /2. The data are extrapolated from numbers obtaned for small values of L. For populaton szes N = 000 and greater, there s no dfference n the dscovery tme of c = /6 and c = /2. For N the dscovery tme for a partcular s converges to the dscovery tme for a broad peak wth c = s embedded n a flat ftness landscape. Fgure 5 gves a pctoral llustraton of all the above scenaros. 7. General Alphabet In prevous sectons we presented our results for L-bt strngs. In ths secton, we consder the case of general alphabet, where every sequence conssts of letters from a fnte alphabet Σ. Thus, B(L) s the space of all L-tuple strngs n Σ L. We fx a letter σ Σ, and consder a target set Uc L, consstng of all the L-tuple strngs, such that every s Uc L dffers from the target strng t = σ L (of all σ s) n at most cl postons (.e., Hammng dstance at most c L from the target strng t). We wll prove a dchotomy result that generalzes Theorem S3. We can agan consder a Markov chan on a lne M L,β, where ts -th poston encodes all the strngs n B(L) whch dffer from t n exactly postons. We consder a strng s that corresponds to the -th state of M L,β, for 0 < < L. Then we have the followng cases: There are exactly neghbors of s n state, snce n each poston among the postons that s does not agree wth t, there s exactly one mutaton that wll make s and t match n that poston. There are exactly (L ) ( Σ ) neghbors of s n state +, snce n each poston among the L postons n whch s agrees wth t, there are Σ mutatons that wll make s not agree wth t n that poston. There are exactly ( Σ 2) neghbors of s n state, snce n each poston j among the postons that s does not agree wth t, there are Σ 2 mutatons that wll preserve ths dsagreement. Let us denote Σ = + κ, where κ. Based on the above analyss and Equaton, the followng holds for the transton probabltes of M L,β : 7

whle for δ β (, ) we have: δ β (, + ) δ β (, ) = (L ) κ +e β = L κ e β +e β (κ = ): Then δ β (, ) = 0, snce every mutaton changes the dstance from t. (κ > ): Then by Equaton, for 0 < L: δ β (, ) = + 2 (κ ) (+e β ) + 2 (L ) (κ) (κ ) (+e β ) whch s maxmzed when = L to δ β (L, L) = 2, constant for a fxed alphabet Σ. + (κ ) (+e β ) Lemma S8. For the lnear ftness functon f, for all selecton ntenstes β 0 and all constants c such that c v < for v = + eβ κ, there exsts L 0 N such that for all strng spaces B(L) wth L L 0, for all s B(L) \ Uc L we have T (Uc L, s, f, β) A l L, where A = v (2 c) +c v κ e β > and l = c v 2 v. Proof. We consder the Markov chan M L,β for L L 0 = 2 v c v. Consder the mdpont between cl and L +c v v,.e., = L 2 v (such a mdpont exsts because L L 0 and the choce of c, as > cl). For all cl < j we have: δ β (j, j + ) δ β (j, j ) = L j j κ e β L κ e β = L L +c v 2 v L +c v 2 v κ e β = 2 v c v + c v κ κ+ κ e β = A > The frst nequalty holds because j and thus L j j L. The equaltes follow as smple rewrtng, whle A > 2 v 2 2 κ e β = (v ) κ e β =, snce c v <. We now use Lemma S3 for M L,β wth n = x = cl, y =, and k = L c v 2 v = l L and vary n 2 from x + to L to obtan that H(n, n 2 ) A l L, and hence for all s B(L) \ Uc L we have T (Uc L, s, f, β) A l L. Lemma S9. For all strng spaces B(L), for all c κ κ+ ntenstes β 0 wth c v for v = + eβ κ, for all s B(L) \ U c L M = mn 0< L δ β (, ) = 2. + (κ ) (+e β ) and the lnear ftness functon f, for all selecton we have T (Uc L, s, f, β) = O( L2 M ), where Proof. We consder the Markov chan M L,β, where β s such that we have c v. For every cl < j < L we have: δ β (j, j ) δ β (j, j + ) = j L j eβ κ cl L κ ( c) eβ = c eβ κ c κ The frst nequalty holds because cl < j; the second nequalty holds because c ( + eβ κ ) and thus c e β κ c κ. Thus for all cl < j < L we have δ β(j, j ) δ β (j, j + ), whle M = mn 0< L δ β (, ) = 2. Then, by Lemma S4 we have that H(cL, n + 2 ) = O( L2 (κ ) (+e β M ) for all n 2 > cl. We conclude that ) T (Uc L, s, f, β) = O( L2 M ) for all s B(s) \ U c L. The desred result follows. Lemmas S8 and S9 yeld the followng dchotomy (recall that Σ = + κ): Theorem S4. For alphabet sze Σ, for the lnear ftness functon f, selecton ntensty β 0, and constant c, where Σ = + κ; the followng assertons hold : κ κ+. f c ( + eβ κ ) <, then there exsts L 0 N such that for all strng spaces B(L) wth L L 0, for all s B(L) \ Uc L we have T (Uc L, s, f, β) A l L where A = v (2 c) +c v κ e β > and l = c v 2 v, wth v = eβ +κ κ ; and 8

2. f c (+ eβ κ ), then for all strng spaces B(L), for all s B(L)\U c L we have T (Uc L, s, f, β) = O( L2 M ), where M = 2. + (κ ) (+e β ) Note that Theorem S4 wth the specal case of Σ = 2 and κ = gves us Theorem S3. Corollary S2. For alphabet sze Σ = +κ, consder the Moran process wth multplcatve ftness landscape wth constant r, populaton sze N, and mutaton rate u. Let c. The followng assertons hold :. f c ( + rn κ ) <, then there exsts L 0 N such that for all strng spaces B(L) wth L L 0, for all s B(L) \ Uc L the dscovery tme from s to some strng n Uc L s at least A l L where A = v (2 c) +c v κ r N > and l = c v 2 v, wth v = + rn κ 2. f c ( + rn κ ), then for all strng spaces B(L), for all s B(L) \ U c L the dscovery tme from s to some strng n Uc L s at most O( L3 M u ), where M = 2 s constant. + (κ ) (+r (N ) ) Explct bounds for four letter alphabet. We now present the explct calculaton for L 0 and l of Corollary S2 for four letter alphabet. For the four letter alphabet we have κ = 3, and for the exponental lower bound we have cv <. In ths case we have κ κ+ ; and Snce cv < we have v = 3 + rn 3 and l = 3c crn 3( c) crn 6 + 2r N = 6 + 2r N. N 2v cv 2(v ) A = 3r + cv + cv = 2 r N 3r N 3 3+3c+cr N 3 = 6 3( + c) + cr N By changng the exponental lower bound to base 2, we have that the dscovery tme s at least 2 (ll ) log 2 A. Thus we have the followng two cases: Selecton: Wth selecton (.e., r > ) the exponental lower bound on the dscovery tme when cv < s at least: for all L L 0 = 6+2rN. 3( c) cr N 2 ( 3( c) cr N ) 6+2r N L log 2 6 3(c+)+cr N ; Neutral case: Specalzng the above result for the neutral case (.e., r = ) we obtan the exponental lower bound on the dscovery tme when cv < s at least: for all L L 0 = 8 3 4c exp ( ( 3 4c 6 L) log 2 6 4c+3 2 ( 3 4c 8 L ) log 2 6 4c+3 ;. We gnore the factor as compared to L and have that 2( 3 4c 8 L ) log 2 6 ). 4c+3 Dscusson about mplcatons of results. We now dscuss the mplcatons of Corollary S2.. Frst the corollary mples that for a sngle target (whch ntutvely corresponds to c = 0) even wth multplcatve ftness landscape (whch s an exponentally ncreasng ftness landscape) the dscovery tme s exponental. 2. The dscovery tme s polynomal f c (+ rn κ ), however ths requres that the slope of the ftness gan extends over the whole sequence space (at least tll Hammng dstance (κ/(κ + )) L). 9

3. Consder the case where the ftness gan arses only when the sequence dffers from the target n not more than a fracton of s postons,.e., the slope of the ftness functon only extends upto a Hammng dstance of s L. Now our result for neutral drft wth broad peak apples. Snce we must rely on neutral drft untl the ftness gan arses, the dscovery tme of ths process s at least as long as the dscovery tme for neutral drft wth a broad peak of sze c = s. If r = (neutral drft), then we have that the dscovery tme s polynomal f c( + κ ), and otherwse t s exponental. Hence f the ftness gan arses from Hammng dstance s L and s < κ/(κ + ), then the expected dscovery tme startng from any sequence outsde the ftness gan regon s exponental n L. Moreover, there are two further mplcatons of ths exponental lower bound. Frst, note that f r =, then r N s ndependent of N, and thus the exponental lower bound s ndependent of N. Second, note that f the ftness gan arses from Hammng dstance s L, and t s neutral tll the ftness gan regon s reached, then the exponental lower bound for s < κ/(κ + ), s also ndependent of the shape of the ftness landscape after the ftness gan arses. Formally, f we consder any ftness functon f that assgns zero ftness to strngs that are at Hammng dstance at least s L from the deal sequence, and any nonnegatve ftness value to other strngs, then the process s neutral tll the ftness gan arses, and the exponental lower bound holds for the ftness landscape, and s ndependent of the populaton sze. For a four letter alphabet (as n the case of RNA and DNA) the crtcal threshold s thus s = 3/4. Remark S. Note that we have shown that all results for bt strngs easly extend to any fnte alphabet by approprately changng the constant. For smplcty, n the followng sectons we present our results for strngs over 4-letter alphabet, and they also extend easly to any fnte alphabet by approprately changng the constants. Remark S2. We have establshed several lower bounds on the expected dscovery tme. All the lower bounds are obtaned from httng tmes on Markov chans, and n Markov chans the httng tmes are closely concentrated around the expectaton. In other words, whenever we establsh that the expected dscovery tme s exponental, t follows that the dscovery tme s exponental wth hgh probablty. 8. Multple Independent Searches In ths secton we consder multple ndependent searches. For smplcty we wll consder strngs over 4-letter alphabet, and as shown n Secton 7 the results easly extend to strngs over alphabets of any sze. 8.. Polynomally many ndependent searches. We wll show that f there are polynomally many multple searches startng from a Hammng dstance of at least 3L 4, then the probablty to reach the target n polynomally many steps s neglgbly small (smaller than an nverse of any polynomal functon). We wll present our results for Markov chan on a lne, and t mples the results for the evolutonary processes. In all the followng lemmas we consder the Markov chan on a lne for a four letter alphabet. Before the formal proof we present nformal arguments and ntuton for the proof. The basc ntuton and steps of the proof. The basc ntuton and steps of our proof are as follows:. Frst we show that n the Markov chan on a lne, from any pont n 2 3L 4 the probablty to not reach 3L 4 n polynomally many steps s very (exponentally) small. The key reason s that we have shown that the expected httng tme from n 2 to 3L 4 s at most L2 ; and hence the probablty to reach 3L 4 from n 2 wthn L 5 steps s very hgh. Thus the probablty to not reach wthn L 5 steps s very small (see Lemma S0). 2. Second, we show that the contrbuton to the expected httng tme for the steps beyond L 2 2 L log L s at most a constant. The nformal reasonng s that beyond the expected httng tme, the probablty to not reach n steps beyond the expected httng tme drops exponentally. Hence we obtan a geometrc seres whose sum s bounded by a constant (see Lemma S). 20

3. Then we show that f the expected httng tme s exponental, then for all polynomals p ( ) and p 2 ( ), the probablty to reach wthn p (L) steps s smaller than p 2(L). The key argument s to combne the prevous two results to show that f the probablty to reach wthn p (L) steps s more than p, 2(L) then the expected httng tme would be a polynomal, contradctng that the expected httng tme s exponental. The formal arguments of the above results yeld Theorem S5. We present the formal proof below. Lemma S0. From any pont n 2 3L 4 the probablty that 3L 4 s not reached wthn L5 steps s exponentally small n L (.e., at most e L ). Proof. We have already establshed that the expected httng tme from n 2 to 3L 4 s at most L2. Hence the probablty to reach 3L 4 wthn L 3 steps must be at least L (otherwse the expectaton would have been greater than L 2 ). Snce from all states n 2 3L 4 the probablty to reach 3L 4 s at least L wthn L3 steps, the probablty that 3L 4 s not reached wthn k L3 steps s at most ( L) k. Hence the probablty that 3L 4 s not reached wthn L5 steps s at most The desred result follows. ( L) L L e L. Lemma S. The contrbuton of the expectaton to reach after L 2 2 L log L steps to the expected httng tme s at most a constant (.e., O()). Proof. From any startng pont, the probablty to reach the target wthn L steps s at least. Hence L L the probablty not reachng the target wthn k L L steps s e k. Hence the probablty to reach after l L 2 2 L log L steps at most e l L2. Thus expectaton contrbuton from L 2 2 L log L steps s at most (l + ) L 2 2 L log L l= e l L2 The desred result follows. L2 2 L log L (l + ) 22 log L+L log L e L2 e l 2 L2 l= l= ( l e l + ) e e l (e ) 2 + (e ) = O(). Lemma S2. In all cases, where the lower bound on the expected httng tme s exponental, for all polynomals p ( ) and p 2 ( ), the probablty to reach the target set from any state n 2 such that n 2 3L 4 wthn the frst p (L) steps s at most p. 2(L) Proof. We frst observe that from any start pont n 2 3L 4 the expected tme to reach 3L 4 s L 2, and the probablty that 3L 4 s not reached wthn L5 steps s exponentally small (Lemma S0). Hence f the probablty to reach the target set from 3L 4 wthn p (L) steps s at least p 2(L), then from all states the probablty to reach wthn L 5 p (L) steps s at least L p 2(L). In other words, from any state the probablty that the target set s not reached wthn L 5 p (L) steps s at most ( L p 2(L) ). Hence from any state the probablty that the target set s not reached wthn k L 5 p (L) steps s at most ( L p 2(L) )k. Thus from any state the probablty that the target set s not reached wthn L 3 p 2 (L) L 5 p (L) steps s at most ( ) L p2(l) L 2 = e L2. L p 2 (L) Hence the probablty to reach the target wthn L 8 p (L) p 2 (L) steps s at least e L2. By Lemma S the expectaton contrbuton from steps at least L 2 2 L log L s constant (O()). 2

Hence we would obtan an upper bound on the expected httng tme as ( L 8 p (L) p 2 (L) ) + L2 2 L log L + O() L 9 p (L) p 2 (L). e L2 e L2 Note that the above bound s obtaned wthout assumng that p ( ) and p 2 ( ) are polynomal functons. However, f p ( ) and p 2 ( ) are polynomal, then we wll obtan a polynomal upper bound on the httng tme, whch contradcts the exponental lower bound. The desred result follows. Corollary S3. In all cases, where the lower bound on the expected httng tme s exponental, let us denote by h denote the expected httng tme. Gven numbers t and t 2, the probablty to reach the target set from any state n 2 such that n 2 3L 4 wthn the frst t = h L 9 t 2 steps s at most t 2. Proof. In the proof of Lemma S2 frst we establshed that the httng tme s at most L 9 p (L) p 2 (L) (wthout assumng they are polynomal). By nterpretng t as p (L) and t 2 = p 2 (L) we obtan that h L 9 t t 2. The desred result follows. Theorem S5. In all cases, where the lower bound on the expected httng tme s exponental, for all polynomals p ( ), p 2 ( ) and p 3 ( ), for p 3 (L) ndependent multple searches, the probablty to reach the target set from any state n 2 such that n 2 3L 4 wthn frst p (L) steps for any of the searches s at most p 2(L). Proof. Consder the polynomal p 2 (L) = p 3 (L) p 2 (L). Then by Lemma S2 for a sngle search the probablty to reach the target wthn p (L) steps s at most p 2 (L). Hence the probablty that none of the search reaches the target n p (L) steps s ( ) p3(l) ( = p 2 (L) p 2 (L) snce e 2 x x, for 0 x 2. The desred result follows. ) p2 (L) p3(l) p 2 (L) = ( ) p2 (L) p 2 (L) = e p 2 (L) p 2 (L) 2 p 2 (L) ; Remark S3. Observe that n Theorem S5 the ndependent searches could start at dfferent startng ponts, and the result stll holds, because n all cases we establshed an exponental lower bound, the lower bound holds for all startng ponts outsde the target regon. 8.2. Probablty of httng n a gven number of steps. We now present a smple (and nformal) argument for the approxmaton of the probablty that none of M ndependent searches succeed to dscover the target n a gven number of b steps, where the expected dscovery tme for a sngle search s d, for b << d. The steps of the argument are as follows:. Frst we observe that the expected dscovery tme s the expected httng tme n a Markov chan, and the probablty dstrbuton of the httng tme n a Markov chan s largely concentrated around the expected httng tme d, when the expected httng tme s exponental and the startng state s far away from the target set. The result of sharp concentraton around the expected httng tme s a generalzaton of the classcal Chernoff bound for the sum of ndependent varables: the generalzaton for Markov chans s obtaned by consderng Azuma-Hoeffdng s s nequalty for bounded martngales [, Chapter 7] that shows exponental concentraton around the expected httng tme. Note that for Markov chans, the martngale for expected httng tme s bounded by (as wth every step the httng tme ncreases by ). 2. Gven that the probablty dstrbuton s concentrated around the mean, an approxmaton of the probablty that a sngle search succeeds n b steps s at most b d, for b << d. 22

3. Gven ndependent events E, E 2,..., E M such that the success probabltes are a, a 2,..., a M, respectvely, by ndependence (that allows us to multply probabltes) the probablty that none of the events succeed s ( a ) ( a 2 )... ( a M ). Hence for ndependent M searches the probablty that none of the searches succeed n b steps when the expected httng tme for a sngle search s d s at least ( b ) M = e M b d. d The above reasonng gves an nformal argument to obtan an upper bound on the probablty of success of M ndependent searches n b steps. 9. Dstrbuted Targets We now dscuss several cases of dstrbuted targets for whch the exponental lower bounds can be obtaned from our results. We dscuss the results for four letter alphabet.. Consder the example of dstrbuted targets where the letters n a gven L 0 number of postons are mmateral (e.g., the frst four postons, the tenth poston and the last four postons are mmateral, and hence L 0 = 9 n ths case). Then we can smply apply our results gnorng the postons whch are mmateral,.e., the strng space of sze L L 0, and apply all results wth effectve length L L 0. 2. Consder the example where the target set s as follows: nstead of the target of all σ s (.e., t = σ L ), the target set has all sequences that have at least an α L length segment of σ s, for α > /2. Then all the targets have an overlappng segment of (2 α ) L number of σ s from poston ( α) L to α L. We can then obtan a lower bound on the dscovery tme of these targets by consderng as target set the superset contanng all sequences wth σ s n that regon. In other words, we can apply our results wth sngle target but the effectve length s (2 α ) L. A pctoral llustraton of the above two cases s shown n Supplementary Fgure 6. 3. We now consder the case of dstrbuted targets that are chosen unformly at random and ndependently, and let m << 4 L be the number of dstrbuted targets. Let the selecton gradent extend up to a dstance of cl from a target, for c < 3/4. Formally we consder any ftness landscape f that assgns zero ftness to a strng whose Hammng dstance exceeds cl from every target. We consder a startng sequence for the search and argue about the estmate on the expected dscovery tme. Frst we consder the Markov chan M defned on B(L) where every strng s n B(L) s a state of the Markov chan. The transton probablty from a strng s to a neghborng strng n Nh(s) of Hammng dstance s Nh(s). The Markov chan M has the followng two propertes: t s () rreducble,.e., the whole Markov chan M s a recurrent class; and () reversble,.e., f there s a transton probablty from s to s, there s also a transton probablty from s to s. Snce M s rreducble and reversble, and due to ts symmetrc nature, t has a very fast mxng tme (the number of steps requred to converge to the statonary dstrbuton). In partcular, the statonary dstrbuton, whch s the unform dstrbuton over B(L), s converged to wth n O(L log L) steps [2]. Snce c < 3/4, the expected tme to reach a strng from where the selecton gradent to a specfc target s felt s exponental (by Corollary S2). Thus gven m << 4 L and c < 3/4, a strng from where the selecton gradent to any target s felt s reached wth n the frst O(L log L) steps wth low probablty. Snce any strng from where the selecton gradent s felt to a target s reached wth n the frst O(L log L) steps wth low probablty, and after O(L log L) steps M converges to the unform dstrbuton, a lower bound on the expected dscovery tme can be obtaned as follows: consder the probablstc process that n every step chooses a strng n B(L) unformly at random and the process succeeds f the chosen strng has a Hammng dstance at most cl from any of the target sequence. The expected number of steps requred for the success of the probablstc process s a 23

(a) (b) Supplementary Fgure 6: Dstrbuted target examples. Fgure (a) shows that f there are postons of the strng that are mmateral, then the effectve length decreases. Fgure (b) consders the case when the evolutonary process searches for a strng of length α L, and t shows that t searches for a sngle strng of length at least (2 α ) L. 24