NP Completeness of Kauffman s N-k Model, a Tuneably Rugged Fitness Landscape

Size: px

Start display at page:

Download "NP Completeness of Kauffman s N-k Model, a Tuneably Rugged Fitness Landscape"

Branden Cook
5 years ago
Views:

P Completeness of Kauffman s -k Model, a Tuneably Rugged Fitness Landscape Edward D.

represent the views of the Santa Fe Institute.

1 P Completeness of Kauffman s -k Model, a Tuneably Rugged Fitness Landscape Edward D. Weinberger SFI WORKIG PAPER: SFI Working Papers contain accounts of scientific work of the author(s) and do not necessarily represent the views of the Santa Fe Institute. We accept papers intended for publication in peer-reviewed journals or proceedings volumes, but not papers that have already appeared in print. Except for papers by our external faculty, papers must be based on work done at SFI, inspired by an invited visit to or collaboration at SFI, or funded by an SFI grant. OTICE: This working paper is included by permission of the contributing author(s) as a means to ensure timely distribution of the scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the author(s). It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may be reposted only with the explicit permission of the copyright holder. SATA FE ISTITUTE

2 P Completeness of Kauman's -k Model, a Tuneably Rugged Fitness Landscape Edward D. Weinberger Max-Planck-Institut fur biophysikalische Chemie Postfach 2841 Am Fassberg D-3400 Gottingen-ikolausberg Federal Republic of Germany The concept of a \tness landscape", a picturesque term for a mapping of the vertices of a nite graph to the real numbers, has arisen in several elds, including evolutionary theory. The computational complexity of two, qualitatively similar versions of a particularly simple tness landscape are shown to dier considerably. In one version, the question \Is the global optimum greater than a given value V?" is shown to be answerable in polynomial time by presenting an ecient algorithm that actually computes the optimum. The corresponding problem for the other version of the landscape is shown to be P complete. The P completeness of the latter problem leads to some speculations on why P 6= P. 1

3 Introduction The notion of an adaptive \landscape" representing the abstract \tness" of various kinds of organisms in various contexts has been a xture of evolutionary biology ever since it was proposed by Sewell Wright (1932). Although there are problems with a notion of \tness" that is a property of an individual, independent of other individuals and the environment, the discoveries of molecular biology have signicantly reinforced the power of this idea. We now understand, for example, the role of a discrete genomic \blueprint" in specifying the chemical constitutents of enzymes and a glimmering of how sensitive the tness of the organism can be to variations in enzyme chemistry, sothatitmakes sense to identify the specic sequence of nucleotide bases in the genome as the argument of the tness function. It has also become increasingly clear that the \design" of organisms involves a host of complex trade-os, implying that there must inevitably be large numbers of local optima in such tness landscapes. It was thus only a matter of time before the analogy between optimization (i.e. selection) on landscapes and combinatorial optimization problems appeared in the biological literature. One paper that discussed this analogy (Kauman & Weinberger, 1989), also proposed a simple statistical model of a tness landscape, the -k model, that could be used as an aide to the qualitiative understanding of more complex, if more realistic models. The purpose of the present paper is to consider the computational complexity of optimization on landscapes generated with this simple model. We show, in particular, that the optimization problem for one version of the model can be solved in polynomial time, but that another, qualitatively very similar version of the model, is P complete. This result makes the -k model an ideal candidate for investigating the nature of P completeness. One might think that optimization on -k landscapes is P complete because a knowledge 2

4 of the tness of a given conguration says little about the tness of distant congurations, thus leading to the need for an exhaustive search over a large number of congurations. Part of this conjecture turns out to be true: we can identify a set of pairwise distant congurations whose number grows faster than any polynomial in. However, when we compute the correlation in tness between pairs of sites in this set, the correlation is actually larger for the P complete version of the model than for the polynomial time version, thus providing strong evidence that the rest of the conjecture must be false. We conclude the paper by noting dierences between the two models in \tree width". Dress (1987) has shown that the global optimum in models where this quantity iso(1) can be solved in polynomial time. Our result that the tree width of an P complete problem is O() suggests that the notions of P completeness and tree width are more generally related. Denition of the -k Model In nature, the argument of the tness function is the set of all possible sequences of the four nucleotide bases. ature \evaluates" this function by translating these nucleic sequences into sequences of amino acids. It is only amino acid sequences the enzymes and structural proteins of an organism that have biological signicance. Therefore, the simplest version of the -k model ignores genetics and assigns a tness to the sequences of amino acids directly. The -k model makes two further simplications: the rst, that the amino acid sequence has a xed length of sites and the second, that there are only two possible amino acids, rather than the full complement oftwenty, that can occupy a given site of the sequence. This last assumption is justied by the fact that almost all of the properties of such sequences are determined by their three dimensional, folded structures, which, in turn, are determined by the chemical properties of the constituent amino acids. 3

5 The most important of these properties is polarity: those amino acids in the sequence that are polar get pulled to the outside of the folded structure by chemical attraction to surrounding water molecules, and non-polar amino acids get pushed into the interior of the structure. The assumption of two amino acids per site also dramatically simplies the modelling task by reducing the argument of the tness function to a bit string, which we denote b. The -k model assigns a real valued \tness" to b by rst assigning a real valued \tness contribution", f i,tothei th bit, b i,inb. Each such assignment depends, not just on i and the value of b i, but also on 0 k< other bits, which we call its \neighbors". The tness contribution of each site is a random function, f i (s i ), of the substring, s i, formed by the i th bit and its k neighbors. f i (s i ) is assigned by selecting an independent random variable from some distribution p(x), such as the uniform or Gaussian distributions, for each of the 2 k+1 possible values of s i,thus generating a \tness table" for the i th site. There is a dierent, independently generated table for each of the sites. Then, given any string of bits, the total tness of the string, F, is dened as the average of the tness contributions of each site that is, F (b) = 1 X i=1 f i (s i ): The use of a probability distribution in assigning the tness contributions can be interpreted either as an admission of ignorance of the true nature of the complex couplings between the bits or as an attempt to capture the typical statistical properties of a wide class of landscapes with k interconnections per bit. One other aspect of the -k model must be specied namely, theway inwhich the substrings, s i, are chosen. The simplest but not the only way ofchoosing neighbors, at least for even k, is to use the k sites adjacent tositei that is, the bits at sites i ; k=2 4

6 thru i + k=2. As in the original formulation of the model, we introduce periodic boundary conditions to assign neighbors to sites i with i k=2 and i ; k=2. In other words, we assume that the sites are arranged in a circle, such that site is next to site 1. Under this assumption, if k =2,site has neighbors ; 1 and 1, site 1 has neighbors and 2, and, more generally, site i has neighbors ( + i ; k=2) mod ::: ( + i + k=2) mod for any even integer k. This assignment of the neighbors gives rise to a class of short range spin glasses. Alternatively, we could assign the neighbors by randomly selecting, for each site i, k other sites on the string to be used in forming the index to the i th tness table, a denition that also makes sense for odd k. This assignment of neighbors makes the model similar to a long range, dilute spin glass. Rather surprisingly, local features of the landscape the height of local optima, and the length of typical \up hill" walks through a series of tter one mutant variants to these local optima were remarkably insensitive to the details of how the k +1 bit substrings were chosen in computer simulations (Kauman, Weinberger, & Perelson, 1988 Kauman & Weinberger, 1989). The -k model aords a \tuneably rugged" tness landscape, since tuning k alters the ruggedness of the landscape. For k = 0, each site is independent of all other sites. Either the bit value 0 or the bit value 1 is almost surely \tter" than the other hence, a single specic sequence comprised of the tter bit value in each position is almost surely the single, global optimum in the tness landscape. Any other string is sub-optimal, and lies on a connected walk via 1-mutant tter variants to the global optimum by ipping bits from less t to more t values. The length of the walk is just the Hamming distance from the initial string to the global optimum. For a randomly chosen initial string, half of the bits will be in their less t state, hence the expected walk length is just =2. A transition to a one mutant neighbor (i.e. the ip of a single bit) typically alters tness by anamount O(1=). In contrast, the fully connected -k model yields a completely random tness 5

7 landscape. For this k = ; 1 case, the tness contribution of each site depends on all of the other sites because the \context" of each of the ; 1 other bits is changed when even a single bit is ipped. In this case, therefore, the tness of each bit string is statistically independent of its neighbors. As was shown in Kauman & Levin (1987), Weinberger (1988), Macken & Perelson (1989) and Weinberger (1991a), such random landscapes have very many local optima (2 =( + 1), on average), walks to optima are short (O(ln ), on average), and only a small fraction of local optima are accessible from any initial string. Thus adaptive walks vary dramatically as the ruggedness of the landscape varies. The -k decision problem Having dened the model, we are now in a position to consider the following decision problem: Is the global optimum, F MAX, in a given instance of an -k landscape greater than some specied value V? In view of the work of Dress (1987), it is not surprising that this question can be answered for adjacent neighborhoods with periodic boundary conditions by the following simple dynamic programming algorithm that actually nds the globally maximal tness. For simplicity, we present the algorithm for k = 2, leaving the trivial generalization to arbitrary k to the reader. Also, is to be added or subtracted, as appropriate, to subscripts outside the range 1 2 ::: so that they assume values within that range. Let f b i;1b i b i+1 i be the site tness of the i th site, given the values of the bits b i;1, b i, and b i+1, and let F b b 1 jb i+1 b i+2 i be the maximum value of the sum Xi+1 j=1 f b j;1b j b j+1 j over the values of b 2 b 3 ::: b i, given the values of the bits b, b 1, b i+1,and b i+2. The algorithm then consists of the three phases 6

8 Initialization: F b b 1 jb 2 b 3 1 = f b b 1 b f b 1b 2 b 3 2. Continuation: F b b 1 jb i+1 b i+2 i = max F b b 1 jb i b i+1 b i;1 + f b ib i+1 b i+2 i+1 i for 2 i ; 1, which implicitly species a value for b i for 2 i ; 1 and each choice of b, b 1, b i+1, and b i+2. Use the value of b i thus specied for subsequent calculations. ote that the formula still makes sense when i = ; 2ori = ; 1, but there will be only 8 F b b 1 jb ;1 b ;2 values and only 4 F b b 1 jb b 1 ;1 values instead of the 16 F b b 1 jb i+1 b i+2 i values dened for 1 i<; 2. Termination: F MAX = max b b 1 F b b 1 jb b 1 ;1 : We have therefore proven Theorem 1. The -k decision problem with adjacent neighborhoods is solveable in O(2 k ) steps, and is thus in P. However, the situation for random neighborhoods is quite dierent, as is shown by Theorem 2. The -k decision problem with random neighborhoods is P complete for k 3. Proof: We note rst that the single integer characterizes the size of the problem completely if k is xed: 2 k+1 real numbers are required to specify the tness tables, and k integers per site, or k integers per instance of the problem to specify the neighborhoods. Furthermore, it is merely a matter of a table lookup followed by an addition for each site, or O() total work to check that a proposed solution does, indeed, have a tness greater than a given value. We conclude that the -k decision problem is in P. 7

9 We demonstrate P completeness by showing that the -k problem is polynomially equivalent to one of the best known P complete problems, the 3SAT problem (Garey & Johnson, 1978). Given boolean variables b =(b 1 b 2 ::: b ) and a list of M expressions, E i (b pi b qi b ri ), involving arbitrary triples of these variables and the operators AD, OR, and OT, the 3SAT problem is to determine whether values exist for the b i 's such that all of the expressions are satisable (i.e. all evaluate to TRUE). If we can show that every such problem can be mapped into an -k decision problem in polynomial time, we can conclude that the latter problem is \at least as hard" as the 3SAT problem, because our mapping, together with a polynomial time solution to the -k decision problem, would provide a polynomial time solution to the 3SAT problem. It is easiest to generate the -k decision problem that corresponds to a given 3SAT problem when = M and k =3. We rst transform each expression E i (b pi b qi b ri )into the equivalent expression E 0 i(b i b pi b qi b ri )=E i (b pi b qi b ri ) AD (b i b i ) where the expression a b is TRUE if and only if a = b. We then identify b i with the i th bit in the -k conguration, and the other variables appearing in the i th expression with b i 's \neighbors". We assign the tness tables associated with b i as follows: a tness table entry is assigned the value 1 if the corresponding E 0 is TRUE for the specied values of the b's, and 0 if it is FALSE. Clearly, the corresponding 3SAT problem is solved if we can determine whether the global maximum of the -k landscape thus generated is greater than or equal to. A trivial variation of the above mapping suces when M<. As before, we identify each 3SAT variable, b i, with the i th bit in the -k conguration, modify the M expressions such that b i appears in the i th expression for 1 i M and assign the tness tables as 8

10 described above. Sites M +1 M+2 ::: are assigned tness tables in which every entry is given the value 1. Once more, the given 3SAT problem is solved if and only if the global maximum of the corresponding -k problem is at least. The M case can be handled by introducing additional variables b +1, b +2 ::: b M, and proceeding as above, provided we lengthen the bit string in the -k model to M bits, and assign the neighborhoods as before. The -k decision problem for k3 is a fortiori P complete because every 3SAT problem can be embedded in a k3 decision problem by dening the expressions E 00 i (b i ::: b j b pi b qi b ri )=E i (b pi b qi b ri ) AD (b i b i ) AD ::: AD (b j b j ) where the bits b i ::: b j include the bits at the i th site and some arbitrary collection of k ; 3 neighbors. The previously given mapping to the -k problem can then be applied. Remark 1: So far as we know, the question of whether the k = 2 random model is P complete is open. Remark 2: An almost identical argument shows that the \P -T model," proposed by Pedro Tarazona, in which the tness table entries for site i do not depend on the i th bit, but only on its k neighbors, is also P complete for k 3. The computational complexity of the k =2P-T model is also an open question, but we conjecture that it is polynomially equivalent to the polynomial time 2SAT problem, a variant of the 3SAT problem in which the M boolean expressions involve arbitrary pairs, rather than triples, of the boolean variables. P Completeness and Correlation The foregoing is a clear example of Garey and Johnson's observation that seemingly trivial 9

11 modications to a combinatorial optimization problem can render it intractable. Previous analytical and numerical work suggests that many of the statistical properties of -k landscapes for large, but xed k are quite similar for adjacent and random neighborhoods: Weinberger (1991a) shows that the mean number of local optima, distances between optima and the expected tness of a local optimum is, asymptotically for large k, the same in both cases. In Appendix I, we compute the exact correlation R(d), between pairs of points separated by a Hamming distance d, from which we can deduce the limiting behavior for d k <<. For the random landscape, we obtain R(d) =1; d(k +1) + d(d ; 1)k(k +2) O dk 3 for the adjacent neighbor landscape, we obtain the rather similar expression d(k +1) d(d ; 1)k(k +1) dk 3 R(d) =1; O : These correlations, along with the common mean and variance of the tnesses, completely characterize the multivariate Gaussian distribution to which the distribution of tnesses converges as k!1. We might conjecture that a problem is intractable because the correlation between pairs of distant congurations decays so rapidly as their distance increases that their tnesses are eectively independent. We then have the obvious, but suggestive Lemma A. Every algorithm for nding the maximum (or minimum) of M completely arbitrary real numbers requires at least O(M) steps. Proof: If such an algorithm required merely o(m) steps, some of the numbers must necessarily be ignored by the algorithm. The answer given must then be independent of which (if any) of the ignored numbers is the maximum. Clearly, such an algorithm cannot be guaranteed to nd the maximum in all cases. 10

12 Any hope of nding the maximum of the 2 tnesses assigned to the 2 vertices in polynomial time must be based on the ability to rule out whole classes of tnesses with a single operation, which, from the above lemma, is impossible if the vertices are assigned completely independently and without any a priori knowledge about the assignment. On the face of things, it would seem that such problems cannot possibly lie in P however, we show that a similar problem is embedded in the problem of optimization on a random neighbor -k landscape, for which the correlation between tnesses of points separated by a Hamming distance of =4 is 3 4 e;k=4 1+O(k=) (See Appendix I.). Clearly, this quantity can be made as small as desired by choosing k suciently large and letting tend to innity (In fact, the -k decision problem remains in P for k = O(log ).). Because the tnesses are asymptotically jointly Gaussian, an asymptotically vanishing correlation between them implies that they are asymptotically statistically independent, and therefore a knowledge of one tness tells us, asymptotically, nothing about the other. We now prove Lemma B. For arbitrarily large, there exists a set,,of2 (log 2 +1)=2 bit strings of length such that the Hamming distance between any pair of strings is at least =4. The proof of Lemma B is based on Lemma C. For arbitrarily large, there exists a set,,of2 bit strings of length such that the Hamming distance between any pair of strings is at least =2. Proof of Lemma C: We start by recursively constructing, for =2 n, n 2, a set,,of 2 strings of length whose pairwise Hamming distances are at least =2. Clearly, 4 = f g: The 16 strings of length =8in 8 are obtained as follows: form 4 complementary pairs, (t i t i ), from the strings t i 2 4. Because the distance between distinct elements of 4 is 11

13 at least 2, the requisite = 8 strings are t i jjt i, t i jjt i, t i jjt i, and t i jjt i, where the symbol \jj" denotes concatenation (simple juxtaposition). The pairwise distance between each of the 16 strings thus formed is least =2 = 4, either because the strings are formed from dierent t's, or because half of one string is the complement of the corresponding half of the other string. The resulting set is also a set of 8 pairs of complementary strings, so that the iteration can be repeated once again, and, in fact, arbitrarily often, each time doubling the number of strings. The resulting set of 2 strings is obviously not unique, because another set can be generated by complementing the bits at a xed position in each of the strings in the rst set. Remark: In the course of referreeing this paper, Prof. Andreas Dress proved my conjecture that has maximal size, in that at least one member of is fewer than =2 bit ips away from all bit strings not in. His proof begins by identifying the successive bits b i, for 1 i of the string S as the image of a mapping from the index, i, to the appropriate bit value. He then notes the equivalence of the integer i ; 1 with its binary representation, the n =log 2 bit string, which we write explictly as 1 2 ::: n. The construction above shows that consists of precisely those bit strings S whose i th bit, b i is given by 0 + nx j=1 j j for some choice of the bits 0 1 ::: n, and for all i between 1 and. The strings thus generated are precisely the set of ane mappings from the n bit strings 1 2 ::: n to single bits. We can thus re-establish that the strings in have pairwise Hamming distances of at least =2: If the corresponding 's for two mappings dier at position p0, the strings they produce will dier at indicies i whose binary representations have the property that X p =1 and there are exactly =2 such indicies. If the 's dier only at position p = 0, the 12

14 corresponding strings will be complementary. The maximalityof now follows from character theory. Identify with the irreducible characters of F2 n, considered as a group, via the mapping i =(;1) b i =(;1) 0+ P n j=1 j j where i is the i th component of the vector 2F2 n. Foragiven bit string S, denote this mapping as (S). ow consider the inner product <(S) (S 0 )= 1 X 2 n i 0 i i = 1 h i 2 n d(s S 0 ) ; d(s S 0 ) where d(s S 0 ) is the Hamming distance between S and S 0. The fact that is closed under complementation guarantees that the Hamming distance between it and any bit string S is at most =2. If S could be exactly =2 bit ips from every member of, the dierence d(s S 0 ) ; d(s S 0)would be exactly zero for all S 0 2,andthus the inner product of (S) with all of the irreducible characters of F n 2 would be zero. As is well known, this last condition is satised if and only if every component of (S) is zero, which is clearly absurd. We now return to the Proof of Lemma B: Clearly, the 16 strings of length 4 whose pairwise Hamming distances are at least 1 are the binary representations of the integers 0 thru 15. For larger values of, we form the strings of from two =2 bit substrings, a prex and a sux. is an arbitrary member of =2 is constructed from an arbitrary member 2 =2 by complementing those bit positions marked by 1'sin (i.e. = +, where the addition is taken modulo 2.). To check that this construction does, indeed, meet the requirements stated above, we partition into subsets of strings with the same prex. Clearly, there is no problem with 13

15 the strings in the same subset, because their suxes, corresponding to dierent members of =2, are mutually separated by at least =4 bits. There is also no problem with strings in dierent subsets whose suxes were derived from the same : Their prexes, and 0 must dier in at least =8 bit positions, so that their suxes, = + and 0 = + 0,must also dier in at least =8 bit positions. There remains the case in which two strings in have both dierent prexes and dierent suxes. Because they have dierent prexes, the above construction guarantees that their prexes dier in at least =8 positions we nowshow that the same applies to the suxes, and 0.Wehave 0 < d( 0 ) = d( ): For each,, so that, in particular, 0 2 =2.Itfollows from the fact (proven below) that =2 is closed under \+" and the positive distance between + and that they are distinct members of =2, and therefore dier in at least =8 bit positions. We nowverify that is closed under \+" by induction. For = 4, closure obtains trivially. For larger, we assume closure for =2, and establish it for. Given two elements of S 1 S 2 2,whichwe write ( 1 jj ) and ( 2 jj ), where =2, =2,wehave S 1 + S 2 =( ) jj [( )+( )]: Using the induction hypothesis, =2 using the commutativity and the associativity of \+", we write the sux string as ( )+( ). We now observe that =2 is also closed under \+": as is clear from their construction, all elements 2 =2 are =2 bit strings formed by choosing a single element t 2 4 and forming arbitrary concatenations of t and its complement, t. Thus, is the arbitrary concatenation of the four bit strings t 1 + t 2 = t 1 + t 2 and t 1 + t 2 = t 1 + t 2. From the fact that these last 14

16 two strings form a complementary pair in 4,we conclude that is also in =2, allowing us to write S 1 + S 2 = 3 jj for some 3 2 =2 and some 3 2 =2, as required. We now count the number of distinct strings, D,in. Because each stringin is generated by choosing one of the D =2 members of =2 and one of the members of =2,wehave the recursion relation D = D =2 : Writing =2 n, and G n = D 2 n, this relation becomes G n =2 n G n;1 : Given the initial condition D 2 = G 2 =16thatwas computed \by hand" above, we see that G n =2 n(n+1)=2+1,sothatd =2 (log 2 +1)=2, as claimed. Remark: We conjecture that is also maximal. We state the conclusions of this section in the following Summary. The P complete random neighbor -k optimization problem seems to be at least as hard as the problem of optimization over a sample of D =2 (log 2 +1)=2 Gaussian random variables whose pairwise correlations can be made as small as desired, so that they are \arbitrarily close" to being pairwise statistically independent. Because D grows faster than any polynomial in, this result seems to suggest that P 6= P. Unfortunately, small correlations between tnesses of distant points on the landscape are not, in themselves, sucient to render a problem intractable. However small the correlations between the tnesses of strings in can be made in the random neighbor landscape, 15

17 they can be made even smaller in the adjacent neighbor landscape, at least for suciently large landscapes. In fact, if we ignore the O(1=) correction terms in the correlation functions, we have the relation R adj () (1 ; ) k+1 <R rand () (1 ; )e ;k which follows immediately from the inequality 1; <e ;, for 0 <<1. Discussion The crucial dierence between the adjacent and random landscapes seems to be the number of bits upon which each site tness could depend, the \tree width" discussed by Dress (1987). For the adjacent neighbor case, we know a priori that each site depends only on the bit at that site and the k bits at adjacent sites, so that the tree width is k +1. It is this a priori knowledge that makes possible the solution of the corresponding decision problem with a polynomial time dynamic programming algorithm. In contrast, which k +1 of the bits each site tness depends will vary from one instance of the random neighbor model to the other, so that the tree width is no longer k + 1, but O(). It appears, therefore, that the random neighbor problem is intractable, not only because of the small correlations between distant points on the landscape, but also because there is no eective way to partition the problem into smaller problems and thus use information available from nearby points to infer relationships between more distant points. In Weinberger, (1990), Weinberger (1991a) and Weinberger (1991b), we have argued for the importance of the -k model and, more generally, the notion of statistically isotropic \AR(1) landscapes". In general, statistical isotropy implies that the correlation, R(d), between the tnesses of pairs of points depends only on the distance d between them in the specic case of the AR(1) landscape, this correlation function assumes the form 16

18 R(d) =e ;d=t, for some choice of the \correlation length", T. The exact calculations of the pair correlations for the random neighbor and adjacent neighbor -k models presented in the Appendix show some disparity from the precise \AR(1)-ness" of the P -T model. However, this disparity is relatively minor when d<t, and it is R(d) values for d<t that determine the local properties of the landscape (See the above cited papers for details.). From these observations, we maketwo conjectures: that, in general, approximately AR(1) landscapes with unbounded tree width are P-complete, but these problems have statistical properties similar to the properties of \easy" optimization problems. In view of the ubiquity of AR(1) landscapes in optimization problems in computer design (Sorkin, 1988) and in RA folding landscapes (Fontana et al. (1991), this conjecture deserves further study. Finally, our results suggest that nature is solving a non-trivial optimization problem in the \design" of individual enzymes, at least if the relevant tness landscape resembles an -k or P -T landscape. As we sawabove, the tness of an enzyme depends crucially on its three dimensional structure, so that the random neighbor -k landscape (i.e. the P-complete one) is clearly more biologically accurate than the adjacent neighbor model. This observation is further evidence that there is much to be learned about optimization from the study of evolutionary strategies. Acknowledgements The author gratefully acknowledges the support of OR Grant K-0258 for the time period in which this work was begun and the support of a Max Planck Stipendium during its gestation. A discussion with Andreas Dress convinced the author that the optimization problem on adjacent landscapes could be solved in polynomial time via dynamic programming, and his subsequent encouragement resulted in the completion of this work, 17

19 including the correction of some mistakes in a previous version and the proof that is maximal. The author would also like toacknowledge useful discussions with Stuart Kauman on the general subject of evolution as a combinatorial optimization problem, and Pedro Tarazona for suggesting the P -T model and for the computation of both its correlation function and the correlation function of the random neighbor -k model. References Dress, A. (1987) \On the Computational Complexity of Composite Systems," Lecture otes in Physics, Vol. 268, Fluctuations and Stochastic Phenomena in Condensed Matter, L. Garrido (ed.), Springer, Berlin. Fontana, W., Griesmacher, T., Schnabl, W., Stadler, P., and Schuster, P. (1991). \Statistics of Landscapes Based on Free Energies, Replication and Degradation Rate Constants of RA Secondary Structures," Monatshefte fur Chemie, in press. Garey, M. and Johnson, D. (1979). Computers and Intractability: A Guide to the Theory of Incomputability, W. H. Freeman, San Francisco. Kauman, S. & Levin, S. (1987). \Towards a General Theory of Adaptive Walks on Rugged Landscapes," Journal of Theoretical Biology Kauman, S., Weinberger, E., and Perelson, A. (1988). \Maturation of the Immune Response Via Adaptive Walks On Anity Landscapes," Theoretical Immunology, Part I, Santa Fe Institute Studies in the Sciences of Complexity, A. S. Perelson (ed.), Addison- Wesley, Reading, Ma. 18

20 Kauman, S. & Weinberger, E. (1989). \The -k Model of Rugged Fitness Landscapes and Its Application to Maturation of the Immune Response," Journal of Theoretical Biology 141, o Macken, C. and Perelson, A. (1989). "Protein Evolution on Rugged Landscapes," Proceedings of the ational Academy of Sciences Sorkin, G. (1988). \Combinatorial Optimization, Simulated Annealing, and Fractals", IBM Research Report RC13674 (o ). Weinberger, E. (1988). \A More Rigorous Derivation of Some Results on Rugged Fitness Landscapes," J. theor. Biol. 134 o. 1, Weinberger, E. (1990). \Correlated and Uncorrelated Fitness Landscapes and How to Tell the Dierence," Biological Cybernetics 63, o. 5, Weinberger, E. (1991a). \Local Properties of Kauman's -k model, a Tuneably Rugged Energy Landscape," Physical Review A, 44, o. 10, Weinberger, E. (1991b). \Fourier and Taylor Series on Fitness Landscapes," Biological Cybernetics, 65, Wright, S. (1932), \The roles of mutation, inbreeding, crossbreeding and selection in evolution. In: Proceedings 6th Congress on Genetics 1,

21 Appendix Calculation of the Correlation between Pairs of Fitnesses in the -k Model We want to compute R(d) = E[f(a)f(b)] ; 2 2 where a and b are bit strings separated by Hamming distance d, f(a) and f(b) are their respective tnesses, is the common mean of these tnesses, and 2 is their common variance. The expectation, E, is taken over the joint distribution of the random variables f(a) and f(b). Without loss of generality, we choose a distribution for the \site tnesses" that has mean zero and variance 1. We then have =0, 2 =1=, and R(d) = 1 X E 4@ f j (a) A X 13 f j (a)+ f j (b) A5 j2c X j=2c where the notation is chosen to reect the fact that a certain subset, C, of the site tnesses change when bit string a is changed to bit string b, but site tnesses f j =2 C, i.e. all of the others, remain the same. For pairs of sites i 6= j, the denition of the -k model guarantees that site tnesses f i (a) and f j (b) are independent, and thus uncorrelated. Similarly, f j (a) and f j (b) are either identical (because bit j and its neighbors are identical in both a and b) or they are independent. In any case, we conclude that R(d) = 1 E h X j=2c =Prfj =2 Cg: f 2 j (a) In other words, the correlation between the tnesses of two bit strings in the -k model is exactly the probability that a randomly chosen site tness is the same in the computation of f(a) and f(b). i For the random neighbor model, the required probability is easily obtained: A site tness is unchanged only if the bit at that site is not one of the d bits that has been ipped and if 20

22 it is not one of the k neighbors of any ipped bit. The probability that a site satises the rst condition is 1 ; d= the probability that a site satises the statistically independent second condition is [1 ; k=( ; 1)] d.thus, for the random neighbor model, R(d) = 1 ; d 1 ; k d : ; 1 For k d <<, wehave R(d) =1; d(k +1) + (d ; 1)dk(k +2) O dk 3 : If d =, where = O(1), and k<<, R()=(1; ) 1 ; k ; 1 =(1; )e ;k 1+O(k=) : For the P -T model, in which neighborhoods are dened completely at random, the probability that a site is not aected by ipping a random bit is 1;k=, independent of whether other bits have been ipped previously, provided they have not aected the given site. The probability that a site remains unaected by d such ipsisthus (1 ; k=) d.for purposes of comparison with the other models, wegive the small and large d approximations to R(d) for the P -T model with k + 1, rather than k neighbors. The rst of these approximations is R(d) =1; d(k +1) + d(d ; 1)(k +1) O dk 3 for d k << as is clear from a binomial expansion of ; 1 ; (k +1)= d. If d =, where = O(1), and k<<, k R(d) =e 1+O ;(k+1) 2 : The same derivation for the adjacent neighbor model begins by imagining that the site tnesses are arranged in a circle, and that the ipped bits are represented by the integer 21

23 vector (n 1 n 2 ::: n i ::: n d ), where 1 n i. Without loss of generality, wechoose n 1 = 1 and n 2 <n 3 < <n d. Because we must make d ; 1choices among the ;1 remaining bits, and, because we cannot make the same choice twice, there are ; ;1 d;1 ways to make the required choices. However, if we constrain n i+1 ; n i = l, we need only make d ; 2choices from ; l ; 1bits. ; ;l;1 The number of ways of making this second set of choices is d;2. The probability, l, that n i+1 ; n i = l is then given by l = ; ;l;1 d;2 ; ;1 d;1 independent ofi. If l k, thenl site tnesses change as a result of ipping bit n i+1 otherwise, k + 1 site tnesses change. It follows that the expected number, E[jCj], of changed site tnesses after moving a distance d from the starting point is E[jCj] =d ( k X l=1 = d(k +1); d h l l +(k +1) 1 ; ; ;1 d;1 k X l=1 kx i ) l l=1 ; l ; 1 (k +1; l) d ; 2 The probability that a randomly chosen site tness doesn't change is then 1 ;E[jCj]=, and R(d) =1; E[jCj] =1; d(k +1) d + ; ;1 d;1 k X For d k <<, this expression can be written as R(d) =1; =1; d(k +1) d(k +1) + + d(d ; 1) 2 kx l=1 d(d ; 1)k(k +1) 2 2 l=1 (k +1; l) + O 22 ; l ; 1 (k +1; l) d ; 2 1 ; l+1 dk 1 ; 1 3 : 1 ; l+2 : 1 ; d+l;2 1 ; d;1

24 When d = and k<<, R(d) =1; d(k +1) + d(d ; 1) 2 =1; (k +1) + 2 kx l=1 =(1; ) k+1 + O(k 3 =): kx l=1 (k +1; l) 1 ; d 1 ; d+1 1 ; 1 (k +1; l)(1 ; ) l;1 + O(k 3 =) 1 ; l 1 ; d+l;2 23

Lecture 14 - P v.s. NP 1

CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) February 27, 2018 Lecture 14 - P v.s. NP 1 In this lecture we start Unit 3 on NP-hardness and approximation