Energy Minimization of Protein Tertiary Structure by Parallel Simulated Annealing using Genetic Crossover

Similar documents
Computer simulations of protein folding with a small number of distance restraints

Peptides And Proteins

What makes a good graphene-binding peptide? Adsorption of amino acids and peptides at aqueous graphene interfaces: Electronic Supplementary

Packing of Secondary Structures

Molecular Structure Prediction by Global Optimization

Sequential resonance assignments in (small) proteins: homonuclear method 2º structure determination

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

arxiv: v1 [physics.bio-ph] 26 Nov 2007

Supplemental Materials for. Structural Diversity of Protein Segments Follows a Power-law Distribution

Parallel Tempering Algorithm for. Conformational Studies of Biological Molecules

Using Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell

Supporting Information

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Physiochemical Properties of Residues

Other Methods for Generating Ions 1. MALDI matrix assisted laser desorption ionization MS 2. Spray ionization techniques 3. Fast atom bombardment 4.

Section Week 3. Junaid Malek, M.D.

Properties of amino acids in proteins

Resonance assignments in proteins. Christina Redfield

Course Notes: Topics in Computational. Structural Biology.

Model Mélange. Physical Models of Peptides and Proteins

Monte Carlo Simulations of Protein Folding using Lattice Models

Central Dogma. modifications genome transcriptome proteome

Read more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov

B O C 4 H 2 O O. NOTE: The reaction proceeds with a carbonium ion stabilized on the C 1 of sugar A.

7.012 Problem Set 1. i) What are two main differences between prokaryotic cells and eukaryotic cells?

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

Aggregation of the Amyloid-β Protein: Monte Carlo Optimization Study

Amino Acid Side Chain Induced Selectivity in the Hydrolysis of Peptides Catalyzed by a Zr(IV)-Substituted Wells-Dawson Type Polyoxometalate

Supporting Information

NMR parameters intensity chemical shift coupling constants 1D 1 H spectra of nucleic acids and proteins

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

Lecture 9 Evolutionary Computation: Genetic algorithms

Chemistry Chapter 22

Computational statistics

Folding of Polypeptide Chains in Proteins: A Proposed Mechanism for Folding

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

Reproduction- passing genetic information to the next generation

Chapter 8: Introduction to Evolutionary Computation

Unraveling the degradation of artificial amide bonds in Nylon oligomer hydrolase: From induced-fit to acylation processes

Chapter 4: Amino Acids

BIRKBECK COLLEGE (University of London)

Supporting information to: Time-resolved observation of protein allosteric communication. Sebastian Buchenberg, Florian Sittel and Gerhard Stock 1

April, The energy functions include:

UNIT TWELVE. a, I _,o "' I I I. I I.P. l'o. H-c-c. I ~o I ~ I / H HI oh H...- I II I II 'oh. HO\HO~ I "-oh

New Optimization Method for Conformational Energy Calculations on Polypeptides: Conformational Space Annealing

Proteins: Characteristics and Properties of Amino Acids

CSC 4510 Machine Learning

Protein Structures: Experiments and Modeling. Patrice Koehl

Computational Protein Design

Lecture 15: Realities of Genome Assembly Protein Sequencing

NMR study of complexes between low molecular mass inhibitors and the West Nile virus NS2B-NS3 protease

CHEM J-9 June 2014

CHMI 2227 EL. Biochemistry I. Test January Prof : Eric R. Gauthier, Ph.D.

12/6/12. Dr. Sanjeeva Srivastava IIT Bombay. Primary Structure. Secondary Structure. Tertiary Structure. Quaternary Structure.

Meeting the challenges of representing large, modified biopolymers

Major Types of Association of Proteins with Cell Membranes. From Alberts et al

Bacterial protease uses distinct thermodynamic signatures for substrate recognition

Protein Fragment Search Program ver Overview: Contents:

arxiv:cond-mat/ v1 [cond-mat.stat-mech] 2 Nov 2006

The translation machinery of the cell works with triples of types of RNA bases. Any triple of RNA bases is known as a codon. The set of codons is

PROTEIN ORIGAMI - A PROGRAM FOR THE CREATION OF 3D PAPER MODELS OF FOLDED PEPTIDES

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27

Evolutionary Functional Link Interval Type-2 Fuzzy Neural System for Exchange Rate Prediction

Translation. A ribosome, mrna, and trna.

Ramachandran Plot. 4ysz Phi (degrees) Plot statistics

THE UNIVERSITY OF MANITOBA. PAPER NO: 409 LOCATION: Fr. Kennedy Gold Gym PAGE NO: 1 of 6 DEPARTMENT & COURSE NO: CHEM 4630 TIME: 3 HOURS

Exam I Answer Key: Summer 2006, Semester C

Supersecondary Structures (structural motifs)

Ranjit P. Bahadur Assistant Professor Department of Biotechnology Indian Institute of Technology Kharagpur, India. 1 st November, 2013

Computational protein design

Any protein that can be labelled by both procedures must be a transmembrane protein.

Biological Macromolecules

Peptide Syntheses. Illustrative Protection: BOC/ t Bu. A. Introduction. do not acid

Dominant Paths in Protein Folding

HSQC spectra for three proteins

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues

Clustering and Model Integration under the Wasserstein Metric. Jia Li Department of Statistics Penn State University

Bahnson Biochemistry Cume, April 8, 2006 The Structural Biology of Signal Transduction

Details of Protein Structure

Similarity or Identity? When are molecules similar?

A prevalent intraresidue hydrogen bond stabilizes proteins

Advanced Topics in RNA and DNA. DNA Microarrays Aptamers

Local Search & Optimization

Molecular Mechanics. I. Quantum mechanical treatment of molecular systems

X-ray crystallography NMR Cryoelectron microscopy

Amino Acids and Peptides

Viewing and Analyzing Proteins, Ligands and their Complexes 2

Supplementary Information. Broad Spectrum Anti-Influenza Agents by Inhibiting Self- Association of Matrix Protein 1

Biophysical Journal, Volume 96. Supporting Material

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009

DETECTING THE FAULT FROM SPECTROGRAMS BY USING GENETIC ALGORITHM TECHNIQUES

Geometrical Concept-reduction in conformational space.and his Φ-ψ Map. G. N. Ramachandran

Advanced Molecular Dynamics

TOPOLOGY STRUCTURAL OPTIMIZATION USING A HYBRID OF GA AND ESO METHODS

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

Supplementary Figure 3 a. Structural comparison between the two determined structures for the IL 23:MA12 complex. The overall RMSD between the two

Protein Struktur. Biologen und Chemiker dürfen mit Handys spielen (leise) go home, go to sleep. wake up at slide 39

Overview. The peptide bond. Page 1

BCH 4053 Exam I Review Spring 2017

Transcription:

Minimization of Protein Tertiary Structure by Parallel Simulated Annealing using Genetic Crossover Tomoyuki Hiroyasu, Mitsunori Miki, Shinya Ogura, Keiko Aoi, Takeshi Yoshida, Yuko Okamoto Jack Dongarra Department of Engineering, Doshisha University@ Graduate School of Engineering, Doshisha University@ Department of Theoretical Studies, Institute for Molecular Science@ Department of Computer Science, University of Tennessee@ From our recent research, it has been determined that Parallel Simulated Annealing using Genetic Crossover (/GAc) has a high searching capability on an energy minimization of a small protein called Met-enkephalin. /GAc performs genetic crossover (one of the operations of Genetic Algorithm (GA)) among the Parallel SA () to exchange their information. In this paper, /GAc is applied to the energy minimization of C-peptide and Parathyroid Hormone Fragment(1-34). Also among the parameters of /GAc, crossover interval and total number of searching steps, which are supposed to have greater influence on the searching capability, are modified to some values in order to examine and study their influences. The results show that /GAc provides lower energy of the target proteins than Parallel SA. Furthermore, higher frequency of crossover and longer searching steps are proven to derive lower energies. From the results we conclude that /GAc is also effective on the predictions of larger proteins. keywords: Prediction of Protein Tertiary Structure, Simulated Annealing, Genetic Algorithm, Optimization 1 Introduction Protein is a substance directly connected to the biological phenomena and its biological functions are said to be derived from its tertiary structure. Thus, clarification of its tertiary structure leads to an explanation of the biological phenomena process. Tertiary structure of the protein is believed to correspond to the conformation with the lowest residual potential energy. Therefore, as one of the methods to predict the tertiary structure of proteins, minimization of the energy function has been studied. Simulated Annealing (SA) has often been employed as the optimization method to predict the tertiary structure of proteins by the minimization of the energy 4). However, the energy function that determines the tertiary structure is complicated with large numbers of global and local minima. Therefore in order to acquire a fast convergence to a global optimal, a hybrid method that combines Sequential SA (SSA) with a mechanism to improve the searching capability has been developed. Thus, as a combination method of SA with an operator of Genetic Algorithm (GA), Parallel Simulated Annealing using Genetic Crossover (/GAc) is what we propose 3). GA is another optimization method that imitates an evolution of living things. GA is said to be superior to SA at searching for the optimum in a wider area of the searching domain. Our previous research showsthat/gac has higher searching capability than SSA on the energy minimization of a small protein called Met-enkephalin 3). In this paper, /GAc is applied to the energy minimization of the proteins larger than Met-enkephalin; C-peptide and Parathyroid Hormone Fragment (1-34). Also among the parameters of /GAc, crossover interval and total number of searching steps, which are supposed to have greater influence on the searching capability, are modified to some values to examine and study their influences. The results of /GAc are compared to those of Parallel SA (). 1

2 SA and GA 2.1 Simulated Annealing The term annealing derives from the physical process of heating and then slowly cooling a substance to obtain a strong crystalline structure. Simulated Annealing (SA) is the optimization method to stochastically simulate this physical process of annealing on the computers. In SA, the simulation proceeds by randomly generating a solution and then determining its acceptance in certain probability. A temperature parameter is used to determine this probability. In the basic algorithm of SA, three operations bear an important role; generate, accept, and cool. The generation operation modifies a current solution x and generates a next solution x using a probability distribution G(x, x ). The accept operation is a judgment to decide whether to accept the modification or not. The acceptance of the modification is determined from a difference E(= E E) of a current energy E = f(x) and a modified energy E = f(x ), and the temperature parameter T. Metropolis et al. 6) introduced a simple algorithm to provide an efficient simulation. That is, if E 0, the modification is accepted. In case E >0, the modification is accepted at certain probability. This algorithm can be defined as: { 1 if E 0 P ACCEPT = exp( E T ) otherwise (1) The temperature T is the important parameter to control the acceptance of the modified solution. At the beginning of the simulation, both the temperature and the acceptance levels are high. As the simulation proceeds and the temperature decreases, solutions that have the bigger fitness values. Cooling is the operation to generate the temperature of the next state T k+1 from the temperature of the current state T k. The simulation begins with the initial solution x with energy of E and the initial temperature T. The solution is then randomly modified to x with energy of E. The acceptance of the modification is calculated from the difference of energy, E(= E E), and the temperature T k. If accepted, the solution x becomes the starting point of the next step. These operations are repeated long enough at each temperature for the system to reach a steady state, or equilibrium. When reaching the steady state at T k,thetemperature is cooled to T k+1 and the simulation is repeated until reaching steady state again. Users define the terminal criterion, such as the number of evaluations. The simulation is concluded when the temperature becomes low enough and a terminal criterion is met. 2.2 Genetic Algorithm Genetic Algorithm (GA) is the optimization algorithm that imitates the evolution of living creatures. In nature, inadaptable creatures to an environment meet extinction, and only adapted creatures can survive and reproduce. A repetition of this natural selection spreads the superior genes to conspecifics and then the species prospers. GAmodelsthisprocessofnatureoncomputers. GA can be applied to several types of optimization problems by encoding design variables of individuals. Searching for the solution proceeds by performing the three genetic operations on the individuals; selection, crossover, and mutation, which play an important role in GA. Selection is an operation that imitates the survival of the fittest in nature. The individuals are selected for the next generation according to their fitness. Crossover is an operation that imitates the reproduction of living creatures. The crossover exchanges the information of the chromosomes among individuals. Mutation is an operation that imitates the failure that occurs when copying the information of DNA. Mutating the individuals in a proper probability maintains the diversity of the population. 3 Parallel SA using Genetic Crossover Parallel Simulated Annealing using Genetic Crossover (/GAc) is the optimization method to exchange the informations of solutions among SAs running in parallel by the genetic 2

crossover 2). In this algorithm, the total number of SAs running in parallel is defined as population size and each SA is defined as individual 2). Procedure of /GAc is stated below: step 1 SAs start running in parallel with initial solutions. step 2 When number of annealing reaches a certain number d, two solutions (individuals) from the parallel SAs are randomly paired. step 3 For each pair of individuals, the crossover operation is performed and two new individuals are generated. The crossover operation used in /GAc is one-point crossover between the design variables. step 4 Among the two parents and the two offsprings, the two individuals with the better fitness are selected as new searching points. step 5 A certain number d of annealing is performed from the new searching points. step 6 Each pair of individuals performs step 3 to step 5. step 7 Step 2 to step 6 are repeated until the terminal criterion is met. A concept of /GAc is illustrated in Figure 1. SA SA SA d SA high individual temperature crossover d crossover d crossover... d end low temperature d : crossover interval Figure 1: Concept of Parallel Simulated Annealing using Genetic Crossover Figure 2 describes the concept of steps 3 and 4. The figure describes the case of three design variables where the randomly selected boundary between the first and the second variables becomes a crossover point. After evaluating the individuals, the individuals with higher evaluations (parent 2 and offspring 2) are chosen as the next searching points. parent 1 parent 2 offspring 1 offspring 2 crossover evaluation rank -2.0-1.1-2.3-0.8 3 2 4 1 parent 2 offspring 2 next individuals Figure 2: Concept of crossover and selection In case the optimal values of some design variables are obtained, the crossover operation works to transfer the values to other individuals. Thus, the convergence of the annealing is precipitated. The former experiments of applying /GAc on some continuous minimization problems show that /GAc has a better convergence property than SSA, and on the problems difficult for GA to solve, /GAc has also been confirmed to be effective 2). 4 Minimization of Proteins by /GAc In this research, /GAc is applied to the energy minimization of larger proteins to verify its performance and also to study some of its parameters. 4.1 Target Proteins The target proteins of this research are C-peptide and Parathyroid Hormone Fragment(1-34). C- peptide consists of 13 amino residues, has 13 pairs of dihedral angles (φ, ψ) in the backbone, and 38 dihedral angles in the side chains. PTH(1-34) consists of 34 amino residues, has 34 pairs of dihedral angles in the backbone, and 1 dihedral angles in the side chains. The energy minimization of these proteins is gas-phase simulation based on the energy parameters of ECEPP/2 7, 8, 12). The dihedral angles of backbone and side chains are applied as design variables. Values of the dihedral angles are in the range of [-180, 180 ]. Each dihedral angle is generated and given the accept criterion sequentially, and then the temperature is cooled. In this paper, this series of operations are defined as a Monte Carlo Sweep (). Thus, 64 3

Metropolis criterions are performed in one in the case of C-peptide and 178 in the case of PTH(1-34). Okamoto s experiments, using the energy function that adopts ECEPP/2 energy parameters, show that C-peptide has the lowest energy conformation when eight residues (4-11) form an α- helix. The energy of this lowest energy conformation was -42.0 kcal/mol 9). Also, the lowest energy conformation of C-peptide obtained in this experiment corresponds with the structure deduced from the X-ray crystallography experiment 1). It is known by the NMR experiment that PTH(1-34) has two α-helices 5). The lowest energy conformation of PTH(1-34) deduced from Okamoto s experiment also had two α-helices and its energy is -2.0 kcal/mol 11). 4.2 Minimization of C-peptide In the energy minimization of C-peptide, Okamoto made 20 runs of SSA with each run consisting of,000 s 9). In order to equalize the total number of s with Okamoto s experiment, the parameters of /GAc were set to 24 individuals 4,165 s and two runs were made. All the experiments were started from random initial conformations. Next, states of the dihedral angles are generated stochastically from the neighborhood range using uniform distribution. Neighborhood range [max, min] is determined as: max = 180 180 0.7 #ṣweep Total # sweeps min = max (2) Other parameters are shown in Table 1. The initial and the final temperature are the same values as Okamoto s ). The result of the experiment signified that the lowest energy conformation was obtained when the crossover interval was set to 32. Table 2 lists the main-chain dihedral angles of the conformation with the lowest energy, which was - 53.9 kcal/mol. The conformation is considered as α-helical when the dihedral angles (φ, ψ) fall in the range ( 60 ± 45, 50 ± 45 ) 9), therefore, the lowest energy conformation exhibits α-helices in nine residues. The α-helical residues are indicated with a in Table 2. Figure 3 shows the lowest energy conformation of C-peptide obtained in the experiment. The lowest energy conformation derived from the experiment had a lower energy than that of Okamoto and the α-helices were in accord with the known conformation. Consequently, /GAc has high convergence property on the energy minimization of C-peptide. Table 1: Parameters of /GAc Parameter Value Initial Temperature 2.0 (00K) Final Temperature 0.1 (50K) Crossover Interval 8, 16, 32, 64 Table 2: Amino acid sequence of C-peptide and the dihedral angles of the lowest energy conformation 5 Sequence Ly+ Gl- Thr Ala Ala 23-79 -76-67 -66-66 2 86-27 -32 - - - a a Ala Ly+ Phe Glu Ar+ -79-62 -65-65 -63-43 -41-41 -41 a a a a a 13 Gln Hi+ Met -72-69 -82-30 5 a a - = 53.9 kcal/mol Figure 3: Lowest energy conformation of C- peptide obtained by /GAc 4

4.3 Minimization of PTH(1-34) In the energy minimization of PTH(1-34), Okamoto made 20 runs of SSA with each run consisting of,000 s 11). In order to equalize the total number of s with Okamoto s experiment, the parameters of /GAc was set to 24 individuals 4,165 s, and two runs were made. All the experiments were started from random initial conformations. Other parameters are shown in Table 1. The result of the experiment signified that the lowest energy conformation was obtained when the crossover interval was set to eight. Table 3 lists the main-chain dihedral angles of the conformation with lowest energy, which was -236.6 kcal/mol. The lowest energy conformation exhibits α-helices in residues 2-8 and 15-18. Figure 4 shows the lowest energy conformation of PTH(1-34) obtained in the experiment. Table 3: Amino acid sequence of PTH(1-34) and the dihedral angles of the lowest energy conformation 5 Sequence Ser Val Ser Glu Ile 93-65 -68-76 -71 157-26 -36-37 -35 - a a a a Gln Leu Met His Asn -64-70 -55-64 -112-41 -44-43 -96 145 a a a - - 15 Leu Gly Lys His Leu -5 76-95 -90-73 8-90 165-1 -22 - - - - a 20 Asn Ser Met Glu Ar+ -85-85 -74-88 -142-20 -37-30 64-60 a a a - - 25 Val Glu Trp Leu Ar+ -134-85 -64-65 -155-58 73 131 8 112 - - - - - 30 Lys Lys Leu Gln Asp -113-5 -148-64 -65 63-32 148-55 145 - - - a - 34 Val His Asn Phe 48-55 -60-84 71-46 153 118 - a - - = 236.6 kcal/mol Figure 4: Lowest energy conformation of PTH(1-34) obtained by /GAc As stated in section 4.1, PTH(1-34) is known to have two α-helices from both NMR and Okamoto s experiment. The lowest energy conformation obtained by /GAc also exhibited two α-helices and retained lower energy than the Okamoto s. Consequently, /GAc has high convergence property on the energy minimization of PTH(1-34). 4.4 Study on the Number of s and Crossover 4.4.1 Abstract of the Experiment /GAc performs genetic crossover, one of the operations of GA, to transfer the solutions among SAs running in parallel. In case the optimal values of some design variables are obtained, the crossover operation works to transfer the values to other individuals. Thus, the convergence of the annealing can be precipitated. Therefore the frequency of the crossover operation gives significant influence on the performance of /GAc. 5

0 - -20-30 0 - -20-30 0 - -20-30 -50 0 00 2000 3000 4000 5000 (a) 16 individuals 6,000-50 0 500 00 1500 2000 2500 (b) 32 individuals 3,000-50 0 200 400 600 800 00 1200 1400 (c) 64 individuals 1,500 Figure 5: history of C-peptide. crossover intervals ranging from 16 to 64. Each figure presents the result of and /GAc with -24-28 -32-36 -44-48 -52-56 0 20 30 40 50-24 -28-32 -36-44 -48-52 -56 0 20 30 40 50-24 -28-32 -36-44 -48-52 -56 0 20 30 40 50 (a) 16 individuals 6,000 (b) 32 individuals 3,000 (c) 64 individuals 1,500 Figure 6: Lowest energies of C-peptide obtained in each run are plotted in the descending order. Each figure presents the result of and /GAc with crossover intervals ranging from 16 to 64. Thus, in this section, the influence of the total number of s and the frequency of the crossover on the convergence property of /GAc are studied. In the experiment, parameters of /GAc are set to 16 individuals 6,000 s, 32 individuals 3,000 s, and 64 individuals 1,500 s with crossover intervals of 16, 32, and 64. Among /GAcs, Parallel SA () is also applied. is the method to fetch the best solution from plural SSAs running in parallel. Total number of evaluations in every run of the experiment is approximately united. 4.4.2 Result of C-peptide The results of the energy minimization of C- peptide by /GAc with parameters of 16 individuals 6,000, 32 individuals 3,000, and 64 individuals 1,500 are summarized in Figure 5(a), Figure 5(b), and Figure 5(c), respectively whichshowthebestof the individuals as the number of s proceeds. The figures show the average of 50 runs. Each figure also shows the result with crossover intervals of 16, 32, and 64. The horizontal and the vertical axes each represent the number of proceeded s and the energy of protein (kcal/mol). Figure 6(a), Figure 6(b), and Figure 6(c) present the lowest energies obtained in each run of and /GAc with crossover intervals of 16, 32, and 64, respectively. Each energy is plotted in the descending order. The horizontal and the vertical axes each represent the number of runs and the energy of proteins(kcal/mol). 6

-0-120 -140-160 -180 0 00 2000 3000 4000 5000 (a) 16 individuals 6,000-0 -120-140 -160-180 0 500 00 1500 2000 2500 (b) 32 individuals 3,000-0 -120-140 -160-180 0 200 400 600 800 00 1200 1400 (c) 64 individuals 1,500 Figure 7: history of PTH(1-34). Each figure presents the result of and /GAc with crossover intervals ranging from 16 to 64. -2-2 -2-230 -230-230 -250-260 -250-260 -250-260 0 5 15 20 0 5 15 20 0 5 15 20 (a) 16 individuals 6,000 (b) 32 individuals 3,000 (c) 64 individuals 1,500 Figure 8: Lowest energies of PTH(1-34) obtained in each run are plotted in descending order. Each figure presents the result of and /GAc with crossover intervals ranging from 16 to 64. 4.4.3 Result of PTH(1-34) The result for the energy minimization of PTH(1-34) by /GAc with parameters of 16 individuals 6,000, 32 individuals 3,000, and 64 individuals 1,500 are summarized in Figure 7(a), Figure 7(b), and Figure 7(c), respectively whichshowthebestof the individuals as the number of s proceeds. The figures show the average of 20 runs. Each figure also shows the result with crossover intervals of 16, 32, and 64. The horizontal and the vertical axes each represent the number of proceeded s and the energy of proteins (kcal/mol). Figure 8(a), Figure 8(b), and Figure 8(c) present the lowest energies obtained in each run of and /GAc with crossover interval of 16, 32, and 64. Each energy is plotted in the descending order. The horizontal and the vertical axes each represent the number of runs and the energy of proteins (kcal/mol). 4.4.4 Result of Met-enkephalin Results for the energy minimization of small protein consisted of five residues called Metenkephalin are shown below for comparison. Figure 9(a) presents the transition of energy obtained by /GAc with the parameters of 16 individuals 6,000 s, respectively. The transition is an average of 50 runs. Figure 9(b) and Figure 9(c) each present the lowest energy obtained in 50 runs of /GAc with the parameters of 16 individuals 6,400 s and 64 individuals 1,500 s. The energies are plotted in the descending order. 7

-2-4 -6-8 - -12 0 00 2000 3000 4000 5000 6000-7.5-8.0-8.5-9.0-9.5 -.0 -.5-11.0-11.5-12.0-12.5 0 20 30 40 50-7.5-8.0-8.5-9.0-9.5 -.0 -.5-11.0-11.5-12.0-12.5 0 20 30 40 50 (a) Transition of energies obtained by /GAc with 16 individuals 6,000 s (b) The lowest energies of each runs obtained by /GAc with 16 individuals 6,000 s (c) The lowest energies of each runs obtained by /GAc with 64 individuals 1,500 s Figure 9: Transition of the energy and the lowest energies obtained in the energy minimization of Metenkephalin. 4.4.5 Discussion of the Results The following observations can be made from the results of both C-peptide and PTH(1-34). Firstly, we made a comparison of /GAc with the large and small total number of s. In each case /GAc with the larger number of s acquired better convergence. This is because the total number of s is linked with the cooling schedule; thus, /GAc with longer s could search in both global and local areas of the searching domain sufficiently. On the other hand, Figure 9(c) reveals that /GAc with 64 individuals 1,500 s has the best convergence property in the energy minimization of Met-enkephalin. Because Met-enkephalin is a small protein with 19 dihedral angles, it can be considered to be simple as an optimization problem. Therefore, the small number of s and the short crossover interval are sufficient for the search on the energy minimization of Metenkephalin. On the other hand, for the energy minimization of larger proteins such as C-peptide and PHT(1-34), larger number of s is required to obtain enough convergence. Secondly, we focused on the effect of the crossover operation. By comparison the results of /GAc with those of that does not have the cross over operation, it can be said that /GAc derived better solutions. Thus, crossover operation is working effectively on the energy minimization of these proteins. Comparison of the crossover interval also revealed that the convergence property of /GAc is improved by the crossover operation. The smaller the crossover interval is (the higher frequency of the crossover operation is) in either results the faster the convergence becomes. When the total number of s is large, this trend is prominent. In the case of the energy minimization of Metenkephalin, it is obvious from Figure 9(a) that the crossover interval has almost no influence on the convergence property of /GAc in spite of the total number of s. Figure 9(b) indicates that the performance of /GAc with a crossover interval of 16 is inferior to /GAc with other crossover intervals. That is, when the total number of s for the energy minimization of small proteins is unnecessarily large, setting a short crossover interval yields a decrement in the diversity of individuals, and therefore, the performance decreases. From the discussion, it can be speculated that /GAc and the crossover operation are especially effective for the energy minimization of large proteins that require the large number of s. 5 Conclusions In this paper, Parallel Simulated Annealing using Genetic Crossover (/GAc) has confirmed its effectiveness on the energy minimization of the 8

small protein called Met-enkephalin and applied to the energy minimization of larger proteins; C- peptide and PTH(1-34). Firstly, to compare the performance of /GAc with Okamoto s Sequential SA (SSA), /GAc was applied to the energy minimization of C-peptide and PTH(1-34). The result demonstrated that /GAc has a better convergence property than SSA on the energy minimization of both proteins. Secondly, /GAc and Parallel SA () were applied to the energy minimization of above proteins to make a comparison of the performances. (Number of individuals) (total number of s) was set to constant (total number of evaluation is approximately equal) to study the influence of total numbers of s and the crossover operation. The result revealed that /GAc obtains lower energy than. In addition, the result also indicates that the shorter the crossover interval is, the more frequently the lower energies were obtained. Studying the above results, we can say that the crossover operation of /GAs is working effectively on the energy minimization of proteins. Additionally, it can be concluded that /GAc is also effective on the energy minimization of larger proteins. References 1) Ulrich H. E. Hansmann and Yuko Okamoto. Tertiary Structure Prediction of C-Peptide of Ribonuclease A by Multicanonical Algorithm. J. Phys. Chem. B, 2(4):653 656, 1998. 2) Tomoyuki Hiroyasu, Mitsunori Miki, and Maki Ogura. Parallel simulated annealing using genetic crossover. Proceedings of the 44th Institute of Systems, Control and Information Engineers, pages 113 114, 2000. 4) Hikaru Kawai, Takeshi Kikuchi, and Yuko Okamoto. A prediction of tertiary structures of peptide by the Monte Carlo simulated annealing method. Protein Engineering, 3(2):85 94, 1989. 5) W. Klaus, T. Dieckmann, V. Wray, D. Schomburg, E. Wingender, and H. Mayer. Biochemistry, 30:6936 6942, 1991. 6) N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller. Equation of State Calculations by Fast Computing Machines. The Jurnal of Chemical Physics, 21(6):87 92, 1953. 7) F.A. Momany, R.F. McGuire, A.W. Burgess, and H.A. Scheraga. J. Phys. Chem., 79:2361 2381, 1975. 8) G. Nemethy, M.S. Pottle, and H.A. Scheraga. J. Phys. Chem., 87:1883 1887, 1983. 9) Yuko Okamoto, Masataka Fukugita, Takashi Nakazawa, and Hikaru Kawai. α-helix folding by Monte Carlo simulated annealing in isolated C-peptide of ribonuclease A. Protein Engineering, 4(6):639 647, 1991. ) Yuko Okamoto, Takeshi Kikuchi, and Hikaru Kawai. Prediction of Low- Structures of Met-Enkephalin by Monte Carlo Simulated Annealing. CHEMISTRY LETTERS, pages 1275 1278, 1992. 11) Yuko Okamoto, Takeshi Kikuchi, Takashi Nakazawa, and Hikaru Kawai. α-helix structure of parathyroid hormone fragment (1-34) predicted by Monte Carlo simulated annealing. INTERNATIONAL JOURNAL OF PEPTIDE & PROTEIN RESEARCH, 42:300 303, 1993. 12) M.J. Sippl, G. Nemethy, and H.A. Scheraga. J. Phys. Chem., 88:6231 6233, 1984. 3) Tomoyuki HIROYASU, Mitsunori MIKI, Maki OGURA, and Yuko OKAMOTO. Examination of Parallel Simulated Annealing using Genetic Crossover. Journal of Information Processing Society of Japan, 43(SIG7(TOM6)):70 79, 2002. 9