The impact of hydrodynamic interactions on protein folding rates depends on temperature

Similar documents
Scoring functions. Talk Overview. Eran Eyal. Scoring functions what and why

CONVECTIVE HEAT TRANSFER CHARACTERISTICS OF NANOFLUIDS. Convective heat transfer analysis of nanofluid flowing inside a

Aggregate Growth: R =αn 1/ d f

The achievable limits of operational modal analysis. * Siu-Kui Au 1)

Chem 406 Biophysical Chemistry Lecture 1 Transport Processes, Sedimentation & Diffusion

One-Dimensional Motion Review IMPORTANT QUANTITIES Name Symbol Units Basic Equation Name Symbol Units Basic Equation Time t Seconds Velocity v m/s

Objectives. By the time the student is finished with this section of the workbook, he/she should be able

CHAPTER 8 ANALYSIS OF AVERAGE SQUARED DIFFERENCE SURFACES

AP* Bonding & Molecular Structure Free Response Questions page 1

8. INTRODUCTION TO STATISTICAL THERMODYNAMICS

Micro-canonical ensemble model of particles obeying Bose-Einstein and Fermi-Dirac statistics

Biology Chemistry & Physics of Biomolecules. Examination #1. Proteins Module. September 29, Answer Key

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Free energy, electrostatics, and the hydrophobic effect

( ) ( ) ( ) + ( ) Ä ( ) Langmuir Adsorption Isotherms. dt k : rates of ad/desorption, N : totally number of adsorption sites.

Protein Folding & Stability. Lecture 11: Margaret A. Daugherty. Fall How do we go from an unfolded polypeptide chain to a

Partially fluidized shear granular flows: Continuum theory and molecular dynamics simulations

Polarizability and alignment of dielectric nanoparticles in an external electric field: Bowls, dumbbells, and cuboids

OPTIMAL PLACEMENT AND UTILIZATION OF PHASOR MEASUREMENTS FOR STATE ESTIMATION

IOSR Journal of Mathematics (IOSR-JM) e-issn: , p-issn: X.Volume12,Issue 1 Ver. III (Jan.-Feb.2016)PP

Pre-AP Physics Chapter 1 Notes Yockers JHS 2008

Chapter 6 Reliability-based design and code developments

Supplementary material for Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values

ROAD MAP... D-1: Aerodynamics of 3-D Wings D-2: Boundary Layer and Viscous Effects D-3: XFLR (Aerodynamics Analysis Tool)

Protein Folding. I. Characteristics of proteins. C α

Entropy online activity: accompanying handout I. Introduction

Available online at ScienceDirect. Energy Procedia 83 (2015 ) Václav Dvo ák a *, Tomáš Vít a

Gas-side mass transfer coefficient of a laboratory column equipped with one sieve tray

SEPARATED AND PROPER MORPHISMS

Buoyancy Driven Heat Transfer of Water-Based CuO Nanofluids in a Tilted Enclosure with a Heat Conducting Solid Cylinder on its Center

2. Thermodynamics of native point defects in GaAs

Today. Introduction to optimization Definition and motivation 1-dimensional methods. Multi-dimensional methods. General strategies, value-only methods

3.5 Analysis of Members under Flexure (Part IV)

NON-EQUILIBRIUM REACTION RATES IN THE MACROSCOPIC CHEMISTRY METHOD FOR DSMC CALCULATIONS. March 19, 2007

( x) f = where P and Q are polynomials.

8.4 Inverse Functions

2.6 Two-dimensional continuous interpolation 3: Kriging - introduction to geostatistics. References - geostatistics. References geostatistics (cntd.

Packing of Secondary Structures

1 Relative degree and local normal forms

Quantitative Stability/Flexibility Relationships; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 12

Unfolding CspB by means of biased molecular dynamics

COMPARISON OF THERMAL CHARACTERISTICS BETWEEN THE PLATE-FIN AND PIN-FIN HEAT SINKS IN NATURAL CONVECTION

Protein dynamics. Folding/unfolding dynamics. Fluctuations near the folded state

Internal thermal noise in the LIGO test masses: A direct approach

Light, Quantum Mechanics and the Atom

Fs (30.0 N)(50.0 m) The magnitude of the force that the shopper exerts is f 48.0 N cos 29.0 cos 29.0 b. The work done by the pushing force F is

X-ray Diffraction. Interaction of Waves Reciprocal Lattice and Diffraction X-ray Scattering by Atoms The Integrated Intensity

arxiv:quant-ph/ v2 12 Jan 2006

arxiv: v1 [cond-mat.soft] 22 Oct 2007

Curve Sketching. The process of curve sketching can be performed in the following steps:

Lecture 11: Protein Folding & Stability

Protein Folding & Stability. Lecture 11: Margaret A. Daugherty. Fall Protein Folding: What we know. Protein Folding

Syllabus Objective: 2.9 The student will sketch the graph of a polynomial, radical, or rational function.

SEPARATED AND PROPER MORPHISMS

Advanced sampling. fluids of strongly orientation-dependent interactions (e.g., dipoles, hydrogen bonds)

2. ETA EVALUATIONS USING WEBER FUNCTIONS. Introduction

Protein Dynamics, Allostery and Function

Categories and Natural Transformations

Feasibility of a Multi-Pass Thomson Scattering System with Confocal Spherical Mirrors

Anderson impurity in a semiconductor

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION


Description of a One-Dimensional Numerical Model of an Active Magnetic Regenerator Refrigerator

BOUNDARY LAYER ANALYSIS ALONG A STRETCHING WEDGE SURFACE WITH MAGNETIC FIELD IN A NANOFLUID

First-principles calculation of defect free energies: General aspects illustrated in the case of bcc-fe. D. Murali, M. Posselt *, and M.

Least-Squares Spectral Analysis Theory Summary

The Molecular Dynamics Method

Physiochemical Properties of Residues

Controlling the Heat Flux Distribution by Changing the Thickness of Heated Wall

Model Mélange. Physical Models of Peptides and Proteins

A NUMERICAL STUDY OF SINGLE-PHASE FORCED CONVECTIVE HEAT TRANSFER WITH FLOW FRICTION IN ROUND TUBE HEAT EXCHANGERS

NON-SIMILAR SOLUTIONS FOR NATURAL CONVECTION FROM A MOVING VERTICAL PLATE WITH A CONVECTIVE THERMAL BOUNDARY CONDITION

Fatigue verification of high loaded bolts of a rocket combustion chamber.

ENERGY ANALYSIS: CLOSED SYSTEM

NEWTONS LAWS OF MOTION AND FRICTIONS STRAIGHT LINES

The protein folding problem consists of two parts:

Transport Properties: Momentum Transport, Viscosity

RESOLUTION MSC.362(92) (Adopted on 14 June 2013) REVISED RECOMMENDATION ON A STANDARD METHOD FOR EVALUATING CROSS-FLOODING ARRANGEMENTS

Chiral selection in wrapping, crossover, and braiding of DNA mediated by asymmetric bend-writhe elasticity

NUMERICAL STUDY ON THE EFFECT OF INCLINATION ANGLE ON HEAT TRANSFER PERFORMANCE IN BACK-WARD FACING STEP UTILIZING NANOFLUID

Supporting Information

BROWNIAN DYNAMICS SIMULATIONS WITH HYDRODYNAMICS. Abstract

2015 American Journal of Engineering Research (AJER)

FLOW CHARACTERISTICS OF HFC-134a IN AN ADIABATIC HELICAL CAPILLARY TUBE

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

CS 361 Meeting 28 11/14/18

Molecular Modelling. part of Bioinformatik von RNA- und Proteinstrukturen. Sonja Prohaska. Leipzig, SS Computational EvoDevo University Leipzig

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Fin efficiency of the newly developed Compartmented Coil of a Single Coil Twin Fan System

Review D: Potential Energy and the Conservation of Mechanical Energy

Physics 2B Chapter 17 Notes - First Law of Thermo Spring 2018

The Deutsch-Jozsa Problem: De-quantization and entanglement

Peptides And Proteins

Math 754 Chapter III: Fiber bundles. Classifying spaces. Applications

Research Paper 1. Correspondence: Devarajan Thirumalai

A new combination of replica exchange Monte Carlo and histogram analysis for protein folding and thermodynamics

Accuracy of free-energy perturbation calculations in molecular simulation. I. Modeling

Telescoping Decomposition Method for Solving First Order Nonlinear Differential Equations

Remote Sensing ISSN

This is a repository copy of Analytical theory of forced rotating sheared turbulence: The perpendicular case.

Transcription:

1 The impact o hydrodynamic interactions on protein olding rates depends on temperature Fabio C. Zegarra, 1, Dirar Homouz, 1,,3 Yossi Eliaz, 1, Andrei G. Gasic, 1, and Margaret S. Cheung 1, * 1 Department o Physics, University o Houston, Houston, Texas 7704, USA Center or Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA 3 Khalia University o Science and Technology, Department o Physics, P.O. Box 17788, Abu Dhabi, United Arab Emirates *Corresponding author: mscheung@uh.edu 1

ABSTRACT We investigated the impact o hydrodynamic interactions (HI) on protein olding using a coarsegrained model. The extent o the impact o hydrodynamic interactions, whether it accelerates, retards, or has no eect on protein olding, has been controversial. Together with a theoretical ramework o the energy landscape theory (ELT) or protein olding that describes the dynamics o the collective motion with a single reaction coordinate across a olding barrier, we compared the kinetic eects o HI on the olding rates o two protein models that use a chain o single beads with distinctive topologies: a 64-residue α/β chymotrypsin inhibitor () protein, and a 57-residue β-barrel α-spectrin src-homology 3 domain (SH3) protein. When comparing the protein olding kinetics simulated with Brownian dynamics in the presence o HI to that in the absence o HI, we ind that the eect o HI on protein olding appears to have a crossover behavior about the olding temperature. Meaning that at a temperature greater than the olding temperature, the enhanced riction rom the hydrodynamic solvents between the beads in an unolded coniguration results in lowered olding rate; conversely, at a temperature lower than the olding temperature, HI accelerates olding by the backlow o solvent toward the native olded state. Additionally, the extent o acceleration depends on the topology o a protein: or a protein like, where its olding nucleus is rather diuse in a transition state, HI channels the ormation o contacts by avoring a major olding pathway in a complex ree energy landscape, thus accelerating olding. For a protein like SH3, where its olding nucleus is already speciic and less diuse, HI matters less at a temperature lower than the olding temperature. Our indings provide urther theoretical insight to protein olding kinetic experiments and simulations.

3 I. INTRODUCTION Aqueous solvent plays an active role in the dynamics o proteins by inducing the hydrophobic collapse o the chain and helping in the search o the speciic three-dimensional structure to perorm their biological unction [1]. However, the motions o the solute particles o the protein are not independent and intimately coupled by the solvent. As solute particles move, they induce a low in the solvent, which, in turn, aects the motion o neighboring solute particles. These long-range interactions between solute particles and solvent, known as hydrodynamic interactions (HI), are studied extensively in polymers both analytically [,3] and numerically [4]. HI generally accelerates the speed o collapse [5-8] when a polymer is quenched rom good to poor solvent at θ temperature. Unlike a homopolymer, a heteropolymeric protein is made up o 0 dierent amino acids that interact through electrostatics or van der Waals orces to various extents. These interactions are long-range in nature, which complicates the analysis o HI in protein olding. Several groups have employed computer simulations [9-1] on the investigation o HI eects on protein olding with Langevin dynamics. Up to now the outcome rom coarse-grained protein olding simulations on whether HI accelerates or deters protein olding oten varies by research groups. The Cieplak group and the Elcock group showed that HI moderately accelerates the olding kinetic rates by a actor o 1. to 3.6 [9,10]. A recent study by the Scheraga group argued that HI reduces the olding kinetic rates [11]. Furthermore, Kikuchi et al. [1] claimed that HI has accelerated kinetic rates, albeit a small eect. Noticeably, there is scarce work being done on the temperature dependence o these indings. The discussion about temperature is necessary because protein olding rom an unolded coniguration to a natively compact one requires imperect cancellation o conigurational entropy loss and enthalpy gain during the course o collapse, which gives rise to a temperature dependent activation barrier [13,14]. Without a comprehensive investigation over a wide range o temperature, it is challenging to delineate the real impact o HI on protein olding rates. Despite the conounding results rom the groups mentioned above, they all might be correct at their own speciic temperature range. Our motivation is to reconcile the dierences in reported inluences o HI on protein olding over a wide range o temperature rom the viewpoint o the olding energy landscape theory 3

4 [14,15], particularly with a unnel-shaped energy landscape [16]. We used a computer protein model that guarantees to old into the native state rom any unolded conormation [17]. We tracked its collective motion on a single reaction coordinate, the raction o the native contact ormation Q either on a thermodynamic ree energy barrier or by kinetic trajectories. We studied the eects o HI on olding o two well-studied, model proteins with distinctive topologies: one is the 64-residue α/β protein chymotrypsin inhibitor () [18] shown in Fig. 1(a), and the other is the 57-residue β-barrel α-spectrin Src-homology 3 (SH3) domain [19] shown in Fig. 1(b). The two proteins old and unold in a two-state manner and have been used or studying olding mechanisms rom other computational studies [0-4]. We simulated the Brownian dynamics o particles including HI by implementing the algorithm developed by Ermak- McCammon [5]. The eects o HI are approximated through a coniguration-dependent diusion tensor D used in the Brownian equation o motion. Our study shows that the eect o HI on olding rates can both accelerate protein olding at a temperature lower than the olding temperature and retard protein olding speed at a temperature higher than the olding temperature, in comparison with the olding dynamics without HI. Since HI aects the kinetic ordering o contact ormation, or a protein with multiple viable olding pathways like, HI will avor a particular olding route in a complex olding energy landscape. In that sense, ELT is short o ully predicting olding rates. From Secs. III B to III E, we investigate the cause o this temperature dependence o the eect o HI on olding rates and the implications or energy landscape theory. We also suggest a possible experimental design to probe the impact o HI on olding based on a temperature-dependent ϕ-values analysis. II. MODELS AND METHODS A. Coarse-grained protein model We used a coarse-grained, structure-based model [17] or two well-studied, model proteins: 64- residue α/β protein chymotrypsin inhibitor () (PDB ID: 1YPA) [18] in Fig. 1(a), and the 57- residue β-barrel α-spectrin Src-homology 3 (SH3) domain (PDB ID: 1SHG) [19] in Fig. 1(b). A structure-based model is a toy model that provide a single global basin o attraction that corresponds to an experimentally determined coniguration and smooths out the ruggedness on 4

5 the unneled energy landscape [6]. This allows us to study the ideal energy landscape o a protein. In this coarse-grained model, each residue is represented by one bead placed at its α- carbon position, creating a string o beads that represents the entire protein [see Figs. 1(c) and 1(d) or and SH3, respectively]. The Hamiltonian o our system depends on the experimental determined coniguration (also known as the native state) consisting o backbone terms, attractive interactions between beads in close proximity to each other in the native state, and excluded volume, having the ollowing orm taken rom the model developed by Clementi et al. [17]: 0 0 0 ( ΓΓ, ) = kr ( rij rij ) δ j, i+ 1+ kθ ( θi θi ) i< j i angles { } 0 1 0 { 1 cos φi φ i } 1 cos 3( φi φi ) + kφ + i dihedral 1 10 1 0 0 r ij r ij σ + ε 5 6 + ε, r ij r ij r ij i j > 3 i j > 3 i, j native i, j native (1) where Γ is a coniguration o the set: r, θ, ϕ. The r ij term is the distance between ith and jth residues, θ is the angle deined by three consecutive beads, and ϕ is the dihedral angle between our consecutive beads. We deine ε = 0.6 kcal/mol as the solvent-mediated interaction, and kr = 100ε, = 0ε, k φ = ε [17]. δ is the Kronecker delta unction. The native state values o k θ 0 0 r, θ, and 0 φ or both proteins were obtained rom their crystal structures Γ 0 [18,19] where 0 0 0 0 Γ ={ { r },{ θ },{ φ }}. The non-bonded terms consist o a Lennard-Jones interaction between native pairs, and excluded volume interaction between non-native pairs. The native contact pairs were chosen using the CSU program [7]. σ or non-native pairs is 4 Å [17,8]. 5

6 FIG. 1. Representations o protein models o Chymotrypsin inhibitor () (top row) and the α- spectrin Src-homology 3 (SH3) domain (bottom row). The protein models are in [(a), (b)] a cartoon representation, [(c), (d)] a Cα only representation, and [(e), ()] a protein topology cartoon created with Pro-origami [9]. Arrows are β-strands and cylinders are helices. Structures rom [(a), (b), (c), (d)] were created with VMD [30]. The secondary structures were assigned using DSSP [31]. The key residues or the hydrophobic core (A16, L49, and I57) and the minicore (L3, V38, and F50) are represented with green and orange beads, respectively, or in (a). Residues o the diverging turn (M0, K1, K, G3, and D4) and distal loop (N4 and D43) are represented with green and orange or SH3 in (b). A short description o both proteins is necessary or later results. has one α-helix packed against three β-strands (rom β1 to β3), and a 310-helix (310) as shown in Fig. 1(e). The two key cores are the hydrophobic core and the mini-core [the key residues are shown in Fig. 1(a)] [3,33]. SH3 has ive β-strands (rom β1 to β5), and a 310-helix (310) as shown in Fig. 1(). The β-strands are arranged antiparallel with respect to each other. Diverging turn and distal loop are key to the ormation o the transition state conigurations [see Fig. 1(b)] [34,35]. B. Brownian dynamics with or without HI Our protein olding simulations utilized a Brownian dynamics with HI (BDHI) method developed by Ermark and McCammon [5]. We used the sotware HIBD developed by the Skolnick group [36] using only the ar-ield hydrodynamic interactions. The equation o motion is given by 6

7 where ( + ) DF x x G, () j ij j ( t + dt) = ( t) + dt + ( dt) i i i kt B x i t dt is the position vector o the ith Cα bead at time t+dt. Fj is the total orce acting on the jth Cα bead. The diusion tensor D is a supermatrix o 3N 3N, where N is the number o beads. Dij is the 3 3 submatrix in the ith row and jth column o the diusion tensor. In the absence o HI, the o-diagonal submatrices are zero. The diagonal terms were calculated rom the Stokes-Einstein relation shown in Eq. (3) where η is the viscosity o the aqueous solvent at temperature T [36], kb is the Boltzmann constant, a represents the hydrodynamic radius o the beads (5.3 Å or each Cα residue is taken rom [10]), and I 3 is a 3 3 identity matrix. In the presence o HI, the elements o submatrices D ij were obtained rom the equations developed by Rotne and Prager [37] rom the solutions o the Navier-Stokes equation under a low Reynolds number, and Yamakawa [38] extended the expression between a pair whose distance separation is less than the size o a bead. The complete set o ormulas to compute below in Eqs. (3-5). For an equation o motion with HI, then kt 6πηa D ij terms is shown B Dii = I 3, (3) kt B a a rij r ij Dij = 1+ I 3 + 1 r ij a, (4) 8πηrij 3 r ij r ij r ij kt B 9 rij 3 rij r ij Dij = 1 I 3 + rij < a, (5) 6πηa 3 a 3 arij where represents a tensor product between two vectors. For the simulations using Brownian dynamics in the absence o HI (BD), the diusion matrix reduces to Eqs. (6) and (7): kt 6πηa B Dii = I 3, (6) D 0. (7) ij = 3 i j 7

8 G i ( dt) in Eq. () is the random displacement that mimics the stochastic behavior on a Cα bead rom the implicit solvent. The relation between the random displacement and the diusion tensor is linked by Eq. (8), which ensures that the luctuation-dissipation theorem is satisied. represents the ensemble average. ( dt) ( dt) dt ( dt) G G = 6D and G = 0. (8) i j ij i C. Equilibrium thermodynamic simulations To evaluate the thermodynamic properties o and SH3, we utilized molecular dynamics simulation with BD and BDHI. To acquire sampling eiciency o the conormational space o the proteins, we used the replica exchange method (REM) [39] or enhanced sampling. The initial structures were chosen rom an ensemble o unolded structures that were annealed progressively until they reach the target temperature. For each protein, 0 temperatures were chosen or a set o REM simulation. The integration time step dt is 10-3 τ, where τ = σ ε, where σ α is the average distance between two consecutive Cα beads (3.8 Å) and m is 100 u representing the mass o a bead. The sampling rate was greater than the correlation time. The acceptance or rejection o each exchange between replicas ollows the Metropolis criterion [40], min( 1,exp{[ βi β j] [ ( Γi) ( Γ j)]} ), where i and j are two consecutives replicas, β = 1 kt B, k B is the Boltzmann constant, T is the temperature, and is the potential energy o the system. The number o samples was determined according to the convergence o the potential energy or all temperatures. Ensembles o.4 10 5 and 1.6 10 5 statistically signiicant conormations were obtained or each replica o and SH3, respectively. We computed thermodynamics properties and errors rom the simulations with the weighted histogram analysis method (WHAM) [41]. D. Non-equilibrium kinetic simulations 1. Generation o the non-arrhenius plot without HI m α 8

9 We simulated olding kinetics using Brownian dynamics in the absence o HI (BD) or and SH3 over wide ranges o temperatures. We represent the temperature used or each protein in units o their corresponding olding temperature T, where they are deined as the temperatures when the ree energy (FE), with respect to the raction o native contact ormation Q, o the unolded state is equal to the FE o the olded state; i.e., ( Qu) ( Q ) o the unolded state Q u and olded states are represented in units o T and FE FE = 0 where the basin Q are equal in the ree energy. Thus, temperatures T or and SH3, respectively. We explored the range o temperatures rom 0.1T to 1.3T or and rom 0.1T to 1.5T or SH3. A olding kinetic simulation started rom an unolded coniguration, which was chosen randomly rom an ensemble at high temperature, was perormed until it reached the olded state or the irst time (irst passage time). We considered that a protein is olded when the non-bonded potential, the sum o the last two terms o Eq. (1), is less or equal to 0.9 o the native state non-bonded potential or and SH3. The number o trajectories depends on the convergence o the mean irst passage time (MFPT) at each temperature. The average olded time t old is the MFPT. The maximum simulation time tmax is 9 10 6 τ and was chosen as the olding time or trajectories that did not reach the olded state.. The impact o HI on protein olding kinetics We selected another range o temperatures or simulations with HI or each protein, rom 0.95 T to 1.06 T or and rom 0.91 T to 1.03 T or SH3. The number o trajectories rom unolded to the olded state depends on the convergence o MFPT at each condition. The maximum simulation time tmax is 9 10 6 τ and was chosen as the olding time or trajectories that did not reach the olded state. For the analysis where the kinetics trajectories were projected on a two dimensional energy landscape (see Sec. III E), the number o trajectories were increased to 1500 and 500 to reduce statistical error or and SH3 trajectories, respectively. E. Eective diusion coeicient o a reaction coordinate 9

10 We expect changes in the diusion coeicient o a reaction coordinate to relect the changes in the predictions o olding rates rom the energy landscape theory (ELT) because HI is a kinetic eect that will not alter the overall ree energy proiles [i.e., Hamiltonian is the same with or without HI, see Eq. (1)]. Thus, instead o comparing the analytically predicted olding rates, we computed an eective diusion coeicient along a reaction coordinate, the raction o the native contact ormation Q. We used the ollowing expression to it displacement (MSD) o Q over a lag time t [4]: D e rom mean square D e ( + ) ( ) 0 0 1 Q t t' Q t t0, Ω = lim, (9) d t' td t' where represents the average over all simulated time t 0 separated by a lag time t and all t 0, Ω trajectories Ω. d is the dimension o one. t D is the timescale where the MSD shows a linear behavior with time. We ollowed Whitord s method to compute coeicient at the regime where MSD varies linearly with lag time. e D [4] by itting the diusion The ensemble average o the MSD is perormed over a selected number o kinetic trajectories. The MSD is measured rom the unolded basin ( Q u =0. and Q u =0.1 or and SH3, respectively) to slightly above the top o the barrier ( Q =0.6 and Q =0.5, or and SH3, respectively). F. Data analysis 1. Dierences in the probability o secondary structure ormation The probability o secondary structure ormation as a unction o time P( t ) taken over all kinetic trajectories is estimated in the presence or in the absence o HI or both proteins. The dierences in the probability o secondary structure ormation between BDHI and BD ( P( t) ) is deined as: BDHI BD ( ) ( ) ( ) P t = P t P t. (10) 10

11 When P( t) is positive, the average probability o secondary structure ormation rom BDHI is greater than BD, and vice versa when P( t) is negative.. Displacement correlation We calculated the displacement vector o a residue k, which is deined as ( t) = ( t) ( t dt) s x x, throughout the kinetic simulation. Then, we investigated the k k k displacement correlation Cij ( t ) by taking the cosine o the angle ormed between two displacement unit vectors s ˆ ( t) o a pair o residues i and j (i j) at time t and averaging over all trajectories Ω as: ( ) = ˆ ( ) ˆ ( ) C t s t s t. (11) ij i j The translation and rotation o the center o mass o each coniguration was removed beore calculating the displacement correlation. 3. Chance o occurrence The chance o occurrence CoO( i j ) is deined as the ratio o number o residue pairs with a sequence separation i-j >0 whose magnitude o the displacement correlation is above a selected threshold, and the total number o residue pairs at that sequence separation, as: Ω CoO( i j ) = i > j Θ( C µ ) δ i > j ij i j, i j δ i j, i j, (1) where Θ is the Heaviside step unction and δ is the Kronecker delta unction. The chosen threshold µ is the average positive displacement correlation rom the ensemble. µ = i> j C i> j ij Θ( C ) ij Θ( C ) ij. (13) Negative displacement correlation is ignored because the signal is not as strong. III. RESULTS 11

1 A. The impact o BDHI on the olding time depends on temperature We explored the olding kinetics o and SH3 by comparing the mean irst passage time (MFPT) with BD over a broad range o temperatures. Both proteins exhibit non-arrhenius [43] behavior against temperature as shown in Figs. (a) and (b). At high temperatures, the MFPT increases because the thermal luctuations are higher than the stability o the protein, and at low temperatures, the MFPT increases due to the act that the protein is trapped in a local energy minimum [15,43,44]. The temperature that renders the astest MFPT is at 0.95T and 0.91T or and SH3, respectively. We computed the MFPT or the proteins with BDHI over a narrow range o temperatures around T o the proteins in Figs. (c) and (d) in dashed lines. Our study shows that the impact o HI on the MFPT is small within an order o magnitude, but statistically signiicant. What is most interesting is that HI either increases or decreases the olding time depending whether the temperature is higher or lower than T. This distinctive crossover behavior occurs in the proximity o the olding temperature o ( 1.03T SH3 ( 0.98T ) and ). Thus, the impact o HI on protein olding kinetics is temperature dependent. However, the acceleration o the olding is more prominent or than or SH3 at T< T. Thereore, HI eects also depend on the topology o a protein. We will urther investigate the role o topology in the extent o impact rom HI on protein olding in the ollowing subsection at two temperatures or each protein: below T (0.95 T or and 0.91 T or SH3) and above T (1.06 T or and 1.03T or SH3). 1

13 FIG.. The mean irst passage time (MFPT) with respect to temperature or and SH3 in the presence or in the absence o hydrodynamic interactions (HI). Panels (a) and (b) show the MFPT over a broad range o temperatures using the Brownian dynamics without HI (BD) or and SH3, respectively. The temperature or each protein is expressed in units o their corresponding olding temperature T. Note the U-shaped dependence o the olding time (non-arrhenius behavior). The MFPT using BD with HI (BDHI) is compared to the MFPT using only BD in panel (c) or and (d) or SH3. The crossover occurs when the two curves intersect. Error bars are calculated using the jackknie method. B. The eective diusion coeicient o Q partly accounts or the crossover behavior o the olding kinetics Can we capture this olding behavior using a global order parameter? A theoretical estimation o the olding kinetic rate k (the rate is the inverse o olding time t old ree energy surace and the eective diusion coeicient energy surace [43-45] as such, ) depends on the shape o e D o an order parameter on the ree 13

14 1/ 1 β e k = = D ωω exp( β F ), (14) t π old where ω and ω are the curvatures o the unolded state ree energy well and barrier, respectively, β is the inverse temperature, and F is the ree energy barrier height with respect to the unolded state ree energy. However, since the Hamiltonian or BD and BDHI are identical rendering the same ree energy proiles (see Fig. 3), the change in the olding kinetic rates should be explained by the change in the diusion o the order parameter. Here, the order parameter is the raction o native contact ormation Q. The mean square displacement (MSD) o Q is obtained as a unction o time as shown in Fig. 4 to estimate the eective diusion coeicients. The trajectories longer than 10 5 τ at 0.95T and 8 10 5 τ at 1.06 T are considered or, and 1.6 10 5 τ at 0.91T and 4 10 5 τ at 1.03T or SH3. The MSD is calculated or a time shorter than the average olding time or each condition. The initial phase is a transitional subdiusive process characterized by MSD~t α with α<1. Ater the memory rom the initial state dissipates [46,47], the MSD reaches a normal diusive regime. e D o Q is estimated rom the linear region o the MSD o Q as a unction o lag time as shown in the insets o Fig. 4. A smaller diusion coeicient at T> T than that o T< T is simply a thermal argument. FIG. 3. Free energy (FE) with respect to raction o native contact ormation Q without or with HI (BD and BDHI, respectively) or (a) and (b) SH3 at a temperature below the olding temperature ( T ), at T, and above T. Error bars are included. 14

15 FIG. 4. The mean square displacement (MSD) o the raction o the native contact ormation Q as a unction o lag time in the absence or presence o HI (BD and BDHI, respectively) at T< T [(a) 0.95 T and (b) 0.91T or and SH3, respectively], and T> T [(c) 1.06T and (d) 1.03T or and SH3, respectively]. Shaded width o solid lines represent the error. The inset zooms in to the range o time used or a linear it (solid line) o data at every 100 (T< T ) or 00 (T> T ) points in open circles. The unit or the eective diusion coeicient is 1/τ since Q is dimensionless. I Q is a good reaction coordinate that captures the dynamics over the entire ree energy landscape, the ratio o the olding rates k BDHI k BD computed rom kinetic simulations should be equal with the same ratio computed rom MSD analysis. A comparison o the ratios o the olding rates kbdhi kbd computed rom kinetic simulations with the same ratio computed rom e e MSD analysis are shown in Table I and II or and SH3, respectively. The ratio D D, BDHI BD 15

16 however, is not equal to the ratio t BD t BDHI ; although it shows the right trend o the crossover old old behavior. For, the ratio o olding rates rom BDHI and BD kinetic simulations is kbdhi k BD = 1.37 at 0.95 T and kbdhi k BD = 0.83 at 1.06T, whereas the ratio o kbdhi k BD rom itting the MSD is 1.14 at 0.95T and 0.15 at 1.06T. A similar trend o crossover behavior is observed or SH3. D D or either or SH3 is merely around 1 at T< T, e BDHI e BD while the t t computed rom the olding simulations is above 1.18. The analysis o the BD old BDHI old diusivity shows a retarded dynamics due to HI at T> T. The results rom kinetic simulation also show retarded dynamics but less so. We speculate that a mean-ield description o overall olding with the collective order parameter o Q along an energy landscape may not ully grasp the kinetic principle o HI on olding. We will investigate the inluence o HI on olding by the ormation o the local contacts or secondary structures in the next subsection. TABLE I. Folding time rom kinetic simulations ( t old T (in units o T ) BD t old ( D e From kinetic simulations k BDHI t old ) o Q or using BD or BDHI. BDHI k BD ) and the eective diusion coeicient = e D BD From MSD analysis k e D BDHI (10 6 τ) (10 6 BD BDHI τ) told told (10-9 1/τ) (10-9 e e 1/τ) DBDHI DBD 0.95 0.5 ± 0.0 0.38 ± 0.01 1.37 ± 0.06 89.16 ± 0.08 101. ± 0.09 1.14 ± 0.00 1.06 3.64 ± 0.13 4.39 ± 0.14 0.83 ± 0.04.9 ± 0.01 0.35 ± 0.00 0.15 ± 0.00 BDHI k BD = TABLE II. Folding time rom kinetic simulations ( t old T (in units o T ) BD t old ( D e From kinetic simulations k BDHI t old ) o Q or SH3 using BD or BDHI. BDHI k BD ) and the eective diusion coeicient = e D BD From MSD analysis e D BDHI (10 6 τ) (10 6 BD BDHI τ) told told (10-9 1/τ) (10-9 e e 1/τ) DBDHI DBD 0.91 0.19 ± 0.01 0.16 ± 0.01 1.19 ± 0.10 17.7 ± 0.30 36.71 ± 0.4 1.09 ± 0.00 1.03 1.08 ± 0.05 1.65 ± 0.07 0.65 ± 0.04 6.39 ± 0.01.56 ± 0.01 0.40 ± 0.00 k BDHI k BD = C. HI acilitates the ordering o key structural regions at T< T 16

17 In the previous subsection, we have shown how HI impacts olding globally to explain the crossover behavior o the olding rates; however, HI also impacts local secondary structure ormation. We investigated the temperature dependence o HI and the crossover behavior by analyzing the ordering o secondary structures o the proteins along a time that is normalized by the maximum time tmax. For the selected temperatures, we calculated the dierences in the probability o secondary structure ormation P( t) between BDHI and BD as a unction o normalized time (Fig. 5). For each protein, one temperature is slightly below T [Figs. 5(a) and 5(b) or and SH3, respectively] and the other is slightly above T [Figs. 5(c) and 5(d) or and SH3, respectively]. At T< T the olding time with BDHI decreases with respect to BD, and vice versa at T> T. We are interested at analyzing the olding ormation at a time beore the proteins reach the transition state at the top o the olding barrier. The transition state region is shown in the grey shaded area or each protein in Fig. 3, and it is a key part in the olding process. This stage o olding occurs beore the dashed, vertical lines in each panel o Fig. 5 (see Fig. 6 or a complete temporal evolution o Q Ω along normalized time where Ω represents the average over all kinetic trajectories). The two temperatures to investigate s crossover behavior are 0.95T and 1.06T. We grouped the native contact pairs into secondary structure segments to get a structural view o the impact o HI. Figure 5(a) shows that at T< T, the most positive P( t) is observed or β1-β and β1-seg4 that orm the mini-core, and seg3-β, seg3-seg4, seg4-β that are in the neighborhood o the mini-core. It shows a modest positive P( t) or 310-α (native pairs close to the N-terminus) and β-β3 (native pairs close to the C-terminus). This suggests that BDHI enhances the ormation o the secondary structures within the mini-core and have less impact at the termini. In Fig. 5(c) at T> T, the native contacts that are more aected by BDHI than by BD [negative P( t) ] are the segments involving seg3-seg4, seg4-β, the long-range contacts 310-seg5, and contacts in the C-terminus β-β3. This implies that the impact o HI has a longer range at T> T than at T< T. 17

18 Turning the attention to SH3, the two temperatures to investigate SH3 s crossover behavior are 0.91T and 1.03T. At T< T [Fig. 5(b)], the most positive P( t) is observed in the region about the RT loop (RT), the diverging turn (DT), and the distal loop (DL). Those involved secondary structures are DT-β4, DT-β5, DT-310, RT-β3, RT-β4, and N-src-β5. BDHI promotes speciic contacts in SH3 that is known or its obligatory role in the ormation o the transition state. In addition, the contacts in the neighborhood o the distal loop and N-src loop are also mildly enhanced with BDHI (β-β3, β-β4, β3-β5, and DT-β3). On the other hand, at T> T [Fig. 5(d)] the ormation o the previous pairs mentioned beore are also negatively aected by BDHI indicated by a negative P( t). However, the impact o HI on the olding time o SH3 is less than that o. We will discuss the dierence in the impact rom the viewpoint o protein topology in the ollowing subsections. 18

19 FIG. 5. Impact o HI on key pairs o secondary structures interactions P( t) at T< T [(a) 0.95 T and (b) 0.91 T or and SH3, respectively], and T> T [(c) 1.06 T and (d) 1.03 T or and SH3, respectively]. Time is normalized with respect to maximum simulation time (9 10 6 τ) and shown in log scale. The dashed, vertical lines correspond to the time when Q Ω =0.4 or and SH3 which is the Q value in the transition state. The curves were smoothed by averaging every 500 data points. Errors are included but too small to be visible. At the top o the panels, the location o secondary structure elements and unstructured regions along the sequence o (let) and SH3 (right) is indicated as visual guidance. FIG. 6. Evolution o the average o the raction o native contact ormation Q over all trajectories as unction o normalized time t/tmax at T< T and T> T or (a) and (b) SH3. Time is normalized with respect to maximum simulation time (9 10 6 τ). Shaded width o lines represent the error, which is calculated using the jackknie method. D. Hydrodynamic coupling o mid-range and long-range contacts and their opposing impact on the general ordering o contact ormation at T< T and T> T Armed with the knowledge o the kinetic speciic structural ormation rom the previous subsection, we hypothesize that HI inluences the sel-organization o a protein coniguration during the course o olding rom an unolded state to the ormation o a transition state. Consequently, HI creates an opposing eect on olding time at a watershed o T, which is the crossover behavior in the olding kinetics. The addition o HI to the equation o motion introduces the many-body coupling between all beads where their motions are correlated inversely with their spatial separation r. To test this hypothesis, we compare the conigurations in terms o probability o contact ormation o each native pair Qij ( t ) to the displacement correlation between a pair o contacts C ( ) ij t at a particular moment in olding kinetics. We 19

0 chose this moment to be a time where the average o probability o contact ormation Q Ω is 0.4 (see Fig. 6 or a temporal evolution o Q Ω ) as shown in Figs. 7 and 8 or and SH3, respectively. The values o the displacement correlation C ( ) ij t at a particular time are shown in the lower triangles in Figs. 7 and 8 or and or SH3, respectively. The comparison o C ( ) Qij ( ) ij t with t (upper triangles in Figs. 7 and 8) establishes a causal relationship between the dispersity o Qij ( t ) and the spatial pattern o Cij ( ) t. At T< T, the probability o contact ormation among mid- to long-range pairs ( i-j 10) is disperse [upper triangle in Figs. 7(a) and 8(a)]. Their Fano actors (variance over mean) are 0.04 or [Fig. 7(b)] and 0.06 or SH3 [Fig. 8(b)]. Whereas at T> T, the probability o contact ormation or pairs with i-j 10 has a narrower distribution [upper triangle in Figs. 7(c) and 8(c)] evident by a lower Fano actor: 0.01 or [Fig. 7(d)] and 0.0 or SH3 [Fig. 8(d)]. A rather high Fano actor implies the existence o certain localized contact ormation when a protein olds rom an unolded state at T< T. A narrow dispersion [Figs. 7(d) and 8(d)] shows the contact ormation is quite random as the protein olds and unolds at T> T. 1. For, the localized contacts are around the mini-core (highlighted in purple boxes). As shown in the lower triangle o Figs. 7(a) and 7(c) (at T< T and T> T, respectively), HI alters the pattern o motions or long-range contacts (black boxes) and thus impacts the dynamics o mid-range contacts (green boxes). At T> T, the paired residues move cooperatively in the same direction; thus, adversely aecting the ormation o the mid-range contacts around the mini-core. Noticeably, in addition to the native pairs, the surrounding non-native pairs nearby are also correlated, which is not observed in the simulations with BD (Fig. 9). Several research groups have shown the importance o non-native pairs dictating protein kinetics [48-50]. To better visualize the whereabouts o the mid- to long-range contact pairs involving both native and nonnative pairs, we projected the pairs with sequence separation o i-j >8 on the native structure o 0

1 in Fig. 7(e) with colored edges. The contact pairs with a similar range o positive correlation at both temperatures (0.95T and 1.06T ) are shown with green edges. Most o which are located in the two regions ormed by β1 β and β3, and the C-terminus with the connecting loop o β and β3. The pairs in which the magnitude o the correlation is greater at 0.95T than that o 1.06T are shown with blue edges, which are located mostly between the α-helix and the N- terminus, and between the connecting loop o β and β3 and the C-terminus. The pairs in which magnitude o the correlation is greater at 1.06T than that o 0.95T are shown with red edges. They are present between the N-terminus and the C-terminus, the α-helix and the ollowing loop, and the region o the mini-core. As hinted in the previous paragraph, we speculate the sequence separations between contacts, involving both native and non-native contacts, play a signiicant role in the crossover behavior in the presence o HI. To extend our analysis and justiy our speculation, we plotted the chance o occurrence CoO( i j ) (see deinition in Sec. II F) in Fig. 7() along the sequence separation ij. There is a stronger signal at mid-range contacts (10< i-j <30) at T< T than that o T> T. Most noticeably, there is a strong signal at i-j 60 that shows that long-range contacts are indeed correlated at T> T. 1

FIG. 7. (a) The probability o contact ormation o each native pair Qij ( ) the displacement correlation Cij ( t ) (lower triangle) at 0.95 t (upper triangle) and T or. (b) The distribution o

3 native pairs rom panel (a) with i-j 10. (c) Qij ( t ) and ij ( ) C t at 1.06T. (d) The distribution o native pairs rom panel (c) with i-j 10. (a) and (c) correspond to a normalized time where Q Ω =0.4. The arrows and rectangles along (a) and (c) represent β-strands and helices, respectively. The orange and purple dashed boxes highlight the long-range contacts between the N-terminus and C-terminus, and the contacts in neighborhood o the mini-core, respectively. The black and green dashed boxes highlight the displacement correlation between the N-terminus and C-terminus, and between the N-terminus and the α-helix, respectively. (e) The displacement correlation at 0.95 T and 1.06 T or all pairs are classiied into three sets based on the magnitude o positive correlations. I the magnitude o the correlation is similar at both temperatures, the pair is colored with a green edge on the let structure. I the magnitude o the correlation is greater at 0.95T than that at 1.06T a pair is colored with a blue edge on the right structure. I the magnitude o the correlation is greater at 1.06T than that at 0.95T a pair is colored with a red edge on the right structure. Only the pairs with sequence separation greater than 8 residues and magnitude o displacement correlation above the threshold µ =0.061 are considered or this representation. The key residues or the hydrophobic core (A16, L49, and I57) and the mini-core (L3, V38, and F50) are illustrated with green and orange beads, respectively. () CoO( i j ) or all pairs whose magnitude o the displacement correlation is above µ are organized according to the sequence separation i-j.. SH3 We ound that HI aects SH3 (Fig. 8) in a similar way to ; however, the eect is not as strong. This is evident by the data collected at the time that corresponds to the transition state ( Q Ω =0.4). The contact ormation is localized at mid-range contacts between the diverging turn (DT) and the distal loop (DL) (in purple boxes), which is known to be critical to the ormation o transition state ensemble experimentally [34,35]. Similar to Fig. 7(e), we projected the pairs with displacement correlations greater than the average positive correlation on the native structure in Fig. 8(e). Any pairs with sequence separation o i-j >7 are grouped in colored edges. The green ones correspond to the similar magnitude o pair correlation at a temperature either higher or lower than T. The pairs in which the magnitude o the correlation is greater at 0.91T than that o 1.03T are shown with blue edges, which are the pairs in the region o the RT and DT loop, 310-helix and β3. Furthermore, the pairs in which the magnitude o the correlation is greater at 1.03T than that o 0.91T are shown with red edges, which are the pairs between β and the N-terminus (rom seg1 to DT), and the long-range pairs between seg1 and seg. Again, we plotted the CoO( i j ) in Fig. 8() along sequence separation or SH3. Similar to, there is a 3

4 stronger signal at mid-range contacts (10< i-j <0) at T< T than that o T> T. At i-j 56, there are long-range contacts that are correlated at 1.03T. The previous analysis is compared to the same plots without HI (BD) in Fig. 9. Although the contact maps are similar to their corresponding Q Ω or both proteins, there are no clear pattern in the displacement correlation map or BD. The displacement correlation randomly luctuates around zero. The CoO( i j ) showed in Figs. 7() and 8() suggest that the crossover behavior in the presence o HI is due to the displacement correlation between the mid-range contacts and long-range contacts. The mid-range contacts or are the ones that orm the mini-core, and or SH3 are the ones between the diverging turn and distal loop. Although we employed a structurebased model where the native pairs are energetically attractive, we identied the importance o dynamic correlation between residues that orm a native pair and their neighboring non-native pairs particularly between the α-helix and the N-terminus or, and between both DT and β and N-terminus or SH3, or the retardation o the olding time at T> T. 4

5 FIG. 8. (a) The probability o contact ormation o each native pair Qij ( ) the displacement correlation Cij ( t ) (lower triangle) at 0.91 t (upper triangle) and T or SH3. (b) The distribution o 5

6 native pairs rom panel (a) with i-j 10. (c) Qij ( t ) and ij ( ) C t at 1.03T. (d) The distribution o native pairs rom panel (c) with i-j 10. (a) and (c) correspond to a normalized time where Q Ω =0.4. The arrows and rectangles along (a) and (c) represent β-strands and helices, respectively. The orange and purple dashed boxes highlight the long-range contacts between the N-terminus and C-terminus, and the contacts between the diverging turn (DT) and the distal loop (DL), respectively. The black and green dashed boxes highlight the displacement correlation between the N-terminus and C-terminus, and between the N-terminus and both DT and β, respectively. (e) The displacement correlation at 0.91 T and 1.03 T or all pairs are classiied into three sets based on the magnitude o positive correlations. The coloring rules are the same as Fig. 7(e) with pairs with sequence separation greater than 7 residues and magnitude o displacement correlation above the threshold µ =0.095. () CoO( i j ) or all pairs whose magnitude o the displacement correlation is above µ are organized according to the sequence separation i-j. 6

7 FIG. 9. The Brownian motion o residues without HI (BD) shows small and random displacement correlation or native and non-native pairs o and SH3. Upper and lower Q t and triangles represent the probability o contact ormation or each native pair ij ( ) displacement correlation C ( ) T [(a) 0.95 ij t, respectively. The panels are plotted at T< T and (b) 0.91T or and SH3, respectively], and T> T [(c) 1.06 T and (d) 1.03T or and SH3, respectively] at a normalized time where Q Ω =0.4. The arrows and rectangles represent β-strands and helices, respectively. E. HI can kinetically alter olding routes rom multiple pathways To urther investigate the molecular underpinning o the crossover behavior that cannot be simply explained by the ratio o the eective diusion coeicients rom Sec. III B, we explored the possible changes in the pathways due to HI by projecting the kinetic trajectories on a twodimensional ree energy landscape. An additional reaction coordinate QT involving a selected group o mid-range contacts rom Figs. 7 and 8 (or and SH3, respectively), is employed to describe the olding process because we speculate the presence o hidden pathways that are not visible by a global parameter Q [8]. 1. For, QT is deined as a set o the native contacts that are located in the neighborhood o the mini-core (contacts enclosed in the purple dashed rectangle o Fig. 7). Figure 10 reveals two distinct paths: one involves a high QT (0.8) and other involves a low QT (0.) both at about Q 0.5. We projected two representative kinetic trajectories over the two-dimensional ree energy surace as a unction o Q and QT. In Fig. 10(a), the kinetic trajectory, named route I, began rom an unstructured chain. As time increases the α-helix and most o the contacts in QT are ormed beore reaching Q 0.5, which is the top o the barrier o the one-dimensional ree energy proile as a unction o Q. This involves the ormation o the mini-core and contacts in the C-terminus beore the ormation o the hydrophobic core. Ater crossing the top o the barrier the hydrophobic core starts to orm. Figure 10(b) illustrates another kinetic trajectory, called route II, started rom another unolded structure. As time increases, the contacts o QT has not completely ormed at Q 0.5 while the hydrophobic core is ormed beore the mini-core. Then the contacts 7

8 o the mini-core start to orm to reach the olded state. Table III shows the number o trajectories that visit route I and II, and their corresponding average olding time. Route II is slower than route I or both BD and BDHI. BDHI accelerates the olding o both routes, and it reduces the number o trajectories that visit route II rom 10.53% to 7.6%. BDHI not only reduces eective diusivity, it also alters the olding route to avor a aster one than a slower one. FIG. 10. Two representative kinetic pathways are projected on a two-dimensional ree energy landscape o the raction o native contact ormation Q and the raction o native contact ormation in the region o the mini-core QT or at 0.95 T. Panel (a) shows a major pathway o olding kinetics (route I), representing a ast route, in the presence o HI. Panel (b) shows a minor pathway (route II). The olding ree energy was colored in grayscale in units o kt. B The kinetic trajectories were colored by normalized time (time divided by t old o each trajectory) and projected on the olding ree energy. Key conormations were selected or visual guidance. The signiicant residues or the hydrophobic core (Ala16, Leu49, and Ile57) and minicore (Leu3, Val38, and Phe50) are illustrated with green and orange beads, respectively. The mini-core orms beore the hydrophobic core in panel (a), whereas the opposite occurs in panel (b). Structures were created with VMD [30]. TABLE III. Number o trajectories and their olding time ( t old simulations that visit route I and II or at 0.95 ) rom a set o 1500 kinetic T. BD BDHI Number BD BDHI t old Number o t old kbdhi kbd = BD BDHI Route o trajectories (10 6 τ) trajectories (10 6 τ) told told I 134 (89.47%) 0.51 ± 0.01 1391 (9.73%) 0.38 ± 0.01 1.34 ± 0.04 II 158 (10.53%) 0.76 ± 0.05 109 (7.6%) 0.51 ± 0.04 1.49 ± 0.15 8

9. SH3 As or SH3, QT is deined as a set o the native contacts that are located in the neighborhood o the diverging turn (DT) and the distal loop (DL) (contacts enclosed in the purple dashed rectangle in Fig. 8). We created a two-dimensional ree energy landscape as a unction o QT and Q in Fig. 11. There is one dominant olding path. In act, we checked whether a rare event second path occurs by raising the number o olding trajectories to 500. We projected a representative kinetic olding trajectory on the landscape. Figure 11 shows the pathway where the contacts between DL (orange beads) and DT (green beads) are ormed beore reaching Q 0.4, which is the top o the barrier o the one-dimensional ree energy as a unction o Q. Then QT increases along Q and the rest o the protein orms to achieve the olded state. The ormation o the contacts o DL and DT characterizes the selectivity o the transition state or SH3. This topological constraint may reduce the eect o HI on the olding kinetic rates at T< T. FIG. 11. One representative kinetic pathway is projected on a two-dimensional ree energy landscape o the raction o native contact ormation Q and the raction o native contact ormation in the region ormed by the diverging turn (DT) and the distal loop (DL) QT or SH3 at 0.91T. It shows a dominant pathway o olding kinetics. The olding ree energy was colored in grayscale in units o kt. B The kinetic trajectory was colored by normalized time (time divided by t old o the trajectory) and projected on the olding ree energy. Key conormations were selected or visual guidance. The residues o the diverging turn (M0, K1, K, G3, and D4) and distal loop (N4 and D43) are illustrated with green and orange beads, respectively. The 9

30 native contacts between DT and DL are ormed beore reaching the top o the barrier. Structures were created with VMD [30]. IV. DISCUSSION AND CONCLUSION A. Crossover behavior o olding kinetics on the non-arrhenius curve It has been shown extensively that protein-olding rates are temperature dependent. Folding rates with respect to temperature renders a U-shaped, non-arrhenius curve where the rates are low at both low and high temperatures [43], and olding rates are astest at a narrow range o temperature near T. Here, we show the additional impact o HI non-trivially aects olding in that it accelerates olding rates more than that o BD without HI at T< T. On the other hand, HI retards protein-olding rates more than that o BD without HI at T> T. To our knowledge, this crossover behavior o the olding times shown in Figs. (c) and (d) has never been observed or theoretically predicted. The temperature dependence o the eect o HI on olding and the crossover behavior might explain the mixed results o the HI inluences on protein olding rates rom several computational studies in the literature. In these previous studies, it was not clear whether the temperatures used are higher or lower than T. Rather, most simulation temperatures were justiied by matching the experimentally measured diusivity o a protein model. We will discuss previous work below. Several groups used similar coarse-grained molecular simulations with a structure-based model or probing the impact o HI on protein olding. Their results vary: Kikuchi et al. [1] ound that there is no clear dierence in the olding kinetics with or without HI o a protein and two secondary structures, an α-helix and β-hairpin. Their T values were not reported in their study, which makes it diicult to judge their results in an appropriate temperature regime where HI can accelerate or retard olding dynamics. Frembgen-Kesner and Elcock [10] studied 11 small proteins and also two secondary structures, an α-helix and β-hairpin. The olding time decreased with HI or all studied proteins, but it has the opposite eect or the secondary structures. It is inerred that they launched simulations at room temperature or all their systems. It may be that the olding temperatures o all proteins are greater than the room temperature used in the simulations, but it may not be the case or the secondary structures. Another study perormed by 30

31 Cieplak & Niewieczerzał [9] showed the olding time o three proteins (1CRN, 1BBA, and 1LY) over a range o temperatures. Although the dierences o the olding time between their BD model with or without HI decreases at high temperature (above room temperature), there is no indication o a crossover behavior rom their study. We speculate that their simulating temperatures are not close to T because or a structure-based model that olds and unolds in a two-state manner, the ree energy barrier o protein olding is typically a ew kt B at T ; the olding time at T is exponentially longer than the astest olding time with a minimal ree energy barrier. In addition, there is no clear evidence that the protein models remained thermally unold at the maximum temperature studied with HI. Our work shows that the justiication o the simulation temperature against the olding temperature is a criterion to assess the impact o HI on protein olding dynamics. Additionally, Lipska et al. [11] argued that a structure-based model with only avorable attraction between native contacts is the reason why these studies mention above [9,10,1] have not observed retarded dynamics under HI rom their simulations. They argued that the presence o intermediate states is key to a retarded dynamics by studying the eects o HI on two proteins (1BDD and 1EOL) at distinct temperatures with a coarse-grained molecular simulation using the UNRES orce ield. Indeed, HI can alter kinetic paths to avor non-productive intermediates at a temperature lower than the collapsed temperature Tθ as asserted by Tanaka [4]. Lipska s work has not investigate olding at T> T ; thus, their conjecture is compatible with our work that HI can retard the olding dynamics at T> T. B. Underlying kinetic principles o the crossover behavior The impact o HI on protein olding that gives rise to the crossover behavior is subtle over a wide range o temperatures because HI aects the olding mechanism in three thrusts that are not necessarily o equal prominence: (1) the dynamics o crossing over an activation barrier, () the choice o olding pathways, and (3) the motions between beads in viscous solvents. HI is a kinetic eect that expresses rom a diusion tensor in the equation o motion. It does not shape a olding energy landscape but it governs the ordering o contact pairs across a complex olding energy landscape particularly when more than one pathway rom the unolded state to the olded state exists. We argued that at T< T, the irst two actors dominate the kinetic principle that HI 31