FlexPepDock In a nutshell All Tutorial files are located in http://bit.ly/mxtakv FlexPepdock refinement Step 1 Step 3 - Refinement Step 4 - Selection of models Measure of fit FlexPepdock Ab-initio Step 1 - Preparing the structure for modeling Step 3 - Ab-initio run Step 4 - Selection of models FlexPepBind Step 1 - Preparation of the model Step 3 - Threading the peptide sequence Step 4 - FlexPepDock runs Step 5 - Specificity prediction & analyzing results Glossary FlexPepBind Fit measures Options (aka flags ) Refinement Ab-initio FlexPepdock refinement Raveh B, London N, Schueler-furman O. Sub-angstrom modeling of complexes between flexible peptides and globular proteins. Proteins. 2010;78(9):2029-40. The Refinement protocol is intended for cases where an approximate, coarse-grain model of the interaction is available. The protocol iteratively optimizes the peptide backbone and its rigid-body orientation relative to the receptor protein, in addition to on-the-fly side-chain
optimization. The final result is a high resolution model of the peptide-receptor complex with side chains of binding motifs modeled at nearly atomic accuracy. The protocol is able to account for a considerable diversity of peptide conformations within a given binding site. However, it is not intended for cases where the peptide backbone is unknown (the ab-initio protocol is best optimized for these cases). Here we show a successful example starting from an extended conformation, but the protocol is intended for cases where the peptide s backbone is more or less in the correct position and the protocol brings it to high resolution accuracy. Let s start with a refinement run to model a high resolution structure of the Serine-Threonine kinase Chk1 (bound complex PDB 1nvr, free receptor PDB 2x8d). Our starting structure will be the bound receptor with the extended peptide located in its binding site. We want to use the receptor s side chains in our run: they are an important part of the input starting structure since many receptors undergo only minor changes upon peptide docking. Usually if a native structure is not supplied, the starting structure is used as reference instead. Step 1 The first thing you should do is to modify a file called paths to set the relevant paths necessary for the execution of Rosetta. ROSETTA_DB - the path to Rosetta Database ROSETTA_BIN - the path to the bin directory in the Rosetta installation A first preliminary step in our protocol involves the packing of the side-chains in each monomer to remove internal clashes that are not related to inter-molecular interactions The pre-packing stage guarantees a uniform conformational background in non-interface regions, prior to molecular docking. The main flags in the prepacking step are: -unboundrot - takes extra rotamers from the unbound structure. It is worth noting that although our example uses a native.pdb file as an input for unboundrot, the file is actually the native unbound structure and not the bound receptor. -flexpep_prepack - The prepacking flag To prepack the starting structure, simply run prepack_example. For further details on the flags used in this run you can take a look at the prepack_flags file. The resulting structure is start.ppk.pdb. You may want to open it in PyMOL and compare it to the start.pdb structure, and see the changes that have occurred. This will be the starting structure for the next step. Step 3 - Refinement This is the main part of the protocol. In this step, the peptide backbone and its rigid-body orientation are optimized relative to the receptor protein using the Monte-Carlo with Minimization approach, in addition to on-the-fly side-chain optimization. An optional low-resolution (centroid) pre-optimization may improve performance further. When using the FlexPepDock server, results sent to the user are based on a combination of 100 decoys created by using the low-resolution
pre-optimization and 100 that are created without using it. For this reason, in this run we provide you with two flag files that enable you to create results similar to the ones provided by the server. For this step use these two commands: run_example_nolowresolution run_example_withlowresolution Notice that the only difference between the provided flag files is the flag - flexpepdocking:lowres_preoptimize. Also notice the unique flags such as -flexpep_score_only (rescores the input PDB structures, and outputs elaborate statistics about them in the score file). For a production run it is recommended to produce 200 decoys. In this demonstration only 4 will be created by you, but we provide you a folder with 200 decoys resulting from the runs that have been described here. Step 4 - Selection of models A typical analysis of a Rosetta simulation usually involves plotting RMS vs total score. Attached is a general total_score Vs. all atom RMSD plot and a plot of reweighted_score Vs. all atom RMSD. Here are some additional ways to analyze the results we get: 1. Interface score (I_sc) - calculation of scoring based only on interface residues. 2. Peptide score (pep_sc) - calculation of scoring based on the peptide and the interface residues 3. Reweighted score (reweighted_sc) - Experience has shown that a calculated reweighted score (=I_sc + pep_sc + total_score), works better than the general score12 function in the case of flexible peptides docking onto their receptors. Measure of fit 1. rms(ca,bb,all)_if -RMSD between output model and the native structure, over all peptide interface (heavy/backbone/c-alpha) atoms 2. startrms(all,ca,bb) - RMSD between start and native structures, over all peptide (heavy/backbone/c-alpha) atoms FlexPepdock Ab-initio Raveh B, London N, Zimmerman L, Schueler-furman O. Rosetta FlexPepDock ab-initio: simultaneous folding, docking and refinement of peptides onto their receptors. PLoS ONE. 2011;6(4):e18934. The Ab-initio protocol extends FlexPepDock by incorporating fragment-based sampling during the sampling phase. For this protocol, no initial information of the peptide backbone is required, only a placement of an arbitrary peptide starting structure of the peptide within the approximate binding pocket. In the preliminary phase of the protocol, a library of trimer, pentamer and nonamer (where available) backbone fragments is generated, using the general fragment picker protocol. At the end of the fragment picking process, the library should contain 500 fragments from each
category of secondary structure type, i.e., a helix, extended beta strand and coiled-coil loop (with a total of 1,500 fragments for a given query peptide)]. Similar to the refinement protocol, the input to the protocol is an initial model of the peptideprotein complex. It is assumed that the receptor backbone is approximately correct, and that the peptide is initially positioned close to the correct binding site, albeit with arbitrary backbone conformation. Most of our studies, as well as this example, are based on results starting from extended peptide backbone conformations superimposed on a randomly selected anchor residue, but the protocol is designed to work for any arbitrary peptide starting conformation Step 1 - Preparing the structure for modeling First, extract the file abinitio.tar.gz to obtain the necessary files for the tutorial. Since the ab-initio protocol doesn t require the approximate structure of the peptide - only the approximate binding site and the sequence - we start with a peptide-receptor complex in which the peptide is extended. Specifically, in this example we use the unbound structure of 1NVR (pdb-id 2x8d) with its original substrate peptide extended (the 5th residue serves as an anchor point). The first thing you should do is to modify a file called paths to set the relevant paths necessary for the execution of Rosetta. ROSETTA_DB - the path to Rosetta Database ROSETTA_BIN - the path to the bin directory in the Rosetta installation The first step in the Ab-initio protocol as we ve previously mentioned, is the generation of fragment libraries. Since this step requires several more applications (such as blastpgp, psipred, etc) that aren t usually installed on conventional laptops, we ll provide you these libraries. If you do want to generate the fragment libraries yourself, from the root dir of the tutorial, run: scripts/prep_abinitio.sh, to automate the process of the fragment creation. Notice that the pre-generated fragment libraries are located in the frag directory. Same as the refinement protocol. To prepack the starting structure, simply run prepack_example. The output of this prepacking step serves as the input to the main Ab-Initio protocol. Step 3 - Ab-initio run A typical ab initio run includes - 1. Fast, low resolution modeling - the peptide is folded and docked over the surface of the receptor protein using a low-resolution representation of the complex. Results in a coarse-grained model of the peptide-protein complex. 2. High-resolution optimization of the coarse-grained models with the Rosetta FlexPepDock refinement protocol you ve already seen. The recommended amount of sampling for an ab-initio protocol is ~50,000 decoys. Indeed, this protocol is computationally intensive and could take several days on a typical cluster. If you still
wish to experience the joy of running the simulation (even for a smaller amount of decoys) you can try and run the run_example script (in the flags file, change the number of structures you would like to generate). The main flags we re using in the ab-initio protocol are: -flexpepdocking:lowres_abinitio - the Ab initio protocol flag -flexpepdocking:pep_refine - include this flag if you want to include a refinement step after modeling in low resolution -flexpepdocking:flexpep_score_only - include additional scoring terms unique to FlexPepDock. Step 4 - Selection of models A typical analysis of a Rosetta simulation usually involves plotting RMS vs total score. In addition to the scoring terms mentioned above, FlexPepDock abinitio includes several more scoring terms; one of them is bestrms_5mer_all, which indicates the RMS of the 5mer fragment that was closest to the native. FlexPepBind London N, Gulla SV, Keating AE, Schueler-furman O. In--silico and in--vitro elucidation of BH3 binding specificity towards Bcl--2. Biochemistry. 2012; Rosetta FlexPepBind is a protocol for the prediction of peptide binding specificity. It evaluates the binding potential of different peptides, based on structural models of the corresponding peptide-receptor complexes. For a given peptide, we thread the desired sequence onto the peptide in the native structure, while keeping the side chains of the receptor fixed. We then use Rosetta FlexPepD ock in full-atom mode (i.e. without centroid pre-optimization; see above) to refine the structure of the complex with the threaded target peptide (All of the peptide s sidechains as well at the receptor s interface side-chains are flexible). We create 100-1000 models and score the sequence (see below) to evaluate the binding ability of the specific peptide sequence. As an example, we will use the interaction between a Bcl-2 protein and a BH3 only peptide (See London N, Gulla S, Keating A, Schueler--Furman O (2012). In silico and in vitro elucidation of BH3 binding specificity towards Bcl--2. Biochemistry, doi:10.1021/bi3003567). The Bcl-2 protein family contains key regulators of programmed cell death, tumorigenesis and cellular responses to anti-cancer therapy. Note that this protocol might be simplified for specific systems. For calculation of Farnesyltransferase specificity for example, minimization of the peptide backbone only instead of a full run of FlexPepDock was also able to distinguish peptide substrates (see London et al. (2011) PLoS CB 7:e1002170 for more details). Step 1 - Preparation of the model The first input of the protocol is a peptide-protein complex. We will use the supplied template for Mcl-1 (pdb: 2pqk) as our example.
Check that the paths you are directed to are : ROSETTA_DB - the path to Rosetta Database ROSETTA_BIN - the path to the bin directory in the Rosetta installation (as described in the ab-initio section). As described before, the second preliminary step in the protocol involves the packing of the side-chains in each monomer to remove internal clashes that are not related to inter-molecular interactions. The pre-packing stage guarantees a uniform conformational background in non-interface regions, prior to molecular docking. The main flags in the prepacking step are: -unboundrot - takes extra rotamers from the receptor structure and biases towards these side chain conformations (note: in this case unbound, the template for side chain conformations, is the receptor structure). -flexpep_prepack - The prepacking flag To prepack the starting structure, simply run prepack_example, also supplied in scripts: $ROSETTA_BIN/FlexPepDocking.linuxgccrelease -s 2pqk.pdb -native 2pqk.pdb -databa se $ROSETTA_DB -ex1 -ex2aro -unboundrot 2pqk.pdb -flexpep_prepack -nstruct 1 The output of this step will be used as input for step 4 in the protocol. An example output can be found in input files - 2pqk.prepacked.pdb. Step 3 - Threading the peptide sequence In this step, the different peptide sequences for which we wish to examine binding affinity are threaded onto the model. The sequence we wish to use for the peptide can be supplied in a text file, and will be threaded unto the structure using design. The supplied script makeresfile_example can be used to create a resfile for any desired sequence. However, for the purpose of this example we will use the supplied resfile resfile.mcl1 that designs a specific sequence - the peptide NOXA. For this step use the design_example, also supplied in scripts: $ROSETTA_BIN/fixbb.linuxgccrelease -database $ROSETTA_DB -s 2pqk.pdb -resfile resfile.mcl1 -nstruct 1 The output of this step will also be used for the next step in the protocol. An example output of this step can be found in input files - 2pqk.designed.pdb. However, the designed peptide should be cut and pasted into the prepacked receptor, which will be acquired by extracting the receptor from the output of step 2. To save time, the prepacked receptor and the designed peptide were extracted and used to create the starting structure for the next step. It can be found in input files under the name 2pqk.start.pdb, and will be used as input for the next step, step 4 - running FlexPepDock. Step 4 - FlexPepDock runs In this step, FlexPepDock is used for the different threaded peptide sequences. For this step use the FlexPepDock_example, also supplied in scripts: $ROSETTA_BIN/ FlexPepDocking.linuxgccrelease -database $ROSETTA_DB -rbmcm -torsionsmcm -ex1 - ex2aro -s 2pqk.start.pdb -native 2pqk.designed.pdb -unboundrot 2pqk.designed.pdb -nstruct 1000 The recommended amount of sampling for a FlexPepBind protocol is anywhere between 100 and 1,000 decoys, depending on the case at hand. In this case, 1,000 decoys were used for thorough sampling. Of course, it is possible to change the number of structures you would like
to generate. Since this run takes a while, an example of top 5 decoys of this run are supplied in the output files, as well as a.pse file showing the top 5 decoys. Note - the top 5 decoys supplied here are the output of the runs made in London et al. (2012). Biochemistry, doi:10.1021/bi3003567. Step 5 - Specificity prediction & analyzing results FlexPepBind can be used for the straight-forward purpose of assessing binding affinities of a set of peptides to a receptor. However, an interesting expansion of this protocol can be assessing binding specificity. For example, the NOXA BH3 peptide binds to Mcl-1 but not Bcl-xL, while the BAD peptide binds to Bcl-xL but not Mcl-1 (see Table 3 in London et al., Biochemistry...). Therefore, the protocol described can be applied on the different receptors with the different peptides, to determine binding specificity in addition to binding affinity. We ve performed it with Mcl-1 and NOXA, and now it can be performed the same way for Bcl-xL and BAD. The template used for Bcl-xL is supplied and named 3io8. The template used for Mcl-1 is, as mentioned, 2pqk. The resfiles used to thread the peptides NOXA and BAD are supplied and named resfile.mcl1 and resfile.bclxl, respectively. It is worth mentioning that every receptor-peptide complex has its own range of energetic scores, and therefore, when comparing two different receptor-peptide complexes, the score threshold that distinguishes binders from non-binders might very likely be different for each complex. Meaning, a specific peptide might get a score x in complex with receptor A, which in the context of that complex indicates it s a binder, but get a better score y in a complex with receptor B, but still be considered as a non-binder in that context - depending on the score ranges of the different complexes A & B. For the relevant thresholds of Bcl-xL and Mcl-1, see Table 2 in London et al. (2012). Biochemistry, doi:10.1021/bi3003567). Below are details about scores by which it is possible to analyze the results: In order to evaluate the binding ability of different peptides to a given receptor, the resulting scores can be compared to the scores obtained for a range of peptides with known binding ability (see Figure 2 in London et al. (2012). Biochemistry, doi:10.1021/bi3003567). The main scoring functions used are detailed in the Glossary below. The scoring function was slightly adjusted for each system. Bcl-2 receptor-bh3 peptide interactions: Two minor adjustments to score12 were made (in order to obtain results as in London et al. (2012). Biochemistry, doi:10.1021/bi3003567, these adjustments need to be included): The penalty for the burial of the carboxyl oxygen atom of the aspartate side chain is increased (the Gfree parameter of the Lazaridis-Karplus solvation potential was modified from -10 to -13.5 for this atom type). This was added to better account for a group of sequences that received poor experimental binding values in the TRAIN set, but good peptide scores. Inspection of the structural models detected a buried Asp that explains the discrepancy. A weak short-range coulombic electrostatic energy term was added with weight 0.5 (added option []). In addition, we assessed each model by its weighted score (see Glossary below).
Farnesylation peptide substrates: Peptide_score_NoRef: Same as peptide score (see Glossary below), less a constant reference energy for each amino acid. This scoring function was found to perform well in our studies of Farnesyltransferase peptide substrate specificity (London et al. 2011. Identification of a novel class of Farnesylation targets by structure-based modeling of binding specificity. PLoS CB 7:e1002170) Glossary Interface score (I_sc) - calculation of scoring based only on interface residues. the energy of pair-wise interactions across the peptide-protein interface. Peptide score (pep_sc) - calculation of the score of the peptide (including internal peptide energy as well as interactions with interface residues) Reweighted score (reweighted_sc) - Experience has shown that a calculated reweighted score (=I_sc + pep_sc + total_score), works better than the general score12 function in the case of flexible peptides docking onto their receptors. Fit measures rms(ca,bb,all)_if -RMSD between output model and the native structure, over all peptide interface (heavy/backbone/c-alpha) atoms startrms(all,ca,bb) - RMSD between start and native structures, over all peptide (heavy/backbone/c-alpha) atoms Options (aka flags ) Refinement -flexpepdocking:lowres_preoptimize - low-resolution pre-optimization. -flexpep_score_only - rescores the input PDB structures, and outputs elaborate statistics about them in the score file). Ab-initio -flexpepdocking:lowres_abinitio - the Ab initio protocol flag -flexpepdocking:pep_refine - include this flag if you want to include a refinement step after modeling in low resolution -flexpepdocking:flexpep_score_only - include additional scoring terms unique to FlexPepDock.