Novel Monte Carlo Methods for Protein Structure Modeling. Jinfeng Zhang Department of Statistics Harvard University

Similar documents
Protein Structure Analysis with Sequential Monte Carlo Method. Jinfeng Zhang Computational Biology Lab Department of Statistics Harvard University

Biopolymer structure simulation and optimization via fragment regrowth Monte Carlo

Importance of chirality and reduced flexibility of protein side chains: A study with square and tetrahedral lattice models

Supporting Online Material for

3D HP Protein Folding Problem using Ant Algorithm

Monte Carlo Simulations of Protein Folding using Lattice Models

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Outline. The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation. Unfolded Folded. What is protein folding?

Absolute Entropy of a 2D Lattice Model for a Denatured Protein

Monte Carlo Sampling of Near-Native Structures of Proteins With Applications

Extraction of an Effective Pairwise Potential for Amino Acids

Distance-Dependent, Pair Potential for Protein Folding: Results From Linear Optimization

Clustering of low-energy conformations near the native structures of small proteins

Protein Structure Prediction, Engineering & Design CHEM 430

A new combination of replica exchange Monte Carlo and histogram analysis for protein folding and thermodynamics

Computer Simulation of Peptide Adsorption

Discrimination of Near-Native Protein Structures From Misfolded Models by Empirical Free Energy Functions

Many proteins spontaneously refold into native form in vitro with high fidelity and high speed.

Can a continuum solvent model reproduce the free energy landscape of a β-hairpin folding in water?

Frontiers in Physics 27-29, Sept Self Avoiding Growth Walks and Protein Folding

Lecture 18 Generalized Belief Propagation and Free Energy Approximations

arxiv:cond-mat/ v1 [cond-mat.soft] 16 Nov 2002

Proteins are not rigid structures: Protein dynamics, conformational variability, and thermodynamic stability

Polypeptide Folding Using Monte Carlo Sampling, Concerted Rotation, and Continuum Solvation

Algorithm for protein folding problem in 3D lattice HP model

April, The energy functions include:

Design of a Novel Globular Protein Fold with Atomic-Level Accuracy

Monte Carlo (MC) Simulation Methods. Elisa Fadda

Statistical geometry of packing defects of lattice chain polymer from enumeration and sequential Monte Carlo method

Molecular Modeling. Prediction of Protein 3D Structure from Sequence. Vimalkumar Velayudhan. May 21, 2007

Protein Structure Prediction

Lecture 34 Protein Unfolding Thermodynamics

A Quasi-physical Algorithm for the Structure Optimization Off-lattice Protein Model

Protein Folding Prof. Eugene Shakhnovich

Docking. GBCB 5874: Problem Solving in GBCB

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University

Timothy Chan. Bojana Jankovic. Viet Le. Igor Naverniouk. March 15, 2004

Long Range Moves for High Density Polymer Simulations

ICCP Project 2 - Advanced Monte Carlo Methods Choose one of the three options below

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

Crystal Structure Prediction using CRYSTALG program

Effect of Sequences on the Shape of Protein Energy Landscapes Yue Li Department of Computer Science Florida State University Tallahassee, FL 32306

F. Piazza Center for Molecular Biophysics and University of Orléans, France. Selected topic in Physical Biology. Lecture 1

Structural Bioinformatics (C3210) Molecular Docking

Homework 9: Protein Folding & Simulated Annealing : Programming for Scientists Due: Thursday, April 14, 2016 at 11:59 PM

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Kd = koff/kon = [R][L]/[RL]

Protein structure forces, and folding

Molecular Interactions F14NMI. Lecture 4: worked answers to practice questions

The sequences of naturally occurring proteins are defined by

Protein Structures. 11/19/2002 Lecture 24 1

3.320 Lecture 18 (4/12/05)

Context of the project...3. What is protein design?...3. I The algorithms...3 A Dead-end elimination procedure...4. B Monte-Carlo simulation...

Why Proteins Fold. How Proteins Fold? e - ΔG/kT. Protein Folding, Nonbonding Forces, and Free Energy

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

DETECTING NATIVE PROTEIN FOLDS AMONG LARGE DECOY SETS WITH THE OPLS ALL-ATOM POTENTIAL AND THE SURFACE GENERALIZED BORN SOLVENT MODEL

Free energy, electrostatics, and the hydrophobic effect

Other Cells. Hormones. Viruses. Toxins. Cell. Bacteria

Generating folded protein structures with a lattice chain growth algorithm

MCB100A/Chem130 MidTerm Exam 2 April 4, 2013

Presenter: She Zhang

arxiv:physics/ v1 [physics.bio-ph] 7 Mar 2002

There are self-avoiding walks of steps on Z 3

Local Interactions Dominate Folding in a Simple Protein Model

Biology Chemistry & Physics of Biomolecules. Examination #1. Proteins Module. September 29, Answer Key

Protein Docking by Exploiting Multi-dimensional Energy Funnels

Scaling Law for the Radius of Gyration of Proteins and Its Dependence on Hydrophobicity

EE512 Graphical Models Fall 2009

Computer simulation of polypeptides in a confinement

Folding of small proteins using a single continuous potential

Modular Bayesian uncertainty assessment for Structural Health Monitoring

Ж У Р Н А Л С Т Р У К Т У Р Н О Й Х И М И И Том 50, 5 Сентябрь октябрь С

A rule of seven in Watson-Crick base-pairing of mismatched sequences

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Application of the Markov State Model to Molecular Dynamics of Biological Molecules. Speaker: Xun Sang-Ni Supervisor: Prof. Wu Dr.

Molecular dynamics simulation. CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror

Monte Carlo simulations of polyalanine using a reduced model and statistics-based interaction potentials

October 2016 v1 12/10/2015 Page 1 of 10

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

arxiv:cond-mat/ v1 2 Feb 94

Monte Carlo simulation of proteins through a random walk in energy space

Contact map guided ab initio structure prediction

Ab-initio protein structure prediction

Abstract. Introduction

Lecture 2 and 3: Review of forces (ctd.) and elementary statistical mechanics. Contributions to protein stability

Computer Modeling of Protein Folding: Conformational and Energetic Analysis of Reduced and Detailed Protein Models

Monte Carlo Simulations of the Hyaluronan-Aggrecan Complex in the Pericellular Matrix

Simulating Folding of Helical Proteins with Coarse Grained Models

Wang-Landau sampling for Quantum Monte Carlo. Stefan Wessel Institut für Theoretische Physik III Universität Stuttgart

Steering on-surface polymerization with metal-directed template

Markov Chain Monte Carlo Lecture 1

Protein Structure Prediction

Conformational Sampling in Template-Free Protein Loop Structure Modeling: An Overview

BIOC : Homework 1 Due 10/10

Protein folding. Today s Outline

Transcription:

Novel Monte Carlo Methods for Protein Structure Modeling Jinfeng Zhang Department of Statistics Harvard University

Introduction Machines of life Proteins play crucial roles in virtually all biological processes. Myosin Rhodopsin Hemoglobin Pepsin From Protein Data Bank (PDB) 2

Introduction From Sequence to Structure to Function The function of a protein is governed by its three-dimensional structure. Structure is determined by sequence. Genome Sequencing Projects Function 3

Structural Genomics The next step beyond the human genome project X-ray, NMR At least one structure in each protein family (~8000) Computational prediction More than 3,000,000 proteins 4

Outline Markov Chain Monte Carlo (CMC) method for protein structure prediction Sequential Monte Carlo (SMC) method for characterizing ensemble of protein conformations Summary & Discussions 5

Protein Folding p( x) E( x)/ kbt e = Z( T ) = e Z( T ) i E( x )/ k i B T 6

Simplified Protein Folding Model HP Model Protein sequence Hydrophobic (H) and polar (P) residues e.g. HHHPHHPHHHHPHHP. Protein conformation Self-avoiding chain on a 2D square or 3D cubic lattice. Energy E = - n, where n is number of hydrophobic contacts. The assumption The native conformation is the one with minimum energy. Finding the conformation with minimum energy NP complete problem. 7

A Sequence of 64 Residues HHHHHHHHHHHHPHPHPPHHPPHHPPHPPHHP PHHPPHPPHHPPHHPPHPHPHHHHHHHHHHHH E = -42 8

HP Model and Lattice Polymers Lattice models. A Brief Review Polymer science since 1950s HP model, 1985 by Dill. Simple with basic features of protein folding. Studying protein thermodynamics, folding principle, designability, protein evolution Studies on folding HP sequences HZ, CHCC, CI, GA, SA, GTS, EMC, GMC, GSA, PERM, SISPER, ACO, EES, npermis, npermh. Unsolved problem for sequences with medium length. 9

Exploring the Energy Landscape Metropolis-Hastings Algorithm a( x, Start with a conformation x. Draw y from q(x,y) ) = q(y x). Accept the new conformation with probability y) = = π ( y) q( y, x) min 1, π ( x) q( x, y) min { [ E( x) E( y)]/ k T } B 1, e 10

Moves on Lattice Pivot move Local moves End move Corner move Crankshaft move 11

Fragment Re-growth via Energyguided Sequential Sampling (FRESS) 1. Sample a l from L min to L max. For example, p(l) U[L min = 2, L max = 12], and l = 5. 2. Select and delete a random fragment of length l. 3. Sample the fragment sequentially. p j exp( E j / T ) 4. Accept or reject the newly sampled conformation using Metropolis criterion: p = min{1, e t+ ( E 1 E )/ T t } p = min{1, w w t t+ 1 e ( E t+ 1 Et )/ T } 5. Simulated Annealing. 12

FRESS Movie 13

2D Sequences Seq. code L Sequence 2D50 50 HHPHPHPHPHHHHPHPPPHPPPHPPPPHPPPHPPPHPHHHHPHPHPHPHH 2D60 60 2D64 64 2D85 85 2D100a 100 2D100b 100 PPHHHPHHHHHHHHPPPHHHHHHHHHHPHPPPHHHHHHHHHHHHPPP PHHHHHHPHHPHP HHHHHHHHHHHHPHPHPPHHPPHHPPHPPHHPPHHPPHPPHHPPHHPPH PHPHHHHHHHHHHHH HHHHPPPPHHHHHHHHHHHHPPPPPPHHHHHHHHHHHHPPPHHHHHH HHHHHHPPPHHHHHHHHHHHHPPPHPPHHPPHHPPHPH PPPPPPHPHHPPPPPHHHPHHHHHPHHPPPPHHPPHHPHHHHHPHHHHH HHHHHPHHPHHHHHHHPPPPPPPPPPPHHHHHHHPPHPHHHPPPPPPHP HH PPPHHPPHHHHPPHHHPHHPHHPHHHHPPPPPPPPHHHHHHPPHHHHHH PPPPPPPPPHPHHPHHHHHHHHHHHPPHHHPHHPHPPHPHHHPPPPPPHH H 14

Comparison on Folding Long HP Sequences 2D seq. EMC SISPER GSA EES npermis FRESS 2D50-21 -21 NA -21 NA -21 2D60-35 -36-36 -36-36 -36 2D64 2D85 2D100a 2D100b -39 NA NA NA -39-52 -48-49 -42-52 -48-50 -42-53 -48-49 -42-53 -48-50 -42-53 -48-50 15

Minimum Energy Conformations of 2D64 16

Minimum Energy Conformations of 2D100 17

3D Sequences Seq. code L Sequence 3D58 58 3D64 64 3D67 67 3D88 88 3D103 103 3D124 124 3D136 136 PHPHHHPHHHPPHHPHPHHPHHHPHPHPHHPPHHHPPHPHPPPPHP PHPPHHPPHPPH PHHPHHPHHHPPHPHPPHPHPPHHHPHHPHHPPHHPHHPHHHPPH PHPPHPHPPHHHPHHPHHP PHPHHPHHPHPPHHHPPPHPHHPHHPHPPHHHPPPHPHHPHHPHPP HHHPPPHPHHPHHPHPPHHHP PHPHHPHHPHPPHHPPHPPHPPHPPHPPHPPHHPPHHHPPHHHPPH HHPPHHHPPHPHHPHHPHPPHPPHPPHHPPHPPHPPHHPPHP PPHHPPPPPHHPPHHPHPPHPPPPPPPHPPPHHPHHPPPPPPHPPHPH PPHPPPPPHHHPPPPHHPHHPPPPPHHPPPPHHHHPHPPPPPPPPHHH HHPPHPP PPPHHHPHPPPPHPPPPPHHPPPPHHPPHHPPPPHPPPPHPPHPPHHP PPHHPHPHHHPPPPHHHPPPPPPHHPPHPPHPHPPHPPPPPPPHPPHH HPPPPHPPPHHHHHPPPPHHPHPHPHPH HPPPPPHPPPPHPHHPHHPPPPHPHHHPPPPHPHPHHHHPPPPPPPPP PPHPPHPPPHPHHPPPHHPPHPPHPHPHPPPPPPPPHPPPHHHHHHPP PHHPPHHHPPPHHPHHHHHPPPPPPPPPHPPPPHPHPPPP 18

Comparison on 3D HP Sequences 3D seq. CI (1996) npermis (2003) npermh (2005) FRESS (2006) 3D58-42 -44 (0.19*) -44 (1.10) -44 (0.09) 3D64 NA -56 (0.45) -56 (0.47) -56 (0.53) 3D67 NA -56 (1.10) -56 (0.33) -56 (1.41) 3D88 NA -69 (NA) -69 (0.45) -72 (5.03) 3D103-49 -54 (3.12) -55 (0.25) -57 (4.47) 3D124-58 -71 (12.3) -71 (1.19) -75 (280) 3D136-65 -80 (110) NA -83 (350) * Times are in CPU hours. CPU: npermis 667MHz, npermh 1.84 GHz, FRESS 1.4 GHz. For 3D124, -74 was found in 4.8 hours, and for 3D136, -82 was found in 6.4 hours. 19

Minimum Energy Conformations Sequence 3D88: PHPHHPHHPHPPHHPPHPPHPPHPPHPPHPPHHPPHHHPPHHHP PHHHPPHHHPPHPHHPHHPHPPHPPHPPHHPPHPPHPPHHPPHP, E = -72, previous minimum energy is -69. 20

Minimum Energy Conformations Sequence 3D103, E = -57, previous minimum energy is -55. Sequence 3D124:, E = -75, previous minimum energy is -71. c 21

Minimum Energy Conformations Sequence 3D136: HPPPPPHPPPPHPHHPHHPPPPHPHHHPPPPHPHPHHHHPPPP PPPPPPPHPPHPPPHPHHPPPHHPPHPPHPHPHPPPPPPPPHPPPHHHHHHPPPH HPPHHHPPPHHPHHHHHPPPPPPPPPHPPPPHPHPPPP, E = -83, previous minimum energy is -80. 22

Characterizing Ensemble Protein Conformations by Sequential Monte Carlo Method 23

X-ray Structures 24

Proteins are Dynamic Molecules 25

YJ. Huang and GT. Montelione, Nature, 438, (2005), 36-37. 26

N Furnham, T Blundell, M DePristo, Nature Structure & Molecular Biology, (2006) 13:184-185. A more suitable representation of a macro-molecular crystal structure would be an ensemble of models. The range of structures in the ensemble would be considered by any user of the structural information. 27

Characterizing Ensemble Conformations Backbone Ensemble of structures with different backbone conformations. J. Zhang et. al. (2007), Proteins, 66: 61-68. Side-chain Ensemble of structures with the same backbone but different side-chain conformations. J. Zhang & JS Liu (2006), PLoS Comp Biol, 2(12): e168. 28

Ensemble of Side-chain Conformations 29

Ensemble of Side-chain Conformations Number of side chain conformations, N sc. Side chain conformational entropy. S sc = k B ln(n sc ) http://wishart.biology.ualberta.ca/moviemaker Protein stability. G = H-T S 30

Side-chain Modeling All heavy atoms are explicitly modeled. Side-chain flexibility Rotamer library by D. Richardson Excluded volume effect A pair of atoms i and j are considered to be a hard clash if r ij < a ( r0 ( i) + r0 ( j)) r ij : distance; r 0 (i) and r 0 (j) : van der Waals radii of the two atoms; a : scaling coefficient. 31

Sequential Monte Carlo (SMC) S n = (r 1,, r j,, r n ), r j R j = {1,, M j }. SMC: sample a side-chain (r) one at a time and fix the residues that are already sampled. S n Ω n h ( S n ) For each sample i, there is an associated weight, w (i). At step t, a residue, r t, is picked, and a rotamer, k, is sampled from a given distribution with probability p k. Update weight by w w / ( i) = t+1 h( S = 1 m n S Ω m n n i= 1 ) ( i) t p k w ( i) n h( S ( i) n ) 32

Performance S sc k B 5 10 15 20 25 30 a 2ovo 3ebx Enumeration SMC 9 13 17 Length Standard deviation 0.0 0.5 1.0 1.5 2.0 b 2erl (40 aa, 48.1) 4rnt (104 aa, 109.3) 1fi2a (201 aa, 250.0) 1uzba (516 aa, 672.5) 100 1000 2000 Sample size The total number of self-avoiding side-chain conformations for the fragment of 3ebx, residue 1-17, is 396,325,923,840 3.96 10 11, SMC estimate is 4.01 10 11 with a sample size of 1000. 33

Sequential Sampling of Side Chains 34

Effect of SCE on Protein Stability Native & Decoy Structures Decoy R Us database 35

Native & Decoys Structures 1ctf Native S sc k B 55 65 75 85 400 500 600 Contact Number S sc can differ by more than 20 in k B unit, which corresponds to -11.9 kcal/mol of free energy at 300K. The stability of a protein is around -5 to -20 kcal/mol. 36

Incorporating SCE in Energy Function G = H H : Residue contact potential. G = H - T S sc H : Residue contact potential. S sc : Side-chain entropy. T = 1. 37

ΔH vs. ΔH - ΔS sc Protein ID ΔH ΔH - ΔS sc Protein ID ΔH ΔH - ΔS sc 1ctf (A, 630)* 6 1 1beo (D, 2000) 67 2 1r69 (A, 675) 24 5 1ctf (D, 2000) 10 1 1sn3 (A, 660) 86 10 1dkt-A (D, 2000) 588 5 2cro (A, 674) 63 5 1fca (D, 2000) 136 10 3icb (A, 653) 19 25 1nkl (D, 2000) 217 3 4pti (A, 687) 143 83 1pgb (D, 1572) 12 1 4rxn (A, 677) 14 7 1b0n-B (E, 497) 114 104 1fc2 (B, 500) 7 5 1ctf (E, 497) 13 4 1hdd-C (B, 500) 10 5 1dtk (E, 215) 1 1 2cro (B, 500) 47 17 1fc2 (E, 500) 32 3 4icb (B, 500) 1 1 1igd (E, 500) 159 6 1bl0 (C, 971) 851 4 1shf-A (E, 437) 2 2 1eh2 (C, 2413) 995 3 2cro (E, 500) 1 1 1jwe (C, 1407) 288 1 2ovo (E, 347) 19 2 smd3 (C, 1200) 266 1 4pti (E, 343) 1 1 * A: 4state_reduced, B: fisa, C: fisa_casp3, D: lattice_ssfit, E: lmds. 38

Protein Interactions and SCE http://wishart.biology.ualberta.ca/moviemaker 39

Native & Decoy Structure of Protein Complexes 1spb 1brc S sc k B 374 378 382 Native 20 30 40 50 60 Contact Number S sc k B 305 315 325 Native 25 30 35 40 Contact Number 40

X-ray & NMR Structures Protein in crystal Protein in solution 41

SCE vs. R g of X-ray and NMR Structures 23 proteins with both X-ray and NMR structures Average S sc k B 0 5 10 15 20 S sc = Ssc, X ray Ssc, NMR R g = Rg, X ray Rg, NMR 0.4 0.0 0.2 0.4 0.6 0.8 Average R g 42

SCE in Protein Design 1bth 2ptc S sc 20 15 10 5 S sc 15 10 5 60 80 100 15 20 25 30 35 40 Contact Number Contact Number S sc = S sc,complex -S sc,protein_1 -S sc,protein_2 43

Summary Sampling method: a new MC method for protein structure prediction Global optimization and sampling. Simple and effective. Characterizing ensemble conformations SMC for estimating entropy and free energy. SCE is important for protein folding and structure modeling. 44

Acknowledgement Prof. Jun Liu Prof. Jie Liang Prof. Rong Chen Prof. Sam Kou Department of Statistics Harvard University Bioengineering Department University of Illinois at Chicago Department of Information and Decision Science University of Illinois at Chicago Department of Statistics Harvard University NIH, NSF for financial support! 45

Future Work Extend FRESS to real protein simulation. Apply FRESS to other optimization and sampling problems. Apply side-chain modeling to protein structure prediction, protein interaction, and protein design. SMC for other statistical and computational problems. 46

10 Benchmark Sequences of Length 48 s48a HPHHPPHHHHPHHHPPHHPPHPHHHPHPHHPPHHPPPHPPPPPPPPHH -32 s48b HHHHPHHPHHHHHPPHPPHHPPHPPPPPPHPPHPPPHPPHHPPHHHPH -34 s48c PHPHHPHHHHHHPPHPHPPHPHHPHPHPPPHPPHHPPHHPPHPHPPHP -34 s48d PHPHHPPHPHHHPPHHPHHPPPHHHHHPPHPHHPHPHPPPPHPPHPHP -33 s48e PPHPPPHPHHHHPPHHHHPHHPHHHPPHPHPHPPHPPPPPPHHPHHPH -32 s48f HHHPPPHHPHPHHPHHPHHPHPPPPPPPHPHPPHPPPHPPHHHHHHPH -32 s48g PHPPPPHPHHHPHPHHHHPHHPHHPPPHPHPPPHHHPPHHPPHHPPPH -32 s48h PHHPHHHPHHHHPPHHHPPPPPPHPHHPPHHPHPPPHHPHPHPHHPPP -31 s48i PHPHPPPPHPHPHPPHPHHHHHHPPHHHPHPPHPHHPPHPHHHPPPPH -34 s48j PHHPPPPPPHHPPPHHHPHPPHPHHPPHPPHPPHHPPHHHHHHHPPHH -33 Yue, K., et al. Proc. Natl. Acad. Sci. U.S.A. 92:325, 1995. 47

Comparison on Benchmark Sequences Method # Seq Avg. Time* FRESS 10 0.77 LM 2 3.54 npermh 10 4.28 ACO 10 296 CG 9 73.6 MA 8 260 * CPU time in minutes. CPUs: FRESS 1.4 GHz, npermh 1.84 GHz, ACO 2.4 GHz. 48

The Key Ingredients of FRESS Method # Seq Avg. Time Optimal 10 0.77 NE 3 1.37 L max = 4 5 1.94 L = 12 8 1.42 Optimal condition: starting temperature T h = 3.5, minimum temperature T l = 0.1, temperature decreases by 0.98 geometrically; L min = 2, L max = 12, with 5 10 4 moves at each temperature. 49

The Strategy of FRESS 50

3D Sequences 3D seq. CI npermis npermh FRESS (2006) E* 3D58-42 -44 (0.19*) -44 (1.10) -44 (0.09) -47 3D64 3D67 3D88 3D103 3D124 3D136 NA NA NA -49-58 -65-56 (0.45) -56 (1.10) -69 (NA) -54 (3.12) -71 (12.3) -80 (110) -56 (0.47) -56 (0.33) -69 (0.45) -55 (0.25) -71 (1.19) NA -56 (0.53) -56 (1.41) -72 (5.03) -57 (4.47) -75 (280) -83 (350) -59-59 -72-59 -82-85 51

Large Scale Study of SCE S sc k B 0 400 800 a α = 0.8 + α = 0.6 0 400 800 Length S sc,buried S sc 0.0 0.2 0.4 0.6 b α = 0.8 + α = 0.6 0 400 800 Length J Zhang, JS Liu (2006) PLoS Comp Biol, 2(12): e168. 52

Protein Interactions and SCE http://wishart.biology.ualberta.ca/moviemaker 53