Protein-protein Interaction Prediction using Desolvation Energies and Interface Properties

Size: px

Start display at page:

Download "Protein-protein Interaction Prediction using Desolvation Energies and Interface Properties"

Allison Ginger Benson
5 years ago
Views:

1 200 IEEE International Conference on Bioinformatics and Biomedicine Protein-protein Interaction Prediction using Desolvation Energies and Interface Properties Luis Rueda, Sridip Banerjee, Md. Mominul Aziz, Mohammad Raza School of Computer Science University of indsor, 40 Sunset Ave., indsor, ON, N9B 3P4, Canada Abstract An important aspect in understanding and classifying protein-protein interactions (PPI) is to analyze their interfaces in order to distinguish between transient and obligate complexes. e propose a classification approach to discriminate between these two types of complexes. Our approach has two important aspects. First, we have used desolvation energies amino acid and atom type of the residues present in the interface, which are the input features of the classifiers. Principal components of the data were found and then the classification is performed via linear dimensionality reduction (LDR) methods. Second, we have investigated various interface properties of these interactions. From the analysis of protein quaternary structures, physicochemical properties are treated as the input features of the classifiers. Various features are extracted from each complex, and the classification is performed via different linear dimensionality reduction (LDR) methods. The results on standard benchmarks of transient and obligate protein complexes show that (i) desolvation energies are better discriminants than solvent accessibility and conservation properties, among others, and (ii) the proposed approach outperforms previous solvent accessible area based approaches using support vector machines. Keywords-protein-protein interaction; classification; linear dimensionality reduction; desolvation energy; interface properties I. INTRODUCTION In the field of proteomics one of the current goals is to map the protein interaction networks into different organisms []. In the complex web of interacting proteins, defining a protein by its position needs protein-protein interaction information. Knowledge of this information greatly helps biological research and makes the discovery of novel drug targets much easier. Traditionally, the detection of proteinprotein interactions was limited to labor-intensive experimental techniques such as co-immunoprecipitation or affinity chromatography. However, these methods may not be generally applicable to all proteins in all organisms, and may also be prone to systematic errors. Recently, for the large-scale prediction of protein protein interactions various complementary computational approaches have been developed based on protein sequence, structure and evolutionary relationships in complete genomes. Some of the studies in PPI consider the characterization of the geometry [2], physicochemical properties [3], the preference of residues to appear on the surface [4], and the role of hydrogen and saline bridges, and hydrophobic and polar interactions on the proteins surfaces [5]. Other studies include the analysis of the loss of surface accessible to solvent [6], as a result of the interaction and the analysis of the conservation of residues in the interaction surface [7]. In an upper level, amino acid composition of proteinprotein interfaces has been studied to infer the composition of the residues at the interface, which is generally different from the rest of the surface. In [8], six types of interfaces were studied: intra and inter domains, homo and heterooligomers, and obligate and transient complexes. That study concluded that the amino acid composition of these surfaces are different, as there is only.5% of similarity between the internal and external surfaces, and 0.2% similarity between hetero surfaces belonging to obligate homo complexes and transient homo complexes. To study the behavior of transient and obligate interactions, in [9], a classification of these two types of interactions was proposed, where interactions are classified based on the lifetime of the complex. Obligate interactions are usually more stable, while transient interactions are less stable, and hence, more difficult to discriminate and understand, due to their short life [0]. Protomers from obligate complexes do not exist as stable structures in vivo, whereas protomers of non-obligate complexes may dissociate from each other and stay as stable as functional units. For these reasons, it is of prime importance in proteomics to distinguish between obligate and transient complexes. Additionally, in [], it was proposed that interfaces in obligate complexes are inherently hydrophobic. In [2], three different types of interactions were studied, namely crystal packing, obligate and non-obligate interactions. That study is based on using solvent accessible surface area, conservation scores, and /0/$ IEEE 7

2 shapes of the interfaces. The interfaces of some transient complexes were also found to be with clusters of hydrophobic residues [3]. Moreover, transient complexes are rich in aromatic residues and arginine but depleted in other charged residues [4]. However, hydrophobicity at the interfaces of transient complexes is not as distinguishable from the remainder of the surface as hydrophobicity at the interfaces of the obligate complexes [4]. As a result, it is difficult to make an accurate prediction of the interfaces of transient complexes using a single parameter of residue interface propensity. In [5], a research on protein-protein interactions was conducted in which each interaction is analyzed in physical interaction, co-complex relationship and co-member of the pathway (i.e. enzymes are involved in enzyme or metabolic ways). Although interfaces have been the main subject of study to predict protein-protein interactions, an accuracy of 70% has been independently achieved by several different groups [6] [9]. These approaches have been carried out by analyzing a wide range of parameters, including desolvation energies, amino acid composition, conservation, electrostatic energies, and hydrophobicity. In a recent work, prediction of four different PPI types has been performed, including transient enzyme inhibitor/non enzyme inhibitor and permanent homo/hetero obligate complexes [20]. This work uses association rules to understand and characterize the diverse kinds of interactions, and carry out experiments on 47 preclassified complexes a smaller set than the one used in [2], and which is used here. In this paper, we have proposed a classification approach on using desolvation energy and interface properties to discriminate between transient and obligate interactions, achieving 88.68% and 80.27% accuracy for the datasets of [2] and [2], respectively. II. THE PREDICTION METHODS In order to classify complexes on the basis of desolvation energies (400 features for amino acid type and 324 features for atom type), we have first used principal component analysis (PCA) as a pre-processing step. PCA, though an unsupervised classifier, is applied to eliminate ill-conditioned matrices involved in the linear dimensionality reduction (LDR) techniques. To select the principal components, we have used different threshold values, and selected the threshold that leads to the highest accuracy. After obtaining those principal components, we have classified complexes with LDR methods including the well-known Fisher s discriminant analysis and two hetaroscedastic approaches. For classifying on the basis of different number of physicochemical interface properties, we have also compared the LDR methods with the well-known support vector machine (SVM). The basic idea of LDR is to represent an object of dimension n as a lower-dimensional vector of dimension d, achieving this by performing a lineal transformation. e consider two classes, ω and ω 2, represented by two normally distributed random vectors x N(m, S ) and x 2 N(m 2, S 2 ), respectively, with p and p 2 the a priori probabilities. After the LDR is applied, two new random vectors y = Ax and y 2 = Ax 2, where y N(Am ; AS A t ) and y 2 N(Am 2 ; AS 2 A t ) with m i and S i being the mean vectors and covariance matrices in the original space, respectively. The aim of LDR is to find a linear transformation matrix A in such a way that the new classes (y i = Ax i ) are as separable as possible. Let S = p S + p 2 S 2 and S E =(m m 2 )(m m 2 ) t be the within-class and between-class scatter matrices respectively. Various criteria have been proposed to measure this separability [22]. e consider three LDR methods: (a) the well-know Fisher s discriminant analysis (FDA) [23], [24], where criterion to optimize is as follows. J FDA (A) =tr { (AS A t ) (AS E A t ) }. () The matrix A is found by considering the eigenvector corresponding to the largest eigenvalue of S FDA = S S E. (b) The heteroscedastic discriminant analysis (HDA) approach [25], which aims to obtain the matrix A that maximizes the function J HDA (A) =tr { (AS A t ) [AS E A t AS 2 p log(s 2 S S 2 )+p 2 log(s 2 S 2S 2 ) p p 2 ]} S 2. (2) A t This criterion is maximized by obtaining the eigenvectors, corresponding to the largest eigenvalues, of the matrix: S HDA = S [ ] S E S 2 p log(s 2 S S 2 )+p 2 log(s 2 S 2S 2 ) p p 2 S 2. (3) (c) The Chernoff discriminant analysis (CDA) approach [22], which aims to maximize the following function: J CDA (A) =tr{p p 2 AS E A t (AS A t ) + log(as A t ) p log(as A t ) p 2 log(as 2 A t )}. (4) In [22], a gradient-based algorithm was proposed, which maximizes the function in an iterative way. For this gradient algorithm, a learning rate, α k needs to be computed. In order to ensure that the gradient algorithm converges, α k is maximized by the secant method. One of the keys in this algorithm is the initialization of the matrix A, and in this work, we have performed ten different initializations and then chosen the solution for A that gives the maximum Chernoff distance. 8

3 III. THE FEATURES In our approach, we have introduced the use of desolvation energies as physicochemical properties to discriminate between transient and obligate complexes. e have also used other interface and non-interface properties that include solvent accessibility, among others. A. Desolvation Energies Different approaches have been developed to group different types of protein, based on their different properties. Among them, desolvation energies are very efficient for classification, as shown later in the paper. Knowledge-based contact potential that accounts for hydrophobic interactions, self-energy change upon desolvation of charged and polar atom groups and side-chain entropy loss is called desolvation free energy. In [26], the binding free energy, G bind,is defined by the following equation: G bind = E elec + G des, (5) where E elec is the total electrostatic energy and G des is the total desolvation energy, which for a protein is defined as follows: g(r)σσe ij. (6) If we are considering the interaction between the i th atom of a ligand and the j th atom of a receptor then e ij is the atomic contact potential (ACP) [27] between them and g(r) is a smooth function based on their distance. For simplicity, we consider the smooth function to be linear. e also consider the criteria that for a successful interaction, atoms should be within 7 Å distance. ithin 5 and 7 Å, this range the value of g(r) varies from 0 to using a smooth function. The value of g(r) is for atoms that are less than 5 Å appart [26]. To create the datasets for classification, two pre-classified datasets of protein complexes were obtained from the studies of [2], [2]. The first set of proteins, Mintseris et al. dataset, contains complexes of two classes: 209 transient complexes and 5 obligate complexes. The second dataset, Zhu et al. dataset, contains 62 transient complexes and 75 obligate complexes. e collected the structural information about protein complexes from the protein data bank (PDB) [28]. From [27], we obtained 8 different atom types. For each pair of atom types we obtained the cumulative sum of desolvation energies which were computed using Eq. (6), obtaining 8 2 different values for each complex, and hence 324 features. e also considered pairs of amino acids, and for this, we computed 20 2 values for each pair using Eq. (6), obtaining 400 different features. e then created two data subsets from each of the datasets of Mintseris et al. [2] and Zhu et al. [2]. Additionally, we considered the solvent accessible surface area (SASA) using the the NACCESS program [29] and weighted our prepared four data subsets with the SASA values to include the effective surface taking part of the interactions. In the Mintseris et al. dataset, many proteins have multiple chains, and hence we calculate the desolvation energy value based on the pairs or between multiple chains we call these all against all and one against one comparisons, respectively. Finally, we obtained 2 datasets to test our classification methods. In all of these datasets some feature vectors contain zeros in most of the values, which where filtered by applying PCA. B. Interface Properties e have also considered other properties, mainly for those atoms and amino acids in the interface. A residue is defined as being part of the interface, if its SASA decreases by more than Å 2 upon the formation of the complex. SASA values for the residues were calculated using NACCESS [29] with a probe sphere of radius.4 Å 2. Other derived features, (a) interface area and (b) interface area ratio, which can be derived from this SASA value, were calculated in a same way as the NOXclass method [2]. e have considered 40 features, as opposed to NOXClass that considers six features. These features are number based amino acid composition, and area based amino acid composition, as described below. (a) Number-based amino acid composition: The numberbased amino acid composition, v n, is defined as the frequency of each type of the 20 amino acids in the protein protein interface. After calculating which residues are in the interface we obtained the frequency of each type of the 20 standard amino acids of the residues. (b) Area-based amino acid composition: By weighting each residue with its SASA, the area based amino acid composition v a is computed. v a,i=,...,20 = 2 Interface Area Σ r,type(r)=i SASA(r) (7) (c) Amino acid composition of the interface: This feature was computed as in [2]. (d) Correlation between amino acid composition of interface and protein surface: These two features were also calculated as per the method described in [2]. (e) Gap Volume Index: This feature was computed with the SURFNET program [30], as in [2]. (f) Conservation scores for residues in the interface: This features was computed, as in [2], by the ConSurf method [3]. e describe the datasets used in terms of the features included, where n is the number of features. e have first classified primarily for the first four features (Table III, n =4). These features are (a) interface area, (b) interface area ratio, (c) amino acid composition of the interface and (d) correlation between amino acid composition of interface and protein surface. e have then added two more features: (e) gap volume index and (f) conservation score of the 9

4 interface (Table III, n =6). For the analysis, we have used a larger dataset, by adding another feature, (g) area based amino acid composition (Table III, n = 26). Finally, we have added another feature: (h) number-based amino acid composition (Table III, n =46). e have classified with all these dimensions of features (n =4,n=6,n=26,n=46) (a total of eight properties) and compared the significance and importance of these properties and features. IV. CLASSIFICATION In order to classify each complex, first a linear algebraic operation y = Ax is applied to the n-dimensional vector, obtaining y, a d-dimensional vector, where d is ideally much smaller than n. The linear transformation matrix A corresponds to the one obtained by one of the LDR methods, namely FDA, HDA or CDA. The resulting vector y is then passed through a quadratic Bayesian (QB) classifier [23], which is the optimal classifier for normal distributions. For additional tests, a linear Bayesian (LB) classifiers is considered, by deriving a Bayesian classifier with a common covariance matrix: S = S + S 2. To study the performance of the classifiers, a 0-fold cross validation procedure was carried out, and then the average accuracy was computed, where accuracy for each individual fold was computed as follows: acc = (TP + TN)/N f, where TP and TN are the true positive (obligate) and true negative (transient) counters respectively, and N f is the total number of complexes in the test set of the corresponding fold. For the LDR schemes, three different classifiers were implemented and evaluated, namely the combinations of three LDR criteria, FDA, HDA and CDA, combined with a QB or LB classifier. For each of these classifiers reductions to dimensions d =,..., 20 were performed, followed by QB and LB. In the subsequent tables, each column reports the highest average accuracy among all possible reduced dimensions. Since the classification problem is two-class, FDA always leads to reducing to dimension one. The best accuracy for each method for each dataset is underlined to indicate the classifier that performed best of all for that dataset. For comparison purposes, we have also trained and tested a support vector machine (SVM) with a radial basis function kernel, and optimized the parameters by performing a grid search. V. EXPERIMENTS AND DISCUSSION To present and discuss the results, the following acronyms are used when referring to the different datasets: MAS = Mintseris et. al. dataset all against all with SAS, MA = Mintseris et. al. dataset all against all without SASA, MOS = Mintseris et. al. dataset one against one with SASA, MO = Mintseris et. al. dataset one against one without SASA, ZS = Zhu et. al. dataset with SASA, Z = Zhu et. al. dataset without SASA. Table I CLASSIFICATION RESULTS FOR DESOLVATION PROPERTIES, ATOM TYPE. Quadratic Linear Mint. FDA HDA CDA FDA HDA CDA MAS MA MOS MO Zhu FDA HDA CDA FDA HDA CDA ZS Z Table II CLASSIFICATION RESULTS FOR DESOLVATION PROPERTIES, AMINO ACID TYPE. Quadratic Linear Mint. FDA HDA CDA FDA HDA CDA MAS MA MOS MO Zhu FDA HDA CDA FDA HDA CDA ZS Z The results for the atom type properties are depicted in Table I, while the results for the amino acid type properties are shown in Table II. For the Mintseris et al. dataset, it is clearly observable that the best performance was achieved when using atom type features. Among all the atom type features, the one against one dataset weighted with SASA (solvent accessible surface area) value performed best with LDR methods combined with the QB classifier. For LDR criterion HDA achieves the best performance with an accuracy of 80.27%. Among the classification of all these features of Mintseris et al. dataset, desolvation energies for atom type features leads to the best classification accuracies followed by interface properties and desolvation energies for amino acid type features. This suggests that desolvation energies are more important at the atom type level in classifying transient and obligate complexes. Additionally, classification on the basis of interface properties features yields 79.25% accuracy, which is no less than 2% below the best accuracy achieved by desolvation energies at the atom type level. The results for the interface properties are shown in Table III. The classification accuracies in the table show that the LDR methods achieve better performance in most of the cases. This demonstrates that LDR methods perform better than the SVM. If we observe the interface properties features (Table III), we observe that after adding 20 amino acid compositions area-based features to our primary fourfeature datasets, the classification accuracy decreases. Thus, we infer that amino acid composition area-based features do not contribute to the classification of transient and obligate complexes. Then, we added the amino acid compositions number-based features to the 24 dataset, and accuracy increases to 79.25%. From this, we conclude that amino acid 20

5 Table III CLASSIFICATION RESULTS FOR INTERFACE PROPERTIES. Quadratic Linear Mintseris et. al. n SVM FDA HDA CDA FDA HDA CDA Zhu et. al. n SVM FDA HDA CDA FDA HDA CDA compositions number-based features are good discriminators of obligate and transient complexes. Desolvation energies for amino acid type features performs slightly worse than atom type features, and as good as interface properties features for all types of chain combinations and with weighted and non-weighted SASA. All the best performances achieved by amino acid type features datasets (for all types of chain combinations and with weighted and non-weighted SASA) was achieved by LDR methods combined with the QB classifier. Of these, the LDR criterion that achieves the best performance is CDA in all four different kinds of datasets. For the Zhu et al. dataset, we observe that the best performance is achieved, again, using desolvation energies for atom type features. Since in Zhu et al. dataset there are only two interacting chains in a protein complex, there is no option here to divide it in one against one or all against all combinations. For the desolvation energies for atom type without SASA, 88.68% accuracy was achieved by LDR methods with a linear classifier combined with the HDA criterion. Among the classification of all these features of Zhu et al. dataset, desolvation energies for atom type features achieve the best classification accuracies followed by interface properties and desolvation energies amino acid type features. This suggests that desolvation energies for atom type are more important in classifying transient and obligate complexes. e observe from the interface properties (Table III) the superior classification accuracies of LDR methods. This demonstrates that LDR methods perform better than the SVM. If we see the interface properties features (Table III), we observe that after adding 20 amino acid compositions area based features to our primary sixfeature dataset (accuracy = %), the classification accuracy decreases (accuracy = 78.09%). Thus, we infer that amino acid compositions area-based features do not contribute to the classification of transient and obligate complexes. Then, we added the amino acid composition number-based features to the 24 datasets, and accuracy increases to 8.83%. From this, we conclude that amino acid composition number-based features are good discriminators of obligate and transient complexes. In this dataset, the accuracy obtained when using desolvation energies for atom types is much better than the interface properties accuracies. e clearly observe from this that the desolvation energies for atom type features lead to better perfomance in both datasets than the interface properties features (interface area, interface area ratio, amino acid composition area-based, amino acid composition number-based, correlation between amino acid compositions of interface and protein surface, gap volume index, conservation score of the interface). To conclude the paper and as a matter of comparison, we emphasize on the following two aspects of the proposed approach with respect to previous ones. The proposed method outperformed NOXClass of [2] in terms of using other features that include the amino acid composition, area and number based. The LDR methods outperform the SVM, even when the latter is optimized for the kernel and its parameters. The proposed method reveals that the use of desolvation energies for atom type properties are the best discriminants for transient and obligate complexes, on two well-known datasets. VI. CONCLUSION e have proposed a classification approach that uses desolvation energy properties to distinguish between transient and obligate protein complexes. Our classifiers are based on linear dimensionality reduction (LDR) methods that involve homoscedastic and heteroscedastic criteria coupled with quadratic and linear Bayesian classifiers. The results on two datasets of pre-classified complexes show that the LDR schemes coupled with quadratic Bayesian and linear Bayesian classifier achieves much better classification performance, even better than SVM with an RBF kernel, and far better than previous classification approaches (an increase from 75.2% to 88.68%) to distinguish obligate and transient interactions [2]. Our results, also, clearly demonstrate that desolvation energies are quite important in distinguishing transient and obligate complexes. Our future work involves the use of this approach in different proteinprotein interaction classification problems, including intra and inter domains, homo and hetero-oligomers, and the use of other features such as residual vicinity, shape of the structure of the interface, secondary structure, planarity, physicochemical features, hydrophobicity and others. ACKNOLEDGMENTS This research work has been partially supported by NSERC, the Natural Sciences and Research Council of Canada, grant No. RGPIN 26360, and the University of indsor, internal Start-up and VP research equipment grants. REFERENCES [] A. Mendelsohn and R. Brent, Protein interaction methodstoward an endgame. Science, vol. 284(5422), pp ,

6 [2] M. C. Lawrence and P. M. Colman, Shape complementarity at protein/protein interfaces, J. Mol Biol, vol. 234, no. 4, pp , 993. [3] P. Chakrabarti and J. Janin, Dissecting protein-protein recognition sites, Proteins, vol. 47, no. 3, pp , [4] A. L. Gnatt, P. Cramer, J. Fu, D. A. Bushnell, and R. D. Kornberg, Structural basis of transcription: an RNA polymerase II elongation complex at 3.3 A resolution, Science, vol. 292, no. 5523, pp , 200. [5] D. Xu, C. Tsai, and R. Nussinov, Hydrogen bonds and salt bridges accross protein-protein interfaces, Protein Eng, vol. 0, no. 9, pp , 997. [6] H. Shanahan and J. Thornton, Amino acid architecture and the distribution of polar atoms on the surfaces of proteins, Biopolymers, vol. 78, no. 6, pp , [7] B. Ma, T. Elkayam, H. olfson, and R.Nussinov, Proteinprotein interactions: structurally conserved residues distinguish between binding sites and exposed protein surfaces, Proc Natl Acad Sci, USA, vol. 00, no. 0, pp , [8] Y. Ofran and B. Rost, Analysing six types of protein-protein interfaces, J Mol Biol, vol. 325, no. 2, pp , [9] I. Nooren and J. Thornton, Diversity of protein-protein interactions, EMBO Journal, vol. 22, no. 4, pp , [0] S. Jones and J. M. Thornton, Principles of protein-protein interactions, Proc. Natl Acad. Sci, USA, vol. 93, no., pp. 3 20, 996. [] F. Glaser, D. M. Steinberg, I. A. Vakser, and N. Ben- Tal, Residue frequencies and pairing preferences at proteinprotein interfaces, Proteins, vol. 43, no. 2, pp , 200. [2] H. Zhu, F. Domingues, I. Sommer, and T. Lengauer, Noxclass: Prediction of protein-protein interaction types, BMC Bioinformatics, vol. 7, no. 27, pp. doi:0.86/ , [3] J. Young, A role for surface hydrophobicity in protein protein recognition, Protein Sci, vol. 3, pp , 994. [4] L. LoConte, C. Chothia, and J. Janin, The atomic structure of protein-protein recognition sites, J Mol Biol, vol. 285, no. 5, pp , 999. [5] Y. Qi, Z. Bar-Joseph, and J. Klein-Seetharaman, Evaluation of different biological data and computational classification methods for use in protein interaction prediction, Proteins, vol. 63, no. 3, pp , [6] A. J. Bordner and R. Abagyan, Statistical analysis and prediction of protein-protein interfaces, Proteins, vol. 60, no. 3, pp , [8] S. Neuvirth and R. Raz, ProMate. a structure based prediction program to identify the location of protein protein binding sites, J Mol Biol, vol. 338, pp. 8 99, [9] H. Zhou and Y. Shan, Prediction of protein interaction sites from sequence profile and residue neighbor list, Proteins, vol. 44, no. 3, pp , 200. [20] S. H. Park, J. Reyes, D. Gilbert, J.. Kim, and S. Kim, Prediction of protein-protein interaction types using association rule based classification, BMC Bioinformatics, vol. 0, no. 36, 2009, doi:0.86/ [2] J. Mintseris and Z. eng, Structure, function, and evolution of transient and obligate protein-protein interactions, Proc Natl Acad Sci, USA, vol. 02, no. 3, pp , [22] L. Rueda and M. Herrera, Linear Dimensionality Reduction by Maximizing the Chernoff Distance in the Transformed Space, Pattern Recognition, vol. 4, no. 0, pp , [23] R. Duda, P. Hart, and D. Stork, Pattern Classification, 2nd ed. New York, NY: John iley and Sons, Inc., [24] R. Fisher, The Use of Multiple Measurements in Taxonomic Problems, Annals of Eugenics, vol. 7, pp , 936. [25] M. Loog and P. Duin, Linear Dimensionality Reduction via a Heteroscedastic Extension of LDA: The Chernoff Criterion, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 6, pp , [26] C. Camacho and C. Zhang, FastContact: rapid estimate of contact and binding free energies, Bioinformatics, vol. 2, no. 0, pp , [27] C. Zhang, G. Vasmatzis, J. L.Cornette, and C. DeLisi, Determination of atomic desolvation energies from the structures of crystallized proteins, J. Mol. Biol., vol. 267, pp , 997. [28] H. Berman, J. estbrook, Z. Feng, G. Gilliland, T. Bhat, H. eissig, I. Shindyalov, and P. Bourne, The Protein Data Bank, Nucleic Acids Research, vol. 28, pp , [29] S. Hubbard and J. Thornton, Naccess, computer program, 993. [30] R. Laskowski, Surfnet: a program for visualizing molecular surfaces, cavities and intermolecular interactions. J Mol Graph, vol. 3(5):323:30, pp , 995. [3] B.-T. A. Armon, D. Graur, Consurf:an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. J Mol Biol, vol. 307, no , 200. [7] H. J. Caffrey and S. Somaroo, Are protein protein interfaces more conserved in sequence than the rest of the protein surface? Protein Science, vol. 3, pp ,

Analysis of Relevant Physicochemical Properties in Obligate and Non-obligate Protein-protein Interactions

Analysis of Relevant Physicochemical Properties in Obligate and Non-obligate Protein-protein Interactions Mina Maleki, Md. Mominul Aziz, Luis Rueda School of Computer Science, University of Windsor 401