SUPPLEMENTAL DATA - 1. This file contains: Supplemental methods. Supplemental results. Supplemental tables S1 and S2. Supplemental figures S1 to S4

Protein Disulfide Isomerase is Required for Platelet-Derived Growth Factor-Induced Vascular Smooth Muscle Cell Migration, Nox1 Expression and RhoGTPase Activation Luciana A. Pescatore 1, Diego Bonatto 2, Fábio L. Forti 3, Amine Sadok 4, Hervé Kovacic 4, Francisco R.M. Laurindo 1 1 Vascular Biology Laboratory, Heart Institute (InCor), University of São Paulo School of Medicine, São Paulo, Brazil, 05403-000, 2 Biotechnology Center, Molecular and Cellular Biology Department, Federal University of Rio Grande do Sul, Brazil, 15005, 3 Chemistry Institute, University of São Paulo, Brazil, 05508-000, 4 INSERM UMR911, Université de la Mediterranée, Marseille, France, 13385. Running title: PDI requirement for redox cell migration and GTPase activity To whom correspondence should be addressed: Francisco R. M. Laurindo, Heart Institute (InCor), University of São Paulo School of Medicine, Vascular Biology Laboratory. Av. Eneas Carvalho Aguiar, 44, Annex II, 9 th floor, CEP 05403-000, São Paulo, Brazil. Phone: 55 (11) 26615968; Fax: 55(11) 26615920; e-mail: francisco.laurindo@incor.usp.br SUPPLEMENTAL DATA - 1 This file contains: Supplemental methods Supplemental results Supplemental tables S1 and S2 Supplemental figures S1 to S4 1

SUPPLEMENTAL METHODS Physical protein-protein (PPPI) network design and global topological analysis The interactomic data gathered from human PDI proteins was used to obtain informations about their potential interactions with others proteins in the context of physical protein-protein interactions (PPPI networks) in Homo sapiens. In this sense, the data mining screening and network design of a major PPPI network (Figure S2) was performed using Cytoscape software, version 2.6.3 (1). For this purpose, we used the PPPI data of H. sapiens available in STRING 9 (http://string.embl.de) database using the following parameters: active prediction methods all enabled except text mining; no more than 50 interactions; medium or standard confidence score (0.700 or 0.400, respectively); and network depth equal to 1 with addition of new nodes until saturation of the network. The PDI-associated PPPI networks obtained from this first screening were then combined in a unique PPPI network by employing the union function of the Cytoscape core plugin Merge Networks (Figure S2). The major PPPI network was then analyzed with Molecular Complex Detection (MCODE) software (2), a Cytoscape plugin available at http://www.cytoscape.org/plugins2.php in order to detect subnetworks or cluster of proteins that could represent distinct biologic processes. The parameters used for MCODE to generate the subnetworks were: loops included; degree cutoff 2; deletion of single connected nodes from cluster (haircut option enable); expansion of cluster by one neighbor shell allowed (fluff option enable); node density cutoff 0.1; node score cutoff 0.2; kcore 2; and maximum depth of network 100. Network centralities and local topological analyses A major network centrality (bottleneck nodes) was computed from the major PPPI networks using the Cytoscape plugin CytoHubba (3). A PPPI subnetwork containing the 50 major nodes with the highest bottleneck scores was drawn using the Cytoscape plugin Cyto-Hubba (3) and is available at http://hub.iis.sinica.edu.tw/cytohubba. Gene ontology analysis Gene ontology (GO) clustering analysis was performed using Biological Network Gene Ontology (BiNGO) (4) software, a Cytoscape plugin available at 2

http://chianti.ucsd.edu/cyto_web/plugins/index.php. The degree of functional enrichment for a given cluster and category was quantitatively assessed (p value) by hypergeometric distribution (5) and a multiple test correction was applied using the false discovery rate (FDR) (6) algorithm, fully implemented in BiNGO software. Overrepresented biological process categories were generated after FDR correction, with a significance level of 0.05. SUPPLEMENTAL RESULTS The interactomic data about human PDI proteins obtained from different databases prompt us to ask how PDIs interacts with different proteins associated to oxidative stress. In this sense, a search for potential proteins and/or mechanisms and their associated biological processes that are affected by PDIs was initiated. To achieve this goal, different PPPI networks using Homo sapiens data were retrieved from STRING database. Shared proteins and subnetworks present in the major PPPI network (Figure S2) were identified and retrieved using the Cytoscape-associated plugin MCODE and subjected to a Gene Ontology (GO) analysis in order to obtain information about the nature and number of subgraphs belonging to the network and their associated biological processes. Results obtained from MCODE and GO analysis show that the final PPPI network (Figure S2) contains 159 nodes and 565 connectors and is composed by two interconnected clusters, each comprising different biological processes (Figures S3 and S4). GO analyses showed that these biological processes can be classified for cluster 1 into: (i) intracellular signal transduction, (ii) reactive oxygen species metabolism, (iii) small GTPases mediated signal transduction, (iv) circulatory and blood system processes, and (v) ER stress response (Table S1 and supplemental data 2). By its turn, cluster 2 comprises biological processes like: (i) hemopoiesis and leukocyte differentiation, (ii) positive regulation of JAK-STAT cascade, and (iii) ER-nucleus signaling pathway (Table S2 and supplementary data 2). As expected, proteins that could not be classified into any cluster were also identified in the network (Figure S2). Taking into account the data gathered from this initial systems biology analysis, we prompted to get more informations about the major nodes involved in the major PDI- 3

associated PPPI network using network centralities. Network centralities allow us to identify nodes (and the consequent biological processes) that have a relevant position in the overall network architecture (7) and many network centralities have been developed to evaluate the importance of a node for a given network, e.g., node degree, betweenness, and eigenvector measures (7). Centralities have been recently applied to quantify the centrality and prestige of actors in social networks (7) and to understand the structure and properties of complex biological, technological and infrastructural networks (6,7). Many of the nodes in a given network that show elevated values of centrality are important points of vulnerability, indicating that any attack to these nodes could introduce strong perturbations in the network. This graph principle has been exploited to identify proteins that are essential for an organism or that occupy a central position in a biological process, like bottleneck nodes (7-9) Bottleneck is a local topologic data that is defined as all nodes with high betweenness values and different nodes degrees, indicating that those nodes are central points that control the communication between other nodes within the network. Bottleneck also indicates all nodes that are between highly interconnected subgraph clusters, and removing a bottleneck could divide a network (10-12). Bottleneck nodes correspond to highly central proteins that connect several complexes or are peripheral members of central complexes, being important communication points between two complexes (10). Mostly of bottleneck nodes tend to be essential proteins in a network (10). The centrality analysis of major PDI-associated PPPI network (Figure S2) indicated the presence of 50 bottleneck nodes with different scores (Figure 4), corresponding to approximately 31% of all nodes present in the PPPI network. Interestingly, the bottlenecks with high score values correspond to proteins associated to small GTPases signaling processes, NADPH oxidases, intracellular signaling, and PDIA2 (Figure 4) 4

SUPPLEMENTAL REFERENCES 1. Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., Amin, N., Schwikowski, B., and Ideker, T. (2003) Genome research 13, 2498-2504 2. Bader, G. D., and Hogue, C. W. (2003) BMC bioinformatics 4, 2 3. Lin, C. Y., Chin, C. H., Wu, H. H., Chen, S. H., Ho, C. W., and Ko, M. T. (2008) Nucleic acids research 36, W438-443 4. Maere, S., Heymans, K., and Kuiper, M. (2005) Bioinformatics 21, 3448-3449 5. Rivals, I., Personnaz, L., Taing, L., and Potier, M. C. (2007) Bioinformatics 23, 401-407 6. Benjamini, Y., and Hochberg, Y. (1995) J Roy Stat Soc B Met 57, 289-300 7. Borgatti, S. P. (2005) Soc Networks 27, 55-71 8. Estrada, E. (2006) Proteomics 6, 35-40 9. Estrada, E., and Hatano, N. (2010) Physica A 389, 3648-3660 10. Yu, H., Kim, P. M., Sprecher, E., Trifonov, V., and Gerstein, M. (2007) PLoS computational biology 3, e59 11. Newman, M. E. J. (2005) Soc Networks 27, 39-54 12. Girvan, M., and Newman, M. E. J. (2002) Proceedings of the National Academy of Sciences of the United States of America 99, 7821-7826 5

SUPPLEMENTAL TABLES Table S1. Specific gene ontology (GO) classes derived from physical protein-protein interaction (PPPI) network observed in the cluster 1. Biologic Process GO classify p a value Corrected p value b k c f d Intracellular signal transduction 35556 1.47 10-18 2.53 10-15 35 833 Superoxide metabolic process 6801 3.62 10-15 7.8 10-13 10 26 Oxygen and reactive oxygen species metabolic process 6800 1.15 10-13 1.64 10-11 12 65 Small GTPase mediated signal transduction 7264 1.69 10-13 2.23 10-11 20 292 Ras protein signal transduction 7265 9.18 10-11 6.88 10-9 12 112 MAPKKK cascade 165 2.74 10-10 1.82 10-8 14 186 Rho protein signal transduction 7266 8.48 10-7 2.35 10-5 6 41 Response to hydrogen peroxide 42542 9.52 10-7 2.56 10-5 7 66 Circulatory system process 3013 1.52 10-5 2.91 10-4 9 181 Blood circulation 8015 1.52 10-5 2.91 10-4 9 181 ER stress response 6983 7.97 10-5 1.18 10-3 3 11 Response to endoplasmic reticulum stress 34976 1.71 10-4 2.22 10-3 4 35 a p values calculated by the hypergeometric distribution of one ontology class visualized in the network. b Calculated values based on p values obtained after FDR was applied. c Total number of proteins found in the network which belong to a gene ontology. d Total number of proteins that belong to a specific gene ontology. 6

Table S2. Specific gene ontology (GO) classes derived from physical protein-protein interaction (PPPI) of unclustered proteins subnetwork. Biologic Process GO classify p a value Corrected p value b k c f d Hemopoiesis 30097 1.86 10-7 7.19 10-5 6 233 Leukocyte differentiation 2521 2.89 10-7 7.19 10-5 5 127 Hemopoietic or lymphoid organ development 48534 3.48 10-7 7.19 10-5 6 259 Positive regulation of JAK-STAT cascade 46427 4.58 10-4 5.66 10-3 2 27 ER-nucleus signaling pathway 6984 8.18 10-4 8.07 10-3 2 36 a p values calculated by the hypergeometric distribution of one ontology class visualized in the network. b Calculated values based on p values obtained after FDR was applied. c Total number of proteins found in the network which belong to a gene ontology. d Total number of proteins that belong to a specific gene ontology. 7

SUPPLEMENTAL FIGURE LEGENDS Figure S1. Effects of Diphenyleneiodonium on VSMC migration and ROS production. A) VSMC migration, analyzed by Boyden chamber assay, in VSMC exposed or not to PDGF (100ng/ml); B) Whole-cell production of superoxide (followed by the 2-hydroxyethidium signal, EOH) and other oxidants (followed by the ethidium signal, E) at baseline or after PDGF (100ng/ml; 2h). Figure S2. A physical protein-protein interaction (PPPI) network obtained from human PDIA2 interactomic data. The subnetworks and unclustered nodes that compose PPPI network are indicated by nodes with different shapes (inset). Bottleneck nodes are represent in the PPPI network by a color scale that indicates its bottleneck score value (from highest to the lowest value; see figure inset and supplementary material for additional informations). Figure S3. Subnetwork of proteins (cluster 1) associated to intracellular signal transduction, reactive oxygen species metabolism, small GTPases mediated signal transduction, circulatory and blood system processes, and ER stress response. Bottleneck nodes are represent in the cluster 1 by a color scale that indicates its bottleneck score value (from highest to the lowest value; see figure inset and supplementary material 2 for additional informations). Figure S4. Subnetwork of proteins (cluster 2) associated to hemopoiesis and leukocyte differentiation, positive regulation of JAK-STAT cascade, and ER-nucleus signaling pathway. Bottleneck nodes are represent in the cluster 2 by a color scale that indicates its bottleneck score value (from highest to the lowest value; see figure inset and supplementary material 2 for additional informations). 8

SUPPLEMENTAL FIGURES Supplemental Figure S1 9

Supplemental Figure S2. 10

Supplemental Figure S3 11

Supplemental Figure S4. 12