Evolutionary Analysis of Functional Modules in Dynamic PPI Networks

Size: px
Start display at page:

Download "Evolutionary Analysis of Functional Modules in Dynamic PPI Networks"

Transcription

1 Evolutionary Analysis of Functional Modules in Dynamic PPI Networks ABSTRACT Nan Du Computer Science and Engineering Department Jing Gao Computer Science and Engineering Department Stanley A. Schwartz Department of Medicine Functional module detection in Protein-Protein Interaction (PPI) networks is essential to understanding the organization, evolution and interaction of the cellular systems. In recent years, most of the researches have focused on detecting the functional modules from the static PPI networks. However, sometimes the structure of the PPI networks changes in response to stimuli resulting in the changes of both the composition and functionality of these modules. These changes occur gradually and can be thought of as an evolution of the functional modules. In our opinions the evolutionary analysis of functional modules is a key to form important insights of the functional modules underlying behaviors, particularly when targeting complex living systems. In this paper, we propose a novel computational framework which integrates a PPI network with multiple dynamic gene coexpression networks to categorize and track the evolutionary pattern of functional modules over consecutive timestamps. We first propose a method to construct dynamic PPI networks, and then design a new functional influence based algorithm to detect the functional modules from these dynamic PPI networks. Based on the results of this approach, we provide a simple but effective method to charac- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ACM-BCB 12, October 7-10, 2012, Orlando, FL, USA Copyright 2012 ACM /12/10...$ Yuan Zhang Kang Li College of Electronic Computer Science and Information and Control Engineering Department Engineering Beijing University of Technology kli22@buffalo.edu Beijing, , China zhangyuan@ s.bjut.edu.cn Supriya D Mahajan Department of Medicine smahajan@buffalo.edu Aidong Zhang Computer Science and Engineering Department azhang@buffalo.edu Bindukumar B Nair Department of Medicine bnair@buffalo.edu terize and track the evolutionary patterns of dynamic modules, which involves detecting evolutionary events between modules found at consecutive timestamps. Extensive experiments on the fermentation process dataset of S. cerevisiae show that the proposed framework not only outperforms previous functional module detection methods, but also efficiently tracks the evolutionary patterns of functional modules. Categories and Subject Descriptors J.3 [Life And Medical Sciences]: Biology and Genetics General Terms ALGORITHMS 1. INTRODUCTION Protein Protein Interaction (PPI) networks help us systematically analyzing the structure of a large living system and also allow us to understand principles like essentiality, protein interactions, functional modules and cellular pathways. The identification of functional modules in PPI networks is of great interest as it often reveals unknown functional ties between proteins and thus helps in predicting functionalities of unknown genes. However, traditional functional module detection approaches treat the PPI network as a static graph, where the graph is either derived from data which is fixed at a certain timestamp or aggregated from the data collected over a period. These approaches ignore the temporal evolution of the functional modules which can offer biologists valuable insights. In the absence of capturing the inherent dynamic charac- ACM-BCB

2 teristics within the PPI networks, one may miss the opportunity to capture the evolutionary pattern of functional modules. Protein-Protein interactions are often subjected to external stimuli and this results in a change in the structure of the network during the development. These dynamically varying interactions which sometimes are referred to as transient interactions are caused by stimuli that may be either reactive (caused by exogenous factors, such as a response to environmental stimulus) or programmed (due to endogenous signals, such as cell-cycle dynamics or developmental process) [23]. Also, the functional modules detected at each timestamp may evolve regularly as the protein interactions dynamically change over time. Specifically, detecting the functional module evolution, that is, the module s functions change over time, provides insights into the underlying behavior of the molecular system. For example, network dynamics can describe how cells respond to environmental cues or how an interaction network changes during development. It is also worth mentioning that temporal evolution of the functional modules will also be very useful for monitoring chronic and genetic disease development and outcome. Thus we believe that it is promising to track the evolution of functional modules and proteins in the dynamic PPI networks. In this paper, we propose a framework to categorize and track the evolutionary pattern of functional modules over consecutive timestamps. Accordingly, we begin by constructing a series of dynamic PPI networks based on both the PPI network and the dynamic gene coexpression networks during various timestamps. We then solve the functional module detection problem with a novel functional influence based algorithm which quantifies the influence from one biological component to another. In addition, the proposed functional module detection method maintains certain levels of module equivalence between consecutive timestamps, the detailed definition of which will be discussed in Section 2.2. Finally, we try to capture complex evolutionary patterns of functional modules over time by analyzing the key evolutionary events among modules in consecutive timestamps. In summary, there are three main contributions of our paper: (i) we propose a novel method to construct the dynamic PPI networks by integrating the static PPI network with the dynamic gene coexpression networks; (ii) we propose a new functional influence based functional module detection algorithm in which the functional modules detected are allowed to be overlapping and would not change dramatically over short time; (iii) we provide a model for tracking the evolutionary process of functional modules over time. To the best of our knowledge, this is the first work in analyzing the evolutionary patterns of functional modules over consecutive timestamps. The rest of the paper is organized as follows. The proposed approach is presented in Section 2. Extensive experimental results are shown in Section 3. Finally, we conclude our work in Section METHOD We begin by introducing the method of constructing the dynamic PPI networks in Section 2.1. In Section 2.2, we will present the functional influence based algorithm used for detecting the functional modules. Finally, the model we used for tracking the evolution of the functional modules is presented in Section Dynamic PPI Network Construction Several researchers have worked on integrating static data with dynamic data to discover the temporal evolution of protein interaction networks. Han et al. integrated the PPI networks with gene expression data and suggested that some modules are active at specific times and locations [8]. Qi et al. further noted that the integration of a variety of datasets, including binary interactions, protein complexes and expression profiles, enables the identification of subnetworks that are active under certain conditions [17]. In order to discover the temporal evolution of functional modules, we integrate the static PPI network with a series of dynamic gene coexpression networks. Given a PPI network P = (V,E), where V is a set of proteins and E is a set of interactions between these proteins, let M 1,M 2,..., M T be a set of V n gene expression matrices, where T is thenumberoftimestampsandn isthenumberofsamples (replicates) in the experiments. Our goal is to construct T dynamic PPI networks D 1,D 2,..., D T,eachofwhichisa V V matrix. Note that each gene expression matrix M i (1 i T ) and dynamic PPI network D i (1 i T ) corresponds to a specific timestamp i. Before constructing the dynamic PPI networks, we first need to construct a series of gene coexpression networks G 1,G 2,..., G T. Gene coexpression networks have been used to demonstrate that functionally related genes are frequently coexpressed across multiple datasets and across different organisms [10], and to estimate the underlying regulatory relationships between genes under various experimental conditions [1]. By constructing specific gene coexpression network at each timestamp, e.g., at early stage, intermediate stage and terminal stage of a certain disease, it is possible to identify disease-mediated changes in the network connectivity patterns. For each gene pair, the absolute Pearson correlation coefficient of their expression profiles along samples is calculated, and the output is a V V correlation matrix, which represents expression similarity between each gene pair. Based on these correlation matrices, we can easily construct the gene coexpression network, where each node is a gene and each edge represents that the correlation measure between two genes is greater than a cutoff threshold. This cutoff threshold is used to remove all but the most likely biologicallysignificant relationships, and we choose an appropriate cutoff threshold based on the average correlation similarity from each correlation matrix. Combining static PPI network with time course gene expression data leads to a better understanding of protein or gene function and reveals global changes in network topology that hint at higher level cellular organizational principles and functions [16]. Furthermore, we can regulate the changes of proteins relationships and also track the evolutionary process of the functional modules by integrating the static PPI network with time course gene expression data. AfterwegetthegenecoexpressionnetworksG 1,G 2,..., G T, we integrate them with the PPI network P by the rule that if one interaction exists at both the PPI network P and the i-th dynamic gene coexpression network G i, this interaction would be added to the i-thdynamicppinetworkd i.otherwise, we believe that there is no interaction between this protein pair at this timestamp. An example of constructing dynamic PPI networks is presented in Figure 1. ACM-BCB

3 Figure 1: An example of constructing dynamic PPI networks at five timestamps. 2.2 Functional Influence based Functional Module Detection In recent years, many methods have been developed to detect functional modules in a PPI network, such as Markov Clustering (MCL) [5] which is a fast stochastic flow based clustering algorithm for graph, hierarchical clustering method [7] and spectral clustering method [24]. Furthermore, two of our previous algorithms based on functional influence have also been proposed, which efficiently analyzed large-sized, complex PPI networks [3, 20]. The functional influence algorithm was first proposed by Nabieva et al [13], and the basic idea of it is that influence is propagated from the source proteins to the surrounding neighborhoods, and this process is repeated for each protein until each protein in the graph has an influence score. This influence score represents the amount of functional influence received by the protein for a given function. However, since these approaches are not designed for dynamic graphs clustering, they do not consider the temporal characteristic of the dynamic PPI networks, where the interactions between proteins continuously evolve. Therefore, we propose to design a novel functional influence based method which can effectively identify the protein functional modules that reflect the temporal evolution over consecutive timestamps. Our method also allows the overlapping between the modules and can automatically estimate the optimal number of modules at each timestamp. The Principle of Module Equivalence. Since living systems are subjected to the external stimuli, the interactions between proteins also evolve with time which raises a new challenge for the traditional clustering algorithms. Since in our case, the clusters evolve continuously, which is different with the case in which the traditional clustering algorithms usually handle, some new considerations are needed. On one hand, we expect to detect the functional modules that depend on the current PPI network; on the other hand, we also expect that the detected functional modules do not deviate too dramatically from the previous timestamp s PPI network. Similar principles have also been used in [2]. In other words, since the living system is more likely to change gradually instead of dramatically, we expect certain level of module equivalence between functional modules detected in consecutive timestamps. Moreover, in many cases, the dramatic change of functional modules over a short time could be due to the noise which may come from sample contamination, experimental design or the clustering method. Fulfilling the module equivalence can also help in generating more robust results that are not sensitive to noise; this is validated in the experiment. Figure 2: An example of illustrating module equivalence. (a) the clustering results evolve gradually; (b) the clustering results change dramatically. Consider the simple example shown in Figure 2. There are two clustering results (a) and (b) of 7 proteins over 3 timestamps, where each node is a protein and the nodes enclosed together denotes a cluster. It is easy to notice that, the proteins partitioned into the same cluster are stable in result (a), where each cluster changes gradually over time. On the contrary, the proteins partitioned together in result (b) change dramatically. Therefore, according to the principle of module equivalence, (a) should be preferred. Obviously, it is easier and more reasonable to track the evolutionary patterns of functional modules in (a) than (b). To achieve certain level of module equivalence between functional modules in consecutive timestamps, we propose a method to construct a series of weighted dynamic PPI networks, which takes the PPI network from the previous timestamp into account and guarantees that the modules ACM-BCB

4 change smoothly in consecutive timestamps. Given T timestamps unweighted dynamic PPI networks D 1,D 2,..., D T which have been introduced in Section 2.1, we aim at constructing T weighted dynamic PPI networks WD 1,WD 2,...,WD T, where each dynamic PPI network can be represented as WD i =(V i,e i ). The weight between proteins u and v in WD i is defined as: α, if Duv i 1 =1xor Duv i =1, WDuv i = β, if Duv i 1 =1and Duv i =1, 0, otherwise, where α and β are pre-set weights, and 0 α<β 1. The assumption is that the weight of an interaction between proteins u and v at i-thtimestampisbasedonbothunweighted dynamic PPI networks D i 1 and D i. If a particular interaction exists at both of these consecutive timestamps, we have a high confidence that this interaction is reliable and stable, and thus it would be assigned a high weight β. If this interaction only exists at one of the two consecutive timestamps, it would be less confident that it does not come from noise, and thus it would be assigned a relatively low weight α. It can also be considered as that we use previous PPI network as an evidence to weigh the current network. In addition, when i = 1 it does not have previous timestamp, thus WDuv 1 = α if there is an interaction between protein u and v in D 1. In our experiments, we set α =0.1 and β =0.2. Functional Flow Model. Based on the weighted dynamic PPI networks WD i (1 i T ), we design a modified influence based functional module detection algorithm. We first select some proteins to be the source protein set S which are the start points to propagate the influence based on the weighted degrees of the proteins. A previous research [9] has observed that the connectivity of nodes in biological networks plays a crucial role in cellular functions. The weighted degree of protein u, denoted d(u), is the summation of the weights between u and its neighbors and the formula is shown as Eq. 2, where N(u) is the set of the neighbors of protein u and w uv is the weight of the edge between the protein u and v. d(u) = v N(u) (1) w uv. (2) Secondly, we assign an initial influence weight to each source protein s (s S) and propagates the weight to its neighbors x. The process of computing the initial flow f(s x) from s to x is denoted as: f(s x) = w sx z N(s) wsz F (s), (3) where F (s) is the initial influence score for the source protein which we assign as a constant value 1 and w sx is the normalized weight of the edge between s and x. The influence score of x is then updated by summing of all incoming flows from its neighbors, which is shown as Eq. 4. F (x) = f s(u x). (4) u N(x) After updating the influence weight, x propagates its influence weight to its neighbors, this process is defined as: f(x y) = w xy z N(x) wxz F (x). (5) The flow f(x y) wouldberemovedifitislessthana threshold θ flow. Eq. 4 and Eq. 5 are repeated until there is no more flow in the network. By the end of the flow simulation, we can obtain a flow pattern which is a S V matrix, where each vector is a set of cumulative quantities of functional influences for a particular source protein s over all the proteins. The functional influence profile is a vector where each item reflects the functional influence received from a source protein in the network. In the flow pattern, all the proteins that have a higher functional influence score than the threshold θ flow, would be grouped into a functional module. Merging Preliminary Modules. Note that the preliminary modules extracted from flow pattern are typically overlapped since a protein may have a high functional influence to multiple source proteins. However, the quality of these preliminary modules mainly depends on the source protein selection. Through merging the similar preliminary modules which have a large fraction of common members, we obtain the final modules which have higher accuracy. It is an important step to merge the similar preliminary modules to generate the final modules [6]. Since these final modules are merged from the overlapped preliminary modules, they are also overlapped. The real functional modules are likely to be overlapping, since a molecule generally may perform different biological processes or functions in different environments [26]. In our work, we set θ flow =0.02. In our case, we use a hierarchical clustering algorithm to merge the preliminary modules based on the Jaccard index between modules [25]. However, one difficult issue in functional module detection is to determine the number of clusters. As we know, the classic hierarchical clustering algorithms suffer from the limitation that the number of clusters is specified by users. It is impractical to expect we have sufficient domain knowledge to determine the number of modules for each timestamp. Also, it is unreasonable to assume that the number of clusters at each timestamp is the same. Therefore, in our work, we use the method of [19] which proposed a L curve method to automatically estimate the optimal number of clusters by using the property of the knee shape graph to identify the appropriate number of functional modules. Therefore, in our method, the number of clusters is unbounded, and an optimal number can be automatically determined. 2.3 Evolutionary Events Recently, a few approaches have been proposed to characterize the evolution of clusters over consecutive timestamps in social networks. Takaffoli et al. [22] described an eventbased framework to track the transitions between clusters at consecutive timestamps, and they improved the event formulae to track the entire observation time in a later work [21]. All these works have used a two-stage approach in which the clusters are first detected independently at each timestamp, and then matched to determine the critical evolutionary events. As mentioned before, our functional modules detected from consecutive timestamps are simultaneously influenced by two consecutive timestamps which makes our ACM-BCB

5 framework different. We believe that analyzing the evolutionary pattern of the functional modules detected at each timestamp, including form, dissolve, continue, merge and split, can help us discover underlying evolutionary trends or behaviors of different diseases or species. We state the problem of characterizing the evolutionary pattern of the functional modules in dynamic PPI networks in the following way. At a particular timestamp i, we can detect k i functional modules from the weighted dynamic PPI network WD i which is mentioned in the previous section, denoted as C i = {C1,C i 2, i..., Ck i i }. Note that there are overlapping between modules generated by our method. The evolutionary patterns of functional modules can be represented as a sequence of key evolutionary events (change) in consecutive timestamps. These key evolutionary events cover the evolution of functional modules and can be further formulated as a set of rules. We use the definition of transitionary events from [21], but we only focus on tracking the informative events from consecutive timestamps instead of entire observation timestamps. Given a module Cx i from i-th timestamp, the metric which tracks the optimal module which has the highest similarity with Cx i at (i + 1)-th timestamp, is defined as: track(cx,i+1)=c i y i+1 iff Cy i+1 Vx i Vz i+1 = arg max C i+1 max( Vx, i Vz i+1 ) } α, (6) z C i+1 { where V i x is the set of proteins of C i x, and the overlap threshold α defines whether two modules are matched, which is also used in the definitions of evolutionary events below. So track(c i x,i+1) denotes the optimal matching module for C i x at (i + 1)-th timestamp. If none of the modules in C i+1 has an overlap ratio larger than α, then track(c i x,i+1)= ( denotes an empty matching result). It is worth mentioning that this metric could also be used in the reverse direction with simple revision. The formal definitions of the five evolutionary events are defined as follows: C i+1 y is the continuation of Cx i in the next timestamp. It can also be considered as a module which continues its existence in the consecutive timestamps. Note that we do not ask for two modules to be totally the same. In Figure 3, module C3 2 is the continuation of module C2 1. Formally, a module Cx i in the i-th timestamp continues its existence to the (i + 1)-th timestamp iff: Cy i+1 C i+1 track(cx,i+1)=c i y i+1. (9) Split. If a particular functional module Cx i in i-th timestamp is matched to a set of modules C i+1 = {C i+1 1,C i+1 2,..., C i+1 k } in the coming (i + 1)-th timestamp then we say Cx i is split C i+1. For example, in Figure 3, module C1 1 is split into two modules - C1 2 and C2 2 in the next timestamp. Formally, a module Cx i in the i-th timestamp is split into a set of modules to C i+1 1,C i+1 2,..., C i+1 k, and it is worth noticing that C i+1 C i+1 1,C i+1 2,..., C i+1 k Merge. in the (i + 1)-th timestamp iff: C i+1 = {C i+1 1,C i+1 2,..., C i+1 k } C i+1 : C i+1 y C i+1 : Vx i Vy i+1 α. Vy i+1 (10) If a particular functional module Cx i+1 in (i + 1)-th timestamp is matched to a set of modules C i = {C1,C i 2, i..., Ck} i in the previous i-th timestamp then we say Cx i+1 is merged from C1,C i 2, i..., Ck,andC i i C i. For example, in Figure 3, module C2 3 is merged from three modules - C2 2, C3 2 and in the previous timestamp. Formally, a set of modules C 2 4 C1,C i 2, i..., Ck i in the i-th timestamp is merged into a modules in the (i + 1)-th timestamp iff: C i+1 x C i+1 x : C i y C i : Vy i Vx i+1 V i y α. (11) Form. A particular functional module C i x is marked as form if it did not exist in the previous timestamp. To be more specific, a form indicates that it is the first time a set of proteins are grouped together to perform some function, and some examples are shown as modules C 1 1, C 1 2 and C 2 4 in Figure 3. Thus module C i x is formed in the i-th timestamp iff: track(cx,i i 1) =. (7) Dissolve. A dissolve occurs for a particular functional module Cx i if no similar module exists in the next timestamp. Specifically, a dissolve indicates that it is the last time a set of proteins are grouped together to perform some function, and an example is shown as module C3 1 in Figure 3. Formally, a module Cx i in the i-th timestamp is defined as dissolve iff: track(cx,i+1)=. i (8) Continue. The continue occurs if there is a particular functional module Cy i+1 detected in timestamp i + 1 that is close to a module Cx i in the previous timestamp i-th. We then say Figure 3: An example of functional modules evolution over three timestamps, where five evolutionary events: form, dissolve, continue, split and merge are included. 3. EXPERIMENTS In this section, we show the experimental results of our proposed framework. 3.1 Dataset ACM-BCB

6 To construct the dynamic PPI networks, we have used two data sources, one is the static PPI network, and the other is thetimecoursegeneexpressiondata. Time Course Gene Expression Data. We use a time course gene expression dataset which represents the response of S. cerevisiae in a 15-day wine fermentation that is the process of S. cerevisiae turning the sugar of crushed grapes into alcohol. The dataset consists of seven timestamps (0, 12, 24, 48, 60, 120, and 340 hours which response to different ethanol concentrations), and there is a gene expression matrix created at each timestamp. In order to have a high cover ratio with the PPI network, we used the top 1285 genes which have the most known interactions in the DIP s PPI dataset 1. In addition, for each of the 1285 genes, the primary data consist of three independent biological samples at each of seven timestamps. The raw microarray data are published on Apr. 17, 2008 and available at the National Center for Biotechnology Information database 2 (NCBI) with the accession number GSE8536 [12]. In our experiments, we set the cutoff thresholds for seven timestamps correlation matrices as 0.76, 0.76, 0.83, 0.79, 0.73, 0.76 and 0.70, respectively, corresponding to their average correlation similarity. PPI Network. We used the S. cerevisiae data from the Database of Interacting Proteins 3 (DIP) database which was updated on Feb. 28, The S. cerevisiae PPI dataset contains totally 22,418 interactions. 3.2 Similarity between Functional Modules over Timestamps As we mentioned before, in the real world, the cellular system evolves gradually over time; thus we believe that the functional modules detected from each timestamp should change smoothly instead of dramatically. We assessed the functional modules similarity across the timestamps by comparing the proposed method with some classical clustering methods: K-means, Hierarchical clustering, Fuzzy c-means clustering (FCM) and Spectral clustering. In addition, since these baseline algorithms are required to preset the cluster number K, thus for each algorithm, we have tested both the cases when K =15andwhenK = 30. Note that among these baseline algorithms, K-means, Hierarchical clustering and Spectral clustering are non-overlapping clustering algorithms, and Fuzzy c-means is an overlapping clustering algorithm in which each node has a membership value for each cluster. In our experiments, if one particular node x s membership value for a cluster Cj i is larger than 0.1 we would assign x to Cj. i We also show our proposed method s performance without considering the module equivalence through the consecutive timestamps. To measure the similarity between the functional modules, we use the Jaccard index, which is defined as: which is between 0 and 1. J(Cx,C i y i+1 )= V x i Vy i+1 Vx i Vy i+1, (12) Then we summed up and av- 1 As list at nandu/genenames.docx eraged all the maximal Jaccard value for each module at a certain timestamp to be the final result, where a high value indicates that the modules detected at two separate timestamps are similar, or dissimilar otherwise. The results of all the methods are shown in Table 1. As can be seen, our proposed method shows higher module similarity over all timestamps than the other methods, since the baseline algorithms only consider the PPI network at the current timestamp. It demonstrates that our proposed framework properly handled the functional modules smoothly evolution. 3.3 Functional Module Identification To evaluate the effectiveness of our proposed framework, we used Funcat as the functional annotation from MIPS database. MIPS Functional Catalogue (FunCat) [18] is an annotation scheme for the functional description of proteins of prokaryotic and eukaryotic origin, and we used the top four levels of Funcat for validation. For statistical evolution of the detected modules, we used the p-value from the hypergeometric distribution, which is defined as: m 1 p =1 i=0 ( X i )( V X n i ( V n ) ), (13) where V is the number of proteins in the PPI network, X is the number of proteins in a reference function, n is the size of the modules, and m is the number of proteins in common between the function and the module. It is understood as the probability that at least m proteins in a module of size n are included in a reference function of size X. Alowvalue of p-value demonstrates that the module closely corresponds to the function, since it is not likely that the network will produce the module by chance. Similarly, we assessed the proposed algorithm s performance by comparing it with the baseline algorithms described in Section 3.2. The results are shown in Table 2. As the table shows, our proposed framework remarkably outperforms the baseline algorithms at each timestamp. This result indicates two things: 1) by following the principle of module equivalence, our functional influence based method provides more robust functional modules which are not sensitive to noise; and 2) our functional influence based overlapping functional module detection algorithm is more effective. 3.4 Informative Module Identification In this part, we used the evolutionary events which are defined in Section 2.3 to track the informative behavioral patterns in the evolving graph. We define core-module as the intersection of a series of modules which are linked as a connected graph by the evolutionary events at different timestamps and represents the evolution of its constituent communities ordered by time over the entire timestamps. To be more specific, the core-community is denoted as M = {C t 1 k 1 C t 2 k 2... C tm k m }, where t 1 <t 2 <... < t m. By tracking the critical evolutionary events between timestamps, we found some interesting results. Figure 4 shows the evolving graphs for four α values: 0.6, 0.7, 0.8 and 0.9, respectively. In the evolving graph, each node is a functional module detected at a particular timestamp and each edge is an interaction (event) between modules between two consecutive timestamps. We see from Figure 4 that, as the α increases, the number of detected evolutionary events becomes less and less. Also, the backbone of the evolution ACM-BCB

7 Table 1: Comparing of modules similarity across timestamp t=0-12 t=12-24 t=24-48 t=48-60 t= t= Ave Evolution Flow Evolution Flow (Without Smoothness) K-means (K=15) K-means (K=30) FCM (K=15) FCM (K=30) Spectral Clustering (K=15) Spectral Clustering (K=30) Table 2: Comparing of log(p-value) t=0 t=12 t=24 t=48 t=60 t=120 t=340 Ave Evolution Flow K-means (K=15) K-means (K=30) FCM (K=15) FCM (K=30) Spectral Clustering (K=15) Spectral Clustering (K=30) becomes clearer. Finally, when α =0.9, we can detect a module which is consistent over all timestamps. To make it clearer, we extracted this module and represented it in dashed lines in Figure 4(d). It is easy to note that the coremodule is M = {C1 1 C2 2 C2 3 C1 4 C1 5 C3 6 C1 7 }, which includes 25 core proteins which are POL30, RAD1, PIN3, RAD23, HRT1, YOL087C, RAD7, UBA1, MET30, MGT1, RVS167, HSE1, CDC48, SAN1, PRP8, RPL40A, SNF1, CLB2, KSS1, SWD1, RPL40B, MUS81, SWI5, GRR1 and GPA1. The consistency shows that the proteins which are included in this core-module interact strongly over the entire observation period. This is not surprising since this functional module is essentially involved in cell growth and cell death, as well as ethanol concentrations changing. Such consistency in evolutionary patterns of this module may provide clues about how proteins response to external stimuli during the wine fermentation progression. The top 10 biological process annotations of this core-module M with very low p-value are shown in Table 3, which are calculated by [11]. Some functional key words such as protein ubiquitination, protein conjugation, post-translational modification, response to stimulus and catabolic process, have been proven to play an important role in the process of S. cerevisiae fermentation [15, 14, 4]. 4. CONCLUSIONS In this paper, we proposed a framework for analyzing the evolutionary patterns of functional modules in dynamic PPI networks. Since this framework has considered the inherent dynamic characteristics within the PPI networks, it may provide novel insights into the underlying behaviors of the molecular system. To our best knowledge, this is the first evolutionary analysis of functional modules in dynamic PPI networks. Using the wine fermentation of S. cerevisiae dataset over consecutive timestamps, we demonstrated the gene annotation enrichment of the identified functional modules, the sets of proteins that participate in the same biological function, in high confidence. Also, the results of the experiment in Section 3.4 lead to the conclusion that the proposed framework can categorize and track the evolutionary events of the functional modules effectively, and obtains an informative functional module which plays an important role over the entire observation time. Through deeply analyzing the gene annotations of the functional modules whose evolutionary pattern are distinctive, we may capture important insights of various diseases or creatures. 5. REFERENCES [1] K. Basso and et al. Reverse engineering of regulatory networks in human b cells. Nature Genetics, 37(4): , [2] Y. Chi and et al. On evolutionary spectral clustering. ACM Transactions on Knowledge Discovery from Data, 3(4):1 30, [3] Y.-R. Cho, L. Shi, and A. Zhang. flownet: Flow-based approach for efficient analysis of complex biological networks Ninth IEEE International Conference on Data Mining, pages , [4] J. Ding and et al. Tolerance and stress response to ethanol in the yeast saccharomyces cerevisiae. Applied Microbiology and Biotechnology, 74(2): , [5] A. J. Enright, S. Van Dongen, and C. A. Ouzounis. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Research, 30(7): , [6] L. Getoor and C. P. Diehl. Link mining: a survey. SIGKDD Explor. Newsl., 7(2):3 12, Dec [7] M. Girvan and M. E. J. Newman. Pnas community structure in social and biological networks community structure in social and biological networks- pnas. PNAS, pages 1 9, [8] J.-D. J. Han and et al. Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature, 430(6995):88 93, [9] H.Jeong,S.P.Mason,A.L.Barabà asi, and Z. N. Oltvai. Lethality and centrality in protein networks. Nature, 411(6833):41 42, [10] H. K. Lee and et al. Coexpression analysis of human genes across many microarray data sets. Genome Research, 14(6): , [11] S. Maere, K. Heymans, and M. Kuiper. Bingo: a cytoscape plugin to assess overrepresentation of gene ontology ACM-BCB

8 Table 3: Top 10 biological process annotations for the core-module M GO-ID p-value Description E-10 protein ubiquitination E-09 protein modification by small protein conjugation E-08 protein modification by small protein conjugation or removal E-07 post-translational protein modification E-07 cellular response to stimulus E-06 macromolecule modification E-06 protein ubiquitination involved in ubiquitin-dependent protein catabolic process E-06 response to DNA damage stimulus E-06 protein modification process E-06 response to stimulus Figure 4: Plot of evolving graph with varying α values. categories in biological networks. Bioinformatics, 21(16): , [12] V. Marks and et al. Dynamics of the yeast transcriptome during wine fermentation reveals a novel fermentation stress response. FEMS Yeast Research, 8(1):35 52, [13] E. Nabieva and et al. Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics, 21 Suppl 1: , [14] S. Ostergaard, L. Olsson, and J. Nielsen. Metabolic engineering of saccharomyces cerevisiae. Microbiology and Molecular Biology Reviews, 64(1):34 50, [15] N. Piggott, M. Cook, M. Tyers, and V. Measday. Genome-wide fitness profiles reveal a requirement for autophagy during yeast fermentation. G3 (Bethesda), 1(5):353 67, [16] T. M. Przytycka, M. Singh, and D. K. Slonim. Toward the dynamic interactome : it s about time. Access, 11(1), [17] Y. Qi and H. Ge. Modularity and dynamics of cellular networks. PLoS Computational Biology, 2(12):9, [18] A. Ruepp and et al. The funcat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Research, 32(18): , [19] S. Salvador and P. Chan. Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. 16th IEEE International Conference on Tools with Artificial Intelligence, 1(Ictai): , [20] L. Shi, Y.-R. Cho, and A. Zhang. Functional flow simulation based analysis of protein interaction network. BIBE 10, pages , [21] M. Takaffoli, F. Sangi, J. Fagnan, and O. R. Za. Modec - modeling and detecting evolutions of communities. Artificial Intelligence, pages , [22] M. Takaffoli, F. Sangi, J. Fagnan, and O. R. Zaiane. A framework for analyzing dynamic social networks. Science, [23] X. Tang, J. Wang, B. Liu, M. Li, G. Chen, and Y. Pan. A comparison of the functional modules identified from time course and static ppi network data. BMC Bioinformatics, 12(1):339, [24] S. White and P. Smyth. A spectral clustering approach to finding communities in graphs. Proceedings of the fifth SIAM international conference on data mining, 119:274, [25] A. Zhang. Protein Interaction Networks: Computational Analysis [26] S. Zhang, H.-W. Liu, X.-M. Ning, and X.-S. Zhang. A hybrid graph-theoretic method for mining overlapping functional modules in large sparse protein interaction networks. International journal of data mining and bioinformatics, 3(1):68 84, ACM-BCB

Towards Detecting Protein Complexes from Protein Interaction Data

Towards Detecting Protein Complexes from Protein Interaction Data Towards Detecting Protein Complexes from Protein Interaction Data Pengjun Pei 1 and Aidong Zhang 1 Department of Computer Science and Engineering State University of New York at Buffalo Buffalo NY 14260,

More information

Network Biology: Understanding the cell s functional organization. Albert-László Barabási Zoltán N. Oltvai

Network Biology: Understanding the cell s functional organization. Albert-László Barabási Zoltán N. Oltvai Network Biology: Understanding the cell s functional organization Albert-László Barabási Zoltán N. Oltvai Outline: Evolutionary origin of scale-free networks Motifs, modules and hierarchical networks Network

More information

Evidence for dynamically organized modularity in the yeast protein-protein interaction network

Evidence for dynamically organized modularity in the yeast protein-protein interaction network Evidence for dynamically organized modularity in the yeast protein-protein interaction network Sari Bombino Helsinki 27.3.2007 UNIVERSITY OF HELSINKI Department of Computer Science Seminar on Computational

More information

Predicting Protein Functions and Domain Interactions from Protein Interactions

Predicting Protein Functions and Domain Interactions from Protein Interactions Predicting Protein Functions and Domain Interactions from Protein Interactions Fengzhu Sun, PhD Center for Computational and Experimental Genomics University of Southern California Outline High-throughput

More information

Protein function prediction via analysis of interactomes

Protein function prediction via analysis of interactomes Protein function prediction via analysis of interactomes Elena Nabieva Mona Singh Department of Computer Science & Lewis-Sigler Institute for Integrative Genomics January 22, 2008 1 Introduction Genome

More information

Networks & pathways. Hedi Peterson MTAT Bioinformatics

Networks & pathways. Hedi Peterson MTAT Bioinformatics Networks & pathways Hedi Peterson (peterson@quretec.com) MTAT.03.239 Bioinformatics 03.11.2010 Networks are graphs Nodes Edges Edges Directed, undirected, weighted Nodes Genes Proteins Metabolites Enzymes

More information

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008 MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Detecting temporal protein complexes from dynamic protein-protein interaction networks

Detecting temporal protein complexes from dynamic protein-protein interaction networks Detecting temporal protein complexes from dynamic protein-protein interaction networks Le Ou-Yang, Dao-Qing Dai, Xiao-Li Li, Min Wu, Xiao-Fei Zhang and Peng Yang 1 Supplementary Table Table S1: Comparative

More information

Fuzzy Clustering of Gene Expression Data

Fuzzy Clustering of Gene Expression Data Fuzzy Clustering of Gene Data Matthias E. Futschik and Nikola K. Kasabov Department of Information Science, University of Otago P.O. Box 56, Dunedin, New Zealand email: mfutschik@infoscience.otago.ac.nz,

More information

Network by Weighted Graph Mining

Network by Weighted Graph Mining 2012 4th International Conference on Bioinformatics and Biomedical Technology IPCBEE vol.29 (2012) (2012) IACSIT Press, Singapore + Prediction of Protein Function from Protein-Protein Interaction Network

More information

Introduction to Bioinformatics

Introduction to Bioinformatics CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics

More information

An Efficient Algorithm for Protein-Protein Interaction Network Analysis to Discover Overlapping Functional Modules

An Efficient Algorithm for Protein-Protein Interaction Network Analysis to Discover Overlapping Functional Modules An Efficient Algorithm for Protein-Protein Interaction Network Analysis to Discover Overlapping Functional Modules Ying Liu 1 Department of Computer Science, Mathematics and Science, College of Professional

More information

Differential Modeling for Cancer Microarray Data

Differential Modeling for Cancer Microarray Data Differential Modeling for Cancer Microarray Data Omar Odibat Department of Computer Science Feb, 01, 2011 1 Outline Introduction Cancer Microarray data Problem Definition Differential analysis Existing

More information

Written Exam 15 December Course name: Introduction to Systems Biology Course no

Written Exam 15 December Course name: Introduction to Systems Biology Course no Technical University of Denmark Written Exam 15 December 2008 Course name: Introduction to Systems Biology Course no. 27041 Aids allowed: Open book exam Provide your answers and calculations on separate

More information

Types of biological networks. I. Intra-cellurar networks

Types of biological networks. I. Intra-cellurar networks Types of biological networks I. Intra-cellurar networks 1 Some intra-cellular networks: 1. Metabolic networks 2. Transcriptional regulation networks 3. Cell signalling networks 4. Protein-protein interaction

More information

A Multiobjective GO based Approach to Protein Complex Detection

A Multiobjective GO based Approach to Protein Complex Detection Available online at www.sciencedirect.com Procedia Technology 4 (2012 ) 555 560 C3IT-2012 A Multiobjective GO based Approach to Protein Complex Detection Sumanta Ray a, Moumita De b, Anirban Mukhopadhyay

More information

Association Analysis-based Transformations for Protein Interaction Networks: A Function Prediction Case Study

Association Analysis-based Transformations for Protein Interaction Networks: A Function Prediction Case Study Association Analysis-based Transformations for Protein Interaction Networks: A Function Prediction Case Study Gaurav Pandey Dept of Comp Sc & Engg Univ of Minnesota, Twin Cities Minneapolis, MN, USA gaurav@cs.umn.edu

More information

Cell biology traditionally identifies proteins based on their individual actions as catalysts, signaling

Cell biology traditionally identifies proteins based on their individual actions as catalysts, signaling Lethality and centrality in protein networks Cell biology traditionally identifies proteins based on their individual actions as catalysts, signaling molecules, or building blocks of cells and microorganisms.

More information

A general co-expression network-based approach to gene expression analysis: comparison and applications

A general co-expression network-based approach to gene expression analysis: comparison and applications BMC Systems Biology This Provisional PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon. A general co-expression

More information

Robust Community Detection Methods with Resolution Parameter for Complex Detection in Protein Protein Interaction Networks

Robust Community Detection Methods with Resolution Parameter for Complex Detection in Protein Protein Interaction Networks Robust Community Detection Methods with Resolution Parameter for Complex Detection in Protein Protein Interaction Networks Twan van Laarhoven and Elena Marchiori Institute for Computing and Information

More information

PNmerger: a Cytoscape plugin to merge biological pathways and protein interaction networks

PNmerger: a Cytoscape plugin to merge biological pathways and protein interaction networks PNmerger: a Cytoscape plugin to merge biological pathways and protein interaction networks http://www.hupo.org.cn/pnmerger Fuchu He E-mail: hefc@nic.bmi.ac.cn Tel: 86-10-68171208 FAX: 86-10-68214653 Yunping

More information

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics GENOME Bioinformatics 2 Proteomics protein-gene PROTEOME protein-protein METABOLISM Slide from http://www.nd.edu/~networks/ Citrate Cycle Bio-chemical reactions What is it? Proteomics Reveal protein Protein

More information

In order to compare the proteins of the phylogenomic matrix, we needed a similarity

In order to compare the proteins of the phylogenomic matrix, we needed a similarity Similarity Matrix Generation In order to compare the proteins of the phylogenomic matrix, we needed a similarity measure. Hamming distances between phylogenetic profiles require the use of thresholds for

More information

hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference

hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference CS 229 Project Report (TR# MSB2010) Submitted 12/10/2010 hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference Muhammad Shoaib Sehgal Computer Science

More information

MTopGO: a tool for module identification in PPI Networks

MTopGO: a tool for module identification in PPI Networks MTopGO: a tool for module identification in PPI Networks Danila Vella 1,2, Simone Marini 3,4, Francesca Vitali 5,6,7, Riccardo Bellazzi 1,4 1 Clinical Scientific Institute Maugeri, Pavia, Italy, 2 Department

More information

EFFICIENT AND ROBUST PREDICTION ALGORITHMS FOR PROTEIN COMPLEXES USING GOMORY-HU TREES

EFFICIENT AND ROBUST PREDICTION ALGORITHMS FOR PROTEIN COMPLEXES USING GOMORY-HU TREES EFFICIENT AND ROBUST PREDICTION ALGORITHMS FOR PROTEIN COMPLEXES USING GOMORY-HU TREES A. MITROFANOVA*, M. FARACH-COLTON**, AND B. MISHRA* *New York University, Department of Computer Science, New York,

More information

SUPPLEMENTAL DATA - 1. This file contains: Supplemental methods. Supplemental results. Supplemental tables S1 and S2. Supplemental figures S1 to S4

SUPPLEMENTAL DATA - 1. This file contains: Supplemental methods. Supplemental results. Supplemental tables S1 and S2. Supplemental figures S1 to S4 Protein Disulfide Isomerase is Required for Platelet-Derived Growth Factor-Induced Vascular Smooth Muscle Cell Migration, Nox1 Expression and RhoGTPase Activation Luciana A. Pescatore 1, Diego Bonatto

More information

Discovering molecular pathways from protein interaction and ge

Discovering molecular pathways from protein interaction and ge Discovering molecular pathways from protein interaction and gene expression data 9-4-2008 Aim To have a mechanism for inferring pathways from gene expression and protein interaction data. Motivation Why

More information

Systems biology and biological networks

Systems biology and biological networks Systems Biology Workshop Systems biology and biological networks Center for Biological Sequence Analysis Networks in electronics Radio kindly provided by Lazebnik, Cancer Cell, 2002 Systems Biology Workshop,

More information

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor Biological Networks:,, and via Relative Description Length By: Tamir Tuller & Benny Chor Presented by: Noga Grebla Content of the presentation Presenting the goals of the research Reviewing basic terms

More information

Comparative Network Analysis

Comparative Network Analysis Comparative Network Analysis BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by

More information

An Improved Ant Colony Optimization Algorithm for Clustering Proteins in Protein Interaction Network

An Improved Ant Colony Optimization Algorithm for Clustering Proteins in Protein Interaction Network An Improved Ant Colony Optimization Algorithm for Clustering Proteins in Protein Interaction Network Jamaludin Sallim 1, Rosni Abdullah 2, Ahamad Tajudin Khader 3 1,2,3 School of Computer Sciences, Universiti

More information

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it? Proteomics What is it? Reveal protein interactions Protein profiling in a sample Yeast two hybrid screening High throughput 2D PAGE Automatic analysis of 2D Page Yeast two hybrid Use two mating strains

More information

Small RNA in rice genome

Small RNA in rice genome Vol. 45 No. 5 SCIENCE IN CHINA (Series C) October 2002 Small RNA in rice genome WANG Kai ( 1, ZHU Xiaopeng ( 2, ZHONG Lan ( 1,3 & CHEN Runsheng ( 1,2 1. Beijing Genomics Institute/Center of Genomics and

More information

Ensemble Non-negative Matrix Factorization Methods for Clustering Protein-Protein Interactions

Ensemble Non-negative Matrix Factorization Methods for Clustering Protein-Protein Interactions Belfield Campus Map Ensemble Non-negative Matrix Factorization Methods for Clustering Protein-Protein Interactions

More information

Analysis and visualization of protein-protein interactions. Olga Vitek Assistant Professor Statistics and Computer Science

Analysis and visualization of protein-protein interactions. Olga Vitek Assistant Professor Statistics and Computer Science 1 Analysis and visualization of protein-protein interactions Olga Vitek Assistant Professor Statistics and Computer Science 2 Outline 1. Protein-protein interactions 2. Using graph structures to study

More information

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

CISC 636 Computational Biology & Bioinformatics (Fall 2016) CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein

More information

Self Similar (Scale Free, Power Law) Networks (I)

Self Similar (Scale Free, Power Law) Networks (I) Self Similar (Scale Free, Power Law) Networks (I) E6083: lecture 4 Prof. Predrag R. Jelenković Dept. of Electrical Engineering Columbia University, NY 10027, USA {predrag}@ee.columbia.edu February 7, 2007

More information

Learning in Bayesian Networks

Learning in Bayesian Networks Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks

More information

Computational Network Biology Biostatistics & Medical Informatics 826 Fall 2018

Computational Network Biology Biostatistics & Medical Informatics 826 Fall 2018 Computational Network Biology Biostatistics & Medical Informatics 826 Fall 2018 Sushmita Roy sroy@biostat.wisc.edu https://compnetbiocourse.discovery.wisc.edu Sep 6 th 2018 Goals for today Administrivia

More information

Lecture Notes for Fall Network Modeling. Ernest Fraenkel

Lecture Notes for Fall Network Modeling. Ernest Fraenkel Lecture Notes for 20.320 Fall 2012 Network Modeling Ernest Fraenkel In this lecture we will explore ways in which network models can help us to understand better biological data. We will explore how networks

More information

BMD645. Integration of Omics

BMD645. Integration of Omics BMD645 Integration of Omics Shu-Jen Chen, Chang Gung University Dec. 11, 2009 1 Traditional Biology vs. Systems Biology Traditional biology : Single genes or proteins Systems biology: Simultaneously study

More information

Interaction Network Analysis

Interaction Network Analysis CSI/BIF 5330 Interaction etwork Analsis Young-Rae Cho Associate Professor Department of Computer Science Balor Universit Biological etworks Definition Maps of biochemical reactions, interactions, regulations

More information

identifiers matched to homologous genes. Probeset annotation files for each array platform were used to

identifiers matched to homologous genes. Probeset annotation files for each array platform were used to SUPPLEMENTARY METHODS Data combination and normalization Prior to data analysis we first had to appropriately combine all 1617 arrays such that probeset identifiers matched to homologous genes. Probeset

More information

Introduction to clustering methods for gene expression data analysis

Introduction to clustering methods for gene expression data analysis Introduction to clustering methods for gene expression data analysis Giorgio Valentini e-mail: valentini@dsi.unimi.it Outline Levels of analysis of DNA microarray data Clustering methods for functional

More information

Clustering and Network

Clustering and Network Clustering and Network Jing-Dong Jackie Han jdhan@picb.ac.cn http://www.picb.ac.cn/~jdhan Copy Right: Jing-Dong Jackie Han What is clustering? A way of grouping together data samples that are similar in

More information

Application of random matrix theory to microarray data for discovering functional gene modules

Application of random matrix theory to microarray data for discovering functional gene modules Application of random matrix theory to microarray data for discovering functional gene modules Feng Luo, 1 Jianxin Zhong, 2,3, * Yunfeng Yang, 4 and Jizhong Zhou 4,5, 1 Department of Computer Science,

More information

Inferring Transcriptional Regulatory Networks from Gene Expression Data II

Inferring Transcriptional Regulatory Networks from Gene Expression Data II Inferring Transcriptional Regulatory Networks from Gene Expression Data II Lectures 9 Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday

More information

Integration of functional genomics data

Integration of functional genomics data Integration of functional genomics data Laboratoire Bordelais de Recherche en Informatique (UMR) Centre de Bioinformatique de Bordeaux (Plateforme) Rennes Oct. 2006 1 Observations and motivations Genomics

More information

Protein Complex Identification by Supervised Graph Clustering

Protein Complex Identification by Supervised Graph Clustering Protein Complex Identification by Supervised Graph Clustering Yanjun Qi 1, Fernanda Balem 2, Christos Faloutsos 1, Judith Klein- Seetharaman 1,2, Ziv Bar-Joseph 1 1 School of Computer Science, Carnegie

More information

Biological Systems: Open Access

Biological Systems: Open Access Biological Systems: Open Access Biological Systems: Open Access Liu and Zheng, 2016, 5:1 http://dx.doi.org/10.4172/2329-6577.1000153 ISSN: 2329-6577 Research Article ariant Maps to Identify Coding and

More information

Bioinformatics I. CPBS 7711 October 29, 2015 Protein interaction networks. Debra Goldberg

Bioinformatics I. CPBS 7711 October 29, 2015 Protein interaction networks. Debra Goldberg Bioinformatics I CPBS 7711 October 29, 2015 Protein interaction networks Debra Goldberg debra@colorado.edu Overview Networks, protein interaction networks (PINs) Network models What can we learn from PINs

More information

Overview. Overview. Social networks. What is a network? 10/29/14. Bioinformatics I. Networks are everywhere! Introduction to Networks

Overview. Overview. Social networks. What is a network? 10/29/14. Bioinformatics I. Networks are everywhere! Introduction to Networks Bioinformatics I Overview CPBS 7711 October 29, 2014 Protein interaction networks Debra Goldberg debra@colorado.edu Networks, protein interaction networks (PINs) Network models What can we learn from PINs

More information

An Approach to Classification Based on Fuzzy Association Rules

An Approach to Classification Based on Fuzzy Association Rules An Approach to Classification Based on Fuzzy Association Rules Zuoliang Chen, Guoqing Chen School of Economics and Management, Tsinghua University, Beijing 100084, P. R. China Abstract Classification based

More information

Computational methods for predicting protein-protein interactions

Computational methods for predicting protein-protein interactions Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational

More information

Bioinformatics. Dept. of Computational Biology & Bioinformatics

Bioinformatics. Dept. of Computational Biology & Bioinformatics Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS

More information

Understanding Science Through the Lens of Computation. Richard M. Karp Nov. 3, 2007

Understanding Science Through the Lens of Computation. Richard M. Karp Nov. 3, 2007 Understanding Science Through the Lens of Computation Richard M. Karp Nov. 3, 2007 The Computational Lens Exposes the computational nature of natural processes and provides a language for their description.

More information

Phylogenetic Analysis of Molecular Interaction Networks 1

Phylogenetic Analysis of Molecular Interaction Networks 1 Phylogenetic Analysis of Molecular Interaction Networks 1 Mehmet Koyutürk Case Western Reserve University Electrical Engineering & Computer Science 1 Joint work with Sinan Erten, Xin Li, Gurkan Bebek,

More information

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models 02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput

More information

Introduction to clustering methods for gene expression data analysis

Introduction to clustering methods for gene expression data analysis Introduction to clustering methods for gene expression data analysis Giorgio Valentini e-mail: valentini@dsi.unimi.it Outline Levels of analysis of DNA microarray data Clustering methods for functional

More information

networks in molecular biology Wolfgang Huber

networks in molecular biology Wolfgang Huber networks in molecular biology Wolfgang Huber networks in molecular biology Regulatory networks: components = gene products interactions = regulation of transcription, translation, phosphorylation... Metabolic

More information

Supplementary online material

Supplementary online material Supplementary online material A probabilistic functional network of yeast genes Insuk Lee, Shailesh V. Date, Alex T. Adai & Edward M. Marcotte DATA SETS Saccharomyces cerevisiae genome This study is based

More information

Computational Systems Biology

Computational Systems Biology Computational Systems Biology Vasant Honavar Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Graduate Program Center for Computational Intelligence, Learning, & Discovery

More information

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein The parsimony principle: A quick review Find the tree that requires the fewest

More information

Identification of protein complexes from multi-relationship protein interaction networks

Identification of protein complexes from multi-relationship protein interaction networks Li et al. Human Genomics 2016, 10(Suppl 2):17 DOI 10.1186/s40246-016-0069-z RESEARCH Identification of protein complexes from multi-relationship protein interaction networks Xueyong Li 1,2, Jianxin Wang

More information

V 5 Robustness and Modularity

V 5 Robustness and Modularity Bioinformatics 3 V 5 Robustness and Modularity Mon, Oct 29, 2012 Network Robustness Network = set of connections Failure events: loss of edges loss of nodes (together with their edges) loss of connectivity

More information

Functional Characterization and Topological Modularity of Molecular Interaction Networks

Functional Characterization and Topological Modularity of Molecular Interaction Networks Functional Characterization and Topological Modularity of Molecular Interaction Networks Jayesh Pandey 1 Mehmet Koyutürk 2 Ananth Grama 1 1 Department of Computer Science Purdue University 2 Department

More information

Iteration Method for Predicting Essential Proteins Based on Orthology and Protein-protein Interaction Networks

Iteration Method for Predicting Essential Proteins Based on Orthology and Protein-protein Interaction Networks Georgia State University ScholarWorks @ Georgia State University Computer Science Faculty Publications Department of Computer Science 2012 Iteration Method for Predicting Essential Proteins Based on Orthology

More information

Supplementary Figure 3

Supplementary Figure 3 Supplementary Figure 3 a 1 (i) (ii) (iii) (iv) (v) log P gene Q group, % ~ ε nominal 2 1 1 8 6 5 A B C D D' G J L M P R U + + ε~ A C B D D G JL M P R U -1 1 ε~ (vi) Z group 2 1 1 (vii) (viii) Z module

More information

Weighted gene co-expression analysis. Yuehua Cui June 7, 2013

Weighted gene co-expression analysis. Yuehua Cui June 7, 2013 Weighted gene co-expression analysis Yuehua Cui June 7, 2013 Weighted gene co-expression network (WGCNA) A type of scale-free network: A scale-free network is a network whose degree distribution follows

More information

Supplementary Information

Supplementary Information Supplementary Information For the article"comparable system-level organization of Archaea and ukaryotes" by J. Podani, Z. N. Oltvai, H. Jeong, B. Tombor, A.-L. Barabási, and. Szathmáry (reference numbers

More information

Network Biology-part II

Network Biology-part II Network Biology-part II Jun Zhu, Ph. D. Professor of Genomics and Genetic Sciences Icahn Institute of Genomics and Multi-scale Biology The Tisch Cancer Institute Icahn Medical School at Mount Sinai New

More information

Updated: 10/11/2018 Page 1 of 5

Updated: 10/11/2018 Page 1 of 5 A. Academic Division: Health Sciences B. Discipline: Biology C. Course Number and Title: BIOL1230 Biology I MASTER SYLLABUS 2018-2019 D. Course Coordinator: Justin Tickhill Assistant Dean: Melinda Roepke,

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

Comparative RNA-seq analysis of transcriptome dynamics during petal development in Rosa chinensis

Comparative RNA-seq analysis of transcriptome dynamics during petal development in Rosa chinensis Title Comparative RNA-seq analysis of transcriptome dynamics during petal development in Rosa chinensis Author list Yu Han 1, Huihua Wan 1, Tangren Cheng 1, Jia Wang 1, Weiru Yang 1, Huitang Pan 1* & Qixiang

More information

Feature gene selection method based on logistic and correlation information entropy

Feature gene selection method based on logistic and correlation information entropy Bio-Medical Materials and Engineering 26 (2015) S1953 S1959 DOI 10.3233/BME-151498 IOS Press S1953 Feature gene selection method based on logistic and correlation information entropy Jiucheng Xu a,b,,

More information

Basic modeling approaches for biological systems. Mahesh Bule

Basic modeling approaches for biological systems. Mahesh Bule Basic modeling approaches for biological systems Mahesh Bule The hierarchy of life from atoms to living organisms Modeling biological processes often requires accounting for action and feedback involving

More information

Analysis of Biological Networks: Network Robustness and Evolution

Analysis of Biological Networks: Network Robustness and Evolution Analysis of Biological Networks: Network Robustness and Evolution Lecturer: Roded Sharan Scribers: Sasha Medvedovsky and Eitan Hirsh Lecture 14, February 2, 2006 1 Introduction The chapter is divided into

More information

Structure and Centrality of the Largest Fully Connected Cluster in Protein-Protein Interaction Networks

Structure and Centrality of the Largest Fully Connected Cluster in Protein-Protein Interaction Networks 22 International Conference on Environment Science and Engieering IPCEE vol.3 2(22) (22)ICSIT Press, Singapoore Structure and Centrality of the Largest Fully Connected Cluster in Protein-Protein Interaction

More information

Preliminary Results on Social Learning with Partial Observations

Preliminary Results on Social Learning with Partial Observations Preliminary Results on Social Learning with Partial Observations Ilan Lobel, Daron Acemoglu, Munther Dahleh and Asuman Ozdaglar ABSTRACT We study a model of social learning with partial observations from

More information

Gene Ontology and overrepresentation analysis

Gene Ontology and overrepresentation analysis Gene Ontology and overrepresentation analysis Kjell Petersen J Express Microarray analysis course Oslo December 2009 Presentation adapted from Endre Anderssen and Vidar Beisvåg NMC Trondheim Overview How

More information

Chapter 16. Clustering Biological Data. Chandan K. Reddy Wayne State University Detroit, MI

Chapter 16. Clustering Biological Data. Chandan K. Reddy Wayne State University Detroit, MI Chapter 16 Clustering Biological Data Chandan K. Reddy Wayne State University Detroit, MI reddy@cs.wayne.edu Mohammad Al Hasan Indiana University - Purdue University Indianapolis, IN alhasan@cs.iupui.edu

More information

Sig2GRN: A Software Tool Linking Signaling Pathway with Gene Regulatory Network for Dynamic Simulation

Sig2GRN: A Software Tool Linking Signaling Pathway with Gene Regulatory Network for Dynamic Simulation Sig2GRN: A Software Tool Linking Signaling Pathway with Gene Regulatory Network for Dynamic Simulation Authors: Fan Zhang, Runsheng Liu and Jie Zheng Presented by: Fan Wu School of Computer Science and

More information

BIOLOGY 111. CHAPTER 1: An Introduction to the Science of Life

BIOLOGY 111. CHAPTER 1: An Introduction to the Science of Life BIOLOGY 111 CHAPTER 1: An Introduction to the Science of Life An Introduction to the Science of Life: Chapter Learning Outcomes 1.1) Describe the properties of life common to all living things. (Module

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture

More information

Grundlagen der Bioinformatik Summer semester Lecturer: Prof. Daniel Huson

Grundlagen der Bioinformatik Summer semester Lecturer: Prof. Daniel Huson Grundlagen der Bioinformatik, SS 10, D. Huson, April 12, 2010 1 1 Introduction Grundlagen der Bioinformatik Summer semester 2010 Lecturer: Prof. Daniel Huson Office hours: Thursdays 17-18h (Sand 14, C310a)

More information

Function Prediction Using Neighborhood Patterns

Function Prediction Using Neighborhood Patterns Function Prediction Using Neighborhood Patterns Petko Bogdanov Department of Computer Science, University of California, Santa Barbara, CA 93106 petko@cs.ucsb.edu Ambuj Singh Department of Computer Science,

More information

Comparison of Protein-Protein Interaction Confidence Assignment Schemes

Comparison of Protein-Protein Interaction Confidence Assignment Schemes Comparison of Protein-Protein Interaction Confidence Assignment Schemes Silpa Suthram 1, Tomer Shlomi 2, Eytan Ruppin 2, Roded Sharan 2, and Trey Ideker 1 1 Department of Bioengineering, University of

More information

Analysis of Biological Networks: Network Integration

Analysis of Biological Networks: Network Integration Analysis of Biological Networks: Network Integration Lecturer: Roded Sharan Scribe: Yael Silberberg and Renana Miller Lecture 10. May 27, 2009 Introduction The integration of different types of networks

More information

ANAXOMICS METHODOLOGIES - UNDERSTANDING

ANAXOMICS METHODOLOGIES - UNDERSTANDING ANAXOMICS METHODOLOGIES - UNDERSTANDING THE COMPLEXITY OF BIOLOGICAL PROCESSES Raquel Valls, Albert Pujol ǂ, Judith Farrés, Laura Artigas and José Manuel Mas Anaxomics Biotech, c/balmes 89, 08008 Barcelona,

More information

Clustering of Pathogenic Genes in Human Co-regulatory Network. Michael Colavita Mentor: Soheil Feizi Fifth Annual MIT PRIMES Conference May 17, 2015

Clustering of Pathogenic Genes in Human Co-regulatory Network. Michael Colavita Mentor: Soheil Feizi Fifth Annual MIT PRIMES Conference May 17, 2015 Clustering of Pathogenic Genes in Human Co-regulatory Network Michael Colavita Mentor: Soheil Feizi Fifth Annual MIT PRIMES Conference May 17, 2015 Topics Background Genetic Background Regulatory Networks

More information

Protein-protein interaction networks Prof. Peter Csermely

Protein-protein interaction networks Prof. Peter Csermely Protein-Protein Interaction Networks 1 Department of Medical Chemistry Semmelweis University, Budapest, Hungary www.linkgroup.hu csermely@eok.sote.hu Advantages of multi-disciplinarity Networks have general

More information

FCModeler: Dynamic Graph Display and Fuzzy Modeling of Regulatory and Metabolic Maps

FCModeler: Dynamic Graph Display and Fuzzy Modeling of Regulatory and Metabolic Maps FCModeler: Dynamic Graph Display and Fuzzy Modeling of Regulatory and Metabolic Maps Julie Dickerson 1, Zach Cox 1 and Andy Fulmer 2 1 Iowa State University and 2 Proctor & Gamble. FCModeler Goals Capture

More information

A Max-Flow Based Approach to the. Identification of Protein Complexes Using Protein Interaction and Microarray Data

A Max-Flow Based Approach to the. Identification of Protein Complexes Using Protein Interaction and Microarray Data A Max-Flow Based Approach to the 1 Identification of Protein Complexes Using Protein Interaction and Microarray Data Jianxing Feng, Rui Jiang, and Tao Jiang Abstract The emergence of high-throughput technologies

More information

Fine-scale dissection of functional protein network. organization by dynamic neighborhood analysis

Fine-scale dissection of functional protein network. organization by dynamic neighborhood analysis Fine-scale dissection of functional protein network organization by dynamic neighborhood analysis Kakajan Komurov 1, Mehmet H. Gunes 2, Michael A. White 1 1 Department of Cell Biology, University of Texas

More information

2 GENE FUNCTIONAL SIMILARITY. 2.1 Semantic values of GO terms

2 GENE FUNCTIONAL SIMILARITY. 2.1 Semantic values of GO terms Bioinformatics Advance Access published March 7, 2007 The Author (2007). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

More information

ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database

ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database Dina Vishnyakova 1,2, 4, *, Julien Gobeill 1,3,4, Emilie Pasche 1,2,3,4 and Patrick Ruch

More information

GRAPH-THEORETICAL COMPARISON REVEALS STRUCTURAL DIVERGENCE OF HUMAN PROTEIN INTERACTION NETWORKS

GRAPH-THEORETICAL COMPARISON REVEALS STRUCTURAL DIVERGENCE OF HUMAN PROTEIN INTERACTION NETWORKS 141 GRAPH-THEORETICAL COMPARISON REVEALS STRUCTURAL DIVERGENCE OF HUMAN PROTEIN INTERACTION NETWORKS MATTHIAS E. FUTSCHIK 1 ANNA TSCHAUT 2 m.futschik@staff.hu-berlin.de tschaut@zedat.fu-berlin.de GAUTAM

More information

Dynamic modular architecture of protein-protein interaction networks beyond the dichotomy of date and party hubs

Dynamic modular architecture of protein-protein interaction networks beyond the dichotomy of date and party hubs Dynamic modular architecture of protein-protein interaction networks beyond the dichotomy of date and party hubs Xiao Chang 1,#, Tao Xu 2,#, Yun Li 3, Kai Wang 1,4,5,* 1 Zilkha Neurogenetic Institute,

More information