Dividing Protein Interaction Networks by Growing Orthologous Articulations

Size: px

Start display at page:

Download "Dividing Protein Interaction Networks by Growing Orthologous Articulations"

Antony Owen
5 years ago
Views:

1 Dividing Protein Interaction Networks by Growing Orthologous Articulations Pavol Jancura, Jaap Heringa, Elena Marchiori Department of Computer cience, IBIVU, Vrije Universiteit Amsterdam he Netherlands November 5, 2007 Abstract he increasing growth of data on protein-protein interaction (PPI) networks has boosted research on their comparative analysis. In particular, recent studies proposed models and algorithms for performing network alignment, the comparison of networks across species for discovering conserved modules. Common approaches for this task construct a merged representation of the considered networks, called alignment graph, and search the alignment graph for conserved networks of interest using greedy techniques. In this paper we propose an alternative approach for this task. First, each network to be compared is divided into small subnets which are likely to contain conserved modules. Next, network alignment is performed on pairs of resulting subnets from different species, by using an exact algorithm for searching. We develop an algorithm for dividing PPI networks that combines a graph theoretical property (articulation) with a biological one (orthology). Results of experiments show the ability of this approach to discover accurate conserved modules, and substantiate the importance of the notions of orthology and articulation for performing comparative network analysis in a modular fashion. 1 Introduction With the exponential increase of data on protein interactions obtained from advanced technologies, data on thousands of interactions in human and most model species have become available (e.g. [2, 23]). PPI networks offer a powerful representation for better understanding modular organization of cells, for predicting biological functions and for providing insight into a variety of biochemical processes. Recent studies consider a comparative approach for the analysis of PPI networks from different species in order to discover common protein groups which are likely to be related to shared relevant functional modules [8, 17, 19]. his problem is also known as pairwise network alignment. Algorithms for this task typically construct a merged graph representation of the networks to be compared, called alignment (or orthology) graph, and model the problem as an optimization problem on such graph. Due to the computational intractability of such optimization problem, greedy algorithms are commonly used [9, 16]. Corresponding author. 1

2 1.1 Problem tatement Conserved modules, discovered by computational techniques such as [9], have in general small size compared to the size of the PPI network they belong to. Moreover, as indicated by recent studies, hubs whose removal disconnects the PPI network (articulation hubs) are likely to appear in conserved interaction patterns [10, 13]. hese observations motivate the focus of this paper on the problem of dividing a pair of PPI networks into small subnets which are expected to cover conserved modules, with the goal of performing modular network alignment. 1.2 Contributions We introduce an heuristic algorithm for dividing a PPI network into subnets, which combines biological (orthology) and graph theoretical (articulation) information. he algorithm starts by identifying groups of orthologous articulations, called centrums, which are expanded into subsets consisting of orthologous nodes. he algorithm automatically determines the number of subsets and has the desirable property of being parameterless. We use this algorithm for performing network alginment in a modular way, by merging pairs of resulting subnets from different species, and applying exact optimization for searching conserved modules across species. In order to test the performance of this approach, we consider an instance of the method that uses a state-of-the-art evolution-based alignment graph model [9]. Results of experiments show effectiveness of the proposed approach, which is capable of detecting accurate conserved complexes. Furthermore, we show that improved performance can be achieved by merging modules detected with our algorithm with those identified by Koyuturk et al. algorithm [9]. In general, these results substantiate the important role of the notions of orthology and articulation in modular comparative PPI network analysis. 1.3 Related Work Recent overviews of approaches and issues in comparative biological networks analysis are presented in [17, 19]. Based on the general formulation of network alignment proposed in [8], a number of techniques for (local and global) network alignment have been introduced ([9, 16, 5, 18]). echniques for local network alignment commonly construct an orthology graph, which provides a merged representation of the given PPI networks, and search for conserved subnets using greedy techniques ([9],[16],[5]). In particular, in [5], d-clusters are defined for searching efficiently between a pair of networks, where a d-cluster consists of d proteins that are close together in the network, and d is a user-given parameter. Another parameter is used for identifying pairs of d-clusters, one from each network, called seeds, which provide starting regions of the alignment graph to be expanded. he algorithm searches for modules conserved across species by expanding these seeds using a greedy technique similar that used in [9],[16]. Both d-clusters and centrums are initial groups of proteins used for network alignment, however their definition and use is different. In [5], each node is expanded into a d-cluster, while the number and size of centrums is automatically computed. Moreover, while d-clusters are used for generating starting regions of the alignment graph to be expanded, centrums are used for dividing each network prior to alignment. 2

3 While the above algorithms focus on network alignment, we focus on modular network alignment. o the best of our knowledge, we propose the first algorithm which directly tackles the modularity issue in network alignment. Many papers have investigated the importance of hubs in PPI networks and functional groups [13, 14, 15, 7, 4, 21]. In particular, it has been shown that hubs with a central role in the network architecture are three times more likely to be essential than proteins with only a small number of links to other proteins [7]. Moreover, if we take functional groups in PPI networks, then, amongst all functional groups, cellular organization proteins have the largest presence in hubs whose removal disconnects the network [13]. Computational techniques for identifying functional modules in PPI networks generally search for clusters of proteins forming dense components [3, 11]. he scale-free topology of PPI networks makes difficult to isolate modules hidden inside the central core [24]. In [1] several multi-level graph partitioning algorithms are described addressing the difficulty of partitioning scale-free graphs. he approach we propose differs from the above mentioned works because it does not address (directly) the problem of identifying functional modules in a PPI network, but uses homology information and articulations for dividing PPI networks into subnets in order to perform network alignment in a modular fashion. 2 Graph heoretic Background Given a graph G = (V, E), nodes joined by an edge are called adjacent. A neighbor of a node u is a node adjacent to u. he degree of u is the number of elements in E containing the vertex u. Let G(V, E) be a connected undirected graph. A vertex v V is called articulation if the graph resulting by removing this vertex from G and all its edges, is not connected. A tree is a connected graph which is not connected anymore if any edge is removed from it. A tree is called rooted tree if one vertex of has been designated as the root, in which case the edges have a natural orientation, towards or away from the root. A tree without any designated root is called free tree. Given a rooted tree (V, E), the depth of a vertex v V is the number of edges from the root of tree to v. Leafs of the tree are vertices which do not have any neighbor in the tree with higher depth. he depth of a tree is the highest depth of its leafs. A spanning tree (U, F ) of a connected undirected graph G(V, E) is a tree where U = V and F E. uppose a connected undirected graph G(V, E) is given. An hyper-root of a connected undirected graph G(V, E) is a set of vertices forming a free subtree of G. A spanning tree constructed from an hyper-root, called hyper-tree, can be generated as follows. Add all neighbors of vertices in the hyper-root which are not in the hyper-root and connect them to the hyper-root. his generates the first depth of the tree. Continue the tree construction by adding nodes of each successive depth iteratively, where the 0-depth of the hyper-tree is the hyper-root, and the i-th depth of the hyper-tree consists of all neighbors of (i 1)-th depth which are not yet in any lower depth of the hyper-tree yet. ince G is connected, the above procedure eventually covers all vertices of G, hence we obtain a spanning tree of G with the considered hyper-root. Figure 1 shows examples of hyper-trees of a given graph. 3

4 If the hyper-root contains one vertex such that all other vertices of the hyper-root create the first depth of the subtree then we call such hyper-root centrum. Hyper tree Hyper tree with centrum Figure 1: Examples of hyper-trees in the same graph. he red nodes and edges create the hyper-root. he rest of nodes and blue edges create other depths of hyper-tree. In the second hyper-tree one can observe the centrum as a hyper-root. 3 From Orthologous Articulations through Centrums to Hyper- rees A PPI network is represented by an undirected graph G(U, E). U denotes the set of proteins and E denotes set of edges, where an edge uu E represents the interaction between u U and u U. Given a PPI network G, we generate hyper-roots from orthologous articulations, and expand them into (hyper-)subtrees containing only orthologous proteins. he resulting algorithm, called Divide, is sketched in pseudo-code in Algorithm 1, and described in more detail below. An orthology path is a path containing only orthologous proteins. Computing articulations (Line 1). Computation of articulations can be performed in linear time by using, e.g., arjan s algorithm described in [20] or [6]. he degree (in G) of all orthologous articulations is then used for selecting seeds for the construction of centrums. Constructing centrums (Lines 3-10). Networks with scale-free topology appear to have edges between hubs systematically suppressed, while those between a hub and a lowconnected protein seem favored [12]. Guided by this result, we construct centrums by joining one orthologous articulation hub with its orthologous articulation neighbors, which will more likely have low degree. pecifically, let A be the set of orthologous articulations of G. he first centrum is the hyper-root consisting of the element of A with highest degree and all its neighbors in A. he other centrums are hyper-roots generated iteratively by considering, at each iteration, the element of A with highest degree among those which do not occur in any of the centrums constructed so far, together with all its neighbors in A which do not already occur in any 4

5 other centrum. he process terminates when all elements of A are in at least one centrum. hen an unambiguous label is assigned to each centrum. Observe that by construction the resulting hyper-roots are centrums. Figure 2: Examples of centrums of hyper-trees (left figure) and of one-depth hyper-trees (right figure). eeds of centrums are inside square markers. Magenta nodes are the rest of centrums connected to a seed by red edges. Green nodes are orthologous proteins which are not articulations. Black nodes are non-orthologous proteins. In the second (right) graph green edges connect nodes of centrums (zero depth hyper-trees) with nodes of the first depth hyper-trees. Nodes on the yellow background indicate the overlap among hyper-trees. Constructing First Depth Hyper-rees (Lines 11-16). By construction, centrums cover all orthologous articulations. Articulation hubs are often present in conserved subnets detected by means of comparative methods such as [9]. herefore, assuming that the majority of the remaining (that is non-articulation) nodes in the alignment graph are neighbors of articulation hubs, we add to each centrum all its neighboring ortholog proteins, regardless whether they are or not articulations. We mark these new added proteins with the label of the centrums to which they have been added. hese new added proteins form the first depth hyper-trees. Observe that there may be a non-empty overlap between first depth hyper-trees, with a maximal 2-nodes overlap (as illustrated in the right part of Figure 2). Expanding Hyper-rees (Lines 17-27) uccessive depths of hyper-trees are generated by expanding all nodes with only one label which occur in the last depth of each (actual) hyper-tree. We add to the corresponding hyper-trees all orthologous neighbors of these nodes which are not yet labelled. hen we assign to the newly added nodes the labels of the hypertrees they belong to. his process is repeated until it is possible to add unlabelled orthologous proteins to hyper-trees. Observe that each iteration yields at most a 1-node overlap between newly created depths. Assigning Remaining Nodes to Hyper-rees (Lines 28-42). he remaining orthologous nodes, that is, those not yet labelled, are processed as follows. First, unlabelled nodes which are neighbors of multi-labelled nodes are added to the corresponding hyper-trees. hen the newly added nodes are marked with these labels. his process is iterated until there are no unlabelled neighbors of multi-labelled nodes. 5

6 One node overlap wo node overlap Figure 3: Examples of an 1-node and 2-node overlap. Green nodes create the overlap. In the two node overlap the marking with or of a node means that the node is a leaf of tree or tree, respectively. A green edge is the edge of the intersection of trees and. Figure 4 illustrates an example of this situation. Figure 4: he left figure shows an example of non-reachable orthologous proteins from proteins with one label. Red nodes denote multi-labelled orthologous proteins, blue ones one-labelled orthologous proteins. Green nodes denote unlabelled orthologous proteins. Green edges indicate ortholog paths. he right figure shows an example of orthologous proteins (green ones) which are not reachable from any hyper-tree. Red nodes and edges represent actual hyper-trees constructed from centrums. Green nodes together with green edges form the new hyper-trees. Black nodes in both graphs indicate non-orthologous proteins. Nodes which are not neighbors of any labelled protein are still unlabelled. We assume that they may possibly be part of conserved complexes which do not contain articulations. o we create new hyper-trees by joining together all unlabelled orthologous neighbor proteins (see the right part of Figure 4). 4 Divide and Align Algorithm he Divide algorithm divides orthologous proteins of the PPI network into hyper-trees. uch pairs of hyper-trees from distinct species can be merged into orthology graphs in order to 6

7 Algorithm 1 Divide algorithm Input: G: PPI network, O: orthologous nodes of G Output: : list of subsets of O 1: A = { orthologous articulations of G} 2: =<> 3: repeat {Construction of centrums} 4: root = element of A with highest degree not already occurring in 5: s = {root} { neighbors of root in A not already occurring in } 6: =< s, > 7: until all members of A occur in 8: d = 0 9: Assign depth d to all elements of 10: Assign label l s to each s in and to all its elements 11: for s in do 12: s = s { all neighbors of s in O} 13: Assign label l s to all neighbors of s in O 14: end for 15: d = 1 16: Assign depth d to all elements of having yet no depth assigned 17: repeat {Expand one depth hyper-trees from nodes with one label} 18: N = { unlabelled neighbors in O of elements in s of depth d having only one label } 19: for n in N do 20: Assign to n all labels of its neighbors of detph d having only one label 21: for l s n do 22: s = s {n} 23: end for 24: end for 25: d = d : Assign depth d to all elements of having yet no depth assigned 27: until does not change 28: repeat {Expand hyper-trees from nodes multiple labels} 29: R = { unlabelled proteins in O with at least one multi-labelled protein as neighbor } 30: for r in R do 31: Assign to r all labels of its neighbors 32: for l s r do 33: s = s {r} 34: end for 35: end for 36: until does not change 37: repeat 38: choose an unlabelled element u of O 39: t = {u} {all elements of O which can be reached alongside an orthology path from u} 40: Assign label l t to t and to all its elements 41: =< t, > 42: until O does not contain any unlabelled node 7

8 search for alignments corresponding to protein complexes conserved across species. o this aim we use a common approach, based on the construction of a weighted metagraph between two PPI networks of different species. In this metagraph each node corresponds to an homologous pair of proteins, one from each of the two PPI networks. he metagraph is called alignment or orthology graph. Weights are assigned either to edges, like in [9], or to nodes, like in [16], of the alignment graph using a scoring function. he function transforms conservation and eventually also evolution information to one real value for each edge or node. Induced subgraphs with total weight greater than a given threshold are considered as relevant alignments. In this way one gets two subsets of proteins from each discovered subgraph from the two species, and each such subset provides a conserved complex of proteins. he problem of finding induced subgraphs with weight greater than a given threshold is reduced in these methods to the problem of finding a maximal induced subgraph. hen an approximation greedy algorithm based on local search is used because the maximum induced subgraph problem is NP-complete (cf. [9]). We align pairs of hyper-trees from different species having more than one orthologous pair, yielding orthology graphs with more than one node. Because of the small size of the resulting subnets, we use exact optimization [22] for searching in each of such graphs, instead of greedy techniques employed in common approaches. We call the resulting algorithm DivA (Divide and Align). Finally, redundant alignments are filtered out as done in, e.g., [9]. A subgraph G 1 is said to be redundant if there exists another subgraph G 2 which contains r% of its nodes, where r is a threshold value that determines the extent of allowed overlap between discovered protein complexes. In such a case we say that G 1 is redundant for G 2. 5 Experimental Results In order to assess the performance of our approach, we use the state-of-the-art framework for comparative network analysis proposed in [9], called MaWish. 1. he two following PPI networks, already compared in [9], are considered: accharomyces cerevisiae and Caenorhabditis elegans, which were obtained from BIND [2] and DIP [23] molecular interaction databases. he corresponding networks consist of 5157 proteins and interactions, and 3345 proteins and 5988 interactions, respectively. ParseBlast is used for measuring orthology and paralogy potential orthologous pairs created by 792 proteins in. cerevisiae and 633 proteins in C. elegans are identified. 5.1 Divide phase Results of application of the Divide algorithm to these networks are summarized as follows. For accharomyces cerevisiae, 697 articulations, of which 151 orthologs, and 83 centrums are identified. Expansion of these centrums into hyper-tree results in 639 covered orthologs. he algorithm assigns the remaining 153 orthologous proteins to 152 new hyper-trees. For Caenorhabditis elegans, 586 articulations, of which 158 orthologs, are computed, and 112 centrums are constructed from them. Expansion of these centrums into hyper-tree results in 339 covered orthologs. he algorithm assigns the remaining orthologous 294 proteins to 288 new hyper-trees. 1 Available at 8

9 We observe that the last remaining orthologs assigned to hyper-trees are isolated nodes, in the sense that they are rather distant from each other and not reachable from ortholog paths stemming from centrums. 5.2 Alignment phase We obtain 235 hyper-trees for accharomyces cerevisiae and 400 hyper-trees of Caenorhabditis elegans. By constructing alignment graphs between each two hyper-trees containing more than one ortholog pair, we obtain 884 alignment graphs, where the biggest one consists of only 31 nodes. For each of such alignment graphs, the maximum weighted induced subgraph is computed by exact optimization. Zero weight threshold is used for considering an induced subgraph a legal alignment. Redundant graphs are filtered using r = 80% as the treshold for redundancy. In this way DivA discovers 72 alignments. 5.3 Comparison between DivA and MaWish We performed network alginment with MaWish using parameter values as reported in [10]. he algorithm discovered 83 conserved subnets. DivA vs MaWish redundant alignments between. cerevisiae and C. elegans weight weight number of alignments DivA vs MaWish nonredundant alignments between. cerevisiae and C. elegans (0.0, 0.5] (0.5, 1.0] (1.0, 1.5] (1.5, 2.0] 2.0 < weight Figure 5: Left figure: Distribution of pairs of weights of paired redundant alignments, one obtained from MaWish and one from DivA. Weights of alignments found by DivA are on the x-axis, those found by MaWish on the y-axis. Right figure: Interval weight distributions of non-redundant alignments discovered by MaWish (blue bars) and DivA (green bars). he x-axis show weight intervals, the y-axis the number of alignments in each interval. A paired redundant alignment is a pair (G 1, G 2 ) of alignments, with G 1 discovered by DivA and G 2 discovered by MaWish, such that either G 1 is redundant for G 2 or vice versa. For a paired redundant alignment (G 1, G 2 ) we say that G 1 refines G 2 if the weigth of G 1 is bigger than the weight of G 2. DivA finds 14 new alignments not detected by MaWish. Figure 6 shows the best new alignment found by DivA (left) and the alignment of DivA which best refines an alignment of MaWish. here are 58 paired redundant alignments, whose weigts are plotted in the left part of Figure 5. Among these, 40 (55.6%) are equal (red crosses in the diagonal), and 18 (25%) different. 5 (6.9%) (green crosses below the diagonal) with better DivA alignment weight, 9

10 and 12 (16.7%) (blue crosses above the diagonal) with better MaWish alignment weight (for 1 pair it is undecidable because of rounding errors during computation). he right plot of Figure 5 shows the binned distribution of weights of the 14 (19.4%) found by DivA but not MaWish, and 28 found by MaWish and not by DivA. he overall weight average of the DivA ones (1.197) is greater than the overall average of the MaWish ones (0.7501), indicating the ability of DivA to find high score subnets, possibly due to the exact search strategy used. Figure 6: Left: he best new alignment. Red dash lines mark orthologous pairs. Green line is protein-protein interaction. Right: he refined alignment with the greatest weight. Red dash lines mark orthologous pairs. Green line is protein-protein interaction. A loop on a protein means duplication. Of the 14 new alignments detected by DivA, 8 of them have a intersection with a true MIP complex (cf. able 1 in Appendix). hree of these alignments have equal (sub)module in their true. cerevisae complex. By considering all alignments of MaWish and DivA and filter out the redundant ones, performance of MaWish is improved by 26%. In particular, conserved modules of three new true MIP classes are detected: replication fork complexes, mrna splicing, CF-ME30 complex. Moreover, the alignment by MaWish which covers 50% of the true complex Casein kinase II (this complex consists of 4 proteins) is refined by DivA in such a way that the entire true complex is covered (all four proteins). hese results show that DivA can be successfully applied to refine state-of-the-art algorithms for network alignment. 6 Conclusion he comparative experimental analysis with MaWish indicates that DivA is able to discover new alignments which are on average more conserved than those discovered by MaWish but not by DivA. Improved performance is shown to be achieved by combining results of MaWish and DivA, yielding new and refined alignments. he selection of centrums is biased on the orthology information but it can be changed for another property. Hence the divide algorithm can be applied to perform modular network alignment of other type of networks. Finally, we considered here an instance of our approach based on the evolutionary comparative model by Koyuturk et al. [10]. We intend to analyze instances of our approach based 10

11 on other methods, such as [16]. Acknowledgments We would like to thank Memhet Koyuturk for discussion on the MaWish code. 7 Appendix Of the 14 new alignments detected by DivA, 8 of them have a intersection with a true MIP complex (cf. able 1 in Appendix). hree of these alignments (6., 12. and 14.) have equal (sub)module in their true. cerevisae complex. Align. core ize M MIP category Intersection log(hg) proteasome 100(%) /22 regulator 100(%) /22 regulator 50(%) proteasome 100(%) Replication fork complexes 100(%) /22 regulator 100(%) /22 regulator 50(%) /22 regulator 50(%) 1.71 able 1: HG= hypergeometric, ize = number of alignment nodes of an alignment, N = number of proteins of alignment nodes which are annotated in the best (according to hypergeometric score) true. cerevisae s MIP complex of the alignment. M = number of proteins of alignment nodes in. cerevisae. Intersection = N / M From the refined alignments, three of them have intersection with a true MIP complex. Align. core ize M MIP category Intersection log(hg) Cdc28p complexes 10(%) Casein kinase II 100(%) NF1 complex 50(%) 2.16 able 2: rue complexes associated to MaWish refined alignments. Align. core ize M MIP category Intersection log(hg) Cdc28p complexes 9(%) Casein kinase II 100(%) NF1 complex 50(%) 2.16 able 3: rue complexes associated to DivA refined alignments. Note that alignments 1. and 3. in both able 2 and 3 have equal hypergeometric score, showing that the coverage, that is, number of proteins of an alignment contained in its best true MIP module, does not change. Alignment 2. in able 2 covers 50% of the true complex, while its refinement in able 3 covers the entire true complex (Casein kinase II, consisting of 4 proteins). 11

12 References [1] Amine Abou-Rjeili and George Karypis. Multilevel algorithms for partitioning power-law graphs. In 20th International Parallel and Distributed Processing ymposium (IPDP), [2] Gary D. Bader, Ian Donaldson, Cheryl Wolting, B. F. Francis Ouellette, ony Pawson, and Christopher W. V. Hogue. Bind the biomolecular interaction network database. Nucleic Acids Res, 29(1): , January [3] Gary D. Bader, Michael Lssig, and Andreas Wagner. tructure and evolution of protein interaction networks: a statistical model for link dynamics and gene duplications. BMC Evolutionary Biology, 4(51), [4] Diana Ekman, ara Light, Åsa K Björklund, and Arne Elofsson. What properties characterize the hub proteins of the protein-protein interaction network of saccharomyces cerevisiae? Genome Biology, 7(6):R45, [5] Jason Flannick, Antal Novak, Balaji. rinivasan, Harley H. McAdams, and erafim Batzoglou. Graemlin: General and robust alignment of multiple large interaction networks. Genome Res., 16(9): , [6] John Hopcroft and Robert arjan. Algorithm 447: efficient algorithms for graph manipulation. Commun. ACM, 16(6): , [7] Hawoong Jeong, ean P. Mason, Albert-Laszlo Barabasi, and Zoltan N. Oltvai. Lethality and centrality in protein networks. NAURE v, 411:41, [8] B. P. Kelley, R. haran, R. M. Karp,. ittler, D. E. Root, B. R. tockwell, and. Ideker. Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proceedings of the National Academy of cience, 100: , eptember [9] M. Koyutürk, A. Grama, and W. zpankowski. Pairwise local alignment of protein interaction networks guided by models of evolution. In RECOMB, volume 3500 of Lecture Notes in Bioinformatics, pages pringer Berlin / Heidelberg, May [10] M. Koyutürk, Y. Kim, U. opkara,. ubramaniam, A. Grama, and W. zpankowski. Pairwise alignment of protein interaction networks. Journal of Computional Biology, 13(2): , [11] Xiao-Li Li, oon-heng an, Chuan-heng Foo, and ee-kiong Ng. Interaction graph mining for protein complexes using local clique merging. Genome Informatics, 16(2): , [12] ergei Maslov and Kim neppen. pecificity and stability in topology of protein networks. cience, 296: , [13] N. Pržulj. Knowledge Discovery in Proteomics: Graph heory Analysis of Protein- Protein Interactions. CRC Press,

13 [14] N. Pržulj, D. Wigle, and I. Jurisica. Functional topology in a network of protein interactions. Bioinformatics, 20(3): , [15] A. J. Rathod and Ch. Fukami. Mathematical properties of networks of protein interactions. C374 Fall 2005 Lecture 9, Computer cience Department, tanford University, [16] R. haran,. Ideker, B. P. Kelley, R. hamir, and R. M. Karp. Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data. Journal of Computional Biology, 12(6): , [17] Roded haran and rey Ideker. Modeling cellular machinery through biological network comparison. Nature Biotechnology, 24(4): , April [18] Rohit ingh, Jinbo Xu, and Bonnie Berger. Global alignment of multiple protein interaction networks. In RECOMB, [19] Balaji. rinivasan, Nigam H. hah, Jason Flannick, Eduardo Abeliuk, Antal Novak, and erafim Batzoglou. Current Progress in Network Research: toward Reference Networks for kkey Model Organisms. Brief. in Bioinformatics, Advance access. [20] Robert arjan. Depth-first search and linear graph algorithms. IAM Journal on Computing, 1(2): , [21] Duygu Ucar, itaram Asur, Umit Catalyurek, and rinivasan Parthasarathy. Improving functional modularity in protein-protein interactions graphs using hub-induced subgraphs. In 10th European Conference on Principle and Practice of Knowledge Discovery in Database (PKDD), Berlin, Germany, eptember [22] Laurence A. Wolsey. Integer Programming. Wiley-Interscience, 1 edition, eptember [23] Ioannis Xenarios, alwínski, Xiaoqun Joyce Duan, Patrick Higney, ul-min Kim, and David Eisenberg. Dip, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Research, 30(1): , January [24] oon-hyung Yook, Zoltn N. Oltvai, and Albert-Lszl Barabsi. Functional and topological characterization of protein interaction networks. PROEOMIC, 4: ,

Network alignment and querying

Network biology minicourse (part 4) Algorithmic challenges in genomics Network alignment and querying Roded Sharan School of Computer Science, Tel Aviv University Multiple Species PPI Data Rapid growth