Dividing Protein Interaction Networks by Growing Orthologous Articulations

Size: px
Start display at page:

Download "Dividing Protein Interaction Networks by Growing Orthologous Articulations"

Transcription

1 Dividing Protein Interaction Networks by Growing Orthologous Articulations Pavol Jancura, Jaap Heringa, Elena Marchiori Department of Computer cience, IBIVU, Vrije Universiteit Amsterdam he Netherlands November 5, 2007 Abstract he increasing growth of data on protein-protein interaction (PPI) networks has boosted research on their comparative analysis. In particular, recent studies proposed models and algorithms for performing network alignment, the comparison of networks across species for discovering conserved modules. Common approaches for this task construct a merged representation of the considered networks, called alignment graph, and search the alignment graph for conserved networks of interest using greedy techniques. In this paper we propose an alternative approach for this task. First, each network to be compared is divided into small subnets which are likely to contain conserved modules. Next, network alignment is performed on pairs of resulting subnets from different species, by using an exact algorithm for searching. We develop an algorithm for dividing PPI networks that combines a graph theoretical property (articulation) with a biological one (orthology). Results of experiments show the ability of this approach to discover accurate conserved modules, and substantiate the importance of the notions of orthology and articulation for performing comparative network analysis in a modular fashion. 1 Introduction With the exponential increase of data on protein interactions obtained from advanced technologies, data on thousands of interactions in human and most model species have become available (e.g. [2, 23]). PPI networks offer a powerful representation for better understanding modular organization of cells, for predicting biological functions and for providing insight into a variety of biochemical processes. Recent studies consider a comparative approach for the analysis of PPI networks from different species in order to discover common protein groups which are likely to be related to shared relevant functional modules [8, 17, 19]. his problem is also known as pairwise network alignment. Algorithms for this task typically construct a merged graph representation of the networks to be compared, called alignment (or orthology) graph, and model the problem as an optimization problem on such graph. Due to the computational intractability of such optimization problem, greedy algorithms are commonly used [9, 16]. Corresponding author. 1

2 1.1 Problem tatement Conserved modules, discovered by computational techniques such as [9], have in general small size compared to the size of the PPI network they belong to. Moreover, as indicated by recent studies, hubs whose removal disconnects the PPI network (articulation hubs) are likely to appear in conserved interaction patterns [10, 13]. hese observations motivate the focus of this paper on the problem of dividing a pair of PPI networks into small subnets which are expected to cover conserved modules, with the goal of performing modular network alignment. 1.2 Contributions We introduce an heuristic algorithm for dividing a PPI network into subnets, which combines biological (orthology) and graph theoretical (articulation) information. he algorithm starts by identifying groups of orthologous articulations, called centrums, which are expanded into subsets consisting of orthologous nodes. he algorithm automatically determines the number of subsets and has the desirable property of being parameterless. We use this algorithm for performing network alginment in a modular way, by merging pairs of resulting subnets from different species, and applying exact optimization for searching conserved modules across species. In order to test the performance of this approach, we consider an instance of the method that uses a state-of-the-art evolution-based alignment graph model [9]. Results of experiments show effectiveness of the proposed approach, which is capable of detecting accurate conserved complexes. Furthermore, we show that improved performance can be achieved by merging modules detected with our algorithm with those identified by Koyuturk et al. algorithm [9]. In general, these results substantiate the important role of the notions of orthology and articulation in modular comparative PPI network analysis. 1.3 Related Work Recent overviews of approaches and issues in comparative biological networks analysis are presented in [17, 19]. Based on the general formulation of network alignment proposed in [8], a number of techniques for (local and global) network alignment have been introduced ([9, 16, 5, 18]). echniques for local network alignment commonly construct an orthology graph, which provides a merged representation of the given PPI networks, and search for conserved subnets using greedy techniques ([9],[16],[5]). In particular, in [5], d-clusters are defined for searching efficiently between a pair of networks, where a d-cluster consists of d proteins that are close together in the network, and d is a user-given parameter. Another parameter is used for identifying pairs of d-clusters, one from each network, called seeds, which provide starting regions of the alignment graph to be expanded. he algorithm searches for modules conserved across species by expanding these seeds using a greedy technique similar that used in [9],[16]. Both d-clusters and centrums are initial groups of proteins used for network alignment, however their definition and use is different. In [5], each node is expanded into a d-cluster, while the number and size of centrums is automatically computed. Moreover, while d-clusters are used for generating starting regions of the alignment graph to be expanded, centrums are used for dividing each network prior to alignment. 2

3 While the above algorithms focus on network alignment, we focus on modular network alignment. o the best of our knowledge, we propose the first algorithm which directly tackles the modularity issue in network alignment. Many papers have investigated the importance of hubs in PPI networks and functional groups [13, 14, 15, 7, 4, 21]. In particular, it has been shown that hubs with a central role in the network architecture are three times more likely to be essential than proteins with only a small number of links to other proteins [7]. Moreover, if we take functional groups in PPI networks, then, amongst all functional groups, cellular organization proteins have the largest presence in hubs whose removal disconnects the network [13]. Computational techniques for identifying functional modules in PPI networks generally search for clusters of proteins forming dense components [3, 11]. he scale-free topology of PPI networks makes difficult to isolate modules hidden inside the central core [24]. In [1] several multi-level graph partitioning algorithms are described addressing the difficulty of partitioning scale-free graphs. he approach we propose differs from the above mentioned works because it does not address (directly) the problem of identifying functional modules in a PPI network, but uses homology information and articulations for dividing PPI networks into subnets in order to perform network alignment in a modular fashion. 2 Graph heoretic Background Given a graph G = (V, E), nodes joined by an edge are called adjacent. A neighbor of a node u is a node adjacent to u. he degree of u is the number of elements in E containing the vertex u. Let G(V, E) be a connected undirected graph. A vertex v V is called articulation if the graph resulting by removing this vertex from G and all its edges, is not connected. A tree is a connected graph which is not connected anymore if any edge is removed from it. A tree is called rooted tree if one vertex of has been designated as the root, in which case the edges have a natural orientation, towards or away from the root. A tree without any designated root is called free tree. Given a rooted tree (V, E), the depth of a vertex v V is the number of edges from the root of tree to v. Leafs of the tree are vertices which do not have any neighbor in the tree with higher depth. he depth of a tree is the highest depth of its leafs. A spanning tree (U, F ) of a connected undirected graph G(V, E) is a tree where U = V and F E. uppose a connected undirected graph G(V, E) is given. An hyper-root of a connected undirected graph G(V, E) is a set of vertices forming a free subtree of G. A spanning tree constructed from an hyper-root, called hyper-tree, can be generated as follows. Add all neighbors of vertices in the hyper-root which are not in the hyper-root and connect them to the hyper-root. his generates the first depth of the tree. Continue the tree construction by adding nodes of each successive depth iteratively, where the 0-depth of the hyper-tree is the hyper-root, and the i-th depth of the hyper-tree consists of all neighbors of (i 1)-th depth which are not yet in any lower depth of the hyper-tree yet. ince G is connected, the above procedure eventually covers all vertices of G, hence we obtain a spanning tree of G with the considered hyper-root. Figure 1 shows examples of hyper-trees of a given graph. 3

4 If the hyper-root contains one vertex such that all other vertices of the hyper-root create the first depth of the subtree then we call such hyper-root centrum. Hyper tree Hyper tree with centrum Figure 1: Examples of hyper-trees in the same graph. he red nodes and edges create the hyper-root. he rest of nodes and blue edges create other depths of hyper-tree. In the second hyper-tree one can observe the centrum as a hyper-root. 3 From Orthologous Articulations through Centrums to Hyper- rees A PPI network is represented by an undirected graph G(U, E). U denotes the set of proteins and E denotes set of edges, where an edge uu E represents the interaction between u U and u U. Given a PPI network G, we generate hyper-roots from orthologous articulations, and expand them into (hyper-)subtrees containing only orthologous proteins. he resulting algorithm, called Divide, is sketched in pseudo-code in Algorithm 1, and described in more detail below. An orthology path is a path containing only orthologous proteins. Computing articulations (Line 1). Computation of articulations can be performed in linear time by using, e.g., arjan s algorithm described in [20] or [6]. he degree (in G) of all orthologous articulations is then used for selecting seeds for the construction of centrums. Constructing centrums (Lines 3-10). Networks with scale-free topology appear to have edges between hubs systematically suppressed, while those between a hub and a lowconnected protein seem favored [12]. Guided by this result, we construct centrums by joining one orthologous articulation hub with its orthologous articulation neighbors, which will more likely have low degree. pecifically, let A be the set of orthologous articulations of G. he first centrum is the hyper-root consisting of the element of A with highest degree and all its neighbors in A. he other centrums are hyper-roots generated iteratively by considering, at each iteration, the element of A with highest degree among those which do not occur in any of the centrums constructed so far, together with all its neighbors in A which do not already occur in any 4

5 other centrum. he process terminates when all elements of A are in at least one centrum. hen an unambiguous label is assigned to each centrum. Observe that by construction the resulting hyper-roots are centrums. Figure 2: Examples of centrums of hyper-trees (left figure) and of one-depth hyper-trees (right figure). eeds of centrums are inside square markers. Magenta nodes are the rest of centrums connected to a seed by red edges. Green nodes are orthologous proteins which are not articulations. Black nodes are non-orthologous proteins. In the second (right) graph green edges connect nodes of centrums (zero depth hyper-trees) with nodes of the first depth hyper-trees. Nodes on the yellow background indicate the overlap among hyper-trees. Constructing First Depth Hyper-rees (Lines 11-16). By construction, centrums cover all orthologous articulations. Articulation hubs are often present in conserved subnets detected by means of comparative methods such as [9]. herefore, assuming that the majority of the remaining (that is non-articulation) nodes in the alignment graph are neighbors of articulation hubs, we add to each centrum all its neighboring ortholog proteins, regardless whether they are or not articulations. We mark these new added proteins with the label of the centrums to which they have been added. hese new added proteins form the first depth hyper-trees. Observe that there may be a non-empty overlap between first depth hyper-trees, with a maximal 2-nodes overlap (as illustrated in the right part of Figure 2). Expanding Hyper-rees (Lines 17-27) uccessive depths of hyper-trees are generated by expanding all nodes with only one label which occur in the last depth of each (actual) hyper-tree. We add to the corresponding hyper-trees all orthologous neighbors of these nodes which are not yet labelled. hen we assign to the newly added nodes the labels of the hypertrees they belong to. his process is repeated until it is possible to add unlabelled orthologous proteins to hyper-trees. Observe that each iteration yields at most a 1-node overlap between newly created depths. Assigning Remaining Nodes to Hyper-rees (Lines 28-42). he remaining orthologous nodes, that is, those not yet labelled, are processed as follows. First, unlabelled nodes which are neighbors of multi-labelled nodes are added to the corresponding hyper-trees. hen the newly added nodes are marked with these labels. his process is iterated until there are no unlabelled neighbors of multi-labelled nodes. 5

6 One node overlap wo node overlap Figure 3: Examples of an 1-node and 2-node overlap. Green nodes create the overlap. In the two node overlap the marking with or of a node means that the node is a leaf of tree or tree, respectively. A green edge is the edge of the intersection of trees and. Figure 4 illustrates an example of this situation. Figure 4: he left figure shows an example of non-reachable orthologous proteins from proteins with one label. Red nodes denote multi-labelled orthologous proteins, blue ones one-labelled orthologous proteins. Green nodes denote unlabelled orthologous proteins. Green edges indicate ortholog paths. he right figure shows an example of orthologous proteins (green ones) which are not reachable from any hyper-tree. Red nodes and edges represent actual hyper-trees constructed from centrums. Green nodes together with green edges form the new hyper-trees. Black nodes in both graphs indicate non-orthologous proteins. Nodes which are not neighbors of any labelled protein are still unlabelled. We assume that they may possibly be part of conserved complexes which do not contain articulations. o we create new hyper-trees by joining together all unlabelled orthologous neighbor proteins (see the right part of Figure 4). 4 Divide and Align Algorithm he Divide algorithm divides orthologous proteins of the PPI network into hyper-trees. uch pairs of hyper-trees from distinct species can be merged into orthology graphs in order to 6

7 Algorithm 1 Divide algorithm Input: G: PPI network, O: orthologous nodes of G Output: : list of subsets of O 1: A = { orthologous articulations of G} 2: =<> 3: repeat {Construction of centrums} 4: root = element of A with highest degree not already occurring in 5: s = {root} { neighbors of root in A not already occurring in } 6: =< s, > 7: until all members of A occur in 8: d = 0 9: Assign depth d to all elements of 10: Assign label l s to each s in and to all its elements 11: for s in do 12: s = s { all neighbors of s in O} 13: Assign label l s to all neighbors of s in O 14: end for 15: d = 1 16: Assign depth d to all elements of having yet no depth assigned 17: repeat {Expand one depth hyper-trees from nodes with one label} 18: N = { unlabelled neighbors in O of elements in s of depth d having only one label } 19: for n in N do 20: Assign to n all labels of its neighbors of detph d having only one label 21: for l s n do 22: s = s {n} 23: end for 24: end for 25: d = d : Assign depth d to all elements of having yet no depth assigned 27: until does not change 28: repeat {Expand hyper-trees from nodes multiple labels} 29: R = { unlabelled proteins in O with at least one multi-labelled protein as neighbor } 30: for r in R do 31: Assign to r all labels of its neighbors 32: for l s r do 33: s = s {r} 34: end for 35: end for 36: until does not change 37: repeat 38: choose an unlabelled element u of O 39: t = {u} {all elements of O which can be reached alongside an orthology path from u} 40: Assign label l t to t and to all its elements 41: =< t, > 42: until O does not contain any unlabelled node 7

8 search for alignments corresponding to protein complexes conserved across species. o this aim we use a common approach, based on the construction of a weighted metagraph between two PPI networks of different species. In this metagraph each node corresponds to an homologous pair of proteins, one from each of the two PPI networks. he metagraph is called alignment or orthology graph. Weights are assigned either to edges, like in [9], or to nodes, like in [16], of the alignment graph using a scoring function. he function transforms conservation and eventually also evolution information to one real value for each edge or node. Induced subgraphs with total weight greater than a given threshold are considered as relevant alignments. In this way one gets two subsets of proteins from each discovered subgraph from the two species, and each such subset provides a conserved complex of proteins. he problem of finding induced subgraphs with weight greater than a given threshold is reduced in these methods to the problem of finding a maximal induced subgraph. hen an approximation greedy algorithm based on local search is used because the maximum induced subgraph problem is NP-complete (cf. [9]). We align pairs of hyper-trees from different species having more than one orthologous pair, yielding orthology graphs with more than one node. Because of the small size of the resulting subnets, we use exact optimization [22] for searching in each of such graphs, instead of greedy techniques employed in common approaches. We call the resulting algorithm DivA (Divide and Align). Finally, redundant alignments are filtered out as done in, e.g., [9]. A subgraph G 1 is said to be redundant if there exists another subgraph G 2 which contains r% of its nodes, where r is a threshold value that determines the extent of allowed overlap between discovered protein complexes. In such a case we say that G 1 is redundant for G 2. 5 Experimental Results In order to assess the performance of our approach, we use the state-of-the-art framework for comparative network analysis proposed in [9], called MaWish. 1. he two following PPI networks, already compared in [9], are considered: accharomyces cerevisiae and Caenorhabditis elegans, which were obtained from BIND [2] and DIP [23] molecular interaction databases. he corresponding networks consist of 5157 proteins and interactions, and 3345 proteins and 5988 interactions, respectively. ParseBlast is used for measuring orthology and paralogy potential orthologous pairs created by 792 proteins in. cerevisiae and 633 proteins in C. elegans are identified. 5.1 Divide phase Results of application of the Divide algorithm to these networks are summarized as follows. For accharomyces cerevisiae, 697 articulations, of which 151 orthologs, and 83 centrums are identified. Expansion of these centrums into hyper-tree results in 639 covered orthologs. he algorithm assigns the remaining 153 orthologous proteins to 152 new hyper-trees. For Caenorhabditis elegans, 586 articulations, of which 158 orthologs, are computed, and 112 centrums are constructed from them. Expansion of these centrums into hyper-tree results in 339 covered orthologs. he algorithm assigns the remaining orthologous 294 proteins to 288 new hyper-trees. 1 Available at 8

9 We observe that the last remaining orthologs assigned to hyper-trees are isolated nodes, in the sense that they are rather distant from each other and not reachable from ortholog paths stemming from centrums. 5.2 Alignment phase We obtain 235 hyper-trees for accharomyces cerevisiae and 400 hyper-trees of Caenorhabditis elegans. By constructing alignment graphs between each two hyper-trees containing more than one ortholog pair, we obtain 884 alignment graphs, where the biggest one consists of only 31 nodes. For each of such alignment graphs, the maximum weighted induced subgraph is computed by exact optimization. Zero weight threshold is used for considering an induced subgraph a legal alignment. Redundant graphs are filtered using r = 80% as the treshold for redundancy. In this way DivA discovers 72 alignments. 5.3 Comparison between DivA and MaWish We performed network alginment with MaWish using parameter values as reported in [10]. he algorithm discovered 83 conserved subnets. DivA vs MaWish redundant alignments between. cerevisiae and C. elegans weight weight number of alignments DivA vs MaWish nonredundant alignments between. cerevisiae and C. elegans (0.0, 0.5] (0.5, 1.0] (1.0, 1.5] (1.5, 2.0] 2.0 < weight Figure 5: Left figure: Distribution of pairs of weights of paired redundant alignments, one obtained from MaWish and one from DivA. Weights of alignments found by DivA are on the x-axis, those found by MaWish on the y-axis. Right figure: Interval weight distributions of non-redundant alignments discovered by MaWish (blue bars) and DivA (green bars). he x-axis show weight intervals, the y-axis the number of alignments in each interval. A paired redundant alignment is a pair (G 1, G 2 ) of alignments, with G 1 discovered by DivA and G 2 discovered by MaWish, such that either G 1 is redundant for G 2 or vice versa. For a paired redundant alignment (G 1, G 2 ) we say that G 1 refines G 2 if the weigth of G 1 is bigger than the weight of G 2. DivA finds 14 new alignments not detected by MaWish. Figure 6 shows the best new alignment found by DivA (left) and the alignment of DivA which best refines an alignment of MaWish. here are 58 paired redundant alignments, whose weigts are plotted in the left part of Figure 5. Among these, 40 (55.6%) are equal (red crosses in the diagonal), and 18 (25%) different. 5 (6.9%) (green crosses below the diagonal) with better DivA alignment weight, 9

10 and 12 (16.7%) (blue crosses above the diagonal) with better MaWish alignment weight (for 1 pair it is undecidable because of rounding errors during computation). he right plot of Figure 5 shows the binned distribution of weights of the 14 (19.4%) found by DivA but not MaWish, and 28 found by MaWish and not by DivA. he overall weight average of the DivA ones (1.197) is greater than the overall average of the MaWish ones (0.7501), indicating the ability of DivA to find high score subnets, possibly due to the exact search strategy used. Figure 6: Left: he best new alignment. Red dash lines mark orthologous pairs. Green line is protein-protein interaction. Right: he refined alignment with the greatest weight. Red dash lines mark orthologous pairs. Green line is protein-protein interaction. A loop on a protein means duplication. Of the 14 new alignments detected by DivA, 8 of them have a intersection with a true MIP complex (cf. able 1 in Appendix). hree of these alignments have equal (sub)module in their true. cerevisae complex. By considering all alignments of MaWish and DivA and filter out the redundant ones, performance of MaWish is improved by 26%. In particular, conserved modules of three new true MIP classes are detected: replication fork complexes, mrna splicing, CF-ME30 complex. Moreover, the alignment by MaWish which covers 50% of the true complex Casein kinase II (this complex consists of 4 proteins) is refined by DivA in such a way that the entire true complex is covered (all four proteins). hese results show that DivA can be successfully applied to refine state-of-the-art algorithms for network alignment. 6 Conclusion he comparative experimental analysis with MaWish indicates that DivA is able to discover new alignments which are on average more conserved than those discovered by MaWish but not by DivA. Improved performance is shown to be achieved by combining results of MaWish and DivA, yielding new and refined alignments. he selection of centrums is biased on the orthology information but it can be changed for another property. Hence the divide algorithm can be applied to perform modular network alignment of other type of networks. Finally, we considered here an instance of our approach based on the evolutionary comparative model by Koyuturk et al. [10]. We intend to analyze instances of our approach based 10

11 on other methods, such as [16]. Acknowledgments We would like to thank Memhet Koyuturk for discussion on the MaWish code. 7 Appendix Of the 14 new alignments detected by DivA, 8 of them have a intersection with a true MIP complex (cf. able 1 in Appendix). hree of these alignments (6., 12. and 14.) have equal (sub)module in their true. cerevisae complex. Align. core ize M MIP category Intersection log(hg) proteasome 100(%) /22 regulator 100(%) /22 regulator 50(%) proteasome 100(%) Replication fork complexes 100(%) /22 regulator 100(%) /22 regulator 50(%) /22 regulator 50(%) 1.71 able 1: HG= hypergeometric, ize = number of alignment nodes of an alignment, N = number of proteins of alignment nodes which are annotated in the best (according to hypergeometric score) true. cerevisae s MIP complex of the alignment. M = number of proteins of alignment nodes in. cerevisae. Intersection = N / M From the refined alignments, three of them have intersection with a true MIP complex. Align. core ize M MIP category Intersection log(hg) Cdc28p complexes 10(%) Casein kinase II 100(%) NF1 complex 50(%) 2.16 able 2: rue complexes associated to MaWish refined alignments. Align. core ize M MIP category Intersection log(hg) Cdc28p complexes 9(%) Casein kinase II 100(%) NF1 complex 50(%) 2.16 able 3: rue complexes associated to DivA refined alignments. Note that alignments 1. and 3. in both able 2 and 3 have equal hypergeometric score, showing that the coverage, that is, number of proteins of an alignment contained in its best true MIP module, does not change. Alignment 2. in able 2 covers 50% of the true complex, while its refinement in able 3 covers the entire true complex (Casein kinase II, consisting of 4 proteins). 11

12 References [1] Amine Abou-Rjeili and George Karypis. Multilevel algorithms for partitioning power-law graphs. In 20th International Parallel and Distributed Processing ymposium (IPDP), [2] Gary D. Bader, Ian Donaldson, Cheryl Wolting, B. F. Francis Ouellette, ony Pawson, and Christopher W. V. Hogue. Bind the biomolecular interaction network database. Nucleic Acids Res, 29(1): , January [3] Gary D. Bader, Michael Lssig, and Andreas Wagner. tructure and evolution of protein interaction networks: a statistical model for link dynamics and gene duplications. BMC Evolutionary Biology, 4(51), [4] Diana Ekman, ara Light, Åsa K Björklund, and Arne Elofsson. What properties characterize the hub proteins of the protein-protein interaction network of saccharomyces cerevisiae? Genome Biology, 7(6):R45, [5] Jason Flannick, Antal Novak, Balaji. rinivasan, Harley H. McAdams, and erafim Batzoglou. Graemlin: General and robust alignment of multiple large interaction networks. Genome Res., 16(9): , [6] John Hopcroft and Robert arjan. Algorithm 447: efficient algorithms for graph manipulation. Commun. ACM, 16(6): , [7] Hawoong Jeong, ean P. Mason, Albert-Laszlo Barabasi, and Zoltan N. Oltvai. Lethality and centrality in protein networks. NAURE v, 411:41, [8] B. P. Kelley, R. haran, R. M. Karp,. ittler, D. E. Root, B. R. tockwell, and. Ideker. Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proceedings of the National Academy of cience, 100: , eptember [9] M. Koyutürk, A. Grama, and W. zpankowski. Pairwise local alignment of protein interaction networks guided by models of evolution. In RECOMB, volume 3500 of Lecture Notes in Bioinformatics, pages pringer Berlin / Heidelberg, May [10] M. Koyutürk, Y. Kim, U. opkara,. ubramaniam, A. Grama, and W. zpankowski. Pairwise alignment of protein interaction networks. Journal of Computional Biology, 13(2): , [11] Xiao-Li Li, oon-heng an, Chuan-heng Foo, and ee-kiong Ng. Interaction graph mining for protein complexes using local clique merging. Genome Informatics, 16(2): , [12] ergei Maslov and Kim neppen. pecificity and stability in topology of protein networks. cience, 296: , [13] N. Pržulj. Knowledge Discovery in Proteomics: Graph heory Analysis of Protein- Protein Interactions. CRC Press,

13 [14] N. Pržulj, D. Wigle, and I. Jurisica. Functional topology in a network of protein interactions. Bioinformatics, 20(3): , [15] A. J. Rathod and Ch. Fukami. Mathematical properties of networks of protein interactions. C374 Fall 2005 Lecture 9, Computer cience Department, tanford University, [16] R. haran,. Ideker, B. P. Kelley, R. hamir, and R. M. Karp. Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data. Journal of Computional Biology, 12(6): , [17] Roded haran and rey Ideker. Modeling cellular machinery through biological network comparison. Nature Biotechnology, 24(4): , April [18] Rohit ingh, Jinbo Xu, and Bonnie Berger. Global alignment of multiple protein interaction networks. In RECOMB, [19] Balaji. rinivasan, Nigam H. hah, Jason Flannick, Eduardo Abeliuk, Antal Novak, and erafim Batzoglou. Current Progress in Network Research: toward Reference Networks for kkey Model Organisms. Brief. in Bioinformatics, Advance access. [20] Robert arjan. Depth-first search and linear graph algorithms. IAM Journal on Computing, 1(2): , [21] Duygu Ucar, itaram Asur, Umit Catalyurek, and rinivasan Parthasarathy. Improving functional modularity in protein-protein interactions graphs using hub-induced subgraphs. In 10th European Conference on Principle and Practice of Knowledge Discovery in Database (PKDD), Berlin, Germany, eptember [22] Laurence A. Wolsey. Integer Programming. Wiley-Interscience, 1 edition, eptember [23] Ioannis Xenarios, alwínski, Xiaoqun Joyce Duan, Patrick Higney, ul-min Kim, and David Eisenberg. Dip, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Research, 30(1): , January [24] oon-hyung Yook, Zoltn N. Oltvai, and Albert-Lszl Barabsi. Functional and topological characterization of protein interaction networks. PROEOMIC, 4: ,

Network alignment and querying

Network alignment and querying Network biology minicourse (part 4) Algorithmic challenges in genomics Network alignment and querying Roded Sharan School of Computer Science, Tel Aviv University Multiple Species PPI Data Rapid growth

More information

Bioinformatics: Network Analysis

Bioinformatics: Network Analysis Bioinformatics: Network Analysis Comparative Network Analysis COMP 572 (BIOS 572 / BIOE 564) - Fall 2013 Luay Nakhleh, Rice University 1 Biomolecular Network Components 2 Accumulation of Network Components

More information

Towards Detecting Protein Complexes from Protein Interaction Data

Towards Detecting Protein Complexes from Protein Interaction Data Towards Detecting Protein Complexes from Protein Interaction Data Pengjun Pei 1 and Aidong Zhang 1 Department of Computer Science and Engineering State University of New York at Buffalo Buffalo NY 14260,

More information

Network Biology: Understanding the cell s functional organization. Albert-László Barabási Zoltán N. Oltvai

Network Biology: Understanding the cell s functional organization. Albert-László Barabási Zoltán N. Oltvai Network Biology: Understanding the cell s functional organization Albert-László Barabási Zoltán N. Oltvai Outline: Evolutionary origin of scale-free networks Motifs, modules and hierarchical networks Network

More information

Analysis of Biological Networks: Network Robustness and Evolution

Analysis of Biological Networks: Network Robustness and Evolution Analysis of Biological Networks: Network Robustness and Evolution Lecturer: Roded Sharan Scribers: Sasha Medvedovsky and Eitan Hirsh Lecture 14, February 2, 2006 1 Introduction The chapter is divided into

More information

V 5 Robustness and Modularity

V 5 Robustness and Modularity Bioinformatics 3 V 5 Robustness and Modularity Mon, Oct 29, 2012 Network Robustness Network = set of connections Failure events: loss of edges loss of nodes (together with their edges) loss of connectivity

More information

Protein function prediction via analysis of interactomes

Protein function prediction via analysis of interactomes Protein function prediction via analysis of interactomes Elena Nabieva Mona Singh Department of Computer Science & Lewis-Sigler Institute for Integrative Genomics January 22, 2008 1 Introduction Genome

More information

Protein Complex Identification by Supervised Graph Clustering

Protein Complex Identification by Supervised Graph Clustering Protein Complex Identification by Supervised Graph Clustering Yanjun Qi 1, Fernanda Balem 2, Christos Faloutsos 1, Judith Klein- Seetharaman 1,2, Ziv Bar-Joseph 1 1 School of Computer Science, Carnegie

More information

Phylogenetic Analysis of Molecular Interaction Networks 1

Phylogenetic Analysis of Molecular Interaction Networks 1 Phylogenetic Analysis of Molecular Interaction Networks 1 Mehmet Koyutürk Case Western Reserve University Electrical Engineering & Computer Science 1 Joint work with Sinan Erten, Xin Li, Gurkan Bebek,

More information

Interaction Network Analysis

Interaction Network Analysis CSI/BIF 5330 Interaction etwork Analsis Young-Rae Cho Associate Professor Department of Computer Science Balor Universit Biological etworks Definition Maps of biochemical reactions, interactions, regulations

More information

Comparative Network Analysis

Comparative Network Analysis Comparative Network Analysis BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by

More information

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison 10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:

More information

Robust Community Detection Methods with Resolution Parameter for Complex Detection in Protein Protein Interaction Networks

Robust Community Detection Methods with Resolution Parameter for Complex Detection in Protein Protein Interaction Networks Robust Community Detection Methods with Resolution Parameter for Complex Detection in Protein Protein Interaction Networks Twan van Laarhoven and Elena Marchiori Institute for Computing and Information

More information

An Efficient Algorithm for Protein-Protein Interaction Network Analysis to Discover Overlapping Functional Modules

An Efficient Algorithm for Protein-Protein Interaction Network Analysis to Discover Overlapping Functional Modules An Efficient Algorithm for Protein-Protein Interaction Network Analysis to Discover Overlapping Functional Modules Ying Liu 1 Department of Computer Science, Mathematics and Science, College of Professional

More information

17 Non-collinear alignment Motivation A B C A B C A B C A B C D A C. This exposition is based on:

17 Non-collinear alignment Motivation A B C A B C A B C A B C D A C. This exposition is based on: 17 Non-collinear alignment This exposition is based on: 1. Darling, A.E., Mau, B., Perna, N.T. (2010) progressivemauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 5(6):e11147.

More information

Department of Computing, Imperial College London. Introduction to Bioinformatics: Biological Networks. Spring 2010

Department of Computing, Imperial College London. Introduction to Bioinformatics: Biological Networks. Spring 2010 Department of Computing, Imperial College London Introduction to Bioinformatics: Biological Networks Spring 2010 Lecturer: Nataša Pržulj Office: 407A Huxley E-mail: natasha@imperial.ac.uk Lectures: Time

More information

Cell biology traditionally identifies proteins based on their individual actions as catalysts, signaling

Cell biology traditionally identifies proteins based on their individual actions as catalysts, signaling Lethality and centrality in protein networks Cell biology traditionally identifies proteins based on their individual actions as catalysts, signaling molecules, or building blocks of cells and microorganisms.

More information

networks in molecular biology Wolfgang Huber

networks in molecular biology Wolfgang Huber networks in molecular biology Wolfgang Huber networks in molecular biology Regulatory networks: components = gene products interactions = regulation of transcription, translation, phosphorylation... Metabolic

More information

Evidence for dynamically organized modularity in the yeast protein-protein interaction network

Evidence for dynamically organized modularity in the yeast protein-protein interaction network Evidence for dynamically organized modularity in the yeast protein-protein interaction network Sari Bombino Helsinki 27.3.2007 UNIVERSITY OF HELSINKI Department of Computer Science Seminar on Computational

More information

A Multiobjective GO based Approach to Protein Complex Detection

A Multiobjective GO based Approach to Protein Complex Detection Available online at www.sciencedirect.com Procedia Technology 4 (2012 ) 555 560 C3IT-2012 A Multiobjective GO based Approach to Protein Complex Detection Sumanta Ray a, Moumita De b, Anirban Mukhopadhyay

More information

Algorithms to Detect Multiprotein Modularity Conserved During Evolution

Algorithms to Detect Multiprotein Modularity Conserved During Evolution Algorithms to Detect Multiprotein Modularity Conserved During Evolution Luqman Hodgkinson and Richard M. Karp Computer Science Division, University of California, Berkeley, Center for Computational Biology,

More information

MTopGO: a tool for module identification in PPI Networks

MTopGO: a tool for module identification in PPI Networks MTopGO: a tool for module identification in PPI Networks Danila Vella 1,2, Simone Marini 3,4, Francesca Vitali 5,6,7, Riccardo Bellazzi 1,4 1 Clinical Scientific Institute Maugeri, Pavia, Italy, 2 Department

More information

EFFICIENT AND ROBUST PREDICTION ALGORITHMS FOR PROTEIN COMPLEXES USING GOMORY-HU TREES

EFFICIENT AND ROBUST PREDICTION ALGORITHMS FOR PROTEIN COMPLEXES USING GOMORY-HU TREES EFFICIENT AND ROBUST PREDICTION ALGORITHMS FOR PROTEIN COMPLEXES USING GOMORY-HU TREES A. MITROFANOVA*, M. FARACH-COLTON**, AND B. MISHRA* *New York University, Department of Computer Science, New York,

More information

Protein-protein Interaction: Network Alignment

Protein-protein Interaction: Network Alignment Protein-protein Interaction: Network Alignment Lecturer: Roded Sharan Scribers: Amiram Wingarten and Stas Levin Lecture 7, May 6, 2009 1 Introduction In the last few years the amount of available data

More information

Lecture 10: May 19, High-Throughput technologies for measuring proteinprotein

Lecture 10: May 19, High-Throughput technologies for measuring proteinprotein Analysis of Gene Expression Data Spring Semester, 2005 Lecture 10: May 19, 2005 Lecturer: Roded Sharan Scribe: Daniela Raijman and Igor Ulitsky 10.1 Protein Interaction Networks In the past we have discussed

More information

Comparative Analysis of Modularity in Biological Systems

Comparative Analysis of Modularity in Biological Systems 2009 Ohio Collaborative Conference on Bioinformatics Comparative Analysis of Modularity in Biological Systems Xin Li 1, Sinan Erten 1, Gurkan Bebek,MehmetKoyutürk,JingLi 2 Email: xxl41@caseedu, msinane@gmailcom,

More information

Comparative Genomics: Sequence, Structure, and Networks. Bonnie Berger MIT

Comparative Genomics: Sequence, Structure, and Networks. Bonnie Berger MIT Comparative Genomics: Sequence, Structure, and Networks Bonnie Berger MIT Comparative Genomics Look at the same kind of data across species with the hope that areas of high correlation correspond to functional

More information

MULE: An Efficient Algorithm for Detecting Frequent Subgraphs in Biological Networks

MULE: An Efficient Algorithm for Detecting Frequent Subgraphs in Biological Networks MULE: An Efficient Algorithm for Detecting Frequent Subgraphs in Biological Networks Mehmet Koyutürk, Ananth Grama, & Wojciech Szpankowski http://www.cs.purdue.edu/homes/koyuturk/pathway/ Purdue University

More information

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics GENOME Bioinformatics 2 Proteomics protein-gene PROTEOME protein-protein METABOLISM Slide from http://www.nd.edu/~networks/ Citrate Cycle Bio-chemical reactions What is it? Proteomics Reveal protein Protein

More information

GRAPH-THEORETICAL COMPARISON REVEALS STRUCTURAL DIVERGENCE OF HUMAN PROTEIN INTERACTION NETWORKS

GRAPH-THEORETICAL COMPARISON REVEALS STRUCTURAL DIVERGENCE OF HUMAN PROTEIN INTERACTION NETWORKS 141 GRAPH-THEORETICAL COMPARISON REVEALS STRUCTURAL DIVERGENCE OF HUMAN PROTEIN INTERACTION NETWORKS MATTHIAS E. FUTSCHIK 1 ANNA TSCHAUT 2 m.futschik@staff.hu-berlin.de tschaut@zedat.fu-berlin.de GAUTAM

More information

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it? Proteomics What is it? Reveal protein interactions Protein profiling in a sample Yeast two hybrid screening High throughput 2D PAGE Automatic analysis of 2D Page Yeast two hybrid Use two mating strains

More information

BIOINFORMATICS. Improved Network-based Identification of Protein Orthologs. Nir Yosef a,, Roded Sharan a and William Stafford Noble b

BIOINFORMATICS. Improved Network-based Identification of Protein Orthologs. Nir Yosef a,, Roded Sharan a and William Stafford Noble b BIOINFORMATICS Vol. no. 28 Pages 7 Improved Network-based Identification of Protein Orthologs Nir Yosef a,, Roded Sharan a and William Stafford Noble b a School of Computer Science, Tel-Aviv University,

More information

Algorithms and tools for the alignment of multiple protein networks

Algorithms and tools for the alignment of multiple protein networks Tel-Aviv University School of Computer Science Algorithms and tools for the alignment of multiple protein networks This thesis is submitted in partial fulfillment of the requirements toward the M.Sc. degree

More information

Constraint-based Subspace Clustering

Constraint-based Subspace Clustering Constraint-based Subspace Clustering Elisa Fromont 1, Adriana Prado 2 and Céline Robardet 1 1 Université de Lyon, France 2 Universiteit Antwerpen, Belgium Thursday, April 30 Traditional Clustering Partitions

More information

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor Biological Networks:,, and via Relative Description Length By: Tamir Tuller & Benny Chor Presented by: Noga Grebla Content of the presentation Presenting the goals of the research Reviewing basic terms

More information

A MAX-FLOW BASED APPROACH TO THE IDENTIFICATION OF PROTEIN COMPLEXES USING PROTEIN INTERACTION AND MICROARRAY DATA (EXTENDED ABSTRACT)

A MAX-FLOW BASED APPROACH TO THE IDENTIFICATION OF PROTEIN COMPLEXES USING PROTEIN INTERACTION AND MICROARRAY DATA (EXTENDED ABSTRACT) 1 A MAX-FLOW BASED APPROACH TO THE IDENTIFICATION OF PROTEIN COMPLEXES USING PROTEIN INTERACTION AND MICROARRAY DATA (EXTENDED ABSTRACT) JIANXING FENG Department of Computer Science and Technology, Tsinghua

More information

Bioinformatics I. CPBS 7711 October 29, 2015 Protein interaction networks. Debra Goldberg

Bioinformatics I. CPBS 7711 October 29, 2015 Protein interaction networks. Debra Goldberg Bioinformatics I CPBS 7711 October 29, 2015 Protein interaction networks Debra Goldberg debra@colorado.edu Overview Networks, protein interaction networks (PINs) Network models What can we learn from PINs

More information

Overview. Overview. Social networks. What is a network? 10/29/14. Bioinformatics I. Networks are everywhere! Introduction to Networks

Overview. Overview. Social networks. What is a network? 10/29/14. Bioinformatics I. Networks are everywhere! Introduction to Networks Bioinformatics I Overview CPBS 7711 October 29, 2014 Protein interaction networks Debra Goldberg debra@colorado.edu Networks, protein interaction networks (PINs) Network models What can we learn from PINs

More information

Networks. Can (John) Bruce Keck Founda7on Biotechnology Lab Bioinforma7cs Resource

Networks. Can (John) Bruce Keck Founda7on Biotechnology Lab Bioinforma7cs Resource Networks Can (John) Bruce Keck Founda7on Biotechnology Lab Bioinforma7cs Resource Networks in biology Protein-Protein Interaction Network of Yeast Transcriptional regulatory network of E.coli Experimental

More information

arxiv: v1 [q-bio.mn] 5 Feb 2008

arxiv: v1 [q-bio.mn] 5 Feb 2008 Uncovering Biological Network Function via Graphlet Degree Signatures Tijana Milenković and Nataša Pržulj Department of Computer Science, University of California, Irvine, CA 92697-3435, USA Technical

More information

Multiple Whole Genome Alignment

Multiple Whole Genome Alignment Multiple Whole Genome Alignment BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 206 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by

More information

Research Proposal. Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family.

Research Proposal. Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family. Research Proposal Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family. Name: Minjal Pancholi Howard University Washington, DC. June 19, 2009 Research

More information

V 6 Network analysis

V 6 Network analysis V 6 Network analysis - Dijkstra algorithm: compute shortest pathways - Graph layout - Network robustness - Graph modularity Fri, May 4, 2018 Bioinformatics 3 SS 18 V 6 1 The Shortest Path Problem Problem:

More information

The architecture of complexity: the structure and dynamics of complex networks.

The architecture of complexity: the structure and dynamics of complex networks. SMR.1656-36 School and Workshop on Structure and Function of Complex Networks 16-28 May 2005 ------------------------------------------------------------------------------------------------------------------------

More information

Network by Weighted Graph Mining

Network by Weighted Graph Mining 2012 4th International Conference on Bioinformatics and Biomedical Technology IPCBEE vol.29 (2012) (2012) IACSIT Press, Singapore + Prediction of Protein Function from Protein-Protein Interaction Network

More information

Assessing Significance of Connectivity and. Conservation in Protein Interaction Networks

Assessing Significance of Connectivity and. Conservation in Protein Interaction Networks Assessing Significance of Connectivity and Conservation in Protein Interaction Networks Mehmet Koyutürk, Wojciech Szpankowski, and Ananth Grama Department of Computer Sciences, Purdue University, West

More information

Network Alignment 858L

Network Alignment 858L Network Alignment 858L Terms & Questions A homologous h Interolog = B h Species 1 Species 2 Are there conserved pathways? What is the minimum set of pathways required for life? Can we compare networks

More information

COMPARATIVE ANALYSIS OF BIOLOGICAL NETWORKS. A Thesis. Submitted to the Faculty. Purdue University. Mehmet Koyutürk. In Partial Fulfillment of the

COMPARATIVE ANALYSIS OF BIOLOGICAL NETWORKS. A Thesis. Submitted to the Faculty. Purdue University. Mehmet Koyutürk. In Partial Fulfillment of the COMPARATIVE ANALYSIS OF BIOLOGICAL NETWORKS A Thesis Submitted to the Faculty of Purdue University by Mehmet Koyutürk In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy December

More information

Sequence Alignment Techniques and Their Uses

Sequence Alignment Techniques and Their Uses Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this

More information

RODED SHARAN, 1 TREY IDEKER, 2 BRIAN KELLEY, 3 RON SHAMIR, 1 and RICHARD M. KARP 4 ABSTRACT

RODED SHARAN, 1 TREY IDEKER, 2 BRIAN KELLEY, 3 RON SHAMIR, 1 and RICHARD M. KARP 4 ABSTRACT JOURNAL OF COMPUTATIONAL BIOLOGY Volume 12, Number 6, 2005 Mary Ann Liebert, Inc. Pp. 835 846 Identification of Protein Complexes by Comparative Analysis of Yeast and Bacterial Protein Interaction Data

More information

BMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven)

BMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven) BMI/CS 776 Lecture #20 Alignment of whole genomes Colin Dewey (with slides adapted from those by Mark Craven) 2007.03.29 1 Multiple whole genome alignment Input set of whole genome sequences genomes diverged

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S3 (box) Methods Methods Genome weighting The currently available collection of archaeal and bacterial genomes has a highly biased distribution of isolates across taxa. For example,

More information

Computational methods for predicting protein-protein interactions

Computational methods for predicting protein-protein interactions Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational

More information

Learning in Bayesian Networks

Learning in Bayesian Networks Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks

More information

Systems biology and biological networks

Systems biology and biological networks Systems Biology Workshop Systems biology and biological networks Center for Biological Sequence Analysis Networks in electronics Radio kindly provided by Lazebnik, Cancer Cell, 2002 Systems Biology Workshop,

More information

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its

More information

Phylogenetic Networks, Trees, and Clusters

Phylogenetic Networks, Trees, and Clusters Phylogenetic Networks, Trees, and Clusters Luay Nakhleh 1 and Li-San Wang 2 1 Department of Computer Science Rice University Houston, TX 77005, USA nakhleh@cs.rice.edu 2 Department of Biology University

More information

Comparison of Protein-Protein Interaction Confidence Assignment Schemes

Comparison of Protein-Protein Interaction Confidence Assignment Schemes Comparison of Protein-Protein Interaction Confidence Assignment Schemes Silpa Suthram 1, Tomer Shlomi 2, Eytan Ruppin 2, Roded Sharan 2, and Trey Ideker 1 1 Department of Bioengineering, University of

More information

Detecting Conserved Interaction Patterns in Biological Networks

Detecting Conserved Interaction Patterns in Biological Networks Detecting Conserved Interaction Patterns in Biological Networks Mehmet Koyutürk 1, Yohan Kim 2, Shankar Subramaniam 2,3, Wojciech Szpankowski 1 and Ananth Grama 1 1 Department of Computer Sciences, Purdue

More information

Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law

Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law Ze Zhang,* Z. W. Luo,* Hirohisa Kishino,à and Mike J. Kearsey *School of Biosciences, University of Birmingham,

More information

Predicting Protein Functions and Domain Interactions from Protein Interactions

Predicting Protein Functions and Domain Interactions from Protein Interactions Predicting Protein Functions and Domain Interactions from Protein Interactions Fengzhu Sun, PhD Center for Computational and Experimental Genomics University of Southern California Outline High-throughput

More information

Differential Modeling for Cancer Microarray Data

Differential Modeling for Cancer Microarray Data Differential Modeling for Cancer Microarray Data Omar Odibat Department of Computer Science Feb, 01, 2011 1 Outline Introduction Cancer Microarray data Problem Definition Differential analysis Existing

More information

Networks & pathways. Hedi Peterson MTAT Bioinformatics

Networks & pathways. Hedi Peterson MTAT Bioinformatics Networks & pathways Hedi Peterson (peterson@quretec.com) MTAT.03.239 Bioinformatics 03.11.2010 Networks are graphs Nodes Edges Edges Directed, undirected, weighted Nodes Genes Proteins Metabolites Enzymes

More information

Example of Function Prediction

Example of Function Prediction Find similar genes Example of Function Prediction Suggesting functions of newly identified genes It was known that mutations of NF1 are associated with inherited disease neurofibromatosis 1; but little

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

The Role of Network Science in Biology and Medicine. Tiffany J. Callahan Computational Bioscience Program Hunter/Kahn Labs

The Role of Network Science in Biology and Medicine. Tiffany J. Callahan Computational Bioscience Program Hunter/Kahn Labs The Role of Network Science in Biology and Medicine Tiffany J. Callahan Computational Bioscience Program Hunter/Kahn Labs Network Analysis Working Group 09.28.2017 Network-Enabled Wisdom (NEW) empirically

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

Modular Algorithms for Biomolecular Network Alignment

Modular Algorithms for Biomolecular Network Alignment Graduate Theses and Dissertations Graduate College 2011 Modular Algorithms for Biomolecular Network Alignment Fadi George Towfic Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/etd

More information

Interaction Network Topologies

Interaction Network Topologies Proceedings of International Joint Conference on Neural Networks, Montreal, Canada, July 31 - August 4, 2005 Inferrng Protein-Protein Interactions Using Interaction Network Topologies Alberto Paccanarot*,

More information

Discovering Binding Motif Pairs from Interacting Protein Groups

Discovering Binding Motif Pairs from Interacting Protein Groups Discovering Binding Motif Pairs from Interacting Protein Groups Limsoon Wong Institute for Infocomm Research Singapore Copyright 2005 by Limsoon Wong Plan Motivation from biology & problem statement Recasting

More information

Peeling the yeast protein network

Peeling the yeast protein network 444 DOI 10.1002/pmic.200400962 Proteomics 2005, 5, 444 449 REGULAR ARTICLE Peeling the yeast protein network Stefan Wuchty and Eivind Almaas Department of Physics, University of Notre Dame, Notre Dame,

More information

Comparative Analysis of Molecular Interaction Networks

Comparative Analysis of Molecular Interaction Networks Comparative Analysis of Molecular Interaction Networks Jayesh Pandey 1, Mehmet Koyuturk 2, Shankar Subramaniam 3, Ananth Grama 1 1 Purdue University, 2 Case Western Reserve University 3 University of California,

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

Biological Networks. Gavin Conant 163B ASRC

Biological Networks. Gavin Conant 163B ASRC Biological Networks Gavin Conant 163B ASRC conantg@missouri.edu 882-2931 Types of Network Regulatory Protein-interaction Metabolic Signaling Co-expressing General principle Relationship between genes Gene/protein/enzyme

More information

Computational Network Biology Biostatistics & Medical Informatics 826 Fall 2018

Computational Network Biology Biostatistics & Medical Informatics 826 Fall 2018 Computational Network Biology Biostatistics & Medical Informatics 826 Fall 2018 Sushmita Roy sroy@biostat.wisc.edu https://compnetbiocourse.discovery.wisc.edu Sep 6 th 2018 Goals for today Administrivia

More information

Iteration Method for Predicting Essential Proteins Based on Orthology and Protein-protein Interaction Networks

Iteration Method for Predicting Essential Proteins Based on Orthology and Protein-protein Interaction Networks Georgia State University ScholarWorks @ Georgia State University Computer Science Faculty Publications Department of Computer Science 2012 Iteration Method for Predicting Essential Proteins Based on Orthology

More information

Self Similar (Scale Free, Power Law) Networks (I)

Self Similar (Scale Free, Power Law) Networks (I) Self Similar (Scale Free, Power Law) Networks (I) E6083: lecture 4 Prof. Predrag R. Jelenković Dept. of Electrical Engineering Columbia University, NY 10027, USA {predrag}@ee.columbia.edu February 7, 2007

More information

A Max-Flow Based Approach to the. Identification of Protein Complexes Using Protein Interaction and Microarray Data

A Max-Flow Based Approach to the. Identification of Protein Complexes Using Protein Interaction and Microarray Data A Max-Flow Based Approach to the 1 Identification of Protein Complexes Using Protein Interaction and Microarray Data Jianxing Feng, Rui Jiang, and Tao Jiang Abstract The emergence of high-throughput technologies

More information

Graph Alignment and Biological Networks

Graph Alignment and Biological Networks Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale

More information

Increasing availability of experimental data relating to biological sequences, coupled with efficient

Increasing availability of experimental data relating to biological sequences, coupled with efficient JOURNAL OF COMPUTATIONAL BIOLOGY Volume 13, Number 7, 2006 Mary Ann Liebert, Inc. Pp. 1299 1322 Detecting Conserved Interaction Patterns in Biological Networks MEHMET KOYUTÜRK, 1 YOHAN KIM, 2 SHANKAR SUBRAMANIAM,

More information

Graph Theory and Networks in Biology

Graph Theory and Networks in Biology Graph Theory and Networks in Biology Oliver Mason and Mark Verwoerd Hamilton Institute, National University of Ireland Maynooth, Co. Kildare, Ireland {oliver.mason, mark.verwoerd}@nuim.ie January 17, 2007

More information

Tools and Algorithms in Bioinformatics

Tools and Algorithms in Bioinformatics Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week-4 BLAST Algorithm Continued Multiple Sequence Alignment Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and

More information

BIOINFORMATICS. Integrative Network Alignment Reveals Large Regions of Global Network Similarity in Yeast and Human

BIOINFORMATICS. Integrative Network Alignment Reveals Large Regions of Global Network Similarity in Yeast and Human BIOINFORMATICS Vol. 00 no. 00 2010 Pages 1 7 Integrative Network Alignment Reveals Large Regions of Global Network Similarity in Yeast and Human Oleksii Kuchaiev 2 and Nataša Pržulj 1 1 Department of Computing,

More information

Mining Molecular Fragments: Finding Relevant Substructures of Molecules

Mining Molecular Fragments: Finding Relevant Substructures of Molecules Mining Molecular Fragments: Finding Relevant Substructures of Molecules Christian Borgelt, Michael R. Berthold Proc. IEEE International Conference on Data Mining, 2002. ICDM 2002. Lecturers: Carlo Cagli

More information

The Strong Largeur d Arborescence

The Strong Largeur d Arborescence The Strong Largeur d Arborescence Rik Steenkamp (5887321) November 12, 2013 Master Thesis Supervisor: prof.dr. Monique Laurent Local Supervisor: prof.dr. Alexander Schrijver KdV Institute for Mathematics

More information

Functional Characterization and Topological Modularity of Molecular Interaction Networks

Functional Characterization and Topological Modularity of Molecular Interaction Networks Functional Characterization and Topological Modularity of Molecular Interaction Networks Jayesh Pandey 1 Mehmet Koyutürk 2 Ananth Grama 1 1 Department of Computer Science Purdue University 2 Department

More information

Chapter 8: The Topology of Biological Networks. Overview

Chapter 8: The Topology of Biological Networks. Overview Chapter 8: The Topology of Biological Networks 8.1 Introduction & survey of network topology Prof. Yechiam Yemini (YY) Computer Science Department Columbia University A gallery of networks Small-world

More information

Part V. Matchings. Matching. 19 Augmenting Paths for Matchings. 18 Bipartite Matching via Flows

Part V. Matchings. Matching. 19 Augmenting Paths for Matchings. 18 Bipartite Matching via Flows Matching Input: undirected graph G = (V, E). M E is a matching if each node appears in at most one Part V edge in M. Maximum Matching: find a matching of maximum cardinality Matchings Ernst Mayr, Harald

More information

Tree Decompositions and Tree-Width

Tree Decompositions and Tree-Width Tree Decompositions and Tree-Width CS 511 Iowa State University December 6, 2010 CS 511 (Iowa State University) Tree Decompositions and Tree-Width December 6, 2010 1 / 15 Tree Decompositions Definition

More information

Written Exam 15 December Course name: Introduction to Systems Biology Course no

Written Exam 15 December Course name: Introduction to Systems Biology Course no Technical University of Denmark Written Exam 15 December 2008 Course name: Introduction to Systems Biology Course no. 27041 Aids allowed: Open book exam Provide your answers and calculations on separate

More information

Types of biological networks. I. Intra-cellurar networks

Types of biological networks. I. Intra-cellurar networks Types of biological networks I. Intra-cellurar networks 1 Some intra-cellular networks: 1. Metabolic networks 2. Transcriptional regulation networks 3. Cell signalling networks 4. Protein-protein interaction

More information

FROM UNCERTAIN PROTEIN INTERACTION NETWORKS TO SIGNALING PATHWAYS THROUGH INTENSIVE COLOR CODING

FROM UNCERTAIN PROTEIN INTERACTION NETWORKS TO SIGNALING PATHWAYS THROUGH INTENSIVE COLOR CODING FROM UNCERTAIN PROTEIN INTERACTION NETWORKS TO SIGNALING PATHWAYS THROUGH INTENSIVE COLOR CODING Haitham Gabr, Alin Dobra and Tamer Kahveci CISE Department, University of Florida, Gainesville, FL 32611,

More information

COMPARATIVE PATHWAY ANNOTATION WITH PROTEIN-DNA INTERACTION AND OPERON INFORMATION VIA GRAPH TREE DECOMPOSITION

COMPARATIVE PATHWAY ANNOTATION WITH PROTEIN-DNA INTERACTION AND OPERON INFORMATION VIA GRAPH TREE DECOMPOSITION COMPARATIVE PATHWAY ANNOTATION WITH PROTEIN-DNA INTERACTION AND OPERON INFORMATION VIA GRAPH TREE DECOMPOSITION JIZHEN ZHAO, DONGSHENG CHE AND LIMING CAI Department of Computer Science, University of Georgia,

More information

Fine-scale dissection of functional protein network. organization by dynamic neighborhood analysis

Fine-scale dissection of functional protein network. organization by dynamic neighborhood analysis Fine-scale dissection of functional protein network organization by dynamic neighborhood analysis Kakajan Komurov 1, Mehmet H. Gunes 2, Michael A. White 1 1 Department of Cell Biology, University of Texas

More information

GOSAP: Gene Ontology Based Semantic Alignment of Biological Pathways

GOSAP: Gene Ontology Based Semantic Alignment of Biological Pathways GOSAP: Gene Ontology Based Semantic Alignment of Biological Pathways Jonas Gamalielsson and Björn Olsson Systems Biology Group, Skövde University, Box 407, Skövde, 54128, Sweden, [jonas.gamalielsson][bjorn.olsson]@his.se,

More information

Erzsébet Ravasz Advisor: Albert-László Barabási

Erzsébet Ravasz Advisor: Albert-László Barabási Hierarchical Networks Erzsébet Ravasz Advisor: Albert-László Barabási Introduction to networks How to model complex networks? Clustering and hierarchy Hierarchical organization of cellular metabolism The

More information

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

CISC 636 Computational Biology & Bioinformatics (Fall 2016) CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein

More information

Generating p-extremal graphs

Generating p-extremal graphs Generating p-extremal graphs Derrick Stolee Department of Mathematics Department of Computer Science University of Nebraska Lincoln s-dstolee1@math.unl.edu August 2, 2011 Abstract Let f(n, p be the maximum

More information

Graph Theory and Networks in Biology arxiv:q-bio/ v1 [q-bio.mn] 6 Apr 2006

Graph Theory and Networks in Biology arxiv:q-bio/ v1 [q-bio.mn] 6 Apr 2006 Graph Theory and Networks in Biology arxiv:q-bio/0604006v1 [q-bio.mn] 6 Apr 2006 Oliver Mason and Mark Verwoerd February 4, 2008 Abstract In this paper, we present a survey of the use of graph theoretical

More information

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting. Genome Annotation Bioinformatics and Computational Biology Genome Annotation Frank Oliver Glöckner 1 Genome Analysis Roadmap Genome sequencing Assembly Gene prediction Protein targeting trna prediction

More information