Functional Characterization and Topological Modularity of Molecular Interaction Networks

Similar documents
Functional Characterization of Molecular Interaction Networks

Phylogenetic Analysis of Molecular Interaction Networks 1

The Role of Network Science in Biology and Medicine. Tiffany J. Callahan Computational Bioscience Program Hunter/Kahn Labs

Francisco M. Couto Mário J. Silva Pedro Coutinho

Protein function prediction via analysis of interactomes

86 Part 4 SUMMARY INTRODUCTION

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor

ANNOTATING PATHWAYS IN INTERACTION NETWORKS

A Multiobjective GO based Approach to Protein Complex Detection

MULE: An Efficient Algorithm for Detecting Frequent Subgraphs in Biological Networks

Discovering molecular pathways from protein interaction and ge

COMPARATIVE ANALYSIS OF BIOLOGICAL NETWORKS. A Thesis. Submitted to the Faculty. Purdue University. Mehmet Koyutürk. In Partial Fulfillment of the

V 5 Robustness and Modularity

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

Analysis and visualization of protein-protein interactions. Olga Vitek Assistant Professor Statistics and Computer Science

GOSAP: Gene Ontology Based Semantic Alignment of Biological Pathways

Basic modeling approaches for biological systems. Mahesh Bule

Networks & pathways. Hedi Peterson MTAT Bioinformatics

Bayesian Hierarchical Classification. Seminar on Predicting Structured Data Jukka Kohonen

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

Proteomics Systems Biology

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

2 GENE FUNCTIONAL SIMILARITY. 2.1 Semantic values of GO terms

Interaction Network Analysis

Lecture 8: Temporal programs and the global structure of transcription networks. Chap 5 of Alon. 5.1 Introduction

Types of biological networks. I. Intra-cellurar networks

BIOLOGY 111. CHAPTER 1: An Introduction to the Science of Life

A Study of Correlations between the Definition and Application of the Gene Ontology

EFFICIENT AND ROBUST PREDICTION ALGORITHMS FOR PROTEIN COMPLEXES USING GOMORY-HU TREES

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

Biological networks CS449 BIOINFORMATICS

Comparative Analysis of Molecular Interaction Networks

Learning in Bayesian Networks

Lecture 4: Yeast as a model organism for functional and evolutionary genomics. Part II

56:198:582 Biological Networks Lecture 10

Identifying Signaling Pathways

Network alignment and querying

Introduction to Bioinformatics

Design and characterization of chemical space networks

Context dependent visualization of protein function

CSCE555 Bioinformatics. Protein Function Annotation

Systems biology and biological networks

Graph Theory and Networks in Biology

Gene Ontology and overrepresentation analysis

Self Similar (Scale Free, Power Law) Networks (I)

BioControl - Week 6, Lecture 1

MATHEMATICAL AND EXPERIMENTAL INVESTIGATION OF ONTOLOGICAL SIMILARITY MEASURES AND THEIR USE IN BIOMEDICAL DOMAINS

Detecting Conserved Interaction Patterns in Biological Networks

Bioinformatics: Network Analysis

Network Biology: Understanding the cell s functional organization. Albert-László Barabási Zoltán N. Oltvai

Erzsébet Ravasz Advisor: Albert-László Barabási

Integration of functional genomics data

Written Exam 15 December Course name: Introduction to Systems Biology Course no

The architecture of complexity: the structure and dynamics of complex networks.

BMD645. Integration of Omics

Link Prediction. Eman Badr Mohammed Saquib Akmal Khan

An Improved Ant Colony Optimization Algorithm for Clustering Proteins in Protein Interaction Network

BIOINFORMATICS CS4742 BIOLOGICAL NETWORKS

NETWORK BIOLOGY: UNDERSTANDING THE CELL S FUNCTIONAL ORGANIZATION

Nonparametric Bayesian Matrix Factorization for Assortative Networks

Networks in systems biology

In post-genomic biology, the nature and scale of data. Algorithmic and analytical methods in network biology. Mehmet Koyutürk 1,2

Grundlagen der Systembiologie und der Modellierung epigenetischer Prozesse

Clustering and Network

Comparison of Protein-Protein Interaction Confidence Assignment Schemes

Clustering of Pathogenic Genes in Human Co-regulatory Network. Michael Colavita Mentor: Soheil Feizi Fifth Annual MIT PRIMES Conference May 17, 2015

FUNDAMENTALS of SYSTEMS BIOLOGY From Synthetic Circuits to Whole-cell Models

Robust Community Detection Methods with Resolution Parameter for Complex Detection in Protein Protein Interaction Networks

Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes. - Supplementary Information -

A general co-expression network-based approach to gene expression analysis: comparison and applications

Increasing availability of experimental data relating to biological sequences, coupled with efficient

A new approach for biomarker detection using fusion networks

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Gene Ontology. Shifra Ben-Dor. Weizmann Institute of Science

Biological Pathways Representation by Petri Nets and extension

Comparative Network Analysis

Outline. Terminologies and Ontologies. Communication and Computation. Communication. Outline. Terminologies and Vocabularies.

SPA for quantitative analysis: Lecture 6 Modelling Biological Processes

identifiers matched to homologous genes. Probeset annotation files for each array platform were used to

Analysis of Biological Networks: Network Robustness and Evolution

Towards Detecting Protein Complexes from Protein Interaction Data

Finding Common Modules in a Time-Varying Network with Application to the Drosophila Melanogaster Gene Regulation Network

Markov Chains, Random Walks on Graphs, and the Laplacian

Biological Networks Analysis

Assessing Significance of Connectivity and. Conservation in Protein Interaction Networks

Evidence for dynamically organized modularity in the yeast protein-protein interaction network

Introduction to Bioinformatics

Review Article From Ontology to Semantic Similarity: Calculation of Ontology-Based Semantic Similarity

Physical network models and multi-source data integration

LAPLACIAN MATRIX AND APPLICATIONS

arxiv: v1 [q-bio.mn] 30 Dec 2008

FCModeler: Dynamic Graph Display and Fuzzy Modeling of Regulatory and Metabolic Maps

networks in molecular biology Wolfgang Huber

Networks. Can (John) Bruce Keck Founda7on Biotechnology Lab Bioinforma7cs Resource

MTopGO: a tool for module identification in PPI Networks

Network motifs in the transcriptional regulation network (of Escherichia coli):

Systems Biology Across Scales: A Personal View XIV. Intra-cellular systems IV: Signal-transduction and networks. Sitabhra Sinha IMSc Chennai

Detecting temporal protein complexes from dynamic protein-protein interaction networks

Modular organization of protein interaction networks

Computational methods for predicting protein-protein interactions

Transcription:

Functional Characterization and Topological Modularity of Molecular Interaction Networks Jayesh Pandey 1 Mehmet Koyutürk 2 Ananth Grama 1 1 Department of Computer Science Purdue University 2 Department of Electrical Engineering & Computer Science Case Western Reserve University Asia Pacific Bioinformatics Conference, 2010

Molecular Interaction Networks Provides a high level description of cellular organization Directed and undirected graph representation Nodes represent cellular components Protein, gene, enzyme, metabolite Edges represent reactions or interactions Binding, regulation, modification, complex membership, substrate-product relationship S.cerevisiae Protein-Protein Interaction (PPI) Network

Function : Gene Ontology Molecular annotation provides a unified understanding of the underlying principles Gene Ontology: A controlled vocabulary of molecular functions, biological processes, and cellular components Terms (concepts) related by is-a, part-of relationships If a molecule is annotated by a term, then it is also annotated by terms on the paths towards root.

Function & Topology in Molecular Networks How does function relate to network topology?

Functional Coherence in Networks Modularity manifests itself in terms of high connectivity in the network Functional association (similarity) is correlated with network proximity A measure for annotation proximity of nodes (semantic similarity) A measure for network distance Sharan et al., MSB, 2007

Assessing Functional Similarity Gene Ontology (GO) provides a hierarchical taxonomy of biological process, molecular function and cellular component Assessment of semantic similarity between concepts in a hierarchical taxonomy is well studied (Resnik, IJCAI, 1995)

Semantic Similarity of GO Terms Resnik s measure based on information content I(c) = log 2 ( G c / G r ) δ I (c i, c j ) = max c A i A j I(c) G c : Set of molecules that are associated with term c, r: Root term A i : Ancestors of term c i in the hierarchy λ(c i, c j ) = argmax c Ai A j I(c): Lowest common ancestor of c i and c j Resnik(c 3, c 4 ) = Max(IC(c 1 ), IC(c 2 ))

Functional Similarity of Molecules Each molecule (protein or domain) is associated with multiple GO terms Average (Lord et al., Bioinformatics, 2003) ρ A (S i, S j ) = 1 S i S j c k S i c l S j δ(c k, c l ) Generalize the concept of lowest common ancestor to sets of terms (Pandey et al., ECCB, 2008) Λ(S i, S j ) = λ(c k, c l ) c k S i,c l S j ( ) GΛ(Si,S j ) ρ I (S i, S j ) = I(Λ(S i, S j )) = log 2 G r G Λ(Si,S j ) = is the set of molecules that are c k Λ(S i,s j ) G ck associated Pandey, with Koyutürk, all Grama terms in the MCA set

Functional Coherence of Module A set of molecules that participates in the same biological processes or functions sub-network with dense intra-connections and sparse interconnections Each module is associated with set of molecular entities, and each molecule associated with set of terms. S 1 = {c 4 }, S 2 = {c 4 }, S 3 = {c 4, c 6 }, S 4 = {c 1, c 6 }, S 5 = {c 1 }, S 6 = {c 6 } Sets: R 1 = {S 1, S 2, S 3, S 4 } R 2 = {S 1, S 2, S 3 } R 3 = {S 3, S 4 }

Existing Measure Average (Pu et al., Proteomics, 2007) σ A (R) = 1 n(n 1)/2 Example: σ A (S 1, S 2, S 3, S 4 ) = 1 i<j n ρ(s i, S j ). 1 6 (3 σ A(S 1, S 2, S 3 ) + ρ(s 3, S 4 ) + ρ(s 1, S 4 ) + ρ(s 2, S 4 ))

Generalized Information Content Extend the notion of the minimum common ancestor of pairs of terms to tuples of terms λ(c i1,...,c in ) = argmax c n k=1 A ik I(c) σ I (R) = I(Λ(S 1,..., S n )) = log 2 ( GΛ(Si,...,S j ) G r ). where Λ(S 1, S 2,..., S n ) = λ(c i1, c i2,...,c in ) c ij S j,1 j n Example: σ I (S 1, S 2, S 3, S 4 ) = I(r) = 0, no common ancestor!

Weighted Information Content Weigh the information content of shared functionality by the number of molecules that contribute to the shared functionality I(c) σ W (R) = 1 1 i n c A i 1 i n c A i I(c) σ W (S 1, S 2, S 3, S 4 ) = 0.86 σ W (S 1, S 2, S 3 ) = 0.75

Accounting for Multiple Paths Is "shortest path" a good measure of network proximity? Multiple alternate paths might indicate stronger functional association In well-studied pathways, redundancy is shown to play an important role in robustness & adaptation (e.g., genetic buffering)

Random walks with restarts Consider a random walker that starts on a source node s. At every tick, the walker chooses randomly among available edges or goes back to node s with probability c.

Proximity Based On Random Walks Simulate an infinite random walk with random restarts at protein i Proximity between proteins i and j is given by the relative amount of time spent at protein j Φ(0) = I, Φ(t + 1) = (1 c)aφ(t) + ci, Φ = lim t Φ(t) Φ(i, j): Network proximity between protein i and protein j A: Stochastic matrix derived from the adjacency matrix of the network I: Identity matrix c: Restart probability Define proximity between proteins i and j as {Φ(i, j) + Φ(j, i)}/2

Network Proximity & Functional Similarity Correlation between functional similarity and network proximity

normalized average semantic similarity 5 4 3 2 1 0 Topological Proximity and Functional Similarity ST DDI Comp-2 DDI S. pombe PPI S. cere PPI -1 x >1/2 4 1/2 4 > x >1/2 8 1/2 8 > x >1/2 12 1/2 12 > x >0 random walk proximity (a) normalized average semantic similarity 5 4 3 2 1 0-1 ST DDI Comp-2 DDI S. pombe PPI S. cere PPI 1 1.5 2 2.5 3 3.5 4 network distance (b) Comparison of the DDI and PPI networks with respect to the relation between semantic similarity vs proximity and network distance

Comparison of Coherence Meaures 5 Avg. pairwise term sim. Avg. pairwise molecule sim. generalized IC weighted IC pvalue < 0.05 index of detectability 4 3 2 1 4 5 6 7 8 9 10 11 complex set size Index of Dectectability vs. complex sizes mean d(σ) = t T (σ(t)) mean t C (σ(t)) ((stdt T (σ(t))) 2 +(std t C (σ(t))) 2 )/2

Conclusion & Ongoing work Random walk based measures of topological proximity are better suited to existing interaction data Measures that quantify coherence among entire sets are superior to aggregares of known pair-wise measures Future work : Using proximity measure to identify disease implicated genes in networks