Network Biology-part II

Similar documents
Introduction to Bioinformatics

Chapter 15 Active Reading Guide Regulation of Gene Expression

Integrative causal networks for understanding complex human diseases

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

Complete all warm up questions Focus on operon functioning we will be creating operon models on Monday

Predicting Protein Functions and Domain Interactions from Protein Interactions

Clustering and Network

Computational Biology: Basics & Interesting Problems

Causal Discovery by Computer

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia

Regulation of Gene Expression

Latent Variable models for GWAs

Related Courses He who asks is a fool for five minutes, but he who does not ask remains a fool forever.

Research Article Sample Size Calculation for Controlling False Discovery Proportion

Discovering Correlation in Data. Vinh Nguyen Research Fellow in Data Science Computing and Information Systems DMD 7.

Measuring TF-DNA interactions

Regulation of Gene Expression

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

Mathematics, Genomics, and Cancer

Inferring Transcriptional Regulatory Networks from High-throughput Data

Causal Graphical Models in Systems Genetics

networks in molecular biology Wolfgang Huber

Inferring Transcriptional Regulatory Networks from Gene Expression Data II

Proteomics. Areas of Interest

Systems biology and biological networks

Inferring Genetic Architecture of Complex Biological Processes

Discovering MultipleLevels of Regulatory Networks

Types of biological networks. I. Intra-cellurar networks

Regulation of gene expression. Premedical - Biology

Computational Systems Biology

Lecture Notes for Fall Network Modeling. Ernest Fraenkel

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

Lesson 11. Functional Genomics I: Microarray Analysis

Written Exam 15 December Course name: Introduction to Systems Biology Course no

GLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data

BioControl - Week 6, Lecture 1

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

Proteomics Systems Biology

Lesson Overview. Gene Regulation and Expression. Lesson Overview Gene Regulation and Expression

Thematic review series: Systems Biology Approaches to Metabolic and Cardiovascular Disorders

Bias in RNA sequencing and what to do about it

Comparative Network Analysis

GSBHSRSBRSRRk IZTI/^Q. LlML. I Iv^O IV I I I FROM GENES TO GENOMES ^^^H*" ^^^^J*^ ill! BQPIP. illt. goidbkc. itip31. li4»twlil FIFTH EDITION

Computational methods for predicting protein-protein interactions

Gene Autoregulation via Intronic micrornas and its Functions

Multiple Choice Review- Eukaryotic Gene Expression

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

Welcome to Class 21!

BTRY 7210: Topics in Quantitative Genomics and Genetics

Understanding Science Through the Lens of Computation. Richard M. Karp Nov. 3, 2007

Biological Networks. Gavin Conant 163B ASRC

CONJOINT 541. Translating a Transcriptome at Specific Times and Places. David Morris. Department of Biochemistry

Proteomics. 2 nd semester, Department of Biotechnology and Bioinformatics Laboratory of Nano-Biotechnology and Artificial Bioengineering

13.4 Gene Regulation and Expression

27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr

Learning in Bayesian Networks

Evidence for dynamically organized modularity in the yeast protein-protein interaction network

L3.1: Circuits: Introduction to Transcription Networks. Cellular Design Principles Prof. Jenna Rickus

Computational Biology From The Perspective Of A Physical Scientist

Introduction Biology before Systems Biology: Reductionism Reduce the study from the whole organism to inner most details like protein or the DNA.

GCD3033:Cell Biology. Transcription

Honors Biology Reading Guide Chapter 11

DNA. Recombinant DNA Technology. (Gene deletion, replacement, site directed mutagenesis) (Genetically modified organisms)

Control of Gene Expression

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Self Similar (Scale Free, Power Law) Networks (I)

Hairpin Database: Why and How?

REVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E

Coding sequence array Office hours Wednesday 3-4pm 304A Stanley Hall

Causal Model Selection Hypothesis Tests in Systems Genetics

Intrinsic Noise in Nonlinear Gene Regulation Inference

Fuzzy Clustering of Gene Expression Data

Gene Regulation and Expression

Regulation of gene Expression in Prokaryotes & Eukaryotes

Matteo Figliuzzi

#33 - Genomics 11/09/07

1.1. KEY CONCEPT Biologists study life in all its forms. 4 Reinforcement Unit 1 Resource Book. Biology in the 21st Century CHAPTER 1

Zhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018

The geneticist s questions. Deleting yeast genes. Functional genomics. From Wikipedia, the free encyclopedia

Introduction. Gene expression is the combined process of :

BMD645. Integration of Omics

Bi 1x Spring 2014: LacI Titration

Computational Genomics. Reconstructing dynamic regulatory networks in multiple species

Prokaryotic Regulation

Introduction to clustering methods for gene expression data analysis

Big Idea 3: Living systems store, retrieve, transmit, and respond to information essential to life processes.

Prokaryotic Gene Expression (Learning Objectives)

2. Mathematical descriptions. (i) the master equation (ii) Langevin theory. 3. Single cell measurements

SUPPLEMENTARY INFORMATION

Gene Regula*on, ChIP- X and DNA Mo*fs. Statistics in Genomics Hongkai Ji

Differential Modeling for Cancer Microarray Data

Principles of Genetics

robustness: revisting the significance of mirna-mediated regulation

Geert Geeven. April 14, 2010

Introduction to clustering methods for gene expression data analysis

Evolutionary analysis of the well characterized endo16 promoter reveals substantial variation within functional sites

Gene Control Mechanisms at Transcription and Translation Levels

Going Beyond SNPs with Next Genera5on Sequencing Technology Personalized Medicine: Understanding Your Own Genome Fall 2014

MIP543 RNA Biology Fall 2015

Transcription:

Network Biology-part II Jun Zhu, Ph. D. Professor of Genomics and Genetic Sciences Icahn Institute of Genomics and Multi-scale Biology The Tisch Cancer Institute Icahn Medical School at Mount Sinai New York, NY http://research.mssm.edu/integrative-network-biology/ Email: jun.zhu@mssm.edu @IcahnInstitute

Why it is so hard to model biological systems? The more we learn, the more complicated it becomes! Epigenetic regulation : heritable changes in gene function that cannot be explained by changes in DNA sequence DNA methylation Chromotin structure Junk DNA? Post transcriptional regulation Splicing (1981) RNA editing (1986) mirna mediated regulation (1993) It is not one gene to one protein anymore! Post translational regulation Phosphorylation Glycosaltion acetylation

How to model biological systems: Types of Network models time-dependent networks discrete, continuous Differential equations Prediction vs explanation Phenomenologically predictive networks correlation based, dependency nets, Explanatory pictorial Deterministic vs. Stochastic (probabilistic) Concentration vs. Bayesian nets for up/down/nc

Biological networks/pathways Observation-> description-> explanation-> prediction Data required to train models Gene sets Association networks Probabilistic causal networks Mechanism based models Biological details revealed

Theory of network biology: how biological processes are regulated? Observation-> description-> explanation-> prediction Transcription factor Multiple genes

Micorarray: revolutionized the way we query a biological system 1995: Patrick Brown reported a proof of concept study

Microarray: two channel system vs one channel system Short term cost, long tem cost, accuracy

Microarray: what are the assumptions and limitations? Late 1990s, many EST libraries were sequenced and human and mouse genomes were closed to finished; Assuming all gene transcripts were known; Assuming all gene isoforms were known; There were no SNPs within probes Outdated and will be replaced by RNAseq mrnas (isoforms, allele specific expression ) mirnas Long non-coding RNAs

Gene set enrichment analysis Transcription factor Multiple genes

Gene set enrichment Fisher s exact Test hypergeometric distribution cdf Foreground k x Observed Signature N p F( x 1 M, K, N ) x 1 i 0 K M K i N i M N background M Problem: have to define a cutoff

Gene set enrichment: a non-parametric test Kolmogorov-Smirnov test: Order genes by fold changes or p-value Test whether genes involved in a Pathway are randomly distributed. pathway background Observed pathway

Gene set enrichment Analysis (GSEA) Difference from Kolmogorov-Smirnov test: using weighted sum Subramanian et al, PNAS, 2005

Gene set enrichment analysis What are assumptions and limitations? Can only analyze known pathways Don t know how genes involved in a pathway are regulated What are direction interactions and secondary interactions? Don t know how multiple pathways interact with each other if multiple pathways involve. Transcription factor Multiple genes

Biological networks/pathways Observation-> description-> explanation-> prediction Data required to train models Gene sets Association networks Probabilistic causal networks Mechanism based models Biological details revealed

How to define association? Association of two genes is context dependent protein-protein interaction by Y2H experiments co-cited in literature Protein-DNA interaction: ChIP-on-chip, ChIPseq correlation of mrna expression levels

Yeast-2-hybrid system Gene fusion Gene fusion reporter gene Limitations: High false positive and negative Only for soluble proteins not in a physiological condition Lodish, et al., Molecular Cell Biology

Protein-protein interaction networks Stelzl et al, Cell, 2005 Genes in a pathway interact with each other; discover new members in the pathway

Are all genes equally important? Degree distribution: how many connections a gene has? the majority of genes connect with a small number of genes, while a smaller number of genes connect to a large number of genes? Scale-free: Log-log linear p ( k) ~ k log( p( k)) ~ *log( k) Clustering coefficient: CC p k p 2n ( k 1) p

Different types of Complex Networks Degree distribution Clustering coefficient Barabasi and Oltvai, 2004

Protein-DNA interaction: chromatin immunoprecipitation (ChIP): To find transcription factor binding targets

Association by gene expression correlation How strong the correlation of mrna expression levels should be? the p-value cutoff for correlation Assuming two expression levels are independent FDR (False Discover Rate) by permutation No explicit assumption Data set specific FDR total false positives positives detected

Selecting threshold for Gene-Gene Correlation (GGC) of 25,000 genes on a microarray chip p-value < total positive false positive FDR (from data) (from permuted data) 1e-10 40245988 1079 2.68e-5 1e-15 22475531 192 8.54e-6 1e-20 13755681 38 2.76e-6 At p value <1e-20, there are only 38 false positives so that no module was detected for the permuted data Pvalue<1e-20 was chosen as threshold

Association by gene expression correlation Can two expression levels correlate because they both correlate to noise? Guilt by association is noise prone Stuart et al. 2003 Two gene expression levels correlate because they respond to common perturbation F2 intercross setting QTL overlap

Scale-free property is robust

Genetics filter makes the network closer to scale-free Chen*, Zhu*, et al. Nature, 2008

Genetics: eqtl overlapping enhances correlation signals Problem: a cutoff threshold is needed Chen Y*, Zhu J* et al. Nature 452:429-435 (2008)

Causality vs. association Smoking and disease risks

Causality vs. association Fat dogs and fat masters Stephen Friend

Biological networks/pathways Observation-> description-> explanation-> prediction Data required to train models Gene sets Association networks Probabilistic causal networks Mechanism based models Biological details revealed

A simple biological question: are there causal/reactive relationships?

A Bayesian network approach: Best model

Biological networks/pathways Observation-> description-> explanation-> prediction Data required to train models Gene sets Association networks Probabilistic causal networks Mechanism based models Biological details revealed

Differential equations explain why different dg dt dc dt correlations can be observed n g n n c g c v( g, c) u( g, c) Chen*, Zhu*, et al Nature (2008)

Model a signaling pathway

Aknowledgements Zhu lab Seungyeul Yoo Eunjee Lee Li Wang Luan Lin Quan Long Supported by: Mount Sinai Genomics Institute Eric Schadt Bin Zhang Zhidong Tu Charles Powell Patrizia Casaccia Boston University Avrum Spira Joshua Campbell U Washington Roger Baumgarner Berkerley Rachel Brem Princeton Lenoid Kruglyak Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai Janssen Canary Foundation Prostate Cancer Foundation NIH NCI