Computational Genomics. Reconstructing dynamic regulatory networks in multiple species

Similar documents
Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

Introduction to Bioinformatics

Inferring Transcriptional Regulatory Networks from Gene Expression Data II

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

Computational Biology: Basics & Interesting Problems

L3.1: Circuits: Introduction to Transcription Networks. Cellular Design Principles Prof. Jenna Rickus

Bioinformatics. Transcriptome

Complete all warm up questions Focus on operon functioning we will be creating operon models on Monday

13.4 Gene Regulation and Expression

Bioinformatics 2 - Lecture 4

Computational Systems Biology

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr

Inferring Transcriptional Regulatory Networks from High-throughput Data

Predicting Protein Functions and Domain Interactions from Protein Interactions

BME 5742 Biosystems Modeling and Control

Discovering molecular pathways from protein interaction and ge

Old FINAL EXAM BIO409/509 NAME. Please number your answers and write them on the attached, lined paper.

Multiple Choice Review- Eukaryotic Gene Expression

3.B.1 Gene Regulation. Gene regulation results in differential gene expression, leading to cell specialization.

Name: SBI 4U. Gene Expression Quiz. Overall Expectation:

Regulation of Gene Expression

Written Exam 15 December Course name: Introduction to Systems Biology Course no

Prokaryotic Gene Expression (Learning Objectives)

BMD645. Integration of Omics

Computational Cell Biology Lecture 4

Chapter 15 Active Reading Guide Regulation of Gene Expression

Markov Random Field Models of Transient Interactions Between Protein Complexes in Yeast

Measuring TF-DNA interactions

Bacterial Genetics & Operons

Bioinformatics: Network Analysis

Dynamic modular architecture of protein-protein interaction networks beyond the dichotomy of date and party hubs

Computational methods for predicting protein-protein interactions

Midterm Review Guide. Unit 1 : Biochemistry: 1. Give the ph values for an acid and a base. 2. What do buffers do? 3. Define monomer and polymer.

UNIT 6 PART 3 *REGULATION USING OPERONS* Hillis Textbook, CH 11

56:198:582 Biological Networks Lecture 10

A New Method to Build Gene Regulation Network Based on Fuzzy Hierarchical Clustering Methods

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

Bio 101 General Biology 1

Prokaryotic Regulation

Lecture 7: Simple genetic circuits I

Honors Biology Fall Final Exam Study Guide

Warm Up. What are some examples of living things? Describe the characteristics of living things

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Course Descriptions Biology

hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference

Gene regulation II Biochemistry 302. Bob Kelm February 28, 2005

CS-E5880 Modeling biological networks Gene regulatory networks

AP Bio Module 16: Bacterial Genetics and Operons, Student Learning Guide

Systems biology and biological networks

Systems biology and complexity research

Control of Gene Expression in Prokaryotes

Computational methods for the analysis of bacterial gene regulation Brouwer, Rutger Wubbe Willem

BioControl - Week 6, Lecture 1

the noisy gene Biology of the Universidad Autónoma de Madrid Jan 2008 Juan F. Poyatos Spanish National Biotechnology Centre (CNB)

Unit 3: Control and regulation Higher Biology

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

Biology I Fall Semester Exam Review 2014

Biology. Biology. Slide 1 of 26. End Show. Copyright Pearson Prentice Hall

Gene Control Mechanisms at Transcription and Translation Levels

Number of questions TEK (Learning Target) Biomolecules & Enzymes

Welcome to Class 21!

Prokaryotic Gene Expression (Learning Objectives)

Proteomics Systems Biology

Biological Networks. Gavin Conant 163B ASRC

Name Period The Control of Gene Expression in Prokaryotes Notes

Inferring Protein-Signaling Networks

REGULATION OF GENE EXPRESSION. Bacterial Genetics Lac and Trp Operon

Lesson Overview. Gene Regulation and Expression. Lesson Overview Gene Regulation and Expression

Types of biological networks. I. Intra-cellurar networks

Regulation of gene expression. Premedical - Biology

Regulation of Gene Expression

Inferring Models of cis-regulatory Modules using Information Theory

Causal Discovery by Computer

Virginia Western Community College BIO 101 General Biology I

O 3 O 4 O 5. q 3. q 4. Transition

Co-ordination occurs in multiple layers Intracellular regulation: self-regulation Intercellular regulation: coordinated cell signalling e.g.

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

Curriculum Map. Biology, Quarter 1 Big Ideas: From Molecules to Organisms: Structures and Processes (BIO1.LS1)

Gene regulation II Biochemistry 302. February 27, 2006

Reconciling Gene Expression Data With Known Genome-Scale Regulatory Network Structures

Big Idea 1: Does the process of evolution drive the diversity and unit of life?

GLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data

CONJOINT 541. Translating a Transcriptome at Specific Times and Places. David Morris. Department of Biochemistry

Lecture 8: Temporal programs and the global structure of transcription networks. Chap 5 of Alon. 5.1 Introduction

Big Idea 3: Living systems store, retrieve, transmit and respond to information essential to life processes. Tuesday, December 27, 16

Introduction to Bioinformatics

Prediction o f of cis -regulatory B inding Binding Sites and Regulons 11/ / /

2. Cellular and Molecular Biology

Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes. - Supplementary Information -

Updated: 10/11/2018 Page 1 of 5

Honors Biology Reading Guide Chapter 11

EASTERN ARIZONA COLLEGE General Biology I

Clustering and Network

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016

Whole-genome analysis of GCN4 binding in S.cerevisiae

Biological Pathways Representation by Petri Nets and extension

Understanding Science Through the Lens of Computation. Richard M. Karp Nov. 3, 2007

Genome 541! Unit 4, lecture 3! Genomics assays

Gene Regulation and Expression

Transcription:

02-710 Computational Genomics Reconstructing dynamic regulatory networks in multiple species

Methods for reconstructing networks in cells CRH1 SLT2 SLR3 YPS3 YPS1 Amit et al Science 2009 Pe er et al Recomb 2001, Segal et al Nature Genetics 2003 Gerstein et al Science 2010

Key problem: Most highthroughput data is static Time-series measurements Static data sources DNA motif CHIP-chip microarray PPI Time

Method: Integrating time series expression and static protein-dna interaction data

a Expression Level Time Series Expression Data b Static TF-DNA Binding Data TF A time TF B TF C TF D c Expression Level Model Structure d IOHMM Model 0.1 time 1 0.95 0.9 0.05 1

Things are a bit more complicated: Real data

A Hidden Markov Model T t t t n i T t t t i H i H p i H i O p O H L 2 1 1 1 )) ( ) ( ( )) ( ) ( ( ) ;, ( Hidden States Observed outputs (expression levels) t=0 t=1 t=2 t=3 H 0 H 1 H 2 H 3 O 0 O 1 O 2 O 3 Schliep et al Bioinformatics 2003 1

Input Output Hidden Markov Model Input (Static transcription factorgene interactions) Bengio and Frasconi, NIPS 1995 I g Hidden States Variables (We constrain transitions between states to form a tree structure) H 0 H 1 H 2 H 3 Output State Variables (Gaussian distribution for expression values) Log Likelihood O 0 O 1 O 2 O 3 t=0 t=1 t=2 t=3 Sum over all genes Sum over all paths Q Product over all Gaussian emission density values on path Product over all transition probabilities on path

Results: Yeast response pathways

Application to AA starvation in yeast with condition specific data Expression data: Gasch et al Mol. Bio. Cell. 2000, Chip-chip data: Harbison et al Nature 2004 Ernst et al Nature MSB 2007

Application to AA starvation in yeast with condition specific data Nitrogen Compound Metabolism 5*10-23 Nucleotide Biosynthesis 2*10-12 Cellular Carbohydrate Metabolism 4*10-13 Protein biosynthesis 3*10-72 Ribosome biogenesis and assembly 4*10-21 Amino acid transport 10-9

Application to AA starvation in yeast with general binding and motif data new predictions

Intensity Validating predicted interactions for Ino4 12 10 Ino4 Occupancy in Gene Promoter Region at 0h and 4h 8 6 4 SCD AA Starvation 2 0 YDR497C YNL169C YGR196C YHR123W Gene

AA Repeat 1: -log base 10 p-value Validating predicted interactions for Ino4 AA vs. SCD Binding for Ino4 Bound Genes on Response Path (Repeat 1) Ino4 regulates phospholipid biosynthesis 5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 0 1 2 3 4 5 SCD Repeat 1: -log base 10 p-value Many genes in the path are known lipid metabolism genes (GO p-value 6*10-5). May be connected to the need of membrane used for the autophagocytosis process which regulates equilibrium between proteins and the diminishing set of amino acids

Stress and hormone response Stem cells differentiation 2 1 6 7 3 4 5 8 MSB 2007, PLoS Comp. Bio. 2008, elife 2013 Immune response 9 MSB 2011 Fly development Science 2010 IRF7 Genome Research 2010, PLoS ONE 2011, Nature Immunology 2013, ISMB 2013

micrornas

expression expression expression mirna-target detection using expression data global association no association mirna mrna time Pearson-correlation (Cheng et al. 2008, Huang et al. 2011) Regression with graphical models, GenMiR++ (Huang et al. 2007) time mir1 mrna1 time 1-2 time 2-3 time 3-4 Disadvantages methods often assume linear correlation time is not implicitely modeled no prediction of time-specific regulation joint modeling of TF regulation seldom possible time-specific association time

Incorporating mirnas into DREM - Incorporate mirna expression ratios with logistic function to generate dynamic input map for mirnas -Enforce positivity constraints for mirnas coefficients in the logistic regression model (convex optimization, still global optimum)

Dynamic regulaotry models for lung development in mice - Down regulated mirna - Up regulated mirna

Validating mirnas controling Week 1 Week 2 transition mirna sign. paths opposite direction corrected enrichment p-values mir-125a-5p A,B,C+E 0.0108,0.00008,0.0 3872 mir-337-5p B,C+E 0.0332, 0.00008 mir-467c D < 10-6 mir-466a-3p D 0.05152 mir-466d-3p D 0.03904 mir-30d H 0.01456 mir-30a - - mir-23b - -

From development to disease Analysis of significant mirnas in patients with Idiopathic Pulmonary Fibrosis (IPF) 10 control vs 10 IPF (TissueBank ) 28 control vs 33 IPF (Tissue Bank) 142 control vs 162 IPF (LGRC) bleomycin at 14d ( 3 replicates) Liu et al data J. Ex. Med. 2010 mir-125a down - - down mir-30a down down down down mir-30d down down down down mir-467c - - - - mir-337 - up up - mir-466a- 3p mir-466d- 3p - - - down - - - down Schulz et al PNAS 2013

From development to disease Analysis of significant mirnas in patients with Idiopathic Pulmonary Fibrosis (IPF) 10 control 28 142 control vs bleomycin vs 10 Development IPF control 162 IPF at 14d ( 3 (TissueBank ) vs 33 IPF (Tissue Bank) (LGRC) replicates) Liu et al data J. Ex. Med. 2010 mir-125a down - - down mir-30a down down down down mir-30d down down down down mir-467c - - - - mir-337 - up up - mir-466a- - - - down 3p mir-466d- 3p - - - down Disease Schulz et al PNAS 2013

E. coli

Escherichia coli Regulatory Network Comprehensive genome wide binding data is currently not available for most transcription factors Key transcription factor activate some genes and repress others Direction of interaction should also be part of the input to DREM 25% of genes have at least one known regulator based on curated small scale experiments No confirmed negative data Expression and motif data also available E. coli Image from Wikipedia

Using unlabeled data helps supervised learning Consider setting: Set X of instances drawn from unknown distribution P(X) Wish to learn target function f: X Y Given: iid labeled examples iid unlabeled examples Determine:

SEmi-supervised REgulatory Network Discoverer (SEREND) Motif Exp 1 Exp 2 Exp p Label Gene 1 8.0 1.2-0.5 0.4 activated Gene 2 6.2-0.4 1.0 2.0 repressed Gene 3 7.0-0.8 1.2 3.2 unknown Gene N 2.2 0.4 1.4-1.4 unknown Binary Logistic Regression Classifier P(regulated) 3-Way Logistic Regression Classifier P(activated)+P(repressed) Meta Classifier Binary Logistic Regression Classifier P(regulated) Self-training rule to change unknown labels to either activated or repressed P(regulated) > 2 x (#activated + #repressed)/n Ernst et al., PLoS Comput Biol, 2008 Expression Data from (Faith et al, PloS Biology 2007); Curated Interactions from EcoCyc; Motifs from Regulon DB

The Self-Training Step Legend + regulated unknown 0 unregulated No Self-Training Input Labels + + + + + + + + With Self-Training Input Labels + + + + + + + + After Self-Training + + + + + + + + + + + Labels for Final Classification + 0 0 + 0 + + + 00 0 + + + 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 Labels for Final Classification + 0 0 0 0 00 + 0 0 0 0 0 0 0 0 + 0 + + 00 0 0 0 + + + + + + 0 0 0 0 0 0

Aerobic-anaerobic shift response in Expression Level (log base 10) E. coli 1 activator; -1 repressor 1 6 2 7 3 4 5 8 Time 9 1 2 3 4 5 6 7 8 9 Beg et al PNAS 2007, Ernst et al Plos Comp. Bio. 2008

DREM is useful, but several questions remain Who controls the master regulators