GLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data

Similar documents
3.B.1 Gene Regulation. Gene regulation results in differential gene expression, leading to cell specialization.

Introduction to Bioinformatics

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

REVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E

Topic 4 - #14 The Lactose Operon

Chapter 16 Lecture. Concepts Of Genetics. Tenth Edition. Regulation of Gene Expression in Prokaryotes

REGULATION OF GENE EXPRESSION. Bacterial Genetics Lac and Trp Operon

Regulation of Gene Expression

Bayesian Networks BY: MOHAMAD ALSABBAGH

56:198:582 Biological Networks Lecture 8

Introduction. Gene expression is the combined process of :

Complete all warm up questions Focus on operon functioning we will be creating operon models on Monday

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016

Warm-Up. Explain how a secondary messenger is activated, and how this affects gene expression. (LO 3.22)

Prokaryotic Gene Expression (Learning Objectives)

13.4 Gene Regulation and Expression

Chapter 15 Active Reading Guide Regulation of Gene Expression

CHAPTER : Prokaryotic Genetics

the noisy gene Biology of the Universidad Autónoma de Madrid Jan 2008 Juan F. Poyatos Spanish National Biotechnology Centre (CNB)

Multistability in the lactose utilization network of Escherichia coli

CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON

Intrinsic Noise in Nonlinear Gene Regulation Inference

Introduction to Bioinformatics

Biology. Biology. Slide 1 of 26. End Show. Copyright Pearson Prentice Hall

Learning in Bayesian Networks

Regulation of Gene Expression

Bi 1x Spring 2014: LacI Titration

Gene Regulation and Expression

The Monte Carlo Method: Bayesian Networks

Name: SBI 4U. Gene Expression Quiz. Overall Expectation:

Gene regulation I Biochemistry 302. Bob Kelm February 25, 2005

12-5 Gene Regulation

Chapter 18: Control of Gene Expression

GENE REGULATION AND PROBLEMS OF DEVELOPMENT

Lecture 18 June 2 nd, Gene Expression Regulation Mutations

THE EDIBLE OPERON David O. Freier Lynchburg College [BIOL 220W Cellular Diversity]

Inferring Protein-Signaling Networks

Initiation of translation in eukaryotic cells:connecting the head and tail

Inferring Protein-Signaling Networks II

Control of Gene Expression in Prokaryotes

Lecture 4: Transcription networks basic concepts

Written Exam 15 December Course name: Introduction to Systems Biology Course no

Random Boolean Networks

WELCOME BAYESIAN NETWORK

Controlling Gene Expression

Prokaryotic Regulation

Prokaryotic Gene Expression (Learning Objectives)

Bayesian Networks. Machine Learning, Fall Slides based on material from the Russell and Norvig AI Book, Ch. 14

MCMC: Markov Chain Monte Carlo

UNIT 6 PART 3 *REGULATION USING OPERONS* Hillis Textbook, CH 11

Biological Pathways Representation by Petri Nets and extension

Unit 3: Control and regulation Higher Biology

32 Gene regulation, continued Lecture Outline 11/21/05

APGRU6L2. Control of Prokaryotic (Bacterial) Genes

Bacterial Genetics & Operons

Name Period The Control of Gene Expression in Prokaryotes Notes

AP Bio Module 16: Bacterial Genetics and Operons, Student Learning Guide

Predicting Protein Functions and Domain Interactions from Protein Interactions

Undirected Graphical Models

Week 10! !

Bioinformatics 2 - Lecture 4

Control of Prokaryotic (Bacterial) Gene Expression. AP Biology

Big Idea 3: Living systems store, retrieve, transmit and respond to information essential to life processes. Tuesday, December 27, 16

Chapter 7: Regulatory Networks

2. Mathematical descriptions. (i) the master equation (ii) Langevin theory. 3. Single cell measurements

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

Translation - Prokaryotes

CS-E5880 Modeling biological networks Gene regulatory networks

Inferring gene regulatory networks

Measuring TF-DNA interactions

2. What was the Avery-MacLeod-McCarty experiment and why was it significant? 3. What was the Hershey-Chase experiment and why was it significant?

Multistability in the lactose utilization network of E. coli. Lauren Nakonechny, Katherine Smith, Michael Volk, Robert Wallace Mentor: J.

Honors Biology Reading Guide Chapter 11

Physical network models and multi-source data integration

Graph Alignment and Biological Networks

Bi 8 Lecture 11. Quantitative aspects of transcription factor binding and gene regulatory circuit design. Ellen Rothenberg 9 February 2016

Latent Variable models for GWAs

Villa et al. (2005) Structural dynamics of the lac repressor-dna complex revealed by a multiscale simulation. PNAS 102:

Optimal State Estimation for Boolean Dynamical Systems using a Boolean Kalman Smoother

Inferring Transcriptional Regulatory Networks from High-throughput Data

Stochastic simulations

Lesson Overview. Gene Regulation and Expression. Lesson Overview Gene Regulation and Expression

Regulation of gene expression. Premedical - Biology

The Gene The gene; Genes Genes Allele;

Regulation of Gene Expression at the level of Transcription

Lecture 7: Simple genetic circuits I

Regulation of Gene Expression in Bacteria and Their Viruses

Computational Cell Biology Lecture 4

A Problog Model For Analyzing Gene Regulatory Networks

Translation and Operons

L3.1: Circuits: Introduction to Transcription Networks. Cellular Design Principles Prof. Jenna Rickus

Topic 4: Equilibrium binding and chemical kinetics

Index. FOURTH PROOFS n98-book 2009/11/4 page 261

BioControl - Week 6, Lecture 1

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

16 CONTROL OF GENE EXPRESSION

A&S 320: Mathematical Modeling in Biology

Computational Biology: Basics & Interesting Problems

Network motifs in the transcriptional regulation network (of Escherichia coli):

Systems Biology. Edda Klipp, Wolfram Liebermeister, Christoph Wierling, Axel Kowald, Hans Lehrach, and Ralf Herwig. A Textbook

Transcription:

GLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data 1

Gene Networks Definition: A gene network is a set of molecular components, such as genes and proteins, and interactions between them that collectively carry out some cellular function. A genetic regulatory network refers to the network of controls that turn on/off gene transcription. Motivation: Using a known structure of such networks, it is sometimes possible to describe behavior of cellular processes, reveal their function and the role of specific genes and proteins Experiments DNA microarray : observe the expression of many genes simultaneously and monitor gene expression at the level of mrna abundance. Protein chips: the rapid identification of proteins and their abundance is becoming possible through methods such as 2D polyacrylamide gel electrophoresis. 2-hybrid systems: identify protein-protein interactions (Stan Fields lab http://depts.washington.edu/sfields/) 2

Regulation Genes (DNA) Message (RNA) Proteins Function/ Environment Other Cells Regulation 3

Operon + + + + P1 laci P2 lacz lacy laca lac operon on E. coli RNA polymerase laci protein P1 laci P2 lacz lacy laca LAC RNA polymerase laci suppressed LAC LAC + + + P1 laci P2 lacz lacy laca Repressor protein coded by laci, bind to P2 preventing transcription of lacz, lacy and laca Lactose binds with laci, allowing RNA polymerase to bind to P2 and transcribe the structural genes 4

Genetic Network Models Linear Model: expression level of a node in a network depends on linear combination of the expression levels of its neighbors. Boolean Model: The most promising technique to date is based on the view of gene systems as a logical network of nodes that influence each other's expression levels. It assumes only two distinct levels of expression: 0 and 1. According to this model a value of a node at the next step is boolean function of the values of its neighbors. Bayesian Model: attempts to give a more accurate model of network behavior, based on Bayesian probabilities for expression levels. 5

Evaluation of Models Inferential power Predictive power Robustness Consistency Stability Computational cost 6

Boolean Networks: An example 1: induced 0: suppressed -: forced low +: forced high Interpreting data Reverse Engineering 7

Boolean networks: A Predictor/Chooser scheme Gene Expression Profile Predictor (may give multiple networks) Chooser (suggest new perturbation experiments) New Gene Expression Profile 8

Predictor A population of cells containing a target genetic network T is monitored in the steady state over a series of M experimental perturbations. In each perturbation p m (0 m < M) any number of nodes may be forced to a low or high level. Wild-type state -: forced low +: forced high 9

Step 1. For each gene x n, find all pairs of rows (i, j) in E in which the expression level of x n differs, excluding rows in which x n was forced to a high or low value. For x 3, we find: (p0, p1), (p0, p3), (p1, p2), (p2, p3) 10

Step 2. For each pair (i,j), S ij contains all other genes whose expression levels also differ between experiments i and j. Find the minimum cover set S min, which contains at least one node from each set S ij Step 1: (p0,p1), (p0, p3), (p1,p2), (p2,p3) Step 2: (p0, p1)->s 01 ={x 0, x 2 } (p0, p3)->s 03 ={x 2 } (p1, p2)-> S 12 ={x 0, x 1 } (p2, p3)->s 23 ={x 1 ) So, now the S min is {x 1, x 2 } 11

Step 3. use the nodes in S min as input, x n as output, build truth table to find out f n (In this example, n=3) Now the S min is {x 1, x 2 } x 1 1 0 1 0 x 2 1 1 0 0 x 3 0 * 1 0 So f 3 = 0 * 1 0 * cannot be determined 12

Chooser For L hypothetical equiprobable networks generated by the predictor, choose perturbation p that would best discriminate between the L networks, by maximizing entropy H p as defined below. H p = - s=1 S (l s /L) log 2 (l s /L) where l s is the number of networks giving the state s Note: (1 s S), and (1 S L) 13

Result and Evaluation Evaluation of Predictor construct a target network T: size = N, and maximum in-degree = k (where the in-degree of a node is its number of incoming edges) sensitivity is defined as the percentage of edges in the target network that were also present in the inferred network, and specificity is defined as the percentage of edges in the inferred network that were also present in the target network. 14

Discussions Incorporate pre-existing information Boolean to multi-levels Cyclic networks Noise tolerance References Ideker, Thorsson, and Karp, PSB 2000, 5: 302-313. 15

Bayesian Networks Biological processes are stochastic - Data can be noisy as well. 16

conditions Join probability Query/Inference: P(X1 X6, X7)? e.g., P(+,+,-,,+) = 0.003 P(-,+,+,, -) = 0.00015 2 N How many combinations? 17

Conditional Probability and Conditional Independence 18

Bayesian Network as an efficient way to factorize the Joint Probability Factorization of join probability Example: # of parameters = 2 N -1 # of parameters = 1 + 2 + 4 + 8 + 16 = 31 Conditional independence Assuming max in-degree k, the number of parameters is reduced to 2 k N # of parameters = 1 + 1 + 4 + 1 + 1 = 10 19

A greedy Algorithm to Learn Bayesian Network from the data 20

Parameter Estimation -Maximum Likelihood -Bayesian approach - Dirichlet priors are used for model parameters. Structure evaluation 21

Model Averaging Feature f: edge X Y is in the network. f(g) = 1, if G has the feature = 0, otherwise. How to compute P[f(G) D]? -Enumerate all high scored networks - Sampling (MCMC) - Bootstrap 22

Bootstrap 23

Inference Given P(X 1,, X N ) as a BN, calculate P(X i evidence), where evidence is a subset of nodes that we know their values. e.g., P(X 2 X 3, X 4 ) =? Exact inference is NP-hard (Cooper, 1990) P(X i evidence) = Y V {Xi, evidence} P(X i Y) BINF694, S15, Liao 24

Inference by Sampling Direct sampling, e.g. P(X 5 ) Rejection sampling Weighted (likelihood) sampling Gibbs sampling P(X 1 =1 X 3 =1, X 4 =2) = ¾ P(X 2 =1 X 3, X 3 ) = ¼ P(X 5 =1 X 3, X 3 ) = 2/4 Learning Bayesian Networks, Neapolitan, 2004 BINF694, S15, Liao 25

Likelihood Weighting BINF694, S15, Liao 26

Gibbs sampler Russell & Norvig, AI Modern Approach, 2ed. BINF694, S15, Liao 27

Markov blanket Russell & Norvig, AI Modern Approach, 2ed. BINF694, S15, Liao 28

Model Intervention for causality - Bayesian networks <= causal networks - Bayesian networks + intervention => causality External interventions are needed to infer the causal direction for edges in a Bayesian network: Genetic mutations sirna small chemical interventions as inhibitors or activators BINF694, S15, Liao 29

Intervention on X - do(x=x) for node X to take value x. - Incoming edges to X are cut off. - X becomes a root node, and P(X=x) =1. BINF694, S15, Liao 30

Example Two Bayesian networks: X Y and Y X - Equivalent based on observed data - do(x=x) for node X to take value x. - If X Y is the causal network, then P(Y do(x=x)) = P(Y X=x) - If Y X is the causal network, then P(Y do(x=x)) = P(Y) We will get a different conditional distribution in the observational and inhibited (intervened) samples. Whereas we cannot distinguish between the two models with the use of observations alone, we can differentiate between them with the use of interventional data. BINF694, S15, Liao 31

Dynamics of gene expression regulation Linear model: Non-linear model:

Discretization: Stochastic -- adding noise

Loops in DBN

BINF694, S15, Liao 38

BINF694, S15, Liao 39

BINF694, S15, Liao 40

Bayesian networks for cellular signaling BINF694, S15, Liao 41

42

43