Written Exam 15 December Course name: Introduction to Systems Biology Course no

Similar documents
Chapter 15 Active Reading Guide Regulation of Gene Expression

REVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E

Principles of Synthetic Biology: Midterm Exam

7.32/7.81J/8.591J: Systems Biology. Fall Exam #1

Basic Synthetic Biology circuits

Lecture 4: Transcription networks basic concepts

Measuring TF-DNA interactions

3.B.1 Gene Regulation. Gene regulation results in differential gene expression, leading to cell specialization.

Prokaryotic Gene Expression (Learning Objectives)

Network motifs in the transcriptional regulation network (of Escherichia coli):

Chapter 16 Lecture. Concepts Of Genetics. Tenth Edition. Regulation of Gene Expression in Prokaryotes

FUNDAMENTALS of SYSTEMS BIOLOGY From Synthetic Circuits to Whole-cell Models

Introduction to Bioinformatics

L3.1: Circuits: Introduction to Transcription Networks. Cellular Design Principles Prof. Jenna Rickus

56:198:582 Biological Networks Lecture 8

Types of biological networks. I. Intra-cellurar networks

REGULATION OF GENE EXPRESSION. Bacterial Genetics Lac and Trp Operon

4. Why not make all enzymes all the time (even if not needed)? Enzyme synthesis uses a lot of energy.

56:198:582 Biological Networks Lecture 9

Topic 4 - #14 The Lactose Operon

Prokaryotic Gene Expression (Learning Objectives)

Biological Pathways Representation by Petri Nets and extension

Random Boolean Networks

GENE REGULATION AND PROBLEMS OF DEVELOPMENT

Name Period The Control of Gene Expression in Prokaryotes Notes

Cellular Biophysics SS Prof. Manfred Radmacher

UNIT 6 PART 3 *REGULATION USING OPERONS* Hillis Textbook, CH 11

Chapter 6- An Introduction to Metabolism*

Biological Networks. Gavin Conant 163B ASRC

56:198:582 Biological Networks Lecture 10

Introduction. Gene expression is the combined process of :

networks in molecular biology Wolfgang Huber

Self Similar (Scale Free, Power Law) Networks (I)

Welcome to Class 21!

arxiv: v1 [q-bio.mn] 30 Dec 2008

Networks in systems biology

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016

Lecture 6: The feed-forward loop (FFL) network motif

Lecture 8: Temporal programs and the global structure of transcription networks. Chap 5 of Alon. 5.1 Introduction

Genetic transcription and regulation

BioControl - Week 6, Lecture 1

Prokaryotic Regulation

Systems biology and biological networks

Control of Gene Expression in Prokaryotes

Introduction to Bioinformatics

Genetic transcription and regulation

A Synthetic Oscillatory Network of Transcriptional Regulators

APGRU6L2. Control of Prokaryotic (Bacterial) Genes

Unit 3: Control and regulation Higher Biology

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

Regulation of Gene Expression

Simulation of Gene Regulatory Networks

GLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data

Hybrid Model of gene regulatory networks, the case of the lac-operon

Complete all warm up questions Focus on operon functioning we will be creating operon models on Monday

Name: SBI 4U. Gene Expression Quiz. Overall Expectation:

Bi 8 Lecture 11. Quantitative aspects of transcription factor binding and gene regulatory circuit design. Ellen Rothenberg 9 February 2016

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

Modeling Multiple Steady States in Genetic Regulatory Networks. Khang Tran. problem.

Analog Electronics Mimic Genetic Biochemical Reactions in Living Cells

Evidence for dynamically organized modularity in the yeast protein-protein interaction network

Network Biology: Understanding the cell s functional organization. Albert-László Barabási Zoltán N. Oltvai

CS-E5880 Modeling biological networks Gene regulatory networks

CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON

Big Idea 3: Living systems store, retrieve, transmit and respond to information essential to life processes. Tuesday, December 27, 16

Multistability in the lactose utilization network of E. coli. Lauren Nakonechny, Katherine Smith, Michael Volk, Robert Wallace Mentor: J.

Chapter 8: An Introduction to Metabolism

An Introduction to Metabolism

Lecture 10: Cyclins, cyclin kinases and cell division

CHAPTER : Prokaryotic Genetics

A Simple Protein Synthesis Model

A synthetic oscillatory network of transcriptional regulators

Regulation and signaling. Overview. Control of gene expression. Cells need to regulate the amounts of different proteins they express, depending on

Co-ordination occurs in multiple layers Intracellular regulation: self-regulation Intercellular regulation: coordinated cell signalling e.g.

Modeling Biological Networks

Chapter 8: An Introduction to Metabolism

Computational Cell Biology Lecture 4

An introduction to biochemical reaction networks

Biological networks CS449 BIOINFORMATICS

SYSTEMS BIOLOGY 1: NETWORKS

Predicting Protein Functions and Domain Interactions from Protein Interactions

Systems biology and complexity research

Biological Networks Analysis

Translation and Operons

Chapter 6: Energy and Metabolism

Basic modeling approaches for biological systems. Mahesh Bule

Supplementary Figure 3

Multistability in the lactose utilization network of Escherichia coli

Course plan Academic Year Qualification MSc on Bioinformatics for Health Sciences. Subject name: Computational Systems Biology Code: 30180

Warm-Up. Explain how a secondary messenger is activated, and how this affects gene expression. (LO 3.22)

Regulation of gene expression. Premedical - Biology

Analysis of Biological Networks: Network Integration

V14 Graph connectivity Metabolic networks

Thermodynamic principles governing metabolic operation : inference, analysis, and prediction Niebel, Bastian

Control of Prokaryotic (Bacterial) Gene Expression. AP Biology

CELL CYCLE AND DIFFERENTIATION

Gene regulation II Biochemistry 302. Bob Kelm February 28, 2005

Inferring Transcriptional Regulatory Networks from Gene Expression Data II

AP Bio Module 16: Bacterial Genetics and Operons, Student Learning Guide

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

Transcription:

Technical University of Denmark Written Exam 15 December 2008 Course name: Introduction to Systems Biology Course no. 27041 Aids allowed: Open book exam Provide your answers and calculations on separate paper. Remember to provide the required information on each of these pages, i.e. name, student number and signature on the final page. The contribution to the overall score for each question and section is indicated in bold (out of 100 points). Section I: Short questions (42 points total) 1. Although it is a broadly defined field, Systems Biology does have a number of common principles and approaches. Please indicate TRUE or FALSE whether the following statements characterize Systems Biology. (2 points) A. Systems Biology endeavours to understand the global behaviour of a cell through modelling its components and their functional interactions. B. Systems Biology models are often defined from sequence analysis of the genome alone. TRUE, FALSE 2. Which of the 3 Gene Ontologies (Biological Process, Molecular Function, Cellular Component) will contain the following annotation terms (assign each term to its ontology) (3 points) A. regulation of glycolysis B. peroxisomal membrane C. DNA binding A) Biological Process, B) Cellular Component, C) Molecular Function 3. You have five nodes in a network. What is the minimum number of interactions needed if the network is to be drawn as one connected component? Draw the resulting network (2 points). 4, O-O-O-O-O

4. Name 3 types of relationships between components of the cell (e.g. genes, DNA, RNA, protein, metabolites) that are commonly represented as networks? (3 points) Transcriptional regulatory interactions (TF-DNA), Protein-protein interactions (complexes), Metabolic reactions (enzyme-metabolite), Genetic interactions (gene-gene), Signalling cascades (protein kinase protein target) 5. A regulation program (see the figure) for a module of genes has been identified using the module networks procedure by Segal et al. The regulation program consists of two transcription factors, X and Y. X is a repressor and Y is an activator and both of them have similar regulatory strength. Determine for each of the three contexts (A, B, C) if the genes in the module are up-regulated, down-regulated or expressed at basal level. (3 points) Context A: up regulated, Context B: down regulated, Context C: Basal 6. Given the node degree distributions for two different scale free networks, Graph A and Graph B (see figure), which network is likely to have the larger network diameter (i.e. the longest shortest-path) and why? (2 points)

Graph B, because it has fewer hubs (high degree nodes). A larger number of nodes with a high degree (as seen in Graph B) will likely result in networks of smaller diameters. 7. In a research paper, the authors have identified a protein complex composed of five proteins based on a mass spec pull down experiment. In the article, they have drawn the protein complex using the matrix representation with no self-interacting proteins. If you were to verify each interaction in the drawn complex with Y2H experiments, how many experiments would you have to perform (3 points)? 10 Y2H experiments 8. List one advantage and one disadvantage of bow-tie architecture of metabolic networks. (2 points) Advantages: i) Flexibility ii) Evolvability iii) Error tolerance Disadvantages: i) Prone to targeted attacks on hubs ii) Spatial distribution of currency metabolites crucial for functioning 9. Based on the network below, which node has the highest in degree? Which node has the highest out degree? (2 points) B has the highest out degree and E has the highest in degree

10. Draw a network with five nodes, where at least one node has a clustering coefficient of 1. Indicate in your drawn network the node with a clustering coefficient of 1 (3 points). 11. Draw all incoherent regulatory feed forward circuit of 3 regulators (nodes) and 3 regulatory interactions (edges). (2 points) 12. In a simple feed forward circuit (3 regulators, 3 interactions) the promoter of the target gene (i.e. terminal node) must integrate the input from 2 regulators. What regulatory function(s) must be assumed (e.g. AND, OR, NAND, NOR, XOR, XNOR) in order to implement a delayed induction circuit and a delayed repression circuit? Briefly explain why and show your 2 circuits. (4 points) AND for both induction repression as both upstream regulators need to be present/absent before induction/repression can happen. NOR would be required for a de-repression delay 13. A node from a scale free, hierarchical network has a high degree. Will it be most likely to have a high or a low clustering coefficient and why (2 points)? Most likely to have a low clustering coefficient, as the clustering coefficient follows a power law 14. Neighbour enzymes of metabolites are often found to be coregulated. List two possible (biological) reasons behind this observation. (2 points) 1. Maintain homeostasis 2. Adjust level of metabolite so that new flux distribution around that metabolite is thermodynamically feasible

15. What statistical methods are used to assess the enrichment of Gene Ontology (GO) terms in a set of genes? Describe at least one. (2 points) Test based on the hypergeometric distribution (Fisher s exact test), Chi-square test, test based on the Binomial distribution 16. Based on the Uri Alon book, which of the simple regulatory circuits was described to have regulatory memory. (2 points) Positive regulatory feedback loop 17. In the repressilator work or Elowitz & Leibler, (Nature, 403(20) 2000), they used 6 coupled first-order differential equations to describe the concentrations of mrna (m i ) and protein (p i ) for the 3 repressors. Given α is the production of mrna and β is defined as the ratio of protein to mrna degradation rate (decay rate) which parameter did they have to control in the engineered E.coli in order to achieve oscillatory behaviour and why? (note that these variable definitions are different form the Alon book) (3 points) β, because the natural protein decay rates were far too low to ever achieve oscillatory behaviour, they had to engineer degradation signals into the protein sequence (numerator of β). Section II: Multipart questions (58 points): 18. Respiratory capacity in the yeast Saccharomyces cerevisiae is regulated by a transcription factor called Hap4. Expression of Hap4 is induced by oxygen in the presence of ethanol and/or acetate, and is repressed by high glucose concentration. (6 points for all parts) A. Draw a logic gate representation of the above-described model assuming that all variables are binary. (4 points) A) B. Provide the truth table for ethanol, acetate, oxygen, glucose and Hap4. (2 points) B) Truth table: Glucose O2 Ethanol Acetate Hap4

0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 1 1 0 1 1 0 1 0 1 1 1 1 1 0 0 0 0 1 0 0 1 0 1 0 1 0 0 1 0 1 1 0 1 1 0 0 0 1 1 0 1 0 1 1 1 0 0 1 1 1 1 0 19. Protein production from mrna can be approximated by a simple model under a certain set of simplifying assumptions as shown in the following figure. (6 points for all parts) A. Assume that the stimulation of protein synthesis by mrna (S) is the rate limiting factor for the production of the protein (P). Write the differential equation that describes the concentration of P as a function of the concentration of S and derive the relation between P and S at steady state. (4 points) B. Would you recommend modeling this system by using a Boolean model? If not, illustrate a signal-response curve that may be suitable for making a binary behaviour assumption. (2 points) A) At steady state the time derivative can be set equal to zero. Thus we get: B) Since the signal-response relationship for this case is linear, Boolean approximation is not advisable. An example of signal-response curve where the Boolean approximation will be useful is given below.

20. Protein interactions: A series of Y2H experiments have been performed in order to determine the composition of a potential protein complex. The results are summarized in the table below. (10 points for all parts) A. Draw the network when you only include interactions with a reliability score above -0.7. Even though all experiments are done in replicates (as both prey and bat) they should only count as one in your drawing and calculations. (6 points). B. To clean up the resulting complex, it is decided to include proteins if their clustering coefficient is defined (i.e. having at least two neighbors) and above 0.5. Which proteins are members of the resulting, cleaned protein complex? (4 points) A) A-B: -0.301, A-C: -0.301, A-D: -0.60, A-F: -0.90, B-C: 0, B-D: -0.301, C-D: -0.301.

D-E: -0.90, E-F: -0.6 B) A, B, C, D which all have a clustering coefficient of 1 21. Regulatory network: You find an article in which it is stated that proteins E and F are co-factors for two transcription factors X and Y, respectively. The two transcription factors regulate the expression of proteins A, B, C and D. The regulatory network is depicted in the figure below. (8 points for all parts) The regulatory influence of the transcription factors and E and F is currently unknown, i.e. each could have either a repression or an activation effect. To support this claim, a series of knock-out expression experiments are performed, the results of which can be found in the table below. The results are in relative expression compared to a wild type strain. A. From the expression data and the regulatory network, determine if X has an activation, inhibiting or no regulatory influence on each of it s four targets (A, B, C and D). Is E inhibiting or activating the transcription factor X? (3 point) B. From the expression data and the regulatory network, determine if Y has an activation, inhibiting or no regulatory influence on each of it s four targets (A, B, C and D). Is F inhibiting or activating the transcription factor Y? (3 point) C. What experiment would you perform in order to determine if there is a genetic interaction between E and F? (2 points)

A) X is activating A and C and inhibiting B and D. E is activating X. B) Y is not inhibiting any of the proteins, it is activating C and D. F is inhibiting C) A double knocked out, where E and F are both knocked down and check if the expression changes are unexpected compared to the individual knockdowns. 22. Regulatory Cascade: Consider a cascade of three repressors; X -- Y -- Z (X represses Y, and Y represses Z). Protein X is initially present in the cell in its inactive form. The input signal of X, S x, appears at time t = 0. As a result, X rapidly (instantaneously) becomes active and binds the promoter of gene Y, so that protein Y starts to be repressed. When Y levels fall below a threshold K y, gene Z begins to be transcribed. (10 points for all parts) A) Assumptions you can make: - All proteins have the same degradation and dilution rate α = 0.9. - X is always present at sufficient concentrations to perfectly repress Y when in its activated form, X*. - Y is produced in its active form Y* (i.e. Y:=Y*) - The promoter of Z is constitutively active when not repressed. A. Plot the concentrations of Y and Z qualitatively, by hand, given that K y = ½ Y st where Y st is the steady state level of Y. (4 points) B. What is the response time of Z (i.e. time for Z to reach half of its steady state) given the definition of Ky in part (a). Remember to show your work. (4 points) C. What is the total response time of Z, from when X is activate, if the derepression of Z occurs at Ky = ¼ Yst, i.e. Y is effective at repressing Z down to ¼ of its maximal levels? (2 points)

B) This is the time for Y to decay to ½Y st plus the response time of Z from T Ky, or 2*ln(2)/α = 1.54 C) ln(4)/α + ln(2)/ α = 2.31 23. Metabolic network: Following diagram shows a simplified metabolic network for a certain mammalian cell-line. The maximum substrate uptake rate is estimated to be 3 mmole/gdw/hr. X denotes biomass while C and D are by-products. Assume steady-state behavior unless otherwise stated. (18 points for all parts)

A. Identify all essential reactions and a synthetically lethal set of reactions. (2 points) B. Identify all sets of fully coupled reactions. (2 points) C. List two sets of directionally flux coupled reactions that do not involve flux v1. (2 points) D. What will be the FBA predicted optimal growth rate for this model? Will the FBA solution be unique? Justify your answer. (2 points) E. For the both products C & D, prepare a plot showing the relation between the biomass formation rate and productivity (in this case, product formation rate * biomass formation rate ). What is the specific biomass production rate (i.e. growth rate) at which the maximum productivity will be achieved? (4 points) F. To understand the transcriptional regulation in the metabolism, DNA microarray analysis was performed comparing two different conditions. The results from the analysis are summarized in the following table, in terms of the p-values (and Z-scores) calculated by using a statistical significance test. Using this data, identify the top-scoring reporter metabolite (i.e. metabolite around which the most significant collective transcriptional changes are observed). The background distribution of Z- scores at the whole genome level is given by the following equations,

where n is number of neighbours. (6 points) a) 2 pts. Essential reactions: v1, v11 Synthetically lethal sets: {v2, v3}, {v2, v6}, {v2, v7}, b) 2 pts. Fully coupled sets: {v8, v10}, {v6, v7} c) 2 pts. Directionally flux coupled reactions: {v4 v3}, {v9 v2}, {v8 v2} d) 2 pts. FBA predicted optimal growth rate for this model will be 3 hr-1. This will be the case when all of the substrate will go to biomass and none to by-products. The FBA solution in this case will not be unique since any distribution of flux at the branch points following the metabolites A and B will result in the same optimal growth rate. e) 4 pts. The relationship between the productivity and biomass formation arte is shown in the above plot. The maximum productivity for C occurs at growth rate of 1.5, i.e. at a point when the substrate is equally distributed between the biomass and

product. In case of D, no product formation is feasible due to unbalance of ATP/ADP in the pathway and hence there is no optimal productivity. f) 6 pts.the calculation of the corrected Z-scores for all metabolites is summarized in the following table. Based on these results it can be seen that the highest scoring reporter metabolite is ATP (/ADP).