CS-E5880 Modeling biological networks Gene regulatory networks

CS-E5880 Modeling biological networks Gene regulatory networks Jukka Intosalmi (based on slides by Harri Lähdesmäki) Department of Computer Science Aalto University January 12, 2018

Outline Modeling gene expression Reading (see references at the end): This lecture follows closely Section 7 from (Ingalls, 2013)

Gene expression and gene regulatory networks Cellular functions are driven by proteins Proteins are produced in a process called gene expression Proteins abundance is primarily controlled by its production rate These production rates are regulated by transcription factor proteins Collectively genes and transcription factors form a gene regulatory network

Unregulated gene expression Figure: Figure 7.2a from (Ingalls, 2013)

Regulated gene expression Figure: Figures 7.2b-c from (Ingalls, 2013)

Gene expression Gene expression as a two-step process Transcription Translation Figure: Figure 7.1 from (Ingalls, 2013)

Modeling gene expression Transcription and translation are complex processes that involve biochemical reactions which, in turn, involve many elementary reactions

Modeling gene expression Transcription and translation are complex processes that involve biochemical reactions which, in turn, involve many elementary reactions Majority of these biochemical reactions are not fully characterized Transcription factors and other molecules involved in gene regulation are typically present in small amounts Mass-action kinetics applied in a more abstracted sense Differential equation models for average behaviour over a population of cells

Modeling gene expression Simplified models may be feasible if we can assume that the activities of background cellular machinery (e.g. nucleic acids, RNA polymerases, amino acids, ribosomes) are fixed. Level of abstraction depends on the application.

Unregulated gene expression Unregulated gene expression can be modelled using the following model d dt m(t) = k 0 δ m m(t) d dt p(t) = k 1m(t) δ p p(t), where m(t) and p(t) are the mrna and protein concentrations The transcription rate parameter k 0 depends on Gene copy number, abundance of polymerase, promoter affinity, availability of nucleotide building blocks The translation rate parameter k 1 depends on Availability of ribosomes, strength of mrna s ribosome binding, availability of transfer RNA and amino acids

Unregulated gene expression Transcription and translation are balanced by the degradation rates δ m and δ p Parameters δ m and δ p can also be used to describe dilution of mrna and protein due to increasing cell size

Unregulated gene expression In steady-state we have m ss = k 0 δ m p ss = k 0k 1 δ m δ p The unregulated gene expression model can be simplified by assuming that mrna decay is much faster than protein decay (minutes vs. hours) As before, separation of time-scales motivates the use of the quasi-steady state for mrna level, thus d dt p(t) = k 0k 1 δ p p(t) δ m

Regulated gene expression Majority of gene expression regulation takes place at the initiation of transcription In prokaryotes, this is regulated via association of RNA polymerase with gene promoter Proteins called transcription factors (TF) can affect polymerase association TF binding sites, also called operons, are typically located close to promotes In eukaryotes, TF binding sites can be more far away TFs can act as activators or repressors Additionally, gene expression can be regulated at many stages: RNA polymerase binding, RNA polymerase elongation, translation initiation and elongation, mrna and protein degration, epigenetic state of chromatin, post-translational modifications of proteins/tfs, mirna binding mrna, etc.

Regulated gene expression An illustration of activator and repressor TFs Figure: Figures 7.2b-c from (Ingalls, 2013)

TF binding TF binding to a binding/operator site can be modeled by P + O a d OP, where P and O are the TF and operator site

TF binding TF binding to a binding/operator site can be modeled by P + O a d OP, where P and O are the TF and operator site If binding/disassociation occurs much faster than gene expression, we can assume steady-state and write fraction of bound sites = where K = d a. This defines the promoter occupancy = = [OP] [O] + [OP] [O][P]/K [O] + [O][P]/K [P]/K 1 + [P]/K = [P] K + [P],

Rates of regulated transcription The rate of transcription depends on promoter occupancy Activation rate of activated transcription = α [P]/K 1 + [P]/K, where K and α are the half saturating constant and limiting rate as before The above model implies no transcription if the TF is not bound Often genes can be transcribed weakly even in the absence of TF binding rate of activated transcription = α 0 + α [P]/K 1 + [P]/K,

Rates of regulated transcription Repression can be quantified by following the fractional saturation fraction of unbound sites = [O] [O] + [OP] = 1 1 + [P]/K which corresponds to the following leaky transcription 1 rate of repressed transcription = α 0 + α 1 + [P]/K

Regulation by multiple TFs Genes are commonly regulated by several TFs Assume the promoter of a gene contains two non-overlapping operons: O A for a TF A and O B for TF B Promoter has four different states O : OA : OB : OAB : A and B unbound A bound at O A, B unbound B bound at O B, A unbound A bound at O A, B bound at O B

Regulation by multiple TFs If the binding event for O A and O B are independent, then O + A a 1 d1 OB + A a 1 d1 O + B a 2 d2 OA + B a 2 d2 OA OAB OB OAB.

Regulation by multiple TFs If the binding event for O A and O B are independent, then O + A a 1 d1 OB + A a 1 d1 O + B a 2 d2 OA + B a 2 d2 OA OAB OB OAB. For example, the fraction of cells in state OAB fraction of cells in state OAB = [OAB] [O] + [OA] + [OB] + [OAB]

Regulation by multiple TFs Assuming steady-state for binding events and using the fractional saturation, one gets fraction of cells in state O = fraction of cells in state OA = fraction of cells in state OB = fraction of cells in state OAB = 1 1 + [A] K A + [B] K B [A] K A 1 + [A] K A + [B] K B [B] K B 1 + [A] K A + [B] K B [A][B] K A K B + [A][B] K A K B + [A][B] K A K B + [A][B] K A K B 1 + [A] K A + [B] K B + [A][B], K A K B where K A = d 1 a 1 and K B = d 2 a 2

Regulation by multiple TFs The rate of transcription depends on the type of multivariate transcriptional regulation For example, if both A and B are repressors and either one is enough to block the transcription, then gene expression happens only from state O and transcription rate = α 1 + [A] K A + [B] K B + [A][B] K A K B Alternatively, if both repressors are needed for the inhibition transcription rate = [A] α(1 + K A + [B] K B ) 1 + [B] K B + [A] K A + [A][B] K A K B

Cooperativity in TF binding Recall cooperativity when multiple ligands binding a protein Cooperativity among multiple TFs binding operator sites in a promoter can also occur and can be modelled similarly Consider the same model for two TFs regulating a gene via operator O A and O B as before but additionally assume that the two TFs interact after binding DNA Assume the dissociation of the second TF binding event is reduced by a factor K Q The dissociation constant for A dissociating from OAB reduces from K A to K A K Q (K A > K A K Q ) The dissociation constant for B dissociating from OAB reduces from K B to K B K Q (K B > K B K Q )

Cooperativity in TF binding The steady state distribution changes to fraction of cells in state O = fraction of cells in state OA = fraction of cells in state OB = fraction of cells in state OAB = 1 1 + [A] K A + [B] K B + [A][B] K A K B K Q [A] K A 1 + [A] K A + [B] K B + [A][B] K A K B K Q [B] K B 1 + [A] K A + [B] K B + [A][B] K A K B K Q [A][B] K A K B K Q 1 + [A] K A + [B] K B + [A][B], K A K B K Q In the case of strong cooperativity (K Q 1), the second operator site will be almost always bound once the first TF has bound

Cooperativity in TF binding In the case of strong cooperativity (K Q 1), the second operator site will be almost always bound once the first TF has bound This implies that states OA and OB will be negligible 1 + [A] + [B] + [A][B] 1 + [A][B] K A K B K A K B K Q K A K B K Q and fraction of cells in state O = fraction of cells in state OAB = 1 1 + [A][B] K A K B K Q [A][B] K A K B K Q 1 + [A][B], K A K B K Q

Cooperativity in TF binding In the case the two TFs are the same A = B = P, the distribution becomes fraction of cells in state O = fraction of cells in state OPP = 1 1 + [P]2 K 2 P K Q [P] 2 K 2 P K Q 1 + [P]2 K 2 P K Q In general with N TFs, this takes the familiar form of Hill function ( [P] 1 + K P K Q ) N ( [P] K P K Q ) N [P] N K N + [P] N

Autoregulatory gene circuits Figure: Figures 7.3 from (Ingalls, 2013)

Autoinhibition In equilibrium the rate of an inhibited gene is rate of repressed transcription = The model for autoinhibition loop is thus d dt p(t) = α 1 + p(t)/k δ pp(t) α 1 + [P]/K Autoinhibition mechanism provides reduced sensititivity to perturbations

Autoactivation Autoactivation loop can be constructed similarly, assuming steady-state for mrna level rate of activated transcription = The model for autoactivation loop is thus d dt p(t) = αp(t) 1 + p(t)/k δ pp(t) α[p]/k 1 + [P]/K Quick convergence to a state in which the gene is expressing at high rate.

Genetic switches The so-called Collins toggle switch is one of the first genetic toggle switches Constructed by re-wiring component of an existing gene regulatory network The network consists of a mutual repression scheme The toggle switch can be flipped by intervening the TFs Figure: Figures 7.12 from (Ingalls, 2013)

The Collins toggle switch Collins toggle switch contains a reporter gene Green fluorescent protein (GFP) emits green light when exposed to blue light GFP is attached downstream of one of the TFs

The Collins toggle switch A simple model without the mrna level has been developed for the model d dt p 1(t) = d dt p 2(t) = 1 + 1 + α 1 ( p2 (t) 1+i 2 ) β p 1 (t) α 2 where Protein concentrations p1 (t) and p 2 (t) ( p1 (t) 1+i 1 ) γ p 2 (t), Limiting rates α1 and α 2 Degrees of nonlinearity (cooperativity) β and γ

The Collins toggle switch An illustration of numerical model simulator Inducers have a desired effect Figure: Figures 7.13 from (Ingalls, 2013)

Oscillatory gene regulatory networks An autoinhibitory gene can generate persistent oscillation We will look at a specific model that provides circadian rhythm in the fruit fly Circadian rhythms regulate our sleep-wake cycles and are disturbed by travels across time zones Studies have shown that these internal clocks have a period of approx. 24 hours In mammals, the primary pacemaker is a group of some thousands neurons in the suprachiasmatic nuclues, which have a direct connection to the retina We will consider an early circadian oscillator model for fruit fly by Albert Goldbeter

Oscillatory gene regulatory networks Goldbeter s model was initially suggested by molecular studies per mrna, protein and phoshoprotein levels all oscillate with the 24 hour period Peak in mrna preceeds that of protein by 4 hours If per cannot enter nucleus, oscillation did not occur

Oscillatory gene regulatory networks Illustration of Goldbeter s model The basic structure: autoinhibition with delay Figure: Figures 7.18 from (Ingalls, 2013)

Oscillatory gene regulatory networks Goldbeter s model is formulated by using first-order kinetics for transport and Michaelis-Menten kinetics for degradation, (de)phosphorylation d dt m(t) = v s 1 + (p N (t)/k I ) n v mm(t) K m + m(t) d dt p 0(t) = k s m(t) V 1p 0 (t) K 1 + p 0 (t) + V 2p 1 (t) K 2 + p 1 (t) d dt p V 1 p 0 (t) 1(t) = K 1 + p 0 (t) V 2p 1 (t) K 2 + p 1 (t) d dt p V 3 p 1 (t) 2(t) = K 3 + p 1 (t) V 4p 2 (t) k 2 p N (t) V dp 2 (t) K d + p 2 (t) d dt p N(t) = k 1 p 2 (t) k 2 p N (t) V 3p 1 (t) K 3 + p 1 (t) + K 4 + p 2 (t) k 1p 2 (t) + V 4p 2 (t) K 4 + p 2 (t)

Oscillatory gene regulatory networks Illustration of Goldbeter s model Total protein level: p T = p 0 + p 1 + p 2 + p N Nuclear protein level: p N Figure: Figures 7.19 from (Ingalls, 2013)

References Ingalls BP, Mathematical Modeling in Systems Biology: An Introduction, MIT Press, 2013