An analytical model of the effect of plasmid copy number on transcriptional noise strength

Similar documents
the noisy gene Biology of the Universidad Autónoma de Madrid Jan 2008 Juan F. Poyatos Spanish National Biotechnology Centre (CNB)

2. Mathematical descriptions. (i) the master equation (ii) Langevin theory. 3. Single cell measurements

Stochastic simulations

Statistical Physics approach to Gene Regulatory Networks

Boulder 07: Modelling Stochastic Gene Expression

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

A synthetic oscillatory network of transcriptional regulators

arxiv: v2 [q-bio.mn] 20 Nov 2017

Stochastic simulations!

arxiv: v2 [q-bio.qm] 12 Jan 2017

Bi 1x Spring 2014: LacI Titration

Modelling Stochastic Gene Expression

Lecture 2: Analysis of Biomolecular Circuits

Biomolecular Feedback Systems

A Hilbert Space for Random Processes

BE 150: Design Principles of Genetic Circuits

Supplementary Materials for

Testing the transition state theory in stochastic dynamics of a. genetic switch

4. Why not make all enzymes all the time (even if not needed)? Enzyme synthesis uses a lot of energy.

Stochastic model of mrna production

Colored extrinsic fluctuations and stochastic gene expression: Supplementary information

Expression arrays, normalization, and error models

Engineering of Repressilator

Stochastic Processes around Central Dogma

Cosmic Statistics of Statistics: N-point Correlations

Counting Statistics and Error Prediction

Biomolecular reaction networks: gene regulation & the Repressilator Phys 682/ CIS 629: Computational Methods for Nonlinear Systems

Genomic Medicine HT 512. Data representation, transformation & modeling in genomics

Modelling residual wind farm variability using HMMs

2 Two Random Variables

A Synthetic Oscillatory Network of Transcriptional Regulators

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

The Wright-Fisher Model and Genetic Drift

Package noise. July 29, 2016

DEGseq: an R package for identifying differentially expressed genes from RNA-seq data

Lecture 7: Simple genetic circuits I

Physics Sep Example A Spin System

Design of microarray experiments

1 Introduction. Maria Luisa Guerriero. Jane Hillston

Q = (c) Assuming that Ricoh has been working continuously for 7 days, what is the probability that it will remain working at least 8 more days?

CSEP 590A Summer Lecture 4 MLE, EM, RE, Expression

CSEP 590A Summer Tonight MLE. FYI, re HW #2: Hemoglobin History. Lecture 4 MLE, EM, RE, Expression. Maximum Likelihood Estimators

Stochastic Structural Dynamics Prof. Dr. C. S. Manohar Department of Civil Engineering Indian Institute of Science, Bangalore

Recall the Basics of Hypothesis Testing

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

Conditions for successful data assimilation

Part 3: Introduction to Master Equation and Complex Initial Conditions in Lattice Microbes

Combined Model of Intrinsic and Extrinsic Variability for Computational Network Design with Application to Synthetic Biology

Every step in the process of gene expression includes some

Wide-field Infrared Survey Explorer

Stem Cell Reprogramming

Gillespie s Algorithm and its Approximations. Des Higham Department of Mathematics and Statistics University of Strathclyde

Statistics Random Variables

Lecture 7: Chapter 7. Sums of Random Variables and Long-Term Averages

Modeling the impact of periodic bottlenecks, unidirectional mutation, and observational error in experimental evolution

Monte Carlo Integration I [RC] Chapter 3

Frequency Spectra and Inference in Population Genetics

Stochastic Processes around Central Dogma

2DI90 Probability & Statistics. 2DI90 Chapter 4 of MR

Modeling and Systems Analysis of Gene Regulatory Networks

Chapter 3: Birth and Death processes

Stochastic Processes

Predicting causal effects in large-scale systems from observational data

Probability and Stochastic Processes

CELL BIOLOGY. by the numbers. Ron Milo. Rob Phillips. illustrated by. Nigel Orme

One-shot Learning of Poisson Distributions Information Theory of Audic-Claverie Statistic for Analyzing cdna Arrays

Numerical Methods. Equations and Partial Fractions. Jaesung Lee

Modeling Multiple Steady States in Genetic Regulatory Networks. Khang Tran. problem.

What is the neural code? Sekuler lab, Brandeis

Reaction Kinetics in a Tight Spot

Lecture 5: Processes and Timescales: Rates for the fundamental processes 5.1

Confidence Intervals 1

SUPPLEMENTARY INFORMATION

Protein production. What can we do with ODEs

Probability and Stochastic Processes: A Friendly Introduction for Electrical and Computer Engineers Roy D. Yates and David J.

Moments of the maximum of the Gaussian random walk

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Confidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean

Optimal Feedback Strength for Noise Suppression in Autoregulatory Gene Networks

Tutorial 1 : Probabilities

MICROPIPETTE CALIBRATIONS

ECE-340, Spring 2015 Review Questions

Population Genetics: a tutorial

Efficient Leaping Methods for Stochastic Chemical Systems

Supplement A. The relationship of the proportion of populations that replaced themselves

Regression for finding binding energies

STAT 516 Midterm Exam 3 Friday, April 18, 2008

The Amoeba-Flagellate Transformation

56:198:582 Biological Networks Lecture 9

Image Gradients and Gradient Filtering Computer Vision

Stochastic dynamics of small gene regulation networks. Lev Tsimring BioCircuits Institute University of California, San Diego

Expressions for the covariance matrix of covariance data

Mode-locked laser pulse fluctuations

A traffic lights approach to PD validation

Stochastic Processes: I. consider bowl of worms model for oscilloscope experiment:

F denotes cumulative density. denotes probability density function; (.)

A Visual Process Calculus for Biology

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Chapter 15 Active Reading Guide Regulation of Gene Expression

On the declaration of the measurement uncertainty of airborne sound insulation of noise barriers

Transcription:

An analytical model of the effect of plasmid copy number on transcriptional noise strength William and Mary igem 2015 1 Introduction In the early phases of our experimental design, we considered incorporating our two fluorescent reporters into a single plasmid. However, under this method the fluorescence readout for each channel from each cell would represent the cumulative output of many identical but independent noisy functions. The interactions between these independent signals would lead to a damping of the noise in our measurements it is a well-known result that for identical signals with with an uncorrelated zero-mean Gaussian noise term, the signal-tonoise ratio scales proportionally to 1/ n where n is the number of independent copies of the signal [1]. On the other hand, fluctuations of plasmid copy number around its steadystate value would be an added source of noise in our measurements. Although the dual-reporter system filters out extrinsic noise by considering the correlation between the two fluorescence signals, the derivation of the equation which Elowitz et al 2002 used to decompose the observed noise into intrinsic and extrinsic components explicitly accounted for the gene copy number within a cell [2][3]. Without a reliable way to separate the additional copy number fluctuation-induced noise effects from the other noise terms, we would be unable to use this pre-established theory to analyze our measurements for noise. Hence we found the need to develop a model that explicitly describes the simultaneously reducing and enhancing impacts of plasmid copy number on our fluorescence signals. 2 The Basis for the Model We found through a search of the literature that Jones et al 2014 derived a simple equation that accounts for the effect of chromosome replication on transcription rate in E. coli [4] : Var[m] m = m2 1 m 2 1 f(1 f) + m 1 1 + f m 1 (1) where the Fano Factor, Var[m]/ m, is their measure of transcriptional noise. Here m is a random variable which indicates the number of mrna copies present in the cell, and Var[ ] and are used to denote the variance and mean, respectively, across the population. m i is the expected value of m given that there exist i copies of the gene in the cell. f represents the fraction of time that a given cell contains two chromosomes, and hence two copies of the gene. 1

This result is a powerful one, stating that the fluctuations in gene copy number always confer an additive term onto the Fano Factor of transcripts in the cell (notice that the first term is equivalent to the Fano Factor of mrna copy number in a cell that always has one gene copy). The Fano Factor, though it is not the way we define noise in our data analysis, can be interpreted as noise strength or the fold change of the noise of the process compared to that of an equivalent Poisson process [4][5]. Unfortunately, Jones et al s derivation only accounts for cells that contain either 1 or 2 gene copies. Given that the copy number of psb1c3 can range from 100-300 per cell [6], we needed to expand their model to accommodate any number of gene copies. 3 Expansion to n Gene Copies We begin by defining some new random variables to ease the generalization of Jones et al s work: let M be a random variable which represents the mrna copy number in the cell, R a random variable which represents the net mrna production of one gene copy at a given time, and G a random variable which represents the number of additional gene copies in the cell (ie. the number of gene copies 1). M is distributed for any integer n, and R, G = 0 R (2), G = 1 M =, R (n), G = n 1 R (i) = with the R j identical and mutually independent and having mean and variance equal to E[R] and Var[R], respectively. Then the n-gene copy equation is Var[M] E[M] = Var[R] E[R] R j + Var[G] E[R] (2) 1 + E[G] where E[ ] denotes the expected value of the variable over the population. In particular, 1 + E[G] is the expected value of the actual number of plasmids in the cell. Proof. First, recall from the properties of mutually independent random variables that E[R (i) ] = E = E[R j ] = E[R] = ie[r] R j and Var[R (i) ] = Var = R j Var[R j ] = Var[R] = ivar[r]. 2

Now, by the Law of Total Expectation, E[M] = E G [E[M G]] = E G 2E[R], G = 1 ne[r], G = n 1 = [1 p G (1) p G (2) p G (n 1)]E[R] + [2p G (1)E[R] + 3p G (2)E[R] + + np G (n 1)]E[R] = [1 + p G (1) + 2p G (2) + + (n 1)p G (n 1)]E[R] = 1 + gp G (g) E[R] g G where G is the support of G. = (1 + E[G])E[R] (3) Using the Law of Total Variance, we can also obtain Var[M] = E G [Var[M G]] + Var G [E[M G]] Var[R], G = 0 = E G 2Var[R], G = 1 + Var G nvar[r], G = n 1 2E[R], G = 1 ne[r], G = n 1 = [1 + p G (1) + 2p G (2) + + (n 1)p G (n 1)]Var[R] + 2 E G 2E[R], G = 1 E 2E[R], G = 1 G ne[r], G = n 1 ne[r], G = n 1 = (1 + E[G])Var[R] + E G 4E[R], G = 1 n 2 E[R], G = n 1 (1 + p G (1) + + (n 1)p G (n 1)) 2 E[R] 2 = (1 + E[G])Var[R] + [1 + 3p G (1) + 8p G (2) + + (n 2) 2 p G (n 1)]E[R] 2 (1 + E[G]) 2 E[R] 2 = (1 + E[G])Var[R] + 1 + 2gp G (g) + g 2 p G (g) E[R] 2 g G g G (1 + 2E[G] + E[G] 2 )E[R] 2 = (1 + E[G])Var[R] + [1 + 2E[G] + E[G 2 ] (1 + 2E[G] + E[G] 2 )]E[R] 2 = (1 + E[G])Var[R] + (E[G 2 ] E[G] 2 )E[R] 2 = (1 + E[G])Var[R] + Var[G]E[R] 2. (4) 2 3

Substituting (3) and (4) into the definition of the Fano Factor, we obtain Var[M] E[M] = (1 + E[G])Var[R] + Var[G]E[R]2 (1 + E[G])E[R] = Var[R] E[R] + Var[G] 1 + E[G] E[R] which is equivalent to (2), our n-gene copy equation. Note that letting G Bernoulli(f) in (2) allows us to exactly reproduce (1). 4 Properties of the Model We can run a number of checks to bolster our confidence in the model. First, fluctuations in copy number at low copy numbers should have a larger impact on the noise than fluctuations at high copy numbers a change from 1 to 2 plasmids is a 100 percent increase in gene copy, whereas a change from 100 to 101 is only a 1 percent increase. If we set G Discrete Uniform(a, b) for simplicity, we find that plasmids which are allowed to fluctuate between 0 and 10 extra copies confer a much higher value to the noise strength (Figure 1) than plasmids which are allowed to fluctuate between 10 and 20 extra copies (Figure 2). This dampening is consistent across a variety of common distribution choices. Figure 1: Fano Factor contributions from plasmid fluctuations for various distributions of the family G Discrete Uniform(a, b), ranging through a, b [0, 15]. E[R] = 10. 4

Figure 2: Fano Factor contributions from plasmid fluctuations for various distributions of the family G Discrete Uniform(a, b), ranging through a, b [10, 25]. E[R] = 10. We can also examine the extent to which the choice of the distribution of G affects the noise conferred by the plasmid fluctuations. If we set G Bernoulli(f), we see for various values of E[R] that the largest contribution to noise strength occurs near f 0.5, ie. when fluctuations in gene copy number happen more frequently (Figure 3). 5

Figure 3: Fano Factor contributions from plasmid fluctuations for various distributions of the family G Bernoulli(f), ranging through f [0, 1]. Values of E[R] chosen to correspond with Figure S6 of [4]. Generalizing the distribution of G, however, we find that this effect persists if G is normally distributed so that the average plasmid copy number is 200, the Fano Factor contribution increases monotincally as the distribution widens. If we allow σ = 49 so that the any values in the published 100-300 copies-percell range fall within 2 standard deviations of the mean, the additional noise strength is so high that it questions the validity of the assumption (Figure 4). 6

Figure 4: Fano Factor contributions from plasmid fluctuations for various distributions of the family G N(199, σ), ranging through σ [1, 50]. E[R] values chosen to correspond to Figure 3. Black diamonds correspond to distributions in Figure 5. Figure 5: A portion of the distributions corresponding to black diamonds in Figure 4. The σ = 5 distribution of G induces much less noise than the σ = 49 distribution. 7

5 Conclusions From our model it is clear that the additive effect of gene copy number fluctuations on noise strength that Jones et al observed for up to 2 gene copies persists for any number of possible gene copy numbers. However, we can see that if we want to analyze the magnitude of this additive term for a given mean transcription level, it is sufficient, but also necessary, to know the first two moments of the distribution of G. Although the distributions of steady-state copy number are considered to be narrow Gaussian around the mean in most wild-type plasmids [7], the distributions for the psb1x3-type plasmids are not well-described. While we felt it would be relatively simple to disentangle the effect of signal damping on the noise values we would obtain from our measurements, the confounding factors arising from the fluctuations of the plasmids themselves would require an explicit input of the distribution of G into our model. Because our model revealed that the choice of distribution so heavily determines the strength of the additional noise, we decided that we could not justifiably assume any distribution for G without additional information which was beyond the scope of our project. To ensure the validity of our experiment s results, we chose not to proceed with the plasmid-based reporter system. References [1] https://en.wikipedia.org/wiki/signal_averaging [2] Elowitz, M.B., Levine, A.J., Siggia, E.D., and Swian, P.S. Stochastic Gene Expression in a Single Cell. (2002) Science 297, 1183-1186. [3] Swain, P.S., Elowitz, M.B., and Siggia, E.D. Intrinsic and extrinsic contributions to stochasticity in gene expression. (2002) PNAS 99, 12795-12800. [4] Jones, D.L., Brewster, R.C., and Phillips, R. Promoter architecture dictates cell-to-cell variability in gene expression. (2014) Science 346, 1533-1536. [5] Arriaga, E.A. Determining biological noise via single cell analysis. (2009) Anal Bioanal Chem 393, 73-80. [6] http://parts.igem.org/part:psb1c3 [7] del Solar, G. and Espinosa, M. Plasmid copy number control: an evergrowing story. (2000) Molecular Microbiology 37, 492-500. 8