State-Feedback Control of Partially-Observed Boolean Dynamical Systems Using RNA-Seq Time Series Data

Similar documents
Optimal State Estimation for Boolean Dynamical Systems using a Boolean Kalman Smoother

Multiple Model Adaptive Controller for Partially-Observed Boolean Dynamical Systems

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 2, FEBRUARY X/$ IEEE

BIOINFORMATICS. Adaptive Intervention in Probabilistic Boolean Networks. Ritwik Layek 1, Aniruddha Datta 1, Ranadip Pal 2, Edward R.

Index. FOURTH PROOFS n98-book 2009/11/4 page 261

Optimal control of Boolean control networks

GLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data

Lecture Network analysis for biological systems

SYSTEMS MEDICINE: AN INTEGRATED APPROACH WITH DECISION MAKING PERSPECTIVE. A Dissertation BABAK FARYABI

External Control in Markovian Genetic Regulatory Networks: The Imperfect Information Case

Temporal-Difference Q-learning in Active Fault Diagnosis

Completion Time of Fuzzy GERT-type Networks with Loops

Introduction to Bioinformatics

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

CONTROL OF STATIONARY BEHAVIOR IN PROBABILISTIC BOOLEAN NETWORKS BY MEANS OF STRUCTURAL INTERVENTION

Basic modeling approaches for biological systems. Mahesh Bule

Input-State Incidence Matrix of Boolean Control Networks and Its Applications

A linear control model for gene intervention in a genetic regulatory network

Decision Theory: Markov Decision Processes

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem

9 Multi-Model State Estimation

Networks in systems biology

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

Procedia Computer Science 00 (2011) 000 6

An Empirical Algorithm for Relative Value Iteration for Average-cost MDPs

Elements of Reinforcement Learning

Reinforcement Learning. Introduction

Reinforcement Learning

A New Method to Build Gene Regulation Network Based on Fuzzy Hierarchical Clustering Methods

On the Convergence of Optimistic Policy Iteration

On Finding Optimal Policies for Markovian Decision Processes Using Simulation

Probabilistic reconstruction of the tumor progression process in gene regulatory networks in the presence of uncertainty

Introduction to Bioinformatics

Epistemology and the Role of Mathematics in Translational Science

Reinforcement learning

Synchronous state transition graph

MDP Preliminaries. Nan Jiang. February 10, 2019

DESIGN OF EXPERIMENTS AND BIOCHEMICAL NETWORK INFERENCE

Reinforcement Learning and Optimal Control. ASU, CSE 691, Winter 2019

Learning in Bayesian Networks

Abstract Dynamic Programming

arxiv: v1 [q-bio.mn] 7 Nov 2018

5.3 METABOLIC NETWORKS 193. P (x i P a (x i )) (5.30) i=1

6 Reinforcement Learning

Intrinsic Noise in Nonlinear Gene Regulation Inference

56:198:582 Biological Networks Lecture 9

Value and Policy Iteration

6.231 DYNAMIC PROGRAMMING LECTURE 6 LECTURE OUTLINE

Optimal control and estimation

DNA regulatory circuits can be often described by

CS 7180: Behavioral Modeling and Decisionmaking

L06. LINEAR KALMAN FILTERS. NA568 Mobile Robotics: Methods & Algorithms

An Adaptive Clustering Method for Model-free Reinforcement Learning

21 Markov Decision Processes

1 Problem Formulation

Decision Theory: Q-Learning

Approximate active fault detection and control

Stochastic Shortest Path Problems

Optimal Perturbation Control of General Topology Molecular Networks

Lecture 3: Mixture Models for Microbiome data. Lecture 3: Mixture Models for Microbiome data

Introduction to Approximate Dynamic Programming

Case Studies of Logical Computation on Stochastic Bit Streams

nutrients growth & division repellants movement

Bayesian Control of Large MDPs with Unknown Dynamics in Data-Poor Environments

Introduction Probabilistic Programming ProPPA Inference Results Conclusions. Embedding Machine Learning in Stochastic Process Algebra.

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

Lecture notes for Analysis of Algorithms : Markov decision processes

A Tour of Reinforcement Learning The View from Continuous Control. Benjamin Recht University of California, Berkeley

Chapter 15 Active Reading Guide Regulation of Gene Expression

Real Time Value Iteration and the State-Action Value Function

Prioritized Sweeping Converges to the Optimal Value Function

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

Can evolution paths be explained by chance alone? arxiv: v1 [math.pr] 12 Oct The model

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS

2. Mathematical descriptions. (i) the master equation (ii) Langevin theory. 3. Single cell measurements

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 7: Simple genetic circuits I

CS 4649/7649 Robot Intelligence: Planning

Parameter estimation using simulated annealing for S- system models of biochemical networks. Orland Gonzalez

Internet Monetization

The Kalman Filter ImPr Talk

Sig2GRN: A Software Tool Linking Signaling Pathway with Gene Regulatory Network for Dynamic Simulation

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

Hub Gene Selection Methods for the Reconstruction of Transcription Networks

Mathematical Biology - Lecture 1 - general formulation

Models of Molecular Evolution

ASIGNIFICANT research effort has been devoted to the. Optimal State Estimation for Stochastic Systems: An Information Theoretic Approach

Reinforcement Learning

Basics of reinforcement learning

Algorithmisches Lernen/Machine Learning

CSEP 573: Artificial Intelligence

Measures for information propagation in Boolean networks

Regulation of Gene Expression

Network Dynamics and Cell Physiology. John J. Tyson Department of Biological Sciences & Virginia Bioinformatics Institute

Sensitivity analysis of biological Boolean networks using information fusion based on nonadditive set functions

Computation and Dynamic Programming

arxiv: v1 [cs.sy] 25 Oct 2017

BAYESIAN ESTIMATION OF UNKNOWN PARAMETERS OVER NETWORKS

Transcription:

State-Feedback Control of Partially-Observed Boolean Dynamical Systems Using RNA-Seq Time Series Data Mahdi Imani and Ulisses Braga-Neto Department of Electrical and Computer Engineering Texas A&M University College Station TX USA Email: m.imani88@tamu.edu ulisses@ece.tamu.edu Abstract External control of a genetic regulatory network is used for the purpose of avoiding undesirable states such as those associated with disease. This paper proposes a strategy for state-feedback infinite-horizon control of Partially-Observed Boolean Dynamical Systems (POBDS) using a single time series of Next-Generation Sequencing (NGS) RNA-seq data. A separation principle is assumed whereby first the optimal stationary policy is obtained offline by solving Bellman s equation and then an optimal MMSE observer the Boolean Kalman Filter is employed for online implementation of the policy using the RNAseq observations of the evolving system. Performance is investigated using a Boolean network model of the mutated mammalian cell cycle and simulated RNA-seq observations. State-Feedback Control Boolean Dynamical Systems Boolean Kalman Filter Next-Generation Sequencing. I. INTRODUCTION A fundamental problem in genomic signal processing is to design intervention strategies for gene regulatory networks to beneficially alter network dynamics. This task is usually done to reduce the steady-state mass of undesirable states such as cell proliferation states which may be associated with cancer [1]. Boolean networks [2] have emerged as an effective model of the dynamical behavior of gene regulatory networks consisting of genes in activated/ inactivated states the relationship among which is governed by logical relationships updated at discrete time intervals [3] [4]. To date different modeling approaches such as Probabilistic Boolean Networks (PBNs) [3] S-systems [5] and Bayesian networks [6] have been proposed in literature to mathematically capture the behavior of genetic regulatory networks. In addition various intervention approaches [1] [7] [9] have been developed. These all assume that the system states are directly measurable and are mostly in the framework of Probabilistic Boolean Networks (PBNs). In this paper we consider a signal model with distinct state and observation processes; thus the transcriptional state of the genes is not assumed to be observable directly but only though indirect RNA-seq measurements [4] [10] [12]. Initially we apply the theory of infinitehorizon optimal stochastic control to find the stationary policy that optimally controls the evolution of a Boolean dynamical system is applied. In the next step we apply the optimal MMSE observer for the proposed signal model known as the Boolean Kalman Filter [4] to estimate the state of the system to which is applied the previously obtained control policy. Performance is investigated using a Boolean network model of the mutated mammalian cell cycle and simulated RNA-seq observations. II. PARTIALLY-OBSERVED BOOLEAN DYNAMICAL SYSTEMS Deterministic Boolean network models are unable to cope with (1) uncertainty in state transition due to system noise and the effect of unmodeled variables; the fact that (2) the Boolean states of a system are never observed directly this calls for a stochastic approach. We describe below the stochastic signal model for partially-observed Boolean dynamical systems first proposed in [4] which will be employed here. A. State Model We assume that the system is described by a state process {X k ; k = 0 1...} where X k {0 1} d is a Boolean vector of size d in the case of a gene regulatory network the components of X k represent the activation/inactivation state of the genes at time k. The state is affected by a sequence of control inputs

{u k ; k = 0 1...} where u k U represents a purposeful intervention into the system state in the biological example this might model drug applications. The state evolution is thus specified by the following discrete-time nonlinear signal model: X k = f (X k 1 u k 1 ) n k (1) for k = 1 2... where f {0 1} d U {0 1} d is a network function {n k ; k = 1 2...} is a white state noise process with n k {0 1} d and indicates component-wise modulo-2 addition. The noise is white in the sense that it is uncorrelated; i.e. n k and n l are uncorrelated for k l. In addition the noise process is assumed to be uncorrelated from the state process and control input. B. Observation Model In most real-world applications the system state is only partially observable and distortion is introduced in the observations by sensor noise. Let {Y k ; k = 1 2...} be the observation process which is the actual data obtained from the experiment. The observation Y k is formed from the state X k through the equation: Y k = h (X k v k ) (2) for k = 1 2... where {v k ; k = 1 2...} is a white observation noise process which is uncorrelated from the state process. We will consider here the specific case of RNA-seq transcriptomic data in which case Y k = (Y k1... Y kd ) contains the RNA-seq data at time k; for a single-lane NGS platform Y kj is the read count corresponding to transcript j in the single lane for j = 1... d. In this paper we choose to use a Negative Binomial model for the number of reads for each transcript: P (Y kj = m X kj ) = m Γ(m + φ j ) m! Γ(φ j ) ( λ kj φ j ) ( ) λ kj + φ j λ kj + φ j φ j (3) for m = 0 1...; where Γ denotes the Gamma function and φ j λ kj > 0 are the real-valued inverse dispersion parameter and mean read count of transcript j at time k respectively for j = 1... d. The inverse dispersion parameter models observation noise; the smaller it is the more variable the measurements are. Now recall that according to the Boolean state model there are two possible states for the abundance of transcript j: high if X kj = 1 and low if X kj = 0. Accordingly we model the parameter λ kj in log space as: log λ kj = log s + µ + δ j X kj (4) where the parameter s is the sequencing depth (which is instrument-dependent) µ > 0 is the baseline level of expression in the inactivated transcriptional state and δ j > 0 expresses the effect on the observed RNA-seq read count as gene j goes from the inactivated to the activated state for j = 1... d. Typical values for all these parameters are given in section VI. III. BOOLEAN KALMAN FILTER The optimal filtering problem consists of given the history of observations Y 1... Y k up to the present time k finding an estimator ˆX k (Y 1... Y k ) of the state X k that optimizes a given performance criterion. The criterion considered here is the (conditional) mean square error: MSE(Y 1... Y k ) = E [ ˆX k X k 2 Y k... Y 1 ] (5) The minimum mean-square error (MMSE) state estimator for the model described in the previous sections is the Boolean Kalman Filter (BKF). A recursive algorithm for the exact computation of the BKF for a general signal model was given in [4]. Briefly let (x 1... x 2d ) be an arbitrary enumeration of the possible state vectors. For each time k = 1 2... define the posterior distribution vectors (PDV) Π k k and Π k k 1 of length by (Π k k ) i = P (X k = x i Y k... Y 1 ) (6) (Π k k 1 ) i = P (X k = x i Y k 1... Y 1 ) (7) for i = 1.... Let the prediction matrix M k of size be the transition matrix of the controlled Markov chain corresponding to the state process: (M k ) ij = P (X k = x i X k 1 = x j u k 1 = u) = P (n k = x i f(x j u)) (8) for i j = 1.... On the other hand assuming conditional independence among the measurements given the state the update matrix T k also of size is a diagonal matrix defined by: (T k (Y k )) ii = P (Y k X k = x i ) d Γ(Y kj + φ j ) s exp(µ + δ j (x i Y kj ) j ) = j=1 Y kj! Γ(φ j ) s exp(µ + δ j (x i ) j ) + φ j φ φ j j ( ) s exp(µ + δ j (x i ) j ) + φ j (9) 2

for i = 1.... For a Boolean vector v {0 1} d define the binarized vector v(i) = I v(i)>1/2 and the complement vector v c (i) = 1 v(i) for i = 1... d and the L 1 -norm v 1 = d v(i). Finally define the matrix A of size d via A = [x 1 x 2d ]. Then it can be shown that the optimal MMSE estimator ˆX k can be computed by Algorithm 1. Algorithm 1 Boolean Kalman Filter 1: Initialization Step: The initial PDV is given by (Π 0 0 ) i = P (X 0 = x i ) for i = 1... For k = 1 2... do: 2: Prediction Step: Given the previous PDV Π k 1 k 1 the predicted PDV Π k k 1 is given by Π k k 1 = M k Π k 1 k 1 3: Update Step: Given the current observation Y k =y k let β k = T k (y k ) Π k k 1 The updated PDV Π k k is obtained by normalizing β k to obtain a probability measure: Π k k = β k / β k 1 4: MMSE Estimator Computation Step: The MMSE estimator is given by ˆX k = AΠ k k with optimal conditional MSE MSE(Y 1... Y k ) = min{aπ k k (AΠ k k ) c } 1. IV. DISCOUNTED COST INFINITE-HORIZON CONTROL In this section the infinite-horizon control problem when the states of the system are directly observed is formulated. We will state the problem and give its solution without proof. For more details the reader is referred to [7] [13]. The goal of control is to select the appropriate external input u ext k U ext at each time k to make the network spend the least amount of time on average in undesirable states; e.g. states associated with cell proliferation which may be associated with cancer [1]. In formal terms assuming a bounded cost of control g(x k u ext k ) that is a function of the state and the intervention at time k our goal is to find a stationary control policy µ {0 1} d U ext which minimizes the infinite-horizon cost (for a given initial state X 0 = x j 0 ): m J µ (j) = lim E γ m k g (X k µ(x k )) (10) k=0 for j = 1... where 0 < γ < 1 is a discounting factor that ensures that the limit of the finite sums converges as the horizon length m goes to infinity. The discount factor places a premium on minimizing the costs of early interventions as opposed to later ones which is sensible from a medical perspective [7]. Let Π be the set of all admissible control policies. We assume in this paper that the set of external inputs U ext is finite so that Π is also finite. The optimal infinitehorizon cost is defined by J (j) = min µ Π J µ(j) (11) for j = 1... where J µ (j) is given by (10). The following theorem which follows from classical results proved in [13] provides necessary and sufficient conditions for optimality of a stationary control policy. Here we assume that the transition matrix M k of the controlled Markov chain defined in (8) may depend on k only through the input; the dependency on the input is made explicit by writing M(u) while dropping the index k. Theorem 1. (Bellman s Equation) The optimal infinitehorizon cost J satisfies: J (j) = min g(x u U j u) + γ (M(u)) ij J (i) ext (12) for j = 1.... Equivalently J is the unique fixed point of the functional T [J] defined by T [J](j) = min g(x u U j u) + γ (M(u)) ij J(i) ext (13) for j = 1... where J {0 1} d R is an arbitrary cost function. Furthermore a stationary policy µ is optimal if and only if it attains the minimum in Bellman s equation (12) for each initial state x j that is T [J ](j) = min g(x u U j u) + γ ext = g(x j µ (x j )) + γ (M(u)) ij J (i) (M(µ (x j ))) ij J (i) (14) 3

for j = 1.... The previous theorem suggests a procedure to determine the optimal stationary control policy. Starting with an arbitrary initial cost function J 0 {0 1} d R run the iteration J t = T [J t 1 ] (15) until a fixed point is obtained it can be shown that the iteration will indeed converge to a fixed point [13]. This fixed point is the optimal cost J and the corresponding policy defined by (14) is the optimal stationary control policy. In practice a stopping criterion has to be defined to stop the iteration when the change between consecutive J t 1 and J t is small enough. The procedure is summarized in Algorithm 2. Algorithm 2 Optimal Infinite-Horizon Control 1: Initialization Step: Initialize J 0 arbitrarily e.g. J 0(j) = max... g(x i u) for j = 1.... Choose a small number β and discounting factor γ. 2: Let J(j) = J(j) for j = 1.... 3: Let J(j) = min u U ext g(x j u) + γ (M(u)) ij J(i) for j = 1.... 4: Compute = max j=1...( J(j) J(j) ). 5: If > β go to step 2. 6: The optimal stationary control policy is: µ (x j ) = arg min for j = 1.... u U ext g(x j u) + γ (M(u)) ij J(i) V. OPTIMAL INTERVENTION STRATEGY To be able to apply the optimal stationary policy the states of the system must be known at each time step. However in most real applications the states of the systems are not directly observable. Therefore we propose to use the Boolean Kalman Filter to estimate the states for application of the policy. While the optimal policy is determined offline the BKF is applied online and the control input dictated by the optimal control policy is applied to the state estimate at each time k (the control policy is suboptimal on the space of observed data). The proposed approach is illustrated in Figure 1. BKF Optimal Policy System Fig. 1: The proposed scheme for intervention to data observed through noise. VI. NUMERICAL EXPERIMENT In this section we present results of a numerical experiment using a Boolean network based on the mutated mammalian cell cycle network. Mammalian cell division is coordinated with the overall growth of the organism through extracellular signals that control the activation of CycD in the cell. This mutation introduces a situation where both CycD and Rb might be inactive in which case the cell cycles (i.e. divides) in the absence of any growth factor [14]. This suggests considering the logical states in which both Rb and CycD are downregulated as undesirable proliferatory states. Table I presents the Boolean functions of the mutated cell-cycle network. TABLE I: Mutated Boolean functions of mammalian cell cycle. Genes CycD Rb E2F CycE CycA Cdc20 Cdh1 UbcH10 CycB Predictors Input (CycD CycE CycA CycB) (Rb CycA CycB) (E2F Rb) (E2F Rb Cdc20 (Cdh1 UbcH10)) (CycA Rb Cdc20 (Cdh1 UbcH10)) CycB (CycA CycB) Cdc20 Cdh1 (Cdh1 UbcH10 (Cdc20 CycA CycB) (Cdc20 Cdh1) The control input consists of flipping or not flipping the state of a control gene. The cost function is defined as follows: g(x j 5 + O(u) if (CycDRb) = (00) u) = { O(u) if (CycDRb) (00) where O(u) is 0 if state of the control gene is not flipped or 1 if it is flipped. The process noise is assumed to have independent components distributed as Bernoulli(p) with intensity p = 0.01 so that all genes are perturbed with a small 4

Fraction of observed desirable states in long run Sum of state values of CycD and Rb genes probability except the transducer gene CycD which is perturbed by noise with intensity p = 0.5 to simulate the presence or absence of extracellular signals sensed by this gene. The parameters of the RNA-seq model are assumed as follows: the sequencing depth s is equal to 2.875 which corresponds to 50K 100K reads; the baseline expression µ is set to be 0.01; the parameter δ i is assumed to be 2 for all transcripts (i = 1... 8) and finally two values of the inverse dispersion parameters φ i = 0.5 and φ = 5 are used to model the high and low dispersed measurements respectively. The simulation proceeds as follows: first the optimal stationary policy is found based on Algorithm 2 with the threshold parameter β = 10 6 and γ = 0.95. Then an initial state and corresponding observed RNA-seq measurement are generated. Based on the estimated state by the BKF and the optimal stationary policy obtained initially the appropriate control function is applied to the system. Based on the current intervention the next data point is generated. The process continues in an online closed loop model. The sum of the state values of CycD and Rb genes for the system under control of Rb gene and without control are presented in Figure 2 over 80 time steps. It can be seen that the simultaneous inactivation of both CycD and Rb genes occurred several times for the uncontrolled system; however this undesirable condition (simultaneous inactivation of both Rb and CycD genes) is visited less often in the system under control of the Rb gene. Furthermore better control can be seen in the presence of low dispersed data. The reason is that the performance of state estimation is worse for high dispersed data leading to lower control performance as well. Next the results of the proposed method under control of the Rb gene observed through simulated RNA-seq data are compared with the results of the value iteration method with directly observed states. The fractions of observed desirable states over 5000 time steps for data with high and low dispersion and different process noise are shown in Figure 3. It should be noted that since the stationary policy is dependent on the process noise different policies are obtained for each value of the process noise. As we expected the value iteration method with directly observed states has higher performance than the system with partially-observed states. In addition the fractions of observed desirable states for systems observed through low dispersion data are very close to the value iteration method; however in data with high dispersion the accuracy of state estimation by BKF is Under Control of Rb Time a. Low dispersed data ( ) Without Control Time b. High dispersed data ( ) Fig. 2: Sum of state values of CycD and Rb genes over 80 time steps for system under control of the Rb gene and without control. smaller which degrades control performance. Fig. 3: Fraction of observed desirable states for the system with directly observed states (VI) the proposed method with observations through RNA-seq data (VI- BKF) and system without control (Low Dispersion: φ = 5 High Dispersion φ = 0.5). The fraction of observed desirable states using different genes as control input over 5000 time steps is shown in Figure 4. It is clear that the Rb gene is the best control gene for this system. 5

Fraction of observed desirable states in long run CycD Rb E2F CycE CycA Cdc20 Cdh1 UbcH10 CycB No control Fig. 4: Fraction of desirable states using different control genes for process noise p = 0.01 and φ j = 5 for j = 1.... VII. CONCLUSION In this paper an intervention strategy for partiallyobserved Boolean dynamical systems is discussed. The method contains two main steps: in the first step the optimal control policy for the system with directly observed states is obtained and in the second step the obtained control policy is employed to the system based on the estimated states by the Boolean Kalman Filter. The performance of the method is evaluated using a Boolean network model of the mutated mammalian cell cycle and simulated RNA-seq observations. ACKNOWLEDGMENT The authors acknowledge the support of the National Science Foundation through NSF award CCF-1320884. REFERENCES [1] A. Datta A. Choudhary M. L. Bittner and E. R. Dougherty External control in markovian genetic regulatory networks Machine learning vol. 52 no. 1-2 pp. 169 191 2003. [2] S. A. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets Journal of theoretical biology vol. 22 no. 3 pp. 437 467 1969. [3] I. Shmulevich E. R. Dougherty and W. Zhang From boolean to probabilistic boolean networks as models of genetic regulatory networks Proceedings of the IEEE vol. 90 no. 11 pp. 1778 1792 2002. [4] U. Braga-Neto Optimal state estimation for boolean dynamical systems in Signals Systems and Computers (ASILOMAR) 2011 Conference Record of the Forty Fifth Asilomar Conference on pp. 1050 1054 IEEE 2011. [5] E. O. Voit and J. Almeida Decoupling dynamical systems for pathway identification from metabolic profiles Bioinformatics vol. 20 no. 11 pp. 1670 1681 2004. [6] N. Friedman M. Linial I. Nachman and D. Pe er Using bayesian networks to analyze expression data Journal of computational biology vol. 7 no. 3-4 pp. 601 620 2000. [7] R. Pal A. Datta and E. R. Dougherty Optimal infinite-horizon control for probabilistic boolean networks Signal Processing IEEE Transactions on vol. 54 no. 6 pp. 2375 2387 2006. [8] Á. Halász M. S. Sakar H. Rubin V. Kumar G. J. Pappas et al. Stochastic modeling and control of biological systems: the lactose regulation system of escherichia coli Automatic Control IEEE Transactions on vol. 53 no. Special Issue pp. 51 65 2008. [9] G. Vahedi B. Faryabi J.-F. Chamberland A. Datta and E. R. Dougherty Optimal intervention strategies for cyclic therapeutic methods Biomedical Engineering IEEE Transactions on vol. 56 no. 2 pp. 281 291 2009. [10] M. Imani and U. Braga-Neto Optimal state estimation for boolean dynamical systems using a boolean kalman smoother in 2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP) pp. 972 976 IEEE 2015. [11] M. Imani and U. Braga-Neto Optimal gene regulatory network inference using the boolean kalman filter and multiple model adaptive estimation in 2015 49th Asilomar Conference on Signals Systems and Computers pp. 423 427 IEEE 2015. [12] A. Bahadorinejad and U. Braga-Neto Optimal fault detection and diagnosis in transcriptional circuits using next-generation sequencing IEEE/ACM Transactions on Computational Biology and Bioinformatics 2015. [13] D. P. Bertsekas D. P. Bertsekas D. P. Bertsekas and D. P. Bertsekas Dynamic programming and optimal control vol. 1. Athena Scientific Belmont MA 1995. [14] M. R. Yousefi A. Datta and E. R. Dougherty Optimal intervention strategies for therapeutic methods with fixed-length duration of drug effectiveness Signal Processing IEEE Transactions on vol. 60 no. 9 pp. 4930 4944 2012. 6