arxiv: v2 [q-bio.pe] 27 May 2017

Size: px
Start display at page:

Download "arxiv: v2 [q-bio.pe] 27 May 2017"

Transcription

1 Stochastic and Information-thermodynamic Structure in Adaptation Stochastic and Information-thermodynamic Structures of Population Dynamics in Fluctuating Environment Tetsuya J. Kobayashi 1,2,a) and Yuki Sughiyama 1 1) Institute of Industrial Science, the University of Tokyo. 2) PRESTO, JST. arxiv: v2 [q-bio.pe] 27 May 2017 (Dated: 30 May 2017) Adaptation in a fluctuating environment is a process of fueling environmental information to gain fitness. Living systems have gradually developed strategies for adaptation from random and passive diversification of the phenotype to more proactive decision making, in which environmental information is sensed and exploited more actively and effectively. Understanding the fundamental relation between fitness and information is therefore crucial to clarify the limits and universal properties of adaptation. In this work, we elucidate the underlying stochastic and informationthermodynamic structure in this process, by deriving causal fluctuation relations (FRs) of fitness and information. Combined with a duality between phenotypic and environmental dynamics, the FRs reveal the limit of fitness gain, the relation of time reversibility with the achievability of the limit, and the possibility and condition for gaining excess fitness due to environmental fluctuation. The loss of fitness due to causal constraints and the limited capacity of real organisms is shown to be the difference between time-forward and time-backward path probabilities of phenotypic and environmental dynamics. Furthermore, the FRs generalize the concept of evolutionary stable state (ESS) for fluctuating environment by giving the probability that the optimal strategy on average can be invaded by a suboptimal one owing to rare environmental fluctuation. These results clarify the information thermodynamic structures in adaptation and evolution. PACS numbers: Valid PACS appear here Keywords: Fluctuation theorem; Evolution; Decision making; Bet-hedging;Fitness; Variational structure; I. INTRODUCTION A. Adaptation in fluctuating environment Adaptation is fundamental to all organisms for their survival and evolutionary success in a changing environment. In the course of evolution, living systems have gradually attained and developed more active and efficient strategies for adaptation, which generally accompany more effective use of environmental information. Understanding how the use of information is linked to the efficiency of adaptation is crucial to clarify the fundamental limits and universal properties of biological adaptations 1,2. The most primitive strategy for adaptation is to randomly generate genetic and phenotypic heterogeneity in a population 3 6. Provided that a sufficiently large heterogeneity is constantly generated in the population, a fraction of organisms can, by chance, have the types adaptive to the upcoming environmental state and circumvent extinction of the population at the cost of others with non-adaptive types 7,8. Such a strategy is known as bet-hedging or phenotypic diversification and works even if the organisms are completely blind to the environment, without any a priori knowledge of its dynamics. The bet-hedging a) Electronic mail: tetsuya@mail.crmind.net;

2 Stochastic and Information-thermodynamic Structure in Adaptation 2 is a passive and a posteriori adaptation in the sense that the adaptation is achieved extrinsically by and after the impact of environmental selection 9. The evolutionary advantage of the bet-hedging strategy is demonstrated by the persistence of bacteria, pathogens, and cancer cells to antibiotic or anticancer drug treatments 5,6, The gain of fitness by bethedging can be optimized if the population evolves to generate an appropriate pattern of heterogeneity by learning the environmental statistics 13. Nevertheless, the gain of fitness by bet-hedging is fundamentally limited because of the passive and a posteriori nature of the strategy, in which the individual organisms have no access to the microscopic information of which environmental states will actually be realized. With any access to such information, the loss can be avoided further by decision making: directly sensing the current environmental state, predicting the upcoming state, and switching into the phenotypic state that is adaptive to that state The strategy of adaptation via sensing is active and a priori in a sense that adaptation is intrinsically achieved by the predictive actions of the organisms 9. In biologically relevant situations, both passive and active aspects of adaptation are intermingled because perfect sensing and prediction of environment are impossible with the limited capacities of biological systems. B. Notions of information and analogy with physics in biological adaptation At an analogical level, the problem of the fundamental law and the limits of adaptation and evolution shares several aspects with physics, especially with thermodynamics, which drove the long-lasting attempts to establish the thermodynamics of biological adaptation and evolution Among other areas, the fundamental limit of fitness in a changing environment and the value of environmental information have been a major focus in evolutionary biology 1,2,7, Haccou and Iwasa may be the first who linked, albeit implicitly, environmentalinformationwith the gainoffitness in astochasticenvironment 28. Bergstrom and Lachmann pursued the fitness value of information by directly incorporating mutual information Others also pointed out some quantitative relations between fitness and information measures such as relative entropy and Jeffreys divergence More recently, Rivoire and Liebler conducted a comprehensive analysis by employing an analogy between bet-hedging oforganismsand horserace gambling 36, the link ofwhich to information theory was revealed in the seminal work by Kelly in However, all previous works either imposed certain restrictions on their models to derive the information-theoretic measures of fitness value 32,36,38 or had to introduce phenomenological measures for the value of information to accommodate more general situations 36,38, because they lacked an appropriate method to handle the mixture of the passive and active aspects in adaptation. Werecentlyresolvedthisproblem 9 bycombiningapathintegralformulationofpopulation dynamics 39 42, a retrospective characterization of the selected population 41,43,44, and a variational structure in population dynamics 42,45. The results we obtained generalized the limits of fitness gain by sensing and revealed that the gain satisfies fluctuation relations (FRs) that fundamentally constrain not only its average but also its fluctuation. These relations, alongside a previous work in the line of neutral theory 24, imply that fitness in the fluctuation environment shares, at least mathematically, similar structures to those of stochastic and information thermodynamics 46,47. In our FRs, the fluctuation of fitness of a given population is evaluated by the difference from the fitness that achieves the maximum average fitness over all possible phenotypic histories of organisms. Conceptually, this means that we postulate a Darwinian demon, an imaginary organism, that can exhibit any type of behavior without imposing any constraint not only on biological capacity but also on the causality of dynamics. The FRs characterize the loss of fitness of a realistic organism from such an idealized organism. Thus, understanding the properties of the Darwinian demon and the deviationfromit byarealisticorganismarecentralto adeeperunderstandingofthe behavior of populations in a changing environment. However, the implicit definition of the demon as the maximizer of the average fitness hampers the explicit characterization of the demon and obscures the formal link to stochastic thermodynamics, in which a variational

3 Stochastic and Information-thermodynamic Structure in Adaptation 3 characterization is not common 46,47. More practically, without an explicit characterization, we are unable to simulate possible behaviors of the demon even numerically. C. Outline of main results In this paper, we resolve these problems by deriving FRs of fitness without using the variational approach. To this end, we first formulate and generalize the problem of adaptation in a changing environment so that individual organisms can change not only their strategy of switching phenotypic states but also the strategy of allocating metabolic resources to each phenotypic state (Sec. II). Combined with the path integral formulation of population dynamics, this generalization enables us to obtain a decomposition of fitness with a combination of time-forward(chronological) and time-backward(retrospective) path probabilities (Sec. III). The decomposition naturally spells out an explicit representation of the upper bound of the average fitness, which was implicitly defined in our previous work 9. For the bet-hedging problem without a sensing environment (Sec. IV), the decomposition directly leads to FRs of the fitness loss, which has a very similar form as the FRs of entropy production in stochastic thermodynamics 46. After numerically verifying the derived FRs (Sec. IVB), we investigate the biological meanings and achievability of the FRs (Sec. V). The average FR is related to the evolutionary stable state (ESS), under which the strategy with maximal average fitness cannot be invaded by any other strategies. The detailed and integral FRs generalize the ESS by giving the probability that a suboptimal strategy outperforms the optimal one within a finite time interval owing to rare fluctuation of the environment 48. By using a dualistic relation between phenotypic and environmental dynamics, the detailed FR is shown to be represented as the ratio of the path probability of the actual environment and that of the conjugate environment under which a given strategy of the organisms becomes optimal. The duality also clarifies that the average loss of fitness is directly related to the imperfectness of the adaptive behavior of the organisms, originating both from physical constraints and from the suboptimality of the behaviors. The introduction of a sensing signal extends the FRs to accommodate the mutual information between the environment and the signal as the gain of fitness by sensing, in the same manner as mutual information bounds the negative gain of entropy production in information thermodynamics 47 (Sec. VI). Although the extended FRs cover very general situations, the FRs are not tight and therefore the mutual information overestimate the value of fitness by sensing. By explicitly assuming a causal relation between the environment and the signal, the FRs are further modified to involve the directed information as a tighter bound of fitness gain (Sec. VII). This modification clarifies how the loss of fitness from the upper bound is related to the causality, the inaccessibility to perfect information of the environment, and the imperfect implementation of information processing. Finally, three quantities are introduced to account for the fitness loss of inappropriate sensing and the imperfectness of metabolic allocation and phenotypic switching strategies in general situations (Sec. VII C). The summary and future directions are described in Discussion (Sec. VIII). II. MODELING ADAPTATION OF POPULATION IN CHANGING ENVIRONMENT Let x t S x, y t S y, and z t S z be the phenotype of a living organism, the state of the environment, and the state of the sensing signal at time t, respectively (Fig. 1 (A)). For simplicity, possible phenotypic, environmental, and signal states are assume to be discrete as in references 9,30,34,36. The paths (histories) of the states up to time t are defined as X t := {x τ τ [0,t]} S X := S x (t+1), Y t := {y τ τ [0,t]} S Y := S y (t+1), and Z t := {z τ τ [0,t]} S Z := S z (t+1), respectively. Time is also treated as discrete in this work.

4 Stochastic and Information-thermodynamic Structure in Adaptation 4 A. Modeling phenotype switching The phenotype of an organism, in general, switches stochastically over time, depending on its past phenotypic state and the sensed signal (Fig. 1 (A) and (B)). The switching dynamics is modeled, for example, by a Markov transition probability T F (x t+1 x t,z t+1 ), which satisfies x t+1 T F (x t+1 x t,z t+1 ) = 1 for all x t and z t+1. Although we mainly focus on the Markov switching, our result can be extended for causal switching T F (x t+1 X t, Z t+1 ) in which the next phenotypic state x t depends on both past phenotypic and signal histories X t and Z t+1. B. Modeling metabolic resource allocation Next, we model the strategy of metabolic resource allocation of the organisms. Each organism is assumed to duplicate asexually to produce e k daughter organisms on average. For each state of environment y, the organisms have a maximum replication rate, e kmax(y), that can be achieved only when the organism allocates its all metabolic resources for adapting only to that environmental state. Because the environment changes, however, the organism usually distributes its resources for different environmental conditions 49. To represent this situation, we introduce a conditional probability T K (y x) that quantitatively represents the fraction of resources allocated to the environmental state y in a phenotype x. An instantaneous replication rate k(x, y) of the phenotype x under the environmental state y is then assumed to be represented as e k(x,y) = e kmax(y) T K (y x), (1) (Fig. 1(C)). Notethat suchadecompositionofe k(x,y) candate backtohaccouandiwasa 28, at least. This relation between the resource allocation strategy and the replication rate may appear to be restrictive because of the linear relation between the allocation strategy T K (y x) and the actual replication rate e k(x,y). Nevertheless, this decomposition of k(x,y) is general enough because, for a given k(x,y), we can find a pair of e kmax(y) and T K (y x) as long as the possible phenotypic states are fewer than those of the environment (refer to Appendix A). Such a situation is biologically plausible because the environment is usually more complex than the phenotype of an organism. Moreover, our setting includes the special situation that has been intensively investigated in previous works 32,36. For example, when the numbers of phenotypic and environmental states are equal as #S x = #S y,t K (y x) = δ x,y correspondstokelly shorseracegambling 36,37, inwhicheachphenotypicstateallocates all metabolic resources to a certain environmental state and, as a result, can survive and grow only when the phenotypic state matches the realized environmental state. C. Modeling sensing processes We finally model the sensing process. We consider the case that organisms can obtain a sensing signal z(t), which correlates with the environment y(t). The sensed signal z(t) is assumed to be common to all the organisms in the population (Fig. 1 (A)). A biologically relevant situation is that z(t) is a vector of concentrations of extracellular signaling molecules that cannot be consumed as metabolites but correlate with the available metabolites. Another situation is that z(t) is a subset of y(t) to which the organisms have sensors. In either situation, the sensing noise should be negligibly small because all the organisms receive the same sensing signal z(t). Even though sensing of a common signal cannot cover all biologically realistic situations such as individual sensing with noisy receptors 14 17,50,51, the common signal has been investigated in various works 9,28,36. In this work, we mainly focus on the common sensing problem and touch on the individual sensing in Discussion.

5 Stochastic and Information-thermodynamic Structure in Adaptation 5 (A) Environmental history Signal history Time-forward sampling t t +1 t Phenotypic history Time-backward (retrospective) sampling (B) zt+1 Signal (C) yt Environment Type-switch xt xt+1 Replication xt FIG. 1. Schematic diagrams for population dynamics of organisms with multiple types and sensing in a changing environment. (A) A tree representation of a growing population in a changing environment. Colors indicate different phenotypic states, environmental states, and signal over time. By tracking the phenotypic history from t = 0 to t in a time-forward manner, we can effectively obtain a sample X t with probability P F[X t Z t]. By tracking the phenotypic history retrospectively from t to t = 0 in a time-backward manner, we can obtain a sample X t with probability P s B[X t Y t, Z t]. Note that we assume that the size of the population is sufficiently large when we introduce the population dynamics of the organisms eq. (2). (B) Phenotypic switching from x t to x t+1 in response to environmental signal z t+1. The switching probability is represented by T F(x t+1 x t,z t+1). (C) Replication of organisms with phenotypic state x t under environmental state y t. Each organism generates e k(xt,yt) descendants on average. D. Population dynamics of organisms By combining the phenotype switching, the metabolic allocation, and the sensing strategies, we can explicitly derive the dynamics of the population of the organisms. Because both environmental and sensing histories are external factors of the organisms, the population dynamics of the organisms is described for a given pair of environmental and signaling histories, Y t and Z t. Let N Y,Z t (x t ) R 0 be the number of organisms whose phenotypic state is x t at time t under the realization of environmental and signal histories Y t and Z t. The population size of the organisms is assumed to be sufficiently large so that N Y,Z t (x t ) can be well approximated as a continuous variable. Because the large population size enable us to effectively ignore the demographic fluctuation due to finite number of the organisms, we

6 Stochastic and Information-thermodynamic Structure in Adaptation 6 can obtain the update rule 9 of N Y,Z t (x t ) as N Y,Z t+1 (x t+1) = e k(xt+1,yt+1) T F (x t+1 x t,z t+1 )N Y,Z t (x t ). (2) x t S x If we need to work on a population, the size of which is not sufficiently large, modeling of the population e.g., by a branching process is required. The statistical properties of the environmental and signal histories are generally characterized by a joint path probability Q[Y t, Z t ]. III. PATH-WISE FORMULATION AND FITNESS DECOMPOSITION By using N Y,Z t (x t ) in the previous section, we define the fitness of the population and derive its path integral formulation. The cumulative fitness Ψ s [Y t, Z t ] of the population at t under environmental and signal histories Y t and Z t is defined by the exponential expansion of the total population size as follows: Ψ s [Y t, Z t ] := ln x t N Y,Z t (x t ) x 0 N Y,Z 0 (x 0 ) = ln N Y,Z t N Y,Z 0. (3) We here define N Y,Z t := x t N Y,Z t (x t ). The ensemble average of the cumulative fitness for different realizations of the environmental and signal histories is represented as Ψ s t Q := Ψ s [Y t, Z t ] Q[Yt,Z t]. (4) A. Path-wise and retrospective formulation As derived in 9,39,42, the cumulative fitness at time t can be represented with a path-wise (path integral) formulation. Let us first define the time-forward path probability of the phenotype switching as [ t 1 ] P F [X t Z t ] := T F (x τ+1 x τ,z τ+1 ) p(x 0 ), (5) τ=0 where p(x 0 ) := N Y,Z 0 (x 0 )/ x 0 N Y,Z 0 (x 0 ). We here use Kramer s causal conditioning rather than the usual conditioning in order to indicate that the path probability P F [X t Z t ] iscausallygeneratedbythemarkovtransitionmatrixt F (x t+1 x t,z t+1 ), whichdependsonly on the past phenotypic state x t and the signal z t+1 52,53. The single bar is also used for the normal conditioning of a path probability that does not necessarily satisfy the causal relation between conditioning and conditioned histories. We also define the path-wise (historical) fitness of a phenotypic history X t under an environmental history Y t as t 1 K[X t, Y t ] := k(x τ+1,y τ+1 ). (6) τ=0 where K[X t, Y t ] is defined over all {X t, Y t } S X S Y. With these path-wise quantities 9, we obtain the population size of the organisms at time t, the past phenotypic history of which is X t as Because N Y,Z t described as N Y,Z t [X t ] = e K[Xt,Yt] P F [X t Z t ]N Y,Z 0. = X t N Yt,Zt t [X t ], the cumulative fitness with sensing Ψ s [Y t, Z t ] is explicitly Ψ s [Y t, Z t ] = ln N Y,Z t N Y,Z 0 = ln e K[Xt,Yt]. (7) P F[X t Z t]

7 Stochastic and Information-thermodynamic Structure in Adaptation 7 From this representation, the fitness can be described variationally 9,42 as [ Ψ s [Y t, Z t ] = max [ K[X t, Y t ] D ]] P[Xt] P[X t ] P F [X t Z t ], (8) P[X t] where D[P P ] := X t P[X t ]lnp[x t ]/P [X t ] is the Kullback-Leibler divergence (relative entropy) 31,54 between two path measures P and P. It should be noted that both KL divergence and Kramer s causal conditioning use the double bar but their meanings are different. The maximization is achieved with the time-backward retrospective path probability defined by P s B[X t Y t, Z t ] := e K[Xt,Yt] Ψs [Y t,z t] P F [X t Z t ] = N Y,Z t [X t ]. (9) N Y,Z t If the phenotypic switching does not depend on the sensing signal as P F [X t Z t ] = P F [X t ], which corresponds to the bet-hedging by random phenotypic switching, P s Bis reduced to P b B [X t Y t ] := e K[Xt,Yt] Ψb [Y t] P F [X t ] = N t Y [X t ] N Y, (10) t where the superscript b denotes bet-hedging and Ψ b [Y t ] := ln e K[Xt,Yt] P F[X t]. Because [X t ] is the number of organisms with phenotypic history X t in the population at time t, the second equality in eq. (9) indicates that P s B [X t Y t, Z t ] is the fraction of the organisms that has phenotpyic history X t. This property of P s B [X t Y t, Z t ] leads to an interpretation of P s B [X t Y t, Z t ] as the probability of observing a certain phenotypic history X t under a realization of the environmental and signal histories Y t and Z t when we sample an organism in the population at time t and track its phenotypic history in a time-backward manner, retrospectively 9, Because the organisms grow more if their phenotypic histories are more adaptive than others for the given environmental realization Y t, the chance to observe a certain phenotypic history X t after selection under Y t is biased to P s B [X t Y t, Z t ] from the probability P F [X t Z t ] with which the same phenotypic history is intrinsically generated. Because the selected phenotypic histories strongly depend on the actual realization of the environmental history, P s B [X t Y t, Z t ] is conditional on Y t. It should be noted that P s B [X t Y t, Z t ] is not necessarily causal, because of which we use the normal conditioning 9,42. N Y,Z t B. Decomposition of fitness In order to understand the relation between fitness and information obtained by sensing, we decompose the cumulative fitnesses into biologically relevant components. To obtain the decompositions, we first define a constant φ 0 and a probability distribution q 0 (y) by using k max (y) as φ 0 := ln y e kmax(y) and q 0 (y) := e φ0 e kmax(y). From these definitions, k max (y) can be described as e kmax(y) = e φ0 /q 0 (y). By defining Q 0 [Y t ] := t 1 τ=0 q 0(y τ+1 ), P K [Y t X t ] := t 1 τ=0 T K(y τ+1 x τ+1 ), and Φ 0 := tφ 0, we obtain the following decomposition of K: K[X t, Y t ] = Φ 0 +ln P K[Y t X t ], (11) Q 0 [Y t ] where we use eq. (1). By defining K max [Y t ] := t 1 τ=0 k max(y τ+1 ) = Φ 0 lnq 0 [Y t ], K[X t, Y t ] can also be described as K[X t, Y t ] = K max [Y t ]+lnp K [Y t X t ]. (12)

8 Stochastic and Information-thermodynamic Structure in Adaptation 8 With eqs. (10) and (11), for the bet-hedging problem, we obtain a decomposition of the fitness Ψ b [Y t ] = Φ 0 lnq 0 [Y t ] ln P b B [X t Y t ] P K [Y t X t ]P F [X t ]. (13) For a given environmental statistics Q[Y t ], the fitness is represented by the ratio of the time-forward and time-backward path probabilities as Ψ b [Y t ] = Ψ 0 [Y t ] ln P b B [X t, Y t ] P K [Y t X t ]P F [X t ], (14) where P b B [X t, Y t ] := P b B [X t Y t ]Q[Y t ] is the time-backward joint probability of the phenotypic and environmental histories X t and Y t. Here we also define Ψ 0 [Y t ] := Φ 0 +ln Q[Y t] Q 0 [Y t ] = K max[y t ]+lnq[y t ]. (15) If the organismscan perfectly foreseethat the environmentalstate attime τ becomesy τ and if they can choose the phenotype that allocates all metabolic resource to the environmental state y τ, the maximum replication rate at time τ reaches e kmax(yτ). Therefore, K max [Y t ] is interpreted as the maximum replication over an environmental path Y t that can be achieved only when the organisms perfectly foresee what kind of environmental history will be realized in advance. In contrast, lnq[y t ] is the entropic loss of fitness due to the lack of knowledge of which environmental history will be realized 32,36. Therefore, Ψ 0 [Y t ] is the maximum replication when the organisms cannot know which environmental state will be realized but know the statistics of the future environmental state. The relevance of this interpretation and the biological meaning of some quantities such as Φ 0 and Q 0 [Y t ] are explicitly shown by using the FRs derived in the following sections. For the case with the sensing signal, we can similarly obtain a decomposition of Ψ s [Y t, Z t ] as Ψ s P s B [Y t, Z t ] = Ψ 0 [Y t ] ln [X t, Y t, Z t ] (16) P K [Y t X t ]P F [X t Z t ]Q[Z t Y t ] where P s B [X t, Y t, Z t ] := P s B [X t Y t, Z t ]Q[Y t, Z t ] is the time-backward joint path probability among X t, Y t, and Z t. It should be noted that P K [Y t X t ]P F [X t Z t ]Q[Z t Y t ] is not a joint pathprobabilitybecauseofthecircularnoncausaldependencyamong X t, Y t, and Z t. Finally, by using the decomposition in eq. (11), the time-backward conditional path probabilities, eq. (10) and eq. (9), are reduced to where the normalization factors are P b B[X t Y t ] = P K[Y t X t ]]P F [X t ], P K,F [Y t ] (17) P s B [X t Y t, Z t ] = P K[Y t X t ]P F [X t Z t ], P K,F [Y t Z t ] (18) P K,F [Y t ] := X t P K [Y t X t ]]P F [X t ], P K,F [Y t Z t ] := X t P K [Y t X t ]P F [X t Z t ]. Because P F [X t ] is the probability to intrinsically generate the repertoireof phenotypic histories in the population and P K [Y t X t ] is the probability that an organism with a phenotypic history X t allocates resources to each history of environment Y t, the normalization factor P K,F [Y t ] can be interpreted as the marginalresourceallocation to the environmentalhistory Y t at the population level. P K,F [Y t Z t ] can similarly be interpreted as the population-level resource allocation to Y t when signal history Z t is received. By using FRs in the next section, we clarify that P F [X t ] and P K,F [Y t Z t ] also have meaning as the conjugate environment under which the given strategy {T F,T K } becomes optimal.

9 Stochastic and Information-thermodynamic Structure in Adaptation 9 IV. CAUSAL FRS FOR BET-HEDGING STRATEGY By rearranging the decomposition of Ψ b [Y t ] in eq. (14), we can immediately obtain a detailed causal FR for fitness difference Ψ 0 [Y t ] Ψ b [Y t ] as e (Ψ0[Yt] Ψb [Y t]) = P K[Y t X t ]P F [X t ] P b B [X t, Y t ] = P K,F[Y t ], (19) Q[Y t ] where we use eq. (17) to obtain the last equality. The first equality means that the fitness difference is the log ratio of time-forward and time-backward path probabilities P K [Y t X t ]P F [X t ] and P b B [X t, Y t ]. P K [Y t X t ]P F [X t ] is the time-forward probability of observing an organism that takes phenotypic history X t and then allocates P K [Y t X t ] of metabolic resources to Y t a priori to selection by conducting time-forward tracking of the histories. P b B [X t, Y t ] is the time-backward probability of observing the realization of environmental history Y t and the time-backward phenotypic history X t a posteriori to selection by conducting time-backward tracking of the histories. The second equality also indicates that the fitness difference is the log ratio of the percentage of resource allocated to environmental history Y t at the population level and the probability of observing environmental history Y t. By averaging eq. (19) with respect to P b B [X t, Y t ] or Q[Y t ], we can derive an integral FR as e (Ψ0[Yt] Ψb [Y t]) = 1. (20) Q[Y t] If we average eq. (19) after taking the logarithm of both sides, we obtain an average FR as Ψ b Q = Ψ 0 Q D b loss, (21) where Ψ b Q := Ψ b [Y t ] Q[Y t], Ψ 0 Q := Ψ 0 [Y t ] Q[Yt], and [ ] Dloss b := D P b B[X t, Y t ] P K [Y t X t ]P F [X t ] [ ] = D Q[Y t ] P K,F [Y t ] (22) (23) Because of the non-negativity of the relative entropy Dloss b, we can easily see that Ψ 0 Q is an upper bound of the average fitness Ψ b of a bet-hedging strategy: Q Ψ 0 Q max Ψ b {T Q, (24) F,T K} where we use the fact that Ψ 0 [Y t ] is dependent neither on the phenotype switching strategy T F nor on the metabolic allocation strategy T K. These FRs are basically the same as those wederivedinourpreviousworkbyusingavariationalapproach. Itshouldbealsonotedthat the FRs derived by Mustonen and Lässig 24 are different from ours because their relations are based on a model describing the dynamics of an ensemble of populations whereas ours is one describing the dynamics of a population. A. Biological meaning of Ψ 0 and Φ 0 From eq. (15), the upper bound of the average fitness, Ψ 0 Q admits two different representations: [ Ψ 0 Q = K max Q S[Q] = Φ 0 +D Q Q 0 ], (25)

10 Stochastic and Information-thermodynamic Structure in Adaptation 10 (A) (B) (C) (D) FIG. 2. Schematic representations of environmental dynamics, phenotypic switching, and sensing used for numerical verification of FRs in Figs. 3 and 5. (A) Characteristics of the environment. The environment has three states: y t S y = {s y 1,sy 2,sy 3 }. sy 1 and sy 2 are nutrient rich states that have different sources of nutrients. s y 3 is a nutrient-poor state under which replication of the organisms is severely restricted. The maximum growth under each environment is e kmax(sy 1 ) = 3.2, e kmax(sy 2 ) = 3.2, and e kmax(sy 3 ) = 0.4. The transition rates T F E(y t+1 y t) between the states are shown on the arrows. The environmental states usually fluctuate between s y 1 and sy 2 but occasionally flip to s y 3 with a 5% chance. (B) Characteristics of phenotypic states without sensing. Organisms have two phenotypic states: x t S x = {s x 1,s x 2}. The transition rates T F(x t+1 x t) between the states are shown on the arrows. s x 1 is the phenotypic state allocating more resources to the environmental state s y 1 (70%) than sx 2 (10%) whereas s x 2 allocates more to s x 2 (70%) than s x 1 (10%). Both states allocate 20% of the resources to the starving environmental state s y 3. (C) Characteristic of sensing signal. The sensing signal has two states as z t S z = {s z 1,s z 2}. When the environmental state is s y 1 (s y 2 ), the organisms obtain sz 1 (s z 2) as the sensing signal with a 90% accuracy. If the environment is in s y 3, the sensing signal produces sz 1 or s z 2 with equal probability. (D) Characteristic of phenotypic switching with sensing. When the organism obtains the sensing signal s z 1 (s z 2), it switches its phenotypic state into s x 1 (s x 2) 95% of the time. where S[Q] := lnq[y t ] Q[Yt] is the entropy of Q[Y t] and K max Q is the average fitness under environmental statistics Q[Y t ] that is attained only when organisms have perfect knowledge of the future environment. Therefore, the first equality means that the randomness of the environment quantified by the entropy S[Q] works as the inevitable loss of fitness due to the lack of knowledge of which environmental history will be realized in the future. If the environment fluctuates more unpredictably, we have a higher S[Q] and a lower upper bound of the average fitness. Note that these properties of Ψ 0 Q have been pointed out previously and repeatedly 28,32,36. The meaning of Φ 0 and Q 0 [Y t ] in the second equality becomes explicit by considering the

11 Stochastic and Information-thermodynamic Structure in Adaptation 11 minimization of Ψ 0 Q with respect to Q[Y t ] as follows: Φ 0 = min Ψ 0 [Y t ] min Q Q[Yt] Q max {T F,T K} Ψ b [Y t ] Q[Y t], (26) where we use min Q Ψ 0 [Y t ] Q[Yt] = Φ 0 +min Q D[Q Q 0 ] = Φ 0. (27) This relation indicates that Φ 0 is the minimum of the maximum average fitness and that Q 0 [Y t ] is the worstenvironment for the organismsunder which the maximum averagefitness is minimized. This min-max characterization of Φ 0 and Q 0 [Y t ] has been clarified in the context of game theory with a matrix formulation 34. B. Verification of FRs for fitness Equations (19 21) indicate that, under quite general situations, the fitness difference Ψ 0 [Y t ] Ψ b [Y t ] satisfies the FRs 9 as the entropy production does in stochastic thermodynamics 46,47. To demonstrate the relations, we consider an organism with two phenotypic states growing in a Markovian environment with three states as depicted in Fig. 2 (A) and (B). The three environmental states, s y 1, sy 2, and sy 3, describe nutrient A rich, nutrient B rich, and nutrient-poor conditions, respectively (Fig 2(A)). The two phenotypic states, s x 1 and sx 2, employ strategies to allocate 70% of the metabolic resources to sy 1 and s y 2, respectively. Both states allocate 10% resources to the rest of two states (see Appendix B for more details). This setting abstractly and simply represents the fact that organisms generally have much less phenotypic and sensing states than possible environmental states because of the limited physical complexity of the organisms 35. Figures 3 (A) and (B) show the population dynamics of the organisms with the phenotypic states s x 1 and sx 2 under two different realizationsofthe environmentalhistory alongside Ψ b [Y t ] and Ψ 0 [Y t ]. Depending on the actual realization of the environment, the relative relations among Nt Y (s x 1 ), N t Y (s x 2 ), [Y t] eψb = Nt Y, and e Ψ0[Yt] change over time stochastically. In Fig 3 (A), e Ψ0[Yt] is mostly greater than N Y t = e Ψb t [Yt], which reflects the average relation Ψ 0 Q Ψ b t Q. On the contrary, N Y t frequently becomes greater than e Ψ0[Yt] in Fig 3 (B). Figures 3 (C) and (D) show that the environmental fluctuation induces a large fluctuation in both Ψ b [Y t ] and Ψ 0 [Y t ]. As shown in Fig 3 (E), although most environmental fluctuations result in positive fitness differences, that is, Ψ 0 [Y t ] Ψ b [Y t ] > 0, rare environmental fluctuations lead to a negative fitness difference in a finite time interval, meaning that the fitness of the suboptimal strategy Ψ b [Y t ] outperforms the averageupper bound Ψ 0 [Y t ]. This is analogue to the reversed heat flow in a small thermal system 55. Such rare events are balanced to satisfy the integral FR in eq. (20) as verified numerically in Fig 3 (F).

12 Stochastic and Information-thermodynamic Structure in Adaptation 12 (A) (B) (C) time (D) time (E) time (F) time time time FIG. 3. Numerical simulation of the population dynamics without sensing defined in Fig. 2. (A, B) Two sample histories of environment Y t and the dynamics of N Y t (s x 1), N Y t (s x 2), N Y t = e Ψb [Y t], and e Ψ 0[Y t] under Y t obtained by solving eq. (2). We set N Y 0 = 1. The color bars on the graphs represent the state of the environment at each time point. The correspondence between the colors and environmental states is as shown in Fig. 2 (A). Cyan and yellow lines represent N Y t (s x 1) and N Y t (s x 2), respectively. The dashed blue line with filled grey style is N Y t. The red line is Ψ 0[Y]. (C) Stochastic behaviors of population fitness Ψ b [Y t] for 100 independent samples of the environmental histories. Each colored thin line represents e Ψb [Y t] = N Y t /N Y 0 for each realization of environmental history. The thick black line is e Ψ b[y t] Q[Yt]. (D) Stochastic behaviors of Ψ 0[Y t] for the same 100 samples of environmental histories as those in (C). Each colored thin line represents e Ψ 0[Y t] for each realization of the same environmental history as in (C). (E) Stochastic behaviors of e (Ψ 0[Y t] Ψ b [Y t]) for the same 100 samples of environmental histories as those in (C). Each colored thin line represents e (Ψ 0[Y t] Ψ b [Y t]) for each realization of the same environmental history as in (C). (F) e (Ψ 0[Y t] Ψ b [Y t]) calculated empirically by the numerical simulations of sample Q[Y t] paths of e (Ψ 0[Y t] Ψ b [Y t]). The thick line is the average of 10 8 independentsamples of environmental histories. To illustrate the fluctuation, the 10 8 histories are dissected into 100 groups of histories, each contains 10 5 histories. Each thin line is obtained as the average of 10 5 histories in each group.

13 Stochastic and Information-thermodynamic Structure in Adaptation 13 V. BIOLOGICAL MEANING OF THE FITNESS FRS IntheoriginaldetailedFRoverpaths 47, theentropyproductionisthelogratioofthepath probability of a system s trajectory and its time reversal. The average entropy production attains its minimum 0 only when the time reversibility of the system holds in the sense that the probabilities of observing the time-forward and the time-reversed trajectories are equal. Thus, the FRs are related to the extent of the time reversibility of the system. In contrast, the detailed FRs for the fitness difference (eq. (19)) is the log ratio between the probability Q[Y t ] of observing the environmental history and the percentage P K,F [Y t ] of the marginal resource allocation to Y t, or that between the time-backward path probability P b B [X t, Y t ] and time-forward path probability P K [Y t X t ]P F [X t ]. By investigating the FRs, we clarify a dualistic structure, a conjugacy of these quantities, and the time-reversal condition for the equality attained in eq. (24). A. Dualistic relation between strategy and environment The average FR in eq. (21) implies that the maximization of Ψ b with respect to the Q strategies is dual to the minimization of the relative entropy Dloss b as follows: {T F,T K } := arg max Ψ b = arg min {T Q F,T K} {T Db loss, (28) F,T K} because Ψ 0 [Y t ] is independent of P F and P K. This duality indicates that maximizing the average fitness by choosing the best strategy is equivalent to the organisms to implicitly learn and prepare for the environmental statistics Q[Y t ] so that the marginal resource allocation P K,F [Y t ] to Y t becomes close to Q[Y t ] because D b loss = D[Q[Y t] P K,F [Y t ]]. The upper bound of the average fitness in eq. (24) is achieved as Ψ b [Y t ] = Ψ 0 [Y t ] if and only if {T F,T K } satisfies Q[Y t] = P K,F [Y t] where P K,F [Y t] := X t T K [Y t X t ]T F [X t], meaning that the environmental statistics and the marginal resource allocation match perfectly. B. Meaning of P F,K[Y t] as conjugate environment For a given environment, the strategy that achieves the bound may not always exist. In contrast, for a given pair of strategies {T F,T K }, there always exists the environment Q [Y t ] under which the pair achieves the bound Ψ 0 Q = Ψ b [Y t ] Q [Y max t] Ψ b [Y t ], (29) {T F,T K } Q [Y t] and {T F,T K } = arg max Ψ b [Y t ], (30) {T F,T K } Q [Y t] is satisfied. Because of the duality shown in eq. (28), Q [Y t ] is explicitly obtained as Q [Y t ] = P F,K [Y t ]. Therefore, P K,F [Y t ] can also be regarded as the conjugate environment to the strategy {T F,T K } or {P F,P K } under which they are optimal. Therefore, the fitness difference in eq. (19) is the log ratio of the actual environmental statistics Q[Y t ] and that of the conjugate environment Q [Y t ] = P K,F [Y t ] of the given strategy. Ψ b [Y t ] is bounded by Ψ 0 [Y t ] on average, and therefore, the optimal strategy that attains Ψ 0 [Y t ] cannot be invaded by any other strategy if we consider an infinitely large population and the asymptotic dynamics. Thus, the optimal strategy is a version of ESS in a fluctuating environment. Within a finite time interval, however, Ψ b [Y t ] becomes greater than Ψ 0 [Y t ] under certain realizations of environment Y t that satisfy Q[Y t ] < Q [Y t ]. In stochastic thermodynamics, such realizations correspond to the temporal reversed heat flow in a small

14 Stochastic and Information-thermodynamic Structure in Adaptation 14 thermal system 55. In biology, they temporally violate ESS because a suboptimal strategy {T F,P K } under Q can outperform the optimal strategy or near optimal strategies with the aid of the environmental fluctuation within a finite time interval. The integral FR in eq. (20) tells us that the violation of ESS can always occur with a small but finite probability in a finite time interval. If the population size of the optimal strategy is finite, such violation can leads to extinction of the optimal population with a finite probability 48. Moreover, the detailed FR in eq. (19) implies that a greater violation can occur under a realization of environmental history Y t if Q[Y t ] is small but Q [Y t ] is large. This fact can be intuitively understood as follows: the greater violation is induced by the environmental history Y t that rarely occurs in the actual environmental statistics Q[Y t ] but is adaptive and advantageous for the given strategy. A crucial fact is that this intuitive understanding is supported by a quantitative relation as in eq. (19). Furthermore, the detailed FR suggests that greater violation can occur for specific suboptimal strategies than for others if Q[Y t ] contains many rare environmental histories. In contrast, if Q[Y t ] is perfectly random as Q[Y t ] = const., the extent of the violation is limited and all the suboptimal strategies have only an even chance of violation. This implies that structured environmental fluctuation promotes impactful violation by some specific suboptimal strategies. C. Time reversibility and optimality If Ψ b [Y t ] = Ψ 0 [Y t ] holds, moreover, the time-backward probability P b, B [X t, Y t ] and the time-forward path probability P K [Y t X t ]P F [X t] become equal: P b, B [X t, Y t ] = P b, B [X t Y t ]Q[Y t ] = P K [Y t X t ]P F [X t] (31) where we use the first equality in eq. (23). Marginalization of this equality with respect to Y t indicates that the marginalized time-backward path probability P b B [X t] satisfies the consistency condition P b, B [X t] = P F [X t] as shown previously 9. Thus, the optimal strategies to achieve the bound have time reversibility in the sense that the ensembles of the timeforward and time-backward phenotypic histories are indistinguishable without knowing the actual environmental history that the population experienced. While the definition of time reversibility is different, this result is closely related to the fact that the average entropy production attains 0 when the time reversibility is satisfied. Biologically, this result is quite important because we can evaluate the optimality of an organism in a changing environment by just observing its phenotypic dynamics without directly measuring the environment that the organisms experience. It should be noted, however, that P b, B [X t] = P F [X t] is a necessary but not sufficient condition for Ψ b [Y t ] = Ψ 0 [Y t ]. D. Achievability of the fitness upper bound The duality eq. (28) indicates that the achievability of the bound in eq. (24) depends on the actual property of the environmental statistics and biological constraints on the selectable strategies. If the environment is a time-homogeneous Markov process as Q[Y t ] = t 1 τ=0 T E(y τ+1 y τ )q(y 0 ) and if the number of possible phenotypic states is the same as that of the environmental states, that is, #S X = #S Y, the bound can be achieved by a pair of strategies: {T F (x x),t K (y x)} = {T E(x x),δ x,y } (32) where T E (x x) := T E (y τ+1 y τ ) yτ+1=x,y τ=x and δ x,y is the Kronecker delta. This pair is equivalent to the optimal bet-hedging strategy in Kelly s horse race gambling because T K (y x) = δ x,y means that organismscan survive only when their phenotypic state matches the current environmental state; they die out otherwise. To be specific, we call T K (y x) =

15 Stochastic and Information-thermodynamic Structure in Adaptation 15 δ x,y Kelly s strategy of metabolic allocation. Under biologically realistic constraints, however, Kelly s strategy cannot be the optimal strategy because the possible phenotypic states are usually much fewer than those of the environment as in Fig 2 and 3. Allocating all resources to a specific environment easily leads to extinction if the phenotypic states cannot cover all the possible environmental states. With a limited capacity in possible phenotypic states, Dloss b = 0 can be attained only if the environmental fluctuation has a hidden structure the dimensionality of which is sufficiently low. This is manifested by the fact that the conjugate environment Q is the environment with a hidden dynamics P F [X t ] that generates the actual environmental history as Q [Y t ] = X P t K [Y t X t]p F [X t]. Because such a low dimensional structure may not always exist, however, Dloss b := min {T Db loss F,T K} is generally not zero but finite and positive under a biological constraint that the possible phenotypic states are fewer than environmental ones. Therefore, Ψ 0 Q is generally attained only by a Darwinian demon that cannot perfectly foresee the future environment but has sufficient capacity in its phenotypic properties to perfectly learn and prepare for the environmental fluctuation Q[Y t ]. Even when the bound is not achieved, Dloss b has explicit meaning as the component in the environmental fluctuation that cannot be learned or represented by the dynamics of the cell s strategy. For example, when T F is memoryless and Q[Y t ] is a stationary Markov process with the stationary probability q(y), that is, Q[Y t ] = t 1 τ=0 TF E (y τ+1 y τ )q(y 0 ), then D b loss = min {T F,T K} Db loss = min {T F,T K} = min {T F,T K} = I xt;xt+1, ln TF E (y y)q(y) T K (y x) TF(x) T F E (y y)q(y) [ [ ]] I xt;xt+1 +D q(y) T K (y x) TF(x) [ ] where I xt;xt+1 := D T F E (y y)q(y)) q(y)q(y ) is the mutual information that measures the correlation of environmental states between consecutive time points, which cannot be learned or mimicd by the memoryless phenotypic switching. VI. FRS WITH SIGNAL SENSING Next, we consider the case in which the organisms can exploit the information obtained from the sensing signal. The fitness decomposition in eq. (16) can be rearranged as Ψ s P s B [Y t, Z t ] = Ψ 0 [Y t ]+i[y t ; Z t ] ln t, Y t, Z t ] P K [Y t X t ]P F [X t Z t ]Q[Z t ] (33) Q[Y t, Z t ] = Ψ 0 [Y t ]+i[y t ; Z t ] ln P K,F [Y t Z t ]Q[Z t ], (34) where i[y t ; Z t ] := lnq[y t, Z t ]/Q[Y t ]Q[Z t ] is the bare mutual information between Y t and Z t and we use eq. (18) to derive the last equality. From this, we can similarly obtain detailed, integral, and average FRs with information as follows: e (Ψ0[Yt]+i[Yt;Zt] Ψs [Y t,z t]) = P K,F[Y t Z t ]Q[Z t ], (35) Q[Y t, Z t ]

16 Stochastic and Information-thermodynamic Structure in Adaptation 16 and e (Ψ0[Yt]+i[Yt;Zt] Ψs [Y t,z t]) = 1, (36) Q[Y t,z t] Ψ s Q = Ψ 0 Q +I Y;Z D s loss, (37) where I Y;Z := i[y t ; Z t ] Q[Yt,Z t] is the path-wise mutual information between Y t and Z t, and ] Dloss [P s :=D B [X t, Y t, Z t ] P K [Y t X t ]P F [X t Z t ]Q[Z t ] [ ] =D Q[Y t, Z t ] P K,F [Y t Z t ]Q[Z t ]. (38) The way information terms involved in Eqs. (35)-(37) is the same as the way those appearing in the Sagawa-Ueda relations, where the Maxwell demon and feedback regulation are involved 47. Because of the non-negativity of D s loss, Ψs Q is upper bounded by Ψ 0 Q +I Y;Z : Ψ s Q Ψ 0 Q +I Y;Z = K max Q S Y Z [Q], (39) where S Y Z [Q] := lnq[y t Z t ] Q[Yt,Z t] is the conditional entropy of Q[Y t, Z t ]. If the history of signal Z t has perfect information on the history of Y t, the upper bound reaches K max Q. As in the bet-hediging situation, maximization of the average fitness Ψ s Q with sensing is also dual to the minimization of the relative entropy D s loss : {T F,T K} := arg max {T F,T K} Ψs Q = arg min {T F,T K} Ds loss, (40) where we use to denote the optimal T F and T K with sensing. The duality indicates that Ψ s Q achieves the bound Ψ 0 Q +I Y;Z only when P K,F [Y t Z t ] = Q[Y t Z t ] holds. As in the bet-heding, if P K,F [Y t Z t ] = Q[Y t Z t ] holds, the time backward path probability P s, B [X t, Y t, Z t ] equals the time-forward path probability: P s, B [X t, Y t, Z t ] = P K[Y t X t ]P F[X t Z t ]Q[Z t ]. (41) MarginalizationofthisequationleadstotheconsistencyconditionP F [X t Z t ] = P s, B [X t Z t ](:= Y t P s, B [X t, Y t, Z t ]Q[Y t Z t ]) derived in reference 9. Moreover, P K,F [Y t Z t ]Q[Z t ] is the conjugate environment and signal of a given pair of strategies {T F,T K } under which it achieves the bound. From the detailed FR in eq. (35), we also see that the fitness of a given strategy can exceed the bound by chance as Ψ s [Y t, Z t ] > Ψ 0 [Y t ] + i[y t ; Z t ] when the realized pair of environmental and signal histories {Y t, Z t } appears more frequently in the conjugate environment than in the actual environment as P K,F [Y t Z t ]Q[Z t ] > Q[Y t, Z t ]. A. Achievability of the bound and causality The necessary and sufficient condition P K,F [Y t Z t ] = Q[Y t Z t ] for achieving the bound means that the optimal metabolic allocation and phenotype switching strategy together implement the Bayesian computation of the posterior distribution Q[Y t Z t ] of Y t given the history of the sensed signal Z t. Under the constraint that P K,F [Y t Z t ] satisfies a causal relation as P K,F [Y t Z t ] = X t P K [Y t X t ]P F [X t Z t ], P K,F [Y t Z t ] = Q[Y t Z t ] cannot be realized in general because Q[Y t Z t ] does not necessarily satisfy the causality relation between Y t and Z t. If, for example, the metabolic allocation strategy is of Kelly s type as T K (y x) = δ y,x, the phenotypic switching strategy must satisfy P F [X t Z t ] = Q[X t Z t ] to achieve the bound where Q[X t Z t ] := Q[Y t Z t ] Yt=X t. By definition, P F [X t Z t ] should satisfy the causal relation that x(t) depends only on the past and/or current states of z(t). However, Q[Y t Z t ] may not necessarily be causal because the conditioning by histories biases the past states of the

Inference and estimation in probabilistic time series models

Inference and estimation in probabilistic time series models 1 Inference and estimation in probabilistic time series models David Barber, A Taylan Cemgil and Silvia Chiappa 11 Time series The term time series refers to data that can be represented as a sequence

More information

Information in Biology

Information in Biology Lecture 3: Information in Biology Tsvi Tlusty, tsvi@unist.ac.kr Living information is carried by molecular channels Living systems I. Self-replicating information processors Environment II. III. Evolve

More information

Information in Biology

Information in Biology Information in Biology CRI - Centre de Recherches Interdisciplinaires, Paris May 2012 Information processing is an essential part of Life. Thinking about it in quantitative terms may is useful. 1 Living

More information

Second law, entropy production, and reversibility in thermodynamics of information

Second law, entropy production, and reversibility in thermodynamics of information Second law, entropy production, and reversibility in thermodynamics of information Takahiro Sagawa arxiv:1712.06858v1 [cond-mat.stat-mech] 19 Dec 2017 Abstract We present a pedagogical review of the fundamental

More information

Chapter 2 Review of Classical Information Theory

Chapter 2 Review of Classical Information Theory Chapter 2 Review of Classical Information Theory Abstract This chapter presents a review of the classical information theory which plays a crucial role in this thesis. We introduce the various types of

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department Statistical Physics I Spring Term 2013 Notes on the Microcanonical Ensemble

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department Statistical Physics I Spring Term 2013 Notes on the Microcanonical Ensemble MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department 8.044 Statistical Physics I Spring Term 2013 Notes on the Microcanonical Ensemble The object of this endeavor is to impose a simple probability

More information

Biology as Information Dynamics

Biology as Information Dynamics Biology as Information Dynamics John Baez Stanford Complexity Group April 20, 2017 What is life? Self-replicating information! Information about what? How to self-replicate! It is clear that biology has

More information

Machine Learning Lecture Notes

Machine Learning Lecture Notes Machine Learning Lecture Notes Predrag Radivojac January 25, 205 Basic Principles of Parameter Estimation In probabilistic modeling, we are typically presented with a set of observations and the objective

More information

Endogenous Information Choice

Endogenous Information Choice Endogenous Information Choice Lecture 7 February 11, 2015 An optimizing trader will process those prices of most importance to his decision problem most frequently and carefully, those of less importance

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Expectation propagation for signal detection in flat-fading channels

Expectation propagation for signal detection in flat-fading channels Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA

More information

Bioinformatics: Biology X

Bioinformatics: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA Model Building/Checking, Reverse Engineering, Causality Outline 1 Bayesian Interpretation of Probabilities 2 Where (or of what)

More information

6.1 Main properties of Shannon entropy. Let X be a random variable taking values x in some alphabet with probabilities.

6.1 Main properties of Shannon entropy. Let X be a random variable taking values x in some alphabet with probabilities. Chapter 6 Quantum entropy There is a notion of entropy which quantifies the amount of uncertainty contained in an ensemble of Qbits. This is the von Neumann entropy that we introduce in this chapter. In

More information

How to Quantitate a Markov Chain? Stochostic project 1

How to Quantitate a Markov Chain? Stochostic project 1 How to Quantitate a Markov Chain? Stochostic project 1 Chi-Ning,Chou Wei-chang,Lee PROFESSOR RAOUL NORMAND April 18, 2015 Abstract In this project, we want to quantitatively evaluate a Markov chain. In

More information

Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach

Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach 1 Ashutosh Nayyar, Aditya Mahajan and Demosthenis Teneketzis Abstract A general model of decentralized

More information

QUANTIFYING CAUSAL INFLUENCES

QUANTIFYING CAUSAL INFLUENCES Submitted to the Annals of Statistics arxiv: arxiv:/1203.6502 QUANTIFYING CAUSAL INFLUENCES By Dominik Janzing, David Balduzzi, Moritz Grosse-Wentrup, and Bernhard Schölkopf Max Planck Institute for Intelligent

More information

arxiv: v1 [q-bio.pe] 28 Mar 2017

arxiv: v1 [q-bio.pe] 28 Mar 2017 Transitions in optimal adaptive strategies for populations in fluctuating environments arxiv:1703.09780v1 [q-bio.pe] 28 Mar 2017 Andreas Mayer, 1 Thierry Mora, 2 Olivier Rivoire, 3 and Aleksandra M. Walczak

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 /

More information

Foundations of Nonparametric Bayesian Methods

Foundations of Nonparametric Bayesian Methods 1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye Chapter 2: Entropy and Mutual Information Chapter 2 outline Definitions Entropy Joint entropy, conditional entropy Relative entropy, mutual information Chain rules Jensen s inequality Log-sum inequality

More information

Information and Physics Landauer Principle and Beyond

Information and Physics Landauer Principle and Beyond Information and Physics Landauer Principle and Beyond Ryoichi Kawai Department of Physics University of Alabama at Birmingham Maxwell Demon Lerner, 975 Landauer principle Ralf Landauer (929-999) Computational

More information

Lecture 5 Channel Coding over Continuous Channels

Lecture 5 Channel Coding over Continuous Channels Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5 From

More information

Ergodicity and Non-Ergodicity in Economics

Ergodicity and Non-Ergodicity in Economics Abstract An stochastic system is called ergodic if it tends in probability to a limiting form that is independent of the initial conditions. Breakdown of ergodicity gives rise to path dependence. We illustrate

More information

5 Mutual Information and Channel Capacity

5 Mutual Information and Channel Capacity 5 Mutual Information and Channel Capacity In Section 2, we have seen the use of a quantity called entropy to measure the amount of randomness in a random variable. In this section, we introduce several

More information

Variational inference

Variational inference Simon Leglaive Télécom ParisTech, CNRS LTCI, Université Paris Saclay November 18, 2016, Télécom ParisTech, Paris, France. Outline Introduction Probabilistic model Problem Log-likelihood decomposition EM

More information

Evolutionary Dynamics and Extensive Form Games by Ross Cressman. Reviewed by William H. Sandholm *

Evolutionary Dynamics and Extensive Form Games by Ross Cressman. Reviewed by William H. Sandholm * Evolutionary Dynamics and Extensive Form Games by Ross Cressman Reviewed by William H. Sandholm * Noncooperative game theory is one of a handful of fundamental frameworks used for economic modeling. It

More information

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION Alexandre Iline, Harri Valpola and Erkki Oja Laboratory of Computer and Information Science Helsinki University of Technology P.O.Box

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#8:(November-08-2010) Cancer and Signals Outline 1 Bayesian Interpretation of Probabilities Information Theory Outline Bayesian

More information

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels? Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity

More information

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Gabriel Y. Weintraub, Lanier Benkard, and Benjamin Van Roy Stanford University {gweintra,lanierb,bvr}@stanford.edu Abstract

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Lecture 1: Introduction to Probabilistic Modelling Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Why a

More information

Introduction to Information Entropy Adapted from Papoulis (1991)

Introduction to Information Entropy Adapted from Papoulis (1991) Introduction to Information Entropy Adapted from Papoulis (1991) Federico Lombardo Papoulis, A., Probability, Random Variables and Stochastic Processes, 3rd edition, McGraw ill, 1991. 1 1. INTRODUCTION

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

A Model of Human Capital Accumulation and Occupational Choices. A simplified version of Keane and Wolpin (JPE, 1997)

A Model of Human Capital Accumulation and Occupational Choices. A simplified version of Keane and Wolpin (JPE, 1997) A Model of Human Capital Accumulation and Occupational Choices A simplified version of Keane and Wolpin (JPE, 1997) We have here three, mutually exclusive decisions in each period: 1. Attend school. 2.

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013 School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two

More information

Chapter 7. Evolutionary Game Theory

Chapter 7. Evolutionary Game Theory From the book Networks, Crowds, and Markets: Reasoning about a Highly Connected World. By David Easley and Jon Kleinberg. Cambridge University Press, 2010. Complete preprint on-line at http://www.cs.cornell.edu/home/kleinber/networks-book/

More information

Biology as Information Dynamics

Biology as Information Dynamics Biology as Information Dynamics John Baez Biological Complexity: Can It Be Quantified? Beyond Center February 2, 2017 IT S ALL RELATIVE EVEN INFORMATION! When you learn something, how much information

More information

Robust Monte Carlo Methods for Sequential Planning and Decision Making

Robust Monte Carlo Methods for Sequential Planning and Decision Making Robust Monte Carlo Methods for Sequential Planning and Decision Making Sue Zheng, Jason Pacheco, & John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory

More information

1+t 2 (l) y = 2xy 3 (m) x = 2tx + 1 (n) x = 2tx + t (o) y = 1 + y (p) y = ty (q) y =

1+t 2 (l) y = 2xy 3 (m) x = 2tx + 1 (n) x = 2tx + t (o) y = 1 + y (p) y = ty (q) y = DIFFERENTIAL EQUATIONS. Solved exercises.. Find the set of all solutions of the following first order differential equations: (a) x = t (b) y = xy (c) x = x (d) x = (e) x = t (f) x = x t (g) x = x log

More information

Understanding Generalization Error: Bounds and Decompositions

Understanding Generalization Error: Bounds and Decompositions CIS 520: Machine Learning Spring 2018: Lecture 11 Understanding Generalization Error: Bounds and Decompositions Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the

More information

Centralized Versus Decentralized Control - A Solvable Stylized Model in Transportation Logistics

Centralized Versus Decentralized Control - A Solvable Stylized Model in Transportation Logistics Centralized Versus Decentralized Control - A Solvable Stylized Model in Transportation Logistics O. Gallay, M.-O. Hongler, R. Colmorn, P. Cordes and M. Hülsmann Ecole Polytechnique Fédérale de Lausanne

More information

1 Types of stochastic models

1 Types of stochastic models 1 Types of stochastic models Models so far discussed are all deterministic, meaning that, if the present state were perfectly known, it would be possible to predict exactly all future states. We have seen

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Basic modeling approaches for biological systems. Mahesh Bule

Basic modeling approaches for biological systems. Mahesh Bule Basic modeling approaches for biological systems Mahesh Bule The hierarchy of life from atoms to living organisms Modeling biological processes often requires accounting for action and feedback involving

More information

CS242: Probabilistic Graphical Models Lecture 4B: Learning Tree-Structured and Directed Graphs

CS242: Probabilistic Graphical Models Lecture 4B: Learning Tree-Structured and Directed Graphs CS242: Probabilistic Graphical Models Lecture 4B: Learning Tree-Structured and Directed Graphs Professor Erik Sudderth Brown University Computer Science October 6, 2016 Some figures and materials courtesy

More information

CSCE 478/878 Lecture 6: Bayesian Learning

CSCE 478/878 Lecture 6: Bayesian Learning Bayesian Methods Not all hypotheses are created equal (even if they are all consistent with the training data) Outline CSCE 478/878 Lecture 6: Bayesian Learning Stephen D. Scott (Adapted from Tom Mitchell

More information

Linear-Quadratic Optimal Control: Full-State Feedback

Linear-Quadratic Optimal Control: Full-State Feedback Chapter 4 Linear-Quadratic Optimal Control: Full-State Feedback 1 Linear quadratic optimization is a basic method for designing controllers for linear (and often nonlinear) dynamical systems and is actually

More information

Information Flow/Transfer Review of Theory and Applications

Information Flow/Transfer Review of Theory and Applications Information Flow/Transfer Review of Theory and Applications Richard Kleeman Courant Institute of Mathematical Sciences With help from X. San Liang, Andy Majda and John Harlim and support from the NSF CMG

More information

Information, Utility & Bounded Rationality

Information, Utility & Bounded Rationality Information, Utility & Bounded Rationality Pedro A. Ortega and Daniel A. Braun Department of Engineering, University of Cambridge Trumpington Street, Cambridge, CB2 PZ, UK {dab54,pao32}@cam.ac.uk Abstract.

More information

Max-Planck-Institut für Mathematik in den Naturwissenschaften Leipzig

Max-Planck-Institut für Mathematik in den Naturwissenschaften Leipzig Max-Planck-Institut für Mathematik in den Naturwissenschaften Leipzig Hierarchical Quantification of Synergy in Channels by Paolo Perrone and Nihat Ay Preprint no.: 86 2015 Hierarchical Quantification

More information

To appear in the American Naturalist

To appear in the American Naturalist Unifying within- and between-generation bet-hedging theories: An ode to J.H. Gillespie Sebastian J. Schreiber Department of Evolution and Ecology, One Shields Avenue, University of California, Davis, California

More information

FuncICA for time series pattern discovery

FuncICA for time series pattern discovery FuncICA for time series pattern discovery Nishant Mehta and Alexander Gray Georgia Institute of Technology The problem Given a set of inherently continuous time series (e.g. EEG) Find a set of patterns

More information

Context tree models for source coding

Context tree models for source coding Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding

More information

Signal Processing - Lecture 7

Signal Processing - Lecture 7 1 Introduction Signal Processing - Lecture 7 Fitting a function to a set of data gathered in time sequence can be viewed as signal processing or learning, and is an important topic in information theory.

More information

Online Learning With Kernel

Online Learning With Kernel CS 446 Machine Learning Fall 2016 SEP 27, 2016 Online Learning With Kernel Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes Overview Stochastic Gradient Descent Algorithms Regularization Algorithm Issues

More information

Game Theory and its Applications to Networks - Part I: Strict Competition

Game Theory and its Applications to Networks - Part I: Strict Competition Game Theory and its Applications to Networks - Part I: Strict Competition Corinne Touati Master ENS Lyon, Fall 200 What is Game Theory and what is it for? Definition (Roger Myerson, Game Theory, Analysis

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

7. Shortest Path Problems and Deterministic Finite State Systems

7. Shortest Path Problems and Deterministic Finite State Systems 7. Shortest Path Problems and Deterministic Finite State Systems In the next two lectures we will look at shortest path problems, where the objective is to find the shortest path from a start node to an

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

More information

PROPERTIES OF THE EMPIRICAL CHARACTERISTIC FUNCTION AND ITS APPLICATION TO TESTING FOR INDEPENDENCE. Noboru Murata

PROPERTIES OF THE EMPIRICAL CHARACTERISTIC FUNCTION AND ITS APPLICATION TO TESTING FOR INDEPENDENCE. Noboru Murata ' / PROPERTIES OF THE EMPIRICAL CHARACTERISTIC FUNCTION AND ITS APPLICATION TO TESTING FOR INDEPENDENCE Noboru Murata Waseda University Department of Electrical Electronics and Computer Engineering 3--

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Lecture 4: State Estimation in Hidden Markov Models (cont.)

Lecture 4: State Estimation in Hidden Markov Models (cont.) EE378A Statistical Signal Processing Lecture 4-04/13/2017 Lecture 4: State Estimation in Hidden Markov Models (cont.) Lecturer: Tsachy Weissman Scribe: David Wugofski In this lecture we build on previous

More information

Reading for Lecture 13 Release v10

Reading for Lecture 13 Release v10 Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

CALCULUS III THE CHAIN RULE, DIRECTIONAL DERIVATIVES, AND GRADIENT

CALCULUS III THE CHAIN RULE, DIRECTIONAL DERIVATIVES, AND GRADIENT CALCULUS III THE CHAIN RULE, DIRECTIONAL DERIVATIVES, AND GRADIENT MATH 20300 DD & ST2 prepared by Antony Foster Department of Mathematics (office: NAC 6-273) The City College of The City University of

More information

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden 1 Selecting Efficient Correlated Equilibria Through Distributed Learning Jason R. Marden Abstract A learning rule is completely uncoupled if each player s behavior is conditioned only on his own realized

More information

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION THOMAS MAILUND Machine learning means different things to different people, and there is no general agreed upon core set of algorithms that must be

More information

Machine Learning Lecture 7

Machine Learning Lecture 7 Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant

More information

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions

More information

Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes

Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes Ellida M. Khazen * 13395 Coppermine Rd. Apartment 410 Herndon VA 20171 USA Abstract

More information

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Part I C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Probabilistic Graphical Models Graphical representation of a probabilistic model Each variable corresponds to a

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 Plan for today and next week Today and next time: Bayesian networks (Bishop Sec. 8.1) Conditional

More information

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti 1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early

More information

arxiv: v2 [cond-mat.stat-mech] 3 Jun 2018

arxiv: v2 [cond-mat.stat-mech] 3 Jun 2018 Marginal and Conditional Second Laws of Thermodynamics Gavin E. Crooks 1 and Susanne Still 2 1 Theoretical Institute for Theoretical Science 2 University of Hawai i at Mānoa Department of Information and

More information

Variational Message Passing. By John Winn, Christopher M. Bishop Presented by Andy Miller

Variational Message Passing. By John Winn, Christopher M. Bishop Presented by Andy Miller Variational Message Passing By John Winn, Christopher M. Bishop Presented by Andy Miller Overview Background Variational Inference Conjugate-Exponential Models Variational Message Passing Messages Univariate

More information

Lecture 4 Noisy Channel Coding

Lecture 4 Noisy Channel Coding Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 9, 2015 1 / 56 I-Hsiang Wang IT Lecture 4 The Channel Coding Problem

More information

Wolpert, D. (1995). Off-training set error and a priori distinctions between learning algorithms.

Wolpert, D. (1995). Off-training set error and a priori distinctions between learning algorithms. 14 Perrone, M. (1993). Improving regression estimation: averaging methods for variance reduction with extensions to general convex measure optimization. Ph.D. thesis, Brown University Physics Dept. Wolpert,

More information

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein Kalman filtering and friends: Inference in time series models Herke van Hoof slides mostly by Michael Rubinstein Problem overview Goal Estimate most probable state at time k using measurement up to time

More information

State Estimation of Linear and Nonlinear Dynamic Systems

State Estimation of Linear and Nonlinear Dynamic Systems State Estimation of Linear and Nonlinear Dynamic Systems Part I: Linear Systems with Gaussian Noise James B. Rawlings and Fernando V. Lima Department of Chemical and Biological Engineering University of

More information

Lecture 5 - Information theory

Lecture 5 - Information theory Lecture 5 - Information theory Jan Bouda FI MU May 18, 2012 Jan Bouda (FI MU) Lecture 5 - Information theory May 18, 2012 1 / 42 Part I Uncertainty and entropy Jan Bouda (FI MU) Lecture 5 - Information

More information

Quantum Mechanical Foundations of Causal Entropic Forces

Quantum Mechanical Foundations of Causal Entropic Forces Quantum Mechanical Foundations of Causal Entropic Forces Swapnil Shah North Carolina State University, USA snshah4@ncsu.edu Abstract. The theory of Causal Entropic Forces was introduced to explain the

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Learning in Zero-Sum Team Markov Games using Factored Value Functions

Learning in Zero-Sum Team Markov Games using Factored Value Functions Learning in Zero-Sum Team Markov Games using Factored Value Functions Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 27708 mgl@cs.duke.edu Ronald Parr Department of Computer

More information

Markov Chain Monte Carlo Methods for Stochastic Optimization

Markov Chain Monte Carlo Methods for Stochastic Optimization Markov Chain Monte Carlo Methods for Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U of Toronto, MIE,

More information

Bayesian reconstruction of free energy profiles from umbrella samples

Bayesian reconstruction of free energy profiles from umbrella samples Bayesian reconstruction of free energy profiles from umbrella samples Thomas Stecher with Gábor Csányi and Noam Bernstein 21st November 212 Motivation Motivation and Goals Many conventional methods implicitly

More information

Multivariate Distribution Models

Multivariate Distribution Models Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is

More information

Particle Filtering Approaches for Dynamic Stochastic Optimization

Particle Filtering Approaches for Dynamic Stochastic Optimization Particle Filtering Approaches for Dynamic Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge I-Sim Workshop,

More information

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.

More information

INFORMATION-THEORETIC BOUNDS OF EVOLUTIONARY PROCESSES MODELED AS A PROTEIN COMMUNICATION SYSTEM. Liuling Gong, Nidhal Bouaynaya and Dan Schonfeld

INFORMATION-THEORETIC BOUNDS OF EVOLUTIONARY PROCESSES MODELED AS A PROTEIN COMMUNICATION SYSTEM. Liuling Gong, Nidhal Bouaynaya and Dan Schonfeld INFORMATION-THEORETIC BOUNDS OF EVOLUTIONARY PROCESSES MODELED AS A PROTEIN COMMUNICATION SYSTEM Liuling Gong, Nidhal Bouaynaya and Dan Schonfeld University of Illinois at Chicago, Dept. of Electrical

More information

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information. L65 Dept. of Linguistics, Indiana University Fall 205 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission rate

More information

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014 Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of

More information