Process-Based Risk Measures for Observable and Partially Observable Discrete-Time Controlled Systems

Size: px
Start display at page:

Download "Process-Based Risk Measures for Observable and Partially Observable Discrete-Time Controlled Systems"

Transcription

1 Process-Based Risk Measures for Observable and Partially Observable Discrete-Time Controlled Systems Jingnan Fan Andrzej Ruszczyński November 5, 2014; revised April 15, 2015 Abstract For controlled discrete-time stochastic processes we introduce a new class of dynamic risk measures, which we call process-based. Their main features are that they measure risk of processes that are functions of the history of the base process. We introduce a new concept of conditional stochastic time consistency and we derive the structure of process-based risk measures enjoying this property. We show that they can be equivalently represented by a collection of static law-invariant risk measures on the space of functions of the state of the base process. We apply this result to controlled Markov processes and we derive dynamic programming equations. Next, we consider partially observable processes and we derive the structure of stochastically conditionally time-consistent risk measures in this case. We prove that they can be represented by a sequence of law invariant risk measures on the space of function of the observable part of the state. We also prove corresponding dynamic programming equations. Keywords: Dynamic Risk Measures; Time Consistency; Partially Observable Markov Processes 1 Introduction The main objective of this paper is to provide theoretical foundations of the theory of dynamic risk measures for controlled stochastic processes, in particular, Markov processes, with only partial observation of the state. The theory of dynamic risk measures in discrete time has been intensively developed in the last 10 years see [12, 43, 37, 39, 20, 10, 42, 2, 35, 26, 25, 11] and the references therein). The basic setting is the following: we have a probability space Ω, F, P), a filtration {F t } t=1,...,t with a trivial F 1, and we define appropriate spaces Z t of F t -measurable random variables, t = 1,..., T. For each t = 1,..., T, a mapping ρ t,t : Z T Z t is called a conditional risk measure. The central role in the theory is played by the concept of time consistency, which regulates relations between the mappings ρ t,t and ρ s,t for different s and t. Rutgers University, Department of Management Science and Information Systems, Piscataway, NJ 08854, USA, jf546@rutgers.edu Rutgers University, Department of Management Science and Information Systems, Piscataway, NJ 08854, USA, rusz@rutgers.edu 1

2 One definition employed in the literature is the following: for all Z, W Z T, if ρ t,t Z) ρ t,t W ) then ρ s,t Z) ρ s,t W ) for all s < t. This can be used to derive recursive relations ρ t,t Z) = ρ t ρt+1,t Z) ), with simpler one-step conditional risk mappings ρ t : Z t+1 Z t, t = 1,..., T 1. Much effort has been devoted to derive dual representations of the conditional risk mapping and to study their evolution in various settings. When applied to controlled processes, and in particular, to Markov decision problems, the theory of dynamic measures of risk encounters difficulties. The spaces Z t become larger, when t grows, and each one-step mapping ρ t has different domain and range spaces. There is no easy way to explore stationarity of the process, nor a clear path exists to extend the theory to infinitehorizon problems. Moreover, no satisfactory theory of law invariant dynamic risk measures exists, which would be suitable for Markov control problems the restrictive definitions of law invariance employed in [27] and [44] lead to conclusions of limited practical usefulness, while the translation of the approach of [46] to the Markov case appears to be difficult). Motivated by these issues, in [41], we introduced a specific class of dynamic risk measures, which is well-suited for Markov problems. We postulated that the one-step conditional risk mappings ρ t have a special form, which allows for their representation in terms of risk measures on a space of functions defined on the state space of the Markov process. This restriction allowed for the development of dynamic programming equations and corresponding solution methods, which generalize the well-known results for expected value problems. Our ideas were successfully extended in [8, 7, 30, 45]. In this paper, we introduce and analyze a general class of risk measures, which we call process-based. We consider a controlled process { t } t=1,...,t taking values in a Polish space the state space), whose conditional distributions are described by controlled transition kernels Q t : t U P ), t = 1,..., T 1, where U is a certain control space. Any historydependent measurable) control u t = π t x 1,..., x t ) is allowed. In this setting, we are only interested in measuring risk of stochastic processes of the form Z t = c t t, u t ), t = 1,..., T, where c t : U R can be any bounded measurable function. This restriction of the class of stochastic processes for which risk needs to be measured is one of the two cornerstones of our approach. The other cornerstone is our new concept of stochastic conditional time-consistency. It is more restrictive than the usual time consistency, because it involves conditional distributions and uses stochastic dominance rather than the pointwise order. These two foundations allow for the development of a theory of dynamic risk measures which can be fully described by a sequence of law invariant risk measures on a space V of measurable functions on the state space. In the special case of controlled Markov processes, we derive the structure postulated in [41], thus providing its solid theoretical foundations. We also derive dynamic programming equations in a much more general setting than that of [41]. In the extant literature, three basic approaches to introduce risk aversion in Markov decision processes have been employed: utility functions see, e.g., [23, 24, 15]), mean variance models see. e.g., [47, 19, 31, 1]), and entropic exponential) models see, e.g., [22, 32, 13, 17, 29, 5]). Our approach generalizes the utility and exponential models; the mean variance models do not satisfy, in general, the monotonicity and time-consistency conditions, which are relevant for our theory. However, a version satisfying these conditions has been recently proposed in [9]. In the second part of the paper, we extend our theory to partially observable controlled Markov processes. In the expected-value case, this classical topic is covered in many monographs see, e.g., [21, 6, 4] and the references therein). The standard approach is to consider the belief state space, involving the space of probability distributions of the unobserved part of the state. The recent report [18] provides the state-of-the-art setting. The risk averse case has been dealt, so far, with the use of the entropic risk measure see [36, 5]). Our main result in the 2

3 risk-averse case is that the conditional risk mappings can be equivalently modeled by risk measures on the space of functions defined on the observable part of the state only. We derive this observation in two parallel, but equivalent ways: by considering history-dependent description of the evolution of the observable part, and by considering belief states, that is, posterior distributions of the unobservable part. We also derive risk-averse dynamic programming equations for partially observable Markov models. The paper is organized as follows. In sections , we formalize the basic model and review the concepts of risk measures and their time consistency. The first set of original results are presented in section 2.4; we introduce a new concept of stochastic conditional time consistency and we characterize the structure of dynamic risk measures enjoying this property Theorem 2.14). In section 3, we extend these ideas to the case of controlled processes, and we prove Theorem 3.3 on the structure of measures of risk in this case. These results are further specialized to controlled Markov processes in section 4. We introduce the concept of a Markov risk measure and we derive its structure Theorem 4.4). In section 4.2, we prove an analog of dynamic programming equations in this case. Section 5 is devoted to the analysis of the structure of risk measures for partially observable controlled Markov processes. We attack the problem in two different ways. In section 5.3, we focus on the observable part of the process and its history-dependent dynamics. The key result is Theorem 5.6, which simplifies modeling of conditional risk measures in this case. In section 5.5 we consider the Markov model in the extended state space, involving the posterior distribution of the unobservable part the belief state ). In Theorem 5.8 we derive an equivalent representation of risk measures, which also reduces to law invariant risk models on the space of functions of the next observation. Finally, in section 5.6, we derive an analog of dynamic programming equations for risk-averse partially observable models. 2 Risk Measures Based on Observable Processes In this section we introduce fundamental concepts and properties of dynamic risk measures for uncontrolled stochastic processes in discrete time. In subsection 2.1 we set up our a probabilistic framework, and in subsections 2.2 and 2.3 we revisit some important concepts existing in the literature. Beginning from subsection 2.4, we introduce new notions of conditional time consistency and stochastic conditional time-consistency, which are strictly stronger requirements on the dynamic risk measure than the standard concept od time consistency, and which are particularly useful for controlled stochastic processes. Based on these two concepts, we derive the structure of dynamic risk measures, which is based on so-called transition risk mappings. 2.1 Preliminaries In all subsequent considerations, we work with a Borel subset, B )) of a Polish space and the canonical measurable space T, B ) T ) where T is a natural number and B ) T is the product σ -algebra. We use { t } t=1,...,t to denote the discrete-time process of canonical projections. We also define H t = t to be the space of possible histories up to time t, and we use h t for a generic element of H t : a specific history up to time t. The random vector 1,, t ) will be denoted by H t. We assume that for all t = 1,..., T 1, the transition kernels, which describe the conditional distribution of t+1, given 1,, t, are measurable functions Q t : t P ), t = 1,..., T 1, 1) 3

4 where P ) is the set of probability measures on, B )). These kernels, along with the initial distribution of 1, define a unique probability measure P on the product space T with the product σ -algebra. For a stochastic system described above, we consider a sequence of random variables {Z t } t=1,...,t taking values in R; we assume that lower values of Z t are preferred e.g., Z t represents a cost at time t). We require that {Z t } t=1,...,t is bounded and adapted to {F t } t=1,...,t - the natural filtration generated by the process. In order to facilitate our discussion, we introduce the following spaces: Z t = { Z : T R Z is F t -measurable and bounded It is then equivalent to say that Z t Z t. We also introduce the spaces Z t,t = Z t Z T, t = 1,..., T. }, t = 1,..., T. 2) Since Z t is F t -measurable, a measurable function φ t : t R exists such that Z t = φ t 1,..., t ). With a slight abuse of notation, we still use Z t to denote this function. 2.2 Dynamic risk measures In this subsection, we quickly review some definitions related to risk measures. Note that all relations e.g., equality, inequality) between random variables are understood in the everywhere sense. Definition 2.1. A mapping ρ t,t : Z t,t Z t, where 1 t T, is called a conditional risk measure, if it satisfies the monotonicity property: for all Z t,..., Z T ) and W t,..., W T ) in Z t,t, if Z s W s, for all s = t,..., T, then ρ t,t Z t,..., Z T ) ρ t,t W t,..., W T ). Definition 2.2. A conditional risk measure ρ t,t : Z t,t Z t i) is normalized if ρ t,t 0,..., 0) = 0; ii) is translation-invariant if Z t,..., Z T ) Z t,t, ρ t,t Z t,..., Z T ) = Z t + ρ t,t 0, Z t+1,..., Z T ). Throughout the paper, we assume all conditional risk measures to be at least normalized. Translation-invariance is a fundamental property, which will also be frequently used; under normalization, it implies that ρ t,t Z t, 0,, 0) = Z t. Definition 2.3. A conditional risk measure ρ t,t has the local property if for any Z t,..., Z T ) Z t,t and for any event A F t, we have 1 A ρ t,t Z t,..., Z T ) = ρ t,t 1 A Z t,...,1 A Z T ). The local property is important because it means that the conditional risk measure at time t restricted to any F t -event A is not influenced by the values that Z t,..., Z T take on A c. Definition 2.4. A dynamic risk measure ρ = { ρ t,t is a sequence of conditional risk }t=1,...,t measures ρ t,t : Z t,t Z t. We say that ρ is translation-invariant or has the local property, if all ρ t,t, t = 1,..., T, satisfy the respective conditions of Definitions 2.2 or

5 2.3 Time Consistency The notion of time-consistency can be formulated in different ways, with weaker or stronger assumptions; but the key idea is that if one sequence of costs, compared to another sequence, has the same current cost and lower risk in the future, then it should have lower current risk. In this and the next subsection, we discuss three formulations of time-consistency from the weakest to the strongest one. We also show how the tower property the recursive relation between ρ t,t and ρ t+1,t that time-consistency implies) improves with more refined time consistency concepts. The following definition of time consistency was employed in [41]. Definition 2.5. A dynamic risk measure { ρ t,t is time-consistent if and only if for any }t=1,...,t 1 τ < θ T, for any Z τ,..., Z T ) and W τ,..., W T ) in Z τ,t, the conditions { Zt = W t, t = τ,..., θ 1, imply ρ θ,t Z θ,..., Z T ) ρ θ,t W θ,..., W T ), ρ τ,t Z τ,..., Z T ) ρ τ,t W τ,..., W T ). The following proposition simplifies the definition of a time-consistent measure, which will be useful in further derivations. Proof is omitted and can be found in [41]. Proposition 2.6. A dynamic risk measure { ρ t,t is time-consistent if for any 1 t < T }t=1,...,t and for all Z t,..., Z T ), W t,..., W T ) Z t, the conditions { Zt = W t, ρ t+1,t Z t+1,..., Z T ) ρ t+1,t W t+1,..., W T ), imply that ρ t,t Z t,..., Z T ) ρ t,t W t,..., W T ). It turns out that a translation-invariant and time-consistent dynamic risk measure can be decomposed into and then reconstructed from so-called one-step conditional risk mappings. Theorem 2.7 [41]). A dynamic risk measure { ρ t,t is translation-invariant and timeconsistent if and only if there exist ρ t : Z t+1 Z t, t = 1,..., T 1 satisfying the monotonicity }t=1,...,t property and such that ρ t 0) = 0, called one-step conditional risk mappings, such that for all t = 1,..., T 1, ρ t,t Z t,..., Z T ) = Z t + ρ t ρ t+1,t Z t+1,..., Z T )). 3) As an immediate comment, we would like to point out that translation invariance and time consistency do not imply the local property, as is demonstrated by the following example. Example 2.8. An example of a dynamic risk measure which is translation-invariant and timeconsistent but fails to have the local property: T = 2, = {1, 2}, { ρ1,2 Z 1, Z 2 )x 1 ) = Z 1 x 1 ) + E[Z 2 1 = x 1 ], x 1 {1, 2}, ρ 2,2 Z 2 )x 1, x 2 ) = Z 2 x 1, x 2 ), x 1, x 2 ) {1, 2} 2. Conceptually, the one-step conditional risk mappings play a similar role to one-step conditional expectations, and will be very useful when tower property is involved. At this stage without further refinement of assumptions, it remains a fairly abstract and general object that is 5

6 hard to characterize. In [41], a special variational form of this one-step conditional risk mappings is imposed, which is well suited for Markovian applications, while it is unclear if there exist other forms of such mappings. In order to gain deeper understanding of these concepts, in the following we introduce two stronger notions of time consistency, and we argue that any one-step conditional risk mapping is of the variational form postulated in [41]. To this end, we use the particular structure of the space T, B ) T ) and the way a probability measure is defined on this space. 2.4 Stochastic Conditional Time-Consistency and Transition Risk Mappings We now refine the concept of time-consistency for process-based risk measures. Definition 2.9. A dynamic risk measure { ρ t,t is conditionally time-consistent if for }t=1,...,t any 1 t T 1, for any h t t, for all Z t,..., Z T ), W t,..., W T ) Z t,t, the conditions { Zt h t ) = W t h t ), 4) ρ t+1,t Z t+1,..., Z T )h t, ) ρ t+1,t W t+1,..., W T )h t, ), imply ρ t,t Z t,..., Z T )h t ) ρ t,t W t,..., W T )h t ). Proposition The conditional time-consistency implies the time-consistency and the local property. Proof. If { ρ t,t is conditionally time-consistent, then it satisfies the conditions of Propo- }t=1,...,t sition 2.6 and is thus time-consistent. Let us prove by induction on t from T down to 1 that ρ t,t satisfy the local property. Clearly, ρ T,T does: if A F T, then [ 1 A ρ T,T ] ZT ) = 1 A Z T = ρ T,T 1 A Z T ). Suppose ρ t+1,t satisfies the local property for some 1 t < T, and consider any A F t, any h t t, and any Z t,..., Z T ) Z t,t. Two cases may occur. If 1 A h t ) = 0, then [1 A Z t ]h t ) = 0 and the local property for t + 1 yields: [ ρt+1,t 1 A Z t+1,...,1 A Z T ) ] h t, ) = [ 1 A ρ t+1,t Z t+1,..., Z T ) ] h t, ) = 0. By conditional time consistency, ρ t,t 1 A Z t,1 A Z t+1,...,1 A Z T )h t ) = ρ t,t 0,..., 0)h t ) = 0. If 1 A h t ) = 1, then [1 A Z t ]h t ) = Z t h t ) and the local property for t + 1 implies that [ ρt+1,t 1 A Z t+1,...,1 A Z T ) ] h t, ) = [ 1 A ρ t+1,t Z t+1,..., Z T ) ] h t, ) = ρ t+1,t Z t+1,..., Z T )h t, ). By conditional time consistency, ρ t,t 1 A Z t,...,1 A Z T )h t ) = ρ t,t Z t,..., Z T )h t ). In both cases, ρ t,t 1 A Z t,...,1 A Z T )h t ) = [ 1 A ρ t,t Z t,..., Z T ) ] h t ). The conditional time-consistency is thus a stronger property than the time-consistency; it allows us to exclude risk measures which do not satisfy the local property, like Example

7 Generally, the definition of dynamic risk measures in Section 2.2 and the definition of timeconsistency property in Section 2.3 are valid with any filtration on an underlying probability space, F, P), instead of process-generated filtration. However, from Section 2.4 and to the end of the paper, we only consider dynamic risk measures defined with a filtration generated by a specific process, because we are interested in those risk measures which can be evaluated on each specific history path. That is why we call these risk measures process-based. The following proposition shows that the conditional time-consistency also yields a refined version of the one-step risk-mapping. Throughout the paper, we define V to be the set of all bounded measurable functions on, and it turns out that such mappings can be equivalently represented by risk measures on V. Theorem A dynamic risk measure { ρ t,t is translation-invariant and conditionally time-consistent if and only if there exist }t=1,...,t functionals σ t : t V R, t = 1,..., T 1, called transition risk mappings, such that i) for all t = 1,... T 1 and all h t t, the functional σ t h t, ) is a normalized and monotonic risk measure on V; ii) for all t = 1,... T 1, for all Z t,..., Z T ) Z t,t, and for all h t t, we have the following recursive relation: ρ t,t Z t,..., Z T )h t ) = Z t h t ) + σ t ht, ρ t+1,t Z t+1,..., Z T )h t, ) ). 5) Moreover, for all t = 1,..., T 1 the functional σ t is uniquely determined by ρ t,t as follows: for every h t t and every v V, σ t h t, v) = ρ t,t 0, V, 0,..., 0)h t ), 6) where V Z t+1 satisfies the equation V h t, ) = v ), and can be arbitrary elsewhere. Proof. Assume { ρ t,t is translation-invariant and conditionally time-consistent. For }t=1,...,t any h t t, the formula 6) defines a normalized and monotonic risk measure on the space V. Setting vx) = ρ t+1,t Z t+1,..., Z T )h t, x), x, { vx), if h t+1 = h t, x), V h t+1 ) = 0, otherwise, we obtain, by conditional time-consistency, Thus, by the translation property, ρ t+1,t V, 0,..., 0)h t, ) = ρ t+1,t Z t+1,..., Z T )h t, ). ρ t,t Z t,..., Z T )h t ) = Z t h t ) + ρ t,t 0, Z t+1,..., Z T )h t ) = Z t h t ) + ρ t,t 0, V, 0,..., 0)h t ) = Z t h t ) + σ t h t, v). This chain of relations proves also the uniqueness of σ t. 7

8 On the other hand, if such transition risk mappings exist, then { ρ t,t is conditionally }t=1,...,t time-consistent by the monotonicity of σ h t, ). We can now use 5) to obtain for any t = 1,..., T 1, and for all h t t the following identity: ρ t,t 0, Z t+1,..., Z T )h t ) = σ t ht, ρ t+1,t Z t+1,..., Z T )h t, ) ) which is translation invariance of ρ t,t. = ρ t,t Z t,..., Z T )h t ) Z t h t ), Further refinement of the concept of time-consistency can be obtained by the use of the transition kernels Q t and stochastic, rather than pointwise, orders. We stress that unlike the previous notions of time-consistency, the new one will depend on the underlying kernels Q t. Definition A dynamic risk measure { ρ t,t is stochastically conditionally timeconsistent with respect to {Q t } t=1,...,t 1 if for any 1 t T 1, for any h t t, for all }t=1,...,t Z t,..., Z T ), W t,..., W T ) Z t,t, the conditions { Zt h t ) = W t h t ), ) ) 7) ρt+1,t Z t+1,..., Z T ) H t = h t st ρt+1,t W t+1,..., W T ) H t = h t, imply ρ t,t Z t,..., Z T )h t ) ρ t,t W t,..., W T )h t ), 8) where the relation st stands for the conditional stochastic order understood as follows: { Q t h t ) x ρt+1,t Z t+1,..., Z T )h t, x) > η }) Q t h t ) { x ρt+1,t W t+1,..., W T )h t, x) > η }), η R. When the choice of the underlying transition kernels is clear from the context, we will simply say that the dynamic risk measure is stochastically conditionally time-consistent. We first note that this new notion of time-consistency is even stronger than the conditional time-consistency introduced in Definition 2.9. Proposition The stochastic conditional time-consistency implies the conditional timeconsistency. Proof. Suppose { ρ t,t is stochastically conditionally time-consistent. Fix t = 1,..., T }t=1,...,t 1, and let h t and Z t,..., Z T ), W t,..., W T ) satisfy the conditions 4) of Definition 2.9. Then for all η R we obtain { x ρt+1,t Z t+1,..., Z T )h t, x) > η } { x ρ t+1,t W t+1,..., W T )h t, x) > η }, and thus 8) follows. It comes with no surprise that with this stronger notion of time-consistency, the mapping σ t appearing in Theorem 2.11 has a more refined characterization. The following theorem proves that this extra characterization is the law invariance property. 8

9 Theorem A process-based dynamic risk measure { ρ t,t is translation-invariant }t=1,...,t and stochastically conditionally time-consistent if and only if there exist functionals σ t : graphq t ) V R, t = 1,..., T 1, such that i) for all t = 1,..., T 1 and all h t t, the functional σ t h t, Q t h t ), ) is a normalized, monotonic, and law-invariant risk measure on V with respect to the distribution Q t h t ); ii) for all t = 1,..., T 1, for all Z t,..., Z T ) Z t,t, and for all h t t, ρ t,t Z t,..., Z T )h t ) = Z t h t ) + σ t h t, Q t h t ), ρ t+1,t Z t+1,..., Z T )h t, )). 9) Moreover, for all t = 1,..., T 1, σ t is uniquely determined by ρ t,t as follows: for every h t t and every v V, σ t h t, Q t h t ), v) = ρ t,t 0, V, 0,..., 0)h t ), 10) where V Z t+1 satisfies the equation V h t, ) = v ), and can be arbitrary elsewhere. Proof. On the one hand, if { ρ t,t is translation-invariant and stochastically conditionally time-consistent, then by Proposition 2.13 and Theorem 2.11, there exist normalized and }t=1,...,t monotonic risk measures σ t h t, Q t h t ), ), h t t, t = 1,..., T 1, that satisfy 9) and 6). We need only verify the postulated law invariance of σ t h t, Q t h t ), ). If V, V Z t+1 have the same conditional distribution, given h t, then Definition 2.12 implies that ρ t,t 0, V, 0,..., 0)h t ) = ρ t,t 0, V, 0,..., 0)h t ), and law invariance follows from 6). On the other hand, if such {σ t } t=1,...,t 1 exist, the law-invariance of σ t h t, Q t h t ), ) implies the stochastic conditional time-consistency. Remark With a slight abuse of notation, we included the distribution Q t h t ) as an argument of the transition risk mapping in view of the application to controlled processes. Example In the theory of risk-sensitive Markov decision processes, the following family of entropic risk measures is employed see [22, 36, 32, 14, 40, 13, 17, 29, 5]): ρ t,t Z t,..., Z T ) = E 1 [ γ ln exp γ ) T ] ) s=t Z s Ft, t = 1,..., T, γ > 0. It is stochastically conditionally time-consistent, and corresponds to the transition risk mapping σ t h t, q, v) = 1 γ ln E q [e γ v ] ) = 1 ) γ ln e γ vx) qdx), γ > 0. 11) We could also make γ in 11) dependent on the time t, the current state x t, or even the entire history h t, and still obtain a stochastically conditionally time-consistent dynamic risk measure. Example The following transition risk mapping satisfies the condition of Theorem 2.14 and corresponds to a stochastically conditionally time-consistent dynamic risk measure: [ ) ] ) p 1/p σ t h t, q, v) = vs) qds) + ϰ t h t ) vs) vs ) qds ) qds), 12) + where ϰ t : t [0, 1] is a measurable function, and p [1, + ). It is an analogue of the static mean semideviation measure of risk, whose consistency with stochastic dominance is well known [33, 34]. In the construction of a dynamic risk measure, we use q = Q t h t ). If ϰ t depends on x t only, the mapping 12) corresponds to a Markov risk measure discussed in sections 4.1 and

10 Example Our use of the stochastic dominance relation in the definition of stochastic conditional time consistency rules out some candidates for transition risk mappings. Suppose σ t h t, q, v) = vx 1 ), where x 1 is a selected state. Such a mapping is a coherent measure of risk, as a function of the last argument, and may be law invariant. In particular, it is law invariant with = {x 1, x 2 }, qx 1 ) = 1/3, qx 2 ) = 2/3, vx 1 ) = 3, vx 2 ) = 1, wx 1 ) = 2, wx 2 ) = 4. For this mapping, we have v st w under q, but σ t h t, q, v) > σ t h t, q, w), and thus the condition of stochastic conditional time consistency is violated. We consciously exclude such cases, because in controlled systems, which will be discussed in the next section, the second argument q) is the only one that depends on our decisions. It should be included in the definition of our preferences, if practically meaningful results are to be obtained. 3 Risk Measures for Controlled Stochastic Processes We now extend the setting of Section 2 by making the kernels 1) depend on some control variables u t. 3.1 The Model We still work with the process { t } t=1,...,t on the space T and introduce a measurable control space U, BU)). At each time t, we observe the state x t and then apply a control u t U. We assume that the admissible control sets and the transition kernels conditional distributions of the next state) depend on all currently-known state and control values. More precisely, we make the following assumptions: 1. For all t = 1,..., T, we require that u t U t x 1, u 1,..., x t 1, u t 1, x t ), where U t : G t U is a measurable multifunction, and G 1,..., G T are the sets of histories of all currently-known state and control values before applying each control: { G1 =, G t+1 = graphu t ) U) t, t = 1,..., T 1; 2. For all t = 1,..., T, the control-dependent transition kernels Q t : graphu t ) P ), t = 1,..., T 1 13) are measurable, and for all t = 1,..., T 1, for all x 1, u 1,..., x t, u t ) graphu t ), Q t x 1, u 1,..., x t, u t ) describes the conditional distribution of t+1, given currently-known states and controls. The reason why we allow U t and Q t to depend on all known state and control values is that in partially observable Markov decision problems, which we study in Section 5, the conditional distributions of the next observable part of the state does depend on all past controls. We plan to apply the results of the present section to the partially observable case. For this controlled process, a deterministic) history-dependent admissible policy π = π 1,..., π T ) is a sequence of measurable selectors, called decision rules, π t : G t U such that π t g t ) U t g t ) for all g t G t. We can easily prove by induction on t that for such an admissible policy π, each π t reduces to a measurable function on t, as u s = π s h s ) for all s = 1,..., t 1. We are still using π s to denote the decision rule, although it is a different function, formally; it will not lead to any misunderstanding. The set of admissible policies is Π := { π = π 1,..., π T ) t, π t x 1,..., x t ) U t x 1, π 1 x 1 ),..., x t 1, π t 1 x 1,..., x t 1 ), x t ) }. 14) 10

11 For any fixed policy π Π, the transition kernels can be rewritten as measurable functions from t to P ): Q π t : x 1,..., x t ) Q t x1, π 1 x 1 ),..., x t, π t x 1,..., x t ) ), t = 1,..., T 1, 15) just like the transition kernels of the uncontrolled case given in 1), but indexed by π. Thus for any policy π Π, we can consider { t } t=1,...,t as an uncontrolled process, on the probability space T, B ) T, P π ) with P π defined by {Q π t } t=1,...,t 1, which is adapted to the policyindependent filtration {F t } t=1,...,t. As before and throughout this paper, h t t will stand for x 1,..., x t ). We still use the same spaces Z t, t = 1,..., T as defined in 2) for the costs incurred at each stage; these spaces also allow us to consider control-dependent costs as collections of policyindexed costs in Z 1,T. Thus, we are able to classify time-consistent) dynamic risk measures ρ π for each fixed π Π, as in Section 2. Note that ρ π are defined on the same spaces independently of π, because the filtration and the spaces Z t, t = 1,..., T, are not dependent on π; however, we do need to index the measures of risk by the policy π, because the transition kernels and, consequently, the probability measure on the space T, depend on π. 3.2 Stochastic Conditional Time-Consistency and Transition Risk Mappings In the previous subsection we discussed a family of dynamic risk measures ρ π ) π such that for each fixed π Π, ρ π is a dynamic risk measure. In future considerations, we will compare risk levels among different policies, so a meaningful order among ρ π ) π Π is needed. It turns out that our concept of stochastic conditional time-consistency can be extended to this setting and help us achieve this goal. } π Π Definition 3.1. A family of process-based dynamic risk measures { ρt,t π is stochastically conditionally time-consistent if for any π, π Π, for any 1 t < T, for all h t t, all t=1,...,t 1 Z t,..., Z T ) Z t,t and all W t,..., W T ) Z t,t, the conditions { Zt h t ) = W t h t ), imply ρ π t+1,t Z t+1,..., Z T ) H π t = h t ) st ρ π t+1,t W t+1,..., W T ) H π t = h t ), ρ π t,t Z t,..., Z T )h t ) ρ π t,t W t,..., W T )h t ). Remark 3.2. As in Definition 2.12, the conditional stochastic order st should be understood as follows: for all η R we have { Q π t h t) x ρ π t+1,t Z t+1,..., Z T )h t, x) > η }) { Q π t h t ) x ρ π t+1,t W t+1,..., W T )h t, x) > η }). The above definition of stochastic conditional time-consistency of a family of dynamic risk measures ρ π ) π Π plays a pivotal role in building an intrinsic connection among them, as we explain it below. If a family of process-based dynamic risk measures { ρt,t π } π Π t=1,...,t 1 is 11

12 stochastically conditionally time-consistent, then for each fixed π Π the process-based dynamic risk measure { ρt,t π } is stochastically conditionally time-consistent, as defined t=1,...,t 1 in Definition By virtue of Proposition 2.14, for each π Π, there exist functionals σt π : graphq π t ) V R, t = 1... T 1, such that for all t = 1,..., T 1, all h t t, the functional σt π h t, Q π t h t), ) is a lawinvariant risk measure on V with respect to the distribution Q π t h t) and ρ π t,t Z t,..., Z T )h t ) = Z t h t ) + σ π t ht, Q π t h t), ρ π t+1,t Z t+1,..., Z T )h t, ) ), h t t. Consider any π, π Π, h t t, and Z t,..., Z T ) Z t,t, W t,..., W T ) Z t,t such that Z t h t ) = W t h t ), Q π t h t) = Q π t h t ), ρt+1,t π Z t+1,..., Z T )h t, ) = ρ π t+1,t W t+1,..., W T )h t, ). Then we have ρ π t+1,t Z t+1,..., Z T ) H π t = h t ) st ρ π t+1,t W t+1,..., W T ) H π t = h t ), where the relation st means that both st and st are true; in other words, equality in law. Because of the stochastic conditional time-consistency, whence ρ π t,t Z t,..., Z T )h t ) = ρ π t,t W t,..., W T )h t ), σt π ht, Q π t h t), ρt+1,t π Z t+1,..., Z T )h t, ) ) = σ π t ht, Q π t h t ), ρ π t+1,t W t+1,..., W T )h t, ) ). This proves that in fact σ π does not depend on π, and all dependence on π is carried by the controlled kernel Q π t. This is a highly desirable property, when we apply dynamic risk measures to a control problem. We summarize this important observation in the following theorem, which extends Theorem 2.14 to the case of controlled processes. } π Π Theorem 3.3. A family of process-based dynamic risk measures { ρt,t π is translationinvariant and stochastically conditionally time-consistent if and only if there exist functionals t=1,...,t { } σ t : graphq π t ) V R, t = 1... T 1, π Π such that i) For all t = 1,..., T 1 and all h t t, σ t h t,, ) is normalized and satisfies the following strong monotonicity with respect to stochastic dominance: q 1, q 2 { Q π t h t) : π Π }, v 1, v 2 V, v 1 ; q 1 ) st v 2 ; q 2 ) σ t h t, q 1, v 1 ) σ t h t, q 2, v 2 ), where v; q) = q v 1 means the distribution of v under q; 12

13 ii) For all π Π, for all t = 1,..., T 1, for all Z t,..., Z T ) Z t,t, and for all h t t, ρ π t,t Z t,..., Z T )h t ) = Z t h t ) + σ t h t, Q π t h t), ρ π t+1,t Z t+1,..., Z T )h t, )). 16) Moreover, for all t = 1,..., T 1, σ t is uniquely determined by ρ t,t as follows: for every h t t, for every q { Q π t h t) : π Π }, and for every v V, σ t h t, q, v) = ρ π t,t 0, V, 0,..., 0)h t), 17) where π is any admissible policy such that q = Q π t h t), and V Z t+1 satisfies the equation V h t, ) = v ), and can be arbitrary elsewhere. Proof. We have shown the existence of {σ t } t=1,...,t satisfying 16) and 17) in the discussion preceding the theorem. We can verify the strong law-invariance by 17) and Definition Strong monotonicity with respect to stochastic dominance In Theorem 3.3 we introduced the notion of strong monotonicity with respect to stochastic dominance, and in this subsection we shed more light on this notion. For a measurable space, we use P ) to denote the space of probability measures on, and we use S to denote a subset of P ). Recall that V stands for the set of bounded measurable functions on. We say that a function σ : S V R is strongly monotonic with respect to stochastic dominance on S, if q 1, q 2 S, v 1, v 2 V, v 1 ; q 1 ) st v 2 ; q 2 ) σ q 1, v 1 ) σ q 2, v 2 ). 18) The property of strong monotonicity with respect to stochastic dominance is strictly stronger than the usual monotonicity and law-invariance of σ q, ) with respect to q for all q S. We remind the readers that the usual law-invariance means that for all v 1 and v 2 having the same distribution under q, we must have σ v 1, q) = σ v 2, q). Furthermore, a mapping σ, which is strongly monotonic with respect to stochastic dominance, is nothing else but a function that evaluates the distributions of v under q for all v V and q S, which preserves the firstorder) stochastic dominance. Thus we can rewrite it as a function of distributions, or a function of cumulative distribution functions, F v;q α) = q{x : vx) α}, α R. Proposition 3.4. A function σ : S V R is strongly monotonic with respect to stochastic dominance on S if and only if there exist σ : { F v;q v V, q S } R such that σ q, v) = σ F v;q ) for all v V and q S, and for all F 1, F 2 { F v;q v V, q S }, F 1 F 2 σ F 1 ) σ F 2 ). 19) Consequently, to construct risk measures, which are strongly monotonic with respect to stochastic dominance, one possibility is to take any function of distributions satisfying 19) and the normalization condition. Such functions include the expected value, the mean-uppersemideviations, as in Example 2.18, or, more generally, coherent risk measures given by the Kusuoka-representation: σ q, v) = sup µ M 1 0 AVaR 1 α v; q) µdα), q P ), v V, 20) 13

14 where M is a convex subset of P 0, 1] ), and AVaR defined by AVaR 1 α v; q) = 1 α 1 1 α F 1 v;q s) ds, with F 1 v;q is the quantile function of v under q. Even more general mappings are monotonic and normalized risk measures on the space of bounded quantile functions, as discussed in [16]. We can also include all non-decreasing transformations of such measures, provided that they satisfy the normalization condition, such as the entropic risk measure of Example Note that { F v;q v V, q S } does not always span all bounded distributions on R, especially when is a discrete space; thus, we also allow different forms for each q, as shown in the following example: } Example 3.5. Let = {1, 2, 3}, S = {q 1 = 0, 1, 0.1, 0.8); q 2 = 3 1, 3 1, 1 3 ), and σ q, v) = { Eq 1v) + κe q 1 [v E q 1v)] +), if q = q 1, 0.4 max { v1), v2), v3) } + 0.6E q 2v), if q = q 2, with any 0 κ 1. Then σ, ) is strongly monotonic with respect to stochastic dominance on S. Indeed, each σ q i, ) is a coherent risk measure on V. We fix any v 2 V and denote min { v 2 1), v 2 2), v 2 3) } = a and max { v 2 1), v 2 2), v 2 3) } = c. Then we have 0.4c + 0.2a + a + c) σ q 2, v 2 ) 0.4c + 0.2a + c + c), and i) for any v 1, v 2 V such that v 1 ; q 1 ) st v 2 ; q 2 ), we have v 1 w := c, c, a), and thus [w E q 1w)] +) σ q 1, v 1 ) E q 1w) + κe q 1 = 0.2c + 0.8a κc a) 0.6c + 0.4a σ q 2, v 2 ); ii) for any v 1, v 2 V such that v 1 ; q 1 ) st v 2 ; q 2 ), we have v 1 w := a, a, c), and thus [w E q 1w )] +) σ q 1, v 1 ) E q 1w ) + κe q 1 = 0.8c + 0.2a κc a) 0.8c + 0.2a σ q 2, v 2 ). 4 Application to Controlled Markov Systems Our results can be further specialized to the case when { t } is a controlled Markov system, in which we assume the following conditions: i) The admissible control sets are measurable multifunctions of the current state, i.e., U t : U, t = 1,..., T ; ii) The dependence in the transition kernel 13) on the history is carried only through the last state and control: Q t : graphu t ) P ), t = 1,..., T 1; iii) The step-wise costs are only dependent on the current state and control: Z t = c t x t, u t ), t = 1,..., T, where c t : graphu t ) R, t = 1,..., T are measurable bounded functions. 14

15 Let Π be the set of admissible history-dependent policies: Π := { π = π 1,..., π T ) t, π t x 1,..., x t ) U t x t ) }. To alleviate notation, for all π Π and for all measurable c = c 1,..., c T ), we write v c,π t h t ) := ρ π t,t ct t, π t H t )),..., c T T, π T H T )) ) h t ). The following result is the direct translation of Theorem 3.3 to the Markovian case: Corollary 4.1. For a controlled Markov system, a family of process-based dynamic risk measures { ρt,t π } π Π is translation-invariant and stochastically conditionally time-consistent if t=1,...,t and only if there exist functionals σ t : { h t, Q t x t, u) ) : h t t, u U t x t ) } V R, t = 1... T 1, such that i) For all t = 1,..., T 1 and all h t t, σ t h t,, ) is normalized and strongly monotonic with respect to stochastic dominance on { Q t x t, u) : u U t x t ) } ; ii) For all π Π, for all bounded measurable c, for all t = 1,..., T 1, and for all h t t, ) vt c,π h t ) = c t x t, π t h t )) + σ t h t, Q t x t, π t h t )), v c,π t+1 h t, ). 21) Proof. To verify the if and only if statement, we can show that 17) is true if σ t satisfies 21) for all measurable bounded c. 4.1 Markov Risk Measures Because of the Markov property of the transition kernels, for a fixed Markov policy 1 π, the future evolution of the process { τ } τ=t,...,t is solely dependent on the current state x t, so is the distribution of the future costs c τ τ, π τ τ )), τ = t,..., T. Therefore, it is reasonable to assume that the dependence of the conditional risk measure on the history is also carried by the current state only. } π Π Definition 4.2. A family of process-based dynamic risk measures { ρt,t π for a controlled Markov system is Markov if for all Markov policies π Π, for all measurable c = t=1,...,t c 1,..., c T ), and for all h t = x 1,..., x t ) and h t = x 1,..., x t ) in t such that x t = x t, we have vt c,π h t ) = vt c,π h t ). Proposition 4.3. Under the translation invariance and the stochastic conditional time consistency, { ρt,t π } π Π t=1,...,t is Markov if and only if the dependence of σ t on h t is carried only by x t, for all t = 1,..., T 1. Proof. The Markov property in Definition 4.2 implies that for all t = 1,..., T 1, for all h t, h t t such that x t = x t, for all u U tx t ) and for all v V, there exists a Markov π Π such that π t x t ) = u. By setting c = 0,..., 0, c t+1, 0,..., 0) with c t+1 : x, u ) vx ), we obtain σ t h t, Q t x t, u), v) = v c,π t h t ) = vt c,π h t ) = σ th t, Q tx t, u), v). 1 A Markov policy π is composed of state-dependent measurable decision rules π t : U), t = 1,..., T. 15

16 Therefore, σ t is indeed memoryless, that is, its dependence on h t is carried by x t only. If σ t, t = 1,..., T 1 are all memoryless, we can prove by induction backward in time that for all t = T,..., 1, vt c,π h t ) = vt c,π h t ) for all Markov π and all h t, h t t such that x t = x t. Theorem 4.4. For a controlled Markov system, a family of process-based dynamic risk measures { ρt,t π } π Π is translation-invariant, stochastically conditionally time-consistent and t=1,...,t Markov if and only if there exist functionals σ t : { x, Q t x, u) ) : x, u U t x) } V R, t = 1... T 1, where V is the set of bounded measurable functions on, such that: i) For all t = 1,..., T 1 and all x, σ t x,, ) is normalized and strongly monotonic with respect to stochastic dominance on { Q t x, u) : u U t x) } ; ii) For all π Π, for all measurable bounded c, for all t = 1,..., T 1, and for all h t t, ) vt c,π h t ) = c t x t, π t h t )) + σ t x t, Q t x t, π t h t )), v c,π t+1 h t, ). 22) Remark 4.5. In [41], formula 22) was postulated as a definition of a Markov risk measure. Theorem 4.4 provides us with a simple recursive formula 22) for the evaluation of risk of a Markov policy π: v c,π T x) = c T x, π T x)), x, x) = c t x, π t x)) + σ t x, Qt x, π t x)), vt+1) c,π, x, t = T 1,..., 1. v c,π t It involves calculation of the values of functions v c,π t ) on the state space. 4.2 Dynamic Programming In this section, we fix the cost functions c 1,..., c T and consider a family of dynamic risk measures { ρt,t π } π Π which is translation-invariant, stochastically conditionally time-consistent t=1,...,t and Markov. Our objective is to analyze the risk minimization problem: min π Π vπ 1 x 1), x 1. For this purpose, we introduce the family of value functions: v t h t) = inf vt π h t), t = 1,..., T, h t t, 23) π Π t,t where Π t,t is the set of feasible deterministic policies π = {π t,..., π T }. As stated in Theorem 4.4, transition risk mappings { σ t exist, such that }t=1,...,t 1 v π t h t) = c t x t, π t h t )) + σ t xt, Q t x t, π t h t )), v π t+1 h t, ) ), t = 1,..., T 1, π Π, h t t. 24) Our intention is to prove that the value functions vt ) are memoryless, that is, for all h t = x 1,..., x t ) and h t = x 1,..., x t ) such that x t = x t, we have v t h t) = vt h t ). In this case, with a slight abuse of notation, we shall simply write vt x t). In order to formulate the main result of this subsection, we equip the space P ) of probability measures on with the topology of weak convergence. 16

17 Theorem 4.6. We assume in addition that the following conditions are satisfied: i) The transition kernels Q t, ), t = 1,..., T, are continuous; ii) For every lower semicontinuous v V the transition risk mappings σ t,, v), t = 1,..., T, are lower semi-continuous; iii) The functions c t, ), t = 1,..., T, are lower semicontinuous; iv) The multifunctions U t ), t = 1,..., T, are measurable, compact-valued, and upper semicontinuous. Then the functions vt, t = 1,..., T, are measurable, memoryless, lower semicontinuous, and satisfy the following dynamic programming equations: v t x) = v T x) = min c T x, u), x, 25) u U T x) min { ct x, u) + σ t x, Qt x, u), v )} t+1, x, t = T 1,..., 1. 26) u U t x) Moreover, an optimal Markov policy ˆπ exists and satisfies the equations: ˆπ T x) argmin c T x, u), x, 27) u U T x) { ˆπ t x) argmin ct x, u) + σ t x, Qt x, u), v )} t+1, x, t = T 1,..., 1. 28) u U t x) Proof. We prove the memoryless property of vt ) and construct the optimal Markov policy by induction backwards in time. For all h T T we have vt h T ) = inf c T x T, π T h T )) = inf c T x T, u). 29) π Π u U T x T ) Since c T, ) is lower semicontinuous, it is a normal integrand, that is, its epigraphical mapping x {u, α) U R : c T x, u) α} is a closed-valued and measurable multifunction [38, Ex ]. Due to assumption iv), the mapping { c T x, u) if u U T x), c T x, u) = + otherwise, is a normal integrand as well. By virtue of [38, Thm ], the infimum in 29) is attained and is a measurable function of x T. Hence, vt ) is measurable and memoryless. By assumptions iii) and iv) and Berge theorem, it is also lower semicontinuous see, e.g. [3, Thm ]). Moreover, the optimal solution mapping Ψ T x) = { u U T x) : c T x, u) = vt x)} is measurable and has nonempty and closed values. Therefore, a measurable selector ˆπ T of Ψ T exists [28], [3, Thm ]. Suppose vt+1 ) is measurable, memoryless, and lower semicontinuous, and Markov decision rules { ˆπ t+1,..., ˆπ T } exist such that Then for any h t t we have v t h t) = inf π Π vπ t h t) = inf π Π v t+1 x t+1) = v { ˆπ t+1,..., ˆπ T } t+1 x t+1 ), h t+1 t+1. { ct x t, π t h t )) + σ t xt, Q t x t, π t h t )), v π t+1 h t, ) )}. 17

18 On the one hand, since vt+1 π h t, ) vt+1 ) and σ t is non-decreasing with respect to the last argument, we obtain vt h { t) inf ct x t, π t h t )) + σ t xt, Q t x t, π t h t )), vt+1)} π Π { = inf ct x t, u) + σ t xt, Q t x t, u), v 30) t+1)}. u U t x t ) By assumptions i) iii), the mapping x, u) c t x, u) + σ t x, Q t x, u), vt+1 ) is lower semicontinuous. Invoking [38, Thm ] and assumption iv) again, exactly as in the case of t = T, we conclude that the optimal solution mapping { Ψ t x) = u U t x) : c t x, u) + σ t x, Qt x, u), vt+1 ) = inf u U t x) { ct x, u) + σ t x, Qt x, u), v t+1)} } is measurable and has nonempty and closed values; hence, a measurable selector ˆπ t of Ψ t exists [3, Thm ]. Substituting this selector into 30), we obtain vt h t) c t x t, ˆπ t x t )) + σ t xt, Q t x t, ˆπ t x t )), v { ˆπ t+1,..., ˆπ T }) { ˆπ t+1 = v t,..., ˆπ T } t x t ). In the last equation, we used 24) and the fact that the decision rules ˆπ t,..., ˆπ T are Markov. On the other hand, vt h t) = inf π Π vπ t h t) v { ˆπ t,..., ˆπ T } t x t ). Therefore, vt h t) = v { ˆπ t,..., ˆπ T } t x t ) is measurable, memoryless, and vt x { t) = min ct x t, u) + σ t xt, Q t x, u), vt+1)} u U t x t ) = c t x t, ˆπ t x t )) + σ t xt, Q t x t, ˆπ t x t )), v t+1). By assumptions ii), iii), iv), and Berge theorem, vt ) is lower semicontinuous see, e.g. [3, Thm ]). This completes the induction step. Let us verify the weak lower semicontinuity assumption ii) of the mean semideviation transition risk mapping of Example To make the mapping Markovian, we assume that the parameter ϰ depends on x only, that is, σ x, q, v) = [ ) ] p ) 1/p vs) qds) + ϰx) vs) vs ) qds ) qds). 31) + As before, p [1, ). For simplicity, we skip the subscript t of σ and ϰ. Lemma 4.7. Suppose ϰ ) is continuous. Then for every lower semicontinuous function v, the mapping x, q) σ x, q, v) is lower semicontinuous. Proof. Let q k q weakly and x k x. For all s we have the inequality [ ] 0 vs) vs ) qds ) + [ ] [ ] vs) vs ) q k ds ) + vs ) q k ds ) vs ) qds )

19 By the triangle inequality for the norm in L p, B ), q k ), [ vs) [ vs) ] p vs ) qds ) + ] p vs ) q k ds ) + ) 1/p q k ds) Adding vs) qds) to both sides, we obtain [ vs) qds) + vs) [ vs) vs ) q k ds ) 1/p [ ] q k ds)) + vs ) q k ds ) vs ) qds ). + ] p ) 1/p vs ) qds ) q k ds) + ] p 1/p [ q k ds)) + max + By the lower semicontinuity of v and weak convergence of q k to q, we have vs) qds) lim inf vs) q k ds), k that is, for every ε > 0, we can find k ε such that for all k k ε vs) q k ds) vs) qds) ε. Therefore, for these k we obtain vs) qds) + [ vs) ] p vs ) qds ) + vs) q k ds) + [ vs) ) 1/p q k ds) ] vs) q k ds), vs) qds). ] p 1/p vs ) q k ds ) q k ds)) + ε. + Taking the lim inf of both sides, and using the weak convergence of q k to q and the lower semicontinuity of the functions integrated, we conclude that vs) qds) + { lim inf k [ vs) ] p vs ) qds ) + vs) q k ds) + [ vs) ) 1/p qds) ] p ) 1/p } vs ) q k ds ) q k ds) + ε. + As ε > 0 was arbitrary, the last relation proves the lower semicontinuity of σ in the case when ϰx) 1. The case of a continuous ϰx) [0, 1] can be now easily analyzed by noticing that σ x, q, v) is a convex combination of the expected value, and the risk measure of the last displayed relation: σ x, q, v) = 1 ϰx) ) vs) qds) { [ ] p ) 1/p } + ϰx) vs) qds) + vs) vs ) qds ) qds). + As both components are lower semicontinuous in x, q), so is their sum. 19

20 5 Application to Partially-Observable Markov Decision Processes In this section, we apply our previous results to a partially-observable Markov decision process { t, Y t } t=1,...,t, in which { t } t=1,...,t is observable and {Y t } t=1,...,t is not. We use the term partially-observable Markov decision process in a more general way than the extant literature, because we consider dynamic risk measures as the objective of control, rather than just the expected value of the cost. 5.1 The Model In order to develop our subsequent theory, it is essential to define our partially-observable Markov decision process POMDP) in a clear and rigorous way. This subsection mostly follows Chapter 5 of [4], and readers are encouraged to consult [4] for more details. The state space of the model is defined as Y where, B )) and Y, BY)) are two Polish spaces. From the modeling perspective, x is the part of the state that we can observe at each step, while y Y is unobservable. The measurable space that we will work with is then given by Y) T endowed with the canonical product σ -field, and we use x t and y t to denote the canonical projections at time t. The control space is still given by U, and since only is observable, the set of admissible controls at step t is still given by a multifunction U t : U. The transition kernel at time t is now given by Q t : graphu t ) Y P Y); in other words, given that at time t the state is x, y) and we apply control u, the distribution of the next state is given by Q t x, y, u). Finally, the cost incurred at each stage t is given by the functional c t : graphu t ) R, which is assumed to be measurable and bounded. We would also like to comment on the notion of history in the current POMDP setting. At time t, all the information available for making a decision is given by g t = x 1, u 1,, x t ) because y t is unobservable. In the following, the notion of a history will always correspond to this observable history. 5.2 Bayes Operator In a POMDP, the Bayes operator provides a way to update from prior belief to posterior belief. More precisely, assume that our current state observation is x and action is u, and our current best guess of the distribution of the unobservable state is ξ. Given a new observation x, we can find a formula to determine the posterior distribution of the unobservable state. Let us start with a fairly general construction of the Bayes operator. Assuming the above setup, for given x, ξ, u) PY) U, define a new measure m t x, ξ, u) on Y, initially on all measurable rectangles A B as m t x, ξ, u)a B) = Q t A B x, y, u) ξdy). Y We verify readily that this uniquely defines a probability measure on Y. If the measurable space Y, BY)) is standard, i.e., isomorphic to a Borel subspace of R, we can disintegrate 20

Risk-Averse Dynamic Optimization. Andrzej Ruszczyński. Research supported by the NSF award CMMI

Risk-Averse Dynamic Optimization. Andrzej Ruszczyński. Research supported by the NSF award CMMI Research supported by the NSF award CMMI-0965689 Outline 1 Risk-Averse Preferences 2 Coherent Risk Measures 3 Dynamic Risk Measurement 4 Markov Risk Measures 5 Risk-Averse Control Problems 6 Solution Methods

More information

Decomposability and time consistency of risk averse multistage programs

Decomposability and time consistency of risk averse multistage programs Decomposability and time consistency of risk averse multistage programs arxiv:1806.01497v1 [math.oc] 5 Jun 2018 A. Shapiro School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta,

More information

Birgit Rudloff Operations Research and Financial Engineering, Princeton University

Birgit Rudloff Operations Research and Financial Engineering, Princeton University TIME CONSISTENT RISK AVERSE DYNAMIC DECISION MODELS: AN ECONOMIC INTERPRETATION Birgit Rudloff Operations Research and Financial Engineering, Princeton University brudloff@princeton.edu Alexandre Street

More information

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS APPLICATIONES MATHEMATICAE 29,4 (22), pp. 387 398 Mariusz Michta (Zielona Góra) OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS Abstract. A martingale problem approach is used first to analyze

More information

CVaR and Examples of Deviation Risk Measures

CVaR and Examples of Deviation Risk Measures CVaR and Examples of Deviation Risk Measures Jakub Černý Department of Probability and Mathematical Statistics Stochastic Modelling in Economics and Finance November 10, 2014 1 / 25 Contents CVaR - Dual

More information

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory Part V 7 Introduction: What are measures and why measurable sets Lebesgue Integration Theory Definition 7. (Preliminary). A measure on a set is a function :2 [ ] such that. () = 2. If { } = is a finite

More information

On Kusuoka Representation of Law Invariant Risk Measures

On Kusuoka Representation of Law Invariant Risk Measures MATHEMATICS OF OPERATIONS RESEARCH Vol. 38, No. 1, February 213, pp. 142 152 ISSN 364-765X (print) ISSN 1526-5471 (online) http://dx.doi.org/1.1287/moor.112.563 213 INFORMS On Kusuoka Representation of

More information

Risk Measures. A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia , USA ICSP 2016

Risk Measures. A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia , USA ICSP 2016 Risk Measures A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA ICSP 2016 Min-max (distributionally robust) approach to stochastic

More information

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3 Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................

More information

Minimax and risk averse multistage stochastic programming

Minimax and risk averse multistage stochastic programming Minimax and risk averse multistage stochastic programming Alexander Shapiro School of Industrial & Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive, Atlanta, GA 30332. Abstract. In

More information

Minimum average value-at-risk for finite horizon semi-markov decision processes

Minimum average value-at-risk for finite horizon semi-markov decision processes 12th workshop on Markov processes and related topics Minimum average value-at-risk for finite horizon semi-markov decision processes Xianping Guo (with Y.H. HUANG) Sun Yat-Sen University Email: mcsgxp@mail.sysu.edu.cn

More information

Constrained continuous-time Markov decision processes on the finite horizon

Constrained continuous-time Markov decision processes on the finite horizon Constrained continuous-time Markov decision processes on the finite horizon Xianping Guo Sun Yat-Sen University, Guangzhou Email: mcsgxp@mail.sysu.edu.cn 16-19 July, 2018, Chengdou Outline The optimal

More information

Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes

Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes RAÚL MONTES-DE-OCA Departamento de Matemáticas Universidad Autónoma Metropolitana-Iztapalapa San Rafael

More information

Risk-Averse Control of Partially Observable Markov Systems. Andrzej Ruszczyński. Workshop on Dynamic Multivariate Programming Vienna, March 2018

Risk-Averse Control of Partially Observable Markov Systems. Andrzej Ruszczyński. Workshop on Dynamic Multivariate Programming Vienna, March 2018 Risk-Averse Control of Partially Observable Markov Systems Workshop on Dynamic Multivariate Programming Vienna, March 2018 Partially Observable Discrete-Time Models Markov Process: fx t ; Y t g td1;:::;t

More information

Risk averse stochastic programming: time consistency and optimal stopping

Risk averse stochastic programming: time consistency and optimal stopping Risk averse stochastic programming: time consistency and optimal stopping A. Pichler Fakultät für Mathematik Technische Universität Chemnitz D 09111 Chemnitz, Germany A. Shapiro School of Industrial and

More information

1 The Observability Canonical Form

1 The Observability Canonical Form NONLINEAR OBSERVERS AND SEPARATION PRINCIPLE 1 The Observability Canonical Form In this Chapter we discuss the design of observers for nonlinear systems modelled by equations of the form ẋ = f(x, u) (1)

More information

Part III. 10 Topological Space Basics. Topological Spaces

Part III. 10 Topological Space Basics. Topological Spaces Part III 10 Topological Space Basics Topological Spaces Using the metric space results above as motivation we will axiomatize the notion of being an open set to more general settings. Definition 10.1.

More information

Convex Optimization Notes

Convex Optimization Notes Convex Optimization Notes Jonathan Siegel January 2017 1 Convex Analysis This section is devoted to the study of convex functions f : B R {+ } and convex sets U B, for B a Banach space. The case of B =

More information

Operations Research Letters. On a time consistency concept in risk averse multistage stochastic programming

Operations Research Letters. On a time consistency concept in risk averse multistage stochastic programming Operations Research Letters 37 2009 143 147 Contents lists available at ScienceDirect Operations Research Letters journal homepage: www.elsevier.com/locate/orl On a time consistency concept in risk averse

More information

Generalized metric properties of spheres and renorming of Banach spaces

Generalized metric properties of spheres and renorming of Banach spaces arxiv:1605.08175v2 [math.fa] 5 Nov 2018 Generalized metric properties of spheres and renorming of Banach spaces 1 Introduction S. Ferrari, J. Orihuela, M. Raja November 6, 2018 Throughout this paper X

More information

RANDOM APPROXIMATIONS IN STOCHASTIC PROGRAMMING - A SURVEY

RANDOM APPROXIMATIONS IN STOCHASTIC PROGRAMMING - A SURVEY Random Approximations in Stochastic Programming 1 Chapter 1 RANDOM APPROXIMATIONS IN STOCHASTIC PROGRAMMING - A SURVEY Silvia Vogel Ilmenau University of Technology Keywords: Stochastic Programming, Stability,

More information

Gaussian Processes. 1. Basic Notions

Gaussian Processes. 1. Basic Notions Gaussian Processes 1. Basic Notions Let T be a set, and X : {X } T a stochastic process, defined on a suitable probability space (Ω P), that is indexed by T. Definition 1.1. We say that X is a Gaussian

More information

Combinatorial Criteria for Uniqueness of Gibbs Measures

Combinatorial Criteria for Uniqueness of Gibbs Measures Combinatorial Criteria for Uniqueness of Gibbs Measures Dror Weitz School of Mathematics, Institute for Advanced Study, Princeton, NJ 08540, U.S.A. dror@ias.edu April 12, 2005 Abstract We generalize previously

More information

1 Stochastic Dynamic Programming

1 Stochastic Dynamic Programming 1 Stochastic Dynamic Programming Formally, a stochastic dynamic program has the same components as a deterministic one; the only modification is to the state transition equation. When events in the future

More information

On duality theory of conic linear problems

On duality theory of conic linear problems On duality theory of conic linear problems Alexander Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 3332-25, USA e-mail: ashapiro@isye.gatech.edu

More information

WHY SATURATED PROBABILITY SPACES ARE NECESSARY

WHY SATURATED PROBABILITY SPACES ARE NECESSARY WHY SATURATED PROBABILITY SPACES ARE NECESSARY H. JEROME KEISLER AND YENENG SUN Abstract. An atomless probability space (Ω, A, P ) is said to have the saturation property for a probability measure µ on

More information

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales Prakash Balachandran Department of Mathematics Duke University April 2, 2008 1 Review of Discrete-Time

More information

Empirical Processes: General Weak Convergence Theory

Empirical Processes: General Weak Convergence Theory Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated

More information

CONVERGENCE ANALYSIS OF SAMPLING-BASED DECOMPOSITION METHODS FOR RISK-AVERSE MULTISTAGE STOCHASTIC CONVEX PROGRAMS

CONVERGENCE ANALYSIS OF SAMPLING-BASED DECOMPOSITION METHODS FOR RISK-AVERSE MULTISTAGE STOCHASTIC CONVEX PROGRAMS CONVERGENCE ANALYSIS OF SAMPLING-BASED DECOMPOSITION METHODS FOR RISK-AVERSE MULTISTAGE STOCHASTIC CONVEX PROGRAMS VINCENT GUIGUES Abstract. We consider a class of sampling-based decomposition methods

More information

TOPOLOGICAL EQUIVALENCE OF LINEAR ORDINARY DIFFERENTIAL EQUATIONS

TOPOLOGICAL EQUIVALENCE OF LINEAR ORDINARY DIFFERENTIAL EQUATIONS TOPOLOGICAL EQUIVALENCE OF LINEAR ORDINARY DIFFERENTIAL EQUATIONS ALEX HUMMELS Abstract. This paper proves a theorem that gives conditions for the topological equivalence of linear ordinary differential

More information

On a Class of Multidimensional Optimal Transportation Problems

On a Class of Multidimensional Optimal Transportation Problems Journal of Convex Analysis Volume 10 (2003), No. 2, 517 529 On a Class of Multidimensional Optimal Transportation Problems G. Carlier Université Bordeaux 1, MAB, UMR CNRS 5466, France and Université Bordeaux

More information

The Canonical Model Space for Law-invariant Convex Risk Measures is L 1

The Canonical Model Space for Law-invariant Convex Risk Measures is L 1 The Canonical Model Space for Law-invariant Convex Risk Measures is L 1 Damir Filipović Gregor Svindland 3 November 2008 Abstract In this paper we establish a one-to-one correspondence between lawinvariant

More information

An introduction to Mathematical Theory of Control

An introduction to Mathematical Theory of Control An introduction to Mathematical Theory of Control Vasile Staicu University of Aveiro UNICA, May 2018 Vasile Staicu (University of Aveiro) An introduction to Mathematical Theory of Control UNICA, May 2018

More information

Optimal stopping for non-linear expectations Part I

Optimal stopping for non-linear expectations Part I Stochastic Processes and their Applications 121 (2011) 185 211 www.elsevier.com/locate/spa Optimal stopping for non-linear expectations Part I Erhan Bayraktar, Song Yao Department of Mathematics, University

More information

Translative Sets and Functions and their Applications to Risk Measure Theory and Nonlinear Separation

Translative Sets and Functions and their Applications to Risk Measure Theory and Nonlinear Separation Translative Sets and Functions and their Applications to Risk Measure Theory and Nonlinear Separation Andreas H. Hamel Abstract Recently defined concepts such as nonlinear separation functionals due to

More information

MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION. Extra Reading Material for Level 4 and Level 6

MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION. Extra Reading Material for Level 4 and Level 6 MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION Extra Reading Material for Level 4 and Level 6 Part A: Construction of Lebesgue Measure The first part the extra material consists of the construction

More information

Combinatorial Criteria for Uniqueness of Gibbs Measures

Combinatorial Criteria for Uniqueness of Gibbs Measures Combinatorial Criteria for Uniqueness of Gibbs Measures Dror Weitz* School of Mathematics, Institute for Advanced Study, Princeton, New Jersey 08540; e-mail: dror@ias.edu Received 17 November 2003; accepted

More information

APPENDIX C: Measure Theoretic Issues

APPENDIX C: Measure Theoretic Issues APPENDIX C: Measure Theoretic Issues A general theory of stochastic dynamic programming must deal with the formidable mathematical questions that arise from the presence of uncountable probability spaces.

More information

A unified approach to time consistency of dynamic risk measures and dynamic performance measures in discrete time

A unified approach to time consistency of dynamic risk measures and dynamic performance measures in discrete time A unified approach to time consistency of dynamic risk measures and dynamic performance measures in discrete time Tomasz R. Bielecki a, Igor Cialenco a, and Marcin Pitera b First Circulated: September

More information

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping. Minimization Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping. 1 Minimization A Topological Result. Let S be a topological

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

1 Lyapunov theory of stability

1 Lyapunov theory of stability M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability

More information

Lecture 17 Brownian motion as a Markov process

Lecture 17 Brownian motion as a Markov process Lecture 17: Brownian motion as a Markov process 1 of 14 Course: Theory of Probability II Term: Spring 2015 Instructor: Gordan Zitkovic Lecture 17 Brownian motion as a Markov process Brownian motion is

More information

On Kusuoka representation of law invariant risk measures

On Kusuoka representation of law invariant risk measures On Kusuoka representation of law invariant risk measures Alexander Shapiro School of Industrial & Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive, Atlanta, GA 3332. Abstract. In this

More information

Asymptotics of minimax stochastic programs

Asymptotics of minimax stochastic programs Asymptotics of minimax stochastic programs Alexander Shapiro Abstract. We discuss in this paper asymptotics of the sample average approximation (SAA) of the optimal value of a minimax stochastic programming

More information

A Note on Robust Representations of Law-Invariant Quasiconvex Functions

A Note on Robust Representations of Law-Invariant Quasiconvex Functions A Note on Robust Representations of Law-Invariant Quasiconvex Functions Samuel Drapeau Michael Kupper Ranja Reda October 6, 21 We give robust representations of law-invariant monotone quasiconvex functions.

More information

Observer design for a general class of triangular systems

Observer design for a general class of triangular systems 1st International Symposium on Mathematical Theory of Networks and Systems July 7-11, 014. Observer design for a general class of triangular systems Dimitris Boskos 1 John Tsinias Abstract The paper deals

More information

4. Conditional risk measures and their robust representation

4. Conditional risk measures and their robust representation 4. Conditional risk measures and their robust representation We consider a discrete-time information structure given by a filtration (F t ) t=0,...,t on our probability space (Ω, F, P ). The time horizon

More information

arxiv: v2 [math.ag] 24 Jun 2015

arxiv: v2 [math.ag] 24 Jun 2015 TRIANGULATIONS OF MONOTONE FAMILIES I: TWO-DIMENSIONAL FAMILIES arxiv:1402.0460v2 [math.ag] 24 Jun 2015 SAUGATA BASU, ANDREI GABRIELOV, AND NICOLAI VOROBJOV Abstract. Let K R n be a compact definable set

More information

A Concise Course on Stochastic Partial Differential Equations

A Concise Course on Stochastic Partial Differential Equations A Concise Course on Stochastic Partial Differential Equations Michael Röckner Reference: C. Prevot, M. Röckner: Springer LN in Math. 1905, Berlin (2007) And see the references therein for the original

More information

The small ball property in Banach spaces (quantitative results)

The small ball property in Banach spaces (quantitative results) The small ball property in Banach spaces (quantitative results) Ehrhard Behrends Abstract A metric space (M, d) is said to have the small ball property (sbp) if for every ε 0 > 0 there exists a sequence

More information

Relaxations of linear programming problems with first order stochastic dominance constraints

Relaxations of linear programming problems with first order stochastic dominance constraints Operations Research Letters 34 (2006) 653 659 Operations Research Letters www.elsevier.com/locate/orl Relaxations of linear programming problems with first order stochastic dominance constraints Nilay

More information

Propagating terraces and the dynamics of front-like solutions of reaction-diffusion equations on R

Propagating terraces and the dynamics of front-like solutions of reaction-diffusion equations on R Propagating terraces and the dynamics of front-like solutions of reaction-diffusion equations on R P. Poláčik School of Mathematics, University of Minnesota Minneapolis, MN 55455 Abstract We consider semilinear

More information

The Asymptotic Theory of Transaction Costs

The Asymptotic Theory of Transaction Costs The Asymptotic Theory of Transaction Costs Lecture Notes by Walter Schachermayer Nachdiplom-Vorlesung, ETH Zürich, WS 15/16 1 Models on Finite Probability Spaces In this section we consider a stock price

More information

Stochastic Dynamic Programming: The One Sector Growth Model

Stochastic Dynamic Programming: The One Sector Growth Model Stochastic Dynamic Programming: The One Sector Growth Model Esteban Rossi-Hansberg Princeton University March 26, 2012 Esteban Rossi-Hansberg () Stochastic Dynamic Programming March 26, 2012 1 / 31 References

More information

A SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06

A SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06 A SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06 CHRISTIAN LÉONARD Contents Preliminaries 1 1. Convexity without topology 1 2. Convexity

More information

Deterministic Dynamic Programming

Deterministic Dynamic Programming Deterministic Dynamic Programming 1 Value Function Consider the following optimal control problem in Mayer s form: V (t 0, x 0 ) = inf u U J(t 1, x(t 1 )) (1) subject to ẋ(t) = f(t, x(t), u(t)), x(t 0

More information

THE DAUGAVETIAN INDEX OF A BANACH SPACE 1. INTRODUCTION

THE DAUGAVETIAN INDEX OF A BANACH SPACE 1. INTRODUCTION THE DAUGAVETIAN INDEX OF A BANACH SPACE MIGUEL MARTÍN ABSTRACT. Given an infinite-dimensional Banach space X, we introduce the daugavetian index of X, daug(x), as the greatest constant m 0 such that Id

More information

2 Statement of the problem and assumptions

2 Statement of the problem and assumptions Mathematical Notes, 25, vol. 78, no. 4, pp. 466 48. Existence Theorem for Optimal Control Problems on an Infinite Time Interval A.V. Dmitruk and N.V. Kuz kina We consider an optimal control problem on

More information

Valid Inequalities and Restrictions for Stochastic Programming Problems with First Order Stochastic Dominance Constraints

Valid Inequalities and Restrictions for Stochastic Programming Problems with First Order Stochastic Dominance Constraints Valid Inequalities and Restrictions for Stochastic Programming Problems with First Order Stochastic Dominance Constraints Nilay Noyan Andrzej Ruszczyński March 21, 2006 Abstract Stochastic dominance relations

More information

c 2012 Ozlem Cavus ALL RIGHTS RESERVED

c 2012 Ozlem Cavus ALL RIGHTS RESERVED c 2012 Ozlem Cavus ALL RIGHTS RESERVED RISK-AVERSE CONTROL OF UNDISCOUNTED TRANSIENT MARKOV MODELS by OZLEM CAVUS A dissertation submitted to the Graduate School New Brunswick Rutgers, The State University

More information

Optimal Stopping under Adverse Nonlinear Expectation and Related Games

Optimal Stopping under Adverse Nonlinear Expectation and Related Games Optimal Stopping under Adverse Nonlinear Expectation and Related Games Marcel Nutz Jianfeng Zhang First version: December 7, 2012. This version: July 21, 2014 Abstract We study the existence of optimal

More information

Bayesian Persuasion Online Appendix

Bayesian Persuasion Online Appendix Bayesian Persuasion Online Appendix Emir Kamenica and Matthew Gentzkow University of Chicago June 2010 1 Persuasion mechanisms In this paper we study a particular game where Sender chooses a signal π whose

More information

Lecture 21 Representations of Martingales

Lecture 21 Representations of Martingales Lecture 21: Representations of Martingales 1 of 11 Course: Theory of Probability II Term: Spring 215 Instructor: Gordan Zitkovic Lecture 21 Representations of Martingales Right-continuous inverses Let

More information

SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES

SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES RUTH J. WILLIAMS October 2, 2017 Department of Mathematics, University of California, San Diego, 9500 Gilman Drive,

More information

HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS. Josef Teichmann

HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS. Josef Teichmann HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS Josef Teichmann Abstract. Some results of ergodic theory are generalized in the setting of Banach lattices, namely Hopf s maximal ergodic inequality and the

More information

On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q)

On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q) On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q) Andreas Löhne May 2, 2005 (last update: November 22, 2005) Abstract We investigate two types of semicontinuity for set-valued

More information

arxiv: v1 [math.ds] 4 Jan 2019

arxiv: v1 [math.ds] 4 Jan 2019 EPLODING MARKOV OPERATORS BARTOSZ FREJ ariv:1901.01329v1 [math.ds] 4 Jan 2019 Abstract. A special class of doubly stochastic (Markov operators is constructed. These operators come from measure preserving

More information

Equilibria with a nontrivial nodal set and the dynamics of parabolic equations on symmetric domains

Equilibria with a nontrivial nodal set and the dynamics of parabolic equations on symmetric domains Equilibria with a nontrivial nodal set and the dynamics of parabolic equations on symmetric domains J. Földes Department of Mathematics, Univerité Libre de Bruxelles 1050 Brussels, Belgium P. Poláčik School

More information

CLASSES OF STRICTLY SINGULAR OPERATORS AND THEIR PRODUCTS

CLASSES OF STRICTLY SINGULAR OPERATORS AND THEIR PRODUCTS CLASSES OF STRICTLY SINGULAR OPERATORS AND THEIR PRODUCTS GEORGE ANDROULAKIS, PANDELIS DODOS, GLEB SIROTKIN AND VLADIMIR G. TROITSKY Abstract. Milman proved in [18] that the product of two strictly singular

More information

arxiv:math/ v4 [math.pr] 12 Apr 2007

arxiv:math/ v4 [math.pr] 12 Apr 2007 arxiv:math/612224v4 [math.pr] 12 Apr 27 LARGE CLOSED QUEUEING NETWORKS IN SEMI-MARKOV ENVIRONMENT AND ITS APPLICATION VYACHESLAV M. ABRAMOV Abstract. The paper studies closed queueing networks containing

More information

Competitive Equilibria in a Comonotone Market

Competitive Equilibria in a Comonotone Market Competitive Equilibria in a Comonotone Market 1/51 Competitive Equilibria in a Comonotone Market Ruodu Wang http://sas.uwaterloo.ca/ wang Department of Statistics and Actuarial Science University of Waterloo

More information

The Canonical Gaussian Measure on R

The Canonical Gaussian Measure on R The Canonical Gaussian Measure on R 1. Introduction The main goal of this course is to study Gaussian measures. The simplest example of a Gaussian measure is the canonical Gaussian measure P on R where

More information

Notes on Measure Theory and Markov Processes

Notes on Measure Theory and Markov Processes Notes on Measure Theory and Markov Processes Diego Daruich March 28, 2014 1 Preliminaries 1.1 Motivation The objective of these notes will be to develop tools from measure theory and probability to allow

More information

Exponential Moving Average Based Multiagent Reinforcement Learning Algorithms

Exponential Moving Average Based Multiagent Reinforcement Learning Algorithms Exponential Moving Average Based Multiagent Reinforcement Learning Algorithms Mostafa D. Awheda Department of Systems and Computer Engineering Carleton University Ottawa, Canada KS 5B6 Email: mawheda@sce.carleton.ca

More information

Some SDEs with distributional drift Part I : General calculus. Flandoli, Franco; Russo, Francesco; Wolf, Jochen

Some SDEs with distributional drift Part I : General calculus. Flandoli, Franco; Russo, Francesco; Wolf, Jochen Title Author(s) Some SDEs with distributional drift Part I : General calculus Flandoli, Franco; Russo, Francesco; Wolf, Jochen Citation Osaka Journal of Mathematics. 4() P.493-P.54 Issue Date 3-6 Text

More information

Banach-Alaoglu theorems

Banach-Alaoglu theorems Banach-Alaoglu theorems László Erdős Jan 23, 2007 1 Compactness revisited In a topological space a fundamental property is the compactness (Kompaktheit). We recall the definition: Definition 1.1 A subset

More information

Risk Measures in non-dominated Models

Risk Measures in non-dominated Models Purpose: Study Risk Measures taking into account the model uncertainty in mathematical finance. Plan 1 Non-dominated Models Model Uncertainty Fundamental topological Properties 2 Risk Measures on L p (c)

More information

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries Chapter 1 Measure Spaces 1.1 Algebras and σ algebras of sets 1.1.1 Notation and preliminaries We shall denote by X a nonempty set, by P(X) the set of all parts (i.e., subsets) of X, and by the empty set.

More information

The Subdifferential of Convex Deviation Measures and Risk Functions

The Subdifferential of Convex Deviation Measures and Risk Functions The Subdifferential of Convex Deviation Measures and Risk Functions Nicole Lorenz Gert Wanka In this paper we give subdifferential formulas of some convex deviation measures using their conjugate functions

More information

Normed Vector Spaces and Double Duals

Normed Vector Spaces and Double Duals Normed Vector Spaces and Double Duals Mathematics 481/525 In this note we look at a number of infinite-dimensional R-vector spaces that arise in analysis, and we consider their dual and double dual spaces

More information

Nonlinear representation, backward SDEs, and application to the Principal-Agent problem

Nonlinear representation, backward SDEs, and application to the Principal-Agent problem Nonlinear representation, backward SDEs, and application to the Principal-Agent problem Ecole Polytechnique, France April 4, 218 Outline The Principal-Agent problem Formulation 1 The Principal-Agent problem

More information

Some Background Math Notes on Limsups, Sets, and Convexity

Some Background Math Notes on Limsups, Sets, and Convexity EE599 STOCHASTIC NETWORK OPTIMIZATION, MICHAEL J. NEELY, FALL 2008 1 Some Background Math Notes on Limsups, Sets, and Convexity I. LIMITS Let f(t) be a real valued function of time. Suppose f(t) converges

More information

Packing-Dimension Profiles and Fractional Brownian Motion

Packing-Dimension Profiles and Fractional Brownian Motion Under consideration for publication in Math. Proc. Camb. Phil. Soc. 1 Packing-Dimension Profiles and Fractional Brownian Motion By DAVAR KHOSHNEVISAN Department of Mathematics, 155 S. 1400 E., JWB 233,

More information

Noncooperative continuous-time Markov games

Noncooperative continuous-time Markov games Morfismos, Vol. 9, No. 1, 2005, pp. 39 54 Noncooperative continuous-time Markov games Héctor Jasso-Fuentes Abstract This work concerns noncooperative continuous-time Markov games with Polish state and

More information

Differential Games II. Marc Quincampoix Université de Bretagne Occidentale ( Brest-France) SADCO, London, September 2011

Differential Games II. Marc Quincampoix Université de Bretagne Occidentale ( Brest-France) SADCO, London, September 2011 Differential Games II Marc Quincampoix Université de Bretagne Occidentale ( Brest-France) SADCO, London, September 2011 Contents 1. I Introduction: A Pursuit Game and Isaacs Theory 2. II Strategies 3.

More information

Mean-field dual of cooperative reproduction

Mean-field dual of cooperative reproduction The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time

More information

An introduction to some aspects of functional analysis

An introduction to some aspects of functional analysis An introduction to some aspects of functional analysis Stephen Semmes Rice University Abstract These informal notes deal with some very basic objects in functional analysis, including norms and seminorms

More information

On the distributional divergence of vector fields vanishing at infinity

On the distributional divergence of vector fields vanishing at infinity Proceedings of the Royal Society of Edinburgh, 141A, 65 76, 2011 On the distributional divergence of vector fields vanishing at infinity Thierry De Pauw Institut de Recherches en Mathématiques et Physique,

More information

Inverse Stochastic Dominance Constraints Duality and Methods

Inverse Stochastic Dominance Constraints Duality and Methods Duality and Methods Darinka Dentcheva 1 Andrzej Ruszczyński 2 1 Stevens Institute of Technology Hoboken, New Jersey, USA 2 Rutgers University Piscataway, New Jersey, USA Research supported by NSF awards

More information

Weak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij

Weak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij Weak convergence and Brownian Motion (telegram style notes) P.J.C. Spreij this version: December 8, 2006 1 The space C[0, ) In this section we summarize some facts concerning the space C[0, ) of real

More information

Alternative Characterizations of Markov Processes

Alternative Characterizations of Markov Processes Chapter 10 Alternative Characterizations of Markov Processes This lecture introduces two ways of characterizing Markov processes other than through their transition probabilities. Section 10.1 describes

More information

1. Stochastic Processes and filtrations

1. Stochastic Processes and filtrations 1. Stochastic Processes and 1. Stoch. pr., A stochastic process (X t ) t T is a collection of random variables on (Ω, F) with values in a measurable space (S, S), i.e., for all t, In our case X t : Ω S

More information

Boundedly complete weak-cauchy basic sequences in Banach spaces with the PCP

Boundedly complete weak-cauchy basic sequences in Banach spaces with the PCP Journal of Functional Analysis 253 (2007) 772 781 www.elsevier.com/locate/jfa Note Boundedly complete weak-cauchy basic sequences in Banach spaces with the PCP Haskell Rosenthal Department of Mathematics,

More information

Risk neutral and risk averse approaches to multistage stochastic programming.

Risk neutral and risk averse approaches to multistage stochastic programming. Risk neutral and risk averse approaches to multistage stochastic programming. A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA

More information

Generalized Hypothesis Testing and Maximizing the Success Probability in Financial Markets

Generalized Hypothesis Testing and Maximizing the Success Probability in Financial Markets Generalized Hypothesis Testing and Maximizing the Success Probability in Financial Markets Tim Leung 1, Qingshuo Song 2, and Jie Yang 3 1 Columbia University, New York, USA; leung@ieor.columbia.edu 2 City

More information

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from Topics in Data Analysis Steven N. Durlauf University of Wisconsin Lecture Notes : Decisions and Data In these notes, I describe some basic ideas in decision theory. theory is constructed from The Data:

More information

Notes on uniform convergence

Notes on uniform convergence Notes on uniform convergence Erik Wahlén erik.wahlen@math.lu.se January 17, 2012 1 Numerical sequences We begin by recalling some properties of numerical sequences. By a numerical sequence we simply mean

More information

Optimality of Walrand-Varaiya Type Policies and. Approximation Results for Zero-Delay Coding of. Markov Sources. Richard G. Wood

Optimality of Walrand-Varaiya Type Policies and. Approximation Results for Zero-Delay Coding of. Markov Sources. Richard G. Wood Optimality of Walrand-Varaiya Type Policies and Approximation Results for Zero-Delay Coding of Markov Sources by Richard G. Wood A thesis submitted to the Department of Mathematics & Statistics in conformity

More information

Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition

Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition 1 arxiv:1510.07001v1 [cs.gt] 23 Oct 2015 Yi Ouyang, Hamidreza Tavafoghi and

More information