MICHAEL LUDKOVSKI AND SEMIH O. SEZER

Size: px
Start display at page:

Download "MICHAEL LUDKOVSKI AND SEMIH O. SEZER"

Transcription

1 FINITE HORIZON DECISION TIMING WITH PARTIALLY OBSERVABLE POISSON PROCESSES MICHAEL LUDKOVSKI AND SEMIH O. SEZER Abstract. We study decision timing problems on finite horizon with Poissonian information arrivals. In our model, a decision maker wishes to optimally time her action in order to maximize her expected reward. The reward depends on an unobservable Markovian environment, and information about the environment is collected through a compound) Poisson observation process. Examples of such systems arise in investment timing, reliability theory, Bayesian regime detection and technology adoption models. We solve the problem by studying an optimal stopping problem for a piecewise-deterministic process, which gives the posterior likelihoods of the unobservable environment. Our method lends itself to simple numerical implementation and we present several illustrative numerical examples. 1. Introduction Decision timing under uncertainty is one of the fundamental problems in Operations Research. In a typical setting, an economic agent called the decision-maker or DM) has a set of possible actions A where each action has a random) reward associated with it. The objective of the DM is to select a single action and time it so as to maximize her expected reward. More precisely, the DM picks a stopping time τ and an action k from the set A at τ. The reward H that DM receives is a function of the pair τ, k), as well as of some stochastic state variable Y. In classical examples e.g. investment timing, American option pricing, natural resource management, etc.), Y is an observable stochastic process e.g. asset prices, market demand etc.), and the DM s objective is a standard optimal stopping problem. More complicated stopping problems involving unobserved system states have also been considered in the literature; see, for example, Bather 1973, Monahan 198, Jensen 1982, McCardle 1985, Mazziotto 1986, Jensen and Hsu 1993, Stadje 1997, Schöttl 1998, Fuh 23, Décamps et al. 25, Dayanik and Goulding 29. Such models are especially natural when one wishes to capture the inherent conflict between gathering information which makes waiting valuable) and the time-value of money which makes waiting costly). Indeed, most realistic settings involve a DM who is only partially aware of the environment and must collect data before making a decision. In a multi-period setting, it is natural to capture this uncertainty in the environment through an unobservable stochastic process M {M t } t, where M t represents the state of the world at time t. The DM starts with an initial guess about M, collects information via relevant news, and updates her beliefs. At the time of decision she then receives a reward that depends on the present environment, H = Hτ, k, M τ ). In such problems, a common approach is to postulate that the process M is a partially observable Markov decision) process POMDP), in which case we have a hidden Markov model HMM). Such models have been studied extensively both in discrete- and continuous-time. The reader may refer 2 Mathematics Subject Classification. Primary 62L1; Secondary 62L15, 62C1, 6G4. Key words and phrases. Markov Modulated Poisson processes, Bayesian sequential analysis, optimal stopping, decision making. 1

2 2 MICHAEL LUDKOVSKI AND SEMIH O. SEZER to Bertsekas 1976, Monahan 1982, Elliott et al and Cappé et al. 25 for a comprehensive treatment of discrete-time models and also for many other references on the subject. For continuoustime models and applications, Bensoussan 1992, Liptser and Shiryaev 21, Mamon and Elliott 27 and Elliott et al. 1995), and the references therein can be consulted. In continuous-time models, if news such as changes in asset prices) arrive in infinitesimal amounts, then it is intuitive to have a continuum of information, which is typically captured by the filtration of an observed diffusion process. However, in many instances, a more realistic representation is to use discrete information amounts. Corporate developments, engineering failures, insurance claims, and economic surveys are all discrete events and the corresponding news arrive in chunks. Note that discreteness of information is distinct from the discreteness of time. The model is still in continuous-time, since the events may take place at any instance. However, the event itself carries a strictly positive amount of information. Moreover, no news is still informative and affects the beliefs of the DM. Mathematically, discrete information in continuous-time may be represented by the filtration of an observed marked point process MPP). In such a model, the instantaneous arrival intensity and the distribution of the marks typically depend on the current state of the process M. That is, the observable point process encodes information about the hidden environment M via its arrival times and/or marks. Filtering with continuous-time point process observations has been considered in Bremaud 1981, Arjas et al. 1992, Elliott and Malcolm 25, and it is known that the dynamics of the conditional probabilities of M are of the piecewise deterministic process PDP) type. In other words, the DM s beliefs evolve deterministically between arrivals of new information, and experience random jumps at event times. From the control perspective, various aspects of optimal stopping of PDP s have been studied by Lenhart and Liao 1985, Gugerli 1986 and Costa and Davis In this paper, we study a class of finite-horizon decision-making problems within the PDP framework by considering a general partially-observable regime-switching model with Poisson information arrivals. More precisely, we consider a setting where the observations of DM come from a compound Poisson process X with arrival rate λ, and mark/jump distribution ν. The local characteristics λ, ν) of X are modulated by the current state of an unobservable finite-state Markov process M. In this setting, the DM can stop at any time τ less than some horizon T < and select an action from a set A {1,..., a}. Action k A yields a terminal reward cost) equal to µ k,i1 {Mτ =i}, as a function of the unobservable state of M. Here, µ k,i is a given finite number, which can also be interpreted as the expected value of an independent random variable Φ k,i representing the uncertain payoff of taking action k when M t = i. The DM may alternatively delay her decision and continue to observe the process X in order to collect more information, or in order to stop later when M appears to be in a better state. Delaying the decision carries penalties rewards) due to the cost of observation or lost opportunity or operating revenues). We allow these terms to depend on M and we assume that an amount with present value τ e ρt c i1 {Mt=i}) dt is accumulated until the decision time τ. Here ρ is the discount factor, and c i is the instantaneous cost or revenue of running the system when M is in state i E. Also, we allow ρ to be zero. This makes the formulation suitable for non-financial applications where the quality of the decision is more important than its timing. In this setup, the objective of the DM is to find an admissible pair τ, d) that will maximize her total expected reward and resolve the trade-off between exploring getting more observations) and exploiting engaging in an action). Since the DM collects information by observing X, τ must be a

3 FINITE HORIZON DECISION TIMING WITH PARTIALLY OBSERVABLE POISSON PROCESSES 3 stopping time of the filtration F X generated by X. Also, the decision d should be measurable with respect to the information Fτ X revealed until τ. Let π = π 1,..., π n ) PM = 1),..., PM = n)) denote the prior beliefs of the DM about the initial state of M, and let P π be the corresponding probability measure. Then, the objective of the DM is to compute ) τ UT, π) sup E π e ρt c i 1 {Mt=i} dt + e ) 1.1) ρτ 1 {d=k} µ k,i 1 {Mτ =i}, τ T, d k A and, if it exists, find a pair τ, d) attaining this value. In our paper, we solve the problem in 1.1) in its general form without any restrictive assumption. We give a full characterization of the value function with a direct proof of the dynamic programming principle. We also identify optimal and ɛ-optimal policies for the DM. Moreover, we study the qualitative properties of the solution structure and provide a numerical approach that can be readily implemented; see Sections 4 and 5. Special cases of this optimal stopping problem have been considered by Jensen and Hsu 1993 in connection with system reliability studies, Jensen 1997 and Schöttl 1998 in the context of insurance premium re-pricing and Peskir and Shiryaev 2, Gapeev 22, Bayraktar et al. 26, Dayanik et al. 28a for classical Poisson disorder and regime detection problems. This line of work links together the MPP filtering literature with the PDP results of Lenhart and Liao 1985, Gugerli 1986, Costa and Davis Here, we extend these works in three major directions. First, we consider a general continuous-time finite-state Markov chain for the environment M, and impose no restrictions on the arrival rate and mark distribution of the observed MPP X. Second, we consider a general discount/cost structure, that can be used to encode a variety of economic objectives. So far, all the aforementioned papers have dealt only with special cases by imposing additional assumptions on either X or M or c i, ρ, µ k,i s. Finally, we work in the context of finite horizon, where value functions are time-inhomogeneous. The introduction of time-to-maturity as a state variable adds analytical complexity and leads to the appearance of novel effects that are not possible with infinite horizon stationary models. In Section 5, we illustrate the strength and wide-applicability of our approach on two key applications. In Section 5.1, we revisit the machine replacement problem of Jensen and Hsu 1993 in the finite horizon setting and without their assumptions. Next, in Section 5.2, we give the solution of the finite horizon formulation of the hypothesis-testing problem studied in Peskir and Shiryaev 2, Gapeev 22 and Dayanik et al. 28a. In the Bayesian sequential analysis literature, continuous-time change-detection and hypothesis-testing problems have attracted considerable interest, especially for Poisson and Wiener processes; see for example Dayanik et al. 28a,b and the references therein. Earlier works in this field study these two problems on the infinite horizon. One exception is Gapeev and Peskir 24, which solves the finite horizon change-detection problem for the Wiener process long after its infinite horizon formulation was solved by Shiryaev Our analysis in Sections 3 and 4 gives the solutions of these problems for the compound Poisson process as an immediate corollary, and this is another contribution of our paper. This paper is organized as follows. Below in Section 2, we describe the formal setting of our model and then show that the problem in 1.1) is equivalent to an optimal stopping problem in terms of the conditional probability process, which is a piecewise deterministic process. In Section 3, we describe how the value function of this stopping problem can be computed via a sequential procedure. The results of Section 3 are used in Section 4 in order to identify an optimal strategy and study its properties. Following this, in Section 5 we give examples illustrating our results.

4 4 MICHAEL LUDKOVSKI AND SEMIH O. SEZER Finally, Appendices at the end include supplementary proofs and additional remarks. Appendix A extends our model to the case of discrete costs incurred at each event time, and Appendix B comments on the relationship between finite- and infinite-horizon problems and optimal controls. 2. Problem Statement 2.1. Model. Let Ω, H, P) be a probability space hosting a continuous-time Markov process M taking values on E {1,..., n}, for n N, and with infinitesimal generator Q = q ij ) i,j E. Also, we have a collection of independent compound Poisson processes X 1),..., X n) with local parameters λ 1, ν 1 ),..., λ n, ν n ) respectively. In terms of these independent processes, we define the observation process 2.1) X t X +,t 1 {Ms=i} dx i) s, t, which is a Markov-modulated Poisson process, also called a Cox process see Cox and Isham 198 and Grandell 1976). In the remainder, we let σ, σ 1,... denote the arrival times of the process X: σ m inf{t > σ m 1 : X t X t }, m 1, with σ, and the variables Y 1, Y 2,... denote R d -valued marks observed at these arrival times: Y m = X σm X σm, m 1. Finally, to compute relative likelihoods of different marks, we introduce the measure ν defined as ν ν ν n, and we let f i ) denote the density of ν i with respect to ν Conditional probability process. For a point in D { π R n + : π π n = 1}, let P π denote the probability measure with the expectation operator E π ) under which M has initial distribution π. Moreover, let F {Ft X } t be the filtration of the process X in 2.1). With this notation, we define the D-valued conditional probability process Π ) t Π 1) t,..., Π n) t such that 2.2) Π i) t = P π {M t = i F X t }, for i E, and t. The process Π is clearly adapted to F, and each component gives the conditional probability that the current state of M is {i} given the information generated by X until the current time t. Moreover, using standard arguments as in Shiryaev 1978, pp , and Dayanik et al. 28a, Proof of Proposition 2.1, it can be shown that the problem in 1.1) is equivalent to a fully observed optimal stopping problem with the process Π as the new hyperstate. More precisely, the value function U in 1.1) can be written as 2.3) in terms of the functions τ UT, π) = V T, π) sup E π e ρt C Π t )dt + e ρτ H Π τ ) τ T, 2.4) C π) c i π i and H π) max H k π), where H k π) µ k,i π i. k A If there is a stopping time τ attaining the supremum in 2.3), then the admissible strategy τ, dτ )) is an optimal rule for the problem in 1.1) if we define 2.5) dτ) arg max k A H k Π τ ).

5 FINITE HORIZON DECISION TIMING WITH PARTIALLY OBSERVABLE POISSON PROCESSES Sample paths of Π. Let us take a sample path of the observations process X, in which m-many arrivals are observed on, t. Let t k ) k m denote those arrival times. If we know that the process M stays at the state {i} without any transition, then the conditional) likelihood of this path would be written as P π {σ k dt k, Y k dy k ; k m M s = i, s t} = m m λ i e λ it 1 dt 1 λ i e λ it m t m 1 ) dt m e λ it m t m 1 ) f i y k )νdy k ) = e λ it λ i dt k f i y k )νdy k ). By construction, the observation process X has independent increments conditioned on M = {M t } t. Therefore, we have 2.6) 1 {Mt=i} P π{ } σ i dt i, Y i dy i ; i m M s ; s t ) t n m = 1 {Mt=i} exp λ i 1 {Mtk =i}ds 1 {Mtk =j}λ j dt k f i y k )νdy k ). j E i=1 By taking the expectations of the expressions above, we obtain the unconditional likelihoods, in terms of which we give an explicit representation for the process Π in Lemma 2.1 below. Lemma 2.1. For i E, let us define 2.7) where 2.8) k=1 k=1 L π i t, m : t k, y k ), k m) E π 1 {Mt=i} e It) It) t n i=1 λ i 1 {Ms=i} ds and lt, y) j E k=1 m lt k, y k ), k=1 1 {Mt=j}λ j f j y). Also, let L π t, m : t k, y k ), k m) j E L π j t, m : t k, y k ), k m). Then we have 2.9) Π i) t = L π i t, N t : σ k, Y k ), k N t ) L π t, N t : σ k, Y k ), k N t ) L π i t, m : t k, y k ), k m) L π, t, m : t k, y k ), k m) m=nt ; t k =σ k,y k =Y k ) k m P π -a.s., for all t, and for i E. Lemma 2.1 indicates that the conditional probability of M t being in state i is simply the relative likelihood of the observed path until t on the event {M t = i}. Using the explicit form in 2.9), we describe the behavior of the sample paths of Π in Remark 2.1 below. Remark 2.1. The process Π has piecewise-deterministic sample paths: between two arrival times of X, it moves deterministically, and at an arrival time, it jumps from one point to another depending on the observed mark size see Figure 1). In precise terms, the sample paths have the characterization 2.1) Πt) = x t σ m, Πσ ) m ) Πσ m ) = R Πσ m ), Y m ) where xt, π) x 1 t, π),..., x n t, π)) is defined as 2.11), σ m t < σ m+1, m N, x i t, π) E π 1 {Mt=i} e It) E π e It) = P π {σ 1 > t, M t = i} P π, for i E, {σ 1 > t}

6 6 MICHAEL LUDKOVSKI AND SEMIH O. SEZER 1,),1) 1,),1) a) b),,1),,1) 1,,) c),1,) 1,,) d),1,) Figure 1. Sample paths of the process Π for four different examples. Solid lines represent actual sample paths. Dashed lines in panels c) and d) are the deterministic parts in 2.11). In panels a) and b), there are two hidden states, and in panels c) and d), there are three. The parameters of each example: Q a = ) ) 1 1, Q b =, Q c = , Q d = 1 1 with λ a = 1, 2, λ b = 1, 4, λ c = 1, 2, 3, λ d = 1, 3, 5. In each example, jumps of the process X are always of unit size. and R π, y) is defined by 2.12) R π, y) Rπ 1,..., π n, y) = ) λ 1 π 1 f 1 y) j E λ jπ j f j y),..., λ n π n f n y) j E λ. jπ j f j y) Note that the paths t xt, π) have the semigroup property xt + u, π) = xu, xt, π)), for t, u. The i th component of the vector flow x i, ) indicates how likely it is to have a period of, t without any arrival on the event {M t = i}. Moreover, from similar analysis in Dayanik et al. 28a, Section 2, Π is a P π, F)-Markov process for every π D. Corollary 2.1. Using infinitesimal last step analysis, it can be shown see, for example, Darroch and Morris 1968, page 416, and Karlin and Taylor 1998, Chapter 6.7) that the vector mt, π) m 1 t, π),..., m n t, π)) E π 1 {Mt=1} e Iu),..., E π 1 {Mt=n} e Iu) ) 2.13)

7 FINITE HORIZON DECISION TIMING WITH PARTIALLY OBSERVABLE POISSON PROCESSES 7 has the form mt, π) = π e tq Λ) in terms of n n diagonal matrix Λ with Λ i,i = λ i, and the components of mt, π) satisfy dm i t, π)/dt = λ i m i t, π) + j E m jt, π) q j,i. Then thanks to the chain rule and 2.11) we have dx i t, π) n n 2.14) = q j,i x j t, π) λ i x i t, π) + x i t, π) λ j x j t, π). dt Hence, the process Π in 2.1) has the dynamics j j 2.15) dπ i) t = n j q j,i Π j) t λ iπ i) t + Πi) t n λ j Π j) dt + j t R d λ i f i y)π i) t j E λ jf j y)π j) 1 pdt, dy), i E, t where p, ) is the point process given by p, t B) = i N 1,t B σ i, Y i ), for every Borel set B BR d ) and t. 3. Constructing the Value Function The characterization of the sample paths in 2.15) and general theory of optimal stopping see, for example, Bensoussan 1992, Lenhart and Liao 1985) imply that the free-boundary problem associated with the optimal stopping problem in 2.3) has the form 3.1) max { ρ + L)fs, π) + C π) ; H π) fs, π) } =, in terms of the infinitesimal generator Lfs, π) = fs, π) s + q j,i π j λ i π i + π i λ j π j fs, π) π i j E j E ) + f s, R π, y) fs, π) π i λ i ν i dy), y R d acting on smooth) functions f, ) on, T D. Studying the equation ρ+l)fs, π)+c π) = and determining the stopping regions is not easy even when n = 2; see, for example, Peskir and Shiryaev 2, which solves a free-boundary problem similar to 3.1) for an infinite horizon problem with n = 2. Moreover, it is known that the value function of such a stopping problem may not differentiable at every point on its domain as illustrated in Dayanik and Sezer 25, in which case the equation 3.1) should be considered in viscosity sense. Instead of studying the problem in 3.1), we will employ a sequential approximation technique to compute the value function following Gugerli 1986 and Davis 1993, Chapter 5. Similar approach is also taken in Bayraktar et al. 26 and Dayanik et al. 28a for the disorder-detection and hypothesis-testing problems respectively on infinite horizon. Below, we tailor this method to fit it into the finite-horizon setting. We focus on the non-trivial modifications that arise due to timedependent operators and the more general form of M, and otherwise refer to the results of Dayanik et al. 28a. All the proofs are delegated to the Appendix.

8 8 MICHAEL LUDKOVSKI AND SEMIH O. SEZER 3.1. A sequential approximation. Let us first define the sequence of functions 3.2) V s, π) sup τ s V m s, π) sup τ s τ E π e ρt C Π ) t )dt + e ρτ H Πτ, and τ σm E π e ρt C Π ) t )dt + e ρτ σm H Πτ σm, for m N, on the domain, T D, where the first argument s should be considered as the remaining time to maturity. Proposition 3.1 below shows that V m s converge to V uniformly; see also the proof of Davis 1993, Theorem 53.4) and Dayanik et al. 28a, Proposition 3.1 for related results. Proposition 3.1 is a generalization of these results in the finite horizon case. Proposition 3.1. The sequence {V m } m 1 converges to V uniformly on, T D. More precisely, we have V m s, π) V s, π) V m s, π) + T C + 2 H ) ) ) m/2 λ T λ 3.3) 1/2, m 1 2ρ + λ for all s, π), T D and m N, where C max π D C π), H max π D H π) and λ max λ i. Let us consider the second problem in 3.2) for fixed m N, and let τ s be an F-stopping time. Note that the first arrival time σ 1 is a regeneration time of Markov process Π; therefore, on the event {τ σ 1 }, the maximal expected reward that the DM can achieve after σ 1 should be V m 1 s σ 1, Π σ1 ). Define the operator 3.4) τ σ1 Jwτ, s, π) E π e ρt C Π t )dt + 1 {τ<σ1 }e ρτ H ) Πτ + 1 {σ1 τ}e ρσ 1 w s σ 1, Πσ 1 )). Then, the dynamic programming intuition suggests that V ) should solve the equation V m s, π) = J V m 1 s, π), where the operator J is defined as 3.5) J ws, π) sup Jwτ, s, π) = sup Jwt, s, π) τ s t,s for a bounded function w :, T D R. The second equality in 3.5) is due to the characterization of F-stopping times Davis 1993, Lemma A2.3, p. 261) whereby for every m N, there exists a F X σ m -measurable R m such that τ σ m+1 = σ m + R m ) σ m+1, P-a.s. on {τ σ m }. Note that, with the notation in 2.13), we have P π σ 1 > u = E π e Iu) and P π σ 1 du, M u = i = E π λ i 1 {Mu=i}e Iu) du = λ i m i u, π) du, and using the characterization of the paths in 2.1) and 2.14) the operator J in 3.4) can be rewritten as ) 3.6) Jwt, s, π) = m i t, π) e ρt H xt, π)) + t e ) ρu m i u, π) C xu, π)) + λ i S i ws u, xu, π)) du,

9 FINITE HORIZON DECISION TIMING WITH PARTIALLY OBSERVABLE POISSON PROCESSES 9 in terms of the operators see 2.12)) 3.7) S i wt, π) w t, R π, y)) f i y)νdy), for i E. R d The following lemma provides basic properties of the operator J. Lemma 3.1. If w, ) is a bounded continuous function on, T D, then so is J w, ). Also, if w 1, ) w 2, ), then J w 1, ) J w 2, ). Moreover, if the mapping π ws, π) is convex for each s, T, so is π J ws, π) for each s, T. 3.8) Let us now define the sequence v s, π) H π), and v m+1 s, π) J v m s, π), for m, on, T D. Thanks to Lemma 3.1 we immediately see that the sequence {v m, )} m N is non-decreasing, hence the pointwise limit v, ) sup m N v m, ) is well defined on, T D. Moreover, again by Lemma 3.1, we have that each v m, ) is bounded and continuous on, T D, and the mapping π v m s, π) is convex for each s, T. Proposition 3.2. The sequences defined in 3.2) and 3.8) coincide. That is, we have v m, ) = V m, ) for every m N. Corollary 3.1. Each V m is continuous and hence their uniform limit see Proposition 3.1) V, ) is also continuous on, T D. As the upper envelope of convex mappings π v m s, π) = V m s, π), the mapping π V s, π) is again convex for each s, T. The Proposition 3.3 below is the dynamic programming equation for V, ), characterizing the value function as the fixed point of the operator J defined in ). Proposition 3.3. The value function satisfies V s, π) = J V s, π), and it is the smallest bounded solution of this equation greater than H ). 4. An Optimal Strategy Recall that the process Π has right-continuous paths with left limits), and the functions V, ) and H ) are continuous due to Corollary 3.1. Hence the paths of the process V t, Π t ) H Π t ) are also right-continuous and have left limits. Therefore, for ε the random time { U ε s, π) inf t, s : V s t, Π t ) ε H Π } 4.1) t ) is a well-defined F-stopping time. We also have U ε s, π) σ 1 = r ε s, π) σ 1, where 4.2) r ε s, π) inf {t, s : V s t, xt, π)) ε H xt, π))}, which can be considered as the deterministic counterpart of 4.1). Proposition 4.1. The stopping time U ε s, π) defined in 4.1) is an ε-optimal stopping time for the problem in 2.3), i.e., 4.3) Uεs, π) E π e ρt C Π t ) dt + e ρ Uεs, π) H ΠUε s, π)) V s, π) ε, for all ε and s, π), T D. Before proceeding with the proof of Proposition 4.1, we first state an immediate consequence of this result.

10 1 MICHAEL LUDKOVSKI AND SEMIH O. SEZER Corollary 4.1. The pair U T, π), du T, π))) is an optimal admissible strategy for the problem in 1.1). Proof of Proposition 4.1. Let us define 4.4) Z t t e ρu C Π u ) du + e ρt V s t, Π t ), t, s, which is a bounded process on t, s, T. The ε-optimality of U ε s, π) follows easily once we establish 4.5) E π Z Uεs, π) = Z since this equality would imply V s, π) = E π Z Uεs, π) = Uεs, π) 4.6) E π e ρt C Π t ) dt + e ρuεs, π) V s U ε s, π), Π Uεs, π)) Uεs, π) E π e ρt C Π t ) dt + e ρuεs, π) H Π Uεs, π)) + ε, due to regularity of the paths t V t, Π t ) H Π t ). In the remainder of the proof we show 4.5) by establishing E π Z Uεs, π) σ m = Z, for m = 1, 2,..., inductively. After taking the limit as m in the equality above, we obtain 4.5) due to bounded convergence theorem. For typographical convenience we write r ε = r ε s, π) and U ε = U ε s, π). First, we consider m = 1. Recall that U ε s, π) σ 1 = r ε σ 1. Then E π Z Uε σ1 = E π Z rε σ1 = 4.7) rε σ1 E π e ρt C Π t )dt + 1 {σ1 r ε}e ρσ 1 V s σ 1, Π σ1 ) + 1 {σ1 >r ε}e ρrε H Π rε ) + 1 {σ1 >r ε}e V ρ rε s r ε, Π rε ) H Π ) rε ) = JV r ε, s, π) + e ρ rε P π {σ 1 > r ε } V s r ε, xr ε, π)) H xr ε, π)) ) where we used Proposition 3.3. Analogously to Dayanik et al. 28a, Lemma 3.8, we have that for deterministic times u t s, and for a bounded function w, ) ) 4.8) Jwt, s, π) = Jwu, s, π) + P π {σ 1 > u} e ρu Jwt u, s u, xu, π)) Hxu, π)). For t < r ε s, π), we have V s t, xt, π)) H xt, π)) > ε. Then, 4.8) yields JV t, s, π) sup JV u, s, π) εp π {σ 1 > t}e ρt sup JV u, s, π) εp π {σ 1 > t}e ρt < sup JV u, s, π). u t,s u,s u,s Therefore, the supremum in sup t,s JV t, s, π) must be achieved on r ε s, π), s and combining 4.8) with 4.7), we get E π Z Uε σ 1 = sup JV u, s, π) = J V s, π) = V s, π) = Z. u r ε,s

11 4.9) FINITE HORIZON DECISION TIMING WITH PARTIALLY OBSERVABLE POISSON PROCESSES 11 Now suppose by induction that E π Z Uεs, π) σ m = Z for m 1 and consider the equality E π Z Uε σ m+1 = E π 1 {Uε<σ 1 }Z Uε + 1 {Uε σ 1 }Z Uε σ m {Uε σ 1 } = E 1 π U ε {Uε<σ1 } e ρt C Π t ) dt + e ρuε V s U ε, Π ) Uε ) Uε σ m+1 e ρt C Π t ) dt + e ρ Uε σ m+1 V s U ε σ m+1, Π Uε σ m+1 ) ). On the event {U ε σ 1 }, we have U ε σ m+1 = σ 1 +U ε σ m θ σ1, where θ is the time-shift operator on Ω; i.e., X t θ s = X t+s. Using the strong Markov property of Π, equation 4.9) becomes 4.1) E π Z Uε σm+1 = E π Uε ) 1 {Uε<σ1 } e ρt C Π t ) dt + e ρuε V s U ε, Π Uε ) where 4.11) + σ1 e ρt C Π t )dt + 1 {Uε σ1 }e ρσ 1 ηs σ 1, Π σ1 ), Uε σm ηu, π) E π e ρt C Π t )dt + e ρ Uε σm V u U ε σ m, Π Uε σm ) = V u, π), thanks to the induction hypothesis for m. Combining 4.1) and 4.11) and the definition of Z in 4.4) we get E π Z Uε σm+1 = E π 1 {Uε<σ1 }Z Uε + 1 {Uε σ1 }Z σ1 = E π Z Uε σ1 = Z, where the last equality follows from our result for m = 1. Hence we have E π Z Uε σm+1 = Z and this completes the induction step A nearly-optimal strategy. On a practical level, one cannot compute V directly, but instead computes the approximate value functions V m s defined in 3.2) and employs the corresponding nearly-optimal strategies see 4.12). It is therefore important to know the error associated with this approximation. For a given error level ε >, let us fix { ) 1/2 ) k/2 λ T λ m = inf k N : T C + 2 H ) ε/2}, k 1 2ρ + λ such that V m V ε/2 on, T D via 3.3). Next, let us define the stopping times 4.12) ε/2 s, π) inf{t, s : V ms, Π t ) ε/2 H Π t )}. ) ) The regularity of the paths t Π t implies that V U m) ε/2 s, π), Π m) U ε/2 s, π) H Π m) U ε/2 s, π) ε. Then the arguments in the proof of Proposition 4.1 see 4.4), 4.5), and 4.6)) can easily be modified to show that m) U V s, π) = E π ε s, π) e ρt C m) ρu ε Π t ) dt + e s, π) V s U ε m) s, π), ) Π m) U s, π) ε 4.13) m) U E π ε s, π) e ρt C Π ) m) ρu ε t ) dt + e s, π) H ΠU m) + ε. ε s, π) ) Hence, if we apply the admissible strategy T, π), du m) T, π)), which requires computing U m) U m) ε 3.2) only up to m defined above, the resulting error is no more than ε. ε

12 12 MICHAEL LUDKOVSKI AND SEMIH O. SEZER 4.2. Stopping and continuation regions. Let 4.14) C T {s, π), T D : V s, π) > H π)}, Γ T {s, π), T D : V s, π) = H π)} denote the continuation and stopping regions respectively. decomposed as the union k A Γ T,k of the regions 4.15) Γ T,k {s, π), T D : V s, π) = H k π)}, k A, The stopping region can further be where H k is defined in 2.4). Corollary 4.1 states that in the optimal solution U T, π), du T, π)) ), one observes the process Π until U T, π), whence it enters the region Γ T. At this time, if Π is in the set Γ T,k we take du T, π)) = k. Remark 4.1. The definition of the value function V in 2.3) implies that the mapping s V s, π) is non-decreasing. Therefore if s, π) Γ T,k for some s, π), T D, then we have t, π) Γ T,k for all t s. Remark 4.2. For fixed s T, let s, π 1 ) and s, π 2 ) be two points in the region Γ T,k, and let α, 1). As the upper envelope of convex mappings π v m s, π) see Corollary 3.1), the mapping π V s, π) is convex for each s, T. Using this property we obtain H k α π α) π 2 )) V s, α π α) π 2 ) α V s, π 1 ) + 1 α) V s, π 2 ) = α H k π 1 ) + 1 α) H k π 2 ) = H k α π α) π 2 )), which implies that s, α π α) π 2 ) Γ T,k, and the region Γ T,k {s} D) is convex for each fixed s T and k A. Remark 4.3. Note that Γ T {, π); π D}. The region {s, π) Γ T : s > } may however be empty. In an example where min c i > and µ k,i s are all the same it is never optimal to stop prior to terminal time T. Moreover, the region {s, π) Γ T : s > } may be non-empty but have an empty interior. For example, in the hypothesis testing problem discussed in Section 5.2 all the states of the unobservable Markov process are absorbing, and each component Π i) t is a martingale. Since the terminal cost function of the corresponding minimization problem see 2.4)) H ) = min k E H k ) is concave, the process H Π t ) is a super-martingale on, T. If we select ρ = and c i = for all i E, it is therefore never optimal to stop early on the interior of {s, π) Γ T : s > }. In this case, there is no penalty associated with a delay in the decision, so τ = T unless π is at a corner of the simplex D. Lemma 4.1. For i E, let A i) {k A : µ k,i = max j A µ j,i }. If the inequality c i ρµ k,i + j i µ k,j µ k,i )q i,j > holds for all k A i), then there exists πi c < 1, independent of T, such that {s, π), T D : π i πi c} C T. If the hidden process M is known to be in state i E, then the expression ρµ k,i is the instantaneous decay of the payoff from selecting action k A immediately, and c i is the instantaneous cost of waiting. Moreover, under action k A, the term j i µ k,j µ k,i )q k,j is the marginal rate of return from waiting for the hidden process M to jump to another state. Therefore the sum in Lemma 4.1 is the instantaneous net return enjoyed by the DM under action k A. Lemma 4.1 indicates that if there is strong posteriori evidence that M is in state i, and if the instantaneous net return is positive under all favorable actions around the i th corner of D, the decision maker should not stop unless T = ).

13 FINITE HORIZON DECISION TIMING WITH PARTIALLY OBSERVABLE POISSON PROCESSES Stopping regions for reward maximization with running cost. Here, we consider the problem in 2.3) with the assumption c i running costs) for i E, and µ max k,i µ k,i > terminal rewards). The second condition is not restrictive if ρ = since we can always add and subtract) the same constant to and from) the terminal reward function. Let us define 4.16) I {i E : max k A µ k,i = µ}, which is the set of the states of M, at which the DM can get the highest terminal reward. Since c i for all i E, we obviously have i I {s, π) : s, T, π i = 1} Γ T. In general, if there is a penalty associated with waiting, we expect that it is optimal to stop at the points s, π) for which the best component π i, i I, is sufficiently high, for any s >. Lemma 4.2 provides a sufficient condition for this to be true. Lemma 4.2. Let i I. If ρ >, or c i <, then there exists a constant πi s T, such that Γ T {s, π), T D : π i π s i }. < 1, independent of Remark 4.4. If H ), the statement of the stopping problem in 2.3) implies that the value function V is non-increasing as a function of the discount factor ρ. If we denote the dependence of the stopping region on ρ with Γ T ρ), then we have Γ T ρ 1 ) Γ T ρ 2 ) whenever ρ 1 ρ 2. Moreover, the dynamics of the process Π are independent of ρ and U s, π) is the hitting time of Π to Γ T. Therefore, the time that the DM can afford for observing the process X in the presence of a lower discount factor is no less than that spent under heavier discounting. A similar claim also holds for dependence of U s, π) and Γ T on the running costs c i. Namely, an observer with lower in absolute value) running costs stops no-sooner than another one with heavier running costs. 5. Examples Below, we re-visit the well-known Bayesian regime detection problem and the machine replacement problem of Jensen and Hsu 1993 in our finite horizon setting. For both problems, we also provide numerical solutions, which are obtained by discretizing the domain, T D of V, ) and solving the fixed point equation V, ) = J V t, π) recursively. We set the number of iterations m N such that the error V m ) V ) is negligible see 3.3)). Our model is applicable in many other settings that have been considered elsewhere, including launch of insurance products Schöttl 1998, technology adoption Ulu and Smith 27 and various disorder detection problems; see Ludkovski and Sezer 27 for further details and examples Optimal replacement of a system. Here, we consider the reliability problem in Jensen and Hsu 1993 where the aim is to find the best time to replace a machine in order to maximize its lifetime net earnings. The objective is to compute 5.1) sup τ τ E π c i 1 {Mt=i}dt + µ i 1 {Mτ =i} In this setting, the observations come from a simple Poisson process representing the number of defective items produced by the machine, and the process M represents the current productivity level. The n th state defective state) is absorbing, while all others are transient. Related models have appeared in Makis and Jiang 23, and Stadje 1994 and go all the way to classical POMDP work by Smallwood and Sondik

14 14 MICHAEL LUDKOVSKI AND SEMIH O. SEZER Figure 2. Value function V T, π) of the reliability example of Section 5.1. The shaded regions represent the stopping regions { π D : V T, π) = H π)}. Left and right panels are for the values T = 1.5 and T =.2 respectively. The shaded regions are the same in both panels. Note however the different z-scales. The panels also show the line 3.5π π 2 π 3 =, which is the stopping boundary of the ILA rule. Assumption 1. In Jensen and Hsu 1993, it is assumed that i) q i for i = 1,..., n 1, with q n = ii) r 1 r 2... r n = c n, with c n < iii) < λ 1... λ n, iv) q in > λ n λ i for i = 1,..., n 1. These assumptions ensure that the infinitesimal look-ahead rule τ ILA := inf{t : i r iπ i) t < } is optimal where r i c i + j i µ j µ i )q i,j cf. Lemma 4.1). It follows as a corollary to Jensen 1989, Theorem 3.1 that τ ILA T is an optimal stopping rule for the finite horizon problem and the region { π D : V T, π) = H π)} does not depend on T. This occurs because the instantaneous revenue rates r i s completely summarize the relative worth of different machine states, and the sum r iπ i) t is monotonically non-increasing over time P π -a.s. for all π D see Jensen and Hsu 1993, Theorem 2). Thus, T only plays a role insofar as allowing the DM to collect profits before the machine deteriorates. We illustrate this degeneracy in Figure 2. In this example, we select the parameters to fit the framework of Jensen and Hsu We have a machine that moves through three regimes E = {1, 2, 3} with transition matrix Q = At different states, the running profit from operating the machine is c = 1,, 1, and shutting down the machine involves a cost of µ = 1, 1,. In each state, the breakdowns occur according to independent Poisson processes with intensities λ = 2, 3, 4. In this setting, we have r = {3.5, 1.5, 1} so that τ ILA = inf{t : 3.5Π 1) t + 1.5Π 2) t Π 3) t < }. The left and right panels of Figure 2 show the functions V T, π) and the regions { π D : T, π) Γ T } for T = 1.5 and T =.2 respectively. We see that V.2, π) < V 1.5, π) but the regions { π D : V T, π) = H π)} for T =.2 and T = 1.5 coincide with the region { π D : 3.5π π 2 π 3 }, at least modulo the D-discretization necessary for numerical implementation. This degenerate structure would disappear if one removes some of the assumptions in Jensen and Hsu Nevertheless, the sequential construction of Section 3 can still safely be employed

15 FINITE HORIZON DECISION TIMING WITH PARTIALLY OBSERVABLE POISSON PROCESSES 15 Figure 3. The second example for the reliability problem of Section 5.1 with the new parameters in 5.2). In the left panel T = 2, in the middle T =.5, and in the right panel T =.1. In each picture, the function V T, π) is plotted on D. Shaded regions are the sets { π D : V T, π) = H π)}. with the optimal stopping rule given in Corollary 4.1. We give an example in Figure 3 where ) Q =.5.5 and λ = λ1, λ 2, λ 3 = 1, 4, 7. We keep other parameters the same as in the previous example. Now, the instantaneous net gain r iπ i) t = 1.5Π 1) t +.5Π 2) t Π 3) t is not monotonically non-increasing P π -a.s. for all π D anymore. Figure 3 shows that the stopping region is now time dependent and expands as time to maturity decreases. For this choice of c and µ, one can modify the proof of Lemma 4.2 to show that there always exists a stopping region around the absorbing state for all T ). Furthermore, Lemma 4.1 implies that it is never optimal to stop around the corners of the simplex D corresponding to non-absorbing states. Also note that the transition rates of M are now lower. Therefore, the DM can obtain positive net gain when M starts from the state {1} and there is enough time to operate the system. Indeed, the first panel in Figure 3 shows that for T = 2 the value function is positive around the corner {1} Sequential hypothesis-testing. In this problem, a compound Poisson process X = {X t } t is observed starting from t =. The arrival rate λ and mark distribution ν of X are not known precisely. Rather they depend on the static regime of a Markov process M with n absorbing states i.e., M t = M for all t ). Each state corresponds to the realization of one of the n simple hypotheses 5.3) A 1 : λ, ν) = λ 1, ν 1 ),......, A n : λ, ν) = λ n, ν n ), with given prior likelihoods π i, for i = 1,..., n. The objective of the DM is to identify the current regime as quickly as possible, with minimal probability of wrong decision. In earlier work on this problem, the trade-off between observing and stopping is generally modeled via the Bayes risk 5.4) n E π τ + µ k,i 1 {d=k,m =i}, k,i=1

16 16 MICHAEL LUDKOVSKI AND SEMIH O. SEZER where τ is the decision time, d {1,..., n} represents the hypothesis selected and µ k,i is the cost of selecting the wrong hypothesis A k when the correct one is A i. The DM then needs to minimize 5.4) and find a pair τ, d), if one exists, that attains this infimum. The infinite horizon version of 5.4) was solved for the first time by Peskir and Shiryaev 2 for a simple Poisson process with n = 2. Later, Gapeev 22 provided the solution again with n = 2), where the jump size is exponentially distributed under each hypothesis, and the mean of the exponential distribution is the same as the proposed arrival rate. The solution for any jump distribution and for n N was recently provided by Dayanik et al. 28a. Below we treat the finite horizon version of that problem, where a decision must be made before horizon T <. Remark 5.1. Let V, π) denote the value function of this minimization problem on infinitehorizon, and for 1 k n, let Γ,k { π D : V, π) = H k π)} in terms of the functions H k π) = µ k,iπ i. Dayanik et al. 28a showed that each region Γ,k is closed and convex with a non-empty interior around the k th corner of the simplex D. This structure also extends to the finite-horizon problem. Since V, π) V T, π), we have Γ,k Γ T,k, for k E and T <. Then, Remarks 4.1 and 4.2 and Corollary 4.1 imply that there are time-dependent closed convex sets with non-empty interiors) around the corners of D such that it is optimal to stop the first time the process Π enters one of these sets. At this time, if the conditional likelihoods process Π is around the k th corner, we select hypothesis A k. In Figure 4, we illustrate the time-dependence of the solution structure using a simple example with two hypotheses A 1 : Λ = λ 1 and A 2 : Λ = λ 2 on the arrival rate only. This problem was solved in Peskir and Shiryaev 2 on infinite horizon, and the authors show that the immediate stopping is optimal if and only if µ 2,1 µ 1,2 λ 2 λ 1 ) µ 2,1 + µ 1,2. Hence, the inequality µ 2,1 µ 1,2 λ 2 λ 1 ) > µ 2,1 + µ 1,2 has to be satisfied also in any finite-horizon problem with non-trivial solution. In Figure 4, the arrival rates are λ 1 = 1 and λ 2 = 5. For the Bayes risk given in 5.4), we select µ 1,2 = µ 2,1 = 2 for the penalty costs. This numerical example matches Peskir and Shiryaev 2, Figures 2-3. The left panel of Figure 4 shows the value functions V T, ) with horizons T =.1, T =.2, T =.4 and T = 2 respectively, and the terminal reward H π) = min{µ 1,2 π 2 ; µ 2,1 1 π 2 )} on the state space of π 2, 1. We see that as T increases, the value function decreases, as expected. The right panel of Figure 4 shows that the continuation region widens as time to maturity increases. We also observe that the boundary curves approach the solution structure of problem with infinite horizon. Peskir and Shiryaev 2 obtained a continuation region of.22,.7, very close to ours of.23,.75 for T > 1. Let us define the lower boundary curve T b 1 T ) sup{π 2, 1 : V T, π) = 2π 2 }. Clearly b 1 ) =.5. In the right panel, we remarkably observe that the lower boundary curve b 1 ) has a discontinuity at T = and then remains constant until about T =.2. Note that the point π = π 1, π 2 ) =.5,.5) is the global maximum of the terminal cost function H π). Starting at the point.5 + ε,.5 ε), for ε and small, as long as there is no jump, the conditional likelihood process Π drifts quickly) toward π = π 1, π 2 ) = 1, ) and away from this maximum. Intuitively speaking, for very small values of T, the probability of observing a jump is low and thus it is optimal to continue. Therefore, the lower curve in Figure 4 is discontinuous around T =. The drift of the process Π towards 1, ) decreases as π 2 decreases and approaches 1, ) see 2.14)). As a result, at points π where π 2 is small, the effect of waiting cost becomes dominant and it is optimal to stop even if T is small.

17 FINITE HORIZON DECISION TIMING WITH PARTIALLY OBSERVABLE POISSON PROCESSES 17 Figure 4. Bayesian regime detection example of Section 5.2. The left panel shows the value functions V T, π) for various time horizons T. The right panel shows the stopping regions Γ T,k namely Γ T,1 below the lower curve and Γ T,2 above the higher curve) for T = 2. The following proposition summarizes our discussion on this example and states that this behavior of the lower boundary curve around T = holds for any set of parameters λ 2 > λ 1, µ 1,2, µ 2,1. Proposition 5.1. Consider the hypothesis-testing problem in 5.4) with two simple hypotheses on the arrival rate: A 1 : Λ = λ 1 and A 2 : Λ = λ 2 with λ 2 > λ 1 ). The continuation region C T is non-empty for T > ) if and only if µ 2,1 µ 1,2 λ 2 λ 1 ) > µ 2,1 + µ 1,2. The boundary curve T b 1 T ) sup{π 2, 1 : V T, π) = µ 1,2 π 2 } is discontinuous at T =, and there is an interval around T = at which b 1 ) is constant. Remark 5.2. As a final note, we would like to add that our analysis in Sections 3 and 4 can also be applied easily to solve the finite horizon change-detection problem. In this problem, the local parameters λ, ν) of an observed compound Poisson process change at some unobservable time θ when the process M hits one of its absorbing states, and the objective is to find the best time τ that minimizes Eτ θ) + + c Pτ < θ), which is another special case of 1.1). In the infinite horizon setting, Dayanik and Sezer 25 and Bayraktar and Sezer 29 show that the stopping region consists of closed convex regions again with non-empty interior) around the corners of D corresponding to absorbing states. In the finite-horizon formulation, Remarks 4.1 and 4.2 and Lemma 4.2 imply that there are time-dependent closed convex sets around these corners, and the hitting time of Π to those regions is an optimal alarm time thanks to Corollary 4.1). Moreover, Lemma 4.1 implies that it is never optimal to stop around the remaining corners of D corresponding to non-absorbing states. Acknowledgments The authors would like to thank the editors and anonymous referees for many helpful comments and remarks that improved the presentation in the paper.

18 18 MICHAEL LUDKOVSKI AND SEMIH O. SEZER Appendices Appendix A. Discrete information costs The objective function in 1.1) is applicable to a variety of economic settings. This has allowed us to provide a unified treatment of many disparate models. Returning to the economic interpretation of the running costs appearing in the first term in 1.1), in a typical setting they represent information acquisition expenses, or opportunity costs. Alternatively, observation costs may be discrete and be incurred only when new information arrives. This, for example, happens if new information corresponds to opportunities lost e.g. deals signed by competitors), leading to a cost structure of the form N τ j=1 e ρσ j KY j ). Here, N τ is the number of arrivals by time τ, σ j, Y j ) are the arrival times and marks respectively, and KY j ) is the cost incurred upon an arrival of size Y j with K : R d R satisfying ν i K + R K + y)ν d i dy) <, i E). In the third case, one deals with the objective function A.1) ÛT, π) sup τ T, d F X τ E π N τ e ρσ j KY j ) + e ρτ j=1 a =i}) 1 {d=k} µ k,i 1 {Mτ, by solving the equivalent stopping problem ˆV T, π) sup τ T E π Nτ ) j=1 e ρσ j KY j ) + e ρτ H Πτ, as in Proposition 2.3. In this case, one can verify that the sequential approximation method of Section 3 holds for the value function ˆV. Namely, if we define the sequence of functions { ˆV m, )} m, where ˆV m s, π) sup τ s E π m Nτ j=1 e ρσ j KY j ) + e ρτ σm H k=1 Πτ σm ), it can be shown see ), Proposition 3.2) that we have ˆV m+1 s, π) = Ĵ ˆV m s, π) where the operator Ĵ is defined as Ĵ ws, π) = sup E π e It) e ρt H xt, π)) t,s t + e ) ρu m i t, π) λ i Ky)ν i dy) + S i ws u, xu, π)) du, R d for a bounded function w :, T D R. NT Clearly {V m } m is an increasing sequence. Using the inequality E j=1 K+ Y j ) max λ i )T max ν i K + ) and the truncation arguments in the proof of Proposition 3.1, one can show that the sequence converges to ˆV uniformly with the error bound ) ) 1/2 ) m/2 λ T λ V V m max λ i)t max ν ik + ) + 2 H. m 1 2ρ + λ Arguments in Sections 3 and 4 can then be replicated to conclude that NÛεs, π) π)) E π e ρσ j KY j ) + e ρ Ûεs, π) H Π Û ε s, ˆV s, π) ε, j=1 for the stopping time Ûεs, π) inf { t, s : ˆV s t, Πt ) ε H Π } t ). Hence, the admissible strategy Ûεs, π), dûεs, π))) is an optimal strategy for the problem in A.1), as expected. Furthermore, other results of Section 4 can be adjusted for this new objective function. Below, we summarize these results in a remark, and we conclude our discussion here. Remark A.1. Let ν j K R d Ky)ν j dy), for j E.

19 FINITE HORIZON DECISION TIMING WITH PARTIALLY OBSERVABLE POISSON PROCESSES 19 i) For a given index i E, Define A i) {k A : µ k,i = max j A µ j,i } as in Lemma 4.1. If ρµ k,i + λ i ν i K + j i µ k,j µ k,i )q i,j > holds for all k A i), then there exists some ˆπ i c < 1 for all T > ) such that it is optimal to continue on the region {, T D; π i ˆπ i c}. ii) Assume ν j K for all j E, and µ max k,i µ k,i >, and let I be as in 4.16). For i I, if ν i K < or ρ > there exists a number ˆπ i s < 1 free of T ) such that it is optimal to stop at the points π for which π i ˆπ i s. That is: Γ T,i {, T D; π i ˆπ i c } for all T. iii) In the case where ν j K for all j E, and H ), the stopping region is monotone in ρ and ν j K, for j E. Namely, if we increase one of these factors in absolute terms keeping everything else fixed), the stopping region expands, and the DM is forced to make a decision sooner. iv) For a given ε >, let m N such that ˆV T, ) ˆV T, ) ε/2. Then the stopping time Û m) ε/2 {t s, π) inf, T : ˆVm T t, Π t ) ε H Π } t ) gives an ε-optimal strategy. v) If ρ > or K ) with max ν i K ) <, then ˆV T, ) ˆV, ) uniformly as in B.2) if we redefine e ρt max λ i max ν ik H ) ), if ρ > ErrT ) ) 2 H ) mink,i µ k,i max k,i µ k,i, if ρ =, K ) and max T min λ i max ν i K ν ik <. Appendix B. Remarks on the infinite horizon problem In general, if there is a strict penalty for waiting, it is likely that the DM will make a decision prior to the final time T for moderate or large values of T. In this case, the constraint τ T in 2.3) is of less importance, and one essentially faces an infinite horizon stopping problem. Solving the infinite horizon problem can be computationally more appealing since we eliminate the timedimension of the state space, T D. Below, we show that the value function of the finite-horizon problem converges uniformly to that of the infinite horizon under the assumption B.1) either ρ > or max c i <. The infinite horizon problem is defined as in 2.3) and 1.1)) by removing the constraint τ T. With the notation in 2.3), let V, π) be the value function of this stopping problem. Lemma B.1. As T, the function V T, π) converges to V, π) uniformly on D, and we have B.2) where V T, π) V, π) V T, π) + ErrT ), for all π D and T, e ρt C + 2 H ), if ρ > ) ErrT ) 2 H mink,i µ k,i max k,i µ k,i, if ρ = and max T max c c i <. i

SEQUENTIAL TESTING OF SIMPLE HYPOTHESES ABOUT COMPOUND POISSON PROCESSES. 1. Introduction (1.2)

SEQUENTIAL TESTING OF SIMPLE HYPOTHESES ABOUT COMPOUND POISSON PROCESSES. 1. Introduction (1.2) SEQUENTIAL TESTING OF SIMPLE HYPOTHESES ABOUT COMPOUND POISSON PROCESSES SAVAS DAYANIK AND SEMIH O. SEZER Abstract. One of two simple hypotheses is correct about the unknown arrival rate and jump distribution

More information

Solving the Poisson Disorder Problem

Solving the Poisson Disorder Problem Advances in Finance and Stochastics: Essays in Honour of Dieter Sondermann, Springer-Verlag, 22, (295-32) Research Report No. 49, 2, Dept. Theoret. Statist. Aarhus Solving the Poisson Disorder Problem

More information

Compound Poisson disorder problem

Compound Poisson disorder problem Compound Poisson disorder problem Savas Dayanik Princeton University, Department of Operations Research and Financial Engineering, and Bendheim Center for Finance, Princeton, NJ 844 email: sdayanik@princeton.edu

More information

Bayesian quickest detection problems for some diffusion processes

Bayesian quickest detection problems for some diffusion processes Bayesian quickest detection problems for some diffusion processes Pavel V. Gapeev Albert N. Shiryaev We study the Bayesian problems of detecting a change in the drift rate of an observable diffusion process

More information

Liquidation in Limit Order Books. LOBs with Controlled Intensity

Liquidation in Limit Order Books. LOBs with Controlled Intensity Limit Order Book Model Power-Law Intensity Law Exponential Decay Order Books Extensions Liquidation in Limit Order Books with Controlled Intensity Erhan and Mike Ludkovski University of Michigan and UCSB

More information

On the sequential testing problem for some diffusion processes

On the sequential testing problem for some diffusion processes To appear in Stochastics: An International Journal of Probability and Stochastic Processes (17 pp). On the sequential testing problem for some diffusion processes Pavel V. Gapeev Albert N. Shiryaev We

More information

Surveillance of BiometricsAssumptions

Surveillance of BiometricsAssumptions Surveillance of BiometricsAssumptions in Insured Populations Journée des Chaires, ILB 2017 N. El Karoui, S. Loisel, Y. Sahli UPMC-Paris 6/LPMA/ISFA-Lyon 1 with the financial support of ANR LoLitA, and

More information

Point Process Control

Point Process Control Point Process Control The following note is based on Chapters I, II and VII in Brémaud s book Point Processes and Queues (1981). 1 Basic Definitions Consider some probability space (Ω, F, P). A real-valued

More information

Dynamic Pricing for Non-Perishable Products with Demand Learning

Dynamic Pricing for Non-Perishable Products with Demand Learning Dynamic Pricing for Non-Perishable Products with Demand Learning Victor F. Araman Stern School of Business New York University René A. Caldentey DIMACS Workshop on Yield Management and Dynamic Pricing

More information

Optimal Execution Tracking a Benchmark

Optimal Execution Tracking a Benchmark Optimal Execution Tracking a Benchmark René Carmona Bendheim Center for Finance Department of Operations Research & Financial Engineering Princeton University Princeton, June 20, 2013 Optimal Execution

More information

Noncooperative continuous-time Markov games

Noncooperative continuous-time Markov games Morfismos, Vol. 9, No. 1, 2005, pp. 39 54 Noncooperative continuous-time Markov games Héctor Jasso-Fuentes Abstract This work concerns noncooperative continuous-time Markov games with Polish state and

More information

Birgit Rudloff Operations Research and Financial Engineering, Princeton University

Birgit Rudloff Operations Research and Financial Engineering, Princeton University TIME CONSISTENT RISK AVERSE DYNAMIC DECISION MODELS: AN ECONOMIC INTERPRETATION Birgit Rudloff Operations Research and Financial Engineering, Princeton University brudloff@princeton.edu Alexandre Street

More information

Lecture 12. F o s, (1.1) F t := s>t

Lecture 12. F o s, (1.1) F t := s>t Lecture 12 1 Brownian motion: the Markov property Let C := C(0, ), R) be the space of continuous functions mapping from 0, ) to R, in which a Brownian motion (B t ) t 0 almost surely takes its value. Let

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 15. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

How Much Evidence Should One Collect?

How Much Evidence Should One Collect? How Much Evidence Should One Collect? Remco Heesen October 10, 2013 Abstract This paper focuses on the question how much evidence one should collect before deciding on the truth-value of a proposition.

More information

1 Stochastic Dynamic Programming

1 Stochastic Dynamic Programming 1 Stochastic Dynamic Programming Formally, a stochastic dynamic program has the same components as a deterministic one; the only modification is to the state transition equation. When events in the future

More information

Optimal Stopping Problems and American Options

Optimal Stopping Problems and American Options Optimal Stopping Problems and American Options Nadia Uys A dissertation submitted to the Faculty of Science, University of the Witwatersrand, in fulfilment of the requirements for the degree of Master

More information

Process-Based Risk Measures for Observable and Partially Observable Discrete-Time Controlled Systems

Process-Based Risk Measures for Observable and Partially Observable Discrete-Time Controlled Systems Process-Based Risk Measures for Observable and Partially Observable Discrete-Time Controlled Systems Jingnan Fan Andrzej Ruszczyński November 5, 2014; revised April 15, 2015 Abstract For controlled discrete-time

More information

Dynamic Risk Measures and Nonlinear Expectations with Markov Chain noise

Dynamic Risk Measures and Nonlinear Expectations with Markov Chain noise Dynamic Risk Measures and Nonlinear Expectations with Markov Chain noise Robert J. Elliott 1 Samuel N. Cohen 2 1 Department of Commerce, University of South Australia 2 Mathematical Insitute, University

More information

Optimal Control. Macroeconomics II SMU. Ömer Özak (SMU) Economic Growth Macroeconomics II 1 / 112

Optimal Control. Macroeconomics II SMU. Ömer Özak (SMU) Economic Growth Macroeconomics II 1 / 112 Optimal Control Ömer Özak SMU Macroeconomics II Ömer Özak (SMU) Economic Growth Macroeconomics II 1 / 112 Review of the Theory of Optimal Control Section 1 Review of the Theory of Optimal Control Ömer

More information

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3 Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................

More information

SMSTC (2007/08) Probability.

SMSTC (2007/08) Probability. SMSTC (27/8) Probability www.smstc.ac.uk Contents 12 Markov chains in continuous time 12 1 12.1 Markov property and the Kolmogorov equations.................... 12 2 12.1.1 Finite state space.................................

More information

A Change of Variable Formula with Local Time-Space for Bounded Variation Lévy Processes with Application to Solving the American Put Option Problem 1

A Change of Variable Formula with Local Time-Space for Bounded Variation Lévy Processes with Application to Solving the American Put Option Problem 1 Chapter 3 A Change of Variable Formula with Local Time-Space for Bounded Variation Lévy Processes with Application to Solving the American Put Option Problem 1 Abstract We establish a change of variable

More information

Bayesian Persuasion Online Appendix

Bayesian Persuasion Online Appendix Bayesian Persuasion Online Appendix Emir Kamenica and Matthew Gentzkow University of Chicago June 2010 1 Persuasion mechanisms In this paper we study a particular game where Sender chooses a signal π whose

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

Deceptive Advertising with Rational Buyers

Deceptive Advertising with Rational Buyers Deceptive Advertising with Rational Buyers September 6, 016 ONLINE APPENDIX In this Appendix we present in full additional results and extensions which are only mentioned in the paper. In the exposition

More information

Lecture 21 Representations of Martingales

Lecture 21 Representations of Martingales Lecture 21: Representations of Martingales 1 of 11 Course: Theory of Probability II Term: Spring 215 Instructor: Gordan Zitkovic Lecture 21 Representations of Martingales Right-continuous inverses Let

More information

The Wiener Sequential Testing Problem with Finite Horizon

The Wiener Sequential Testing Problem with Finite Horizon Research Report No. 434, 3, Dept. Theoret. Statist. Aarhus (18 pp) The Wiener Sequential Testing Problem with Finite Horizon P. V. Gapeev and G. Peskir We present a solution of the Bayesian problem of

More information

Filtrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition

Filtrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition Filtrations, Markov Processes and Martingales Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition David pplebaum Probability and Statistics Department,

More information

Multi-dimensional Stochastic Singular Control Via Dynkin Game and Dirichlet Form

Multi-dimensional Stochastic Singular Control Via Dynkin Game and Dirichlet Form Multi-dimensional Stochastic Singular Control Via Dynkin Game and Dirichlet Form Yipeng Yang * Under the supervision of Dr. Michael Taksar Department of Mathematics University of Missouri-Columbia Oct

More information

Finding the Value of Information About a State Variable in a Markov Decision Process 1

Finding the Value of Information About a State Variable in a Markov Decision Process 1 05/25/04 1 Finding the Value of Information About a State Variable in a Markov Decision Process 1 Gilvan C. Souza The Robert H. Smith School of usiness, The University of Maryland, College Park, MD, 20742

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

A COLLOCATION METHOD FOR THE SEQUENTIAL TESTING OF A GAMMA PROCESS

A COLLOCATION METHOD FOR THE SEQUENTIAL TESTING OF A GAMMA PROCESS Statistica Sinica 25 2015), 1527-1546 doi:http://d.doi.org/10.5705/ss.2013.155 A COLLOCATION METHOD FOR THE SEQUENTIAL TESTING OF A GAMMA PROCESS B. Buonaguidi and P. Muliere Bocconi University Abstract:

More information

Sequential Decision Problems

Sequential Decision Problems Sequential Decision Problems Michael A. Goodrich November 10, 2006 If I make changes to these notes after they are posted and if these changes are important (beyond cosmetic), the changes will highlighted

More information

OPTIMAL STOPPING OF A BROWNIAN BRIDGE

OPTIMAL STOPPING OF A BROWNIAN BRIDGE OPTIMAL STOPPING OF A BROWNIAN BRIDGE ERIK EKSTRÖM AND HENRIK WANNTORP Abstract. We study several optimal stopping problems in which the gains process is a Brownian bridge or a functional of a Brownian

More information

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT A MODEL FOR HE LONG-ERM OPIMAL CAPACIY LEVEL OF AN INVESMEN PROJEC ARNE LØKKA AND MIHAIL ZERVOS Abstract. We consider an investment project that produces a single commodity. he project s operation yields

More information

Applications of Optimal Stopping and Stochastic Control

Applications of Optimal Stopping and Stochastic Control Applications of and Stochastic Control YRM Warwick 15 April, 2011 Applications of and Some problems Some technology Some problems The secretary problem Bayesian sequential hypothesis testing the multi-armed

More information

CROSS-VALIDATION OF CONTROLLED DYNAMIC MODELS: BAYESIAN APPROACH

CROSS-VALIDATION OF CONTROLLED DYNAMIC MODELS: BAYESIAN APPROACH CROSS-VALIDATION OF CONTROLLED DYNAMIC MODELS: BAYESIAN APPROACH Miroslav Kárný, Petr Nedoma,Václav Šmídl Institute of Information Theory and Automation AV ČR, P.O.Box 18, 180 00 Praha 8, Czech Republic

More information

Notes from Week 9: Multi-Armed Bandit Problems II. 1 Information-theoretic lower bounds for multiarmed

Notes from Week 9: Multi-Armed Bandit Problems II. 1 Information-theoretic lower bounds for multiarmed CS 683 Learning, Games, and Electronic Markets Spring 007 Notes from Week 9: Multi-Armed Bandit Problems II Instructor: Robert Kleinberg 6-30 Mar 007 1 Information-theoretic lower bounds for multiarmed

More information

Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time

Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time David Laibson 9/30/2014 Outline Lectures 9-10: 9.1 Continuous-time Bellman Equation 9.2 Application: Merton s Problem 9.3 Application:

More information

Simplex Algorithm for Countable-state Discounted Markov Decision Processes

Simplex Algorithm for Countable-state Discounted Markov Decision Processes Simplex Algorithm for Countable-state Discounted Markov Decision Processes Ilbin Lee Marina A. Epelman H. Edwin Romeijn Robert L. Smith November 16, 2014 Abstract We consider discounted Markov Decision

More information

Lecture 22 Girsanov s Theorem

Lecture 22 Girsanov s Theorem Lecture 22: Girsanov s Theorem of 8 Course: Theory of Probability II Term: Spring 25 Instructor: Gordan Zitkovic Lecture 22 Girsanov s Theorem An example Consider a finite Gaussian random walk X n = n

More information

Chapter 2 Event-Triggered Sampling

Chapter 2 Event-Triggered Sampling Chapter Event-Triggered Sampling In this chapter, some general ideas and basic results on event-triggered sampling are introduced. The process considered is described by a first-order stochastic differential

More information

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539 Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory

More information

University of Warwick, EC9A0 Maths for Economists Lecture Notes 10: Dynamic Programming

University of Warwick, EC9A0 Maths for Economists Lecture Notes 10: Dynamic Programming University of Warwick, EC9A0 Maths for Economists 1 of 63 University of Warwick, EC9A0 Maths for Economists Lecture Notes 10: Dynamic Programming Peter J. Hammond Autumn 2013, revised 2014 University of

More information

Lecture 17 Brownian motion as a Markov process

Lecture 17 Brownian motion as a Markov process Lecture 17: Brownian motion as a Markov process 1 of 14 Course: Theory of Probability II Term: Spring 2015 Instructor: Gordan Zitkovic Lecture 17 Brownian motion as a Markov process Brownian motion is

More information

Monitoring actuarial assumptions in life insurance

Monitoring actuarial assumptions in life insurance Monitoring actuarial assumptions in life insurance Stéphane Loisel ISFA, Univ. Lyon 1 Joint work with N. El Karoui & Y. Salhi IAALS Colloquium, Barcelona, 17 LoLitA Typical paths with change of regime

More information

MAXIMAL COUPLING OF EUCLIDEAN BROWNIAN MOTIONS

MAXIMAL COUPLING OF EUCLIDEAN BROWNIAN MOTIONS MAXIMAL COUPLING OF EUCLIDEAN BOWNIAN MOTIONS ELTON P. HSU AND KAL-THEODO STUM ABSTACT. We prove that the mirror coupling is the unique maximal Markovian coupling of two Euclidean Brownian motions starting

More information

Proving the Regularity of the Minimal Probability of Ruin via a Game of Stopping and Control

Proving the Regularity of the Minimal Probability of Ruin via a Game of Stopping and Control Proving the Regularity of the Minimal Probability of Ruin via a Game of Stopping and Control Erhan Bayraktar University of Michigan joint work with Virginia R. Young, University of Michigan K αρλoβασi,

More information

A Rothschild-Stiglitz approach to Bayesian persuasion

A Rothschild-Stiglitz approach to Bayesian persuasion A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago December 2015 Abstract Rothschild and Stiglitz (1970) represent random

More information

STOCHASTIC PERRON S METHOD AND VERIFICATION WITHOUT SMOOTHNESS USING VISCOSITY COMPARISON: OBSTACLE PROBLEMS AND DYNKIN GAMES

STOCHASTIC PERRON S METHOD AND VERIFICATION WITHOUT SMOOTHNESS USING VISCOSITY COMPARISON: OBSTACLE PROBLEMS AND DYNKIN GAMES STOCHASTIC PERRON S METHOD AND VERIFICATION WITHOUT SMOOTHNESS USING VISCOSITY COMPARISON: OBSTACLE PROBLEMS AND DYNKIN GAMES ERHAN BAYRAKTAR AND MIHAI SÎRBU Abstract. We adapt the Stochastic Perron s

More information

Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach

Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach 1 Ashutosh Nayyar, Aditya Mahajan and Demosthenis Teneketzis Abstract A general model of decentralized

More information

Online Appendix for. Breakthroughs, Deadlines, and Self-Reported Progress: Contracting for Multistage Projects. American Economic Review, forthcoming

Online Appendix for. Breakthroughs, Deadlines, and Self-Reported Progress: Contracting for Multistage Projects. American Economic Review, forthcoming Online Appendix for Breakthroughs, Deadlines, and Self-Reported Progress: Contracting for Multistage Projects American Economic Review, forthcoming by Brett Green and Curtis R. Taylor Overview This supplemental

More information

Optimal stopping for non-linear expectations Part I

Optimal stopping for non-linear expectations Part I Stochastic Processes and their Applications 121 (2011) 185 211 www.elsevier.com/locate/spa Optimal stopping for non-linear expectations Part I Erhan Bayraktar, Song Yao Department of Mathematics, University

More information

Worst Case Portfolio Optimization and HJB-Systems

Worst Case Portfolio Optimization and HJB-Systems Worst Case Portfolio Optimization and HJB-Systems Ralf Korn and Mogens Steffensen Abstract We formulate a portfolio optimization problem as a game where the investor chooses a portfolio and his opponent,

More information

A Barrier Version of the Russian Option

A Barrier Version of the Russian Option A Barrier Version of the Russian Option L. A. Shepp, A. N. Shiryaev, A. Sulem Rutgers University; shepp@stat.rutgers.edu Steklov Mathematical Institute; shiryaev@mi.ras.ru INRIA- Rocquencourt; agnes.sulem@inria.fr

More information

Uniformly Uniformly-ergodic Markov chains and BSDEs

Uniformly Uniformly-ergodic Markov chains and BSDEs Uniformly Uniformly-ergodic Markov chains and BSDEs Samuel N. Cohen Mathematical Institute, University of Oxford (Based on joint work with Ying Hu, Robert Elliott, Lukas Szpruch) Centre Henri Lebesgue,

More information

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS 63 2.1 Introduction In this chapter we describe the analytical tools used in this thesis. They are Markov Decision Processes(MDP), Markov Renewal process

More information

Wars of Attrition with Budget Constraints

Wars of Attrition with Budget Constraints Wars of Attrition with Budget Constraints Gagan Ghosh Bingchao Huangfu Heng Liu October 19, 2017 (PRELIMINARY AND INCOMPLETE: COMMENTS WELCOME) Abstract We study wars of attrition between two bidders who

More information

Maximum Process Problems in Optimal Control Theory

Maximum Process Problems in Optimal Control Theory J. Appl. Math. Stochastic Anal. Vol. 25, No., 25, (77-88) Research Report No. 423, 2, Dept. Theoret. Statist. Aarhus (2 pp) Maximum Process Problems in Optimal Control Theory GORAN PESKIR 3 Given a standard

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning March May, 2013 Schedule Update Introduction 03/13/2015 (10:15-12:15) Sala conferenze MDPs 03/18/2015 (10:15-12:15) Sala conferenze Solving MDPs 03/20/2015 (10:15-12:15) Aula Alpha

More information

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME SAUL D. JACKA AND ALEKSANDAR MIJATOVIĆ Abstract. We develop a general approach to the Policy Improvement Algorithm (PIA) for stochastic control problems

More information

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Christos Dimitrakakis Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands

More information

Change-point models and performance measures for sequential change detection

Change-point models and performance measures for sequential change detection Change-point models and performance measures for sequential change detection Department of Electrical and Computer Engineering, University of Patras, 26500 Rion, Greece moustaki@upatras.gr George V. Moustakides

More information

Scheduling Markovian PERT networks to maximize the net present value: new results

Scheduling Markovian PERT networks to maximize the net present value: new results Scheduling Markovian PERT networks to maximize the net present value: new results Hermans B, Leus R. KBI_1709 Scheduling Markovian PERT networks to maximize the net present value: New results Ben Hermans,a

More information

Quickest Detection With Post-Change Distribution Uncertainty

Quickest Detection With Post-Change Distribution Uncertainty Quickest Detection With Post-Change Distribution Uncertainty Heng Yang City University of New York, Graduate Center Olympia Hadjiliadis City University of New York, Brooklyn College and Graduate Center

More information

A Rothschild-Stiglitz approach to Bayesian persuasion

A Rothschild-Stiglitz approach to Bayesian persuasion A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago September 2015 Abstract Rothschild and Stiglitz (1970) introduce a

More information

Optimal exit strategies for investment projects. 7th AMaMeF and Swissquote Conference

Optimal exit strategies for investment projects. 7th AMaMeF and Swissquote Conference Optimal exit strategies for investment projects Simone Scotti Université Paris Diderot Laboratoire de Probabilité et Modèles Aléatories Joint work with : Etienne Chevalier, Université d Evry Vathana Ly

More information

Liquidity risk and optimal dividend/investment strategies

Liquidity risk and optimal dividend/investment strategies Liquidity risk and optimal dividend/investment strategies Vathana LY VATH Laboratoire de Mathématiques et Modélisation d Evry ENSIIE and Université d Evry Joint work with E. Chevalier and M. Gaigi ICASQF,

More information

MDP Preliminaries. Nan Jiang. February 10, 2019

MDP Preliminaries. Nan Jiang. February 10, 2019 MDP Preliminaries Nan Jiang February 10, 2019 1 Markov Decision Processes In reinforcement learning, the interactions between the agent and the environment are often described by a Markov Decision Process

More information

Optimal Stopping and Applications

Optimal Stopping and Applications Optimal Stopping and Applications Alex Cox March 16, 2009 Abstract These notes are intended to accompany a Graduate course on Optimal stopping, and in places are a bit brief. They follow the book Optimal

More information

Optimal Control of an Inventory System with Joint Production and Pricing Decisions

Optimal Control of an Inventory System with Joint Production and Pricing Decisions Optimal Control of an Inventory System with Joint Production and Pricing Decisions Ping Cao, Jingui Xie Abstract In this study, we consider a stochastic inventory system in which the objective of the manufacturer

More information

Technical Appendix to "Sequential Exporting"

Technical Appendix to Sequential Exporting Not for publication Technical ppendix to "Sequential Exporting" acundo lbornoz University of irmingham Héctor. Calvo Pardo University of Southampton Gregory Corcos NHH Emanuel Ornelas London School of

More information

Affine Processes. Econometric specifications. Eduardo Rossi. University of Pavia. March 17, 2009

Affine Processes. Econometric specifications. Eduardo Rossi. University of Pavia. March 17, 2009 Affine Processes Econometric specifications Eduardo Rossi University of Pavia March 17, 2009 Eduardo Rossi (University of Pavia) Affine Processes March 17, 2009 1 / 40 Outline 1 Affine Processes 2 Affine

More information

Reflected Brownian Motion

Reflected Brownian Motion Chapter 6 Reflected Brownian Motion Often we encounter Diffusions in regions with boundary. If the process can reach the boundary from the interior in finite time with positive probability we need to decide

More information

Online Appendix Durable Goods Monopoly with Stochastic Costs

Online Appendix Durable Goods Monopoly with Stochastic Costs Online Appendix Durable Goods Monopoly with Stochastic Costs Juan Ortner Boston University March 2, 2016 OA1 Online Appendix OA1.1 Proof of Theorem 2 The proof of Theorem 2 is organized as follows. First,

More information

Multi-armed bandit models: a tutorial

Multi-armed bandit models: a tutorial Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)

More information

Optimal Stopping Games for Markov Processes

Optimal Stopping Games for Markov Processes SIAM J. Control Optim. Vol. 47, No. 2, 2008, (684-702) Research Report No. 15, 2006, Probab. Statist. Group Manchester (21 pp) Optimal Stopping Games for Markov Processes E. Ekström & G. Peskir Let X =

More information

Stochastic Processes II/ Wahrscheinlichkeitstheorie III. Lecture Notes

Stochastic Processes II/ Wahrscheinlichkeitstheorie III. Lecture Notes BMS Basic Course Stochastic Processes II/ Wahrscheinlichkeitstheorie III Michael Scheutzow Lecture Notes Technische Universität Berlin Sommersemester 218 preliminary version October 12th 218 Contents

More information

Some Fixed-Point Results for the Dynamic Assignment Problem

Some Fixed-Point Results for the Dynamic Assignment Problem Some Fixed-Point Results for the Dynamic Assignment Problem Michael Z. Spivey Department of Mathematics and Computer Science Samford University, Birmingham, AL 35229 Warren B. Powell Department of Operations

More information

Deterministic Dynamic Programming

Deterministic Dynamic Programming Deterministic Dynamic Programming 1 Value Function Consider the following optimal control problem in Mayer s form: V (t 0, x 0 ) = inf u U J(t 1, x(t 1 )) (1) subject to ẋ(t) = f(t, x(t), u(t)), x(t 0

More information

Sample of Ph.D. Advisory Exam For MathFinance

Sample of Ph.D. Advisory Exam For MathFinance Sample of Ph.D. Advisory Exam For MathFinance Students who wish to enter the Ph.D. program of Mathematics of Finance are required to take the advisory exam. This exam consists of three major parts. The

More information

INDEX POLICIES FOR DISCOUNTED BANDIT PROBLEMS WITH AVAILABILITY CONSTRAINTS

INDEX POLICIES FOR DISCOUNTED BANDIT PROBLEMS WITH AVAILABILITY CONSTRAINTS Applied Probability Trust (4 February 2008) INDEX POLICIES FOR DISCOUNTED BANDIT PROBLEMS WITH AVAILABILITY CONSTRAINTS SAVAS DAYANIK, Princeton University WARREN POWELL, Princeton University KAZUTOSHI

More information

U n iversity o f H ei delberg. Informativeness of Experiments for MEU A Recursive Definition

U n iversity o f H ei delberg. Informativeness of Experiments for MEU A Recursive Definition U n iversity o f H ei delberg Department of Economics Discussion Paper Series No. 572 482482 Informativeness of Experiments for MEU A Recursive Definition Daniel Heyen and Boris R. Wiesenfarth October

More information

Properties of an infinite dimensional EDS system : the Muller s ratchet

Properties of an infinite dimensional EDS system : the Muller s ratchet Properties of an infinite dimensional EDS system : the Muller s ratchet LATP June 5, 2011 A ratchet source : wikipedia Plan 1 Introduction : The model of Haigh 2 3 Hypothesis (Biological) : The population

More information

Value and Policy Iteration

Value and Policy Iteration Chapter 7 Value and Policy Iteration 1 For infinite horizon problems, we need to replace our basic computational tool, the DP algorithm, which we used to compute the optimal cost and policy for finite

More information

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011 Department of Probability and Mathematical Statistics Faculty of Mathematics and Physics, Charles University in Prague petrasek@karlin.mff.cuni.cz Seminar in Stochastic Modelling in Economics and Finance

More information

of space-time diffusions

of space-time diffusions Optimal investment for all time horizons and Martin boundary of space-time diffusions Sergey Nadtochiy and Michael Tehranchi October 5, 2012 Abstract This paper is concerned with the axiomatic foundation

More information

Preliminary Results on Social Learning with Partial Observations

Preliminary Results on Social Learning with Partial Observations Preliminary Results on Social Learning with Partial Observations Ilan Lobel, Daron Acemoglu, Munther Dahleh and Asuman Ozdaglar ABSTRACT We study a model of social learning with partial observations from

More information

Random Times and Their Properties

Random Times and Their Properties Chapter 6 Random Times and Their Properties Section 6.1 recalls the definition of a filtration (a growing collection of σ-fields) and of stopping times (basically, measurable random times). Section 6.2

More information

Information obfuscation in a game of strategic experimentation

Information obfuscation in a game of strategic experimentation MANAGEMENT SCIENCE Vol. 00, No. 0, Xxxxx 0000, pp. 000 000 issn 0025-1909 eissn 1526-5501 00 0000 0001 INFORMS doi 10.1287/xxxx.0000.0000 c 0000 INFORMS Authors are encouraged to submit new papers to INFORMS

More information

Poisson random measure: motivation

Poisson random measure: motivation : motivation The Lévy measure provides the expected number of jumps by time unit, i.e. in a time interval of the form: [t, t + 1], and of a certain size Example: ν([1, )) is the expected number of jumps

More information

An iterative procedure for constructing subsolutions of discrete-time optimal control problems

An iterative procedure for constructing subsolutions of discrete-time optimal control problems An iterative procedure for constructing subsolutions of discrete-time optimal control problems Markus Fischer version of November, 2011 Abstract An iterative procedure for constructing subsolutions of

More information

CHAPTER 9 MAINTENANCE AND REPLACEMENT. Chapter9 p. 1/66

CHAPTER 9 MAINTENANCE AND REPLACEMENT. Chapter9 p. 1/66 CHAPTER 9 MAINTENANCE AND REPLACEMENT Chapter9 p. 1/66 MAINTENANCE AND REPLACEMENT The problem of determining the lifetime of an asset or an activity simultaneously with its management during that lifetime

More information

A Review of the E 3 Algorithm: Near-Optimal Reinforcement Learning in Polynomial Time

A Review of the E 3 Algorithm: Near-Optimal Reinforcement Learning in Polynomial Time A Review of the E 3 Algorithm: Near-Optimal Reinforcement Learning in Polynomial Time April 16, 2016 Abstract In this exposition we study the E 3 algorithm proposed by Kearns and Singh for reinforcement

More information

arxiv: v1 [q-fin.tr] 15 Feb 2009

arxiv: v1 [q-fin.tr] 15 Feb 2009 OPTIMAL TRADE EXECUTION IN ILLIQUID MARKETS ERHAN BAYRAKTAR AND MICHAEL LUDKOVSKI arxiv:0902.2516v1 [q-fin.tr] 15 Feb 2009 Abstract. We study optimal trade execution strategies in financial markets with

More information

Generalized Hypothesis Testing and Maximizing the Success Probability in Financial Markets

Generalized Hypothesis Testing and Maximizing the Success Probability in Financial Markets Generalized Hypothesis Testing and Maximizing the Success Probability in Financial Markets Tim Leung 1, Qingshuo Song 2, and Jie Yang 3 1 Columbia University, New York, USA; leung@ieor.columbia.edu 2 City

More information

Lecture Notes - Dynamic Moral Hazard

Lecture Notes - Dynamic Moral Hazard Lecture Notes - Dynamic Moral Hazard Simon Board and Moritz Meyer-ter-Vehn October 27, 2011 1 Marginal Cost of Providing Utility is Martingale (Rogerson 85) 1.1 Setup Two periods, no discounting Actions

More information

Stochastic Dynamic Programming. Jesus Fernandez-Villaverde University of Pennsylvania

Stochastic Dynamic Programming. Jesus Fernandez-Villaverde University of Pennsylvania Stochastic Dynamic Programming Jesus Fernande-Villaverde University of Pennsylvania 1 Introducing Uncertainty in Dynamic Programming Stochastic dynamic programming presents a very exible framework to handle

More information

Prof. Erhan Bayraktar (University of Michigan)

Prof. Erhan Bayraktar (University of Michigan) September 17, 2012 KAP 414 2:15 PM- 3:15 PM Prof. (University of Michigan) Abstract: We consider a zero-sum stochastic differential controller-and-stopper game in which the state process is a controlled

More information