c 2011 Peter A. Maginnis

Size: px

Start display at page:

Download "c 2011 Peter A. Maginnis"

Olivia Nichols
5 years ago
Views:

1 c 0 Peter A. Maginnis

2 VARIANCE REDUCTION FOR POISSON AND MARKOV JUMP PROCESSES BY PETER A. MAGINNIS THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science in Mechanical Engineering in the Graduate College of the University of Illinois at Urbana-Champaign, 0 Urbana, Illinois Advisers: Assistant Professor Matthew West Professor Geir Dullerud

3 ABSTRACT This thesis develops new variance reduction algorithms for the simulation and estimation of stochastic dynamic models. It provides particular application to particle dynamics models including an emissions process and radioactive decay. These algorithms apply several variance reduction techniques to the generation of Poisson variates in the tau-leaping time-stepping method for Markov processes. Both antithetical and stratified sampling variancereduction techniques are considered for Poisson mean estimation, and a hybridization of them is developed that has lower variance than either for every value of the Poisson parameter. Several analytical characterizations of estimator variance are proven for different Poisson parameter regimes. By applying these variance-reduced Poisson mean estimation techniques in an appropriate dynamic fashion to the tau-leaping method, variance-reduced pathwise mean estimators are generated for stochastic Markov processes. It is numerically demonstrated that stepwise variance reduction produces pathwise variance reduction in estimators of systems of physical interest. ii

4 To my girlfriend and family, for their love and support. To my advisers, for their patient guidance and interesting conversations. iii

5 TABLE OF CONTENTS CHAPTER INTRODUCTION CHAPTER PRELIMINARIES Probability Well known distributions The strong law of large numbers CHAPTER 3 POISSON MEAN ESTIMATORS Naive Monte Carlo Antithetical Stratified Hybrid CHAPTER 4 ANALYTICAL RESULTS Estimator Variance for Small Parameter Values Proof of Large Parameter Bound for Antithetical Estimator Variance Global Bounds on Hybrid Estimator Variance CHAPTER 5 PATHWISE MEAN ESTIMATORS Particle Emissions Radioactive Decay Pathwise Comparison and Error Quantification CHAPTER 6 NUMERICAL RESULTS Poisson Mean Estimation Emissions Pathwise Mean Estimation Decay Pathwise Mean Estimation REFERENCES iv

6 CHAPTER INTRODUCTION The roads that lead advanced research toward stochastic processes are numerous. Many deterministic systems exhibit features too complex or highdimensional to treat using traditional analytical or numerical solution techniques. As a result, scientists and engineers often use stochastic models to describe a wide variety of systems. Physical examples are readily available, including topics as diverse as atomistic-scale materials [7], complex fluid/aerosol mixtures [4], granular materials [], and biological and nanoscale environments [9]. Stochastic systems can provide cheap, accurate models of extremely complex dynamics e.g., particle emissions, Section 5., or define inherently stochastic systems e.g., radioactive decay, Section 5.. Whether stochastic processes are being studied intrinsically or to approximate a deterministic counterpart, their relevance to almost all fields of modern scientific research is considerable. The importance of stochastic systems, however, does not imply their ease of analysis. Even relatively simple stochastic systems can defy analytical solution. Given a dearth of closed form solutions, simulation of stochastic systems is often the only reasonable line of research. One canonical problem in the study of stochastic systems is the determination of the expected behavior of the model. Here, many independent sample paths of the system can be produced, and, when aggregated, reveal the underlying mean behavior of the system. Such Monte Carlo methods are particularly effective in models with non-linear or highly multi-dimensional characteristics that render other numerical methods ineffective or computationally infeasible. The primary cost of Monte Carlo simulation is the expense of drawing large numbers of samples. While convergence may be sure, it may also be slow, usually on the order of n in expected mean error, where n is the number of samples used. Thus, achieving high resolution of a particular system can easily become costly. The source of this cost is the variance of the Monte Carlo estimate.

7 Since such an estimate is an aggregation of random objects, it too is a random object. Thus, any given iteration of the estimator may show significant error from the true mean to be estimated. Of course, the law of large numbers applied to a consistent estimator ensures us that this variance and hence our expected error will converge to zero eventually, but as in most engineering contexts the primary question is: what precision can you buy with a given computational budget? This is where variance reduction techniques become indispensable. Given an unbiased estimator, decreasing its variance leads directly to smaller sample sizes needed to achieve the same precision. In fact, for scalar systems, the variance of an unbiased estimator is precisely its expected mean-square error. In higher dimensional systems, the two quantities are still closely related. As a result, variance reduced Monte Carlo estimation can produce equivalently precise results for a reduced computational cost. Many techniques are available to reduce the computational cost of stochastic simulation. A common and useful class of stochastic systems is the Markov process. Under general conditions, stochastic systems whose transition distributions depend only on their current state and not on past history are Markov processes. Exact methods for simulation of Markov processes exist, such as the method of Gillespie [7] developed to stochastically simulate a coalescence model for cloud droplet growth [6]. However, this exact simulation can become expensive as events occur more frequently. This phenomenon becomes particularly damaging when events of interest are relatively infrequent, while other less important but still necessary events occur often. Note that in this situation the cost of generating useful samples grows rapidly. One method, known as tau-leaping, to mitigate this difficulty was developed by Gillespie [8]. Tau-leaping exploits the structure of discrete event Markov processes namely exponentially distributed event times to approximately simulate Markov systems using a time discretization and the sampling of Poisson random variables. The convergence and stability of this technique to exact simulation has been demonstrated by Rathinam et al. [3], and much progress has been made to further reduce computational cost. Significant advances have been achieved, including adaptive step size selection by Cao et al [3] and an implicit tau-leaping by Rathinam et al. []. Variance reduction techniques could be developed for application to tau-leaping to further reduce the cost of simulation. In service of this goal, we implement and analyze three techniques for variance reduction on the sampling of Poisson random

8 variables. These are the well established techniques of antithetic sampling [5, pg. 43] and stratified sampling [5, pg. 55], as well as a method hybridizing the two. These techniques are well known and widely applied, for example used in work from production cost modeling [0] to estimating Fourier transform integrals []. Here we apply them to a construction of the underlying sample space of Poisson random variables. Furthermore, we approximately simulate a pair of stochastic systems using tau-leaping, then we apply variance reduction techniques stepwise to the algorithms, and show improvement in pathwise variance. The first of these two systems we model is the particle emissions process. Here, particles are randomly emitted into a particle population at a rate prescribed by a time inhomogeneous rate function t. The simple time varying dynamics provide a base case for the implementation of stepwise variance reduction techniques in service of pathwise variance reduction. The second system to be examined is radioactive particle decay. While still a relatively simple stochastic system, radioactive decay introduces an important feature: state-feedback. The stochastic rate of decrease of the state of the system decreases with the state. As becomes clear in the development of this system, state feedback requires that we modify the variance reduced Poisson sampling techniques, and the changes necessary are defined and implemented. Another primary thrust of exploration is the dependence of the variance reduction algorithms on the parameter of the Poisson distribution. This distribution has support on all of Z +, and exhibits inherent asymmetry. However, as this parameter becomes large, the Poisson distribution begins to develop symmetry. In fact, under a suitable linear transom the Poisson distribution converges uniformly to the unit normal distribution as becomes large [5], and the antithetic variance reduction technique is particularly suited to exploit this asymptotic behavior. As we will demonstrate and prove, the other standard technique applied, stratified sampling, is better suited to small and intermediate values of. Furthermore, for taking values in certain regions of R +, mathematical analysis of these algorithms is feasible, and we postulated and prove a few analytical results. Chapter provides a short review of several necessary mathematical topics. In Chapter 3, variance reduced algorithms for mean estimation of Poisson random variables are developed and we define notation for their analysis. In Chapter 4 we state and prove several analytical results. First, we prove two 3

9 small results quantifying the variance reduction provided by the antithetical and stratified Poisson mean estimators. Next, we prove an asymptotic bound for the variance of the antithetical Poisson mean estimator for all sufficiently large values of. In the last section of Chapter 4, we prove two global results comparing the variance of the hybrid Poisson mean estimator to the stratified and antithetical estimators. In Chapter 5, we apply tau-leaping to the particle emissions and radioactive decay stochastic systems. We adapt the single step variance reduction techniques to the Poisson sampling steps in the simulation of each system. Also, a metric is defined to quantify and estimate pathwise error. Chapter 6 collects the numerical results of simulation of the processes outlined in Chapters 3 and 5. We examine the relationship between the variance of each Poisson mean estimator and the Poisson parameter. We compare the estimated and analytical variance of both the antithetical and stratified estimators. We successfully demonstrate pathwise variance reduction in the particle emissions model using the antithetical and stratified schemes. Finally, we show that antithetic sampling reduces pathwise variance in estimation of the radioactive decay model. 4

10 CHAPTER PRELIMINARIES Before details of the research are presented, we provide a summary of important mathematical concepts used in this thesis. While some experience with probability and statistics is recommended to gain full value from this work, the ideas contained in this chapter present a brief review and should provide enough detail to make the thesis comprehensible to readers with experience outside of probability. The main points of this chapter are a short compilation of probability theory and notation, a few useful named classes of distributions, and a statement of the strong law of large numbers.. Probability A probability space Ω, F, P is composed of a set Ω with elements ω, a σ- algebra F on Ω, and a non-negative measure P on F such that PΩ =. Objects related to probability spaces that are typically of greatest practical interest are random variables. A random variable is an F-measurable function, say Xω, from Ω to another space, say X. We may think of Ω as the set of all possible outcomes of a random experiment and F as a collection of all sets of outcomes that can be differentiated from other sets of outcomes; these sets are called events. Likewise, Xω is an observable measurement, and P measures the likelihood of a given set of outcomes occurring. For example P {ω : Xω X} =, since, for every ω Ω, Xω X. We may also say for short that the probability that X is in X is one. By convention, Xω is often simply denoted X, and {ω : Xω A X} F is more commonly abbreviated {X A}. As it is merely a measurable function, a given random variable X may take on many forms. One way to describe a random variable, short of supplying its specific functional form, is its distribution. A distribution may be expressed 5

11 in several ways. The law µ of a random variable X, is defined as µa = P {X A},. where A X such that {X A} F. Two random variables have the same distribution if their laws are equal except on sets of measure 0. Another characterization of the distribution of a random variable is its cumulative distribution function CDF. Suppose that X = R and X is a random variable taking values in X. The cumulative distribution function F of X is defined as F : R [0, ] F : x P{X x},. a nondecreasing function taken to be right continuous. Note that for any X taking values in R, lim x F x = and lim x F x = 0. A collection of random variables {X, X,..., X n } has a joint distribution function F x, x,..., x n := P{X < x, X < x,..., X n < x n }..3 The collection of random variables is said to be independent if F x, x,..., x n = F x F x F x n..4 When X takes continuous values, if there exists a function f : R R + such that F x = x ft dt, then f is called the probability density function of X. In this case, µa = fx dx..5 A If X is a discrete random variable, then its probability mass function is defined fm = µ{x = m} = P{X = m}..6 There are a few important functionals of a random variable that are fre- 6

12 quently used. The first, and most important functional is the expected value E of a random variable X. It is defined E[X] := Xω dω..7 Ω Note that the expectation is, by definition a linear functional. If X is a continuous valued random variable in R with probability density function f, then the expected value of X E[X] = xfx dx..8 R Equivalently, if X is a discrete random variable, the expected value of X is given by E[X] = mfm..9 m X Since any measurable function g composed with X is a random variable, we may extend the definition of expectation to include E[gX] := gxω dω..0 Ω This definition admits the natural extensions to probability density and mass functions as above, i.e. E[gX] = gxfx dx,. R if X is a continuous valued random variable in R with probability density function f, and E[gX] = gmfm,. m X if X is a discrete random variable with probability mass function f. Another important functional of two random variables X and Y is covariance Cov, defined CovX, Y := E [X E[X]Y E[Y ]]..3 Note that since E is a linear functional of a random variable, that Cov is bilinear, that is, it is linear in each of its arguments. A particularly common and interesting case is the covariance of a random variable with itself, i.e. 7

13 CovX, X. Such a form is a functional of a single random variable and is almost universally referred to as the variance of X. Note that, in particular VarX := E [ X E[X] ] = E[X ] E[X],.4 where the last equality follows from the fact that E is a linear functional and E[X] is a fixed number. Qualitatively, the expected value may be thought of as the average or typical value of a random variable, and the variance may be thought of as a measure of the tendency of a random variable to take values away from its mean. In other words, the variance measures the typical dispersion of a distribution. Note that the expectation of a constant is itself and the variance of a constant is 0. One final equivalent representation of expectation can be defined using the CDF of X, F. If F is invertible such that for u [0, ] except for perhaps on a set of measure zero, F F u = u, then we may consider for any random variable Xω another random variable with the same distribution Xu := F u. In this case, we may consider [0, ] to be the sample space of Xu with Lebesgue measure as its probability measure and thus: E[Xu] = F u du..5 0 If these requirements hold, then E[X] = E[Xu]. If F is not invertible, say for example not strictly monotone, the same conclusion holds for F u := inf{x : F x u}..6. Well known distributions While there are an uncountable number of possible random variables and distributions, several important parameterized classes are known and their properties well studied. Four important classes are the uniform, exponential, normal or Gaussian, and Poisson distributions. The uniform distribution refers to two different classes of distributions, one discrete and one continuous, each taking two parameters a and b. The 8

14 continuous uniform distribution is more relevant to the development here. A random variable X has uniforma, b distribution, denoted X Unifa, b is real valued and takes values in the interval [a, b] with equal probability. It has CDF 0 if x < 0 F x = x a if a x < b..7 b a else It has mean E[X] = a+b and variance VarX = b a. Most numerical random sampling is performed using approximately Unif0, pseudorandom numbers transformed into other distributions. The next important class of probability distributions is the exponential distribution. It takes a single parameter > 0. The exponential distribution can express the time until the next event if events occur in continuous time with constant rate and their time of arrival is independent of the time of the last arrival. If X Exp, it has CDF exp x if x 0 F x =..8 0 else It has mean E[X] = and variance VarX =. The normal or Gaussian distribution has support on all of R. It takes two parameters µ and σ which are the mean and variance of the distribution. Its probability density function is the well-known bell curve fx = exp x µ..9 πσ σ If a random variable X has normal distribution, we write X N µ, σ. The normal distribution enjoys the property that for such a random variable, X = µ+σz, where Z N 0,, the standard unit normal distribution. The CDF of the unit normal distribution is often denoted Φx = π x exp t dt = erfc x..0 9

15 Note that Φx is symmetric about 0,, namely Φ x = Φx.. The last important distribution named here is the Poisson distribution. Like the exponential distribution, it takes a single parameter > 0. It can express the number of events that occur in a fixed amount of time if events may occur at exponential rate. The Poisson distribution is a discrete distribution taking values on Z +. Its probability mass function m e if m Z + m! fm =,. 0 else and thus it has CDF m k e k=0 if m Z + k! F m =..3 0 else If X Pois, then E[X] = and VarX =. Independent Poisson distributed random variables X Pois and Y Pois enjoy the property that X + Y Pois +..3 The strong law of large numbers Suppose X, X,... is a sequence of independent random variables each with the same distribution henceforward abbreviated i.i.d. for independent identically distributed and suppose further that E[ X n ] < and E[X n ] = µ for every n. Define Then, S n := n X i. i= { } S n P lim n n = µ =..4 0

16 CHAPTER 3 POISSON MEAN ESTIMATORS We begin by constructing several estimators for the mean of a Poisson random variable. The estimators considered here are consistent, meaning they converge to the mean of the Poisson distribution from which they are sampled as their indices approach infinity. Furthermore, the estimators are unbiased, meaning that the expected value of each estimator is the same as the true mean of the Poisson distribution from which they are drawn, for every value of their indices. The typical application for these estimators is parameter estimation, in this case estimating the value of. The first estimator constructed is the naive Monte Carlo mean estimator. The estimator with index n is simply the sample mean of n independent, identically distributed random samples from the Poisson distribution. The next two estimators each use well-known variance reduction techniques. The first is the antithetical estimator which again draws Poisson random variables and averages them, but instead of drawing these samples independently, it introduces negative correlation between pairs of samples. This negative correlation helps to reduce the variance of the estimator. The second well known variance reduction technique is implemented in the stratified mean estimator. The primary idea of stratified variance reduction is that instead of drawing many samples from the whole distribution, the distribution is partitioned into some number of strata and some independent samples are drawn from each stratum. This sampling is performed in such a way that, while each sample does not have the overall distribution, their average still converges to the expected value of the distribution. The imposed spacing of the samples throughout the support of the distribution reduces the variance inherent to the naive estimator. The last estimator constructed uses a hybridization of antithetical and stratified estimation. The domain is divided into an even number of strata, and samples are drawn from each stratum so that negative correlation exists between pairs of samples. In essence, the

17 hybrid estimator is in particular both stratified and antithetical. A pair of analytical results making this intuition more precise is proven in Chapter Naive Monte Carlo Denote the naive sample mean estimator for the expected value of a Poisson random variable by: δ n := n n X i, 3. i= X i i.i.d. Pois. By the strong law of large numbers, δ n a.s. E[X i ] =, and by the central limit theorem, under very weak assumptions we know that this convergence is at least O n. In order to simulate Poisson random variables, the inverse of the CDF F m must be formally defined: F : [0, Z + F : u inf{m : F m > u}. 3. where F is the CDF of X i Pois. This is necessary because many numerical routines only have access to the simulation of pseudorandom numbers with approximately uniform0, distribution. In order to sample other distributions, the algorithms used must implement a way to transform uniform0, random numbers into the desired distribution. The inverse CDF provides such a transform. If u Unif0,, then F u Pois. This inversion is performed by searching from 0 to find the first non-negative integer m such that F m > u. In general, the CDF inversion step used in these estimators can become computationally expensive, particularly when the value of is large and hence the typical number of steps taken from 0 to find the infimum in 3. is large. To mitigate this growth in cost, the following algorithm is implemented, exploiting the uniform convergence of

18 the Poisson distribution to the N, distribution proven by Curtiss [5]. To generate a Poisson random variable, first, generate Z N 0,. This is done extremely efficiently via the Box-Mueller method []. Set u = ΦZ. Then u Unif0,. Invert the Poisson CDF by performing a linear search on Z + as follows: initialize the guess at m 0 = max{ + Z, 0}. If F m 0 < u, search up the integers in m from m 0 until the first m such that F m > u. Return m. If F m 0 > u, search down the integers in m starting at m 0 until the first m such that F m < u. Return m +. This algorithm is of constant computational order in, and will be particularly useful in the implementation of the next estimator. 3. Antithetical To sample the antithetical estimator, draw a uniform variate and invert it and its antithetic pair. v i i.i.d. Unif0, Y i := F v i 3.3 Y i := F v i 3.4 so that Y i, Y i Pois. This inversion is performed using the same algorithm as before, except that two inversions of an antithetic pair must be performed. Perform the first, set v i = ΦZ and calculate F v i, as above. Now, note that by the antisymmetry of the normal CDF, if v i = ΦZ, Φ Z = v i. Thus to perform the second inversion, use the same algorithm except set m 0 = max{ Z, 0} and instead of comparing evaluations of the CDF to v i = ΦZ, compare them to v i = Φ Z. Now, define Y i := Y i + Y i and define the antithetical estimator of the mean as: 3.5 δ n A := n n Y i. 3.6 i= 3

19 3.3 Stratified Next, the stratified estimator is constructed. Let {A j } 4 j= partition [0, such that A j = [ j, j 4 4. For each j, draw u i j UnifA j independent in j and i.i.d. in i. Let Zj i := F u i j and Z i := 4 4 Zj. i 3.7 j= Thus let the stratified estimator of the mean be defined as: δ 4n S,4 := n n Z i 3.8 where the Z i are i.i.d. samples. In this definition, the strata were chosen to partition [0,, which we can easily fix here to be the underlying sample space Ω. Unlike more traditional strata which are chosen to partition the state space, these strata can correspond to shared states. That is, two points in different strata may map to the same state under the mapping F. This distinction, however, is of little consequence as the strata are still nonintersecting and preserve the correct distribution under F. Note here that this stratified estimator serves strictly as a point of reference. The choices made in its construction are simple and effective, but certainly not optimal. The proportional allocation scheme used here is often the best choice if no information is used about the variances within strata. Indeed, variances within strata can be difficult to compute explicitly in general, and in practice they are often pre-estimated numerically in order to fix the stratified scheme. Also note that the choice of four strata, each with equal probability is largely a matter of convenience in calculations. One easy extension is to M equally probable strata. In this case, define {A j } M j= := [ j, j M M. For each j, let Zj i := F ui j where u i i.i.d. j UnifA j and i= Z i := M Zj. i 3.9 j= Then we may define the stratified estimator over M uniform probability 4

20 strata as: where Z i are sampled i.i.d. δ Mn S,M := n n Z i 3.0 i= 3.4 Hybrid The constructions of the antithetical and stratified mean estimators suggest the possibility of a hybridized algorithm that shares in the variance reduction properties of either. We construct the hybrid mean estimator as follows. Draw two uniform variates v i UA and v i UA where the probability strata A j are defined as above for the stratified estimator, such that v i UA 4 and v i UA 3 almost surely. Set H i := F vi 3. H i := F vi 3. H i 3 := F vi 3.3 H i 4 := F vi 3.4 so that each H i j has the same marginal distribution as each Z i j from the stratified estimator. At the same time, there exists correlation between H i and H i 4 and between H i and H i 3, analagously to the correlation between Y i and Y i in the antithetical estimator. Set H i := 4 4 Hj, i 3.5 j= and now define the hybrid Poisson mean estimator δ 4n H,4 := n n H i 3.6 where the H i are sampled i.i.d.. Note also that extension of the hybrid Poisson mean estimator to any even number M of the uniform strata {A j } M j= is simple. For j {,..., M }, draw vi i.i.d. j UnifA j. For j { M +,..., M}, i= 5

21 set v i j = v i M+ j. Define H i := M Hj, i 3.7 j= where Hj i := F vi j and define δ Mn H,M := n n H i 3.8 i= where H i are sampled i.i.d. 6

22 CHAPTER 4 ANALYTICAL RESULTS Due primarily to the analytical unwieldiness of the Poisson distribution function, exact expressions for the reduced variance of the estimators detailed above are difficult to obtain. Calculation of the covariance between two antithetically sampled Poisson variables, for example, becomes combinatorially complex for most values of. Until complete analytical solution of the problem becomes tractable, one course of action is to consider special cases of the parameter. We prove several results along this line of thought. First, two short results calculating the exact variance of the antithetical and stratified mean estimators for below certain thresholds are given in Lemmas and, respectively. In these cases, the estimator variances reduce to simple polynomial expressions. Here the variance reduction from naive Monte Carlo is made explicit. Furthermore, these results are later confirmed by numerical experiment for the four sample point estimate case in Figure 6.3. Proven in Section 4., Theorem provides another result for a specific region of. It provides an upper bound for the variance of the antithetical Poisson mean estimator for all sufficiently large values of. While the bound obtained does grow without limit in, it nevertheless provides analytical proof that for large values, the antithetical mean estimator has much lower variance than the naive Monte Carlo estimator, which is known to have variance linear in for all values of. Numerical results shown in Figure 6. indicate that this bound is highly conservative, but it may be possible to tighten the bound given refinement of the proof. Two global results are obtained. Theorems and 3 prove that for every value of, the hybrid Poisson mean estimator has variance at least as small as the variance of both the stratified and antithetical mean estimators, respectively. This result shows that if given the choice between implementing antithetic or uniformly stratified variance reduction, one may simply implement an algorithm that globally enjoys the benefits of both and is not 7

23 significantly more computationally expensive than either. Again, Figure 6. provides numerical support for these results. 4. Estimator Variance for Small Parameter Values Lemma. Let δ n A be the antithetical mean estimator of a Poisson distribution, where Y i, Y i Pois are antithetically paired such that δ n A := n n i= Y i + Y i. If < ln, then Var δ n A = n. 4. Proof. Suppose < ln. Then e >, and F 0 >. Thus F uf u = 0 for every u 0, by the definition of F. So for any i N Cov [ ] [ ] [ Y i, Y i = E Y i Y i E Y i E Y i = 0 F =. ] uf u du Thus Var δ n A = Var n n Y i i= n = n Var Y i + Y i i= = Y n Var + Y = 4n = 4n Var Y + Var Y + Cov Y, Y + 8

24 = n. Lemma. Let δ 4n S,4 denote the stratified mean estimator of a Poisson distribution with four uniform strata. Let Z, i Z, i Z3, i Z4 i be i.i.d. samples from probability strata A = [ 0, 4, A = [, 4 respectively. Namely, δ 4n S,4 := n n i=, A3 = [, 3 4 Z i + Z i + Z3 i + Z4 i. 4, A4 = [ 3 4,, If < ln 4 3, then Var δ 4n S,4 = n Proof. Suppose < ln 4 3. Then e > 3 4 and P Zi = 0 = P Z i = 0 = P Z i 3 = 0 = for every i N. So each of these random variables have zero mean and variance. The first two moments of Z i 4 are: E [ ] Z4 i = np Z4 i = n n=0 = 0 P Z i 4 = 0 + = 0 + = 4 np Z4 i = n n= 4nP X i = n n= np X i = n n=0 = 4E [ X i]. 4.3 [ ] E Z4 i = n P Z4 i = n n=0 = 0 P Z i 4 = 0 + = 0 + = 4 n P Z4 i = n n= 4n P X i = n n= n P X i = n n=0 9

25 [ = 4E X i]. 4.4 So Var δ 4n S,4 = Var n n i= Z i + Z i + Z i 3 + Z i 4 4 n = n Var Z i + Z i + Z3 i + Z4 i 4 i= = 6n Var Z + Z + Z3 + Z4 = Var Z 6n + Var Z + Var Z 3 + Var Z 4 = 6n Var Z4 = [ ] E Z4 i E [ ] Z4 i 6n = 4E [X i] 6E [ X i] 6n [ = E X i] E [ X i] n 4 = n 4 Var X i 3 4 E [ X i] = n Proof of Large Parameter Bound for Antithetical Estimator Variance Theorem. For any ɛ > 0,there exists Λ > 0 and K > 0 such that for any > Λ, Var δ n A < K n 3 4 +ɛ. 4.5 To prove the theorem, we require the development of several Lemmas in order to asymptotically relate the Poisson and Normal distributions. Lemma 3. Let f, g : [a, b] [0, ] such that f is nondecreasing, g C [a, b] and there is a c > 0 such that g x c for every x [a, b]. Then if fx gx < δ 0

26 for every x [a, b] then f u g u < δ c 4.6 for all u U, where U := [ga, gb] [fa, fb], and f u := inf{x : fx u}. Proof. Let x := δ. Observe that the claim holds trivially if x b a, since c f u, g u [a, b] for all u U. Otherwise, choose any u U. We proceed by showing f u < g u + x f u > g u x. 4.7a 4.7b To prove 4.7a, first consider the case when u gb x. Then by the mean value theorem, there is a z g u, g u + x such that g z x = gg u + x gg u c x gg u + x u u + δ gg u + x. Applying the hypothesis and the definition of the inverse of f, u < fg u + x f u < g u + x. Now suppose u > gb x. Then g u > b x,

27 and since u U, u fb = f u b < g u + x. The proof of 4.7b is similar. Suppose u ga + x. Then by the mean value theorem, there is a z g u x, g u such that g z x = gg u gg u x c x u gg u x gg u x u δ fg u x < u g u x < f u. If u < ga + x, then g u < a + x, and since u U, u fa = f u a > g u x. Since, for u U chosen arbitrarily, 4.7a and 4.7b hold, we have for every u U. f u g u < x := δ c Let F denote the cumulative distribution function CDF of X Pois, > 0. Define X := X +, and let F x denote the CDF of X. That is, F x = F + x. 4.8 Take Φx to be the CDF of the standard unit normal distribution, namely Φx = π x e t dt. 4.9

28 By a result from Cheng [4, Theorem I], F x Φx 6 π x e x + δ 4.0 for every x R and > 0, where the magnitude of δ is bounded above by some function of, namely δ = This result is easily adapted to produce a uniform bound on the error between the Poisson and Normal CDFs that depends only on the parameter. Indeed F x Φx = 6 π x e x + δ 6 x e x + δ π π Fix any 0 < ɛ <. For any > π ɛ, set =: δ. 4. γ := δ + Φ ɛ ln ln π. 4.3 Note that γ 0 as, so there exists Λ 0 > π ɛ such that > Λ 0 = γ <. The rate of convergence of γ to 0 is discussed in Lemma 4. Lemma 4. For any 0 < ɛ <, there is a Λ > π ɛ Λ, where l ɛ := such that, for any γ < ɛ l ɛ, 4.4 ɛ ln ln π. Proof. By definition of γ and the Gaussian CDF, γ = δ + Φ l ɛ = δ + erfcl ɛ Using an asymptotic expansion of erfcx for large x, there exists K > 0 3

29 and x > 0 such that for any x x, e x erfcx x π < K e x x 3 = erfcx < e x x π + K e x x Thus for any such that l ɛ x, namely Λ where Λ := max{ πe x ɛ, π ɛ }, 4.6 erfcl ɛ < exp l ɛ l ɛ exp l ɛ + K π l ɛ 3 π ɛ π ɛ = l ɛ π + K l ɛ 3 = γ < δ + ɛ π ɛ l ɛ + K l ɛ 3 [ = ɛ l ɛ 6 l ɛ π l ɛ l ɛ ɛ ɛ 3 ɛ + 0.3l ɛ + ] π + K ɛ l ɛ. 4.7 Since the bracketed term above converges to < as, there exists Λ > Λ such that for any Λ, γ < ɛ l ɛ. Lemma 5. There exists Λ > Λ 0, such that for any Λ, i γ γ F γ u F u du Φ uφ u du γ < ɛ 3 π 4.8 4

30 ii γ γ [ F γ ] u [ du Φ u ] du < γ 3 π ɛ. 4.9 Proof. As shown in the above extension of Cheng [4, Theorem ], for any positive > 0 F x Φx < δ for any x R. For any > Λ 0, define a := Φ γ + δ = Φ γ δ. Thus Φa = γ + δ = F a > γ Φ a = γ δ = F a < γ Define c := Φ a. Note that c = Φ a by symmetry of Φ x = π e x, and that c Φ x for every x [ a, a ]. Thus by Lemma 3, F u Φ u < δ c 4.0 for every u [ F a, F a ] [Φ a, Φa ] [γ, γ]. To show i, observe that γ γ F = γ u F u du Φ uφ u du γ γ γ γ + F F F γ u F u Φ uφ u du u F u F uφ u uφ u Φ uφ u du 5

31 < γ γ F u F + Φ u F γ γ δ c F γ γ + δ c u Φ u u Φ u du u Φ u + Φ u δ c + Φ u δ c du F γ γ u Φ u du + δ c Φ u du < δ δ γ + c c Iγ δ < π c + δ c, γ γ Φ u du where Iγ := γ Φ u γ du = Φ u du 4. γ Φ u du = E[ N ] = π γ 4. 0 for all 0 < γ <, where the random variable N N 0,. Now observe that c = Φ a = Φ Φ γ δ = Φ l ɛ = π exp l ɛ = exp ɛ ln + ln π π = ɛ. 6

32 So the ratio δ c = π ɛ ɛ 3 ɛ ɛ = [ ] π ɛ converges to 0 and the bracketed term converges to < as. Thus there exists a Λ > Λ 0 such that for > Λ, the bracketed term is less than and Thus, for > Λ, δ c <. 4.4 π γ γ F u F u du Φ uφ u du γ γ δ < π c + δ c = δ c π + δ c = 3 π < 3 π ɛ ɛ [ To show ii, take Λ as above and take > Λ. Then = γ γ γ γ γ γ [ F γ ] u [ du Φ u ] du γ [ ] F u [ Φ u ] du ] π + δ c F u u Φ + Φ u F u u Φ du 7

33 γ γ F u Φ u γ du + < δ δ γ + c c Iγ δ < π c + δ c = δ c π + δ c = 3 π ɛ < 3 π. ɛ γ [ Lemma 6. For every > Λ 0, Φ u F ] π + u Φ u du δ c erfc γ l ɛ α, 4.5 where we define α := ɛ δ. 4.6 Proof. For any > Λ 0, let ρ := δ erfcl ɛ, so that γ = δ + erfcl ɛ = + ρ erfcl ɛ. 4.7 By Taylor s theorem of first order with remainder applied about ρ = 0, for any ρ > 0, there exists 0 < ρ < ρ such that erfc + ρ erfc lɛ π [ = l ɛ exp erfc + ρ erfc lɛ ] erfc l ɛ ρ π [ l ɛ exp erfc erfc l ɛ ] erfc l ɛ ρ π = l ɛ exp l ɛ erfc l ɛ ρ = l ɛ ɛ δ, 8

34 where the inequality follows from the magnitude of the second term decreasing in ρ. Indeed, erfcl ɛ 0, ρ > 0, erfc x decreasing, and expx increasing imply that [ exp erfc + ρ erfc lɛ ] is decreasing in ρ. Lemma 7. There exists a Λ 3 > Λ 0 such that for any > Λ 3, [l ɛ α] exp [l ɛ α] π ɛ l ɛ. 4.8 Proof. By Taylor s theorem of first order with remainder applied about α = 0, for every α > 0, there exists 0 < α < α such that exp [l ɛ α] = exp l ɛ + [l ɛ α] exp [l ɛ α] α exp l ɛ + [l ɛ ] exp [l ɛ α] α. Trivially, since l ɛ, and α 0 as, one can take sufficiently large to make l ɛ α > 0. Then given the above inequality, observe that [l ɛ α] exp [l ɛ α] [l ɛ α] [ exp l ɛ + l ɛ α exp [l ɛ α] ] l ɛ [ exp l ɛ + l ɛ α exp l ɛ exp l ɛ α α ] Since l ɛ α 0 and α 0 as, there exists Λ 3 > Λ 0 large enough so that for any > Λ 3, l ɛ α exp l ɛ α α <. 4.9 Thus, continuing the above inequalities, [l ɛ α] exp [l ɛ α] < l ɛ [exp l ɛ + exp l ɛ ] = πl ɛ ɛ. 9

35 Lemma 8. There exists Λ 4 > Λ 0 such that, for any > Λ 4, γ γ [ Φ u ] du < 5 l ɛ ɛ 4.30 Proof. First, take a := Φ γ = Φ γ > 0. Then, change coordinates and integrate by parts: γ [ Φ u ] a du = x π e x dx γ a = + [ xe x π ] a x= a = + π ae a π π a a a e x dx e x dx 0 π e x 0 dx + 0 π e x dx = + π ae a Φa + = + π ae a γ = γ + π ae a. Observe that a = Φ γ = erfc γ, and thus that γ [ Φ u ] du = γ + π erfc γe erfc γ. 4.3 γ Since l ɛ and α 0 as, select Λ 4 > max{λ, Λ 3} such that for any > Λ 4, l ɛ α. Recall that x exp x is a decreasing function for all x, so by Lemma 6, erfc γ l ɛ α, 30

36 and γ [ Φ u ] du γ + π l ɛ αe lɛ α. 4.3 γ Since > Λ 4, Lemmas 4 and 7 imply that γ γ [ Φ u ] du ɛ l ɛ + 4l ɛ ɛ [ ] = 4l ɛ ɛ l ɛ Since the bracketed term converges to as, there is Λ 4 > Λ 4 such that the bracketed term is less than 5, and thus 4 γ [ Φ u ] du 5lɛ ɛ γ and the proof is complete. Lemma 9. For any 0 < ɛ <, there exist constants K, K, and Λ such that for any Λ, 0 F u F u du Φ uφ u du < K ln + K ɛ ɛ

37 Proof. First, let Q := K := K C := = G := 0 F γ γ γ 0 γ 0 F γ u F u du 4.35 F F u F u du 4.36 u F u du + γ F u F u du 4.37 u F u du 4.38 Φ uφ u du 4.39 N := γ γ [ Φ u ] du 4.40 H := H C := so that = γ γ γ γ 0 γ [ F u ] du 4.4 [ ] F u du + [ F γ [ F u ] du 4.4 ] γ [ ] u du + F u du K + K C = Q 4.44 H + H C = 0 [ F u ] du = E[ X ] =,

38 and by Lemma 5, for > Λ Also, recall by antisymmetry, that K G < 3 π H N < 3 π Φ uφ u du = ɛ ɛ 4.47 [ Φ u ] du = Furthermore, by antisymmetry and Lemma 8, for > Λ 4, N = + G By symmetry, then by Hölder s inequality, 0 K C = γ 0 γ 0 F < 5 l ɛ ɛ u F u du [ F ] u du H C H C = H C = H γ 0 [ F ] u du N + H N Thus, it is easily seen by repeated use of the triangle inequality as well as the inequalities 4.46, 4.47 and 4.49 developed above, that for any 33

39 > Λ := max{λ, Λ 4} 0 F u F u du + = Q + = K G + K C + + G K G + K C + + G < K G + N + H N + N < π + 5l ɛ ɛ ɛ = + 5 π ɛ < π ɛ + 5 ɛ ɛ ln ln π ɛ ln ɛ. Recall the statement of Theorem : For any ɛ > 0, there exists Λ > 0 and K > 0 such that for any > Λ, Var δ n A We now complete the proof of Theorem. Proof. Var δ n A = 4n Var Y + Y = [ ] Var Y n + Cov Y, Y = n = n = n [ + Cov [ + Cov + 0 Y +, Y < K n 3 4 +ɛ. 4.5 ] + Y +, Y + ] F u F u du 34

40 [ = + Φ uφ u du n 0 = n F F u F u du u F u du 0 0 ] Φ uφ u du Φ uφ u du. 4.5 Now, apply Lemma 9 for ɛ = 4 + ɛ to obtain K, K, and Λ such that for > Λ : Var δ n A < n K + K 4 ɛ = K n 3 4 +ɛ + K ln ɛ ln 4 +ɛ Since ln ɛ 0 as, there exists Λ > Λ such that for all > Λ, Var δ n A < K n 3 4 +ɛ. 4.3 Global Bounds on Hybrid Estimator Variance Let the antithetical, stratified and hybrid Poisson mean estimators be defined as above for M an even number of uniform strata. Theorem. For any > 0 and even M, Var δ Mn H,M Var δ Mn S,M

41 Proof. Recall that δ Mn S,M := n n i= Zi, so that Var δ Mn S,M = Var n n Z i i= n = n Var Z i i= = n Var Z i n i= = n Var Z, 4.55 since the Z i are independent and identically distributed. Recalling the definition of Z i, Var Z = Var M = M j= Z j Var Zj, 4.56 j= since Z i j are independent in j. So we have Var δ Mn S,M = n M Var Zj j= By the same logic as above, we have Var δ Mn H,M = Var n n H i i= n = n Var H i i= = n Var H i n i= = n Var H, 4.58 since the H i are independent and identically distributed. that the H j However, recall that compose H are not altogether independent in j. For each 36

42 j, k such that j + k = n +, Cov H j, H k 0. For any other j k, Cov H j, H k = 0. Thus and Var H = Var Hj M j= [ = M Var M Z M j + Cov Hk, HM+ k ], 4.59 j= Var δ Mn H,M = n Thus, the difference M k= [ M Var M Zj + Cov Hk, HM+ k ] j= k= Var δ Mn H,M Var δ Mn S,M = n M Cov Hk, HM+ k, 4.6 k= and all that remains to show is that the right hand side of this equation is negative. This is trivially demonstrated, using the definition of Hj i and a result proven in Schmidt [6, Theorem ]: since fx := F So Cov H k, H M+ k = Cov F = Cov F v k, F v k v k, F v k = Cov f vk, g v k 0, x, gx := F x are non-decreasing functions. Var δ Mn H,M Var δ Mn S,M 0, and the proof that the hybrid algorithm produces estimator variance at least as small as the stratified algorithm is complete. It remains to prove that the hybrid algorithm produces estimator variance at least as small as the stratified algorithm. In order to prove such a theorem, we shall introduce some new notation and prove several lemmas first. Since the proof will involve some operations over indices taking values 37

43 in {,..., M}, we begin by defining several important subsets of pairs of indices and prove a short result. Let M := {,..., M} and M := M M. Partition M as follows. Define A := {i, j M : i + j = M + } 4.6 B := {i, j M : i = j} 4.63 C := M \ A B, 4.64 and note that A = B = M while C = M M. It is trivial to see that for each of these sets, i, j is an element if and only if j, i is an element. Another symmetry within the set C is proven in the following lemma. Lemma 0. For any element in C, it s reflection in a coordinate about M+ is also an element of C, that is, i, j C i, M + j C 4.65 Proof. Select any element i, j C i + j M + and i j i M + j and i + M + j M + i, M + j C. Lemma. For any set of M scalars µ i, i M, µ i µ j + M i,j C i,j A i,j C µ i µ j = µ i µ j µ M+ i µ M+ j Proof. Define C i := {j : i, j C}. Observe that C i = M, since for any i M, i, i / C and i, M + i / C, but for every other j M, 38

44 i, j C. Now, observe that i,j C µ i µ j µ M+ i µ M+ j = i,j C = i,j C µ i µ M+ i i,j C µ i µ M+ i i,j C µ i µ M+ j µ i µ j 4.67 = µ i µ M+ i µ i µ j i M j C i i,j C = M µ i µ M+ i µ i µ j i M i,j C = M µ i µ j µ i µ j, 4.68 i,j A i,j C where 4.67 comes from Lemma 0 and 4.68 comes from the definition of A. Lemma. For any set of M scalars µ j, j M, M M µ j µ j + M j= j= µ j µ M+ j j= Proof. First, observe that M M µ j µ j = µ i µ j j= i,j M j= Indeed, µ i µ j = µ i µ i µ j + µ j i,j M i,j M = µ j µ i µ j i,j M i,j M = M µ j µ j. j M j M 39

45 Now applying 4.70 to the left hand side of 4.69, we see that M M µ j µ j + M µ j µ M+ j j= j= j= = M µ i µ j µ j + M i,j M j= µ j µ M+ j. j= By expanding the square and rewriting in the notation of our partition, we see that M M µ j µ j + M µ j µ M+ j j= j= j= = µ i µ j µ j µ i µ j + M µ i µ j i,j M j M i,j M \B i,j A = µ i µ j µ i + µ j µ i µ j i,j M i,j A i,j M \B + M µ i µ j i,j A = µ i µ j i,j M + M µ i µ j = = i,j M \A i,j C i,j A µ i µ j µ i µ j i,j A i,j M \B µ i µ j µ i µ j + M µ i µ j µ i µ j i,j M \B µ i µ j + M i,j A i,j M \B i,j A µ i µ j, where the last equality follows from the fact that µ i µ j = 0 for every i, j B and C = M \ A \ B. Continuing this string of equations, we see 40

46 that M M µ j µ j + M µ j µ M+ j j= = = = 4 i,j C i,j C i,j C j= j= µ i µ j µ i µ j µ i µ j + M µ i µ j i,j C i,j A µ i µ j µ i µ j + M µ i µ j i,j C i,j A [ µi µ j + µ M+ i µ M+ j ] µ i µ j + M µ i µ j, i,j C i,j A i,j A where the last equality comes from another enumeration of C using Lemma 0 and closure of C under index swapping. Finally, we complete the proof of Lemma. By continuing from above and then applying Lemma, observe that M M µ j µ j + M µ j µ M+ j j= j= j= = [ µi µ j + µ M+ i µ M+ j ] 4 i,j C µ i µ j + M µ i µ j i,j C i,j A = [ µi µ j + µ M+ i µ M+ j ] 4 = 4 i,j C + i,j C i,j C µ i µ j µ M+ i µ M+ j µi µ j + µ M+ i µ M+ j 0. Now, for each stratum A j, j M, define µ j := E [ Z i j ] [ ] = E H i j = M F Aj u du, 4.7 4

47 so that E[X i ] = M M j= µ j. Furthermore, define σj := Var Zj i = Var H i j 4.7 [ ] = E Zj i E [ ] Zj i [ = M F u] du µ j, 4.73 A j since, for each i, j, H i j has the same marginal distribution as Z i j. Theorem 3. For any > 0 and even M, Proof. First, recall that Var Var δ Mn H,M δ Mn H,M Var = n Var H δ Mn A = n Var M j= = nm Var M j= H j H j, and Var δ Mn A = Var δ = Var Mn Mn A = n Var M = nm Var Mn i= M i= M i= Y i Y i + Y i + Y i. Y i So we may proceed by proving that M Var Var M Hj Y i j= i= 4 + Y i. 4.75

48 Now, applying the above notation, observe that Var M j= H j Next, note that Var M i= Y i = = = Var M Hj + Cov Hj, HM+ j j= σj + j= σj + j= + Y i = = = i= j= j= E [ ] [ ] [ ] Hj HM+ j E H j E H M+ j E [ ] M Hj HM+ j µ j µ M+ j j= M Var Y i + Y i i= j= M Var Y i + Var Y i + i= Var Y i = M Var Y + M i= M i= Cov Y i, Y i + M E [ Y Y = M Var [ ] Y + ME Y Y M Cov Y i, Y i ] [ ] [ E Y E Y M j= ] µ j. In order to compare this expression directly to a similar expression for the hybrid estimator, observe that Var Y = E [Y = 0 = M = M ] E [ ] Y [ F u] du M µ j j= [ M F u] du A j j= j= σ j + µ j M M M µ j, j= M µ j j= 43

49 where the last equality comes from Also, observe that E [ ] Y Y = 0 = M = M F uf u du M j= F A j uf u du E [ Hj HM+ j]. j= Applying these two identities, we see that Var M i= + Y i = Y i σj + j= + j= µ j M µ j M j= j= E [ ] Hj HM+ j M M µ j, 4.77 and thus, by subtracting 4.76 from 4.77 and multiplying by M, that M Var = M 0, M i= M + Y i Var Hj Y i j= M µ j µ j + M j= j= by Lemma. So, in conclusion, we see that Var δ Mn A Var δ Mn H,M and the proof is complete. = Var nm 0, M i= Y i j= µ j µ M+ j j= M + Y i Var Hj j= 44

50 CHAPTER 5 PATHWISE MEAN ESTIMATORS In this chapter we introduce the two stochastic processes under consideration in the pursuit of achieving stochastic pathwise variance reduction. Both systems are approximated using a tau-leaping algorithm. Then, the algorithms in Chapter 3 are adapted as necessary to the Poisson sampling steps of each simulation. In both the particle emissions and radioactive decay models, such adaptations are necessary to generate valid sample paths affordably. 5. Particle Emissions The particle emissions problem can be modeled very generally as a continuoustime, time-inhomogeneous stochastic arrivals process. The dynamics of the process are governed by a rate profile t. We could use exact simulation techniques to sample from this stochastic distribution as Gillespie [7] develops for another particle model. However, exact sampling can be computationally expensive as the arrival rate of particles increases. An effective numerical technique to approximate this stochastic distribution is the tau-leaping method of Gillespie [8]. Via this technique, time is discretized by a uniform increment t and each timestep is resolved by simulating some number of events drawn from a Poist t distribution. This method is particularly suited to processes whose state-space transitions are in some sense uniform and easily resolved in multiples, as is the case in the integer-valued arrival process. Applying this particular numerical method allows for a simple stochastic linear system description for the approximate emissions model, henceforward 45

51 referred to as the emissions model: X t Z + X 0 = 0 X t+ = X t + P t 5. P t Poist t. 5. As with many stochastically sampled models, here accurate, low cost estimates of the expected behavior of the system are sought. Naively, we could obtain such an estimate by drawing independent samples from the system and taking the desired estimator to be the sample mean. However, this method can become undesirably expensive, especially when very accurate estimates are needed. Instead, we will apply variance reduction techniques to the sampling of the estimator in order to reduce the ensemble size needed to achieve a desired threshold of accuracy. We now return to the emissions model, and the problem of reducing the variance of estimates of its mean behavior. Informed by the techniques applied to the single step Poisson estimator, we seek algorithms that draw valid sample paths from the model and yet have increased precision via variance reduction. Define the naive mean path estimator D N t by D N t := N N Xt, i 5.3 i= where X i t are i.i.d. sample paths drawn from the emissions model. Note that D N t E[X]t := E[X t ] as N by the law of large numbers. To generate antithetical sample path pairs Xt A, Xt A, we can simply substitute an antithetic pair into the emissions model: X A 0 = 0 X A t+ = X A t + Y t 5.4 X A 0 = 0 X A t+ = X A t + Y t, 5.5 where Y t, Y t Poist t are antithetically paired as in Chapter 3. 46

52 Mean Path E[X]t Antithetically Sampled Paths Particles Emitted X t Time t Figure 5.: E[X]t. A sample antithetic path pair, compared to the expected path An illustration is shown in Figure 5.. We can then define the mean path estimator D N t by A D N A t := N N i= [ X A,i t ] + X A,i t, 5.6 where the pair X A,i t, X A,i t are drawn i.i.d. as outlined above. The sampling of paths utilizing stratified sampling is somewhat less trivial. Each of the samples Z j used to make mean estimate are drawn only from their respective strata, and thus Z j Poist t. Therefore the stratified samples, unlike the antithetic samples, cannot be simply input into the linear emissions model to produce valid sample paths. Our solution to this problem is to construct four sample paths at a time via: X S t+ X S t+ X S3 t+ X S4 t+ = X S t X S t X S3 t X S4 t Z t + Π Z t t Z 3 t, 5.7 Z 4 t where Π t is a random 4 4 permutation matrix, Z j t are stratified samples of Poist t, and the model is subject to the initial condition X S 0 = 47

Exact Simulation of Continuous Time Markov Jump Processes with Anticorrelated Variance Reduced Monte Carlo Estimation

Exact Simulation of Continuous Time Markov Jump Processes with Anticorrelated Variance Reduced Monte Carlo Estimation 53rd I onference on Decision and ontrol December 5-7,. Los Angeles, alifornia, USA xact Simulation of ontinuous Time Markov Jump Processes with Anticorrelated Variance Reduced Monte arlo stimation Peter