c 2011 Peter A. Maginnis

Size: px
Start display at page:

Download "c 2011 Peter A. Maginnis"

Transcription

1 c 0 Peter A. Maginnis

2 VARIANCE REDUCTION FOR POISSON AND MARKOV JUMP PROCESSES BY PETER A. MAGINNIS THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science in Mechanical Engineering in the Graduate College of the University of Illinois at Urbana-Champaign, 0 Urbana, Illinois Advisers: Assistant Professor Matthew West Professor Geir Dullerud

3 ABSTRACT This thesis develops new variance reduction algorithms for the simulation and estimation of stochastic dynamic models. It provides particular application to particle dynamics models including an emissions process and radioactive decay. These algorithms apply several variance reduction techniques to the generation of Poisson variates in the tau-leaping time-stepping method for Markov processes. Both antithetical and stratified sampling variancereduction techniques are considered for Poisson mean estimation, and a hybridization of them is developed that has lower variance than either for every value of the Poisson parameter. Several analytical characterizations of estimator variance are proven for different Poisson parameter regimes. By applying these variance-reduced Poisson mean estimation techniques in an appropriate dynamic fashion to the tau-leaping method, variance-reduced pathwise mean estimators are generated for stochastic Markov processes. It is numerically demonstrated that stepwise variance reduction produces pathwise variance reduction in estimators of systems of physical interest. ii

4 To my girlfriend and family, for their love and support. To my advisers, for their patient guidance and interesting conversations. iii

5 TABLE OF CONTENTS CHAPTER INTRODUCTION CHAPTER PRELIMINARIES Probability Well known distributions The strong law of large numbers CHAPTER 3 POISSON MEAN ESTIMATORS Naive Monte Carlo Antithetical Stratified Hybrid CHAPTER 4 ANALYTICAL RESULTS Estimator Variance for Small Parameter Values Proof of Large Parameter Bound for Antithetical Estimator Variance Global Bounds on Hybrid Estimator Variance CHAPTER 5 PATHWISE MEAN ESTIMATORS Particle Emissions Radioactive Decay Pathwise Comparison and Error Quantification CHAPTER 6 NUMERICAL RESULTS Poisson Mean Estimation Emissions Pathwise Mean Estimation Decay Pathwise Mean Estimation REFERENCES iv

6 CHAPTER INTRODUCTION The roads that lead advanced research toward stochastic processes are numerous. Many deterministic systems exhibit features too complex or highdimensional to treat using traditional analytical or numerical solution techniques. As a result, scientists and engineers often use stochastic models to describe a wide variety of systems. Physical examples are readily available, including topics as diverse as atomistic-scale materials [7], complex fluid/aerosol mixtures [4], granular materials [], and biological and nanoscale environments [9]. Stochastic systems can provide cheap, accurate models of extremely complex dynamics e.g., particle emissions, Section 5., or define inherently stochastic systems e.g., radioactive decay, Section 5.. Whether stochastic processes are being studied intrinsically or to approximate a deterministic counterpart, their relevance to almost all fields of modern scientific research is considerable. The importance of stochastic systems, however, does not imply their ease of analysis. Even relatively simple stochastic systems can defy analytical solution. Given a dearth of closed form solutions, simulation of stochastic systems is often the only reasonable line of research. One canonical problem in the study of stochastic systems is the determination of the expected behavior of the model. Here, many independent sample paths of the system can be produced, and, when aggregated, reveal the underlying mean behavior of the system. Such Monte Carlo methods are particularly effective in models with non-linear or highly multi-dimensional characteristics that render other numerical methods ineffective or computationally infeasible. The primary cost of Monte Carlo simulation is the expense of drawing large numbers of samples. While convergence may be sure, it may also be slow, usually on the order of n in expected mean error, where n is the number of samples used. Thus, achieving high resolution of a particular system can easily become costly. The source of this cost is the variance of the Monte Carlo estimate.

7 Since such an estimate is an aggregation of random objects, it too is a random object. Thus, any given iteration of the estimator may show significant error from the true mean to be estimated. Of course, the law of large numbers applied to a consistent estimator ensures us that this variance and hence our expected error will converge to zero eventually, but as in most engineering contexts the primary question is: what precision can you buy with a given computational budget? This is where variance reduction techniques become indispensable. Given an unbiased estimator, decreasing its variance leads directly to smaller sample sizes needed to achieve the same precision. In fact, for scalar systems, the variance of an unbiased estimator is precisely its expected mean-square error. In higher dimensional systems, the two quantities are still closely related. As a result, variance reduced Monte Carlo estimation can produce equivalently precise results for a reduced computational cost. Many techniques are available to reduce the computational cost of stochastic simulation. A common and useful class of stochastic systems is the Markov process. Under general conditions, stochastic systems whose transition distributions depend only on their current state and not on past history are Markov processes. Exact methods for simulation of Markov processes exist, such as the method of Gillespie [7] developed to stochastically simulate a coalescence model for cloud droplet growth [6]. However, this exact simulation can become expensive as events occur more frequently. This phenomenon becomes particularly damaging when events of interest are relatively infrequent, while other less important but still necessary events occur often. Note that in this situation the cost of generating useful samples grows rapidly. One method, known as tau-leaping, to mitigate this difficulty was developed by Gillespie [8]. Tau-leaping exploits the structure of discrete event Markov processes namely exponentially distributed event times to approximately simulate Markov systems using a time discretization and the sampling of Poisson random variables. The convergence and stability of this technique to exact simulation has been demonstrated by Rathinam et al. [3], and much progress has been made to further reduce computational cost. Significant advances have been achieved, including adaptive step size selection by Cao et al [3] and an implicit tau-leaping by Rathinam et al. []. Variance reduction techniques could be developed for application to tau-leaping to further reduce the cost of simulation. In service of this goal, we implement and analyze three techniques for variance reduction on the sampling of Poisson random

8 variables. These are the well established techniques of antithetic sampling [5, pg. 43] and stratified sampling [5, pg. 55], as well as a method hybridizing the two. These techniques are well known and widely applied, for example used in work from production cost modeling [0] to estimating Fourier transform integrals []. Here we apply them to a construction of the underlying sample space of Poisson random variables. Furthermore, we approximately simulate a pair of stochastic systems using tau-leaping, then we apply variance reduction techniques stepwise to the algorithms, and show improvement in pathwise variance. The first of these two systems we model is the particle emissions process. Here, particles are randomly emitted into a particle population at a rate prescribed by a time inhomogeneous rate function t. The simple time varying dynamics provide a base case for the implementation of stepwise variance reduction techniques in service of pathwise variance reduction. The second system to be examined is radioactive particle decay. While still a relatively simple stochastic system, radioactive decay introduces an important feature: state-feedback. The stochastic rate of decrease of the state of the system decreases with the state. As becomes clear in the development of this system, state feedback requires that we modify the variance reduced Poisson sampling techniques, and the changes necessary are defined and implemented. Another primary thrust of exploration is the dependence of the variance reduction algorithms on the parameter of the Poisson distribution. This distribution has support on all of Z +, and exhibits inherent asymmetry. However, as this parameter becomes large, the Poisson distribution begins to develop symmetry. In fact, under a suitable linear transom the Poisson distribution converges uniformly to the unit normal distribution as becomes large [5], and the antithetic variance reduction technique is particularly suited to exploit this asymptotic behavior. As we will demonstrate and prove, the other standard technique applied, stratified sampling, is better suited to small and intermediate values of. Furthermore, for taking values in certain regions of R +, mathematical analysis of these algorithms is feasible, and we postulated and prove a few analytical results. Chapter provides a short review of several necessary mathematical topics. In Chapter 3, variance reduced algorithms for mean estimation of Poisson random variables are developed and we define notation for their analysis. In Chapter 4 we state and prove several analytical results. First, we prove two 3

9 small results quantifying the variance reduction provided by the antithetical and stratified Poisson mean estimators. Next, we prove an asymptotic bound for the variance of the antithetical Poisson mean estimator for all sufficiently large values of. In the last section of Chapter 4, we prove two global results comparing the variance of the hybrid Poisson mean estimator to the stratified and antithetical estimators. In Chapter 5, we apply tau-leaping to the particle emissions and radioactive decay stochastic systems. We adapt the single step variance reduction techniques to the Poisson sampling steps in the simulation of each system. Also, a metric is defined to quantify and estimate pathwise error. Chapter 6 collects the numerical results of simulation of the processes outlined in Chapters 3 and 5. We examine the relationship between the variance of each Poisson mean estimator and the Poisson parameter. We compare the estimated and analytical variance of both the antithetical and stratified estimators. We successfully demonstrate pathwise variance reduction in the particle emissions model using the antithetical and stratified schemes. Finally, we show that antithetic sampling reduces pathwise variance in estimation of the radioactive decay model. 4

10 CHAPTER PRELIMINARIES Before details of the research are presented, we provide a summary of important mathematical concepts used in this thesis. While some experience with probability and statistics is recommended to gain full value from this work, the ideas contained in this chapter present a brief review and should provide enough detail to make the thesis comprehensible to readers with experience outside of probability. The main points of this chapter are a short compilation of probability theory and notation, a few useful named classes of distributions, and a statement of the strong law of large numbers.. Probability A probability space Ω, F, P is composed of a set Ω with elements ω, a σ- algebra F on Ω, and a non-negative measure P on F such that PΩ =. Objects related to probability spaces that are typically of greatest practical interest are random variables. A random variable is an F-measurable function, say Xω, from Ω to another space, say X. We may think of Ω as the set of all possible outcomes of a random experiment and F as a collection of all sets of outcomes that can be differentiated from other sets of outcomes; these sets are called events. Likewise, Xω is an observable measurement, and P measures the likelihood of a given set of outcomes occurring. For example P {ω : Xω X} =, since, for every ω Ω, Xω X. We may also say for short that the probability that X is in X is one. By convention, Xω is often simply denoted X, and {ω : Xω A X} F is more commonly abbreviated {X A}. As it is merely a measurable function, a given random variable X may take on many forms. One way to describe a random variable, short of supplying its specific functional form, is its distribution. A distribution may be expressed 5

11 in several ways. The law µ of a random variable X, is defined as µa = P {X A},. where A X such that {X A} F. Two random variables have the same distribution if their laws are equal except on sets of measure 0. Another characterization of the distribution of a random variable is its cumulative distribution function CDF. Suppose that X = R and X is a random variable taking values in X. The cumulative distribution function F of X is defined as F : R [0, ] F : x P{X x},. a nondecreasing function taken to be right continuous. Note that for any X taking values in R, lim x F x = and lim x F x = 0. A collection of random variables {X, X,..., X n } has a joint distribution function F x, x,..., x n := P{X < x, X < x,..., X n < x n }..3 The collection of random variables is said to be independent if F x, x,..., x n = F x F x F x n..4 When X takes continuous values, if there exists a function f : R R + such that F x = x ft dt, then f is called the probability density function of X. In this case, µa = fx dx..5 A If X is a discrete random variable, then its probability mass function is defined fm = µ{x = m} = P{X = m}..6 There are a few important functionals of a random variable that are fre- 6

12 quently used. The first, and most important functional is the expected value E of a random variable X. It is defined E[X] := Xω dω..7 Ω Note that the expectation is, by definition a linear functional. If X is a continuous valued random variable in R with probability density function f, then the expected value of X E[X] = xfx dx..8 R Equivalently, if X is a discrete random variable, the expected value of X is given by E[X] = mfm..9 m X Since any measurable function g composed with X is a random variable, we may extend the definition of expectation to include E[gX] := gxω dω..0 Ω This definition admits the natural extensions to probability density and mass functions as above, i.e. E[gX] = gxfx dx,. R if X is a continuous valued random variable in R with probability density function f, and E[gX] = gmfm,. m X if X is a discrete random variable with probability mass function f. Another important functional of two random variables X and Y is covariance Cov, defined CovX, Y := E [X E[X]Y E[Y ]]..3 Note that since E is a linear functional of a random variable, that Cov is bilinear, that is, it is linear in each of its arguments. A particularly common and interesting case is the covariance of a random variable with itself, i.e. 7

13 CovX, X. Such a form is a functional of a single random variable and is almost universally referred to as the variance of X. Note that, in particular VarX := E [ X E[X] ] = E[X ] E[X],.4 where the last equality follows from the fact that E is a linear functional and E[X] is a fixed number. Qualitatively, the expected value may be thought of as the average or typical value of a random variable, and the variance may be thought of as a measure of the tendency of a random variable to take values away from its mean. In other words, the variance measures the typical dispersion of a distribution. Note that the expectation of a constant is itself and the variance of a constant is 0. One final equivalent representation of expectation can be defined using the CDF of X, F. If F is invertible such that for u [0, ] except for perhaps on a set of measure zero, F F u = u, then we may consider for any random variable Xω another random variable with the same distribution Xu := F u. In this case, we may consider [0, ] to be the sample space of Xu with Lebesgue measure as its probability measure and thus: E[Xu] = F u du..5 0 If these requirements hold, then E[X] = E[Xu]. If F is not invertible, say for example not strictly monotone, the same conclusion holds for F u := inf{x : F x u}..6. Well known distributions While there are an uncountable number of possible random variables and distributions, several important parameterized classes are known and their properties well studied. Four important classes are the uniform, exponential, normal or Gaussian, and Poisson distributions. The uniform distribution refers to two different classes of distributions, one discrete and one continuous, each taking two parameters a and b. The 8

14 continuous uniform distribution is more relevant to the development here. A random variable X has uniforma, b distribution, denoted X Unifa, b is real valued and takes values in the interval [a, b] with equal probability. It has CDF 0 if x < 0 F x = x a if a x < b..7 b a else It has mean E[X] = a+b and variance VarX = b a. Most numerical random sampling is performed using approximately Unif0, pseudorandom numbers transformed into other distributions. The next important class of probability distributions is the exponential distribution. It takes a single parameter > 0. The exponential distribution can express the time until the next event if events occur in continuous time with constant rate and their time of arrival is independent of the time of the last arrival. If X Exp, it has CDF exp x if x 0 F x =..8 0 else It has mean E[X] = and variance VarX =. The normal or Gaussian distribution has support on all of R. It takes two parameters µ and σ which are the mean and variance of the distribution. Its probability density function is the well-known bell curve fx = exp x µ..9 πσ σ If a random variable X has normal distribution, we write X N µ, σ. The normal distribution enjoys the property that for such a random variable, X = µ+σz, where Z N 0,, the standard unit normal distribution. The CDF of the unit normal distribution is often denoted Φx = π x exp t dt = erfc x..0 9

15 Note that Φx is symmetric about 0,, namely Φ x = Φx.. The last important distribution named here is the Poisson distribution. Like the exponential distribution, it takes a single parameter > 0. It can express the number of events that occur in a fixed amount of time if events may occur at exponential rate. The Poisson distribution is a discrete distribution taking values on Z +. Its probability mass function m e if m Z + m! fm =,. 0 else and thus it has CDF m k e k=0 if m Z + k! F m =..3 0 else If X Pois, then E[X] = and VarX =. Independent Poisson distributed random variables X Pois and Y Pois enjoy the property that X + Y Pois +..3 The strong law of large numbers Suppose X, X,... is a sequence of independent random variables each with the same distribution henceforward abbreviated i.i.d. for independent identically distributed and suppose further that E[ X n ] < and E[X n ] = µ for every n. Define Then, S n := n X i. i= { } S n P lim n n = µ =..4 0

16 CHAPTER 3 POISSON MEAN ESTIMATORS We begin by constructing several estimators for the mean of a Poisson random variable. The estimators considered here are consistent, meaning they converge to the mean of the Poisson distribution from which they are sampled as their indices approach infinity. Furthermore, the estimators are unbiased, meaning that the expected value of each estimator is the same as the true mean of the Poisson distribution from which they are drawn, for every value of their indices. The typical application for these estimators is parameter estimation, in this case estimating the value of. The first estimator constructed is the naive Monte Carlo mean estimator. The estimator with index n is simply the sample mean of n independent, identically distributed random samples from the Poisson distribution. The next two estimators each use well-known variance reduction techniques. The first is the antithetical estimator which again draws Poisson random variables and averages them, but instead of drawing these samples independently, it introduces negative correlation between pairs of samples. This negative correlation helps to reduce the variance of the estimator. The second well known variance reduction technique is implemented in the stratified mean estimator. The primary idea of stratified variance reduction is that instead of drawing many samples from the whole distribution, the distribution is partitioned into some number of strata and some independent samples are drawn from each stratum. This sampling is performed in such a way that, while each sample does not have the overall distribution, their average still converges to the expected value of the distribution. The imposed spacing of the samples throughout the support of the distribution reduces the variance inherent to the naive estimator. The last estimator constructed uses a hybridization of antithetical and stratified estimation. The domain is divided into an even number of strata, and samples are drawn from each stratum so that negative correlation exists between pairs of samples. In essence, the

17 hybrid estimator is in particular both stratified and antithetical. A pair of analytical results making this intuition more precise is proven in Chapter Naive Monte Carlo Denote the naive sample mean estimator for the expected value of a Poisson random variable by: δ n := n n X i, 3. i= X i i.i.d. Pois. By the strong law of large numbers, δ n a.s. E[X i ] =, and by the central limit theorem, under very weak assumptions we know that this convergence is at least O n. In order to simulate Poisson random variables, the inverse of the CDF F m must be formally defined: F : [0, Z + F : u inf{m : F m > u}. 3. where F is the CDF of X i Pois. This is necessary because many numerical routines only have access to the simulation of pseudorandom numbers with approximately uniform0, distribution. In order to sample other distributions, the algorithms used must implement a way to transform uniform0, random numbers into the desired distribution. The inverse CDF provides such a transform. If u Unif0,, then F u Pois. This inversion is performed by searching from 0 to find the first non-negative integer m such that F m > u. In general, the CDF inversion step used in these estimators can become computationally expensive, particularly when the value of is large and hence the typical number of steps taken from 0 to find the infimum in 3. is large. To mitigate this growth in cost, the following algorithm is implemented, exploiting the uniform convergence of

18 the Poisson distribution to the N, distribution proven by Curtiss [5]. To generate a Poisson random variable, first, generate Z N 0,. This is done extremely efficiently via the Box-Mueller method []. Set u = ΦZ. Then u Unif0,. Invert the Poisson CDF by performing a linear search on Z + as follows: initialize the guess at m 0 = max{ + Z, 0}. If F m 0 < u, search up the integers in m from m 0 until the first m such that F m > u. Return m. If F m 0 > u, search down the integers in m starting at m 0 until the first m such that F m < u. Return m +. This algorithm is of constant computational order in, and will be particularly useful in the implementation of the next estimator. 3. Antithetical To sample the antithetical estimator, draw a uniform variate and invert it and its antithetic pair. v i i.i.d. Unif0, Y i := F v i 3.3 Y i := F v i 3.4 so that Y i, Y i Pois. This inversion is performed using the same algorithm as before, except that two inversions of an antithetic pair must be performed. Perform the first, set v i = ΦZ and calculate F v i, as above. Now, note that by the antisymmetry of the normal CDF, if v i = ΦZ, Φ Z = v i. Thus to perform the second inversion, use the same algorithm except set m 0 = max{ Z, 0} and instead of comparing evaluations of the CDF to v i = ΦZ, compare them to v i = Φ Z. Now, define Y i := Y i + Y i and define the antithetical estimator of the mean as: 3.5 δ n A := n n Y i. 3.6 i= 3

19 3.3 Stratified Next, the stratified estimator is constructed. Let {A j } 4 j= partition [0, such that A j = [ j, j 4 4. For each j, draw u i j UnifA j independent in j and i.i.d. in i. Let Zj i := F u i j and Z i := 4 4 Zj. i 3.7 j= Thus let the stratified estimator of the mean be defined as: δ 4n S,4 := n n Z i 3.8 where the Z i are i.i.d. samples. In this definition, the strata were chosen to partition [0,, which we can easily fix here to be the underlying sample space Ω. Unlike more traditional strata which are chosen to partition the state space, these strata can correspond to shared states. That is, two points in different strata may map to the same state under the mapping F. This distinction, however, is of little consequence as the strata are still nonintersecting and preserve the correct distribution under F. Note here that this stratified estimator serves strictly as a point of reference. The choices made in its construction are simple and effective, but certainly not optimal. The proportional allocation scheme used here is often the best choice if no information is used about the variances within strata. Indeed, variances within strata can be difficult to compute explicitly in general, and in practice they are often pre-estimated numerically in order to fix the stratified scheme. Also note that the choice of four strata, each with equal probability is largely a matter of convenience in calculations. One easy extension is to M equally probable strata. In this case, define {A j } M j= := [ j, j M M. For each j, let Zj i := F ui j where u i i.i.d. j UnifA j and i= Z i := M Zj. i 3.9 j= Then we may define the stratified estimator over M uniform probability 4

20 strata as: where Z i are sampled i.i.d. δ Mn S,M := n n Z i 3.0 i= 3.4 Hybrid The constructions of the antithetical and stratified mean estimators suggest the possibility of a hybridized algorithm that shares in the variance reduction properties of either. We construct the hybrid mean estimator as follows. Draw two uniform variates v i UA and v i UA where the probability strata A j are defined as above for the stratified estimator, such that v i UA 4 and v i UA 3 almost surely. Set H i := F vi 3. H i := F vi 3. H i 3 := F vi 3.3 H i 4 := F vi 3.4 so that each H i j has the same marginal distribution as each Z i j from the stratified estimator. At the same time, there exists correlation between H i and H i 4 and between H i and H i 3, analagously to the correlation between Y i and Y i in the antithetical estimator. Set H i := 4 4 Hj, i 3.5 j= and now define the hybrid Poisson mean estimator δ 4n H,4 := n n H i 3.6 where the H i are sampled i.i.d.. Note also that extension of the hybrid Poisson mean estimator to any even number M of the uniform strata {A j } M j= is simple. For j {,..., M }, draw vi i.i.d. j UnifA j. For j { M +,..., M}, i= 5

21 set v i j = v i M+ j. Define H i := M Hj, i 3.7 j= where Hj i := F vi j and define δ Mn H,M := n n H i 3.8 i= where H i are sampled i.i.d. 6

22 CHAPTER 4 ANALYTICAL RESULTS Due primarily to the analytical unwieldiness of the Poisson distribution function, exact expressions for the reduced variance of the estimators detailed above are difficult to obtain. Calculation of the covariance between two antithetically sampled Poisson variables, for example, becomes combinatorially complex for most values of. Until complete analytical solution of the problem becomes tractable, one course of action is to consider special cases of the parameter. We prove several results along this line of thought. First, two short results calculating the exact variance of the antithetical and stratified mean estimators for below certain thresholds are given in Lemmas and, respectively. In these cases, the estimator variances reduce to simple polynomial expressions. Here the variance reduction from naive Monte Carlo is made explicit. Furthermore, these results are later confirmed by numerical experiment for the four sample point estimate case in Figure 6.3. Proven in Section 4., Theorem provides another result for a specific region of. It provides an upper bound for the variance of the antithetical Poisson mean estimator for all sufficiently large values of. While the bound obtained does grow without limit in, it nevertheless provides analytical proof that for large values, the antithetical mean estimator has much lower variance than the naive Monte Carlo estimator, which is known to have variance linear in for all values of. Numerical results shown in Figure 6. indicate that this bound is highly conservative, but it may be possible to tighten the bound given refinement of the proof. Two global results are obtained. Theorems and 3 prove that for every value of, the hybrid Poisson mean estimator has variance at least as small as the variance of both the stratified and antithetical mean estimators, respectively. This result shows that if given the choice between implementing antithetic or uniformly stratified variance reduction, one may simply implement an algorithm that globally enjoys the benefits of both and is not 7

23 significantly more computationally expensive than either. Again, Figure 6. provides numerical support for these results. 4. Estimator Variance for Small Parameter Values Lemma. Let δ n A be the antithetical mean estimator of a Poisson distribution, where Y i, Y i Pois are antithetically paired such that δ n A := n n i= Y i + Y i. If < ln, then Var δ n A = n. 4. Proof. Suppose < ln. Then e >, and F 0 >. Thus F uf u = 0 for every u 0, by the definition of F. So for any i N Cov [ ] [ ] [ Y i, Y i = E Y i Y i E Y i E Y i = 0 F =. ] uf u du Thus Var δ n A = Var n n Y i i= n = n Var Y i + Y i i= = Y n Var + Y = 4n = 4n Var Y + Var Y + Cov Y, Y + 8

24 = n. Lemma. Let δ 4n S,4 denote the stratified mean estimator of a Poisson distribution with four uniform strata. Let Z, i Z, i Z3, i Z4 i be i.i.d. samples from probability strata A = [ 0, 4, A = [, 4 respectively. Namely, δ 4n S,4 := n n i=, A3 = [, 3 4 Z i + Z i + Z3 i + Z4 i. 4, A4 = [ 3 4,, If < ln 4 3, then Var δ 4n S,4 = n Proof. Suppose < ln 4 3. Then e > 3 4 and P Zi = 0 = P Z i = 0 = P Z i 3 = 0 = for every i N. So each of these random variables have zero mean and variance. The first two moments of Z i 4 are: E [ ] Z4 i = np Z4 i = n n=0 = 0 P Z i 4 = 0 + = 0 + = 4 np Z4 i = n n= 4nP X i = n n= np X i = n n=0 = 4E [ X i]. 4.3 [ ] E Z4 i = n P Z4 i = n n=0 = 0 P Z i 4 = 0 + = 0 + = 4 n P Z4 i = n n= 4n P X i = n n= n P X i = n n=0 9

25 [ = 4E X i]. 4.4 So Var δ 4n S,4 = Var n n i= Z i + Z i + Z i 3 + Z i 4 4 n = n Var Z i + Z i + Z3 i + Z4 i 4 i= = 6n Var Z + Z + Z3 + Z4 = Var Z 6n + Var Z + Var Z 3 + Var Z 4 = 6n Var Z4 = [ ] E Z4 i E [ ] Z4 i 6n = 4E [X i] 6E [ X i] 6n [ = E X i] E [ X i] n 4 = n 4 Var X i 3 4 E [ X i] = n Proof of Large Parameter Bound for Antithetical Estimator Variance Theorem. For any ɛ > 0,there exists Λ > 0 and K > 0 such that for any > Λ, Var δ n A < K n 3 4 +ɛ. 4.5 To prove the theorem, we require the development of several Lemmas in order to asymptotically relate the Poisson and Normal distributions. Lemma 3. Let f, g : [a, b] [0, ] such that f is nondecreasing, g C [a, b] and there is a c > 0 such that g x c for every x [a, b]. Then if fx gx < δ 0

26 for every x [a, b] then f u g u < δ c 4.6 for all u U, where U := [ga, gb] [fa, fb], and f u := inf{x : fx u}. Proof. Let x := δ. Observe that the claim holds trivially if x b a, since c f u, g u [a, b] for all u U. Otherwise, choose any u U. We proceed by showing f u < g u + x f u > g u x. 4.7a 4.7b To prove 4.7a, first consider the case when u gb x. Then by the mean value theorem, there is a z g u, g u + x such that g z x = gg u + x gg u c x gg u + x u u + δ gg u + x. Applying the hypothesis and the definition of the inverse of f, u < fg u + x f u < g u + x. Now suppose u > gb x. Then g u > b x,

27 and since u U, u fb = f u b < g u + x. The proof of 4.7b is similar. Suppose u ga + x. Then by the mean value theorem, there is a z g u x, g u such that g z x = gg u gg u x c x u gg u x gg u x u δ fg u x < u g u x < f u. If u < ga + x, then g u < a + x, and since u U, u fa = f u a > g u x. Since, for u U chosen arbitrarily, 4.7a and 4.7b hold, we have for every u U. f u g u < x := δ c Let F denote the cumulative distribution function CDF of X Pois, > 0. Define X := X +, and let F x denote the CDF of X. That is, F x = F + x. 4.8 Take Φx to be the CDF of the standard unit normal distribution, namely Φx = π x e t dt. 4.9

28 By a result from Cheng [4, Theorem I], F x Φx 6 π x e x + δ 4.0 for every x R and > 0, where the magnitude of δ is bounded above by some function of, namely δ = This result is easily adapted to produce a uniform bound on the error between the Poisson and Normal CDFs that depends only on the parameter. Indeed F x Φx = 6 π x e x + δ 6 x e x + δ π π Fix any 0 < ɛ <. For any > π ɛ, set =: δ. 4. γ := δ + Φ ɛ ln ln π. 4.3 Note that γ 0 as, so there exists Λ 0 > π ɛ such that > Λ 0 = γ <. The rate of convergence of γ to 0 is discussed in Lemma 4. Lemma 4. For any 0 < ɛ <, there is a Λ > π ɛ Λ, where l ɛ := such that, for any γ < ɛ l ɛ, 4.4 ɛ ln ln π. Proof. By definition of γ and the Gaussian CDF, γ = δ + Φ l ɛ = δ + erfcl ɛ Using an asymptotic expansion of erfcx for large x, there exists K > 0 3

29 and x > 0 such that for any x x, e x erfcx x π < K e x x 3 = erfcx < e x x π + K e x x Thus for any such that l ɛ x, namely Λ where Λ := max{ πe x ɛ, π ɛ }, 4.6 erfcl ɛ < exp l ɛ l ɛ exp l ɛ + K π l ɛ 3 π ɛ π ɛ = l ɛ π + K l ɛ 3 = γ < δ + ɛ π ɛ l ɛ + K l ɛ 3 [ = ɛ l ɛ 6 l ɛ π l ɛ l ɛ ɛ ɛ 3 ɛ + 0.3l ɛ + ] π + K ɛ l ɛ. 4.7 Since the bracketed term above converges to < as, there exists Λ > Λ such that for any Λ, γ < ɛ l ɛ. Lemma 5. There exists Λ > Λ 0, such that for any Λ, i γ γ F γ u F u du Φ uφ u du γ < ɛ 3 π 4.8 4

30 ii γ γ [ F γ ] u [ du Φ u ] du < γ 3 π ɛ. 4.9 Proof. As shown in the above extension of Cheng [4, Theorem ], for any positive > 0 F x Φx < δ for any x R. For any > Λ 0, define a := Φ γ + δ = Φ γ δ. Thus Φa = γ + δ = F a > γ Φ a = γ δ = F a < γ Define c := Φ a. Note that c = Φ a by symmetry of Φ x = π e x, and that c Φ x for every x [ a, a ]. Thus by Lemma 3, F u Φ u < δ c 4.0 for every u [ F a, F a ] [Φ a, Φa ] [γ, γ]. To show i, observe that γ γ F = γ u F u du Φ uφ u du γ γ γ γ + F F F γ u F u Φ uφ u du u F u F uφ u uφ u Φ uφ u du 5

31 < γ γ F u F + Φ u F γ γ δ c F γ γ + δ c u Φ u u Φ u du u Φ u + Φ u δ c + Φ u δ c du F γ γ u Φ u du + δ c Φ u du < δ δ γ + c c Iγ δ < π c + δ c, γ γ Φ u du where Iγ := γ Φ u γ du = Φ u du 4. γ Φ u du = E[ N ] = π γ 4. 0 for all 0 < γ <, where the random variable N N 0,. Now observe that c = Φ a = Φ Φ γ δ = Φ l ɛ = π exp l ɛ = exp ɛ ln + ln π π = ɛ. 6

32 So the ratio δ c = π ɛ ɛ 3 ɛ ɛ = [ ] π ɛ converges to 0 and the bracketed term converges to < as. Thus there exists a Λ > Λ 0 such that for > Λ, the bracketed term is less than and Thus, for > Λ, δ c <. 4.4 π γ γ F u F u du Φ uφ u du γ γ δ < π c + δ c = δ c π + δ c = 3 π < 3 π ɛ ɛ [ To show ii, take Λ as above and take > Λ. Then = γ γ γ γ γ γ [ F γ ] u [ du Φ u ] du γ [ ] F u [ Φ u ] du ] π + δ c F u u Φ + Φ u F u u Φ du 7

33 γ γ F u Φ u γ du + < δ δ γ + c c Iγ δ < π c + δ c = δ c π + δ c = 3 π ɛ < 3 π. ɛ γ [ Lemma 6. For every > Λ 0, Φ u F ] π + u Φ u du δ c erfc γ l ɛ α, 4.5 where we define α := ɛ δ. 4.6 Proof. For any > Λ 0, let ρ := δ erfcl ɛ, so that γ = δ + erfcl ɛ = + ρ erfcl ɛ. 4.7 By Taylor s theorem of first order with remainder applied about ρ = 0, for any ρ > 0, there exists 0 < ρ < ρ such that erfc + ρ erfc lɛ π [ = l ɛ exp erfc + ρ erfc lɛ ] erfc l ɛ ρ π [ l ɛ exp erfc erfc l ɛ ] erfc l ɛ ρ π = l ɛ exp l ɛ erfc l ɛ ρ = l ɛ ɛ δ, 8

34 where the inequality follows from the magnitude of the second term decreasing in ρ. Indeed, erfcl ɛ 0, ρ > 0, erfc x decreasing, and expx increasing imply that [ exp erfc + ρ erfc lɛ ] is decreasing in ρ. Lemma 7. There exists a Λ 3 > Λ 0 such that for any > Λ 3, [l ɛ α] exp [l ɛ α] π ɛ l ɛ. 4.8 Proof. By Taylor s theorem of first order with remainder applied about α = 0, for every α > 0, there exists 0 < α < α such that exp [l ɛ α] = exp l ɛ + [l ɛ α] exp [l ɛ α] α exp l ɛ + [l ɛ ] exp [l ɛ α] α. Trivially, since l ɛ, and α 0 as, one can take sufficiently large to make l ɛ α > 0. Then given the above inequality, observe that [l ɛ α] exp [l ɛ α] [l ɛ α] [ exp l ɛ + l ɛ α exp [l ɛ α] ] l ɛ [ exp l ɛ + l ɛ α exp l ɛ exp l ɛ α α ] Since l ɛ α 0 and α 0 as, there exists Λ 3 > Λ 0 large enough so that for any > Λ 3, l ɛ α exp l ɛ α α <. 4.9 Thus, continuing the above inequalities, [l ɛ α] exp [l ɛ α] < l ɛ [exp l ɛ + exp l ɛ ] = πl ɛ ɛ. 9

35 Lemma 8. There exists Λ 4 > Λ 0 such that, for any > Λ 4, γ γ [ Φ u ] du < 5 l ɛ ɛ 4.30 Proof. First, take a := Φ γ = Φ γ > 0. Then, change coordinates and integrate by parts: γ [ Φ u ] a du = x π e x dx γ a = + [ xe x π ] a x= a = + π ae a π π a a a e x dx e x dx 0 π e x 0 dx + 0 π e x dx = + π ae a Φa + = + π ae a γ = γ + π ae a. Observe that a = Φ γ = erfc γ, and thus that γ [ Φ u ] du = γ + π erfc γe erfc γ. 4.3 γ Since l ɛ and α 0 as, select Λ 4 > max{λ, Λ 3} such that for any > Λ 4, l ɛ α. Recall that x exp x is a decreasing function for all x, so by Lemma 6, erfc γ l ɛ α, 30

36 and γ [ Φ u ] du γ + π l ɛ αe lɛ α. 4.3 γ Since > Λ 4, Lemmas 4 and 7 imply that γ γ [ Φ u ] du ɛ l ɛ + 4l ɛ ɛ [ ] = 4l ɛ ɛ l ɛ Since the bracketed term converges to as, there is Λ 4 > Λ 4 such that the bracketed term is less than 5, and thus 4 γ [ Φ u ] du 5lɛ ɛ γ and the proof is complete. Lemma 9. For any 0 < ɛ <, there exist constants K, K, and Λ such that for any Λ, 0 F u F u du Φ uφ u du < K ln + K ɛ ɛ

37 Proof. First, let Q := K := K C := = G := 0 F γ γ γ 0 γ 0 F γ u F u du 4.35 F F u F u du 4.36 u F u du + γ F u F u du 4.37 u F u du 4.38 Φ uφ u du 4.39 N := γ γ [ Φ u ] du 4.40 H := H C := so that = γ γ γ γ 0 γ [ F u ] du 4.4 [ ] F u du + [ F γ [ F u ] du 4.4 ] γ [ ] u du + F u du K + K C = Q 4.44 H + H C = 0 [ F u ] du = E[ X ] =,

38 and by Lemma 5, for > Λ Also, recall by antisymmetry, that K G < 3 π H N < 3 π Φ uφ u du = ɛ ɛ 4.47 [ Φ u ] du = Furthermore, by antisymmetry and Lemma 8, for > Λ 4, N = + G By symmetry, then by Hölder s inequality, 0 K C = γ 0 γ 0 F < 5 l ɛ ɛ u F u du [ F ] u du H C H C = H C = H γ 0 [ F ] u du N + H N Thus, it is easily seen by repeated use of the triangle inequality as well as the inequalities 4.46, 4.47 and 4.49 developed above, that for any 33

39 > Λ := max{λ, Λ 4} 0 F u F u du + = Q + = K G + K C + + G K G + K C + + G < K G + N + H N + N < π + 5l ɛ ɛ ɛ = + 5 π ɛ < π ɛ + 5 ɛ ɛ ln ln π ɛ ln ɛ. Recall the statement of Theorem : For any ɛ > 0, there exists Λ > 0 and K > 0 such that for any > Λ, Var δ n A We now complete the proof of Theorem. Proof. Var δ n A = 4n Var Y + Y = [ ] Var Y n + Cov Y, Y = n = n = n [ + Cov [ + Cov + 0 Y +, Y < K n 3 4 +ɛ. 4.5 ] + Y +, Y + ] F u F u du 34

40 [ = + Φ uφ u du n 0 = n F F u F u du u F u du 0 0 ] Φ uφ u du Φ uφ u du. 4.5 Now, apply Lemma 9 for ɛ = 4 + ɛ to obtain K, K, and Λ such that for > Λ : Var δ n A < n K + K 4 ɛ = K n 3 4 +ɛ + K ln ɛ ln 4 +ɛ Since ln ɛ 0 as, there exists Λ > Λ such that for all > Λ, Var δ n A < K n 3 4 +ɛ. 4.3 Global Bounds on Hybrid Estimator Variance Let the antithetical, stratified and hybrid Poisson mean estimators be defined as above for M an even number of uniform strata. Theorem. For any > 0 and even M, Var δ Mn H,M Var δ Mn S,M

41 Proof. Recall that δ Mn S,M := n n i= Zi, so that Var δ Mn S,M = Var n n Z i i= n = n Var Z i i= = n Var Z i n i= = n Var Z, 4.55 since the Z i are independent and identically distributed. Recalling the definition of Z i, Var Z = Var M = M j= Z j Var Zj, 4.56 j= since Z i j are independent in j. So we have Var δ Mn S,M = n M Var Zj j= By the same logic as above, we have Var δ Mn H,M = Var n n H i i= n = n Var H i i= = n Var H i n i= = n Var H, 4.58 since the H i are independent and identically distributed. that the H j However, recall that compose H are not altogether independent in j. For each 36

42 j, k such that j + k = n +, Cov H j, H k 0. For any other j k, Cov H j, H k = 0. Thus and Var H = Var Hj M j= [ = M Var M Z M j + Cov Hk, HM+ k ], 4.59 j= Var δ Mn H,M = n Thus, the difference M k= [ M Var M Zj + Cov Hk, HM+ k ] j= k= Var δ Mn H,M Var δ Mn S,M = n M Cov Hk, HM+ k, 4.6 k= and all that remains to show is that the right hand side of this equation is negative. This is trivially demonstrated, using the definition of Hj i and a result proven in Schmidt [6, Theorem ]: since fx := F So Cov H k, H M+ k = Cov F = Cov F v k, F v k v k, F v k = Cov f vk, g v k 0, x, gx := F x are non-decreasing functions. Var δ Mn H,M Var δ Mn S,M 0, and the proof that the hybrid algorithm produces estimator variance at least as small as the stratified algorithm is complete. It remains to prove that the hybrid algorithm produces estimator variance at least as small as the stratified algorithm. In order to prove such a theorem, we shall introduce some new notation and prove several lemmas first. Since the proof will involve some operations over indices taking values 37

43 in {,..., M}, we begin by defining several important subsets of pairs of indices and prove a short result. Let M := {,..., M} and M := M M. Partition M as follows. Define A := {i, j M : i + j = M + } 4.6 B := {i, j M : i = j} 4.63 C := M \ A B, 4.64 and note that A = B = M while C = M M. It is trivial to see that for each of these sets, i, j is an element if and only if j, i is an element. Another symmetry within the set C is proven in the following lemma. Lemma 0. For any element in C, it s reflection in a coordinate about M+ is also an element of C, that is, i, j C i, M + j C 4.65 Proof. Select any element i, j C i + j M + and i j i M + j and i + M + j M + i, M + j C. Lemma. For any set of M scalars µ i, i M, µ i µ j + M i,j C i,j A i,j C µ i µ j = µ i µ j µ M+ i µ M+ j Proof. Define C i := {j : i, j C}. Observe that C i = M, since for any i M, i, i / C and i, M + i / C, but for every other j M, 38

44 i, j C. Now, observe that i,j C µ i µ j µ M+ i µ M+ j = i,j C = i,j C µ i µ M+ i i,j C µ i µ M+ i i,j C µ i µ M+ j µ i µ j 4.67 = µ i µ M+ i µ i µ j i M j C i i,j C = M µ i µ M+ i µ i µ j i M i,j C = M µ i µ j µ i µ j, 4.68 i,j A i,j C where 4.67 comes from Lemma 0 and 4.68 comes from the definition of A. Lemma. For any set of M scalars µ j, j M, M M µ j µ j + M j= j= µ j µ M+ j j= Proof. First, observe that M M µ j µ j = µ i µ j j= i,j M j= Indeed, µ i µ j = µ i µ i µ j + µ j i,j M i,j M = µ j µ i µ j i,j M i,j M = M µ j µ j. j M j M 39

45 Now applying 4.70 to the left hand side of 4.69, we see that M M µ j µ j + M µ j µ M+ j j= j= j= = M µ i µ j µ j + M i,j M j= µ j µ M+ j. j= By expanding the square and rewriting in the notation of our partition, we see that M M µ j µ j + M µ j µ M+ j j= j= j= = µ i µ j µ j µ i µ j + M µ i µ j i,j M j M i,j M \B i,j A = µ i µ j µ i + µ j µ i µ j i,j M i,j A i,j M \B + M µ i µ j i,j A = µ i µ j i,j M + M µ i µ j = = i,j M \A i,j C i,j A µ i µ j µ i µ j i,j A i,j M \B µ i µ j µ i µ j + M µ i µ j µ i µ j i,j M \B µ i µ j + M i,j A i,j M \B i,j A µ i µ j, where the last equality follows from the fact that µ i µ j = 0 for every i, j B and C = M \ A \ B. Continuing this string of equations, we see 40

46 that M M µ j µ j + M µ j µ M+ j j= = = = 4 i,j C i,j C i,j C j= j= µ i µ j µ i µ j µ i µ j + M µ i µ j i,j C i,j A µ i µ j µ i µ j + M µ i µ j i,j C i,j A [ µi µ j + µ M+ i µ M+ j ] µ i µ j + M µ i µ j, i,j C i,j A i,j A where the last equality comes from another enumeration of C using Lemma 0 and closure of C under index swapping. Finally, we complete the proof of Lemma. By continuing from above and then applying Lemma, observe that M M µ j µ j + M µ j µ M+ j j= j= j= = [ µi µ j + µ M+ i µ M+ j ] 4 i,j C µ i µ j + M µ i µ j i,j C i,j A = [ µi µ j + µ M+ i µ M+ j ] 4 = 4 i,j C + i,j C i,j C µ i µ j µ M+ i µ M+ j µi µ j + µ M+ i µ M+ j 0. Now, for each stratum A j, j M, define µ j := E [ Z i j ] [ ] = E H i j = M F Aj u du, 4.7 4

47 so that E[X i ] = M M j= µ j. Furthermore, define σj := Var Zj i = Var H i j 4.7 [ ] = E Zj i E [ ] Zj i [ = M F u] du µ j, 4.73 A j since, for each i, j, H i j has the same marginal distribution as Z i j. Theorem 3. For any > 0 and even M, Proof. First, recall that Var Var δ Mn H,M δ Mn H,M Var = n Var H δ Mn A = n Var M j= = nm Var M j= H j H j, and Var δ Mn A = Var δ = Var Mn Mn A = n Var M = nm Var Mn i= M i= M i= Y i Y i + Y i + Y i. Y i So we may proceed by proving that M Var Var M Hj Y i j= i= 4 + Y i. 4.75

48 Now, applying the above notation, observe that Var M j= H j Next, note that Var M i= Y i = = = Var M Hj + Cov Hj, HM+ j j= σj + j= σj + j= + Y i = = = i= j= j= E [ ] [ ] [ ] Hj HM+ j E H j E H M+ j E [ ] M Hj HM+ j µ j µ M+ j j= M Var Y i + Y i i= j= M Var Y i + Var Y i + i= Var Y i = M Var Y + M i= M i= Cov Y i, Y i + M E [ Y Y = M Var [ ] Y + ME Y Y M Cov Y i, Y i ] [ ] [ E Y E Y M j= ] µ j. In order to compare this expression directly to a similar expression for the hybrid estimator, observe that Var Y = E [Y = 0 = M = M ] E [ ] Y [ F u] du M µ j j= [ M F u] du A j j= j= σ j + µ j M M M µ j, j= M µ j j= 43

49 where the last equality comes from Also, observe that E [ ] Y Y = 0 = M = M F uf u du M j= F A j uf u du E [ Hj HM+ j]. j= Applying these two identities, we see that Var M i= + Y i = Y i σj + j= + j= µ j M µ j M j= j= E [ ] Hj HM+ j M M µ j, 4.77 and thus, by subtracting 4.76 from 4.77 and multiplying by M, that M Var = M 0, M i= M + Y i Var Hj Y i j= M µ j µ j + M j= j= by Lemma. So, in conclusion, we see that Var δ Mn A Var δ Mn H,M and the proof is complete. = Var nm 0, M i= Y i j= µ j µ M+ j j= M + Y i Var Hj j= 44

50 CHAPTER 5 PATHWISE MEAN ESTIMATORS In this chapter we introduce the two stochastic processes under consideration in the pursuit of achieving stochastic pathwise variance reduction. Both systems are approximated using a tau-leaping algorithm. Then, the algorithms in Chapter 3 are adapted as necessary to the Poisson sampling steps of each simulation. In both the particle emissions and radioactive decay models, such adaptations are necessary to generate valid sample paths affordably. 5. Particle Emissions The particle emissions problem can be modeled very generally as a continuoustime, time-inhomogeneous stochastic arrivals process. The dynamics of the process are governed by a rate profile t. We could use exact simulation techniques to sample from this stochastic distribution as Gillespie [7] develops for another particle model. However, exact sampling can be computationally expensive as the arrival rate of particles increases. An effective numerical technique to approximate this stochastic distribution is the tau-leaping method of Gillespie [8]. Via this technique, time is discretized by a uniform increment t and each timestep is resolved by simulating some number of events drawn from a Poist t distribution. This method is particularly suited to processes whose state-space transitions are in some sense uniform and easily resolved in multiples, as is the case in the integer-valued arrival process. Applying this particular numerical method allows for a simple stochastic linear system description for the approximate emissions model, henceforward 45

51 referred to as the emissions model: X t Z + X 0 = 0 X t+ = X t + P t 5. P t Poist t. 5. As with many stochastically sampled models, here accurate, low cost estimates of the expected behavior of the system are sought. Naively, we could obtain such an estimate by drawing independent samples from the system and taking the desired estimator to be the sample mean. However, this method can become undesirably expensive, especially when very accurate estimates are needed. Instead, we will apply variance reduction techniques to the sampling of the estimator in order to reduce the ensemble size needed to achieve a desired threshold of accuracy. We now return to the emissions model, and the problem of reducing the variance of estimates of its mean behavior. Informed by the techniques applied to the single step Poisson estimator, we seek algorithms that draw valid sample paths from the model and yet have increased precision via variance reduction. Define the naive mean path estimator D N t by D N t := N N Xt, i 5.3 i= where X i t are i.i.d. sample paths drawn from the emissions model. Note that D N t E[X]t := E[X t ] as N by the law of large numbers. To generate antithetical sample path pairs Xt A, Xt A, we can simply substitute an antithetic pair into the emissions model: X A 0 = 0 X A t+ = X A t + Y t 5.4 X A 0 = 0 X A t+ = X A t + Y t, 5.5 where Y t, Y t Poist t are antithetically paired as in Chapter 3. 46

52 Mean Path E[X]t Antithetically Sampled Paths Particles Emitted X t Time t Figure 5.: E[X]t. A sample antithetic path pair, compared to the expected path An illustration is shown in Figure 5.. We can then define the mean path estimator D N t by A D N A t := N N i= [ X A,i t ] + X A,i t, 5.6 where the pair X A,i t, X A,i t are drawn i.i.d. as outlined above. The sampling of paths utilizing stratified sampling is somewhat less trivial. Each of the samples Z j used to make mean estimate are drawn only from their respective strata, and thus Z j Poist t. Therefore the stratified samples, unlike the antithetic samples, cannot be simply input into the linear emissions model to produce valid sample paths. Our solution to this problem is to construct four sample paths at a time via: X S t+ X S t+ X S3 t+ X S4 t+ = X S t X S t X S3 t X S4 t Z t + Π Z t t Z 3 t, 5.7 Z 4 t where Π t is a random 4 4 permutation matrix, Z j t are stratified samples of Poist t, and the model is subject to the initial condition X S 0 = 47

Exact Simulation of Continuous Time Markov Jump Processes with Anticorrelated Variance Reduced Monte Carlo Estimation

Exact Simulation of Continuous Time Markov Jump Processes with Anticorrelated Variance Reduced Monte Carlo Estimation 53rd I onference on Decision and ontrol December 5-7,. Los Angeles, alifornia, USA xact Simulation of ontinuous Time Markov Jump Processes with Anticorrelated Variance Reduced Monte arlo stimation Peter

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

ELEMENTS OF PROBABILITY THEORY

ELEMENTS OF PROBABILITY THEORY ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan Monte-Carlo MMD-MA, Université Paris-Dauphine Xiaolu Tan tan@ceremade.dauphine.fr Septembre 2015 Contents 1 Introduction 1 1.1 The principle.................................. 1 1.2 The error analysis

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Structural Reliability

Structural Reliability Structural Reliability Thuong Van DANG May 28, 2018 1 / 41 2 / 41 Introduction to Structural Reliability Concept of Limit State and Reliability Review of Probability Theory First Order Second Moment Method

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

2 Random Variable Generation

2 Random Variable Generation 2 Random Variable Generation Most Monte Carlo computations require, as a starting point, a sequence of i.i.d. random variables with given marginal distribution. We describe here some of the basic methods

More information

Northwestern University Department of Electrical Engineering and Computer Science

Northwestern University Department of Electrical Engineering and Computer Science Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability

More information

. Find E(V ) and var(v ).

. Find E(V ) and var(v ). Math 6382/6383: Probability Models and Mathematical Statistics Sample Preliminary Exam Questions 1. A person tosses a fair coin until she obtains 2 heads in a row. She then tosses a fair die the same number

More information

Uncertainty Quantification in Computational Science

Uncertainty Quantification in Computational Science DTU 2010 - Lecture I Uncertainty Quantification in Computational Science Jan S Hesthaven Brown University Jan.Hesthaven@Brown.edu Objective of lectures The main objective of these lectures are To offer

More information

Deterministic. Deterministic data are those can be described by an explicit mathematical relationship

Deterministic. Deterministic data are those can be described by an explicit mathematical relationship Random data Deterministic Deterministic data are those can be described by an explicit mathematical relationship Deterministic x(t) =X cos r! k m t Non deterministic There is no way to predict an exact

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

Lecture 6 Basic Probability

Lecture 6 Basic Probability Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Continuous Random Variables

Continuous Random Variables 1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables

More information

Notes 6 : First and second moment methods

Notes 6 : First and second moment methods Notes 6 : First and second moment methods Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Roc, Sections 2.1-2.3]. Recall: THM 6.1 (Markov s inequality) Let X be a non-negative

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying

More information

6 The normal distribution, the central limit theorem and random samples

6 The normal distribution, the central limit theorem and random samples 6 The normal distribution, the central limit theorem and random samples 6.1 The normal distribution We mentioned the normal (or Gaussian) distribution in Chapter 4. It has density f X (x) = 1 σ 1 2π e

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

Lecture 11. Probability Theory: an Overveiw

Lecture 11. Probability Theory: an Overveiw Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

Lecture 22: Variance and Covariance

Lecture 22: Variance and Covariance EE5110 : Probability Foundations for Electrical Engineers July-November 2015 Lecture 22: Variance and Covariance Lecturer: Dr. Krishna Jagannathan Scribes: R.Ravi Kiran In this lecture we will introduce

More information

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets

More information

JUSTIN HARTMANN. F n Σ.

JUSTIN HARTMANN. F n Σ. BROWNIAN MOTION JUSTIN HARTMANN Abstract. This paper begins to explore a rigorous introduction to probability theory using ideas from algebra, measure theory, and other areas. We start with a basic explanation

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

University of Regina. Lecture Notes. Michael Kozdron

University of Regina. Lecture Notes. Michael Kozdron University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

CSCI-6971 Lecture Notes: Monte Carlo integration

CSCI-6971 Lecture Notes: Monte Carlo integration CSCI-6971 Lecture otes: Monte Carlo integration Kristopher R. Beevers Department of Computer Science Rensselaer Polytechnic Institute beevek@cs.rpi.edu February 21, 2006 1 Overview Consider the following

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539 Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory

More information

LIST OF FORMULAS FOR STK1100 AND STK1110

LIST OF FORMULAS FOR STK1100 AND STK1110 LIST OF FORMULAS FOR STK1100 AND STK1110 (Version of 11. November 2015) 1. Probability Let A, B, A 1, A 2,..., B 1, B 2,... be events, that is, subsets of a sample space Ω. a) Axioms: A probability function

More information

Hochdimensionale Integration

Hochdimensionale Integration Oliver Ernst Institut für Numerische Mathematik und Optimierung Hochdimensionale Integration 14-tägige Vorlesung im Wintersemester 2010/11 im Rahmen des Moduls Ausgewählte Kapitel der Numerik Contents

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

Numerical Methods I Monte Carlo Methods

Numerical Methods I Monte Carlo Methods Numerical Methods I Monte Carlo Methods Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001, Fall 2010 Dec. 9th, 2010 A. Donev (Courant Institute) Lecture

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is

More information

SDS 321: Introduction to Probability and Statistics

SDS 321: Introduction to Probability and Statistics SDS 321: Introduction to Probability and Statistics Lecture 14: Continuous random variables Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin www.cs.cmu.edu/

More information

Introduction to Probability

Introduction to Probability LECTURE NOTES Course 6.041-6.431 M.I.T. FALL 2000 Introduction to Probability Dimitri P. Bertsekas and John N. Tsitsiklis Professors of Electrical Engineering and Computer Science Massachusetts Institute

More information

Probability reminders

Probability reminders CS246 Winter 204 Mining Massive Data Sets Probability reminders Sammy El Ghazzal selghazz@stanfordedu Disclaimer These notes may contain typos, mistakes or confusing points Please contact the author so

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

Primer on statistics:

Primer on statistics: Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood

More information

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION JAINUL VAGHASIA Contents. Introduction. Notations 3. Background in Probability Theory 3.. Expectation and Variance 3.. Convergence

More information

Review: mostly probability and some statistics

Review: mostly probability and some statistics Review: mostly probability and some statistics C2 1 Content robability (should know already) Axioms and properties Conditional probability and independence Law of Total probability and Bayes theorem Random

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

Chapter 5. Random Variables (Continuous Case) 5.1 Basic definitions

Chapter 5. Random Variables (Continuous Case) 5.1 Basic definitions Chapter 5 andom Variables (Continuous Case) So far, we have purposely limited our consideration to random variables whose ranges are countable, or discrete. The reason for that is that distributions on

More information

6.1 Moment Generating and Characteristic Functions

6.1 Moment Generating and Characteristic Functions Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,

More information

where r n = dn+1 x(t)

where r n = dn+1 x(t) Random Variables Overview Probability Random variables Transforms of pdfs Moments and cumulants Useful distributions Random vectors Linear transformations of random vectors The multivariate normal distribution

More information

It can be shown that if X 1 ;X 2 ;:::;X n are independent r.v. s with

It can be shown that if X 1 ;X 2 ;:::;X n are independent r.v. s with Example: Alternative calculation of mean and variance of binomial distribution A r.v. X has the Bernoulli distribution if it takes the values 1 ( success ) or 0 ( failure ) with probabilities p and (1

More information

ADDITIONAL MATHEMATICS

ADDITIONAL MATHEMATICS ADDITIONAL MATHEMATICS GCE Ordinary Level (Syllabus 4018) CONTENTS Page NOTES 1 GCE ORDINARY LEVEL ADDITIONAL MATHEMATICS 4018 2 MATHEMATICAL NOTATION 7 4018 ADDITIONAL MATHEMATICS O LEVEL (2009) NOTES

More information

CS145: Probability & Computing

CS145: Probability & Computing CS45: Probability & Computing Lecture 5: Concentration Inequalities, Law of Large Numbers, Central Limit Theorem Instructor: Eli Upfal Brown University Computer Science Figure credits: Bertsekas & Tsitsiklis,

More information

Appendix A : Introduction to Probability and stochastic processes

Appendix A : Introduction to Probability and stochastic processes A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of

More information

3 Continuous Random Variables

3 Continuous Random Variables Jinguo Lian Math437 Notes January 15, 016 3 Continuous Random Variables Remember that discrete random variables can take only a countable number of possible values. On the other hand, a continuous random

More information

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y. CS450 Final Review Problems Fall 08 Solutions or worked answers provided Problems -6 are based on the midterm review Identical problems are marked recap] Please consult previous recitations and textbook

More information

Preliminary statistics

Preliminary statistics 1 Preliminary statistics The solution of a geophysical inverse problem can be obtained by a combination of information from observed data, the theoretical relation between data and earth parameters (models),

More information

The Multivariate Gaussian Distribution [DRAFT]

The Multivariate Gaussian Distribution [DRAFT] The Multivariate Gaussian Distribution DRAFT David S. Rosenberg Abstract This is a collection of a few key and standard results about multivariate Gaussian distributions. I have not included many proofs,

More information

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables Chapter 2 Some Basic Probability Concepts 2.1 Experiments, Outcomes and Random Variables A random variable is a variable whose value is unknown until it is observed. The value of a random variable results

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Negative Association, Ordering and Convergence of Resampling Methods

Negative Association, Ordering and Convergence of Resampling Methods Negative Association, Ordering and Convergence of Resampling Methods Nicolas Chopin ENSAE, Paristech (Joint work with Mathieu Gerber and Nick Whiteley, University of Bristol) Resampling schemes: Informal

More information

THE N-VALUE GAME OVER Z AND R

THE N-VALUE GAME OVER Z AND R THE N-VALUE GAME OVER Z AND R YIDA GAO, MATT REDMOND, ZACH STEWARD Abstract. The n-value game is an easily described mathematical diversion with deep underpinnings in dynamical systems analysis. We examine

More information

ACE 562 Fall Lecture 2: Probability, Random Variables and Distributions. by Professor Scott H. Irwin

ACE 562 Fall Lecture 2: Probability, Random Variables and Distributions. by Professor Scott H. Irwin ACE 562 Fall 2005 Lecture 2: Probability, Random Variables and Distributions Required Readings: by Professor Scott H. Irwin Griffiths, Hill and Judge. Some Basic Ideas: Statistical Concepts for Economists,

More information

Lecture 2: Review of Probability

Lecture 2: Review of Probability Lecture 2: Review of Probability Zheng Tian Contents 1 Random Variables and Probability Distributions 2 1.1 Defining probabilities and random variables..................... 2 1.2 Probability distributions................................

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables ECE 6010 Lecture 1 Introduction; Review of Random Variables Readings from G&S: Chapter 1. Section 2.1, Section 2.3, Section 2.4, Section 3.1, Section 3.2, Section 3.5, Section 4.1, Section 4.2, Section

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is

More information

MATH Solutions to Probability Exercises

MATH Solutions to Probability Exercises MATH 5 9 MATH 5 9 Problem. Suppose we flip a fair coin once and observe either T for tails or H for heads. Let X denote the random variable that equals when we observe tails and equals when we observe

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

Introduction to Proofs in Analysis. updated December 5, By Edoh Y. Amiran Following the outline of notes by Donald Chalice INTRODUCTION

Introduction to Proofs in Analysis. updated December 5, By Edoh Y. Amiran Following the outline of notes by Donald Chalice INTRODUCTION Introduction to Proofs in Analysis updated December 5, 2016 By Edoh Y. Amiran Following the outline of notes by Donald Chalice INTRODUCTION Purpose. These notes intend to introduce four main notions from

More information

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1). Name M362K Final Exam Instructions: Show all of your work. You do not have to simplify your answers. No calculators allowed. There is a table of formulae on the last page. 1. Suppose X 1,..., X 1 are independent

More information

Expectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Expectation. DS GA 1002 Probability and Statistics for Data Science.   Carlos Fernandez-Granda Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,

More information

Spectral representations and ergodic theorems for stationary stochastic processes

Spectral representations and ergodic theorems for stationary stochastic processes AMS 263 Stochastic Processes (Fall 2005) Instructor: Athanasios Kottas Spectral representations and ergodic theorems for stationary stochastic processes Stationary stochastic processes Theory and methods

More information

Lecture 4: September Reminder: convergence of sequences

Lecture 4: September Reminder: convergence of sequences 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 4: September 6 In this lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused

More information

Chapter 2. Discrete Distributions

Chapter 2. Discrete Distributions Chapter. Discrete Distributions Objectives ˆ Basic Concepts & Epectations ˆ Binomial, Poisson, Geometric, Negative Binomial, and Hypergeometric Distributions ˆ Introduction to the Maimum Likelihood Estimation

More information

Scientific Computing: Monte Carlo

Scientific Computing: Monte Carlo Scientific Computing: Monte Carlo Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 April 5th and 12th, 2012 A. Donev (Courant Institute)

More information

Continuous Random Variables and Continuous Distributions

Continuous Random Variables and Continuous Distributions Continuous Random Variables and Continuous Distributions Continuous Random Variables and Continuous Distributions Expectation & Variance of Continuous Random Variables ( 5.2) The Uniform Random Variable

More information

THEODORE VORONOV DIFFERENTIABLE MANIFOLDS. Fall Last updated: November 26, (Under construction.)

THEODORE VORONOV DIFFERENTIABLE MANIFOLDS. Fall Last updated: November 26, (Under construction.) 4 Vector fields Last updated: November 26, 2009. (Under construction.) 4.1 Tangent vectors as derivations After we have introduced topological notions, we can come back to analysis on manifolds. Let M

More information

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( ) Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca October 22nd, 2014 E. Tanré (INRIA - Team Tosca) Mathematical

More information

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University Chapter 3, 4 Random Variables ENCS6161 - Probability and Stochastic Processes Concordia University ENCS6161 p.1/47 The Notion of a Random Variable A random variable X is a function that assigns a real

More information

Probability, Random Processes and Inference

Probability, Random Processes and Inference INSTITUTO POLITÉCNICO NACIONAL CENTRO DE INVESTIGACION EN COMPUTACION Laboratorio de Ciberseguridad Probability, Random Processes and Inference Dr. Ponciano Jorge Escamilla Ambrosio pescamilla@cic.ipn.mx

More information

Outline. Scientific Computing: An Introductory Survey. Nonlinear Equations. Nonlinear Equations. Examples: Nonlinear Equations

Outline. Scientific Computing: An Introductory Survey. Nonlinear Equations. Nonlinear Equations. Examples: Nonlinear Equations Methods for Systems of Methods for Systems of Outline Scientific Computing: An Introductory Survey Chapter 5 1 Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign

More information

Lecture 3 - Expectation, inequalities and laws of large numbers

Lecture 3 - Expectation, inequalities and laws of large numbers Lecture 3 - Expectation, inequalities and laws of large numbers Jan Bouda FI MU April 19, 2009 Jan Bouda (FI MU) Lecture 3 - Expectation, inequalities and laws of large numbersapril 19, 2009 1 / 67 Part

More information

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Definition of stochastic process (random

More information

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27 Probability Review Yutian Li Stanford University January 18, 2018 Yutian Li (Stanford University) Probability Review January 18, 2018 1 / 27 Outline 1 Elements of probability 2 Random variables 3 Multiple

More information

Multivariate Distribution Models

Multivariate Distribution Models Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is

More information

2.1 Elementary probability; random sampling

2.1 Elementary probability; random sampling Chapter 2 Probability Theory Chapter 2 outlines the probability theory necessary to understand this text. It is meant as a refresher for students who need review and as a reference for concepts and theorems

More information

Regression Analysis. Ordinary Least Squares. The Linear Model

Regression Analysis. Ordinary Least Squares. The Linear Model Regression Analysis Linear regression is one of the most widely used tools in statistics. Suppose we were jobless college students interested in finding out how big (or small) our salaries would be 20

More information

Nonlife Actuarial Models. Chapter 14 Basic Monte Carlo Methods

Nonlife Actuarial Models. Chapter 14 Basic Monte Carlo Methods Nonlife Actuarial Models Chapter 14 Basic Monte Carlo Methods Learning Objectives 1. Generation of uniform random numbers, mixed congruential method 2. Low discrepancy sequence 3. Inversion transformation

More information