Stochastic Simulation Variance reduction methods Bo Friis Nielsen

Similar documents
Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

Stochastic Simulation Discrete simulation/event-by-event

7 Variance Reduction Techniques

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

Stochastic Simulation Introduction Bo Friis Nielsen

Optimization and Simulation

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

CSCI-6971 Lecture Notes: Monte Carlo integration

F denotes cumulative density. denotes probability density function; (.)

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

Variance reduction techniques

S6880 #13. Variance Reduction Methods

Nonlife Actuarial Models. Chapter 14 Basic Monte Carlo Methods

Scientific Computing: Monte Carlo

Numerical Methods I Monte Carlo Methods

Simulation. Alberto Ceselli MSc in Computer Science Univ. of Milan. Part 4 - Statistical Analysis of Simulated Data

[Chapter 6. Functions of Random Variables]

Variance reduction. Timo Tiihonen

Simulation. Where real stuff starts

Simulation. Where real stuff starts

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

IE 581 Introduction to Stochastic Simulation

Exercise 5 Release: Due:

Bindel, Fall 2011 Intro to Scientific Computing (CS 3220) Week 14: Monday, May 2

Monte Carlo-based statistical methods (MASM11/FMS091)

IEOR E4703: Monte-Carlo Simulation

MAS223 Statistical Inference and Modelling Exercises

Time Series Analysis

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

ISyE 6644 Fall 2014 Test 3 Solutions

ACM 116: Lectures 3 4

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

Computer Systems Modelling

S-Y0 U-G0 CANDIDATES

Stochastic Processes - lesson 2

Variance reduction techniques

Simulation of stationary processes. Timo Tiihonen

Chapter 4. Continuous Random Variables

Robustness to Parametric Assumptions in Missing Data Models

Markov Chain Monte Carlo Lecture 1

STA 2101/442 Assignment 3 1

L3 Monte-Carlo integration

STA 2201/442 Assignment 2

Chapter 6: Rational Expr., Eq., and Functions Lecture notes Math 1010

CPSC 531: System Modeling and Simulation. Carey Williamson Department of Computer Science University of Calgary Fall 2017

IEOR 6711: Stochastic Models I SOLUTIONS to the First Midterm Exam, October 7, 2008

Lecture 22: Variance and Covariance

Introduction to Rare Event Simulation

375 PU M Sc Statistics

2905 Queueing Theory and Simulation PART IV: SIMULATION

DUBLIN CITY UNIVERSITY

Variance Reduction. in Statistics, we deal with estimators or statistics all the time

Hochdimensionale Integration

Statistical Computing: Exam 1 Chapters 1, 3, 5

Stat 451 Lecture Notes Monte Carlo Integration

Master s Written Examination - Solution

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

STAT 515 MIDTERM 2 EXAM November 14, 2018

Uniform random numbers generators

ECE531 Lecture 8: Non-Random Parameter Estimation

Outline for today. Computation of the likelihood function for GLMMs. Likelihood for generalized linear mixed model

Exercises and Answers to Chapter 1

Control Variates for Markov Chain Monte Carlo

Econ 2120: Section 2

Statistical Methods in Particle Physics

1 Stat 605. Homework I. Due Feb. 1, 2011

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

Q = (c) Assuming that Ricoh has been working continuously for 7 days, what is the probability that it will remain working at least 8 more days?

Math 362, Problem set 1

180B Lecture Notes, W2011

CIVL Continuous Distributions

Monte Carlo Methods for Computation and Optimization (048715)

Clases 7-8: Métodos de reducción de varianza en Monte Carlo *

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Chapter 5 Class Notes

Multivariate Distributions CIVL 7012/8012

Statistics. Statistics

Bayesian Inference. Chapter 2: Conjugate models

Non-parametric Inference and Resampling

The Nonparametric Bootstrap

GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs

MA 320 Introductory Probability, Section 1 Spring 2017 Final Exam Practice May. 1, Exam Scores. Question Score Total. Name:

Lecture 10. Failure Probabilities and Safety Indexes

Joint Distribution of Two or More Random Variables

Computer Intensive Methods in Mathematical Statistics

Lecture 4: Proofs for Expectation, Variance, and Covariance Formula

ISE/OR 762 Stochastic Simulation Techniques

Statistics & Data Sciences: First Year Prelim Exam May 2018

K-ANTITHETIC VARIATES IN MONTE CARLO SIMULATION ISSN k-antithetic Variates in Monte Carlo Simulation Abdelaziz Nasroallah, pp.

Chapter 5: Monte Carlo Integration and Variance Reduction

Extra Topic: DISTRIBUTIONS OF FUNCTIONS OF RANDOM VARIABLES

X i. X(n) = 1 n. (X i X(n)) 2. S(n) n

CHAPTER 4 MATHEMATICAL EXPECTATION. 4.1 Mean of a Random Variable

Stability of optimization problems with stochastic dominance constraints

Stochastic Simulation

Gaussian random variables inr n

Asymptotic Statistics-III. Changliang Zou

Parameter Estimation and Fitting to Data

Exercises sheet n 1 : Introduction to MC and IS

Transcription:

Stochastic Simulation Variance reduction methods Bo Friis Nielsen Applied Mathematics and Computer Science Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: bfni@dtu.dk

Variance reduction methods To obtain better estimates with the same ressources Exploit analytical knowledge and/or correlation Methods: Antithetic variables Control variates Stratified sampling Importance sampling Bo Friis Nielsen 6/6 2018 02443 lecture 7 2

Case: Monte Carlo evaluation of integral Consider the integral 1 0 We can interpret this interval as e x dx E ( e U) = 1 0 e x dx = θ U U(0,1) To estimate the integral: sample of the random variable e U and take the average. X i = e U i X = n i=1 X i n This is the crude Monte Carlo estimator, crude because we use no refinements whatsoever. Bo Friis Nielsen 6/6 2018 02443 lecture 7 3

Analytical considerations It is straightforward to calculate the integral in this case The estimator X 1 0 e x dx = e 1 1.72 E(X) = e 1 V(X) = E(X 2 ) E(X) 2 E(X 2 ) = Based on one observation 1 0 (e x ) 2 dx = 1 2 ( e 2 1 ) V(X) = 1 2 ( e 2 1 ) (e 1) 2 = 0.2420 Bo Friis Nielsen 6/6 2018 02443 lecture 7 4

Antithetic variables General idea: to exploit correlation If the estimator is positively correlated with U i (monotone function): Use 1 U also Y i = eu i +e 1 U i 2 = eu i + e e U i 2 Ȳ = n i=0 Y i n The computational effort of calculating the effort needed to compute X. Ȳ should be similar to By the latter expression of Y i we can generate the same number of Y s as X s Bo Friis Nielsen 6/6 2018 02443 lecture 7 5

Antithetic variables - analytical We can analyse the example analytically due to its simplicity E(Ȳ) = E( X) = θ To calculate V(Ȳ) we start with V(Y i). V(Y i ) = 1 4 V( e U i ) + 1 4 V( e 1 U i ) +2 1 4 Cov( e U i,e 1 U i ) = 1 2 V( e U i ) + 1 2 Cov( e U i (e 1 U i ) Cov ( e U i,e 1 U i ) = E ( e U i e 1 U i ) E ( e U i ) E ( e 1 U i ) = e (e 1) 2 = 3e e 2 1 = 0.2342 V(Y i ) = 1 (0.2420 0.2342) = 0.0039 2 Bo Friis Nielsen 6/6 2018 02443 lecture 7 6

Comparison: Crude method vs. antithetic Crude method: V(X i ) = 1 2 ( e 2 1 ) (e 1) 2 = 0.2420 Antithetic method: V(Y i ) = 1 (0.2420 0.2342) = 0.0039 2 I.e, a reduction by 98 %, almost for free. The variance on X - and samples. Ȳ - will scale with 1/n, the number of Going from crude to antithetic method, reduces the variance as much as increasing number of samples with a factor 50. Bo Friis Nielsen 6/6 2018 02443 lecture 7 7

Antethetic variables in more complex models If X = h(u 1,...,U n ) where h is monotone in each of its coordinates, then we can use antithetic variables to reduce the variance, because Y = h(1 U 1,...,1 U n ) Cov(X,Y) 0 and therefore V( 1 2 (X +Y)) 1 2 V(X). Bo Friis Nielsen 6/6 2018 02443 lecture 7 8

Antithetic variables in the queue simulation Can you device the queueing model of yesterday, so that the number of rejections is a monotone function of the underlying U i s? Yes: Make sure that we always use either U i or 1 U i, so that a large U i implies customers arriving quickly and remaining long. Bo Friis Nielsen 6/6 2018 02443 lecture 7 9

Control variates Use of covariates Z = X +c(y µ y ) E(Y) = µ y ( known) V(Z) = V(X)+c 2 V(Y)+2cCov(Y,X) We can minimize V(Z) by choosing to get c = Cov(X,Y) V(Y) V(Z) = V(X) Cov(X,Y)2 V(Y) Bo Friis Nielsen 6/6 2018 02443 lecture 7 10

Example Use U as control variate Z i = X i +c ( U i 1 ) 2 X i = e U i The optimal value can be found by Cov(X,Y) = Cov ( U,e U) = E ( Ue U) E(U)E ( e U) 0.14086 In practice we would not know this covariance, but estimate it empirically. V(Z c= 0.14086 1/12 ) = V ( e U) Cov( e U,U ) 2 V(U) = 0.0039 Bo Friis Nielsen 6/6 2018 02443 lecture 7 11

Stratified sampling This is a general survey technique: We sample in predetermined areas, using knowledge of structure of the sampling space eui,1 10 +e 1 10 +U i,2 10 + +e 9 10 +U i,10 10 W i = 10 What is an appropriate number of strata? (In this case there is a simple answer; for complex problems not so) Bo Friis Nielsen 6/6 2018 02443 lecture 7 12

Importance sampling Suppose we want to evaluate θ = E(h(X)) = h(x)f(x)dx For g(x) > 0 whenever f(x) > 0 this is equivalent to ( ) h(x)f(x) h(y)f(y) θ = g(x)dx = E g(x) g(y) where Y is distributed according to g(y) This is an efficient estimator ) of θ, if we have chosen g such that the variance of is small. ( h(y)f(y) g(y) Such a g will lead to more Y s where h(y) is large. More important regions will be sampled more often. Bo Friis Nielsen 6/6 2018 02443 lecture 7 13

Re-using the random numbers We want to compare two different queueing systems. We can estimate the rejection rate of system i = 1,2 by θ i = E(g i (U 1,...,U n )) and then rate the two systems according to ˆθ 2 ˆθ 1 But typically g 1 ( ) and g 2 ( ) are positively correlated: Long service times imply many rejections. Bo Friis Nielsen 6/6 2018 02443 lecture 7 14

Then are more efficient estimator is based on θ 2 θ 1 = E(g 2 (U 1,...,U n ) g 1 (U 1,...,U n )) This amounts to letting the two systems run with the same input sequence of random numbers, i.e. same arrival and service time for each customer. With some program flows, this is easily obtained by re-setting the seed of the RNG. When this is not sufficient, you must store the sequence of arrival and service times, so they can be re-used. Bo Friis Nielsen 6/6 2018 02443 lecture 7 15

Exercise 5: Variance reduction methods Estimate the integral 1 0 ex dx by simulation (the crude Monte Carlo estimator). Use eg. an estimator based on 100 samples and present the result as the point estimator and a confidence interval. Estimate the integral 1 0 ex dx using antithetic variables, with comparable computer ressources. Estimate the integral 1 0 ex dx using a control variable, with comparable computer ressources. Estimate the integral 1 0 ex dx using stratified sampling, with comparable computer ressources. Use control variates to reduce the variance of the estimator in exercise 4 (Poisson arrivals). Bo Friis Nielsen 6/6 2018 02443 lecture 7 16