Monte Carlo method and application to random processes

Similar documents
Clases 7-8: Métodos de reducción de varianza en Monte Carlo *

Monte Carlo Integration

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Surveying the Variance Reduction Methods

Topic 10: The Law of Large Numbers

Lecture 2: Monte Carlo Simulation

This section is optional.

Lecture 12: September 27

Chapter 2 The Monte Carlo Method

Lecture 33: Bootstrap

Lecture 19: Convergence

SDS 321: Introduction to Probability and Statistics

1 Introduction to reducing variance in Monte Carlo simulations

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

MONTE CARLO VARIANCE REDUCTION METHODS

Topic 9: Sampling Distributions of Estimators

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Topic 9: Sampling Distributions of Estimators

Monte Carlo Methods: Lecture 3 : Importance Sampling

Estimation for Complete Data

Random Variables, Sampling and Estimation

An Introduction to Randomized Algorithms

Output Analysis and Run-Length Control

Distribution of Random Samples & Limit theorems

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Output Analysis (2, Chapters 10 &11 Law)

Topic 9: Sampling Distributions of Estimators

f(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

ST5215: Advanced Statistical Theory

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Bayesian Methods: Introduction to Multi-parameter Models

Properties and Hypothesis Testing

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Lecture 11 and 12: Basic estimation theory

Machine Learning Brett Bernstein

Problem Set 4 Due Oct, 12

AMS570 Lecture Notes #2

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

CSE 527, Additional notes on MLE & EM

Exponential Families and Bayesian Inference

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

MATH/STAT 352: Lecture 15

STAT Homework 2 - Solutions

Stat 421-SP2012 Interval Estimation Section

Mathematical Statistics - MS

6. Sufficient, Complete, and Ancillary Statistics

Simulation. Two Rule For Inverting A Distribution Function

Lecture 12: November 13, 2018

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

(7 One- and Two-Sample Estimation Problem )

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

Parameter, Statistic and Random Samples

STATISTICAL INFERENCE

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Statistical inference: example 1. Inferential Statistics

Topic 8: Expected Values

Computing Confidence Intervals for Sample Data

Using the IML Procedure to Examine the Efficacy of a New Control Charting Technique

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

4. Partial Sums and the Central Limit Theorem

On stratified randomized response sampling

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

LECTURE 8: ASYMPTOTICS I

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

Element sampling: Part 2

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

5. Fractional Hot deck Imputation

Machine Learning Brett Bernstein

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

Stochastic Simulation

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Understanding Samples

The Expectation-Maximization (EM) Algorithm

Expectation and Variance of a random variable

Unbiased Estimation. February 7-12, 2008

Sequential Monte Carlo Methods - A Review. Arnaud Doucet. Engineering Department, Cambridge University, UK

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Final Review for MATH 3510

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

PRACTICE PROBLEMS FOR THE FINAL

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018)

CS284A: Representations and Algorithms in Molecular Biology

On the Variability Estimation of Lognormal Distribution Based on. Sample Harmonic and Arithmetic Means

Statistical Theory; Why is the Gaussian Distribution so popular?

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

Department of Mathematics

Basis for simulation techniques

Transcription:

Mote Carlo method ad applicatio to radom processes Lecture 3: Variace reductio techiques (8/3/2017) 1 Lecturer: Eresto Mordecki, Facultad de Ciecias, Uiversidad de la República, Motevideo, Uruguay Graduate program i egieerig, major i applied mathematics, GIMAS8AA, Mies Nacy, Uiversity of Lorraie. 1 March 9, 2017.

Cotets Variace reductio Atithetic variates Importace samplig Cotrol variates Stratified samplig Coditioal samplig

Variace reductio As we have see, a critical issue i MC method is the quality of estimatio. The questio we face is: ca we devise a method that produces, with the same umber of variates, a more precise estimatio? The aswer is yes, ad the geeral idea is the followig: If we wat to estimate µ = E X, to fid Y such that µ = E X = E Y, var Y < var X. The way to produce a good Y usually departs from the kowledge that we ca have about X. There are several methods to reduce variace, however there does ot to exist a geeral method that always produce gai i the variace, the case is that each problem has its ow good method.

Atithetic variates The method is simple, ad cosists i usig a symmetrized variate i the cases this is possible. For istace, if we wat to compute µ = 1 0 f (x)dx, we would have, with U uiform i [0, 1], We have X = f (U), Y = 1 (f (U) + f (1 U)). 2 var Y = 1 (var X + cov(f (U), f (1 U))) var X. 2 If cov(f (U), f (1 U)) < var(x) we have variace reductio.

Example: Uiform radom variables We compute π = 4 1 with = 10 6 variates. Our results 0 1 x 2 dx, Estimate Variace Classical Estimate 3.141379 0.000553 Atithetic Variates 3.141536 0.000205 True value 3.141593

Example: Tail probabilities of ormal radom variables We wat to compute the probability that a stadard ormal variable is larger that 3: µ = P(Z > 3), ˆµ = 1 ˆµ A = 1 2 1{Z k > 3}, k=1 (1{Z k > 3} + 1{ Z k > 3}). k=1 Our results with = 10 4 variates: Estimate Variace Classical Estimate 0.00121 0.00022 Atithetic Variates 0.00137 0.00016 True value 0.0013499

Importace samplig The importace samplig method cosists i chagig the uderlyig distributio of the variable used to simulate. It is specially suited for the estimatio of small probabilities (rare evets). Assumig that X f ad Y g, it is based i the followig idetity h(x)f (x) µ = E h(x) = h(x)f (x)dx = g(x)dx = E H(Y ), g(x) h(x)f (x) where we defie H(x) = g(x). The mai idea is to achieve that Y poits to the set where h takes large values. If ot correctly applied, the method ca elarge the variace.

Example: Tail probabilities of ormal radom variables µ = P(Z > 3) = = 3 3 e x 2 /2 2π e (x 3)2 /2 e (x 3)2 /2 dx e 3x+9/2 e (x 3)2 /2 2π dx = E e 3Y +9/2 1 {Y >3}, where Y N (3, 1). Our results with = 10 4 variates: Estimate Variace Classical Estimate 0.00121 2.2e-04 Atithetic Variates 0.00137 1.6e-04 Importace samplig 0.001340 1.5e-05 True value 0.0013499

Cotrol variates Give the problem of simulatig µ = E h(x) the idea is to cotrol the fuctio h through a fuctio g, close as posible to h, ad such that we kow β = E g(y ). We ca add a costat c to better adjustmet. More cocretely, the equatio is µ = E h(x) = E h(x) c(e g(x) β) = E(h(X) cg(x)) + cβ. The coefficiet c ca be chose i order to miimize the variace: var(h(x) cg(x)) = var h(x) + c 2 var g(x) 2c cov(h(x), g(x)).

This gives a miimum whe c = cov(h(x), g(x)). var(g(x)) As this quatities are usually ukow, we ca first ru a MC to estimate c. obtaiig the followig variace: var(h(x) cg(x)) = var(h(x)) cov(h(x), g(x))2 var(g(x)) = (1 ρ(h(x), g(x)) 2 ) var(h(x)) As ρ(h(x), g(x)) 1, we usually obtai a variace reductio.

Example: The computatio of π We choose g(x) = 1 x, that is close to 1 x 2 We first estimate c. I this case we kow β = E(1 U) ad var(1 U) = 1/12. After simulatio we obtiai ĉ 0.7

So we estimate π = 4 E( 1 U 2 0.7(1 U 1/2)). Our results with = 10 6 : Estimate Variace Classical Estimate 3.141379 0.000553 Atithetic Variates 3.141536 0.000205 Cotrol variates 3.141517 0.000215 True value 3.141593

Stratified samplig 2 The idea to reduce the variace that this method proposes is to produce a partitio of the probability space Ω, ad distribute the effort of samplig i each set of the partitio. Suppose we wat to estimate µ = E(X), ad suppose there is some discrete radom variable Y, with possible values y 1,..., y k, such that, for each i = 1,..., k: (a) the probability p i = P(Y = y i ), is kow; (b) we ca simulate the value of X coditioal o Y = y i. The proposal is to estimate E(X) = k E(X Y = y i )p i, i=1 by estimatig the k quatities E(X Y = y i ), i = 1,..., k. 2 Adapted from Simulatio 5ed.S. M. Ross, (2013) Elsevier

So, rather tha geeratig idepedet replicatios of X, we do p i of the simulatios coditioal o the evet that Y = y i for each i = 1,..., k. If we let ˆX i be the average of the p i observed values of X Y = y i, the we would have the ubiased estimator k ˆµ = ˆX i p i i=1 that is called a stratified samplig estimator of E(X). To compute the variace, we first have var( ˆX i ) = var(x Y = y i) p i

Cosequetly, usig the precedig ad that the ˆX i, are idepedet, we see that var(ˆµ) = = 1 k pi 2 var( ˆX i ) i=1 k p i var(x Y = y i ) = 1 E(var(X Y )). i=1 Because the variace of the classical estimator is 1 var(x), ad var(ˆµ) = 1 E(var(X Y )), we see from the coditioal variace formula var(x) = E(var(X Y )) + var(e(x Y )), that the variace reductio is 1 var(x) 1 E(var(X Y )) = 1 var E(X Y ),

That is, the variace savigs per ru is var E(X Y ) which ca be substatial whe the value of Y strogly affects the coditioal expectatio of X. O the cotrary, if X ad Y are idepedet, E(X Y ) = E X ad var E(X Y ) = 0. Observe that the variace of the stratified samplig estimator ca be estimated by var(ˆµ) = 1 k pi 2 s i, i=1 if s i is the usual estimator of the sample of X Y = y i. Remark: The simulatio of p i variates for each i is called the proportioal samplig. Alteratively, oe ca choose 1,..., k s.t. 1 + + k = that miimize the variace.

Example: Itegrals i [0, 1] Suppose that we wat to estimate We put We have µ = E(h(U)) = Y = j, if j 1 µ = E E(h(U) Y ) = 1 1 0 h(x)dx. U < j, for j = 1,...,. E(h(U (j) )), where U (j) is uiform i j 1 U < j. I this example we have k =, ad we use i = 1 variates for each value of Y. As j=1 j=1 U (j) U + j 1, the resultig estimator is ˆµ = 1 ( ) Uj + j 1 h.

To compute the variace, we have var(ˆµ) = 1 2 = 1 ( ) U + j 1 var h j=1 j=1 j j 1 (h(x) µ j ) 2 dx, where µ j = j j 1 h(x)dx. The reductio is obtaied because µ j is closer to h tha µ: var(ˆµ C ) = 1 1 0 (h(x) µ) 2 dx, where ˆµ C stads for the classic MC estimator.

Example: computatio of π We retur to Observig that π = 4 j U 1 0 1 x 2 dx. U + j 1 U (j), we combie stratified ad atithetic samplig: ˆµ = 2 ( ) Uj + j 1 2 ( ) j 2 Uj 1 + 1 j=1 For = 10 5 we obtai a estimatio ˆµ = 3.1415926537 π = 3.14159265358979 with 10 correct digits.

Coditioal samplig Remember the telescopic (or tower ) property of the coditioal expectatio: E(X) = E(E(X θ)). where θ is a auxiliar radom variable. I case we are able to simulate Y = E(X θ), we have the followig variace reductio: var(y ) = var(e(x θ)) = var(x) E(var(X θ)).

Example: Computig a expectatio Let U be uiform i [0, 1] ad Z N (0, 1). We wat to compute We first compute E(e UZ U = u) = µ = E(e UZ ). R e ux 1 2π e x 2 /2 dx = e u2 /2, so Y = E(e UZ U) = e U2 /2, ad E Y = 1 0 eu2 /2 du. Our results for two size samples. = 10 3 = 10 6 Classical 1.2145 ± 0.020 1.1951 ± 0.00060 Coditioal 1.1962 ± 0.004 1.1949 ± 0.00012 True 1.194958 Note that the classical method requires 2 samples.