Θ is the stratified sampling estimator of θ. Note: y i s subdivide sampling space into k strata ;

Similar documents
P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Multiple Random Variables

IMPORTANCE SAMPLING Importance Sampling Background: let x = (x 1,..., x n ), θ = E[h(X)] = h(x)f(x)dx 1 N. h(x i ) = Θ,

1 Random Variable: Topics

1 Review of Probability and Distributions

Lecture 1: August 28

Probability and Distributions

Summary. Complete Statistics ... Last Lecture. .1 What is a complete statistic? .2 Why it is called as complete statistic?

STT 441 Final Exam Fall 2013

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Chapter 4. Chapter 4 sections

Multivariate Random Variable

5 Operations on Multiple Random Variables

Things to remember when learning probability distributions:

Chapter 5: Monte Carlo Integration and Variance Reduction

MATH 4426 HW7 solutions. April 15, Recall, Z is said to have a standard normal distribution (denoted Z N(0, 1)) if its pdf is

Simulation. Alberto Ceselli MSc in Computer Science Univ. of Milan. Part 4 - Statistical Analysis of Simulated Data

1 Solution to Problem 2.1

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

Chapter 2. Discrete Distributions

p. 6-1 Continuous Random Variables p. 6-2

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Problem Set 9 Fall 2007

SOLUTION FOR HOMEWORK 12, STAT 4351

Midterm Exam 1 Solution

Bernoulli and Binomial Distributions. Notes. Bernoulli Trials. Bernoulli/Binomial Random Variables Bernoulli and Binomial Distributions.

Math Bootcamp 2012 Miscellaneous

Exercise 5 Release: Due:

conditional cdf, conditional pdf, total probability theorem?

45 6 = ( )/6! = 20 6 = selected subsets: 6 = and 10

Joint p.d.f. and Independent Random Variables

Calculus of Variations and Computer Vision

MAS113 Introduction to Probability and Statistics. Proofs of theorems

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

ENGG2430A-Homework 2

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

ECE353: Probability and Random Processes. Lecture 7 -Continuous Random Variable

PROBABILITY AND STATISTICS IN COMPUTING. III. Discrete Random Variables Expectation and Deviations From: [5][7][6] German Hernandez

Special Discrete RV s. Then X = the number of successes is a binomial RV. X ~ Bin(n,p).

Probability and Statistics Notes

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

STA 256: Statistics and Probability I

Partial Solutions for h4/2014s: Sampling Distributions

Mathematical Statistics 1 Math A 6330

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},

18 Bivariate normal distribution I

1 Basic continuous random variable problems

Practice Examination # 3

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

MIT Spring 2015

Discrete Distributions Chapter 6

STAT 414: Introduction to Probability Theory

STAT 430/510: Lecture 16

Stationary independent increments. 1. Random changes of the form X t+h X t fixed h > 0 are called increments of the process.

Contents 1. Contents

Continuous r.v practice problems

Section 8.1. Vector Notation

Basics on Probability. Jingrui He 09/11/2007

A Probability Review

Conditional Distributions

STAT 418: Probability and Stochastic Processes

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

Week 9 The Central Limit Theorem and Estimation Concepts

ARCONES MANUAL FOR THE SOA EXAM P/CAS EXAM 1, PROBABILITY, SPRING 2010 EDITION.

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

Brief Review of Probability

Statistics, Data Analysis, and Simulation SS 2015

Discrete Distributions

Econ 508B: Lecture 5

Notes for Math 324, Part 19

Lecture 2: Repetition of probability theory and statistics

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

4. Distributions of Functions of Random Variables

Multiple Integrals. Introduction and Double Integrals Over Rectangular Regions. Philippe B. Laval. Spring 2012 KSU

Continuous Distributions

3. Probability and Statistics

SDS 321: Introduction to Probability and Statistics

Discrete Distributions

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

1. (Regular) Exponential Family

Continuous Random Variables

Probability Theory and Statistics. Peter Jochumzen

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

Random Variables and Their Distributions

S6880 #7. Generate Non-uniform Random Number #1

Stat 100a, Introduction to Probability.

1 Probability and Random Variables

E[X n ]= dn dt n M X(t). ). What is the mgf? Solution. Found this the other day in the Kernel matching exercise: 1 M X (t) =

Master s Written Examination - Solution

15 Discrete Distributions

4 Pairs of Random Variables

MA 519 Probability: Review

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

Stochastic Processes and Monte-Carlo Methods. University of Massachusetts: Spring Luc Rey-Bellet

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Introduction to Probability and Statistics (Continued)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

Sampling Distributions

1 Bernoulli Distribution: Single Coin Flip

Outline Properties of Covariance Quantifying Dependence Models for Joint Distributions Lab 4. Week 8 Jointly Distributed Random Variables Part II

Final Review: Problem Solving Strategies for Stat 430

Transcription:

1 STRATIFIED SAMPLING Stratified Sampling Background: suppose θ = E[X]. Simple MC simulation would use θ X. If X simulation depends on some discrete Y with pmf P {Y = y i } = p i, i = 1,..., k, and X can be simulated given Y = y i, E[X] = p i E[X Y = y i ] p i Xi = Θ. Θ is the stratified sampling estimator of θ. Note: y i s subdivide sampling space into k strata ; Analysis: If np i samples are used to determine X i, then V ar( X i ) = V ar(x Y = y i), and np i V ar(θ) = p 2 i V ar( X i ) = 1 n = 1 E[V ar(x Y )]. n p i V ar(x Y = y i ) Now V ar(x) = E[V ar(x Y )] + V ar(e[x Y ]), and V ar( X) = 1 nv ar(x), so V ar(θ) = V ar( X) 1 n V ar(e[x Y ]). Stratified sampling simulation can reduce variance significantly. Note: if S 2 i /(np i) is the sample variance for X i with np i samples, V ar(θ) = k p2 i V ar( X i ) 1 n k p is 2 i.

2 Application: one-dimensional integration, where θ = E[h(U)] = 1 h(x)dx. Strata are subintervals [ i 1 k, i k ], with p i = 1 k. For subinterval i, generate U ij = i 1 k + 1 k U, j = 1,..., n i, i = 1,..., k. Example: π 4 = E[ 1 U 2 ] = 1 1 x2 dx.7853982. Try n = 5: simple MC ( raw simulation ), N = 5; U = rand(1,n); X = sqrt(1-u.^2); disp( [mean(x) 2*std(X)/sqrt(N)]).7864.63975 With stratified sampling there are different choices for k. a) k = n, n i = 1, so use U ij = U i = i 1 n + 1 nu, i = 1,..., n. US = [:N-1]/N + U/N; XS = sqrt(1-us.^2); disp( [mean(xs) 2*std(XS)/sqrt(N)] ).7854.63134 b) k = 5, n i = 1, p i = 1 5, so use U ij = i 1 5 + 1 5U, j = 1,..., 1, i = 1,..., 5. V ar( Θ) 5 1 S 2 1(5) 2 i = 1 5 Si 2 5 5. K = 5; Ni = N/K; for i = 1 : K XS = sqrt(1-((i-1+rand(1,ni))/k).^2); XSB(i) = mean(xs); SS(i) = var(xs); end, SST = mean(ss/n); disp( [mean(xsb) 2*sqrt(SST)] ).78541 3.227e-5

3 Optimal Distribution of Samples : to reduce V ar more, assume n i samples for stratum i, with k n i = n. If Xi E[X Y = y i ] using n i samples, let Θ = p i Xi, with V ar( Θ) = p 2 i V ar( X i ). Given estimate s 2 i for each V ar( X i ) s2 i n i, choose n i s to minimize k p2 i s2 i /n i, subject to k n i = n; Lagrange multiplier solution is p i s i n i = n k j=1 p. js j Algorithm: use small n to estimate s i s, larger n for Θ; V ar( Θ) p 2 i Si 2, n i is estimated using sample variances S 2 i from strata.

4 Continued Example: π 4 = E[ 1 U 2 ] = 1 1 x2 dx c) k = 5, p i = 1 5, optimal n i = 5s i /( 5 s i). U ij = i 1 5 + 1 5 U,j = 1,..., n i, i = 1,..., 5, V ar( Θ) 1 5 Si 2 5 5n i. ON = max( 3, round(n*sqrt(ss)/sum(sqrt(ss))) ); for i = 1 : K XS = sqrt(1-((i-1+rand(1,on(i)))/k).^2); XSB(i) = mean(xs); SS(i) = var(xs); end, SST = mean(ss./(5*on)); disp( [mean(xsb) 2*sqrt(SST)] ).78538 1.6446e-5 More Stratified Sampling Examples Estimate P = cos(t2 ) e t2 /2 2π dt. For simulation convert to x [, 1] using x = Φ(t): so dx = e t2 /2 2π dt, P = 1 cos(φ 1 (x) 2 )dx. a) Simple MC f = @(x)cos(norminv(x).^2); N = 5; U = rand(1,n); X = f(u); disp( [mean(x) 2*std(X)/sqrt(N)]).5725.16998

b) k = 5, n i = 1, p i = 1 5, so use U ij = i 1 5 + 1 5U, j = 1,..., 1, i = 1,..., 5. V ar( Θ) 5 1 S 2 1(5) 2 i = 1 5 Si 2 5 5. K = 5; Ni = N/K; for i = 1 : K XS = f((i-1+rand(1,ni))/k); XSB(i) = mean(xs); SS(i) = var(xs); end, SST = mean(ss/n); disp( [mean(xsb) 2*sqrt(SST)] ).5691.1471 c) k = 5, p i = 1 5, optimal n i = 5s i /( 5 s i). U ij = i 1 5 + 1 5 U,j = 1,..., n i, i = 1,..., 5, V ar( Θ) 1 5 Si 2 5 5n i. ON = max( 3, round(n*sqrt(ss)/sum(sqrt(ss))) ); for i = 1 : K XS = f((i-1+rand(1,on(i)))/k); XSB(i) = mean(xs); SS(i) = var(xs); end, SST = mean(ss./(5*on)); disp( [mean(xsb) 2*sqrt(SST)] ).5691.2475 5

Video Poker Example: 5 cards dealt, 1-5 discards are replaced. Let C = ( 52 5 ) 1, and assume payoff table is Hand Payoff Probability RF: Royal Flush 8 4C SF: Straight Flush 5 36C 4K: Four-of-a-kind 25 624C FH: Full House 8 3744C FL: Flush 5 518C ST: Straight 4 12C 3K: Three-of-a-kind 3 54912C 2P: Two Pair 2 123552C HP: High Pair(> 1) 1 33792C LP: Low Pair(<= 1) 7632C OT: Other 1- p i Note: 1- p i.51527. Textbook Strategy: no replacement cards dealt if > 3K; otherwise keep pairs and triplets but replace other cards. Raw simulation: for each run a) use random permutation of [1:52] (p.51) to shuffle cards; b) pick top 5 cards, discard, replace using text strategy; c) compute X from table. Compute X using many runs. 6

7 Video Poker Example with stratification: if payoff is RV X, E[X] = E[X RF ]P {RF } + E[X SF ]P {SF } + E[X 4K]P {4K} + E[X F H]P {F H} + E[X F L]P {F L} + E[X ST ]P {ST } + E[X 3K]P {3K} + E[X 2P ]P {2P } + E[X HP ]P {HP } + E[X LP ]P {LP } + E[X OT ]P {OT } =.51293 + 4.24144397(.21128451) + E[X 2P ]P {2P } + E[X HP ]P {HP } + E[X LP ]P {LP } + E[X OT ]P {OT } E[X 2P ], E[X HP ], E[X LP ], also computed analytically. Stratified sampling: for each run a) use random permutation of [1:52] (p.51) to shuffle cards; b) pick top 5 cards, if LP, start new run; c) pick next 5 cards; compute X OT and X using X =.51293 + 4.24144397(.21128451) + E[X 2P ]P {2P } +E[X HP ]P {HP } + E[X LP ]P {LP } + P {OT }X OT Compute X using many runs.

8 Multdimensional Integrals: consider the 2-dimensional θ = 1 1 g(x, y)dxdy Simple stratified sampling. If subdivision of [, 1] [, 1] is same for x and y, strata are squares s ij = [ i 1 k, i k ] [j 1 k, j k ], for i = 1 : k, j = 1 : k, so θ = p ij E[X (x, y) s ij ], p ij = 1 k 2, Θ = j=1 p ij Xij, with X ij E[X (x, y) s ij ], j=1 X ij = 1 n ij Note: if n ij = n/k 2, n ij l=1 ( i 1 g k V ar( Θ) V ar( Θ) 1 k 2 + U l k, j 1 k j=1 p 2 ij n ij S 2 ij. j=1 S 2 ij n. + V ) l, k

Two-dimensional example: θ = 1 1 e(x+y)2 dxdy. g = @(x)exp(sum(x).^2); N = 1; X = g(rand(2,n)); % Simple MC disp([mean(x) 2*std(X)/sqrt(N)]) 4.8975.1188 K = 2; Nij = N/K^2; % Stratified for i = 1:K for j = 1:K XS = g([i-1+rand(1,nij);j-1+rand(1,nij)]/k); XSb(i,j) = mean(xs); SS(i,j) = var(xs); end end, SST = mean(mean(ss/n)); disp([mean(mean(xsb)) 2*sqrt(SST) ]) 4.8967.1679 Note: for n dimensional integrals the growth in the number of terms k n in the sum for Θ, X i1,...,i Θ = n, k n i 1 =1 i n =1 results in a method that is infeasible for large n; however, the terms in the sum could be sampled with m X I Θ m, l=1 where I = (I 1, I 2,..., I n ), with I j Uniform(1 : k). Other strategies, e.g. with nonuniform n ij s, different ks for each variable, have been considered. 9

1 Transformation Method for Monotone Functions θ = 1 1 g(x 1, x 2,..., x n )dx 1 dx n. If x 1 = e y ny 1, x 2 = e y n(y 2 y 1 ),... x n = e y n(1 y n 1 ), θ = 1 y2 yn 1 yn 2 g(x 1 (y),..., x n (y))y n 1 n e y n dy 1 dy n. The pdf y n 1 n e y n is a Gamma(n, 1) pdf. For simulation, first generate Y n Gamma(n, 1), then U 1,..., U n 1 Uni(, 1), with Y i = U (i) (sorted). If Y n Gamma 1 (n, 1, U n ) (inversion), stratification can be used with U n.

Example 2-d continued; let x 1 = e y 2y 1, x 2 = e y 2(1 y 1 ), then θ = = 1 1 1 e (x 1+x 2 ) 2 dx 1 dx 2 e (x 1(y 1,y 2 )+x 2 (y 1,y 2 )) 2 y 2 e y 2 dy 1 dy 2, N = 1; U = rand(2,n); % Simple MC Y(1,:) = U(1,:); Y(2,:) = gaminv(u(2,:),2,1); X(1,:) = exp(-y(1,:).*y(2,:)); X(2,:) = exp(-y(2,:).*(1-y(1,:))); Z = g(x); disp([mean(z) 2*std(Z)/sqrt(N)]) 4.977.11989 K = 1; Ni = N/K; % Stratified for i = 1:K, U=rand(2,Ni); Y(2,:) = gaminv((i-1+u(2,:))/k,2,1); X(1,:) = exp(-u(1,:).*y(2,:)); X(2,:) = exp(-y(2,:).*(1-u(1,:))); XS = g(x); XSB(i) = mean(xs); SS(i) = var(xs); end, SST = mean(ss/n); disp([mean(xsb) 2*sqrt(SST)]) 4.914.1342 11

12 Compound RVs: assume iid RVs X i and RV N: let S n (X) = n α ix i ; estimate P {S N (X) > c}. E.g. X i is insurance claim i, N is # claims by time T. In many cases, distribution for N is known, e.g. Poisson(λ), with p i = e λ (λ) i i!. Raw simulation : for K runs, generate N and N X i s; count proportion with N X i > c. Stratified simulation: given pmf for N, let g n (x) = I(S n (x) > c).. θ = m E[g n (X)]p n + E[g N (X) N > m](1 n= m p n ) n= Simulation run: a) compute m = N N > m from conditional pmf; b) generate X 1,..., X m and set Θ = m g n (X)p n + g m (X)(1 n= Then Θ P {S N (X) > c}. E.g. Poisson(λ), has conditional pmf with m p n ). n= with α = 1 m n= p n. (λ)m+i p m+i = e λ α(m + i)!,