Reduction of Variance. Importance Sampling

Similar documents
Sampling Distributions Allen & Tildesley pgs and Numerical Recipes on random numbers

So we will instead use the Jacobian method for inferring the PDF of functionally related random variables; see Bertsekas & Tsitsiklis Sec. 4.1.

Limits at Infinity. Horizontal Asymptotes. Definition (Limits at Infinity) Horizontal Asymptotes

Numerical integration - I. M. Peressi - UniTS - Laurea Magistrale in Physics Laboratory of Computational Physics - Unit V

Lecture 11: Probability, Order Statistics and Sampling

Today: Fundamentals of Monte Carlo

Monte Carlo. Lecture 15 4/9/18. Harvard SEAS AP 275 Atomistic Modeling of Materials Boris Kozinsky

Today: Fundamentals of Monte Carlo

Monte Carlo Integration. Computer Graphics CMU /15-662, Fall 2016

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger

RVs and their probability distributions

Bell-shaped curves, variance

STA 4273H: Statistical Machine Learning

Inferring from data. Theory of estimators

ECE Lecture #10 Overview

2 Continuous Random Variables and their Distributions

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

Math 115 Spring 11 Written Homework 10 Solutions

19 : Slice Sampling and HMC

More on Estimation. Maximum Likelihood Estimation.

Expectation, Variance and Standard Deviation for Continuous Random Variables Class 6, Jeremy Orloff and Jonathan Bloom

Complex Systems Methods 11. Power laws an indicator of complexity?

It is true that 12 > 10. All the other numbers are less than 10.

Advanced Monte Carlo Methods Problems

Sampling Distributions Allen & Tildesley pgs and Numerical Recipes on random numbers

ECE531 Lecture 8: Non-Random Parameter Estimation

CNH3C3 Persamaan Diferensial Parsial (The art of Modeling PDEs) DR. PUTU HARRY GUNAWAN

Uncertainty Propagation

ENGG2430A-Homework 2

Eco517 Fall 2004 C. Sims MIDTERM EXAM

Physics 509: Bootstrap and Robust Parameter Estimation

A simple algorithm that will generate a sequence of integers between 0 and m is:

Intelligent Systems I

Today: Fundamentals of Monte Carlo

8 Laws of large numbers

From Random Numbers to Monte Carlo. Random Numbers, Random Walks, Diffusion, Monte Carlo integration, and all that

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 2

Kyle Reing University of Southern California April 18, 2018

Physics Sep Example A Spin System

Sampling Based Filters

Chapter 2. Random Variable. Define single random variables in terms of their PDF and CDF, and calculate moments such as the mean and variance.

ECE295, Data Assimila0on and Inverse Problems, Spring 2015

Continuous random variables

Math 480 The Vector Space of Differentiable Functions

Review of probability. Nuno Vasconcelos UCSD

I forgot to mention last time: in the Ito formula for two standard processes, putting

Monte Carlo Methods. Geoff Gordon February 9, 2006

Applied Statistics. Monte Carlo Simulations. Troels C. Petersen (NBI) Statistics is merely a quantisation of common sense

STA205 Probability: Week 8 R. Wolpert

Some worked exercises from Ch 4: Ex 10 p 151

Lecture 11: Continuous-valued signals and differential entropy

Miscellaneous Errors in the Chapter 6 Solutions

Lecture 4 Propagation of errors

Probability Distributions

MATH Solutions to Probability Exercises

2. As we shall see, we choose to write in terms of σ x because ( X ) 2 = σ 2 x.

CS145: Probability & Computing

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

CSC 446 Notes: Lecture 13

p(z)

8 STOCHASTIC SIMULATION

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

This examination consists of 11 pages. Please check that you have a complete copy. Time: 2.5 hrs INSTRUCTIONS

Random Networks. Complex Networks, CSYS/MATH 303, Spring, Prof. Peter Dodds

Universal examples. Chapter The Bernoulli process

MA8109 Stochastic Processes in Systems Theory Autumn 2013

Brief Review of Probability

CHEM-UA 652: Thermodynamics and Kinetics

Topic 4: Continuous random variables

{X i } realize. n i=1 X i. Note that again X is a random variable. If we are to

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

Expectations, Markov chains, and the Metropolis algorithm

Bayesian Methods for Machine Learning

EXAMPLES OF PROOFS BY INDUCTION

Statistical Techniques in Robotics (16-831, F12) Lecture#17 (Wednesday October 31) Kalman Filters. Lecturer: Drew Bagnell Scribe:Greydon Foil 1

Math Spring Practice for the Second Exam.

Multicanonical methods

14 : Mean Field Assumption

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Distributed Estimation, Information Loss and Exponential Families. Qiang Liu Department of Computer Science Dartmouth College

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Line of symmetry Total area enclosed is 1

8 Eigenvectors and the Anisotropic Multivariate Gaussian Distribution

Numerical Characterization of Multi-Dielectric Green s Function for 3-D Capacitance Extraction with Floating Random Walk Algorithm

functions Poisson distribution Normal distribution Arbitrary functions

Brief introduction to Markov Chain Monte Carlo

MORTAL CREEPERS SEARCHING FOR A TARGET

6.1 Moment Generating and Characteristic Functions

Math 163: Lecture notes

Lecture 14 February 28

Review of Probability Theory

Lecture 11: Non-Equilibrium Statistical Physics

The Fundamentals of Heavy Tails Properties, Emergence, & Identification. Jayakrishnan Nair, Adam Wierman, Bert Zwart

Computer Intensive Methods in Mathematical Statistics

Exercise 5 Release: Due:

Markov Chain Monte Carlo (MCMC)

1 Solution to Problem 2.1

Quantum query complexity of entropy estimation

Markov Chain Monte Carlo (MCMC)

Modern Methods of Data Analysis - WS 07/08

Transcription:

Reduction of Variance As we discussed earlier, the statistical error goes as: error = sqrt(variance/computer time). DEFINE: Efficiency = = 1/vT v = error of mean and T = total CPU time How can you make simulation more efficient? Write a faster code, Get a faster computer Work on reducing the variance. Or all three We will talk about the third option: Importance sampling and correlated sampling 0 Given the integral: Importance Sampling I dxf (x) f (x) How should we sample x to maximize the efficiency? Transform the integral: f (x) I dx p(x) f (x) p(x) p(x) p Estimator variance is: f( x) f( x) v I dxp( x) I px ( ) px ( ) p Optimal sampling: v 0 with dx p(x) p(x) 1 Mean value of estimator I is independent of p(x), but variance v is not! Assume CPU-time/sample is independent of p(x), and vary p(x) to minimize v. 1 1

Finding Optimal p*(x) for Sampling Trick to parameterize as a positive definite PDF: Solution: p *(x) dx f (x) f (x) Estimator: q(x) p(x) dx q(x) f (x) p *(x) sign( f (x)) dx f (x) 1. If f(x) is entirely positive or negative, estimator is constant. zero variance principle.. We can t generally sample p * (x), because, if we could, then we would have solved problem analytically! But the form of p*(x) is a guide to lowering the variance. 3. Importance sampling is a general technique: it works in any dimension. Example of importance sampling Suppose f(x) was given by f (x) e x / 1 x Value is independent of a. Optimize a Gaussian p(x) ex /a (a) 1/ CPU time is not 3

v dx p(x) f (x) p(x) I dx e x (11/a) I (1 x ) Importance sampling functions Variance integrand 4 What are allowed values of a? Clearly for p(x) to exist: 0<a 0.5.6 1. a For finite estimator 1<a For finite variance.5<a Obvious value a=1 Optimal value a=0.6. v dx f (x) p(x) dx (a)1/ 1 x e x (11/a) 5 3

What does infinite variance look like? Spikes Long tails on the distributions Near optimal Why (visually)? 6 General Approach to Importance Sampling Basic idea of importance sampling is to sample more in regions where function is large. Find a convenient approximation to f(x). Do not under-sample -- could cause infinite variance. Over-sampling -- loss in efficiency but not infinite variance. Always derive conditions for finite variance analytically. To debug: test that estimated value is independent of important sampling. Sign problem: zero variance is not possible for oscillatory integral. Monte Carlo can add but not subtract. 7 4

Correlated Sampling Suppose we want to compute a function : G(F 1,F ) where F s are integrals: F k = dx f k (x) Suppose we use same p(x) and same random numbers to do both integrals. What is optimal p(x)? p *(x) f 1 (x) dg df 1 f (x) dg df It is a weighted average of the distributions for F 1 and F. 8 Sampling Boltzmann distribution Suppose we want to calculate a whole set of averages: O k dr O k (R)eV (R) /kt dr e V (R)/kT Optimal sampling is: p k * (x) e V (R)/kT (Ok (R)O k ) constant variable We need to sample the first factor because we want lots of properties Avoid undersampling. The Boltzmann distribution is very highly peaked. 9 5

Independent Sampling for exp(-v/kt)? Try hit or miss MC to get Z= exp(-v/kt). Sample R uniformly in (0,L): P(R) = -N =1 What is the variance of the free energy and how does it depend on the number of particles? Var(Z) dr e V ( R) Z() Z() Z() var(f) var(z)/z Z() Z() 1 e[ F ( ) F ( )] 1 [ F ( ) F ()] error(f) e Blows up exponentially fast at large N since F is extensive! The number of sample points must grow exponentially in N, just like a grid based method. 10 Intuitive explanation Throw N points in a box, area A. Say probability of no overlap is q. Throw N points in a box, area A. Probability of no overlap is q. Throw mn points in a box, area ma Probability of no overlap is q m. Probability of getting a success is p=exp(m ln(q)). Success defined as a reasonable sample of a configuration. This is a general argument. We need to sample only near the peak of the distribution: to do so use random walks. 11 6