Today: Fundamentals of Monte Carlo

Similar documents
Today: Fundamentals of Monte Carlo

Today: Fundamentals of Monte Carlo

Numerical integration - I. M. Peressi - UniTS - Laurea Magistrale in Physics Laboratory of Computational Physics - Unit V

Structural Reliability

16 : Markov Chain Monte Carlo (MCMC)

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

André Schleife Department of Materials Science and Engineering

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

Monte Carlo Simulations

2 Functions of random variables

Preliminary statistics

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Bayesian Methods with Monte Carlo Markov Chains II

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

a k 0, then k + 1 = 2 lim 1 + 1

8 STOCHASTIC SIMULATION

Chapter 2. Random Variable. Define single random variables in terms of their PDF and CDF, and calculate moments such as the mean and variance.

Multivariate Distribution Models

Stochastic Processes for Physicists

Note: Final Exam is at 10:45 on Tuesday, 5/3/11 (This is the Final Exam time reserved for our labs). From Practice Test I

Physics 116C The Distribution of the Sum of Random Variables

Sampling Distributions Allen & Tildesley pgs and Numerical Recipes on random numbers

The Gaussian distribution

Transformation of Probability Densities

18.440: Lecture 28 Lectures Review

Lecture 2 : CS6205 Advanced Modeling and Simulation

Basics of Stochastic Modeling: Part II

Monte Carlo and cold gases. Lode Pollet.

Reduction of Variance. Importance Sampling

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Numerical integration and importance sampling

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

conditional cdf, conditional pdf, total probability theorem?

ECE295, Data Assimila0on and Inverse Problems, Spring 2015

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

( x) ( ) F ( ) ( ) ( ) Prob( ) ( ) ( ) X x F x f s ds

Multiple Random Variables

Monte Carlo Integration. Computer Graphics CMU /15-662, Fall 2016

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Monte Carlo. Lecture 15 4/9/18. Harvard SEAS AP 275 Atomistic Modeling of Materials Boris Kozinsky

6.1 Moment Generating and Characteristic Functions

Continuous Random Variables

E[X n ]= dn dt n M X(t). ). What is the mgf? Solution. Found this the other day in the Kernel matching exercise: 1 M X (t) =

UNIVERSITY OF HOUSTON HIGH SCHOOL MATHEMATICS CONTEST Spring 2018 Calculus Test

Statistics, Data Analysis, and Simulation SS 2015

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Review of Probability Theory

AP Calculus Chapter 9: Infinite Series

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Scientific Computing: Monte Carlo

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Introduction. Microscopic Theories of Nuclear Structure, Dynamics and Electroweak Currents June 12-30, 2017, ECT*, Trento, Italy

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Grade: The remainder of this page has been left blank for your workings. VERSION D. Midterm D: Page 1 of 12

Chapter 3: Random Variables 1

Numerical Methods I Monte Carlo Methods

Sampling Distributions Allen & Tildesley pgs and Numerical Recipes on random numbers

BMIR Lecture Series on Probability and Statistics Fall 2015 Discrete RVs

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Random processes and probability distributions. Phys 420/580 Lecture 20

1 Presessional Probability

Monte Carlo integration (naive Monte Carlo)

Physics 403 Probability Distributions II: More Properties of PDFs and PMFs

2 (Statistics) Random variables

Monte Carlo Methods in High Energy Physics I

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process

1.1 Review of Probability Theory

3 Operations on One Random Variable - Expectation

Lecture 6: Monte-Carlo methods

Probability Review. Gonzalo Mateos

Kinetic Theory 1 / Probabilities

Chapter 3: Random Variables 1

Math 341: Probability Eighteenth Lecture (11/12/09)

Probability and Statistics

Probability Theory Review Reading Assignments

Dennis Bricker Dept of Mechanical & Industrial Engineering The University of Iowa

Chapter 4. Continuous Random Variables 4.1 PDF

Chapter 2: Random Variables

Lecture Notes, FYTN03 Computational Physics

component risk analysis

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Probability, CLT, CLT counterexamples, Bayes. The PDF file of this lecture contains a full reference document on probability and random variables.

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Lecture 1: August 28

From Random Numbers to Monte Carlo. Random Numbers, Random Walks, Diffusion, Monte Carlo integration, and all that

Advanced Sampling Algorithms

Random Variables. P(x) = P[X(e)] = P(e). (1)

A Probability Review

Lecture 8: The Metropolis-Hastings Algorithm

If the objects are replaced there are n choices each time yielding n r ways. n C r and in the textbook by g(n, r).

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y.

MTH739U/P: Topics in Scientific Computing Autumn 2016 Week 6

3 Operations on One R.V. - Expectation

Introduction to Statistics and Error Analysis II

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Transcription:

Today: Fundamentals of Monte Carlo What is Monte Carlo? Named at Los Alamos in 1940 s after the casino. Any method which uses (pseudo)random numbers as an essential part of the algorithm. Stochastic - not deterministic! A method for doing highly dimensional integrals by sampling the integrand. Often a Markov chain, called Metropolis MC. Reading: Lesar Chapter 7 1

Simple example: Buffon s needle Monte Carlo determination of π Consider a square inscribed by circle. Consider one quadrant. By geometry: area of 1/4 circle = π r 2 /4 = π /4. area of square r 2 Simple MC like throwing darts at unit square target: Using RNG in (0,1), pick a pair of coordinates (x,y). Count # of points (darts) in shaded section versus total. Pts in shaded circle/pts in square ~ π /4. hits = 0 DO n=1,n{ x = (random #) y = (random #) distance2 = (x 2 + y 2 ) If (distance2 1) hits = hits + 1} pi= 4 * hits/n 2

MC is advantageous for high dimensional integrals -the best general method Consider an integral in the unit D-dimensional hypercube: I = dx 1...dx D f (x 1,...,x D ) By conventional deterministic methods: Lay out a grid with L points in each direction with h=1/l Number of points is N =L D =h D proportional to CPU time. How does error scale with CPU time or Dimensionality? Error in trapezoidal rule goes as ε=f (x) h 2 since f(x)=f(x 0 ) + f (x 0 ) h + (1/2)f (x 0 )h 2 + direct CPU time ~ ε -D/2. (ε~h 2 and CPU~h D ) But by sampling we find ε -2. (ε~m -1/2 and CPU~M) To get another decimal place takes 100 times longer! But MC is advantageous for D>4! 3

Improved Numerical Integration Integration methods in D-dimensions: Trapeziodal Rule: ε ~ f (2) (x) h 2 Simpson s Rule: ε ~ f (4) (x) h 4 generally: ε ~ f (α) (x) h α And CPU time scales with sample points (grid size) CPU time ~ h -D (e.g., 1-D like 1/L and 2-D like 1/L 2 ) Time to do integral: T int ~ ε -D/α By conventional deterministic methods: Lay out a grid with L points in each direction with h=1/l Number of points is N=h D =L D ~ CPU time. Stochastic Integration (Monte Carlo) Monte Carlo: ε ~ M -1/2 (sqrt of sampled points) CPU time: T ~ M ~ ε -2. > In the limit of small ε, MC wins if D > 2α! 4

4D 6D 8D Log(CPU Time) 2D MC Good Log(error) Bad Other reasons to do Monte Carlo: Conceptually and practically simple. Comes with built in error bars. Many methods of integration have been tried, and will be tried in this world of sin and woe. No one pretends that Monte Carlo is perfect or all-wise. Indeed, it has been said that Monte Carlo is the worst method except all those other methods that have been tried from time to time. Churchill 1947 5

Probability Distributions P(x)dx= probability of observing a value in (x,x+dx) is a probability distribution function (p.d.f.) x can be either a continuous or discrete variable. Cumulative distribution: Probability of x<y. Useful for sampling dx P(x) = 1 P(x) 0 y c( y) = dx P(x) 0 c( y) 1 Average or expectation g(x) = g = dx P(x)g(x) Moments: Zero th moment I 0 =1 Mean <x>=i 1 Variance <(x-<x>) 2 > =I 2 -(I 1 ) 2 I n =< x n >= dx P(x)x n 6

Mappings of random variables Let p x (x)dx be a probability distribution Let y=g(x) be a new variable y=g(x) -- e.g., y=g(x) = ln(x) with 0 < x 1, so y 0 What is the pdf of y? With p y (y)dy=p x (x)dx x p y (y) = p x (x) dx dy = p x(x) dx dg(x) = p x(x) dg(x) dx x 1 What happens when g is not monotonic? Example: y=g(x) = ln(x) p y (y)dy = p x (x) dg(x) dx 1 dy = e y dy *Distributed exponentially, like in Poisson event, e.g. y/λ has PDF of λe λy 7

Example: Drawing from Normal Gaussian p(y 1, y 2,...)dy 1 dy 2... = p(x 1, x 2,...) (x 1, x 2,...) (y 1, y 2,...) dy 1 dy 2... Box-Muller method: get random deviates with normal distr. p(y)dy = 1 2π e y 2 /2 dy Consider: uniform deviates (0,1), x 1 and x 2, and two quantities y 1 and y 2. where y 1 = ln x 1 cos2π x 2 y 2 = ln x 1 sin2π x 2 Or, equivalently, x 1 = exp( [y 1 2 + y 2 2 ] / 2) x 2 = 1 2π arctan(y 2 / y 1 ) (x 1, x 2,...) (y 1, y 2,...) = 1 2π e y 2 1 /2 1 2π e y 2 2 /2 So, each y is independently normal distributed. Better: Pick R 2 = v 1 2 + v 2 2 so x 1 = R 2 and (v 1,v 2 ) = 2π x 2 Advantage: no sine and cosine by using v 1 / R 2 and v 2 / R 2 and And get two RNG per calculation 10

Reminder: Gauss Central Limit Theorem Sample N values from p(x)dx, i.e. (X 1, X 2, X 3, X N ). Estimate mean from y = (1/N) x i. What is the pdf of mean? Solve by fourier transforms. If you add together two random variables, you multiply together their characteristic functions: c x (k) =< e ikx >= dx P(x)e ikx so c x+ y (k) = c x (k)c y (k) Then c x1 +...+ x N (k) = c x (k) N and c y (k) = c x (k / N) N Taylor expand ln[c y (k)] = n=1 (ik) n n! κ n cumulants 11

Cumulants: κ n Mean = κ 1 =<x> = x Variance= κ 2 =<(x-x) 2 > = s 2 Skewness = κ 3 /s 3 =<((x-x)/s) 3 > Kurtosis= κ 4 /s 4 =<((x-x)/s) 4 >-3! What happens to the reduced moments? κ n = κ n N 1 n The n=1 moment remains invariant. The rest get reduced by higher powers of N. lim N c y (k) = e ikκ 1 k 2 κ 2 / 2 N ik 3 κ 3 /6 N 2... P( y) = (N / 2πκ 2 ) 1/ 2 exp N( y κ 1 )2 2κ 2 Given enough averaging almost anything becomes a Gaussian distribution. 12

Approach to normality Gauss Central Limit Thm For any population distribution, the distribution of the mean will approach Gaussian. 13

Conditions on Central Limit Theorem We need the first three moments to exist. If I 0 is not defined à not a pdf If I 1 does not exist à not mathematically well-posed. If I 2 does not exist à infinite variance. Important to know if variance is finite for Monte Carlo. Divergence could happen because of tails of distribution We need: I n =< x n >= I 2 =< x 2 >= Divergence because of singular behavior of P at finite x: dx dx P(x)x 2 lim x ± x 3 P(x) 0 lim x 0 xp(x) 0 P(x)x n 14

Multidimensional Generalization Suppose r is an m dimensional vector from a multidimensional pdf: p(r)d m r. The mean is defined as before. The variance becomes the covariance, a positive symmetric m x m matrix : ν i,j = <(x i -< x i >)(x j -< x j >)> For sufficiently large N, the estimated mean (y) will approach the distribution: P(y)dy = 2π 1/2 N det(v) exp[ (y i < y i >) Nv 1 2 (y j < y j >)] 15

2d histogram of occurrences of means x 2 Off-diagonal components of ν ij are called the co-variance. Data can be uncorrelated, positively or negatively correlated depending on sign of ν ij Like a moment of inertia tensor 2 principal axes with variances Find axes with diagonalization or singular value decomposition x 1 Individual error bars on x 1 and x 2 can be misleading if correlated. 16