Monte Carlo Basics. Aiichiro Nakano

onte Carlo Basics Aiichiro Nakano Collaboratory for Advanced Computing & Simulations Department of Computer Science Department of Physics & Astronomy Department of Chemical Engineering & aterials Science Department of Biological Sciences University of Southern California Email: anakano@usc.edu Goals: C simulation in practice (understand & use) inimal math = probability arkov-chain (etropolis) C

onte Carlo ethod onte Carlo (C) method: A computational method that utilizes random numbers ajor applications of the C method: 1. ultidimensional integrations (e.g., statistical mechanics in physics) 2. Simulation of stochastic natural phenomena (e.g., stock price) A({! r i }) = d r! 1 " d r! N exp( V ({! r i }) k B T )A({! r i }) d r! 1 " d r! N exp( V ({! r i }) k B T )

Numerical integration onte Carlo Integration b a dxf (x) Δxf (x n ) = onte Carlo integration (r n [a,b]: random #): Error ~ 1/ 1/2 b a dxf (x) ultidimensional integration: O(C N 1 1 ) dx 1 f (x 1,, x N ) 0! dx N 0 b a f (r n ) b a f (x n )

Stochastic odel of Stock Prices Fluctuation in stock price Computational stock portfolio trading Basis of Black-Scholes analysis of option prices ds = µsdt +σsε dt (1997 Nobel Economy Prize to yron Scholes) Andrey Omeltchenko (Quantlab)

Goals: onte Carlo Basics 1. Sample-mean C ( ) 2 dxf (x) p(x) f ± f 2 f 1 f = 1 f (r n ) 2. Visualize probability density by a set of points: nonuniform randomnumber generation by coordinate transformation P(r P ʹ (ζ 1,,ζ N ) = 1,,r N ) (ζ 1,,ζ N ) / (r 1,,r N ) 3. etropolis (arkov-chain) C algorithm for importance sampling Random attempt conditional acceptance with probability exp(-δv/k B T)

onte Carlo Integration: Hit-&-iss Visualize probability density function Program hit.c: Calculate π #include <stdio.h>! #include <stdlib.h>! main() {! double x, y, pi;! int hit = 0, try, ntry;! printf("input the number of C trials\n");! scanf("%d",&ntry);! srand((unsigned)time((long *)0));! for (try=0; try<ntry; try++) {! x = rand()/(double)rand_ax;! y = rand()/(double)rand_ax;! if (x*x+y*y < 1.0) ++hit;! }! pi = 4.0*hit/ntry;! printf("c estimate for PI = %f\n", pi);! }! hpc-login3.usc.edu> more /usr/include/stdlib.h!...! #define RAND_AX 2147483647! = 2 31-1 rand() [1,RAND_AX-1] N try P(x,y)dxdy points dy dx

Single-Electron Double-Slit Experiment http://rdg.ext.hitachi.co.jp/rd/moviee/doubleslite-n.mpeg Akira Tonomura (Hitachi, Ltd.)

onte Carlo Integration: Sample ean b a Bottom line: C simulation as integration dxf (x) 1 b a f (r n ) 4 4 1+ x 2 = dx 1+ x 2 = 0 = (b a) 1 π / 4 f (r n ) = (b a) f (r n ) dθ 4 cos 2 θ 1+ tan 2 = 4dθ = π θ 0 π / 4 0 average

Sample ean C Program Program mean.c: Calculate π #include <stdio.h>! #include <stdlib.h>! main() {! double x, pi, sum = 0.0;! int try, ntry;! printf("input the number of C trials\n");! scanf("%d",&ntry);! srand((unsigned)time((long *)0));! for (try=0; try<ntry; try++) {! x = rand()/(double)rand_ax;! sum += 4.0/(1.0 + x*x);! }! pi = sum/ntry;! printf("c estimate for PI = %f\n", pi);! }!

ultidimensional Integration Statistical mechanics k B = 1.38 10-16 erg/k: Boltzmann constant S = k B log W Need onte Carlo!

Probability: Foundation of C inimal probability theory enough to understand C simulation Random variable: A variable that takes different values at different measurements. Probability distribution function p(x) = (# of samples with result x)/(total # of samples) Normalization Expectation (weighted sum) Variance (spread) Var[ x] = Standard deviation p(x k ) = 1 k E[ x] = x = x k p(x k ) k ( x x ) 2 = x 2 2x x + x 2 = x 2 x 2 Std[ x] = Var[ x]

Continuous Random Variables Probability density function (note the unit) p(x)dx = (# of samples in the range [x, x+dx])/(total # of samples) Visualize! Normalization dxp(x) = 1 Expectation E[ x] = x = dxxp(x) N try p(x)dx points ( lim Δxp(x k ) = 1) Δx 0 k Histogram Limit:

C Estimation Distinguish infinite vs. sample (~ poll) means vs. f (x) = 1 f (x) = dxf (x) p(x) takes random values f (r n ) Unbiased estimator: A statistical quantity, using samples, which converges to the underlying population value for large The mean of sampled random variables, f (x) converges to its expected value f (x) : f (x) is an unbiased estimator of f (x) The error of an C estimation decreases as 1/ 1/2 Central Theorem of C Sampling random variable Bottom line ( ) 2 dxf (x) p(x) f ± f 2 f 1 Exact Finite sampling

Proof of the Central Theorem Unbiased estimator of the mean E{ f (x)} = 1 E{ f (r n )} = 1 f (x) = Error estimate variance of the sample mean 1 Var{ f (x)} = Var f (r n ) = 1 2 m=1 For uncorrelated random variables, AB = A B, and thus [ f (r m ) f (x) ][ f (r n ) f (x) ] = 1 f (x) = f (x) [ f (r m ) f (x) ][ f (r n ) f (x) ]. Var{ f (x)} (m = n) f (r m ) f (x) f (r n ) f (x) = 0 (else) or Var{ f (x)} = 1 2 m=1 Var{ f (x)} = Var{ f (x)} Std{ f (x)} Std{ f (x)} = Error decreases as 1/ 1/2, but what is the prefactor Var{f(x)}?

An Example of Correlated Random Variables A spin variable, s, takes the value of +1 (spin up, ) with probability 0.5, & -1 (spin down, ) with probability 0.5 s = 1 1 2 + ( 1) 1 2 = 0 (Uncorrelated spins) s 1 s 2 = s 1 s 2 = 0 far don t see each other atom 1 atom 2 (Correlated spins) Ferromagnetic spins: they always align with each other (spin configurations with probability 0.5 and with probability 0.5) s 1 s 2 = 1 2 1 2 + ( 1 ) 2 1 2 = 1 s 1 s 2

Estimate of the Variance of Sample ean Unbiased estimator of the variance s = 1 f 2 (r n ) 1 s = 1 f 2 (x) 1 = f 2 (x) 1 2 2 m=1 f (r n ) For uncorrelated random variables f (r m ) f (r n ) = f (x) f (x) s = 1 1 f 2 ( 1) (x) 2 f (x) f (x) = 1 Var{ f (x)} f (r m ) f (r n ) f 2 (x) 1 m=1 2 m=1 2 f (r m ) f (r n ). (n m ) dxf (x) p(x) f ± Std{ f } f ± Std{ f (x)} f ± f 2 f ( ) 2 1

Running mean.c C estimation of π: unbiased estimators for the mean & its standard deviation Use the estimators! 1 dxf (x) f ± f 2 f 1 0 ( ) 2 f = 1 (N try ) f (r n ) f 2 = 1 f 2 (r n ) f (r) = 4 1+ r 2

Error Bar & the Exact Value Q: Should the exact value always fall within the error bar? A: No, only ~68% of the time in the case of the Gaussian distribution

Standard Deviation of mean.c C estimation of π: unbiased estimators for the mean & its standard deviation 1 dxf (x) f ± f 2 f 1 0 f = 1 f (r n ) f 2 = 1 f (x) = 4 /(1+ x 2 ) ( ) 2 f 2 (r n ) σ = N 1 seed N 2 1 seed π i π i N seed i=1 N seed i=1 Unbiased estimates & measured values of the standard deviation See lecture on least square fit 2 σ = C p! logσ "# = logc + plog!" # y! x! log 10

Random Number Generation Uniform random number generator: A routine to generate a sequence of random numbers within a specified range (typically [0,1]) Linear congruential generator: Generates a sequence X i of positive integers based on recursion X i+1 = ax i mod in the range [1,-1] & generates a random number in the range [0,1] as r i = X i / (Example) = 7 Good: a = 3 X i+1 = 3X i mod 7 Bad: a = 2 X i+1 = 2X i mod 7 i X i 1 1 2 3 3 2 4 6 5 4 6 5 7 1 Period -1 = 6 i 1 2 3 4 X i 1 2 4 1 Period 3 < -1

Linear Congruential Generator ost common generator: cycle ~ 2 billion (, a) = (2 31-1 = 2147483647, 7 5 = 16807) Overflow avoidance algorithm (using 32-bit integer arithmetic) Let = aq + r, i.e., q = [/a], r = mod a then a(z mod q) r[z /q] if 0 az mod = a(z mod q) r[z /q]+ otherwise #define IA 16807! #define I 2147483647! #define A (1.0/I)! #define IQ 127773! #define IR 2836! float ran0(int *idum) {! long k;! float ans;! k = (*idum)/iq;! *idum = IA*(*idum - k*iq) - IR*k;! if (*idum < 0) *idum += I;! ans = A*(*idum);! return ans;! }!

Gaussian (Normal) Distribution ρ(ζ ) = 1 2π 2 exp ζ 2 dζρ(ζ ) = 1 Var{ζ } = dζζ 2 ρ(ζ ) = 1 Let s use coordinate transformation to generate nonuniform random numbers

Greenland Phenomenon Spherical coordinate system (latitude, θ & longitude, ϕ) is curved

Transformation ethod Uniformly generate population in these distorted maps to generate nonuniform distributions GDP! CO 2 Emission!

Box-uller Transformation Uniform r 1,r 2 [0,1] Non-uniform ζ 1 = 2 ln r 1 cos(2πr 2 ) ζ 2 = 2 ln r 1 sin(2πr 2 ) P(r 1,r 2 ) = 1 P(ζ 1/ 2 ) = 1 2π exp 2 ζ 1/ 2 2

Nonuniform Random Numbers Uniform random number P(r 1,r 2 ) = 1! Coordinate transformation method r! θ Number of samples: N P(r 1,r 2 )!# "# $ dr 1 dr 2 = N P ʹ (ζ 1,ζ 2 ) S(r 1,r 2,dr 1,dr 2 ) =1 density area Areal transformation: P ʹ (ζ 1,ζ 2 ) = ζ 1,ζ 2 r 1,r 2 total sample size ( ) ( ) ( ) ( ) dr 1 dr 2 S = ζ 1,ζ 2 r 1,r 2 1 P(r 1,r 2 )!# "# $ = r 1,r 2 ζ 1,ζ 2 =1 ( ) ( ) Jacobian = areal scaling

Areal Transformation S S = ξ x ζ x ξ y dxdy = ξ ζ ζ x y ξ ζ dxdy y x y Parallelogram! S =! k! l sinθ = a 2 + b 2 c 2 + d 2 sinθ = ad bc

ultidimensional Integration Statistical mechanics k B = 1.38 10-16 erg/k: Boltzmann constant S = k B log W Need importance sampling! (Example: harmonic oscillators) 0.2 100 = 1.3 10-70 for 100 oscillators

Importance Sampling (Problem) How to generate 3N-dimensional sample points according to the Boltzmann probability distribution? A = d! r N A( r! N ) exp V ( r! ( N ) /k B T ) A(! N! r n ) exp V ( N ( r n ) /kb T ) d r! N exp V ( r! N ) /k B T ( ) exp( V ( r! N n ) /kb T ) Importance sampling: Chooses a sequence of random numbers from a p(! N r n ) probability density function, p(x) arkov chain: A sequence of trials that satisfies (a) The outcome of each trial belongs to a finite set of outcomes, {Γ 1, Γ 2,..., Γ N }, called the state space (b) The outcome of each trial depends only on the outcome of the trial that immediately precedes it Transition probability matrix: Its element π mn is the conditional probability that the next state is Γ given that the current state is Γ n, i.e., the probability of state transition, n m N p m (τ ) = π mn p n (τ 1)

arkov Chain Normalization: Given that the system is in the state, Γ n, then the next state must be one of the N possible states atrix-vector notation π 11 π 12! π 1N π Π = 21 π 22 " # (Perron-Frobenius theorem) Eigenvalues: 1. Fixed point: π N1 π NN N π mn m=1 = 1 ρ (τ ) = p 1 (τ ) p 2 (τ ) " p N (τ ) ρ (τ ) = Πρ (τ 1) = Π 2 ρ (τ 2) =! = Π τ ρ (0) Πρ 1 = ρ 1 Andrey arkov (1856-1922) Πρ ν = ε ν ρ ν ( ε 1 = 1, ε 2,, ε N < 1) 2. Filter: Π τ N ν =1 c ν ρ ν = N ν =1 c ν ε ν ρ ν c 1 ρ 1 τ Oskar Perron (1880 1975) Ferdinand G. Frobenius (1849 1917)

Example: Two-Level System Π = 0.6 0.3 a 1 b = a = 0.6,b = 0.7 0.4 0.7 1 a b ( ) p (τ ) p (τ ) 0.6 0.3 τ 1 = 0.4286 0.4 0.7 0 τ 0.5714

Eigenvalue Problem Eigensystem Πw = λw w = u v a 1 b u or 1 a b v = λ u v λ a b 1 u = 0 a 1 λ b v 0 For a nontrivial (other than u = v = 0) solution, (Π-λI) -1 should not exist Eigenvalues: λ + = 1 and -1 < λ - = a+b-1 < 1 Eigenvectors: Π λi = (λ a)(λ b) (a 1)(b 1) = (λ a b +1)(λ 1) = 0 for a positive matrix: a, b > 0 w + = u + = v + 1 2 a b 1 b and 1 a Perron-Frobenius! w = u = 1 1 v

Telescopic Technique Π u + u = λ + v + + v + Π u u = λ v v Π 11 Π 12 or u U + Π 21 Π 22 v + u v ΠU = UΛ or Π = UΛU 1 = u + v + u v λ Λ + 0 0 λ p 1 (n) p 2 (n) = p Πn 1 = UΛU 1 UΛU 1!UΛU 1 UΛU 1 p 1 = UΛ n U 1 p 1 p 2 p 2 p 2 p 1 (n) p 2 (n) = p Πn 1 p 2 n u + v + λ + n 0 0 λ n 1 0 n 0 0 1 b = 2 a b 0.3/0.7 = 0.4286 1 a = 0.4 /0.7 = 0.5714 2 a b Perron-Frobenius theorem!

Detailed Balance Condition Perron-Frobenius theorem repeated application of Π converges to its eigenvector, ρ, with eigenvalue 1 where Πρ = ρ or N Π mn ρ (τ ) = Π τ (τ =0) ρ τ ρ n = ρ m ρ (Problem = importance sampling) Find a transition probability matrix Π whose 1st eigenvector equals a given ρ, e.g., ρ n = exp V n /k B T N π mn ρ n = N π nm ( ) 1 n ( ) exp( V n /k B T ) (Solution = detailed balance condition) Let Π satisfy, for any pair of states, π mn ρ n = π nm ρ m N then π Steady mn ρ n = ρ m. population ρ m ρ CA =34 π NV CA π CA NV ρ NV =2 Balanced flux

etropolis Algorithm (Problem) Find a transition probability matrix Π that satisfies the detailed balance condition for a given ρ: π mn ρ n = π nm ρ m (Solution) Let α mn = α nm be a symmetric attempt matrix, then α mn ρ m ρ n m n π mn = ( ρ m ρ n )α mn ρ m < ρ n m n 1 m = n π m'n m' n (Example) Statistical mechanics: no need for the absolute weight! next { r i } ( ) ( current { r i }) = P! P! exp V! next { r i } ( ( ) k B T ) ( current { r i }) k B T ( ) exp V!! next! current = exp δ V { r i },{ r i } ( ( ) k B T )

etropolis Algorithm Let X be a state variable (e.g. positions of atoms R 3N ) and calculate the statistical average of function A(X) with probability density P(X) e -V(X)/k BT Sum_A = 0! for n = 1 to Ntrial! X X + dx /* Random state-transition attempt */! Compute dv = V(X ) - V(X)! if dv < 0 then /* Greedy */! Accept X X! else if rand() < exp(-dv/k B T) then /* Occasional uphill move */ Accept X X! /* else the state remains X */! endif! Sum_A = Sum_A + A(X) /* Count X more than once, if it remains */! endfor! Avg_A = Sum_A/Ntrial! 0 1 exp(-dv/k B T) Note: rand() here is a uniform random number in the range [0, 1]

onte Carlo Simulation Random trial acceptance by a cost criterion

Summary: onte Carlo Basics 1. Sample-mean C ( ) 2 dxf (x) p(x) f ± f 2 f 1 2. Visualize probability density by a set of points: nonuniform randomnumber generation by coordinate transformation P(r P ʹ (ζ 1,,ζ N ) = 1,,r N ) (ζ 1,,ζ N ) / (r 1,,r N ) Areal magnification factor 3. etropolis (arkov-chain) C algorithm for importance sampling Random attempt conditional acceptance with probability exp(-δv/k B T)