Numerical methods for lattice field theory

Similar documents
Numerical methods for lattice field theory

A Repetition Test for Pseudo-Random Number Generators

Random processes and probability distributions. Phys 420/580 Lecture 20

Uniform Random Number Generators

2008 Winton. Review of Statistical Terminology

Random number generation

Review of Statistical Terminology

Recitation 2: Probability

Random numbers and generators

Finally, a theory of random number generation

Random Number Generation. CS1538: Introduction to simulations

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Preliminary statistics

Generating Random Variables

Today: Fundamentals of Monte Carlo

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Lecture 11. Probability Theory: an Overveiw

Contents. 1 Probability review Introduction Random variables and distributions Convergence of random variables...

Independent Events. Two events are independent if knowing that one occurs does not change the probability of the other occurring

( ) ( ) Monte Carlo Methods Interested in. E f X = f x d x. Examples:

16 : Markov Chain Monte Carlo (MCMC)

Introduction to Machine Learning CMU-10701

EE514A Information Theory I Fall 2013

Physics 403. Segev BenZvi. Monte Carlo Techniques. Department of Physics and Astronomy University of Rochester

Slides 3: Random Numbers

Today: Fundamentals of Monte Carlo

Markov Processes. Stochastic process. Markov process

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture 20. Randomness and Monte Carlo. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Markov Chains and MCMC

Chapter 4: Monte Carlo Methods. Paisan Nakmahachalasint

Today: Fundamentals of Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Math Stochastic Processes & Simulation. Davar Khoshnevisan University of Utah

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Bayesian Inference and MCMC

Generating pseudo- random numbers

Physics 403 Monte Carlo Techniques

Sum-discrepancy test on pseudorandom number generators

Convex Optimization CMU-10725

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Class 12. Random Numbers

14 Random Variables and Simulation

Quick Tour of Basic Probability Theory and Linear Algebra

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

DUBLIN CITY UNIVERSITY

Workshop on Heterogeneous Computing, 16-20, July No Monte Carlo is safe Monte Carlo - more so parallel Monte Carlo

Origins of Probability Theory

Appendix C: Generation of Pseudo-Random Numbers

MTH739U/P: Topics in Scientific Computing Autumn 2016 Week 6

Random Number Generation. Stephen Booth David Henty

How does the computer generate observations from various distributions specified after input analysis?

Markov Chains, Stochastic Processes, and Matrix Decompositions

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

Monte Carlo and cold gases. Lode Pollet.

Finite-Horizon Statistics for Markov chains

Random Number Generators

Brief Review of Probability

2 P. L'Ecuyer and R. Simard otherwise perform well in the spectral test, fail this independence test in a decisive way. LCGs with multipliers that hav

Generating Uniform Random Numbers

Lecture 6 Basic Probability

Uniform random numbers generators

Monte Carlo Methods. PHY 688: Numerical Methods for (Astro)Physics

Generating Uniform Random Numbers

B.N.Bandodkar College of Science, Thane. Random-Number Generation. Mrs M.J.Gholba

Markov Chains and MCMC

Lehmer Random Number Generators: Introduction

Computational Physics (6810): Session 13

( x) ( ) F ( ) ( ) ( ) Prob( ) ( ) ( ) X x F x f s ds

Pseudo-Random Numbers Generators. Anne GILLE-GENEST. March 1, Premia Introduction Definitions Good generators...

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

MATH5835 Statistical Computing. Jochen Voss

Some long-period random number generators using shifts and xors

More General Functions Is this technique limited to the monomials {1, x, x 2, x 3,...}?

n(1 p i ) n 1 p i = 1 3 i=1 E(X i p = p i )P(p = p i ) = 1 3 p i = n 3 (p 1 + p 2 + p 3 ). p i i=1 P(X i = 1 p = p i )P(p = p i ) = p1+p2+p3

Random Number Generators - a brief assessment of those available

Numerical integration and importance sampling

From Random Numbers to Monte Carlo. Random Numbers, Random Walks, Diffusion, Monte Carlo integration, and all that

Systems Simulation Chapter 7: Random-Number Generation

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Some long-period random number generators using shifts and xors

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo

Malvin H. Kalos, Paula A. Whitlock. Monte Carlo Methods. Second Revised and Enlarged Edition WILEY- BLACKWELL. WILEY-VCH Verlag GmbH & Co.

18.600: Lecture 32 Markov Chains

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

Quantum Monte Carlo. Matthias Troyer, ETH Zürich

Monte Carlo Methods. Jesús Fernández-Villaverde University of Pennsylvania

SDS 321: Introduction to Probability and Statistics

Generating Uniform Random Numbers

Random Numbers. Pierre L Ecuyer

PSEUDORANDOM NUMBER GENERATORS BASED ON THE WEYL SEQUENCE

General Principles in Random Variates Generation

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19

Lecture 11: Probability, Order Statistics and Sampling

Outline. Simulation of a Single-Server Queueing System. EEC 686/785 Modeling & Performance Evaluation of Computer Systems.

Lecture 2: Repetition of probability theory and statistics

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Notes on Mathematics Groups

Transcription:

Numerical methods for lattice field theory Mike Peardon Trinity College Dublin August 9, 2007 Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 1 / 37

Numerical methods - references Good introduction to the concepts: Simulation, Sheldon M. Ross, Academic Press ISBN 0-12-598053-1 Detail on the theory of Markov chains: Markov chains, Gibbs fields, Monte Carlo simulations and queues, Pierre Brémaud, Springer ISBN 0-387-98509-3 A classic: The Art of Computer Programming, Volume 2 Donald E. Knuth, Addison-Wesley ISBN 0-201-48541-9. And another: Numerical Recipes: The Art of Scientific Computing (3rd Edition), Press, Teukolsky, Vetterling and Flannery, CUP ISBN 0-521-88068-8 Applications to field theory: The Monte Carlo method in quantum field theory, Colin Morningstar arxiv:hep-lat/0702020 Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 2 / 37

An opening comment... Monte Carlo is the worst method for computing non-perturbative properties of quantum field theories - except for all other methods that have been tried from time to time. Churchill said: Democracy is the worst form of government, except for all those other forms that have been tried from time to time (in a speech to the House of Commons after losing the 1947 general election). Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 3 / 37

Probability (1) Outcomes, Events and Sample Spaces Assume we have a precise description of the outcome, ω of the observation of a random process. The set of all possible outcomes is called the sample space, Ω. Any subset A Ω can be regarded as representing an event occuring. The notation of set theory is helpful. A particularly useful collection of events, A 1, A 2,... A n are those that are mutually incompatible and exhaustive, which implies A i A j = when i j and N k=1 A k = Ω Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 4 / 37

Probability (2) A random variable is a function X : Ω R such that for a R, the event {ω, X (ω) a} can be assigned a probability. A discrete random variable is a random variable for which X : Ω E, where E is a denumerable set. Axioms of probability A probability (measure) on (F, Ω) a is a mapping P : F R such that 1 0 P(A) 1 2 P(Ω) = 1 3 P( N k=1 A k) = N k=1 P(A k) if A k are mutually incompatible. a F is the collection of events that are assigned a probability; for obscure reasons it is not the same as Ω Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 5 / 37

Probability (3) Two events are called independent if P(A B) = P(A)P(B) Similarly, two random variables, X and Y are called independent if for all a, b R, P(X a, Y b) = P(X a)p(y b) A conditional probability of event A given B is P(A B) = P(A B) P(B) The cumulative distribution function, F X describes the properties of a random variable X and is F X (x) = P(X x) The probability density function, f X is defined when F X can be written F X (x) = x f X (z)dz Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 6 / 37

Statistics (1) Expectation For an absolutely continuous c.d.f, and a function g, the expectation, E[g(X )] is defined as Mean and variance The mean, µ X of X is and the variance, σ 2 X is E[g(X )] = µ X = E[X ] = g(z) f X (z)dz z f X (z)dz σ 2 X = E[(X µ X ) 2 ] = E[X 2 ] µ 2 X Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 7 / 37

Statistics (2) The strong law of large numbers (Kolmogorov) If X (n) is the sample mean of n independent, identically distributed random numbers, {X 1, X 2,... X n } with well-defined expected value µ X, and variance σ X then X (n) converges almost surely to µ X ; ( P lim n X (n) = µ X ) = 1 Bernoulli:...so simple that even the stupidest man instinctively knows it is true The central limit theorem (de Moivre, Laplace, Lyapunov,... ) ( ) X (n) lim P µ X n σ X / n z = 1 z e y 2 /2 dy 2π As more statistics are gathered, all sample means become normally distributed. Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 8 / 37

The central limit theorem X Prob. densities X (n) n X (n) n = 2 n = 5 n = 50 Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 9 / 37

Generating (pseudo-)random numbers on a computer A deterministic machine like a computer is incapable of generating really random numbers. Randomness is faked; algorithms can be devised that generate sequences of numbers that appear random. To appear random implies statistical identities hold for large sets of these numbers. Testing randomness is done by both looking at statistical correlations (for example checking that using d adjacent entries in a sequence as co-ordinates of a d-dimensional point does not result in points that lie on certain planes in preference) and using a simulation to compute something known analytically. The best-known is Marsaglia s Diehard test. The elements of these sequences that have passed all the appropriate statistical tests are called pseudo-random. The original example is the linear congruential sequence. For details see eg. Knuth. Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 10 / 37

Generating (pseudo-)random numbers on a computer (2) Suppose we generate a sequence by applying some deterministic algorithm A A A X1 X2,... X 0 If any entry in the sequence can take any of m values, then the sequence must repeat itself in at most m iterations. The number of iterations before a sequence repeats itself is called the cycle, which may depend on the initial state, X 0. This repetition of the sequence over-and-over again is definitely not random! A useful generator must thus have m large - at least as large as the number of random numbers needed in the computer program, preferrably much more. The maximum possible cycle, independent of starting point, X 0. Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 11 / 37

Generating (pseudo-)random numbers on a computer (3) The best-known algorithm is: The linear congruential sequence The sequence, X 0, X 1, X 2,... with 0 X i < m is generated by the recursion X n+1 = ax n + c mod m with X 0 the seed a the multiplier 0 a < m c the increment 0 c < m m the modulus m > 0 The period can be maximised by ensuring c is relatively prime to m, b = a 1 is a multiple of p for every prime p dividing m and b is a multiple of 4 if m is a multiple of 4. See Knuth for details. Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 12 / 37

An example linear congruential sequence To demonstrate this in action, let s pick parameters according to the prescription above m = 90 = 5 3 3 2 then c = 13, b = 5 3 2 a = 31 and seed with X 0 = 0 The sequence is below. No number appears more than once. 0, 13, 56, 39, 52, 5, 78, 1, 44, 27, 40, 83, 66, 79, 32, 15, 28, 71, 54, 67, 20, 3, 16, 59, 42, 55, 8, 81, 4, 47, 30, 43, 86, 69, 82, 35, 18, 31, 74, 57, 70, 23, 6, 19, 62, 45, 58, 11, 84, 7, 50, 33, 46, 89, 72, 85, 38, 21, 34, 77, 60, 73, 26, 9, 22, 65, 48, 61, 14, 87, 10, 53, 36, 49, 2, 75, 88, 41, 24, 37, 80, 63, 76, 29, 12, 25, 68, 51, 64, 17 Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 13 / 37

Generating (pseudo-)random numbers on a computer (4) Many more algorithms exist now, and are used in real lattice QCD codes: Lüscher s ranlux This generator, based on Marsaglia and Zaman s RCARRY algorithm has a long period (10 171 ) when set up properly. The quality is established by considering the dynamics of chaotic systems. (M. Lüscher, A portable high-quality random number generator for lattice field theory simulations, Comput. Phys. Commun 79 (1994), 100-110. hep-lat/9309020) Matsumoto and Nishimura s mersenne twister The period length is a Mersenne prime, and the most commonly-known variant has a period of 2 19937 1 4 10 6001. (M. Matsumoto and T. Nishimura, Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator, ACM Trans. on Modeling and Computer Simulation Vol. 8, No. 1, January (1998) 3-30) Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 14 / 37

Generating more general distributions We will find it useful soon to generate random numbers with more general distributions. Consider taking a uniform variate U and applying some function v (defined in [0, 1]; let us also assume it is increasing in that region). We get a new random variable, V = v(u). What is the probability distribution of V? A simple analysis, shown in the figure (right) shows that equating the probabilities v max gives dv f V (v) = du dv = ( ) dv 1 du v min 0 du 1 Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 15 / 37

The transformation method This observation suggest an algorithm, provided the function v can be found. v is the solution to u = v(u) f X (z)dz unfortunately, there are few pdfs for which this method is efficient! Perhaps the most important example for QCD: the Box-Muller algorithm, which generates pairs of normally distributed random numbers. The Box-Muller algorithm 1 Generate two independent u.v. random numbers, U 1,2 (0, 1] 2 Set Θ = 2πU 1 and R = 2 ln U 2 3 Return R cos Θ or R sin Θ (or both; they are independent). Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 16 / 37

The rejection method Consider the following experiment, for regions F and Q F : 1 Generate a random point, P with co-ordinates (X, Y ) inside F 2 If P is also inside Q, return X. 3 Otherwise go back to step 1. Region Q y Region F x What is the p.d.f. for X? Answer: q(x)/ q(z) dz where q bounds the region Q. (There is no need to know the normalisation of q). The rejection method 1 Generate X from p.d.f. f 2 Generate Y = c f (X )U where U is a uniform variate and c is a constant such that f (x) > q(x). 3 If Y < q(x ), return X otherwise go to step 1. Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 17 / 37

Monte Carlo integration Consider X, a random point in V with probability density function p X (x). A new stochastic variable F is defined by applying a function f to X, so F = f (X ). The expected value of F is then E(F ) = f (x)p X (x)dx V If N independent instances of X are generated, and then N instances of F are derived, the mean of this ensemble is F N = 1 N F i N The basic result of Monte Carlo lim F N = E(F ) = N V i=1 f (x)p X (x)dx Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 18 / 37

Monte Carlo integration - an example Estimate 1 sin 10π(1 x) I = dx 0 x using Monte Carlo. Generate an ensemble of uniform variate points, {U 1, U 2,... }. Then evaluate F i = f (U i ) and compute the mean of this ensemble. 0.3 0.2 Monte Carlo estimate, I(n) 0.1 0-0.1-0.2-0.3 10 2 10 3 10 4 10 5 10 6 10 7 10 8 Number of Monte Carlo samples, n Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 19 / 37

Importance Sampling For almost all the physics problems we want to solve, we find just a tiny fraction of configuration space contributes to the integral; variance reduction will be crucial. The most useful variance reduction technique is importance sampling Instead of a flat sampling over the integration volume, the regions where the integral dominates are sampled preferentially For any p(x) > 0 in V, we have I = V f (z) Dz = V f (z) p(z) Dz p(z) Suppose we generate an ensemble of points, {X 1, X 2,... } in V with p.d.f. p and compute the random variable J i = f (X i) p(x i ) Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 20 / 37

Importance Sampling Then (as seen above), E[J] = I, so the sample mean converges to the answer, independently of p but...... the variance is σj 2 = E[J2 ] E[J] 2 = V f (z) 2 p(z) 2 p(z) Dz I 2 = V f (z) 2 p(z) Dz I 2 and so this does depend on the choice of p. The optimal choice of p would minimise σj 2, so a functional variation calculation yields the optimal choice to be p(x) = V f (x) f (z) Dz In practice, it is usually difficult to draw from this sampling (since if we could, it is likely we could compute I directly!), but this suggests p should be peaked where f is peaked. For QFTs, importance sampling is just a synonym for using the Boltzmann weight in the Euclidean path integral (doesn t have to be!) Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 21 / 37

Importance Sampling (3) Another example. Using Monte Carlo, estimate Note that I (a) = 2 a a Simple Monte Carlo 1 Generate independent uniform variates X, Y [0, a] 2 Compute F = e (X 2 +Y 2 )/2 3 Repeat N times, and average result, F (N) return 2a 2 F (N) 0 0 lim I (a) = π a e (x2 +y 2 )/2 dxdy Importance sampling 1 Generate normally distributed X, Y [0, ] 2 G = 1 if X a, Y a. 3 Repeat N times, and average result, Ḡ (N) return πḡ (N) As a, the region contributing almost all the integral (near the origin) becomes fractionally smaller and smaller - variance in simple MC estimate will diverge. Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 22 / 37

Importance Sampling (4) 3.5 3 Estimated integral, I(a) 2.5 2 1.5 1 0.5 0 0 5 10 15 20 box length, a The variance in the importance sampling estimator becomes zero as a Unfortunately, so far we have only developed algorithms to generate random numbers with a general probability distribution function in very low dimensions - for field theory applications, we want to calculate integrals in many thousands of dimensions... Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 23 / 37

Markov Processes A more general method for generating points in configuration space is provided by considering Markov processes. In 1906, Markov was interested in demonstrating that independence was not necessary to prove the (weak) law of large numbers. He analysed the alternating patterns of vowels and consonants in Pushkin s novel Eugene Onegin. In a Markov process, a system makes stochastic transitions such that the probability of a transition occuring depends only on the start and end states. The system retains no memory of how it came to be in the current state. The resulting sequence of states of the system is called a Markov chain. Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 24 / 37

Markov Processes (2) A Markov Chain Let {ψ i } for i = 0..n + 1 be a sequence of states generated from a finite state space Ω by a stochastic process. Let χ k Ω be a state, such that χ i χ j = if i j and Ω = χ 1 χ 2 χ 3... χ m. If the conditional probability obeys P(ψ n+1 = χ i ψ n = χ j, ψ n 1 = χ jn 1,..., ψ 0 = χ j0 ) = P(ψ n+1 = χ i ψ n = χ j ), then the sequence is called a Markov Chain Moreover, if P(ψ n+1 = χ i ψ n = χ j ) is independent of n, the sequence is a homogenous Markov chain. From now on, most of the Markov chains we will consider will be homogenous, so we ll drop the label. Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 25 / 37

Markov Processes (3) The transition probabilites fully describe the system. They can be written in matrix form (usually called the Markov matrix); M ij = P(ψ n+1 = χ i ψ n = χ j ) The probability the system is in a given state after one application of the process is then P(ψ n+1 = χ i ) = m P(ψ n+1 = χ i ψ n = χ j )P(ψ n = χ j ) j=1 Writing the probabilistic state of the system as a vector, application of the process looks like linear algebra p i (n + 1) = P(ψ n+1 = χ i ) = M ij p j (n) Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 26 / 37

Markov Processes (4) An example: Seattle s weather. It is noticed by a resident that on a rainy day in Seattle, the probability tomorrow is rainy is 80%. Similarly, on a sunny day the probability tomorrow is sunny is 40%. This suggests Seattle s weather can be described by a (homogenous) Markov process. From this data, can we compute the probability any given day is sunny or rainy? For this system, the Markov matrix is Sunny Rainy ( Sunny 0.4 0.6 Rainy ) 0.2 0.8 Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 27 / 37

Markov Processes (5) ( ) 1 If today is sunny, then ψ 0 =, the state vector for tomorrow is 0 ( ) ( ) ( ) 0.4 0.28 0.256 then ψ 1 =, and ψ 0.6 2 =, ψ 0.72 3 =,... 0.744 ( ) 0 If today is rainy, then ψ 0 =, the state vector for tomorrow is 1 ( ) ( ) ( ) 0.2 0.24 0.248 then ψ 1 =, and ψ 0.8 2 =, ψ 0.76 3 =, 0.752 The vector ψ quickly collapses to a fixed-point, which must be π, the eigenvector of M with eigenvalue 1, normalised such that 2 i=1 π i = 1. ( ) 0.25 We find π =. This is the invariant probability distribution 0.75 of the process; with no prior information these are the probabilities any given day is sunny (25%) or rainy (75%). Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 28 / 37

The Markov Matrix (1) The Markov matrix has some elementary properties 1 Since all elements are probabilities, 0 M ij 1 2 Since the system always ends in Ω, N M ij = 1 i=1 From these properties alone, the eigenvalues of M must be in the unit disk; λ 1, since if v is an eigenvector, M ij v j = λv i = M ij v j = λ v i = M ij v j λ v i j j j ( v j ) M ij λ v i = 1 λ j i i Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 29 / 37

The Markov Matrix (2) Also, a Markov matrix must have at least one eigenvalue equal to unity. Considering the vector v i = 1, i, we see v i M ij = M ij = 1, j i i and thus v is a left-eigenvector, with eigenvalue 1. Similarly, for the right-eigenvectors, M ij v j = λv i = v j M ij = λ v i = j i i j j and so either λ = 1 or if λ 1 then i v i = 0 v j = λ i v i Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 30 / 37