Monte Carlo Integration I - PDF Free Download

Monte Carlo Integration I Digital Image Synthesis Yung-Yu Chuang with slides by Pat Hanrahan and Torsten Moller

Introduction L o p,ωo L e p,ωo s f p,ωo,ω i L p,ω i cosθ i i d The integral equations generally don t have analytic solutions, so we must turn to numerical methods. Standard methods like Trapezoidal integration or Gaussian quadrature are not effective for high-dimensional and discontinuous integrals. ω i

Numerical quadrature Suppose we want to calculate I b f d, but a can t solve it analytically. The approimations through quadrature rules have the form Iˆ n i w f i i which is essentially the weighted sum of samples of the function at various points

Midpoint rule convergence

Trapezoid rule convergence

Simpson s rule Similar to trapezoid but using a quadratic polynomial approimation convergence assuming f has a continuous fourth derivative.

Curse of dimensionality and discontinuity For an sd function f, If the d rule has a convergence rate of On -r, the sd rule would require a much larger number n s of samples to work as well as the d one. Thus, the convergence rate is only On -r/s. If f is discontinuous, convergence is On -/s for, g sd.

Randomized algorithms Las Vegas v.s. Monte Carlo Las Vegas: always gives the right answer by using randomness. Monte Carlo: gives the right answer on the average. Results depend on random numbers used, but statistically likely to be close to the right answer.

Monte Carlo integration Monte Carlo integration: uses sampling to estimate the values of integrals. It only requires to be able to evaluate the integrand at arbitrary points, making it easy to implement and applicable to many problems. If n samples are used, its converges at the rate of On -/. That is, to cut the error in half, it is necessary to evaluate four times as many samples. Images by Monte Carlo methods are often noisy. Most current methods try to reduce noise.

Monte Carlo methods Advantages Easy to implement Easy to think about but be careful of statistical bias Robust when used with comple integrands and domains shapes, lights, Efficient for high dimensional integrals Disadvantages Noisy Slow many samples needed for convergence

Basic concepts X is a random variable Applying a function to a random variable gives another random variable, Y=fX. CDF cumulative distribution function P Pr{ { X } PDF probability density function: nonnegative, sum to dp p d canonical uniform random variable ξ provided d by standard library and easy to transform to other distributions ib ti

Discrete probability distributions Discrete events X i with probability p i pi 0 n i p i p i Cumulative PDF distribution i P j j pi i Construction of samples: To randomly select an event, Select X i if P i U P i Pi 0 X Uniform random variable 3 U

Continuous probability distributions PDF p Uniform p 0 CDF P P pd 0 P Pr X P Pr X pd P P 0 0 P

Epected values Average value of a function f over some distribution of values p over its domain D f D E f p d p Eample: cos function over [0, π], p is uniform 0 E p cos cos d 0

Variance Epected deviation from the epected value Fundamental concept of quantifying i the error in Monte Carlo methods V[ f ] E f E f

Properties E af ae f E f X i E f X i i i V af a V f V f E f E f

Monte Carlo estimator Assume that we want to evaluate the integral of f over [a,b] Given a uniform random variable X i over [a,b], Monte Carlo estimator t N b a F f X N N i i says that the epected value E[F N ] of the estimator F N equals the integral

General Monte Carlo estimator Given a random variable X drawn from an arbitrary PDF p, then the estimator is F N N f X i i p X i N Although the converge rate of MC estimator is ON /, slower than other integral methods, its converge rate is independent of the dimension, making it the only practical method for high dimensional integral

Convergence of Monte Carlo Chebyshev s inequality: let X be a random variable with epected value μ and variance σ. For any real number k>0, Pr{ X k} k For eample, for k=, it shows that at least half of the value lie in the interval, Let Y f X / p X, the MC estimate F N becomes i i i F N N N Y i i

Convergence of Monte Carlo According to Chebyshev s inequality, ] [ ] [ Pr N N N F V F E F Y V Y V Y V Y V F V N i N i N i N ] [ Plugging into Chebyshev s inequality N N N N i i i i i i N ] [ Plugging into Chebyshev s inequality, ] [ Pr Y V I F N So, for a fied threshold, the error decreases at Pr N I F N,, the rate N -/.

Properties of estimators An estimator F N is called unbiased if for all N E[ F N ] Q That is, the epected value is independent of N. Otherwise, the bias of the estimator is defined as [ F ] E[ F ] Q N If the bias goes to zero as N increases, the estimator is called consistent lim [ F N lim N E [ F N N N ] 0 ] Q

Eample of a biased consistent estimator Suppose we are doing antialiasing on a d piel, to determine the piel value, we need to evaluate I w f d, where w is the 0 filter function with w d 0 A common way to evaluate this is N w X i i f X i FN N i w X i When N=, we have w X f X E[ F E E f X ] [ ] f d w X 0 I

Eample of a biased consistent estimator When N=, we have I d d w w f w f w F E 0 0 ] [ However, when N is very large, the bias approaches to zero pp N i i i N X f X w N F N i i N X w N X f X N li I d f w d w d f w X w N X f X w N F E N i i i i i N N N 0 0 0 lim lim ] [ lim N i i N 0

Choosing samples N f X i F N N i p X i Carefully choosing the PDF from which samples are drawn is an important technique to reduce variance. We want the f/p to have a low variance. Hence, it is necessary to be able to draw samples from the chosen PDF. How to sample an arbitrary distribution from a variable of uniform distribution? Inversion Rejection Transform

Inversion method Cumulative probability distribution function P Pr X Construction of samples Solve for X=P - U Must know:. The integral of p. The inverse function P - U 0 X

Proof for the inversion method Let U be an uniform random variable and its CDF is P u =. We will show that Y=P - U has the CDF P.

Proof for the inversion method Let U be an uniform random variable and its CDF is P u =. We will show that Y=P - U has the CDF P. Pr Y Pr P U PrU P Pu P P because P is monotonic, P P Thus, Y s CDF is eactly P.

Inversion method Compute CDF P Compute P - Obtain ξ Compute X i =P - ξ

Eample: power function It is used in sampling Blinn s microfacet model. p n

Eample: power function It is used in sampling Blinn s microfacet model. Assume p n n P n n n d 0 0 n n n X ~ p X P U U Trick It only works for sampling power distribution Y ma U, U,, Un, Un n n Pr Y Pr U i

Eample: eponential distribution p ce a useful for rendering participating media. Compute CDF P Compute P - Obtain ξ Compute X i =P - ξ

Eample: eponential distribution p ce a useful for rendering participating media. 0 ce a d c a Compute CDF P Compute P - Obtain ξ Compute X i =P - ξ P P as ae ds 0 ln a X ln a e a ln a

Rejection method Sometimes, we can t integrate into CDF or invert CDF I 0 Algorithm f d y f d dy Pick U and U Accept U if U < fu Wasteful? Efficiency = Area / Area of rectangle y f

Rejection method Rejection method is a dart-throwing method without performing integration and inversion.. Find q so that p<mq. Dart throwing a. Choose a pair X, ξ, where X is sampled from q b. If ξ<px/mqx return X Equivalently, we pick a point X, ξmqx. If it lies beneath px then we are fine.

Why it works For each iteration, we generate X i from q. The sample is returned if ξ<px/mqx, which happens with probability px/mqx. So, the probability bilit to return is p p q Mq M whose integral is /M Thus, when a sample is returned with probability /M, X i is distributed according to p.

Eample: sampling a unit disk void RejectionSampleDiskfloat *, float *y { } float s, sy; do { s =.f -.f * RandomFloat; sy =.f -.f * RandomFloat; } while s*s + sy*sy >.f * = s; *y = sy; π/4~78.5% good samples, gets worse in higher dimensions, for eample, for sphere, π/6~5.4%

Transformation of variables Given a random variable X from distribution p to a random variable Y=yX, where y is one-toone, i.e. monotonic. We want to derive the distribution of Y, p y y. P y Pr{ Y y } Pr{ X } P y PDF: p dy y d y dp dp y y dy y d dp d y dy d p y p P P y y dy y p d y

Eample p Y sin X p y y cos p cos sin y y

Transformation method A problem to apply the above method is that we usually have some PDF to sample from, not a given transformation. Given a source random variable X with p and a target distribution p y y, try transform X into to another random variable Y so that t Y has the distribution p y y. We first have to find a transformation y so that P =P y y. Thus, y y P y P

Transformation method Let s prove that the above transform works. We first prove that the random variable Z= P has a uniform distribution. If so, then P y Z should have distribution P y from the inversion method. Pr Z Pr P X Pr X P P P Thus, Z is uniform and the transformation works. It is an obvious generalization of the inversion method, in which X is uniform and P =.

Eample p y p y y e y

Eample p y y e y p y P y y e y P y y P y ln ln ln ln P P y y Thus, if X has the distribution, then the p random variable has the distribution ln ln X Y y y e y p

Multiple dimensions We often need the other way around,

Spherical coordinates The spherical coordinate representation of directions is r sin cos y r sin sin z r cos J r sin J T p r,, r sinp, y, z

Spherical coordinates Now, look at relation between spherical directions and a solid angles d sindd, p, d d p d Hence, the density in terms of p, sin p

Multidimensional sampling Separable case: independently sample X from p and Y from p y. p,y=p p p y y Often, this is not possible. Compute the marginal density function p first. p p, y dy Then, compute the conditional density function p y p, y p Use D sampling with p and py.

Sampling a hemisphere Sample a hemisphere uniformly, i.e. p c

Sampling a hemisphere Sample a hemisphere uniformly, i.e. Sample first p Now sampling 0 p p, d p, p p 0 c sin d p p p, sin c sin

Sampling a hemisphere Now, we use inversion technique in order to sample the PDF s Inverting these: P sin ' d ' cos 0 P d ' 0 cos

Sampling a hemisphere Convert these to Cartesian coordinate cos y z sin cos sin sin cos cos sin Similar il derivation for a full sphere

Sampling a disk WRONG Equi-Areal RIGHT Equi-Areal r U U U r U

Sampling a disk WRONG Equi-Areal RIGHT Equi-Areal U U r U r U

Sampling a disk Uniform p, y p r, rp, y r Sample r first. Then, sample. Invert the CDF. 0 p r p r, d d r p r, p r pr P r r P r r

Shirley s mapping r U U 4 U

Sampling a triangle u 0, u u v 0 u v v u u v u u A dv du u du 0 0 0 0 puv,

Sampling a triangle Here u and v are not independent! puv, Conditional probability p u p u, v dv 0 u p u dv u u p u Pu 0 udu u 0 pv u 0 0 u v u v p u, v p v U 0 U U 0 vo vo v0 0 0 0 0 0 u0 u0 P v u p v u dv dv

Cosine weighted hemisphere p cos p d 0 0 c cos sindd c 0 cos sind c p, cos sin d sindd

Cosine weighted hemisphere p, cos sin p cos sin d cos sin sin 0 p, p p P cos cos P

Cosine weighted hemisphere Malley s method: uniformly generates points on the unit disk and then generates directions by projecting them up to the hemisphere above it. Vector CosineSampleHemispherefloat u,float u{ } Vector ret; ConcentricSampleDisku, u, &ret., &ret.y; ret.z = sqrtfma0.f,.f - ret.*ret. - return ret; ret.y*ret.y; ret.y; r

Cosine weighted hemisphere Why does Malley s method work? r Unit disk sampling p r, Map to hemisphere r, sin, p y J T T Y r, X, T r sini J T p cos 0 0 cos

Cosine weighted hemisphere p y J T T Y r, X, r sin T J p cos 0 0 p, J p r, T T cos cos sin

Sampling Phong lobe p cos n / n p c cos n c cos sindd c cos 0 0 0 n c c cos d cos n n p, n cos n sin

Sampling Phong lobe sin cos, n n p sin cos sin cos 0 n n n d n p sin cos ' ' 0 n d n P cos cos cos ' cos ' 0 n n n d n ' cos cos cos cos 0 n n n d n cos cos n

Sampling Phong lobe sin cos, n n p sin cos, p p n n ' sin cos ' n p p n ' 0 d P

Sampling Phong lobe When n=, it is actually equivalent to n,, cos, n P cos cos cosine-weighted hemisphere, cos, P cos cos sin sin cos

Piecewise-constant d distributions Sample from discrete D distributions. Useful for teture maps and environment lights. Consider fu, v defined by a set of n u n v values f[u i, v j ]. Given a continuous [u, v], we will use [u, v ] to denote the corresponding discrete u i, v j indices.

Piecewise-constant d distributions ], [, u v n n j i f v u f dudv v u f I integral 0 0 ], [, i j j i v u f f n n f v u f v u f ' ' integral i j j i v u v u f n n v u f dudv v u f v u f v u p,,,,, pdf i i u v u f n d ', marginal f i I du v u p v p, marginal density '] [ '] ', [, v p I v u f v p v u p v u p f conditional probability ] [ p p

Piecewise-constant d distributions f u, v DistributionD low-res marginal density pv low-res conditional probability p u v

Metropolis sampling Metropolis sampling can efficiently generate a set of samples from any non-negative negative function f requiring only the ability to evaluate f. Disadvantage: successive samples in the sequence are often correlated. It is not possible to ensure that t a small number of samples generated by Metropolis is well distributed over the domain. There is no technique like stratified sampling for Metropolis.

Metropolis sampling Problem: given an arbitrary function assuming generate a set of samples

Metropolis sampling Steps Generate initial i i sample 0 mutating current sample i to propose If it is accepted, i+ = Otherwise, i+ = i Acceptance probability guarantees distribution is the stationary distribution f

Metropolis sampling Mutations propose given i T is the tentative transition probability density of proposing p from Being able to calculate tentative transition probability is the only restriction for the choice of mutations a is the acceptance probability of accepting the transition By defining a carefully, we ensure

Metropolis sampling Detailed balance stationary distribution

Pseudo code

Pseudo code epected value

Binary eample I

Binary eample II

Acceptance probability Does not affect unbiasedness; just variance Want transitions ii to happen because transitions ii are often heading where f is large Maimize the acceptance probability Eplore state space better Reduce correlation

Mutation strategy Very free and fleible; the only requirement is to be able to calculate transition probability Based on applications and eperience The more mutation, the better Relative frequency of them is not so important

Start-up bias Using an initial sample not from f s distribution leads to a problem called start-up bias. Solution #: run MS for a while and use the current sample as the initial iti sample to re-start t the process. Epensive start-up cost Need to guess when to re-start Solution #: use another available sampling method to start

D eample

D eample mutation

D eample mutation mutation 0,000 iterations

D eample mutation mutation 300,000 iterations

D eample mutation 90% mutation + 0% mutation Periodically using uniform mutations increases ergodicity

D eample image copy

D eample image copy sample 8 samples 56 samples per piel per piel per piel