Sequential Monte Carlo and adaptive numerical integration

Similar documents
Fall 2013 MTH431/531 Real analysis Section Notes

Lecture 2: Monte Carlo Simulation

Monte Carlo Integration

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

Stochastic Simulation

Lecture 19: Convergence

Chapter 6 Infinite Series

A collocation method for singular integral equations with cosecant kernel via Semi-trigonometric interpolation

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Basics of Probability Theory (for Theory of Computation courses)

Comparison Study of Series Approximation. and Convergence between Chebyshev. and Legendre Series

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Sequential Monte Carlo Methods - A Review. Arnaud Doucet. Engineering Department, Cambridge University, UK

Distribution of Random Samples & Limit theorems

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Clases 7-8: Métodos de reducción de varianza en Monte Carlo *

Output Analysis and Run-Length Control

Convergence of random variables. (telegram style notes) P.J.C. Spreij

LECTURE 8: ASYMPTOTICS I

Mathematical Methods for Physics and Engineering

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

Lecture 33: Bootstrap

Monte Carlo method and application to random processes

Chapter 4. Fourier Series

Surveying the Variance Reduction Methods

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Last time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object

Lecture 8: Solving the Heat, Laplace and Wave equations using finite difference methods

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

NEW FAST CONVERGENT SEQUENCES OF EULER-MASCHERONI TYPE

CSE 527, Additional notes on MLE & EM

Bernoulli numbers and the Euler-Maclaurin summation formula

1 Introduction to reducing variance in Monte Carlo simulations

Advanced Stochastic Processes.

On Random Line Segments in the Unit Square

CHAPTER 10 INFINITE SEQUENCES AND SERIES

Discrete Orthogonal Moment Features Using Chebyshev Polynomials

Sequences of Definite Integrals, Factorials and Double Factorials

Lecture 3 The Lebesgue Integral

The standard deviation of the mean

Chapter 9: Numerical Differentiation

7.1 Convergence of sequences of random variables

L = n i, i=1. dp p n 1

4. Partial Sums and the Central Limit Theorem

MONTE CARLO VARIANCE REDUCTION METHODS

A statistical method to determine sample size to estimate characteristic value of soil parameters

Math 2784 (or 2794W) University of Connecticut

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Chapter 2 The Monte Carlo Method

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

ENGI Series Page 6-01

Math 155 (Lecture 3)

Weighted Approximation by Videnskii and Lupas Operators

ON POINTWISE BINOMIAL APPROXIMATION

Self-normalized deviation inequalities with application to t-statistic

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Random Variables, Sampling and Estimation

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

TR/46 OCTOBER THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION A. TALBOT

Math 113 Exam 3 Practice

Exponential Families and Bayesian Inference

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Ma 4121: Introduction to Lebesgue Integration Solutions to Homework Assignment 5

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology

Numerical Method for Blasius Equation on an infinite Interval

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and

Section A assesses the Units Numerical Analysis 1 and 2 Section B assesses the Unit Mathematics for Applied Mathematics

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

A gentle introduction to Measure Theory

Bull. Korean Math. Soc. 36 (1999), No. 3, pp. 451{457 THE STRONG CONSISTENCY OF NONLINEAR REGRESSION QUANTILES ESTIMATORS Seung Hoe Choi and Hae Kyung

Access to the published version may require journal subscription. Published with permission from: Elsevier.

Infinite Sequences and Series

Detailed proofs of Propositions 3.1 and 3.2

The Growth of Functions. Theoretical Supplement

Math 312 Lecture Notes One Dimensional Maps

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

( a) ( ) 1 ( ) 2 ( ) ( ) 3 3 ( ) =!

n n i=1 Often we also need to estimate the variance. Below are three estimators each of which is optimal in some sense: n 1 i=1 k=1 i=1 k=1 i=1 k=1

THE KALMAN FILTER RAUL ROJAS


Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

COMPUTING THE EULER S CONSTANT: A HISTORICAL OVERVIEW OF ALGORITHMS AND RESULTS

Information-based Feature Selection

6.3 Testing Series With Positive Terms

Unbiased Estimation. February 7-12, 2008

Central limit theorem and almost sure central limit theorem for the product of some partial sums

4.3 Growth Rates of Solutions to Recurrences

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Chapter 10: Power Series

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan

Topic 9: Sampling Distributions of Estimators

x x x Using a second Taylor polynomial with remainder, find the best constant C so that for x 0,

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS

An Introduction to Randomized Algorithms

Transcription:

Sequetial Mote Carlo ad adaptive umerical itegratio VLADIMIR M. IVANOV, MAXIM L. KORENEVSKY Departmet of Computer Sciece Sait-Petersburg State Polytechical Uiversity 195..., Politehichesaya, 29, Sait-Petersburg RUSSIA Abstract: Paper presets theory of the Sequetial Mote Carlo method ad its applicatio for developmet of adaptive statistical multidimetioal itegratio algorithms. Theorem of Sequetial Mote Carlo covergece is give. Several covetioal variace reductio methods are used to develop adaptive itegratio methods. These methods accumulate data about itegrad durig executio ad use it to accelerate covergece of itegral estimates. Such a approach provides optimal covergece rates for may importat fuctioal classes while retais mai merits of covetioal Mote Carlo itegratio methods. Keywords: Sequetial Mote Carlo, adaptive itegratio, successive bisectio. 1 Itroductio Mote Carlo is oe of the most popular methods for multidimetioal itegratio. It is very simple for implemetatio, aturally successive ad allows to chec accuracy durig computatios. The other importat merit of Mote Carlo method is that its rate of covergece does ot deped o itegral dimesio. I fact, itegratio error is of the order O( 0.5 ), where is the umber of sample poits i which itegrad is evaluated. This is very differet from traditioal cubature rules, whose rate of covergece expoetially decreases with the icrease of dimesio. Itegratio accuracy ca be substatially icreased by use of various variace reductio methods, e.g. importace samplig, correlated ad cotrol variate samplig, atithetic variates, stratified samplig etc. [1], [2], [3]. May of them use available a priori data about itegrad to attai more rapid covergece. However, data of such id ca be also accumulated durig computatios ad effectively applied to accelerate covergece of itegratio process, i.e. algorithm ca adapt to the features of itegrad. The Sequetial Mote Carlo method [4] provides a coveiet framewor for desig ad ivestigatio of such adaptive itegratio methods. The rest of paper is orgaized as follows. Sectio 2 gives a brief descriptio of Mote Carlo itegratio ad some variace reductio techiques. Sequetial Mote Carlo method is stated ad ivestigated i sectio 3. Some geeral approaches to develop adaptive itegratio methods are outlied i sectio 4. Sectio 5 describes successive bisectio algorithm ad adaptive methods for itegratio of smooth fuctios. Discussio of some umerical experimets is give i sectio 6. 2 Mote Carlo itegratio The problem is to evaluate itegral J = f(x)dx, (1) over closed bouded domai of s-dimesioal euclidea space R s. Simplest (crude) Mote Carlo estimate of J is J = µ() f(x ), (2) where x are idepedet radom variables uiformly distributed over ad µ() is the measure of. Estimate J is ubiased ad its variace is V ar{j }= 1 µ() f 2 (x)dx J 2 = σ2. (3) This implyies (due to Chebyshev iequality) that error of itegratio for ay predefied cofidece level decreases as O( 0.5 ) ad this rate does ot deped o dimesio s. Itegratio process ca be orgaized i successful maer, ad simple accuracy chec based o sample variace ca be doe 1

to stop computatio as soo as required accuracy is attaied. Numerous variace reductio techiques are developed to reduce multiplier σ 2 i (3). The idea of cotrol variate samplig is to itegrate pricipal part of f(x) aalitycally ad apply Mote Carlo estimate to remaider oly. Ideed, let g(x) be a easily itegrable approximatio of f(x) (so-called easy approximatio ). The J ca be estimated as follows J = g(x)dx + µ() (f(x ) g(x )). (4) Evidetly, V ar{j } is defied by (3) with f(x) substituted by (f(x) g(x)), ad it teds to zero whe g(x) f(x). Therefore all the a priori data about f(x) ca be used to costruct easy approximatio g(x) carefully. Importace samplig is based o the observatio that oe eed to sample more poits ito those parts of where f(x) is greater i absolute value, i.e. samplig should be cotrolled by some probability desity fuctio (pdf) p(x) > 0. This iduce more geeral form of Mote Carlo estimate: J = 1 f(x ) p(x ) (5) where x are idepedet radom variables with pdf p(x). Variace of this estimate is as follows V ar{j } = 1 f 2 (x) p(x) dx J 2 = σ2. (6) Well-ow [2], that the least value of σ 2 is attaied whe p(x) is proportioal to f(x). Provided that f(x) is of costat sig, this p(x) eve reduces σ 2 to zero. Although exact choice of p(x) i such a maer is ot possible, approximate choice is ofte quite acceptable. So, let g(x) be a easily itegrable approximatio of f(x). The p(x) = g(x) g(x)dx = g(x) gives importace samplig pdf ad J = J g f(x ) g(x ) J g (7) gives importace samplig itegral estimate. Agai a priori data about itegrad ca be used to costruct importace samplig pdf. Both cotrol variate ad importace samplig ca be cosidered as special cases of geeral approach, offered by J.H.Halto [4] ad called Sequetial Mote Carlo. 3 Sequetial Mote Carlo Sequetial Mote Carlo method operates o two types of ubiased itegral estimates primary S ad secodary J related as follows: J = β S + (1 β )J 1, (8) where 0 β 1, β 1 = 1 are some umerical factors. Both primary ad secodary estimates deped o radom poits x sampled over accordig to some pdfs (may be depedet o ). Secodary estimates J ca be also expressed as J = α () S, α () = β j=+1 (1 β j ). Easy to see that crude Mote Carlo is the special case of Sequetial Mote Carlo correspodig to S = µ()f(x ), (9) while cotrol variate samplig correspods to S = g(x)dx + µ()(f(x ) g(x )) (10) ad importace samplig correspods to S = f(x ) p(x ). (11) I all cases S deped oly o x ad therefore idepedet from each other ad α () = β = 1. Oe ca assume that use of depedet primary estimates may provide some beefits ad it is really true. The mose useful results ca be obtaied for the case of depedet but ucorrelated primary estimates, i.e. E{(S i J)(S j J)} = 0 for i j. I this case the variace of secodary estimates is as follows: V ar{j } = α () D, D = E{ S J 2 }. 2

Here D are the variaces of S (estimatio is tae over radom variables x 1,..., x which S depeds o). Followig theorem provides some results about covergece of Sequetial Mote Carlo. Theorem 1 [5, 6] Let S are ucorrelated, their variaces ca be estimated as D = O( γ l δ ) for some costats γ, δ 0, ad let coefficiets β are chose as follows: The β = 1 + γ + γ for ay γ > γ 1. (12) 2 V ar{j } = O( γ 1 l δ ) ad J coverg to J with probability 1 for ay γ > 0. Theorem 1 shows that suitable choice of coefficiets β allows for secodary estimates to coverg oe order more rapidly that for primary oes. Moreover, it ca be show that ay other choice of coefficiets ca t implove order of V ar{j } decrease. Empirical estimatio of V ar{j } ca be orgaized i parallel with estimatio of itegral J that allows to chec accuracy of J durig computatios ad stop them as soo as required accuracy is attaied. 4 Adaptive itegratio methods The uderlyig idea of adaptive methods offered below is as follows: algorithm ca accumulate its owledge about itegrad to successively mae curret easy approximatio more precise ad successively decrease variaces of itegral estimate. To mae this more formal assume that alog with samplig radom poits x 1,..., x,... the sequece of easy itegrad approximatios is costructed, f 1 (x),..., f (x),... such that f (x) = f (x; x 1,..., x 1 ). (Thus, each approximatio depeds o the values of f(x) at all poits sampled earlier.) The primary estimates S = f (x)dx + µ()(f(x ) f (x )) (13) are direct aalogy to (10) used for covetioal corol variate samplig method. But as f (x) ted to f(x) variaces of S ted to zero, ad oe ca expect (due to theorem 1) that variaces of secodary estimates J will decrease more rapidly tha O( 1 ). Equatio (13) defies adaptive cotrol variate samplig method. Similarly, adaptive importace samplig method ca be itroduced. Now assume that itegrad f(x) ad all approximatios f (x) are strictly positive 1. The primary estimates S = f(x ) p (x ), p (x) = f (x) f (x)dx, (14) where x is sampled over accordig to pdf p (x) are direct aalogy to (11) used for covetioal importace samplig method. Agai, as f (x) ted to f(x), p (x) ted to the optimal oe ad variaces of S ted to zero. Metio of theorem 1 few lies above is valid oly if we state that estimates S from (13) or (14) are ucorrelated. Fortuately, i fact they are. It follows directly from the relatio E x {S x 1,..., x 1 } = J, where estimatio is coditioal over x uder fixed x 1,..., x 1 ad from the depedece of S oly o x i, i. Thus, theorem 1 is completely applicable. It ca easily be show [6], that D E x1,...,x 1 µ() (f(x) f (x)) 2 dx for adaptive cotrol variate samplig ad (f(x) f (x)) 2 D E x1,...,x 1 dx p (x) for adaptive importace samplig, i.e. variaces D ted to zero whe f (x) ted to f(x) i L 2 (). To use proposed adaptive methods oe should be able to costruct the sequece of approximatios f (x) o the base of values f(x i ) i the poits already sampled. There are may ways to do it. The first ad simplest oe was offered by Kulchitsy ad Srobotov [7] for oe-dimesioal problem ad is as follows. f (x) is chose as piecewisecostat fuctio o the, f (x i ) = f(x i ) for i < ad f (x) is costat betwee x i. It was show for adaptive importace samplig, that i this case V ar{j } = O( 3 ), i.e. method is much 1 This assumptio ca usually be easily satisfied by additio of suffitiely large positive costat to the itegrad. 3

more rapid tha covetioal o-adaptive importace samplig. Further [?], this result was geeralized to piecewisepolyomial approximatios. For oe-dimesioal itegrads of class C m (a) havig cotiuous derivatives up to order m all bouded by costat a, adaptive itegratio methods ca be costructed for which V ar{j } = O( 2m 1 ). 5 Successive bisectio Described approach to costruct approximatios i oe-dimesioal case ca be geeralized for multidimetioal itegratio. For simplicity we will ow cosider oly itegratio over hyperparallelepiped, more geeral case is addressed at the ed of this sectio. Let s cosider -th itegratio step, whe poits x 1,..., x 1 are sampled ad itegratio domai is divided ito N ooverlappig subdomais j, j = 1,..., N ad assume that f (x) is the piecewise approximatio of f(x) over this partitio of. Let s say that sequece of f (x) approximates f(x) with order l > 0 if there exists C > 0 such that f(x) f (x) C [ µ( j ) ] l x j, (15) for all > 0 ad j = 1,..., N. Approximatios that satisfy (15) ca be costructed for wide variety of fuctios. I particular, fuctios of s- dimesioal class C m (a) ca be approximated with the order l up to m/s. Provided (15), variaces D are estimated through the quatity 2 M 2l+1 N = E x1,...,x 1 [µ()] 2l+1 j=1 that ca be called as partitio momet of order (2l + 1) relative to x 1,..., x 1. Now, good geeralizatio of oe-dimesioal algorithm should provide both rapid decrease of M 2l+1 ad relatively slow icrease of N (otherwise computatioal load would be too large). Oe of possible ways to solve this miimax problem is successive bisectio method. Its idea is very simple: ew partitio is obtaied from the curret oe by bisectio of subdomai 2 For adaptive importace samplig oe should additioally assume, that p (x) are uiformly boded from below by some positive costat where ew sampled poit x falls ito. Bisectio is carried out alog the directio where subdomai to be divided is the most legthy. Clearly this way provides very moderate icrease of N (amely N = ). Moreover it ca be show [6] that for both adaptive cotrol variate samplig ad adaptive importace samplig M 2l+1 = O( 2l ). Thus, for fuctios of class C m (a) adaptive methods ca be costructed for which D{J } = O( 1 2m/s ). Bahvalov [9] showed that this order is the best possible for ay odetermiistic itegratio method that uses oly O() values of itegrad. Therefore, proposed adaptive methods are optimal o C m (a). Now what if itegratio domai is differet from hyperparallelepiped. Covetioal approach is to immerse it ito larger hyperparallelepiped ad set f(x) to be zero outside. O the oe had, i this case f(x) loses its smoothess i the boudary poits ad it caot be approximated with high order over all subdomais of partitio. But o the aother had, it is ot a big trouble, because boudary is the maifold of dimesio (s 1) ad therefore fractio of subdomais where approximatio is bad asymptotically decreases whe icreases. The most coveiet implemetatio of successive bisectio is to use biary tree of subdomais, each ode of which cotais subdomai coordiates ad itegral of curret approximatio over it. 6 Discussio Extesive umerical experimets were carried out [6] which approved theretical estimates of adaptive methods covergece rate. The mai coclusio of experimets is as follows. For small dimesios cubature rules are the most effective for itegratio, while for large dimesios the simplest Mote Carlo ad quasi-mote Carlo itegratios are most effective (maily due to their simplicity). But whe the dimesio is moderate (5-15) ad especially whe there are strog accuracy requiremets adaptive methods are the most useful ad coveiet. They are sequetial, asymptotically more rapid the simplest Mote-Carlo ad allows to chec accuracy easily durig computatio i cotrast to quasi-mote Carlo methods ad cubature rules. 4

Methods of itegratio domai divisio differet from successive bisectio ca be used. For example, divisio ca be fully determiistic if oe always bisects the largest subdomai. The asyptotic behaviour of partitio momets obtaied for successive bisectio is still valid. Adaptive sequetial Mote Carlo methods have also bee developed for sequeces of global itegrad approximatio [6]. I these methods f(x) was approximated by trucated Fourier series over some orthogoal basis. For the classes of fuctios expadable ito rapidly coverget trigoometric Fourier series ad Fourier-Haar series optimal or almost optimal rates of covergece were obtaied. [9] N.S.Bahvalov O the approximate calculatio of multiple itegrals, Bulleti of Moscow State Uiversity, No.4, 1959, pp.3 18 (i Russia). Refereces [1] S.M.Ermaov, G.A.Mihailov, Course of Statistical Modellig, Moscow, Naua, 1976 (i Russia). [2] I.M.Sobol, Numerical Mote Carlo Methods, Moscow, Naua, 1973 (i Russia). [3] W.H.Press, S.A.Teuolsy, W.T.Vettrlig, B.P.Flaery Numerical Recipes i C, 2-d editio, Cambridge Uiversity Press, 1992. [4] J.H.Halto, Sequetial Mote Carlo, Proc. Cambridge Philos. Soc., vol.58, No.1, 1962, pp.57 78. [5] M.L.Koreevsy, Developmet of Adaptve- Statictical Methods for Defiite Itegrals Evaluatio, Ph.D. thesis, Sait-Petersburg State Techical Uiversity, 2000 (i Russia). [6] V.M.Ivaov, M.L.Koreevsy, Adaptive- Statistical Methods of Numerical Itegratio, Sait-Petersburg State Polytechical Uiversity, 2003 (i Russia). [7] O.Yu.Kulchitsy, S.V.Srobotov, Adaptive algorithm of Mote Carlo method for computig itegral characteristics of complex systems, Automatics ad telemechaics, No.6, 1986, pp.88 95 (i Russia). [8] V.M.Ivaov, M.L.Koreevsii, O.Yu.Kul chitsii, Adaptive Schemes for the Mote Carlo Method of a Ehaced Accuracy, Dolady Mathematics (Proceedigs of the Russia Academy of Scieces), vol. 60, No.1, 1999, pp.90 93. 5