Sequential Monte Carlo and adaptive numerical integration

Sequetial Mote Carlo ad adaptive umerical itegratio VLADIMIR M. IVANOV, MAXIM L. KORENEVSKY Departmet of Computer Sciece Sait-Petersburg State Polytechical Uiversity 195..., Politehichesaya, 29, Sait-Petersburg RUSSIA Abstract: Paper presets theory of the Sequetial Mote Carlo method ad its applicatio for developmet of adaptive statistical multidimetioal itegratio algorithms. Theorem of Sequetial Mote Carlo covergece is give. Several covetioal variace reductio methods are used to develop adaptive itegratio methods. These methods accumulate data about itegrad durig executio ad use it to accelerate covergece of itegral estimates. Such a approach provides optimal covergece rates for may importat fuctioal classes while retais mai merits of covetioal Mote Carlo itegratio methods. Keywords: Sequetial Mote Carlo, adaptive itegratio, successive bisectio. 1 Itroductio Mote Carlo is oe of the most popular methods for multidimetioal itegratio. It is very simple for implemetatio, aturally successive ad allows to chec accuracy durig computatios. The other importat merit of Mote Carlo method is that its rate of covergece does ot deped o itegral dimesio. I fact, itegratio error is of the order O( 0.5 ), where is the umber of sample poits i which itegrad is evaluated. This is very differet from traditioal cubature rules, whose rate of covergece expoetially decreases with the icrease of dimesio. Itegratio accuracy ca be substatially icreased by use of various variace reductio methods, e.g. importace samplig, correlated ad cotrol variate samplig, atithetic variates, stratified samplig etc. [1], [2], [3]. May of them use available a priori data about itegrad to attai more rapid covergece. However, data of such id ca be also accumulated durig computatios ad effectively applied to accelerate covergece of itegratio process, i.e. algorithm ca adapt to the features of itegrad. The Sequetial Mote Carlo method [4] provides a coveiet framewor for desig ad ivestigatio of such adaptive itegratio methods. The rest of paper is orgaized as follows. Sectio 2 gives a brief descriptio of Mote Carlo itegratio ad some variace reductio techiques. Sequetial Mote Carlo method is stated ad ivestigated i sectio 3. Some geeral approaches to develop adaptive itegratio methods are outlied i sectio 4. Sectio 5 describes successive bisectio algorithm ad adaptive methods for itegratio of smooth fuctios. Discussio of some umerical experimets is give i sectio 6. 2 Mote Carlo itegratio The problem is to evaluate itegral J = f(x)dx, (1) over closed bouded domai of s-dimesioal euclidea space R s. Simplest (crude) Mote Carlo estimate of J is J = µ() f(x ), (2) where x are idepedet radom variables uiformly distributed over ad µ() is the measure of. Estimate J is ubiased ad its variace is V ar{j }= 1 µ() f 2 (x)dx J 2 = σ2. (3) This implyies (due to Chebyshev iequality) that error of itegratio for ay predefied cofidece level decreases as O( 0.5 ) ad this rate does ot deped o dimesio s. Itegratio process ca be orgaized i successful maer, ad simple accuracy chec based o sample variace ca be doe 1

to stop computatio as soo as required accuracy is attaied. Numerous variace reductio techiques are developed to reduce multiplier σ 2 i (3). The idea of cotrol variate samplig is to itegrate pricipal part of f(x) aalitycally ad apply Mote Carlo estimate to remaider oly. Ideed, let g(x) be a easily itegrable approximatio of f(x) (so-called easy approximatio ). The J ca be estimated as follows J = g(x)dx + µ() (f(x ) g(x )). (4) Evidetly, V ar{j } is defied by (3) with f(x) substituted by (f(x) g(x)), ad it teds to zero whe g(x) f(x). Therefore all the a priori data about f(x) ca be used to costruct easy approximatio g(x) carefully. Importace samplig is based o the observatio that oe eed to sample more poits ito those parts of where f(x) is greater i absolute value, i.e. samplig should be cotrolled by some probability desity fuctio (pdf) p(x) > 0. This iduce more geeral form of Mote Carlo estimate: J = 1 f(x ) p(x ) (5) where x are idepedet radom variables with pdf p(x). Variace of this estimate is as follows V ar{j } = 1 f 2 (x) p(x) dx J 2 = σ2. (6) Well-ow [2], that the least value of σ 2 is attaied whe p(x) is proportioal to f(x). Provided that f(x) is of costat sig, this p(x) eve reduces σ 2 to zero. Although exact choice of p(x) i such a maer is ot possible, approximate choice is ofte quite acceptable. So, let g(x) be a easily itegrable approximatio of f(x). The p(x) = g(x) g(x)dx = g(x) gives importace samplig pdf ad J = J g f(x ) g(x ) J g (7) gives importace samplig itegral estimate. Agai a priori data about itegrad ca be used to costruct importace samplig pdf. Both cotrol variate ad importace samplig ca be cosidered as special cases of geeral approach, offered by J.H.Halto [4] ad called Sequetial Mote Carlo. 3 Sequetial Mote Carlo Sequetial Mote Carlo method operates o two types of ubiased itegral estimates primary S ad secodary J related as follows: J = β S + (1 β )J 1, (8) where 0 β 1, β 1 = 1 are some umerical factors. Both primary ad secodary estimates deped o radom poits x sampled over accordig to some pdfs (may be depedet o ). Secodary estimates J ca be also expressed as J = α () S, α () = β j=+1 (1 β j ). Easy to see that crude Mote Carlo is the special case of Sequetial Mote Carlo correspodig to S = µ()f(x ), (9) while cotrol variate samplig correspods to S = g(x)dx + µ()(f(x ) g(x )) (10) ad importace samplig correspods to S = f(x ) p(x ). (11) I all cases S deped oly o x ad therefore idepedet from each other ad α () = β = 1. Oe ca assume that use of depedet primary estimates may provide some beefits ad it is really true. The mose useful results ca be obtaied for the case of depedet but ucorrelated primary estimates, i.e. E{(S i J)(S j J)} = 0 for i j. I this case the variace of secodary estimates is as follows: V ar{j } = α () D, D = E{ S J 2 }. 2

Here D are the variaces of S (estimatio is tae over radom variables x 1,..., x which S depeds o). Followig theorem provides some results about covergece of Sequetial Mote Carlo. Theorem 1 [5, 6] Let S are ucorrelated, their variaces ca be estimated as D = O( γ l δ ) for some costats γ, δ 0, ad let coefficiets β are chose as follows: The β = 1 + γ + γ for ay γ > γ 1. (12) 2 V ar{j } = O( γ 1 l δ ) ad J coverg to J with probability 1 for ay γ > 0. Theorem 1 shows that suitable choice of coefficiets β allows for secodary estimates to coverg oe order more rapidly that for primary oes. Moreover, it ca be show that ay other choice of coefficiets ca t implove order of V ar{j } decrease. Empirical estimatio of V ar{j } ca be orgaized i parallel with estimatio of itegral J that allows to chec accuracy of J durig computatios ad stop them as soo as required accuracy is attaied. 4 Adaptive itegratio methods The uderlyig idea of adaptive methods offered below is as follows: algorithm ca accumulate its owledge about itegrad to successively mae curret easy approximatio more precise ad successively decrease variaces of itegral estimate. To mae this more formal assume that alog with samplig radom poits x 1,..., x,... the sequece of easy itegrad approximatios is costructed, f 1 (x),..., f (x),... such that f (x) = f (x; x 1,..., x 1 ). (Thus, each approximatio depeds o the values of f(x) at all poits sampled earlier.) The primary estimates S = f (x)dx + µ()(f(x ) f (x )) (13) are direct aalogy to (10) used for covetioal corol variate samplig method. But as f (x) ted to f(x) variaces of S ted to zero, ad oe ca expect (due to theorem 1) that variaces of secodary estimates J will decrease more rapidly tha O( 1 ). Equatio (13) defies adaptive cotrol variate samplig method. Similarly, adaptive importace samplig method ca be itroduced. Now assume that itegrad f(x) ad all approximatios f (x) are strictly positive 1. The primary estimates S = f(x ) p (x ), p (x) = f (x) f (x)dx, (14) where x is sampled over accordig to pdf p (x) are direct aalogy to (11) used for covetioal importace samplig method. Agai, as f (x) ted to f(x), p (x) ted to the optimal oe ad variaces of S ted to zero. Metio of theorem 1 few lies above is valid oly if we state that estimates S from (13) or (14) are ucorrelated. Fortuately, i fact they are. It follows directly from the relatio E x {S x 1,..., x 1 } = J, where estimatio is coditioal over x uder fixed x 1,..., x 1 ad from the depedece of S oly o x i, i. Thus, theorem 1 is completely applicable. It ca easily be show [6], that D E x1,...,x 1 µ() (f(x) f (x)) 2 dx for adaptive cotrol variate samplig ad (f(x) f (x)) 2 D E x1,...,x 1 dx p (x) for adaptive importace samplig, i.e. variaces D ted to zero whe f (x) ted to f(x) i L 2 (). To use proposed adaptive methods oe should be able to costruct the sequece of approximatios f (x) o the base of values f(x i ) i the poits already sampled. There are may ways to do it. The first ad simplest oe was offered by Kulchitsy ad Srobotov [7] for oe-dimesioal problem ad is as follows. f (x) is chose as piecewisecostat fuctio o the, f (x i ) = f(x i ) for i < ad f (x) is costat betwee x i. It was show for adaptive importace samplig, that i this case V ar{j } = O( 3 ), i.e. method is much 1 This assumptio ca usually be easily satisfied by additio of suffitiely large positive costat to the itegrad. 3

more rapid tha covetioal o-adaptive importace samplig. Further [?], this result was geeralized to piecewisepolyomial approximatios. For oe-dimesioal itegrads of class C m (a) havig cotiuous derivatives up to order m all bouded by costat a, adaptive itegratio methods ca be costructed for which V ar{j } = O( 2m 1 ). 5 Successive bisectio Described approach to costruct approximatios i oe-dimesioal case ca be geeralized for multidimetioal itegratio. For simplicity we will ow cosider oly itegratio over hyperparallelepiped, more geeral case is addressed at the ed of this sectio. Let s cosider -th itegratio step, whe poits x 1,..., x 1 are sampled ad itegratio domai is divided ito N ooverlappig subdomais j, j = 1,..., N ad assume that f (x) is the piecewise approximatio of f(x) over this partitio of. Let s say that sequece of f (x) approximates f(x) with order l > 0 if there exists C > 0 such that f(x) f (x) C [ µ( j ) ] l x j, (15) for all > 0 ad j = 1,..., N. Approximatios that satisfy (15) ca be costructed for wide variety of fuctios. I particular, fuctios of s- dimesioal class C m (a) ca be approximated with the order l up to m/s. Provided (15), variaces D are estimated through the quatity 2 M 2l+1 N = E x1,...,x 1 [µ()] 2l+1 j=1 that ca be called as partitio momet of order (2l + 1) relative to x 1,..., x 1. Now, good geeralizatio of oe-dimesioal algorithm should provide both rapid decrease of M 2l+1 ad relatively slow icrease of N (otherwise computatioal load would be too large). Oe of possible ways to solve this miimax problem is successive bisectio method. Its idea is very simple: ew partitio is obtaied from the curret oe by bisectio of subdomai 2 For adaptive importace samplig oe should additioally assume, that p (x) are uiformly boded from below by some positive costat where ew sampled poit x falls ito. Bisectio is carried out alog the directio where subdomai to be divided is the most legthy. Clearly this way provides very moderate icrease of N (amely N = ). Moreover it ca be show [6] that for both adaptive cotrol variate samplig ad adaptive importace samplig M 2l+1 = O( 2l ). Thus, for fuctios of class C m (a) adaptive methods ca be costructed for which D{J } = O( 1 2m/s ). Bahvalov [9] showed that this order is the best possible for ay odetermiistic itegratio method that uses oly O() values of itegrad. Therefore, proposed adaptive methods are optimal o C m (a). Now what if itegratio domai is differet from hyperparallelepiped. Covetioal approach is to immerse it ito larger hyperparallelepiped ad set f(x) to be zero outside. O the oe had, i this case f(x) loses its smoothess i the boudary poits ad it caot be approximated with high order over all subdomais of partitio. But o the aother had, it is ot a big trouble, because boudary is the maifold of dimesio (s 1) ad therefore fractio of subdomais where approximatio is bad asymptotically decreases whe icreases. The most coveiet implemetatio of successive bisectio is to use biary tree of subdomais, each ode of which cotais subdomai coordiates ad itegral of curret approximatio over it. 6 Discussio Extesive umerical experimets were carried out [6] which approved theretical estimates of adaptive methods covergece rate. The mai coclusio of experimets is as follows. For small dimesios cubature rules are the most effective for itegratio, while for large dimesios the simplest Mote Carlo ad quasi-mote Carlo itegratios are most effective (maily due to their simplicity). But whe the dimesio is moderate (5-15) ad especially whe there are strog accuracy requiremets adaptive methods are the most useful ad coveiet. They are sequetial, asymptotically more rapid the simplest Mote-Carlo ad allows to chec accuracy easily durig computatio i cotrast to quasi-mote Carlo methods ad cubature rules. 4

Methods of itegratio domai divisio differet from successive bisectio ca be used. For example, divisio ca be fully determiistic if oe always bisects the largest subdomai. The asyptotic behaviour of partitio momets obtaied for successive bisectio is still valid. Adaptive sequetial Mote Carlo methods have also bee developed for sequeces of global itegrad approximatio [6]. I these methods f(x) was approximated by trucated Fourier series over some orthogoal basis. For the classes of fuctios expadable ito rapidly coverget trigoometric Fourier series ad Fourier-Haar series optimal or almost optimal rates of covergece were obtaied. [9] N.S.Bahvalov O the approximate calculatio of multiple itegrals, Bulleti of Moscow State Uiversity, No.4, 1959, pp.3 18 (i Russia). Refereces [1] S.M.Ermaov, G.A.Mihailov, Course of Statistical Modellig, Moscow, Naua, 1976 (i Russia). [2] I.M.Sobol, Numerical Mote Carlo Methods, Moscow, Naua, 1973 (i Russia). [3] W.H.Press, S.A.Teuolsy, W.T.Vettrlig, B.P.Flaery Numerical Recipes i C, 2-d editio, Cambridge Uiversity Press, 1992. [4] J.H.Halto, Sequetial Mote Carlo, Proc. Cambridge Philos. Soc., vol.58, No.1, 1962, pp.57 78. [5] M.L.Koreevsy, Developmet of Adaptve- Statictical Methods for Defiite Itegrals Evaluatio, Ph.D. thesis, Sait-Petersburg State Techical Uiversity, 2000 (i Russia). [6] V.M.Ivaov, M.L.Koreevsy, Adaptive- Statistical Methods of Numerical Itegratio, Sait-Petersburg State Polytechical Uiversity, 2003 (i Russia). [7] O.Yu.Kulchitsy, S.V.Srobotov, Adaptive algorithm of Mote Carlo method for computig itegral characteristics of complex systems, Automatics ad telemechaics, No.6, 1986, pp.88 95 (i Russia). [8] V.M.Ivaov, M.L.Koreevsii, O.Yu.Kul chitsii, Adaptive Schemes for the Mote Carlo Method of a Ehaced Accuracy, Dolady Mathematics (Proceedigs of the Russia Academy of Scieces), vol. 60, No.1, 1999, pp.90 93. 5