arxiv:math/ v1 [math.pr] 3 Jul 2006

Size: px
Start display at page:

Download "arxiv:math/ v1 [math.pr] 3 Jul 2006"

Transcription

1 The Annals of Applie Probability 006, Vol. 6, No., DOI: 0.4/ c Institute of Mathematical Statistics, 006 arxiv:math/ v [math.pr] 3 Jul 006 OPTIMAL SCALING FOR PARTIALLY UPDATING MCMC ALGORITHMS By Peter Neal an Gareth Roberts University of Manchester an Lancaster University In this paper we shall consier optimal scaling problems for highimensional Metropolis Hastings algorithms where upates can be chosen to be lower imensional than the target ensity itself. We fin that the optimal scaling rule for the Metropolis algorithm, which tunes the overall algorithm acceptance rate to be 0.34, hols for the so-calle Metropolis-within-Gibbs algorithm as well. Furthermore, the optimal efficiency obtainable is inepenent of the imensionality of the upate rule. This has important implications for the MCMC practitioner since high-imensional upates are generally computationally more emaning, so that lower-imensional upates are therefore to be preferre. Similar results with rather ifferent conclusions are given for so-calle Langevin upates. In this case, it is foun that high-imensional upates are frequently most efficient, even taking into account computing costs.. Introuction. There exist large classes of Markov chain Monte Carlo MCMC) algorithms for exploring high-imensional target) istributions. All methos construct Markov chains with invariant istribution given by the target istribution of interest. However, for the purposes of maximizing the efficiency of the algorithm for Monte Carlo use, it is imperative to esign algorithms which give rise to Markov chains which mix sufficiently rapily. Since all Metropolis Hastings algorithms require the specification of a proposal istribution, these implementational questions can all be phrase in terms of proposal choice. This paper is about two of these choices: the scaling an imensionality of the proposal. We shall work throughout with continuous istributions, although it is envisage that more general istributions might be amenable to similar stuy. Receive June 003; revise April 005. AMS 000 subject classifications. Primary 60F05; seconary 65C05. Key wors an phrases. Metropolis algorithm, Langevin algorithm, Markov chain Monte Carlo, weak convergence, optimal scaling. This is an electronic reprint of the original article publishe by the Institute of Mathematical Statistics in The Annals of Applie Probability, 006, Vol. 6, No., This reprint iffers from the original in pagination an typographic etail.

2 P. NEAL AND G. ROBERTS One important ecision the MCMC user has to make in a -imensional problem concerns the imensionality of the propose jump. For instance, two extreme types of algorithm are the following: propose a fully -imensional upate of the current state accoring to a ensity with a ensity with respect to -imensional Lebesgue measure) an accept or reject accoring to the Metropolis Hastings acceptance probabilities; or, for each of the components in turn, upate that component conitional on all the others accoring to some Markov chain which preserves the appropriate conitional istribution. The most wiely use example is the -imensional Metropolis algorithm, in one extreme, an the Gibbs sampler or some kin of Metropolis-within-Gibbs scheme in the other. In between these two options, there lie many intermeiate strategies. An important question is whether any general statements can be mae about algorithm choice in this context, leaing to practical avice for MCMC practitioners. In this paper we concentrate on two types of algorithm: Metropolis an Metropolis ajuste Langevin algorithms MALA). We consier strategies which upate a fixe proportion, c, of components at each iteration, an consier the efficiency of the algorithms constructe asymptotically as. In orer to o this, we shall exten the methoology evelope in [6, 7] to our context. The analysis prouces clear cut results which suggest that, while full-imensional Langevin upates are worthwhile, full-imensional Metropolis ones are asymptotically no better than smaller imensional upating schemes, so that the possible extra computational overhea associate with their implementation always leas to their being suboptimal in practice. All this is initially one in the context of target ensities consisting of inepenent components, an this leas naturally to the question of whether this simple picture is altere in any way in the presence of epenence. Although this is ifficult to explore in full generality, we o later consier this problem in the context of a class of Gaussian epenent target istributions where explicit results can be shown, an where the conclusions from the inepenent component case remain vali. It is now well recognize that highly correlate target istributions lea to slow mixing for upating schemes where c < see, e.g., [5, 9]). However, it is also known that spherically symmetric proposal istributions in -imensions on highly correlate target ensities can lea to slow mixing since the proposal istribution is inappropriately shape to explore the target see [8]). So for highly correlate target istributions, both high an small imensional upating strategies perform poorly. We shall explore these two competing algorithms in a Gaussian context where explicit calculations are possible. Our work shows that, for c > 0, for the Metropolis algorithm, these two slowing own effects are the same. In particular, this implies that the commonly use strategy of getting roun high correlation problems by

3 OPTIMAL SCALING FOR MCMC 3 block upating using Metropolis has no justification. In contrast, for MALA full imensional upating, c =, is shown to be optimal. The paper is structure as follows. In Section we outline the MCMC setup. In Sections 3 an 4 we tackle the problem of scaling the variance of the proposal istribution for RWM-within-Gibbs Ranom walk Metropoliswithin-Gibbs) an MALA-within-Gibbs Metropolis ajuste Langevin-within- Gibbs), respectively. The approach taken is similar to that use for the full RWM/MALA algorithms, by obtaining weak convergence to an appropriate Langevin iffusion as the imension of the state space, converges to infinity. The results of Sections 3 an 4 are prove for a sequence of -imensional prouct ensities of the form.) π x ) = fx i ) i= for some suitably smooth probability ensity f ). In both Sections 3 an 4, for each fixe, one-imensional component of X ; }, the one-imensional process converges weakly to an appropriate Langevin iffusion. The aim therefore is to scale the proposal variances so as to maximize the spee of the limiting Langevin iffusion. Since each of the components of X ; } are inepenent an ientically istribute, we shall prove the results for X ; }. However, it is at least plausible that the picture will be very ifferent when consiering epenent ensities. However, theoretical analysis in the limiting case where results can be obtaine an in simulations for more general cases, we fin that the general conclusions which can be erive for ensities of the form.) exten some way towar epenent ensities. To this en, in Section 5, we consier RWM/MALA-within-Gibbs for the exchangeable normal X N0,Σ ρ), where σii =, i, an σ ij = σ ji = ρ, i < j. Throughout the paper, we aopt the notation that Σ will be use for variance matrices, while elements of matrices will be enote by σ, both conventions using appropriate sub- an er-scripts.) All the proofs of the theorems in Sections 3 5 are given in the Appenix. Then in Section 6 with the ai of a simulation stuy we emonstrate that the asymptotic results are practically useful for finite, namely, 0.. Algorithms an preliminaries. For RWM/MALA, we are intereste in,σ ), the imension of the state space,, an the proposal variance σ, where the proposal for the ith component is given by Y i = x i + σ Z i, i, RWM, Yi = x i + σ Z i + σ logπ x ), i, MALA x i

4 4 P. NEAL AND G. ROBERTS an the Z i } s are inepenent an ientically istribute accoring to Z N0,). For both RWM an MALA, the maximum spee of the iffusion can be obtaine by taking the proposal variance to be of the form σ = l s for some l > 0 an s > 0. For RWM, s = an for MALA, s = 3.) Now for RWM/MALA-within-Gibbs, the basic iea is to choose c components at ranom at each iteration, attempting to upate them jointly accoring to the RWM/MALA mechanism, respectively. We sometimes write σ = σ,c, where c represents the proportion of components upate at each iteration. Thus, the two algorithms propose new values as follows:.) Y i = x i + χ i σ,c Z i, i, RWM-within-Gibbs, Yi = x i + χ i σ,c Z i + σ },c logπ x ), x i i, MALA-within-Gibbs, where the Z i } s are inepenent an ientically istribute accoring to Z N0,) an the χ i } are chosen as follows. Inepenently of the Z i s, we select at ranom a subset A, say, of size c from,,...,}, setting χ i = if i A, an χ i = 0 otherwise. The proposal Y is then accepte accoring to the usual Metropolis Hastings acceptance probability:.) α c x,y ) = π Y )qy,x ) π x )qx,y ), where q, ) is the proposal ensity. Otherwise, we set X m = X m. In both cases, the algorithms simulate Markov chains which are reversible with respect to π, an can be easily shown to be π -irreucible an aperioic. Therefore, both algorithms will converge in total variation istance to π. However, here we shall investigate optimization of the algorithms for rapi convergence. To fin a manageable framework for assessing optimality, Roberts, Gelman an Gilks [6] introuce the notion of the average acceptance rate which measures the steay state proportion of accepte proposals for the algorithm, an which can be shown to be closely connecte with the notion of algorithm efficiency an optimality. Specifically, we efine.3) a c l) = E π [α c X,Y )] = E π [ π Y )qy,x ) π X )qx,y ) where σ,c = l s, X π an Y represents the subsequent proposal ranom variable. Thus, a c l) is the π -average acceptance rate of the above algorithms where we upate a proportion c of the components in each iteration. We aopt the general notational convention that, for any -imensional stochastic process W ḍ,., we shall write W t,i for the value of its ith component at time t. ],

5 OPTIMAL SCALING FOR MCMC 5 Our aim in this paper is to consier the optimization [in c,σ,c )] of the algorithms spee of convergence. For convenience although to some extent this assumption can be relaxe), we shall assume that c c as for some 0 < c. It turns out to be both convenient an practical to express many of the optimality solutions in terms of acceptance rate criteria. 3. RWM-within-Gibbs for IID prouct ensities. We shall first consier the RWM algorithm applie initially to a simple IID form target ensity. This allows us to obtain explicit asymptotic results for optimal high-imensional algorithms. The results of this section can be seen as an extension of the results of Theorems. an. of [6] which consiers the full-imensional upate case. Let 3.) π x ) = fx i ) = expgx i )} i= be a -imensional prouct ensity with respect to Lebesgue measure. Let the proposal stanar eviation σ = l for some l > 0. For, let U t = X [t],,x [t],,...,x [t], ), an so, U t,i = X [t],i, i. Let U t = U t,. Theorem 3.. Suppose that f is positive, C 3 a three-times ifferentiable function with continuous thir erivative) an that logf) = g is Lipschitz. Suppose also that, c c, as, for some 0 < c, [ f ) X) 8 ] 3.) E f < fx) an 3.3) i= [ f ) X) 4 ] E f <. fx) Let X 0 = X 0,,X 0,,...) be such that all of its components are istribute accoring to f an assume that X j 0,i = Xi 0,i for all i j. Then, as, 3.4) U U, where U 0 is istribute accoring to f an U satisfies the Langevin SDE 3.5) an U t = h c l)) / B t + h cl)g U t )t h c l) = cl Φ l ) ci,

6 6 P. NEAL AND G. ROBERTS with Φ being the stanar normal cumulative c..f an [ f ) X) ] I E f E g [g X) ]. fx) The following corollary hols. Corollary 3.. Let c c, as, for some 0 < c. Then: i) lim a c l) = a c l) ef = Φ l ci ). ii) Let ˆl be the unique value of l which maximizes h l) = l Φ l I ) on [0, ), an let ˆl c be the unique value of l which maximizes h c l) on [0, ). Then ˆl c = c /ˆl an hc ˆl c ) = h ˆl). iii) For all 0 < c, the optimal acceptance rate a c ˆl c ) = 0.34 to three ecimal places). Though these results involve fairly technical mathematical statements, they yiel a very simple practical conclusion. Optimal efficiency obtainable for a given c oes not epen on c at all. Now, in practice, computational overheas associate with one iteration of the algorithm are nonecreasing as a function of c, so that, in practice, smaller values of c shoul be preferre. Therefore, for RWM, using high-imensional upate steps oes not make any sense. It is, of course, important to see how these conclusions exten to more general target ensities an, in particular, ones which exhibit epenence structure. Some theory an relate simulation stuies in Sections 5 an 6, respectively, will emonstrate that these finings exten consierably beyon the rigorous but restrictive set up of Theorem MALA-within-Gibbs for IID prouct ensities. We now turn our attentions to MALA-within-Gibbs. We again consier a sequence of probability ensities π of the form given in 3.). We follow [7] in making the following assumptions. We assume that X 0 is istribute accoring to the stationary measure π, g is an eight times continuously ifferentiable function with erivatives g i) satisfying 4.) gx), g i) x) C + x K ), i 8, for some C,K > 0, an that 4.) x k fx)x <, k =,,... R Finally, we assume that g is Lipschitz. This ensures that X t } is nonexplosive see, e.g., [], Chapter V, Theorem 5.).

7 OPTIMAL SCALING FOR MCMC 7 Let J t } be a Poisson process with rate /3 an let Γ = Γ t } t 0 be the -imensional jump process efine by Γ t = X J t, where we take σ = l /3 with l an arbitrary constant. We then have the following two theorems which are extensions of [7], Theorems an. Theorem 4.. Suppose that c c, as, for some 0 < c. We have that lim ac ckl l)} = a c 3) l) = Φ, with K = E[ 5g X) 3g X) 3 48 ] > 0. Theorem 4.. Suppose that c c as for some 0 < c. Let U } t 0 be the process corresponing to the first component of Γ. Then, as, the process U converges weakly in the Skorokho topology) to the Langevin iffusion U efine by U t = h c l) / B t + h cl)g U t )t, where h c l) = cl Φ cl 3 K ) is the spee of the limiting iffusion. The most important consequence of Theorems 4. an 4. is the following corollary. Corollary 4.3. Let c c, as, for some 0 < c. Then: i) Let ˆl be the unique value of l which maximizes h l) = l Φ l3 K ) on [0, ), an let ˆl c be the unique value of l which maximizes h c l) on [0, ). Then ˆl c = c /6ˆl an hc ˆl c ) = c /3 h ˆl). ii) For all 0 < c, the optimal acceptance rate a c ˆl c ) = to three ecimal places). Thus, in stark contrast to the RWM case, it is optimal to upate all components at once for MALA. The story is somewhat more complicate in the case where computational overheas are taken into account. For instance, it is common for the computational costs of implementing MALA-within- Gibbs to be approximately a + bc) for constants a an b. To see this, note that the algorithm s computational cost is often ominate by two operations: the calculation of the various erivatives neee to propose a new value, an the evaluation of π at the propose new value. The first of these operations involves a c-imensional upate an typically takes a time which is orer c, while the secon involves evaluating a -imensional function

8 8 P. NEAL AND G. ROBERTS which we woul expect to be at least of orer. Although, in some important special cases, target ensity ratios might be compute more efficiently than this.) In this case the overall efficiency is obtaine by maximizing c /3 a + bc. This expression is maximize at a/b. Therefore, it is conceivable for full imensional upates to be optimal even when computational costs are taken into account. In any case, the optimal proportion will be some value x 0,]. 5. RWM/MALA-within-Gibbs on epenent target istributions. We are now intereste in the extent to which the results of the last two sections can be extene to the case where the components are epenent. It is ifficult to get general results, but certain important special cases can be examine explicitly, yieling interesting results which imply essentially) that the extent by which the epenence structure affects the mixing properties of the chain RWM-within-Gibbs or MALA-within-Gibbs) is inepenent of c. The most tractable special case is the Gaussian target istribution. However, in Section 6, we shall also inclue some simulations in other cases to show that the above statement hols well beyon the cases for which rigorous mathematical results can be prove. We begin with RWM-within-Gibbs an consier the optimal scaling problem of the variance of the proposal istribution for a target istribution consisting of exchangeable normal components. Specifically, X N 0,Σ ρ), where σii =, i, an σ ij = ρ, i j, for some 0 < ρ <. Therefore, we have that 5.) where an π x ) = π) et Σ ρ / exp )) x i ) + θ x i x j ρ i= i= j= = π) et Σ ρ / exp ) j x ), say, θ = j x ) = ρ ρ + )ρ )ρ x i ) + θ x i x j. i= i= j=

9 OPTIMAL SCALING FOR MCMC 9 For, Û t = X [t],,x [t],,...,x [t], ). Let U t = U t,,u t,,u t,3 ) be such that Ut, = Û t,, U t, = Û t, an U t,3 = i=3 Û t,i. Now the proposal Y is given by Y i = x i + σ χ i Z i, i, where the Z i an χ i i ) are efine as before an σ = l for some constant l. [We use ) rather than or ) for simplicity in presentation of the results.] In the epenent case, more care nees to be taken in constructing the sequence X 0 ; }. Let X 0 N0,) [i.e., X 0 is istribute accoring to π )]. For an i, set X0,i = Xi 0,i. Then iteratively efine X 0, N ρ X0,i, i= + )ρ )ρ ) Therefore, X 0 is istribute accoring to π ) an we can continue this process inefinitely to obtain X 0 = X 0,,X 0,,...). Theorem 5.. Suppose that 0 < ρ < an that c c, as, for some 0 < c. Let X 0 = X 0,,X 0,,...) be constructe as above. Let ρ ρ D = ρ ρ, ρ ρ ρ 0 ρ ρ D = 0 ρ ρ, + ρ ρ ρ ρ ρ) 0 0 D 3 = Let fu) enote the probability ensity function of N0,D ). Then, as, U U, where U 0 is istribute accoring to f an U satisfies the Langevin SDE where U t = h c,ρ l)) / D 3 B t + h c,ρ l)d 3 UT t D U t )}t, h c,ρ l) = cl Φ l c ρ ). ).

10 0 P. NEAL AND G. ROBERTS Note that if we efine Ĩ = E[ x j X )) ] an Ĩ = ρ. Then Ĩ Ĩ as an h c,ρ l) = cl Φ l cĩ ). Therefore, the spee of the limiting iffusion for exchangeable normal has the same form as that obtaine for the IID prouct ensities consiere in Section 3. As in.3), let a c,ρ l) be the π -average acceptance rate of the above algorithm where X N0,Σ ρ), σ = l an we upate a proportion c of the components in each iteration. Then we have the following corollary. Corollary 5.. Let c c, as, for some 0 < c. Then, for 0 < ρ < : i) lim a c,ρ l) = a c,ρ l) ef = Φ l c ρ ). ii) Let ˆl be the unique value of l which maximizes h,0 l) = l Φ l ) on [0, ), an let ˆl c,ρ be the unique value of l which maximizes h c,ρ l) on [0, ). Then ˆl c,ρ = ρ c l an h c,ρˆl c,ρ ) = ρ)h,0 ˆl). iii) For all 0 < c an 0 < ρ <, the optimal acceptance rate a c,ρ ˆl c,ρ ) = 0.34 to three ecimal places). Note that Corollary 5.ii) states that the cost incurre by having σij = ρ, i j, rather than σij = 0, i j, is to slow own the spee of the limiting iffusion by a factor of ρ, for all 0 < c. In other wors, the cost incurre by the epenence between the components of X is inepenent of c. Furthermore, the optimal acceptance rate a c,ρ ˆl c,ρ ) is unaffecte by the introuction of epenence. We shall stuy this further in the simulation stuy conucte in Section 6. Note that in Theorem 5. the last row of the matrix D 3 is a row of zeros. This implies that the mixing time of T X grows more rapily than O) as. In [8], heuristic arguments an extensive simulations show that the mixing time of T X is in fact O ). Theorem 5.3 below gives a formal statement of this result. The proof of Theorem 5.3 is similar to the proof of Theorem 5. an is, hence, omitte.) For, let Ũ t = i=3 X[ t],i. Theorem 5.3. Suppose that 0 < ρ < an that c c, as, for some 0 < c. Let X 0 = X 0,,X 0,,...) be constructe as in the prelue to Theorem 5.. Then, as, Ũ Ũ, where Ũ0 N0,ρ) an Ũ satisfies the Langevin SDE Ũt = h c,ρ l)) / B t + h c,ρ l) } t, ρũt

11 where h c,ρ l) = cl Φ l OPTIMAL SCALING FOR MCMC c ρ ), as before. We now turn our attention to MALA-within-Gibbs for the exchangeable normal. So that now the proposal Y is given by Yi = x i + χ i σ Z i + σ )} ρ x i θ x j, where we take σ = l /3 with l an arbitrary constant. Let X 0 be constructe as outline above for the RWM-within-Gibbs. Let J t } be a Poisson process with rate /3 an let Γ = Γ t } t 0 be the -imensional jump process efine by Γ t = X J t. Let U t = U t,,u t,,u t,3 ) be such that U t, = Γ t,, Ut, = Γ t, an U t,3 = i=3 Γ t,i. Theorem 5.4. Suppose that 0 < ρ < an that c c, as, for some 0 < c. Let X 0 = X 0,,X 0,,...) be constructe as in the prelue to Theorem 5.. Let D, D, D 3 an f be as efine in Theorem 5.. Then, as, U U, where U 0 is istribute accoring to f an U satisfies the Langevin SDE j= U t = h c,ρ l)) / D 3 B t + h c,ρ l)d 3 UT t D U t )}t, where ) h c,ρ l) = cl Φ l3 c 8 ρ) 3 is the spee of the limiting iffusion. Note that if we efine K = E [ x 3 jx )) 3 ) 3 }] 3 x jx ) an K = 6 ρ )3, then K K as an h c,ρ l) = cl Φ l3 c K). Therefore, the spee of the limiting iffusion for exchangeable normal has the same form as that obtaine for the IID prouct ensities consiere in Section 4. As in.3), let a c,ρ l) be the π -average acceptance rate of the above algorithm where X N0,Σ ρ), σ = l /6 an we upate a proportion c of the components in each iteration. Then we have the following corollary.

12 P. NEAL AND G. ROBERTS Corollary 5.5. Let c c, as, for some 0 < c. Then, for 0 < ρ < : i) lim a c,ρ l) = a c,ρ l) ef = Φ l3 c 8 ρ) 3 ). ii) Let ˆl be the unique value of l which maximizes h,0 l) = l Φ l3 8 ) on [0, ), an let ˆl c,ρ be the unique value of l which maximizes h c,ρ ˆl) on [0, ). Then ˆl c,ρ = ρc /6 l an h c,ρ ˆl c,ρ ) = c /3 ρ)h,0 ˆl). iii) For all 0 < c an 0 < ρ <, the optimal acceptance rate a c,ρ ˆl c,ρ ) = to three ecimal places). Note that Corollary 5.5ii) states that the cost incurre by having σij = ρ, i j, rather than σij = 0, i j, is to slow own the spee of the limiting iffusion by a factor of ρ, for all 0 < c. Therefore, the epenence in the target istribution π ) affects convergence of the MALA-within-Gibbs in the same way that it affects the RWM-within-Gibbs. The cost associate with upating only a proportion c rather than all of the components is the same as that observe in Section 4. Furthermore, the optimal acceptance rate a c,ρ ˆl c,ρ ) is unaffecte by the introuction of epenence. From Theorem 5.4, we see that the mixing time of T X is greater than O /3 ) as. In fact, the mixing time of T X is in fact O 4/3 ). Let J t } be a Poisson process with rate 4/3 an for, let Ũ t = i=3 XJ. t,i Theorem 5.6. Suppose that 0 < ρ < an that c c, as, for some 0 < c. Let X 0 = X 0,,X 0,,...) be constructe as in the prelue to Theorem 5.. Then, as, Ũ Ũ, where Ũ0 N0,ρ) an Ũ satisfies the Langevin SDE Ũt = h c,ρ l)) / B t + h c,ρ l) } t, ρũt where h c,ρ l) = cl Φ l3 c 8 ), as before. ρ) 3 The proofs of Theorems 5.4 an 5.6 are hybris of those for the results of Section 4, an for Theorems 5. an 5.3 above, an are, hence, omitte. 6. A simulation stuy. The rotational symmetry of the Gaussian istribution effectively allows the epenence problem to be formulate as one of heterogeneity of scale. Other istributional forms exist for which this may be possible e.g., the multivariate t-istribution), but it seems ifficult to erive results for very general istributional families of target istribution

13 OPTIMAL SCALING FOR MCMC 3 without resorting to ieas such as this. Therefore, to port the conjecture that the conclusions of Sections 3 5 hol beyon the rigorous, theoretical results, we present the following simulation stuy. Furthermore, we emonstrate that the asymptotic results are achieve in relatively low imensional 0) situations. Throughout the simulation stuy we measure spee/efficiency of the algorithm by consiering first-orer efficiency. That is, for a multiimensional Markov chain X with first component X, say, the first-orer efficiency is efine to be E[Xt+ X t ) ] for RWM an /3 E[Xt+ X t ) ] for MALA, where X t is assume to be stationary. For each of the target istributions an ifferent choices of c an, we consier 50 ifferent proposal variances, σ,c. For each choice of proposal variance σ,c, we starte with X 0 rawn from the target istribution. We then ran the algorithm for iterations. We estimate E[Xt+ X t ) ] by i= Xi X i ) an the acceptance rate is estimate by i= Xi X i }. We then plot acceptance rate against E[Xt+ X t ) ] first-orer efficiency). We begin by consiering RWM-within-Gibbs. We shall consier three ifferent target istributions π N0,Σ ρ), π t 50 0,Σ ρ) an π x ) = i= exp x i ) ouble-sie exponential). Note that the istributions t 50 0,Σ ρ ) ρ > 0) an the ouble-sie exponential are not covere by the asymptotic results of Sections 3 an 5. For the N0,Σ ρ ) an t 500,Σ ρ ), we plot acceptance rate against the normalize first-orer efficiency, ρ E[X t+ Xt ) ]. The normalization is introuce to take account of epenence see Corollary 5.). Figures an give a representative sample of the simulation stuy we conucte for a whole range of ifferent values of c, an ρ. The results are as one woul expect. In all cases the estimate optimal acceptance rate is approximately As can be seen from Figures an, the normalize first-orer efficiency curves are virtually inistinguishable from one another for each choice of c, an ρ. Therefore, we have mae no attempt to ifferentiate between the ifferent efficiency curves. Note that the results in Figure 3 are a representative sample from a much larger simulation stuy.) Figures 3 an 4 prouce results in line with those expecte from Sections 3 an 5. This emonstrates that the conclusions of Sections 3 an 5 o exten beyon those target istributions for which rigorous statements have been mae. We now turn our attention to MALA-within-Gibbs. We shall consier in our simulation stuy only target ensities of the form π N0,Σ ρ). Simulations in Figures 5 an 6 show excellent agreement with Corollaries 4.3 an 5.5. Again, the results emonstrate the usefulness/relevance of the asymptotic results for even fairly small.

14 4 P. NEAL AND G. ROBERTS 7. Discussion. A rather surprising property of high-imensional Metropolis an Langevin algorithms is the robustness of relative efficiency as a function of acceptance rate. In particular, the optimal acceptance rates 0.34 an for Metropolis an Langevin, respectively, appear to be robust to many kins of perturbation of the target ensity. A remarkable conclusion of this paper is this apparent robustness of relative efficiency, as a function of acceptance rate, seems to exten quite reaily to upating schemes where only a fixe proportion of components are upate at once. A further unexpecte conclusion concerns the issue of optimization in c. Here, very clear cut statements appear to be available, with smallerimensional upates seeming to be optimal for the Metropolis algorithm as seen from Theorem 3. an Corollary 3.), whereas higher-imensional up- Fig.. Normalize first-orer efficiency of RWM-within-Gibbs, ρ E[X t+ Xt ) ], as a function of overall acceptance rates for each combination of = 0; c = 0.5,0.5,0.75,; ρ = 0,0.5), with π N0,Σ ρ).

15 OPTIMAL SCALING FOR MCMC 5 Fig.. Normalize first-orer efficiency of RWM-within-Gibbs, ρ E[X t+ Xt ) ], as a function of overall acceptance rates for each combination c = 0.5; = 0,0,50; ρ = 0,0.5), with π N0,σρ). ates are to be preferre at least before computing time has been taken into consieration) for MALA schemes see Theorem 4. an Corollary 4.3). The robustness of these conclusions to epenence in the target ensity is seen in the results of Section 5 an, porte by the simulation stuy in Section 6, seems contrary to the general intuition that block upating improves MCMC mixing at least for the Metropolis results). However, our results show that this intuition is only correct for schemes where the multivariate upate step utilies the structure of the target ensity as, e.g., in the Gibbs sampler, or, to a lesser extent, MALA). We believe that these results shoul have quite funamental implications for practical MCMC use, although, of course, they shoul be treate with care since they are only asymptotic. Our results have been shown in the

16 6 P. NEAL AND G. ROBERTS Fig. 3. Normalize first-orer efficiency of RWM-within-Gibbs, ρ E[X t+ Xt ) ], as a function of overall acceptance rates for each combination: i) = 0; c = 0.5,0.5,0.75,; ρ = 0,0.5) an ii) c = 0.5; = 0,0,50; ρ = 0,0.5), with π t 500,Σ ρ). simulation stuy to hol approximately in very low-imensional problems although the spee at which the infinite-imensional limit is reache oes vary in a complicate way, in particular, in c an measures of epenence in the target ensity such as ρ in the exchangeable normal examples). The results for the exchangeable normal example show that certain functions can converge at ifferent rates to others X converging at rate, while X i X converges at rate ), an this can cause serious practical problems for the MCMC practitioner. In particular, any one co-orinate X i might converge rapily, in a given time scale, to the wrong target ensity. Certainly, it woul be extremely ifficult to etect such problems empirically.

17 OPTIMAL SCALING FOR MCMC 7 Fig. 4. Normalize first-orer efficiency of RWM-within-Gibbs, ρ E[X t+ Xt ) ], as a function of overall acceptance rates for each combination = 40;c = 0.5,0.5,0.75,), with π x ) = i= exp x i ). The results in this paper are given for Metropolis an MALA algorithms. However, the use of these two methos is, in some sense, illustrative, an other algorithms such as, e.g., higher-orer Langevin algorithms using, e.g., the Ozaki iscretization [0]) are expecte to yiel similar conclusions. APPENDIX A.. Proofs of Section 3. Theorem 3. implies that the first component acts inepenently of all others as. Intuitively, this occurs because all other ) terms contribute expressions to the accept/reject ratio which turn out to obey SLLN an, thus, can be replace by their eterministic limits. To make this iea rigorous, we nee to efine a set in R on which the first component is well approximate by the appropriate LLN limit.

18 8 P. NEAL AND G. ROBERTS Fig. 5. Normalize first-orer efficiency of RWM-within-Gibbs, c /3 ρ /3 E[X t+ X t ) ], as a function of overall acceptance rates for each combination of = 0; c = 0.5,0.5,0.75,; ρ = 0,0.5), with π N0,Σ ρ). Motivate by this iea, we construct sets of tolerances aroun average values for quantities which will appear in the accept/reject ratio. Thus, we efine the sequence of sets F R, > } by } F = x ; g x i ) I < /8 i= } x ; g x i ) + I < /8 i= x ; ) g x i ) 4 < }, /8 i=

19 OPTIMAL SCALING FOR MCMC 9 Fig. 6. Normalize first-orer efficiency of MALA-within-Gibbs, c /3 ρ /3 E[X t+ X t ) ], as a function of overall acceptance rates for each combination c = 0.5; = 0,0,50; ρ = 0,0.5), with π N0,Σ ρ). = F, F, F,3, say, where I is efine in Theorem 3.. Let x = x,x,...) an for, let x = x,x,...,x ), where, for i, x i = x i. Thus, we shall use x an x interchangeably, as appropriate. Lemma A.. For k =,,3 an t > 0, A.) an, hence, PU s F,k,0 s t) as PU s F,0 s t) as.

20 0 P. NEAL AND G. ROBERTS Proof. The cases k = an k = are prove in [6], Lemma.. The case k = 3 is prove similarly using Markov s inequality an 3.). The lemma then follows. For any ranom variable X an for any subset A R, let E [X] = E[X χ = ] an P X A) = PX A χ = ). Let G be the iscrete-time) generator of X, an let V Cc the space of infinitely ifferentiable functions on compact port) be an arbitrary test function of the first component only. Thus, A.) [ G V x ) = E V Y ) V x )) π Y }] ) π x ) = Pχ [V = )E Y ) V x )) π Y ) π x ) since Y = x if χ = 0. The generator G of the one-imensional iffusion escribe in 3.4), for an arbitrary test function V C c, is given by A.3) GV x ) = cl Φ l ci ) g x )V x ) + V x )}. Note that, uner the conitions impose in Theorem 3., Cc forms a core for the full generator.) By Lemma A., we can restrict attention to x F. The aim will therefore be to show that, for all x F, G V x ) GV x ) as. The proof of Theorem 3. will then be fairly straightforwar. Thus, we begin by giving a Taylor series approximation for G V x ) in Lemma A.3, for which we will require the following lemma. }], the space of infinitely ifferentiable func- Lemma A.. For any V Cc tions on compact port), A.4) an A.5) E [V Y ) V x ))] l V x ) 0 as x F σ E [Z V Y ) V x ))] l V x ) 0 as, x F with x = x.

21 Proof. For χ =, Thus, by Taylor s theorem, A.6) OPTIMAL SCALING FOR MCMC Y x = σ l Z = Z. V Y ) V x ) = V x )σ Z ) + V x )σ Z ) + 6 V W )σ Z ) 3 for some W lying between x an Y. The lemma then follows by substituting A.6) into the left-han sies of A.4) an A.5). Lemma A.3. Let G V x ) = cl V x )E [ e B ] + cl V x )g x )E [e B ;B < 0], where B = B x )) = i= gyi ) gx i )). Then, we have that A.7) G V x ) G V x ) 0 as. x F Proof. Decomposing Y into Y,Y ) an using inepenence gives [ G V x ) = c E Y V Y ) V x ))E fyi Y [ ) ]] fx i ). We shall begin by concentrating on the inner expectation, by recalling the following fact note in []. Let h be a twice ifferentiable function on R, then the function z e hz) is also twice ifferentiable, except at a countable number of points, with first erivative given Lebesgue almost everywhere by the function h z ehz) = z)e hz), if hz) < 0, 0, if hz) 0. Now take h z)= h z;x )) = gx + σ z) gx )) + B an let γ z) = E fyi Y [ ) ] fx i ) Z = z. i= Thus, γ z) = E [ e hz) ], an so, for almost every x Y R, there exists W lying between 0 an z such that γ z) = E Y [ e h 0) ] A.8) + ze Y [σ g x )eh 0) ;h 0) < 0] + z E Y [σ g x + σ W) + g x + σ W) )e h W) ;h W) < 0]. i=

22 P. NEAL AND G. ROBERTS The key results to note are that h 0) = B an that, conitional upon χ =, Y an Y are inepenent. Therefore, A.9) G V x ) = c E Y [ V Y ) V x )) E Y [ e h0) ] + Z E Y [σ g x )e h 0) ;h 0) < 0] + Z E Y [σ g x + σ W) }] + g x + σ W) )e hw) ;h W) < 0] = c E [V Y ) V x ))]E [ e B ] + g x )c σ E [V Y ) V x ))Z ]E [e B ;B < 0] [ + c E Y V Y ) V x )) Z E Y [σg x + σ W) ] + g x + σ W) )e hw) ;h W) < 0] = ĜV x ) + D x ;Z ;W), say. Since E [ e B ],E [e B ;B < 0] an x = x, it follows from Lemma A. that x F ĜV x ) G V x ) 0 as. Thus, to prove the lemma, it is sufficient to show that, for all x F, D x ;Z ;W) converges to 0, as. By Taylor s theorem, we have that an V Y ) V x )) Z a R V a ) σ Z3 g x + σ W) g x ) + σ W g a ) a R g x ) + σ Z g a ). a R

23 OPTIMAL SCALING FOR MCMC 3 Since V an g are boune functions, it follows that, for all x F, D x ;Z ;W) c 3 Kσ3 K + g x ) + σ K)} 0 as, for some K > 0, an the lemma is prove. Lemma A.3 states that, for all x F, the generator G can be approximate by the generator G which resembles the limiting generator G. Thus, we now nee to consier for all x F, E [ e B ] an E [e B ;B < 0]. The aim is to approximate B by a more convenient quantity A to be efine in Lemma A.6) an, hence, show that E [ e B ] Φ l ) ci an E [e B ;B < 0] Φ l ) ci This will be one in the following lemmas. as. Lemma A.4. Let λ = λ x )) = i= χ i g x i ). For any ε > 0, x F P λ ci > ε) 0 as. Proof. Let R = R x )) = i= g x i ). Then, for x F, λ ci λ E [λ ] + E [λ ] cr + cr ci. Note that E [λ ] = c R, an so, by Lemma A., we have that E [λ ] cr + cr ci 0 as. Therefore, to prove the lemma, it suffices to show that, for any ε > 0, P λ E [λ ] > ε) 0 as. Note that an so, λ = ) χ i χ j g x i ) g x j ), i= j= E [λ ] = c ) g x i ) 4 i= + c ) c ) ) ) } g x i ) g x j) i= j i

24 4 P. NEAL AND G. ROBERTS = c )c ) R ) ) + c ) c ) ) ) ) g x i ) }. 4 i= Then since x F ) i= g x i )4 0 an c c as, it follows that, for all x F, E [λ E [λ ]) ] 0 as an, hence, by Chebyshev s inequality, as require. Lemma A.5. Let x F P λ E [λ ] > ε) 0 as, W = W x )) = i= g x i )Y i x i ) + an c c as. Then, recalling that σ = l/, x F E [ W ] 0 as. Proof. First, note that E [ W ] E [W ]. Then, by irect calculations, E [W ] = = i= j= + i= E [ g x i )Y i x i ) + } cl ) g x i ), } cl ) g x i ) }] g x j)yj x j) + cl ) g x j) 4 g x i ) 3 c i= j= = W, + W,, say. σ4 c )c ) ) ) 4 g x i )g x j )c )c ) ) ) σ 4 σ 4 + cl ) g x i )g x j )c σ + c l 4 } ) 4 ) g x i ) g x j) } )

25 OPTIMAL SCALING FOR MCMC 5 Let W,3 = W,3 x )) = cl i= ) g x i ) + g x i ) )}, an since c c as, we have that x F W, W,3 0 as. However, by efinition, x F W,3 0 an since g is boune, x F W, 0 as. The lemma follows immeiately. Lemma A.6. Let A = A x )) = i= g x i )Y i x i ) cl ) g x i ) }. Then, A.0) an A.) x F E [ e A ] E [ e B ] 0 as x F E [e A ;A < 0] E [e B ;B < 0] 0 as. Proof. Note that B = gyi ) gx i )) i= = g x i )Yi x i ) + g x i )Yi x i ) + 6 g α i )Yi x i ) 3 }, i= for some α i lying between x i an Y i. Therefore, by [6], Proposition., A.) E [ e A ] E [ e B ] E [ W ] + g a) a R 6 E [ Y x 3 ], = E [ W ] + g a) c a R 6 σ3 E[ Z 3 ]. Now let ϕ = x F E [ W ]+ a R g a) c 6 σ 3 E[ Z 3 ]}, where W is efine in Lemma A.5. Then, since g is a boune function, it follows from Lemma A.5 that ϕ 0 as an so A.0) is prove. Let J = J x )) = e A ;A < 0) e B ;B < 0) an let δ = ϕ. Then we procee by showing that Note that, if A,B > 0, then x F P J > δ ) 0 as. J = 0 A B

26 6 P. NEAL AND G. ROBERTS an if A,B < 0, then J = expa ) expb ) A B. Therefore, it follows that A.3) P J > δ ) P δ < A < δ ) + P A B δ ). By Markov s inequality, P A B δ ) A.4) δ E [ A B ] E [ W ] + g a) } δ a R 6 E [ Y x 3 ] ϕ, an so, P A B > δ ) 0 as, uniformly for x F. Fix x F, then for any ε > 0, by Lemma A.4, ) P ±δ l + cl λ R l ) ci A.5) > ε 0 as. Hence, Thus, A.6) ))] E [Φ ±δ l + cl l ci λ R Φ x F P δ < A < δ ) 0 as. ) as. Therefore, by A.3) A.6), x F P J > δ ) 0 as. Then since J, it follows that x F E [J ] 0 as an so A.) is prove. Lemma A.7. E [ e A ] Φ l ) ci A.7) 0 as an A.8) x F E [e A ;A < 0] Φ l ) ci 0 as. x F

27 that OPTIMAL SCALING FOR MCMC 7 Proof. Since A N cl R,l λ ), it follows by [6], Proposition.4, A.9) E [ e A ] = E [Φ clr ) λ ) + exp l cr λ ) Φ l λ + clr λ Since for any x F an ε > 0, P R I > ε) 0 an P λ ci > ε) 0 as, A.7) follows from A.9). A.8) is prove similarly. We are now in a position to show that, for all x F, the generator G converges to the generator G as. Theorem A.8. For V C c, x F G V x ) GV x ) 0 as. Proof. By Lemma A.3, x F G V x ) GV x ) 0 as, an by Lemmas A.6 an A.7, x F G V x ) GV x ) 0 as. Thus, the theorem is prove. Proof of Theorem 3.. The proof is similar to that of [6]. From Lemmas A., A.4 an Theorem A.8, we have uniform convergence of G V to GV for vectors containe in a set of π measure arbitrarily close to. Since Cc separates points see [4], page 3), the result will follow by [4], Chapter 4, Corollary 8.7 if we can emonstrate the compact containment conition, which in our case follows from the following statement. For all ε > 0, an all real value U0 = X 0,, we can fin K > 0 sufficiently large with PU t / K,K), 0 t ) ε, for all. We appeal irectly to the explicit form of the Metropolis transitions an assume that the Lipshitz constant for g is terme b. Thus, the following estimates are easy to erive by just noting that square jumping istances are boune above by that attaine by ignoring rejections. Moreover, these estimates are uniform over all X n : bσ e b σ / E[X n+, X n, X n] bσ e b σ / )].

28 8 P. NEAL AND G. ROBERTS an E[X n+, X n,) X n] E[Y n+, X n,) X n] = σ. Thus, setting V n = X n, + nbσ eb σ /, V n,0 n []} is submartingale with A.0) E[V [] ] σ + bσ e b σ / ). Since σ = l /, the right-han sie of A.0) is uniformly boune in so that the upper boun result follows by Doob s inequality. The lower boun follows similarly by consiering the ermartingale X n, nbσ eb σ /. A.. Proofs of Section 4. The proofs of Theorems 4. an 4. are similar to the proofs of Theorems an in [7], respectively. The only complication in the proofs is that we are upating a ranom set of components at each iteration in the MALA algorithm. Let x = x,x,...) an for, let x = x,x,...,x ), where, for i, x i = x i. Thus, we shall again use x an x interchangeably as appropriate. Let G be the iscrete-time) generator of X an let V Cc be an arbitrary test function of the first component only. Thus, [ G V x ) = /3 E V Y ) V x )) π Y }] ) π x ) A.) = /3 Pχ = )E [V Y ) V x )) π Y ) π x ) where E is efine after Lemma A. cf. Section A. after Lemma A.). The generator G of the one-imensional iffusion escribe in Theorem 4., for an arbitrary test function, V, is given by cl GV x ) = cl 3 ) K Φ g x )V x ) + V x )} A.) = h c l) g x )V x ) + } V x ), where K an h c l) are efine in Section 5. The aim thus, as in Section A., is to fin a sequence of sets F R } such that, for all t > 0, an, for V C c, PΓ s F, for all 0 s t) as, x F G V x ) GV x ) 0 as. }],

29 OPTIMAL SCALING FOR MCMC 9 The proofs of Theorem 4. an 4. are then straightforwar. The first step is therefore to construct the sets F R }. However, this is much more involve than for the RWM-within-Gibbs in Section A.. Thus, it will be more convenient to construct the sets F through the preliminary lemmas which lea to the proof of Theorems 4. an 4.. The next step will involve a Taylor series expansion of G V x ) to show that, for large, GV x ) is a goo approximation for G V x ). Thus, we begin by stuying log π Y ) π x ) ). Lemma A.9. There exists a sequence of sets F, R, with lim /3 π F C, )} = 0, such that, for χ i =, where fy log i )qyi,x i ) } fx i )qx i,y i ) = C 3 x i,z i) / + C 4 x i,z i) /3 + C 5 x i,z i) 5/6 + C 6 x i,z i ) + C 7 x i,z i,σ ), C 3 x i,z i ) = l 3 4 Z ig x i )g x i ) Z3 i g x i )}, an where C 4 x i,z i), C 5 x i,z i) an C 6 x i,z i) are polynomials in Z i an the erivatives of g. Furthermore, if E Z an E X enote expectation with Z N0,) an X having ensity f ), respectively, then A.3) E X E Z [C 3 X,Z)] = E X E Z [C 4 X,Z)] = E X E Z [C 5 X,Z)] = 0, whereas A.4) E X E Z [C 3 X,Z) ] = l 6 K = E X E Z [C 6 X,Z)]. In aition, A.5) [ E x F fy log i )qy i,x i ) } fx i )qx i,y i ) i= } ] / χ i C 3 x i,z i ) cl6 K 0 as. i= Proof. With the exception of A.5) an the exact form of the sets F,, the lemma is prove in [7], Lemma.

30 30 P. NEAL AND G. ROBERTS For j = 4,5,6 an x R, set c j x) = E Z [C j x,z)] an v j x) = var Z C j x, Z)). The set F,,j = 3 k= F,,j,k, where } F,,j, = x ; C j x i ) E X [C j X)]} < 5/8, F,,j, = F,,j,3 = x ; x ; i= i= i= } V j x i ) E X [V j X)]} < 6/5, C j x i ) E X [C j X)]} < 6/5 }. Then for j = 4,5,6 an k =,,3, it is straightforwar, using Markov s inequality an conitions 4.) an 4.), to show that /3 π F C,,j,k) 0 as. Cf. [7], Lemma, where only the cases k =, are require.) Finally, let F,,7 R } correspon to the sets F n,7 } constructe in [7], Lemma, an so, /3 π F C, ) 0 as, where F, = 7 j=4 F,,j. The proof of A.5) is then essentially the same as the proof of the final expression in [7], Lemma, an, hence, the etails are omitte. The next step is to fin a convenient approximation for G V x ) which effectively allows us to consier separately the first component an the remaining ) components. Lemma A.0. Let G V x ) = c /3 E [V Y ) V x )]E [e B ], where B = B x )) = i= gyi ) gx i )) x σ i Y i σ g Yi )) Yi x i σ g x i )) }. There exists sets F, R with lim /3 π F, C ) = 0 such that, for any V Cc, x F, G V x ) G V x ) 0 as. Moreover, A.6) [ E π Y )qy,x ) ) x F, π x )qx,y ) ] e B ) 0 as.

31 OPTIMAL SCALING FOR MCMC 3 Proof. Since, conitional upon χ, Y an Y are inepenent, it follows that G V x ) = c /3 E [V Y ) V x ))e B )]. The lemma then follows by ientical arguments to those use in [7], Theorem 3, with the sets F, } chosen to correspon to the sets S n } in [7], Theorem 3. Lemma A.. Let F,3 = x ;g x ) / } then /3 π F,3 C ) 0 as an for any V Cc, /3 c E [V Y ) V x )] c l g x )V x ) + V x )} 0 x F,3 with x = x. as, Proof. The proof is ientical to [7], Lemma an is, hence, omitte. We now focus on the remaining ) components. First we introuce the following notation. Let ax) = 4 g x)g x) an bx) = g x). Therefore, we have that Set C 3 x,z) = l 3 ax)z + bx)z 3 }. } Q x, ) = L χ i C 3 x i,z i ) χ =. Let φ x,t) = R expitw)q w) an let φt) = exp t cl6 K ). i= Lemma A.. There exists a sequence of sets F,4 R such that: a) lim /3 π F,4 C )} = 0, b) for all t R, x F,4 φ x ;t) φt) 0 as, c) for all boune continuous functions r, ) Q x,y)ry) ry) exp y x F,4 R πcl 3 K R cl 6 K y 0 as,

32 3 P. NEAL AND G. ROBERTS ) [ }] x F E exp / χ i C 3x i,z i) cl6 K,4 i= Φ cl 3 K ) 0 as. Proof. The sets F,4 are constructe as in the proof of [7], Lemma 3, an so, statement a) follows. Specifically, we let F,4 be the set of x R such that hx A.7) i ) hx)fx)x /4 A.8) i= hx i ) 3/4, i, for each of the functionals hx) = ax),bx),ax)bx),ax) 4,bx) 4, ax) 3 bx),ax) bx),ax)bx) 3. Since statements c) an ) follow from statement b) as outline in [7], Lemma 3, all that is require is to prove b). Let L = j;χ j =, j } an let )] it θj x j [exp ;t) = E C 3 x j,z j). Let [ } ] φ Λ it x ;t) = E exp C 3 x j,z j) L = Λ. j Λ Then since C 3 x j,z j)} j= are inepenent ranom variables, it follows that φ Λ x ;t) = θj x j ;t). j Λ Therefore, }] [ ] it φ x ;t) = E [exp χ jc 3 x j,z j ) = E θjx j;t) j= j L an so, A.9) [ } x F φ x ;t) E ] t,4 vx j) j L [ E θ x F j x j ;t) } ] t,4 vx j ), j L j L

33 OPTIMAL SCALING FOR MCMC 33 where vx j ) = var ZC 3 x j,z)) = l6 ax j ) + 6ax j )bx j ) + 5bx j ) }. The right-han sie of A.9) converges to 0 as by arguments similar to those use in [7], Lemma 3. Hence, the etails are omitte. Now by using a Taylor series expansion for exp t j= χ j vx j )), it is trivial to show that [ } ] E t vx j) A.30) x F,4 j L E [exp j= }] t χ jvx j) 0 as, since for all x F,4, j= vx j ) 0 as cf. [7], Lemma 3). The final step to complete the proof of statement b) is to show that [ x F E exp χ j,4 j= t vx j) )] ) exp t cl6 K 0 as. This follows immeiately, since using Chebyshev s inequality, we can show that, for all ε > 0, P χ t ) x F j,4 vx j ) t cl6 K > ε 0 as. j= Thus, statement b) is prove an the lemma follows. We are now in position to prove Theorems 4. an 4.. Proof of Theorem 4.. The theorem follows from A.5), A.6) an part ) of Lemma A.. Proof of Theorem 4.. We take F = F, F, F,3 F,4. Then an so, for fixe T, /3 π F C ) 0 as, PΓ t F,0 t T) as. Also, from Lemmas A.9 A., it follows that x F G V x ) GV x ) 0 as for all V Cc, which epen only on the first coorinate. Therefore, the weak convergence follows by [4], Chapter 4, Corollary 8.7, since Cc separates points an an ientical argument to that of Theorem 3. can be use to emonstrate compact containment. The maximizing of h c l) is straightforwar using the proof of [7], Theorem.

34 34 P. NEAL AND G. ROBERTS A.3. Proofs of Section 5. The proof of Theorem 5. is very similar to the proof of Theorem 3. given in Section A.. First, for x R, let x = x,x,...), x = i=3 x i an let x = lim x, shoul the limit exist. For x R, let x R be such that x = x,x,...,x ) [= x,x,...,x ), say], that is, x comprises the first components of x. Then let G be the iscrete-time) generator of X, an let V Cc be an arbitrary test function of x,x an x only. Thus, [ G V x ) = E V Y ) V x )) π Y ) π x ) The generator G of the three-imensional iffusion escribe in Theorem 5., for an arbitrary test function V of x,x an x, is given by )) GV x ) = cl cl Φ ρ i= ρ x i x) x i V x ) + x i }]. } V x ). We shall efine sets F R ; } such that for PX F C ) 0 as. This is one in Lemma A.3 an, thus, we can restrict attention to x F. Furthermore, Lemma A.3 ensures that, for all x F, lim x exists. Therefore, since we can restrict attention to x F, we aim to show that A.3) G V x ) GV x ) 0 as, x F which is prove in Theorem A.7 an then Theorem 5. follows trivially. Then efine sets F R ; } such that for PX F C ) 0 as. This is one in Lemma A.3 an, thus, we can restrict attention to x F. Lemma A.3. For k 5, efine the sequence of sets F,k R ; } by F, = x ; R x ) ρ) < /8 }, F, = x ; x x < /8 }, } F,3 = x ; max i x i < /8, F,4 = x ; x i ) < }, /8 i=

35 OPTIMAL SCALING FOR MCMC 35 ) F,5 = x 4 } ; ρ x i + θ x j < /8, i= j= where R x ) = i= x i j= x j ) an θ = Let F = 5 k= F,k, then A.3) PX F C ) 0 as. Proof. It is sufficient to show that, for k 5, ρ + )ρ )ρ. PX F,k C ) 0 as. For the cases k =,3,4 an 5, it is straightforwar but teious using Markov s inequality to prove the result. Therefore, the etails are omitte. For the case k =, let X = i=3 Xi 3) an let X = lim X. Therefore, by construction see Section 5), for all 3, ) 0 ) X )) N, + 3)ρ) X 0 ρ. ρ ρ Thus, A.33) ) X X )ρ = x} N + 3)ρ x, ρ ρ). + 3)ρ Therefore, by Markov s inequality, A.34) PX F C, ) = P X X /8 ) E[ X X 4 ], an the result follows trivially from A.33) an A.34). The proceure now iffers slightly from that given in Section A.. We postpone the fining of a suitable Taylor series expansion for G V x ) an first give Lemmas A.4, A.5 an A.6, which mirror Lemmas A.4, A.6 an A.7, respectively. The proofs of the aforementione lemmas are similar to the proofs of the corresponing results in Section A. an, hence, the etails are omitte. Lemma A.4. For k, let λ k = λk x )) = χ i ρ x i + θ xj). i k j= Then for any ε > 0, P k λ k c ) x F ρ > ε 0 as, where, for any ranom variable X an any subset A R, P k X A) = PX A χ k = ) an E k [X] = E[X χ k = ].

36 36 P. NEAL AND G. ROBERTS For z R, let h k z)= h k z;x )) = log π Y ) π x ) } Zk = z}. The role of h k z) is similar to that playe by h z) in Section A., with h k 0) equivalent to B cf. Lemma A.3). Lemma A.5. For an k, let A k = Ak x )) = σ i k χ i ρ x i + θ j= x j )Z i. Then, A.35) an x F E k [ eak ] E k [ e hk 0) ] 0 as, E k ;A k [eak < 0] E k 0) [ehk ;h k 0) < 0] 0 as. x F Lemma A.6. For k, E k [ eak ] Φ l ) c A.36) 0 as ρ an A.37) x F E k [eak ;A k < 0] Φ l ) c 0 as. ρ x F We are now in position to prove A.3). Theorem A.7. G V x ) GV x ) 0 as, x F cl ρ) Proof. Note that, for all 3, we have the following Taylor series expansion, for V : V Y ) V x ) = σ χ Z V x ) + χ x Z V x ) x + σ + χ Z x i=3 ) } χ i Z i V x ) x V x ) + χ Z x V x ) + χ χ Z Z x x V x )

Optimal scaling for partially updating MCMC algorithms. Neal, Peter and Roberts, Gareth. MIMS EPrint:

Optimal scaling for partially updating MCMC algorithms. Neal, Peter and Roberts, Gareth. MIMS EPrint: Optimal scaling for partially upating MCMC algorithms Neal, Peter an Roberts, Gareth 006 MIMS EPrint: 007.93 Manchester Institute for Mathematical Sciences School of Mathematics The University of Manchester

More information

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Introduction. A Dirichlet Form approach to MCMC Optimal Scaling. MCMC idea

Introduction. A Dirichlet Form approach to MCMC Optimal Scaling. MCMC idea Introuction A Dirichlet Form approach to MCMC Optimal Scaling Markov chain Monte Carlo (MCMC quotes: Metropolis et al. (1953, running coe on the Los Alamos MANIAC: a feasible approach to statistical mechanics

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x) Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy,

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy, NOTES ON EULER-BOOLE SUMMATION JONATHAN M BORWEIN, NEIL J CALKIN, AND DANTE MANNA Abstract We stuy a connection between Euler-MacLaurin Summation an Boole Summation suggeste in an AMM note from 196, which

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

Differentiation ( , 9.5)

Differentiation ( , 9.5) Chapter 2 Differentiation (8.1 8.3, 9.5) 2.1 Rate of Change (8.2.1 5) Recall that the equation of a straight line can be written as y = mx + c, where m is the slope or graient of the line, an c is the

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

Introduction to Markov Processes

Introduction to Markov Processes Introuction to Markov Processes Connexions moule m44014 Zzis law Gustav) Meglicki, Jr Office of the VP for Information Technology Iniana University RCS: Section-2.tex,v 1.24 2012/12/21 18:03:08 gustav

More information

Math 1B, lecture 8: Integration by parts

Math 1B, lecture 8: Integration by parts Math B, lecture 8: Integration by parts Nathan Pflueger 23 September 2 Introuction Integration by parts, similarly to integration by substitution, reverses a well-known technique of ifferentiation an explores

More information

Schrödinger s equation.

Schrödinger s equation. Physics 342 Lecture 5 Schröinger s Equation Lecture 5 Physics 342 Quantum Mechanics I Wenesay, February 3r, 2010 Toay we iscuss Schröinger s equation an show that it supports the basic interpretation of

More information

Logarithmic spurious regressions

Logarithmic spurious regressions Logarithmic spurious regressions Robert M. e Jong Michigan State University February 5, 22 Abstract Spurious regressions, i.e. regressions in which an integrate process is regresse on another integrate

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

Monotonicity for excited random walk in high dimensions

Monotonicity for excited random walk in high dimensions Monotonicity for excite ranom walk in high imensions Remco van er Hofsta Mark Holmes March, 2009 Abstract We prove that the rift θ, β) for excite ranom walk in imension is monotone in the excitement parameter

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Markov Chains in Continuous Time

Markov Chains in Continuous Time Chapter 23 Markov Chains in Continuous Time Previously we looke at Markov chains, where the transitions betweenstatesoccurreatspecifietime- steps. That it, we mae time (a continuous variable) avance in

More information

Expected Value of Partial Perfect Information

Expected Value of Partial Perfect Information Expecte Value of Partial Perfect Information Mike Giles 1, Takashi Goa 2, Howar Thom 3 Wei Fang 1, Zhenru Wang 1 1 Mathematical Institute, University of Oxfor 2 School of Engineering, University of Tokyo

More information

Implicit Differentiation

Implicit Differentiation Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,

More information

The Exact Form and General Integrating Factors

The Exact Form and General Integrating Factors 7 The Exact Form an General Integrating Factors In the previous chapters, we ve seen how separable an linear ifferential equations can be solve using methos for converting them to forms that can be easily

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

Final Exam Study Guide and Practice Problems Solutions

Final Exam Study Guide and Practice Problems Solutions Final Exam Stuy Guie an Practice Problems Solutions Note: These problems are just some of the types of problems that might appear on the exam. However, to fully prepare for the exam, in aition to making

More information

Monte Carlo Methods with Reduced Error

Monte Carlo Methods with Reduced Error Monte Carlo Methos with Reuce Error As has been shown, the probable error in Monte Carlo algorithms when no information about the smoothness of the function is use is Dξ r N = c N. It is important for

More information

Introduction to the Vlasov-Poisson system

Introduction to the Vlasov-Poisson system Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its

More information

A Sketch of Menshikov s Theorem

A Sketch of Menshikov s Theorem A Sketch of Menshikov s Theorem Thomas Bao March 14, 2010 Abstract Let Λ be an infinite, locally finite oriente multi-graph with C Λ finite an strongly connecte, an let p

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

Generalization of the persistent random walk to dimensions greater than 1

Generalization of the persistent random walk to dimensions greater than 1 PHYSICAL REVIEW E VOLUME 58, NUMBER 6 DECEMBER 1998 Generalization of the persistent ranom walk to imensions greater than 1 Marián Boguñá, Josep M. Porrà, an Jaume Masoliver Departament e Física Fonamental,

More information

Quantum Mechanics in Three Dimensions

Quantum Mechanics in Three Dimensions Physics 342 Lecture 20 Quantum Mechanics in Three Dimensions Lecture 20 Physics 342 Quantum Mechanics I Monay, March 24th, 2008 We begin our spherical solutions with the simplest possible case zero potential.

More information

1 Math 285 Homework Problem List for S2016

1 Math 285 Homework Problem List for S2016 1 Math 85 Homework Problem List for S016 Note: solutions to Lawler Problems will appear after all of the Lecture Note Solutions. 1.1 Homework 1. Due Friay, April 8, 016 Look at from lecture note exercises:

More information

θ x = f ( x,t) could be written as

θ x = f ( x,t) could be written as 9. Higher orer PDEs as systems of first-orer PDEs. Hyperbolic systems. For PDEs, as for ODEs, we may reuce the orer by efining new epenent variables. For example, in the case of the wave equation, (1)

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

Some vector algebra and the generalized chain rule Ross Bannister Data Assimilation Research Centre, University of Reading, UK Last updated 10/06/10

Some vector algebra and the generalized chain rule Ross Bannister Data Assimilation Research Centre, University of Reading, UK Last updated 10/06/10 Some vector algebra an the generalize chain rule Ross Bannister Data Assimilation Research Centre University of Reaing UK Last upate 10/06/10 1. Introuction an notation As we shall see in these notes the

More information

YVES F. ATCHADÉ, GARETH O. ROBERTS, AND JEFFREY S. ROSENTHAL

YVES F. ATCHADÉ, GARETH O. ROBERTS, AND JEFFREY S. ROSENTHAL TOWARDS OPTIMAL SCALING OF METROPOLIS-COUPLED MARKOV CHAIN MONTE CARLO YVES F. ATCHADÉ, GARETH O. ROBERTS, AND JEFFREY S. ROSENTHAL Abstract. We consier optimal temperature spacings for Metropolis-couple

More information

arxiv: v1 [math.pr] 18 Oct 2012

arxiv: v1 [math.pr] 18 Oct 2012 The Annals of Applie Probability 2012, Vol. 22, No. 5, 1880 1927 DOI: 10.1214/11-AAP817 c Institute of Mathematical Statistics, 2012 arxiv:1210.5090v1 [math.pr] 18 Oct 2012 OPTIMAL SCALING OF RANDOM WALK

More information

Pure Further Mathematics 1. Revision Notes

Pure Further Mathematics 1. Revision Notes Pure Further Mathematics Revision Notes June 20 2 FP JUNE 20 SDB Further Pure Complex Numbers... 3 Definitions an arithmetical operations... 3 Complex conjugate... 3 Properties... 3 Complex number plane,

More information

A Note on Exact Solutions to Linear Differential Equations by the Matrix Exponential

A Note on Exact Solutions to Linear Differential Equations by the Matrix Exponential Avances in Applie Mathematics an Mechanics Av. Appl. Math. Mech. Vol. 1 No. 4 pp. 573-580 DOI: 10.4208/aamm.09-m0946 August 2009 A Note on Exact Solutions to Linear Differential Equations by the Matrix

More information

12.11 Laplace s Equation in Cylindrical and

12.11 Laplace s Equation in Cylindrical and SEC. 2. Laplace s Equation in Cylinrical an Spherical Coorinates. Potential 593 2. Laplace s Equation in Cylinrical an Spherical Coorinates. Potential One of the most important PDEs in physics an engineering

More information

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold CHAPTER 1 : DIFFERENTIABLE MANIFOLDS 1.1 The efinition of a ifferentiable manifol Let M be a topological space. This means that we have a family Ω of open sets efine on M. These satisfy (1), M Ω (2) the

More information

PDE Notes, Lecture #11

PDE Notes, Lecture #11 PDE Notes, Lecture # from Professor Jalal Shatah s Lectures Febuary 9th, 2009 Sobolev Spaces Recall that for u L loc we can efine the weak erivative Du by Du, φ := udφ φ C0 If v L loc such that Du, φ =

More information

ELEC3114 Control Systems 1

ELEC3114 Control Systems 1 ELEC34 Control Systems Linear Systems - Moelling - Some Issues Session 2, 2007 Introuction Linear systems may be represente in a number of ifferent ways. Figure shows the relationship between various representations.

More information

Calculus in the AP Physics C Course The Derivative

Calculus in the AP Physics C Course The Derivative Limits an Derivatives Calculus in the AP Physics C Course The Derivative In physics, the ieas of the rate change of a quantity (along with the slope of a tangent line) an the area uner a curve are essential.

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

A Review of Multiple Try MCMC algorithms for Signal Processing

A Review of Multiple Try MCMC algorithms for Signal Processing A Review of Multiple Try MCMC algorithms for Signal Processing Luca Martino Image Processing Lab., Universitat e València (Spain) Universia Carlos III e Mari, Leganes (Spain) Abstract Many applications

More information

Mark J. Machina CARDINAL PROPERTIES OF "LOCAL UTILITY FUNCTIONS"

Mark J. Machina CARDINAL PROPERTIES OF LOCAL UTILITY FUNCTIONS Mark J. Machina CARDINAL PROPERTIES OF "LOCAL UTILITY FUNCTIONS" This paper outlines the carinal properties of "local utility functions" of the type use by Allen [1985], Chew [1983], Chew an MacCrimmon

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

Calculus and optimization

Calculus and optimization Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function

More information

Calculus of Variations

Calculus of Variations Calculus of Variations Lagrangian formalism is the main tool of theoretical classical mechanics. Calculus of Variations is a part of Mathematics which Lagrangian formalism is base on. In this section,

More information

Relative Entropy and Score Function: New Information Estimation Relationships through Arbitrary Additive Perturbation

Relative Entropy and Score Function: New Information Estimation Relationships through Arbitrary Additive Perturbation Relative Entropy an Score Function: New Information Estimation Relationships through Arbitrary Aitive Perturbation Dongning Guo Department of Electrical Engineering & Computer Science Northwestern University

More information

On a limit theorem for non-stationary branching processes.

On a limit theorem for non-stationary branching processes. On a limit theorem for non-stationary branching processes. TETSUYA HATTORI an HIROSHI WATANABE 0. Introuction. The purpose of this paper is to give a limit theorem for a certain class of iscrete-time multi-type

More information

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France APPROXIMAE SOLUION FOR RANSIEN HEA RANSFER IN SAIC URBULEN HE II B. Bauouy CEA/Saclay, DSM/DAPNIA/SCM 91191 Gif-sur-Yvette Ceex, France ABSRAC Analytical solution in one imension of the heat iffusion equation

More information

Some Examples. Uniform motion. Poisson processes on the real line

Some Examples. Uniform motion. Poisson processes on the real line Some Examples Our immeiate goal is to see some examples of Lévy processes, an/or infinitely-ivisible laws on. Uniform motion Choose an fix a nonranom an efine X := for all (1) Then, {X } is a [nonranom]

More information

Discrete Mathematics

Discrete Mathematics Discrete Mathematics 309 (009) 86 869 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: wwwelseviercom/locate/isc Profile vectors in the lattice of subspaces Dániel Gerbner

More information

Conservation Laws. Chapter Conservation of Energy

Conservation Laws. Chapter Conservation of Energy 20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action

More information

Jointly continuous distributions and the multivariate Normal

Jointly continuous distributions and the multivariate Normal Jointly continuous istributions an the multivariate Normal Márton alázs an álint Tóth October 3, 04 This little write-up is part of important founations of probability that were left out of the unit Probability

More information

Lecture 7: Interchange of integration and limit

Lecture 7: Interchange of integration and limit Lecture 7: Interchange of integration an limit Differentiating uner an integral sign To stuy the properties of a chf, we nee some technical result. When can we switch the ifferentiation an integration?

More information

LECTURE NOTES ON DVORETZKY S THEOREM

LECTURE NOTES ON DVORETZKY S THEOREM LECTURE NOTES ON DVORETZKY S THEOREM STEVEN HEILMAN Abstract. We present the first half of the paper [S]. In particular, the results below, unless otherwise state, shoul be attribute to G. Schechtman.

More information

Applications of the Wronskian to ordinary linear differential equations

Applications of the Wronskian to ordinary linear differential equations Physics 116C Fall 2011 Applications of the Wronskian to orinary linear ifferential equations Consier a of n continuous functions y i (x) [i = 1,2,3,...,n], each of which is ifferentiable at least n times.

More information

MA 2232 Lecture 08 - Review of Log and Exponential Functions and Exponential Growth

MA 2232 Lecture 08 - Review of Log and Exponential Functions and Exponential Growth MA 2232 Lecture 08 - Review of Log an Exponential Functions an Exponential Growth Friay, February 2, 2018. Objectives: Review log an exponential functions, their erivative an integration formulas. Exponential

More information

Partial Differential Equations

Partial Differential Equations Chapter Partial Differential Equations. Introuction Have solve orinary ifferential equations, i.e. ones where there is one inepenent an one epenent variable. Only orinary ifferentiation is therefore involve.

More information

II. First variation of functionals

II. First variation of functionals II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent

More information

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations Diophantine Approximations: Examining the Farey Process an its Metho on Proucing Best Approximations Kelly Bowen Introuction When a person hears the phrase irrational number, one oes not think of anything

More information

Chapter 6: Energy-Momentum Tensors

Chapter 6: Energy-Momentum Tensors 49 Chapter 6: Energy-Momentum Tensors This chapter outlines the general theory of energy an momentum conservation in terms of energy-momentum tensors, then applies these ieas to the case of Bohm's moel.

More information

Function Spaces. 1 Hilbert Spaces

Function Spaces. 1 Hilbert Spaces Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure

More information

Lecture 10 Notes, Electromagnetic Theory II Dr. Christopher S. Baird, faculty.uml.edu/cbaird University of Massachusetts Lowell

Lecture 10 Notes, Electromagnetic Theory II Dr. Christopher S. Baird, faculty.uml.edu/cbaird University of Massachusetts Lowell Lecture 10 Notes, Electromagnetic Theory II Dr. Christopher S. Bair, faculty.uml.eu/cbair University of Massachusetts Lowell 1. Pre-Einstein Relativity - Einstein i not invent the concept of relativity,

More information

Modelling and simulation of dependence structures in nonlife insurance with Bernstein copulas

Modelling and simulation of dependence structures in nonlife insurance with Bernstein copulas Moelling an simulation of epenence structures in nonlife insurance with Bernstein copulas Prof. Dr. Dietmar Pfeifer Dept. of Mathematics, University of Olenburg an AON Benfiel, Hamburg Dr. Doreen Straßburger

More information

Notes on Lie Groups, Lie algebras, and the Exponentiation Map Mitchell Faulk

Notes on Lie Groups, Lie algebras, and the Exponentiation Map Mitchell Faulk Notes on Lie Groups, Lie algebras, an the Exponentiation Map Mitchell Faulk 1. Preliminaries. In these notes, we concern ourselves with special objects calle matrix Lie groups an their corresponing Lie

More information

Qubit channels that achieve capacity with two states

Qubit channels that achieve capacity with two states Qubit channels that achieve capacity with two states Dominic W. Berry Department of Physics, The University of Queenslan, Brisbane, Queenslan 4072, Australia Receive 22 December 2004; publishe 22 March

More information

SYNCHRONOUS SEQUENTIAL CIRCUITS

SYNCHRONOUS SEQUENTIAL CIRCUITS CHAPTER SYNCHRONOUS SEUENTIAL CIRCUITS Registers an counters, two very common synchronous sequential circuits, are introuce in this chapter. Register is a igital circuit for storing information. Contents

More information

Tractability results for weighted Banach spaces of smooth functions

Tractability results for weighted Banach spaces of smooth functions Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March

More information

Generalized Tractability for Multivariate Problems

Generalized Tractability for Multivariate Problems Generalize Tractability for Multivariate Problems Part II: Linear Tensor Prouct Problems, Linear Information, an Unrestricte Tractability Michael Gnewuch Department of Computer Science, University of Kiel,

More information

COUPLING REQUIREMENTS FOR WELL POSED AND STABLE MULTI-PHYSICS PROBLEMS

COUPLING REQUIREMENTS FOR WELL POSED AND STABLE MULTI-PHYSICS PROBLEMS VI International Conference on Computational Methos for Couple Problems in Science an Engineering COUPLED PROBLEMS 15 B. Schrefler, E. Oñate an M. Paparakakis(Es) COUPLING REQUIREMENTS FOR WELL POSED AND

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

Witten s Proof of Morse Inequalities

Witten s Proof of Morse Inequalities Witten s Proof of Morse Inequalities by Igor Prokhorenkov Let M be a smooth, compact, oriente manifol with imension n. A Morse function is a smooth function f : M R such that all of its critical points

More information

Iterated Point-Line Configurations Grow Doubly-Exponentially

Iterated Point-Line Configurations Grow Doubly-Exponentially Iterate Point-Line Configurations Grow Doubly-Exponentially Joshua Cooper an Mark Walters July 9, 008 Abstract Begin with a set of four points in the real plane in general position. A to this collection

More information

Section 2.7 Derivatives of powers of functions

Section 2.7 Derivatives of powers of functions Section 2.7 Derivatives of powers of functions (3/19/08) Overview: In this section we iscuss the Chain Rule formula for the erivatives of composite functions that are forme by taking powers of other functions.

More information

Proof of SPNs as Mixture of Trees

Proof of SPNs as Mixture of Trees A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a

More information

0.1 Differentiation Rules

0.1 Differentiation Rules 0.1 Differentiation Rules From our previous work we ve seen tat it can be quite a task to calculate te erivative of an arbitrary function. Just working wit a secon-orer polynomial tings get pretty complicate

More information

Sturm-Liouville Theory

Sturm-Liouville Theory LECTURE 5 Sturm-Liouville Theory In the three preceing lectures I emonstrate the utility of Fourier series in solving PDE/BVPs. As we ll now see, Fourier series are just the tip of the iceberg of the theory

More information

Lecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations

Lecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations Lecture XII Abstract We introuce the Laplace equation in spherical coorinates an apply the metho of separation of variables to solve it. This will generate three linear orinary secon orer ifferential equations:

More information

Calculus of Variations

Calculus of Variations 16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

Switching Time Optimization in Discretized Hybrid Dynamical Systems

Switching Time Optimization in Discretized Hybrid Dynamical Systems Switching Time Optimization in Discretize Hybri Dynamical Systems Kathrin Flaßkamp, To Murphey, an Sina Ober-Blöbaum Abstract Switching time optimization (STO) arises in systems that have a finite set

More information

Convergence rates of moment-sum-of-squares hierarchies for optimal control problems

Convergence rates of moment-sum-of-squares hierarchies for optimal control problems Convergence rates of moment-sum-of-squares hierarchies for optimal control problems Milan Kora 1, Diier Henrion 2,3,4, Colin N. Jones 1 Draft of September 8, 2016 Abstract We stuy the convergence rate

More information

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim QF101: Quantitative Finance September 5, 2017 Week 3: Derivatives Facilitator: Christopher Ting AY 2017/2018 I recoil with ismay an horror at this lamentable plague of functions which o not have erivatives.

More information

REVERSIBILITY FOR DIFFUSIONS VIA QUASI-INVARIANCE. 1. Introduction We look at the problem of reversibility for operators of the form

REVERSIBILITY FOR DIFFUSIONS VIA QUASI-INVARIANCE. 1. Introduction We look at the problem of reversibility for operators of the form REVERSIBILITY FOR DIFFUSIONS VIA QUASI-INVARIANCE OMAR RIVASPLATA, JAN RYCHTÁŘ, AND BYRON SCHMULAND Abstract. Why is the rift coefficient b associate with a reversible iffusion on R given by a graient?

More information

Introduction to variational calculus: Lecture notes 1

Introduction to variational calculus: Lecture notes 1 October 10, 2006 Introuction to variational calculus: Lecture notes 1 Ewin Langmann Mathematical Physics, KTH Physics, AlbaNova, SE-106 91 Stockholm, Sween Abstract I give an informal summary of variational

More information

TMA 4195 Matematisk modellering Exam Tuesday December 16, :00 13:00 Problems and solution with additional comments

TMA 4195 Matematisk modellering Exam Tuesday December 16, :00 13:00 Problems and solution with additional comments Problem F U L W D g m 3 2 s 2 0 0 0 0 2 kg 0 0 0 0 0 0 Table : Dimension matrix TMA 495 Matematisk moellering Exam Tuesay December 6, 2008 09:00 3:00 Problems an solution with aitional comments The necessary

More information

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE Journal of Soun an Vibration (1996) 191(3), 397 414 THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE E. M. WEINSTEIN Galaxy Scientific Corporation, 2500 English Creek

More information

Lagrangian and Hamiltonian Mechanics

Lagrangian and Hamiltonian Mechanics Lagrangian an Hamiltonian Mechanics.G. Simpson, Ph.. epartment of Physical Sciences an Engineering Prince George s Community College ecember 5, 007 Introuction In this course we have been stuying classical

More information

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7.

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7. Lectures Nine an Ten The WKB Approximation The WKB metho is a powerful tool to obtain solutions for many physical problems It is generally applicable to problems of wave propagation in which the frequency

More information