On the Containment Condition for Adaptive Markov Chain Monte Carlo Algorithms

Size: px

Start display at page:

Download "On the Containment Condition for Adaptive Markov Chain Monte Carlo Algorithms"

Erika Joseph
5 years ago
Views:

1 O the Cotaimet Coditio for Adaptive arkov Chai ote Carlo Algorithms Ya Bai, Gareth O. Roberts ad Jeffrey S. Rosethal July 2008; revised ay 2009, Dec [Note: this versio is ow OUT OF DATE ad was replaced with a ew versio i July 200, available from probability.ca/jeff/research.html.] Itroductio Abstract This paper cosiders ergodicity properties of certai adaptive arkov chai ote Carlo CC algorithms for multidimesioal target distributios. It was previously show i [8] that Dimiishig Adaptatio ad Cotaimet imply ergodicity of adaptive CC. We derive various sufficiet coditios to esure Cotaimet, ad coect the covergece rates of algorithms with the tail properties of the correspodig target distributios. Two examples are give to show that Dimiishig Adaptatio aloe does ot imply ergodicity. We also preset a Summable Adaptive Coditio which, whe satisfied, proves ergodicity more easily. arkov chai ote Carlo algorithms are widely used for approximately samplig from complicated probability distributios. However, it is ofte ecessary to tue the scalig ad other parameters before the algorithm will coverge efficietly. Adaptive CC algorithms modify their trasitios o the fly, i a effort to automatically tue the parameters ad improve covergece. Cosider a target distributio π defied o the state space X with respect to some σ-field BX πx is also used as the desity fuctio. Let {P γ : γ Y} be the family of trasitio kerels of time homogeeous arkov chais with the same statioary distributio as π, i.e. πp γ = π for all γ Y. A adaptive CC algorithm Z := {X, Γ : 0} ca be regarded as lyig i the sample path space Ω := X Y equipped with a σ-field F. For each iitial state x X ad iitial parameter γ Y, there is a probability measure P x,γ such that the probability of the evet [Z A] is well-defied for ay set A F. There is a filtratio G := {G : 0} such that Z is adapted to G. Some adaptive CC methods use regeeratio times ad other somewhat complicated costructios [see 9, 7]. However, Haario et al. [see 0] proposed a adaptive etropolis algorithm Departmet of Statistics, Uiversity of Toroto, Toroto, ON 5S 3G3, CA. yabai@utstat.toroto.edu Departmet of Statistics, Uiversity of Warwick, Covetry CV4 7AL, UK. gareth.o.roberts@warwick.ac.uk Departmet of Statistics, Uiversity of Toroto, Toroto, ON 5S 3G3, CA. jeff@math.toroto.edu Supported i part by NSERC of Caada.

2 attemptig to optimise the proposal distributio, ad proved that a particular versio of this algorithm correctly coverges strogly to the target distributio. The algorithm ca be viewed as a versio of the Robbis-oro stochastic cotrol algorithm [see 2, 5]. The results were the geeralized provig covergece of more geeral adaptive CC algorithms [see 4,, 24, 3, 5]. A framework of adaptive CC is defied as:. Give a iitial state X 0 := x 0 X ad a kerel P Γ0 with Γ 0 := γ 0 Y. At each iteratio +, X + is geerated from P Γ X, ; 2. Γ + is obtaied from some fuctio of X 0,, X + ad Γ 0,, Γ. For A BX, P x0,γ 0 X + A G = P x0,γ 0 X + A X, Γ = P Γ X, A. I the paper, we study adaptive CC with the property Eq.. We say that the adaptive CC Z is ergodic if for ay iitial state x 0 X ad ay kerel idex γ 0 Y, P x0,γ 0 X π TV coverges to zero evetually where µ TV = µa. sup A BX Cotaimet is defied as that for ay X 0 = x 0 ad Γ 0 = γ 0, for ay ɛ > 0, the stochastic process { ɛ X, Γ : 0} is bouded i probability P x0,γ 0, i.e. for all δ > 0, there is N N such that P x0,γ 0 ɛ X, Γ N δ for all N, where ɛ x, γ = if{ : P γ x, π TV ɛ} is the ɛ-covergece time. Dimiishig Adaptatio is defied as that for ay X 0 = x 0 ad Γ 0 = γ 0, lim D = 0 i probability P x0,γ 0 where D = sup x X PΓ+ x, P Γ x, TV represets the amout of adaptatio performed betwee iteratios ad +. Theorem [8]. Ergodicity of a adaptive CC algorithm is implied by Cotaimet ad Dimiishig Adaptatio. Whe desigig adaptive algorithms, it is ot difficult to esure that Dimiishig Adaptatio holds. However, Cotaimet may be more challegig, which raises two questios. First, is Cotaimet really ecessary? Secod, how ca Cotaimet be verified i specific examples? I this paper, we will aswer the two questios. I Sectio 2, two examples are give that explai that. Ergodicity holds but either Cotaimet or Dimiishig Adaptatio holds; 2. Dimiishig Adaptatio aloe is ot sufficiet for ergodicity of adaptive CC. Note that Cotaimet aloe ca ot guaratee ergodicity was already discussed i [8, see the Oe-Two versio ruig example]. We also will study simultaeous geometric ergodicity. A summable adaptive coditio is give which ca be used to check ergodicity more easily. Some simple coditios for adaptive etropolis algorithms implyig ergodicity are give. I Sectio 3, the results are applied to two examples. The proofs of Sectio 2 are show i Sectio 4. 2 ai Results 2. Toy Examples I this sectio, two examples are give to show that either Dimiishig Adaptatio or Cotaimet is ot ecessary for ergodicity of adaptive CC, ad Dimiishig Adaptatio aloe ca ot guaratee ergodicity. The state space X i Example is fiite. The kerel idex space Y i Example 2 is fiite. 2

3 Example. Let the state space X = {, 2} ad the trasitio kerel [ ] θ θ P θ =. θ θ Obviously, for each θ 0,, the statioary distributio is uiform o X. Propositio. For the target distributio ad the family of trasitio kerels i Example, cosider a state-idepedet adaptatio: at each time choose the trasitio kerel idex θ = + for some fixed r > 0 P r θ0 is the iitial kerel. Show that i For r > 0, Dimiishig Adaptatio holds but Cotaimet does ot; ii For r >, µ 0 P θ0 P θ P θ µ where µ 0 =, 0 ad µ = +α 2, α 2 for some α 0, ; iii For 0 < r ad a probability measure µ 0 o X, µ 0 P θ0 P θ P θ UifX. See the proof i Sectio 4... Remark. The chai i Propositio is a time ihomogeeous arkov chai. It ca be suited ito the framework of adaptive CC. Although very simple, it reflects the complexity of adaptive CC to some degree.. For r >, the limitig distributio of the chai is ot uiform. So it shows that Dimiishig Adaptatio aloe caot esure ergodicity. 2. For 0 < r, the algorithm is ergodic to a uiform distributio, but Cotaimet does ot hold. The reaso is that although the ɛ covergece time goes to ifiity see Eq. 23, the distace betwee the chai ad the target is decreasig. See aother discussio [5, Sectio 4]. Propositio 2. For the target distributio ad the family of trasitio kerels i Example, cosider a state-idepedet adaptatio: for k =, 2,, at each time = 2k choose the trasitio kerel idex θ = /2, ad at each time = 2k choose the trasitio kerel idex θ = /. Both Dimiishig Adaptatio ad Cotaimet do ot hold. The chai coverges to the target distributio UifX. See the proof i Sectio 4... Example 2. Let the state space X = 0,, ad the kerel idex set Y = {, }. The target desity πx Ix>0 is a half-cauchy distributio o the positive part of R. At each time, ru +x 2 the etropolis-hastigs algorithm where the proposal value Y is geerated by Y Γ = X Γ + Z 2 with i.i.d stadard ormal distributio {Z }, i.e. if Γ = the Y = X + Z, while if Γ = the Y = /X +Z. The adaptatio is defied as Γ = Γ IX Γ < + Γ IX Γ, 3 i.e. we chage Γ from to whe X < /, ad chage Γ from to whe X >, otherwise we do ot chage Γ. Propositio 3. The adaptive chai {X : 0} defied i Example 2 does ot coverge weakly to π. Cotaimet does ot hold. See the proof i Sectio

4 2.2 Simultaeous Drift Coditio ad Summable Adaptive Coditio [8] showed that the simultaeously strogly aperiodically geometrically ergodic coditio SSAGE implies Cotaimet. If there is C BX, a fuctio V : X [,, δ > 0, λ <, ad b <, such that sup V x <, ad x C i for each γ, a probability measure ν γ o C with P γ x, δν γ for all x C, ad ii P γ V λv + bi C, we say that the family {P γ : γ Y} is SSAGE. The idea of utilizig SSAGE to check Cotaimet is that SSAGE guaratees there is a uiform quatitative boud of Pγ x, π TV for all γ Y. However, SSAGE ca be geeralized a little. First let us review [23, Theorem 5]. Propositio 4. Suppose a arkov chai P x, dy o the state space X. Let {X : 0} ad {Y : 0} be two realizatios of P x, dy. There are a set C X, δ > 0, some iteger m > 0, ad a probability measure ν m o X such that P m x, δν m for x C. Suppose further that there exist 0 < λ <, b > 0, ad a fuctio h : X X [, such that E [hx, Y X 0 = x, Y 0 = y] λhx, y + bi C C x, y. Let A := sup x,y C C E[hX m, Y m X 0 = x, Y 0 = y], µ := LX 0 be the iitial distributio, ad π be the statioary distributio. The for ay j > 0, LX π TV δ [j/m] + λ jm+ A j E µ π [hx 0, Y 0 ]. To make use of Propositio 4, we cosider the simultaeously geometrically ergodic coditio SGE studied by [22]. If there is C BX, some iteger m, a fuctio V : X [,, δ > 0, λ <, ad b <, such that supv x <, πv <, ad x C i C is a uiform ν m -small set, i.e., for each γ, a probability measure ν γ o C with Pγ m x, δν γ for all x C, ad ii P γ V λv + bi C, we say that the family {P γ : γ Y} is SGE. Note that the differece betwee SGE ad SSAGE is that a uiform miorizatio set C for all P γ is assumed i SSAGE, however a uiform small set C is assumed i SGE [see the defiitios of miorizatio set ad small set i 4, Chapter 5]. Theorem 2. SGE implies Cotaimet. See the proof i Sectio 4.2. Corollary. Cosider the family {P γ : γ Y} of arkov chais o X R d. Suppose that for ay compact set C BX, there exist some iteger m > 0, δ > 0 ad a measure ν γ o C for γ Y such that Pγ m x, δν γ for all x C. Suppose that there is a fuctio V : X, such that for ay compact set C BX, supv x <, πv <, ad x C P γ V x lim sup sup <. 4 x γ Y V x The for ay adaptive strategy usig oly {P γ : γ Y}, Cotaimet holds. 4

5 See the proof i Sectio 4.2. Covergece with sub-geometric rates is studied usig a sequece of drift coditios i [25]. It was show by [2] that if there exist a test fuctio V, positive costats c ad b, a petite set C ad 0 α < such that P V V cv α + bi C, 5 the arkov chai coverges to statioary distributio with a polyomial rate. [5] showed that adaptive CC of all arkov trasitio kerel with simultaeous polyomial drift is ergodic uder some coditios. The followig propositio is a part of the result. Propositio 5. Cosider a adaptive CC algorithm o a state space X. Suppose that there is a set C X with πc > 0, some costat δ > 0, some iteger m > 0, ad some probability measure ν γ o X such that Pγ m x, δi C xν γ for γ Y. Suppose that there are some costats α 0,, β 0, ], b > b > 0, c > 0, ad some measurable fuctio V x : X [, with cv x > b o C c, sup x C V x < such that P γ V V cv α + bi C, γ Y. 6 The for ay adaptive strategy usig {P γ : γ Y} Cotaimet holds. The idea for the proof is to fid the uiform upper boud of Pγ x, π TV. The boud is just depedet of V x, δ,, πv β, ad C. Sice all the trasitio kerels satisfy the simultaeous polyomial drift coditio Eq. 6, {V X : 0} is bouded i probability ca be show. So, Cotaimet holds. [3] study arkovia Adaptatio the joit process {X, Γ : 0} is a arkov chai ad give the similar result as the above propositio. But Propositio 5 ca be applied to more geeral adaptive CC satisfyig Eq. [see details i 5]. I the followig result, we use a simple couplig method to show that oe summable adaptive coditio implies ergodicity of adaptive CC. Propositio 6. Cosider a adaptive CC {X : 0} o the state space X with the kerel idex space Y. Uder the followig coditios: i Y is fiite. For ay γ Y, P γ is ergodic with the statioary distributio π; ii At each time, Γ is a determiistic measurable fuctio of X 0,, X, Γ 0,, Γ ; iii For ay iitial state x 0 X ad ay iitial kerel idex γ 0 Y, PΓ Γ X 0 = x 0, Γ 0 = γ 0 <, 7 = the adaptive CC {X : 0} is ergodic with the statioary distributio π. See the proof i Sectio 4.2. Remark 2. I Example 2, the trasitio kerel is chaged whe X Γ reaches below the boud /. It ca be show that if the boudary is defied as / r with r >, the adaptive algorithm is ergodic with half-cauchy distributio because of Propositio 6. To show it, we oly eed to adopt the procedure i Lemma 2 to check Eq. 7. 5

6 2.3 Adaptive etropolis algorithm The target desity π is defied o the state space X R d. I what follows, we shall write, for the usual scalar product o R d, for the Euclidea ad the operator orm, z := z/ z for the ormed vector of z, for the usual differetial gradiet operator, mx := πx/ πx, B d x, r = {y R d : y x < r} for the hyperball o R d with the ceter x ad the radius r, B d x, r for the closure of the hyperball, ad VolA for the volume of the set A R d. Say a adaptive CC is a Adaptive etropolis-hastigs algorithm if each kerel P γ is from a etropolis-hastigs algorithm P γ x, dy = α γ x, yq γ x, dy + [ ] α γ x, zq γ x, dz δ x dy 8 X πyqγy,x where Q γ x, dy is the proposal distributio, α γ x, y := πxq γx,y Iy X, ad µ d is Lebesgue measure. Say a adaptive etropolis-hastigs algorithm is a Adaptive etropolis algorithm if each q γ x, y is symmetric, i.e. q γ x, y = q γ x y = q γ y x. [] give coditios which imply geometric ergodicity of symmetric radom-walk-based etropolis algorithm o R d for target distributio with lighter-tha-expoetial tails, [see other related results i 3, 20]. Here, we exted their result a little for target distributios with expoetial tails. Defiitio Lighter-tha-expoetial tail. The desity π o R d is lighter-tha-expoetially tailed if it is positive ad has cotiuous first derivatives such that lim sup x, log πx =. 9 x Remark 3.. The defiitio implies that for ay r > 0, there exists R > 0 such that πx + αx πx πx αr, for x R, α > 0. It meas that πx is expoetially decayig alog ay ray, but with the rate r tedig to ifiity as x goes to ifiity. 2. The ormed gradiet mx will poit towards the origi, while the directio x poits away from the origi. For Defiitio, x, log πx = πx πx x, mx. Eve lim sup x x, mx < 0, Eq. 9 might ot be true. E.g. πx, x R. mx = x so that x, mx =. +x 2 x, log πx = 2 x so lim x, log πx = 0. +x 2 x Defiitio 2 Expoetial tail. The desity fuctio π o R d is expoetially tailed if it is a positive, cotiuously differetiable fuctio o R d, ad η 2 := lim sup x, log πx > 0. 0 x Remark 4. There exists β > 0 such that for x sufficietly large, x, log πx = x, mx log πx β. Further, if 0 < x, mx, the log πx β. 6

7 Defie the symmetric proposal desity family C := {q : qx, y = qx y = qy x}. Our ergodicity result for adaptive etropolis algorithms is based o the followig assumptios. Assumptio Target Regularity. The target distributio is absolutely cotiuous w.r.t. Lebesgue measure µ d with a desity π bouded away from zero ad ifiity o compact sets, ad sup πx <. Assumptio 2 Target Strogly Decreasig. The target desity π has cotiuous first derivatives ad satisfies η := lim sup x, mx > 0. x Assumptio 3 Proposal Uiform Local Positivity. Assume that {q γ : γ Y} C. There exist ζ > 0 such that ι := if if q γz > 0. 2 γ Y z ζ Give 0 < p < q <, for u S d S d is the uit hypersphere i R d. ad θ > 0, defie { } C p,q u, θ := z = aξ p a q, ξ S d, ξ u < θ/3. 3 Assumptio 4 Proposal omet Coditio. Suppose the target desity π is expoetially tailed ad {q γ : γ Y} C. Uder Assumptios 2, assume that there are ɛ 0, η, β 0, η 2, δ, ad with 0 < 3 βɛ δ < such that if z q γ zµ d dz > u,γ S d Y C δ, u,ɛ x X 3e + βɛe. 4 Remark 5. Uder Assumptio 3, let P x, dy be the trasitio kerel of etropolis-hastigs algorithm with the proposal distributio Qx, Uif B d x, ζ/2. For ay γ Y, P γ x, dy ιvol B d 0, ζ/2 P x, dy. Uder Assumptios, by [20, Theorem 2.2], ay compact set is a small set for P so that ay compact set is a uiform small set for all P γ. Remark 6.. Assumptio 4 meas that the proposal family has uiform lower boud of the first momet o some local coe aroud the origi. The coditio specifies that the tails of all proposal distributios ca ot be too light, ad the quatity of the lower boud is give ad depedet o the tail-decayig rate η 2 ad the strogly decreasig rate η of target distributio. Assumptios -4 are used to check SGE which is just sufficiet to Cotaimet. 2. If the proposal distributio i {q γ : γ Y} C is a mixture distributio with oe fixed part, the Assumptio 4 is relatively easy to check, because the itegral i Eq. 4 ca be estimated by the fixed part distributio. Especially for the lighter-tha-expoetially tailed target, Assumptio 4 ca be reduced for this case. We will give a sufficiet coditio for Assumptio 4 which ca be applied to more geeral case, see Lemma. Now, we cosider a particular class of target desities with tails which are heavier tha expoetial tails. It was previously show by [8] that the etropolis algorithm coverges at ay polyomial rate whe proposal distributio is compact supported ad the log desity decreases hyperbolically at ifiity, log πx x s, for 0 < s <, as x. 7

8 Defiitio 3 Hyperbolic tail. The desity fuctio π is twice cotiuously differetiable, ad there exist 0 < m < ad some fiite positive costats d i, D i, i =, 2 such that for large eough x, 0 < d 0 x m log πx D 0 x m ; 0 < d x m log πx D x m ; 0 < d 2 x m 2 2 log πx D 2 x m 2. Assumptio 5 Proposal s Uiform Compact Support. Uder Assumptio 3, there exists some > ζ such that all q γ with γ Y are just supported o B d 0,. Theorem 3. A adaptive etropolis algorithm with Dimiishig Adaptatio is ergodic, uder ay coditio of the followig: i. Target desity π is lighter-tha-expoetially tailed, ad Assumptios - 3; ii. Target desity π is expoetially tailed, ad Assumptios - 4; iii. Target desity π is hyperbolically tailed, ad Assumptios - 3 ad 5. 3 Applicatios Here we discuss two examples. The first oe Example 3 is from [7] where the proposal desity is a fixed distributio of two multivariate ormal distributios, oe with fixed small variace, aother usig the estimate of empirical covariace matrix from historical iformatio as its variace. It is a slight variat of the famous adaptive etropolis algorithm of Haario et al. [0]. I the example, the target desity has lighter-tha-expoetial tails. The secod Example 4 cocers with target desities with truly expoetial tails. Example 3. Cosider a d-dimesioal target distributio π o R d satisfyig Assumptios - 2. We perform a etropolis algorithm with proposal distributio give at the th iteratio by Q x, = Nx, 0. 2 I d /d for 2d; For > 2d, { θnx, 2.38 Q x, = 2 Σ /d + θnx, 0. 2 I d /d, Σ is positive defiite, Nx, 0. 2 I d /d, Σ is ot positive defiite, 5 for some fixed θ 0,, I d is d d idetity matrix, ad the empirical covariace matrix Σ = X i Xi + X X, 6 i=0 where X = + i=0 X i, is the curret modified empirical estimate of the covariace structure of the target distributio based o the ru so far. Remark 7. The fixed part Nx, 0. 2 I d /d ca be replaced by UifB d x, τ for some τ > 0. For targets with lighter-tha-expoetial tails, τ ca be a arbitrary positive value, because Assumptio 3 holds. For targets with expoetial tails, τ is depedet o η ad η 2. Remark 8. The proposal Nx, Σ/d is optimal i a particular large-dimesioal cotext, [see 2, 6]. Thus the proposal Nx, Σ /d is a effort to approximate this. Remark 9. Commoly, the iterative form of Eq. 6 is more useful, Σ = Σ + X + X X X. 7 8

9 Propositio 7. Suppose that the target desity π is expoetially tailed. Uder Assumptios -4, X X ad Σ Σ coverge to zero i probability where where is matrix orm. Proof: Note that i the proof of Theorem 3, some test fuctio V x = cπ s x for some s 0, ad some c > 0 is foud such that SGE holds. To check Dimiishig Adaptatio, it is sufficiet to check that both Σ Σ ad X X coverge to zero i probability where is matrix orm. By some algebras, Σ Σ = + X X Hece, i=0 Σ Σ X X + + X i X i + X X + X X X X X X + X X. + i=0 X ixi + 2 X X +. idet To prove Σ Σ coverges to zero i probability, it is sufficiet to check that X X i=0 X ixi, X X ad X X + X X are bouded i probability. Sice lim sup x, log πx < 0, there exist some K > 0 ad some β > 0 such that x For x K, log πy log πx r x s e sβ r r y. Takig x 0 R d with x 0 = K, V x = cπ s x 0 sup x, log πx β. x K β where r > ad y = rx, i.e. s cae sβ r r πx πx 0 πy πx 8, x for x = rx 0, r >, ad a := if y K π s y > 0, because of Assumptio. If r 2 the r r extremely large, V x x 2. We kow that sup E[V X ] < See Theorem 8 i [8]. Sice X X u X X u sup u 2 X 2 X 2, X X is bouded i probability. Obviously, := sup u = The, for K > 0, P X i Xi i=0 Hece, i=0 X ixi X + i=0 X i. So, 0.5. Therefore, as x is i=0 > K K u = X i Xi X i Xi [ Xi E Xi i=0 is bouded i probability. P X > K K + i=0 9 i=0. ] K i=0 [ E E[ X i ] K sup E[V X ]. X i 2] K sup E[V X ].

10 X X is bouded i probability. Hece, X is bouded i probability. Fially, X X + X X 2 X X. Therefore, X X + X X is bouded i probability. Theorem 4. Suppose that the target desity π i Example 3 is lighter-tha-expoetially tailed. The algorithm i Example 3 is ergodic. Proof: Obviously, the proposal desities has uiformly lower boud fuctio. By Theorem 3 ad Propositio 7, the adaptive etropolis algorithm is ergodic. The followig lemma is used to check Assumptio 4. Lemma. Suppose that the target desity π is expoetially tailed ad the proposal desity family {q γ : γ Y} C. Suppose further that there is a fuctio q z := g z, q : R d R + ad g : R + R +, some costats 0, ɛ 0, η, β 0, η 2 ad 3 βɛ < δ < such that for z with the property that q γ z q z for γ Y ad d π d 2 2Γ d+ 2 Be r 2 d 2, gtt d dt > 2 δ 3e + βɛe, 9 where η is defied i Eq. 0, η 2 is defied i Eq., r := 8 ɛ 36 ɛ 2, ad the icomplete beta fuctio Be x t, t 2 := x 0 tt t t2 dt, the Assumptio 4 holds. Proof: For u S d, C δ, u,ɛ z g z µ d dz = δ gtt d dt {ξ S d : ξ u <ɛ/3} ωdξ. where ω deotes the surface measure o S d. By the symmetry of u S d, let u = e d := 0,, 0,. So, the projectio from the piece }{{} { d ξ S d : ξ u < ɛ/3 } of the hypersphere S d to the subspace R d geerated by the first d coordiates is d hyperball B d 0, r with the ceter 0 ad the radius r = 8 ɛ 36 ɛ 2. Defie fz = z z2 d. Hece, { } ω ξ S d : ξ u < ɛ/3 = = d d π 2 C δ, u,ɛ Γ d+ 2 Therefore, the result holds. r 0 ρ d 2 ρ 2 z g z µ d dz = B d 0,r d π 2 dρ =d 2Γ d+ 2 Be r 2 d d π 2 2Γ d+ 2 Be r f 2 dz dz d d 2,. 2 d 2, gtt d dt δ

11 Example 4. Cosider the stadard multivariate expoetial distributio πx = c exp λ x o R d where λ > 0. We perform a etropolis algorithm with proposal distributio i the family {Q γ } γ Y at the th iteratio where { Uif B Q x, = d x,, 2d, or Σ is osigular, θnx, Σ /d + θ Uif B d x,, > 2d, ad Σ is sigular, 2 for θ 0,, Uif B d x, is a uiform distributio o the hyperball B d x, with the ceter x ad the radius, ad Σ is as defied i Eq. 6. The problem is: how to choose such that the adaptive etropolis algorithm is ergodic? Propositio 8. There exists a large eough > 0 such that the adaptive etropolis algorithm of Example 4 is ergodic. Proof: We compute that πx = λxπx. So, x, log πx = λ ad x, mx =. So, the target desity is expoetially tailed, ad Assumptios ad 2 hold. Obviously, each proposal desity is locally positive. Now, let us check Assumptio 4 by usig Lemma. Because VolB d x, = d π d 2 dγ d 2 +, the fuctio gt defied i Lemma is equal to VolB d x,. η defied i Eq. 0 ad η 2 defied i Eq. are respectively λ ad. Now, fix ay ɛ 0, ad ay δ λ,. The left had side of Eq. 9 is d π d 2 2Γ d+ 2 Be r 2 d 2, gtt d dt = 2 δ dd d 2d + Be d+ 2, /2 Be r 2 2, δd+ 2 d+, where Bex, y ad Be r x, y are beta fuctio ad icomplete beta fuctio, r is a fuctio of ɛ defied i Lemma. Oce fixed ɛ ad δ, the first two terms i the right had side of the above equatio is fixed. The, as goes to ifiity, the whole equatio teds to ifiity. So, there exists a large eough > 0 such that Eq. 9 holds. By Lemma, Assumptio 4 holds. The, by Propositio 9, Cotaimet holds. By Propositio 7, Dimiishig Adaptatio holds. By Theorem, the adaptive etropolis algorithm is ergodic. 4 Proofs of the ai Results 4. Proofs of Sectio Proofs of Example Proof of Propositio : Sice the adaptatio is state-idepedet, the statioarity is preserved. So, the adaptive CC X δp θ0 P θ P θ2 P θ for 0 where δ := δ, δ 2 is the iitial distributio. The part i. Cosider P θ+ x, P θ x, TV. For ay x X, P θ+ x, P θ x, TV = θ + θ 0.

12 Thus, for r > 0 Dimiishig Adaptatio holds. By some algebra, Hece, for ay ɛ > 0, P θ x, π TV = 2 2θ. 22 ɛ X, θ logɛ log/2 log 2θ + as. 23 Therefore, the stochastic process { ɛ X, θ : 0} is ot bouded i probability. The parts ii ad iii. Let µ := µ, µ 2 := δp θ0 P θ. So, Hece, µ + = µ θ + µ µ 2 ad µ 2 + = µ2 + θ + µ µ 2. µ + µ2 + = δ δ 2 + 2θ k. For r >, + k=0 2θ k coverges to some α 0, as goes to ifiity. µ + µ2 + δ δ 2 α. For 0 < r, µ + µ Therefore, for r > ergodicity to Uiform distributio does ot hold, ad for 0 < r ergodicity holds. Proof of Propositio 2: From Eq. 22, for ɛ > 0, ɛ X 2k, θ 2k logɛ log/2 log /k as k. So, Cotaimet does ot hold. P θ2k x, P θ2k x, TV = 2 2k 2 as k. So Dimiishig Adaptatio does ot hold. Let δ := δ, δ 2 be the iitial distributio ad µ := µ, µ 2 = δp θ0 P θ. µ µ 2 = δ δ 2 2 [/2] [+/2] 2k 0 as goes to ifiity. So ergodicity holds. k= 4..2 Proof of Propositio 3 First, we show that Dimiishig Adaptatio holds. Lemma 2. For the adaptive chai {X : 0} defied i Example 2, the adaptatio is dimiishig. Proof: For γ =, obviously the proposal desity is q γ x, y = ϕy x where ϕ is the desity fuctio of stadard ormal distributio. For γ =, the radom variable /x+z has the desity ϕy /x so the radom variable //x + Z has the desity q γ x, y = ϕ/y /x/y 2. The proposal desity { ϕy x γ = q γ x, y = ϕ/y /x/y 2 γ = For γ =, the acceptace rate is mi, the acceptace rate is mi, πyqγy,x πxq γx,y, πyqγy,x πxq γx,y k=0 Iy X = mi 2 Iy X = +x2 Iy > 0. For γ = +y 2, Iy > 0 = +y 2 ϕ/x /y/x2 +x 2 ϕ/y /x/y2

13 mi, +x 2 Iy > 0. +y 2 So for γ Y, the acceptace rate is α γ x, y := mi, πyq γy, x Iy X = mi πxq γ x, y, + x2γ + y 2γ Iy X. 24 From Eq. 3, [Γ Γ ] = [X Γ < /]. Sice the joit process {X, Γ : 0} is a time ihomogeeous arkov chai, PΓ Γ = PX Γ < / X = x, Γ = γpx dx, Γ dγ X Y = P γ x, [t > 0 : t γ < /]PX dx, Γ dγ X Y = P γ x, [t > 0 : t γ < /]PX dx, Γ dγ [x γ / ] where the secod equality is from Eq., ad the last equality is from PX Γ by Eq. 3. So for ay x, γ [t, s X Y : t s / ], / = implied P γ x, [t > 0 : t γ < /] = Sice x γ + / < 0, 0 Iy γ < /q γ x, ydy = x γ +/ x γ ϕzdz. We have that ϕ xγ P γ x, [t > 0 : t γ < /] ϕ0. 25 PΓ Γ 2π. 26 Therefore, for ay ɛ > 0, P PΓ x, P Γ x, TV > ɛ PΓ Γ 0. sup x X From Eq. 24, at the th iteratio, the acceptace rate is α Γ X, Y = mi 0. Let us deote Ỹ := Y Γ ad X := X Γ. The acceptace rate is equal to mi, + X 2 + Ỹ 2 IỸ > 0., +X2Γ +Y 2Γ IY > From Eq. 3, X Γ X = Y, = X Γ IX Γ < / + X Γ IX Γ /. Whe Y is accepted, i.e. [Ỹ < /] = [X Γ < /] ad X Γ = Ỹ IỸ < / + ỸIỸ /. 3

14 O the other had, from Eq. 2, the coditioal distributio Ỹ X is N X,. From the above discussio, the chai X := { X : 0} ca be costructed accordig to the followig procedure. Defie the idepedet radom variables Z iid N0,, U iid Beroulli0.5, iid ad T Uif0,. Let X 0 = X Γ 0 0. At each time, defie the variable Ỹ := X U Z + U Z. 27 Clearly, U Z + U Z = d N0, = d meas equal i distributio. If T < mi, + X 2 +Ỹ 2 IỸ > 0 the X = IỸ < /Ỹ + IỸ /Ỹ; 28 otherwise X = X. Note that:. The process X is a time ihomogeeous arkov chai. 2. P X / = for. 3. At the time, U idicates the proposal directio U = 0: try to jump towards ifiity; U = : try to jump towards zero. Z specifies the step size if the proposal value Y is accepted. T is used to check whether the proposal value Y is accepted or ot. Whe U = ad Ỹ > 0, Eq. 28 is always ru. For two itegers 0 s t ad a process X ad a set A X, deote [X s:t A] := [X s A; X s+ A; ; X t A] ad s : t := {s, s +,, t}. For a value x R, deote the largest iteger less tha x by [x]. I the followig proofs for the example, we use the otatio i the procedure of costructig the process X. Lemma 3. Let a = π Give 0 < r <, for [x] > 2 r P i k + : k + [x] +r, Xi < x/2 Xk = x [x] +r [x] 2 7 2[x] r 2 a [x] r. π Proof: The process X is geerated through the uderlyig processes {Ỹj, Z j, U j, T j : j } defied i Eq Eq. 28. Coditioal o [ X k = x], we ca costruct a auxiliary chai B := {B j : j k} that behaves like a asymmetric radom walk util X reaches below x/2, ad B is always domiated from above by X. It is defied as that B k = X k ; For j > k, if Xj < x/2 the B j := X j, otherwise. If proposig towards zero U j = the B also jumps i the same directio with the step size Z j i this case, the acceptace rate mi, + X j 2 is equal to ; +Ỹ j 2 2. If proposig towards ifiity U j = 0, the B j is assiged the value B j + Z j the jumpig directio of B at the time j is same as X with the acceptace rate X j, i.e. for j > k, +x/2 2 +x/2+ Z j 2 idepedet of B j := I X j < x/2 X j + I X j x/2 B j I j x 29 4

15 where I j x := U j Z j U j Z j I T j < Note that. {Z j, U j, T j : j > k} are idepedet so {I j x : j > k} are idepedet. + x/2 2 + x/2 + Z j Whe X j > x/2 ad U j = 0 proposig towards ifiity, the acceptace rate > + X j 2 +Ỹ j 2 ] [ ] +x/2 2 +x/2+ Z j, so that [T 2 j < +x/22 +x/2+ Z j T 2 j < + X j 2 which is equivalet to [B +Ỹ j 2 j B j = Z j ] [ X j X j = Z j ]. Therefore, B is always domiated from above by X. Coditioal o [ X k = x], [ i k + : k + [x] +r, Xi < x/2] [ i k + : k + [x] +r, B i < x/2] ad for i k + : k + [x] +r, So, [B k:i x/2; B i < x/2] [B k x/2; B k t l=k+ I l x x/2 for all t k + : i; B k i l=k+ P i k + : k + [x] +r, Xi < x/2 Xk = x P i k + : k + [x] +r, B k i I j x < x/2 B k = x P max Sl > x/2 l :[x] +r =Pmax l :q S l > q /+r /2 j=k+ I l x < x/2]. where S 0 = 0 ad S l = l j= I k+jx ad q = [x] +r. {I j x : k < j k+l} ad B k are idepedet so that the right had side of the above equatio is idepedet of k. By some algebra, [ ] 0 E[I i x] = 2 E Z i 2 x + Z i + x/2 + Z i 2 Var[I i x] = 2 + [ 2 E Z i 2 + x/2 2 ] + x/2 + Z i ] [ Z x E i 2 + Z i < 7 2, πx [ ] Z i 2 2 x + Z i E + x/2 + Z i 2 [0, ]. Let µ l = E[ S l ] ad S l = S l µ l ad ote that µ l is icreasig as l icreases, ad µ q [0, 7 2q π ]. So {S i : i =,, q} is a artigale. By Kolmogorov aximal Iequality, Pmax l :q S l > q /+r /2 Pmax l :q S l > q /+r /2 µ q qvar[i kx] q /+r /2 µ q 2 [x] +r [x] 2 7 2[x] r 2 < a [x] r. π 5

16 The last secod iequality is from [x] > 2 r > 4 2 r π implyig [x] 2 > 7 2[x] r π. Assume that X coverges weakly to π. Take some c > such that for the set D = /c, c, πd = 9/0. Takig a r 0,, there exists N > 2c 2 r a r /r exp 0.8ϕ cr a is defied i Lemma 3 such that for ay > N +, PX D > 0.8. Sice [X D] = [X Γ D] ad X Γ = d X, P X D > 0.8. So, P X > 2 < 0.2 for > N. Let m = exp 0.8ϕ c + that implies m >, m < +r because > 2 /r exp 0.8ϕ cr, ad log m+ + = 0.8ϕ c. The 0.2 > P X m > m 2 j= P X j D; Ỹj+ < j + ; X j+:m >. 3 2 From Eq. 27 ad Eq. 28, [Ỹi+ < i+ ] = [ X i+ = > i + ] for ay i >. Cosider Ỹ i+ j : m. Sice X is a time ihomogeeous arkov chai, P X j D; Ỹj+ < j + ; X j+:m > /2 = P X j DP X j+ = Ỹj+ < j + Xj D P X j+2:m > 2 Xj+ = > j + Ỹ j+ = P X j DP X j+ = > j + Xj D Ỹ j+ P X t /2 for some t j + : m Xj+ = > j +. Ỹ j+ So, From Eq. 25, for ay x D, PỸj+ < j + Xj = x = P x, {t X : t < /j + } PỸj+ < j + Xj D ϕ c j +. [ ϕ c j +, ϕ0 ]. j + Hece, for x > j +, P Xt /2 for some t j + : m Xj+ = x P Xt x/2 for some t j + : m Xj+ = x P Xt x/2 for some t j + : j + [x] +r Xj+ = x a [x] r a r, 6

17 because of x/2 > /2, m < +r, ad Lemma 3. Thus, P X t /2 for some t j + : m Xj+ = > j + Ỹ j+ Therefore, P X m > 2 0.8ϕ c a m r j + 0.8ϕ c j= Cotradictio! By Lemma 2, Cotaimet does ot hold. 4.2 Proofs of Sectio 2.2 a r. a logm + / + = a > 0.5. r r Proof of Theorem 2: Let {X γ : 0} ad {X γ : 0} be two realizatios of P γ for γ Y. Defie hx, y := V x + V y/2. From ii of SGE, E[hX γ, Y γ X γ 0 = x, Y γ 0 = y] λhx, y + bi C C x, y. It is ot difficult to get Pγ m V x λ m V x + bm so A := sup x,y C C E[hX m γ, Y m γ X γ 0 = x, Y γ 0 = y] λ m sup C V + bm =: B. Cosider LX γ 0 = δ x ad j :=. By Propositio 4, Pγ x, π TV δ [ /m] + λ m+ B V x + πv /2. 32 Note that the quatitative boud is depedet of x,, δ, m, C, V ad π, ad idepedet of γ. As goes to ifiity, the uiform quatitative boud of all P γ x, π TV teds to zero for ay x X. Let {X : 0} be the adaptive CC satisfyig SGE. From ii of SGE, sup E[V X X 0 = x, Γ 0 = γ 0 ] < so the process {V X : 0} is bouded i probability. Therefore, for ay ɛ > 0, { ɛ X, Γ : 0} is bouded i probability give ay X 0 = x ad Γ 0 = γ 0. P Proof of Corollary : From Eq. 4, lettig λ = lim sup x sup γv x γ Y V x <, there exists P some positive costat K such that sup γv x γ Y V x < λ+ 2 for x > K. By V >, P γ V x < λ+ 2 V x for x > K. P γv x λ+ 2 V x + bi {z X : z K}x where b = sup x {z X : z K} V x. Proof of Propositio 6: Fix x 0 X, γ 0 Y. By the coditio iii ad the Borel-Catelli Lemma, ɛ > 0, N 0 x 0, γ 0, ɛ > 0 such that > N 0, P x0,γ 0 Γ = Γ + = > ɛ/2. 33 Costruct a ew chai { X : 0} which satisfies that for N 0, X = X, ad for N 0, X P N 0 Γ N0 X N0,. So, for ay > N 0 ad ay set A BX, by the coditio ii, P x0,γ 0 X A, Γ N0 = Γ N0 + = = Γ = P γ0 x 0, dx P x γn0 N 0, dx N0 P N 0 γ N0 x N0, A X N 0 [γ N0 = =γ ] 7

18 ad P x0,γ 0 X A = X N 0 P γ0 x 0, dx P γn0 x N 0, dx N0 P N 0 γ N0 x N0, A So, Px0,γ0X A, Γ N0 = = Γ P x0,γ0 X A ɛ/2. Sice the coditio i holds, suppose that for some K > 0, Y = {y,, y K }. Deote µ i = P x0,γ 0 X N0 Γ N0 = y i for i =,, K. Because of the coditio ii, for > N 0, P x0,γ 0 X A K = P x0,γ 0 X A, Γ N0 = y i = = i= K i= K i= X N 0 [γ N0 =y i ] P γ0 x 0, dx P γn0 x N 0, dx N0 P N 0 y i x N0, A P x0,γ 0 Γ N0 = y i µ i P N 0 y i A. By the coditio i, there exists N x 0, γ 0, ɛ, N 0 > 0 such that for > N, µ i Py i π TV < ɛ/2. sup i {,,K} So, for ay > N 0 + N, ay A BX, P x0,γ 0 X A πa P x0,γ 0 X A P x0,γ 0 X A + P x0,γ 0 X A πa ɛ/2 + ɛ/2 + ɛ/2 = 3ɛ/2. Therefore, the adaptive CC {X : 0} is ergodic with the target distributio π. 4.3 Proof of Theorem 3 Before we show that Theorem 3, we state [, Lemma 4.2]. Lemma 4. Let x ad z be two distict poits i R d, ad let ξ = x z. If ξ, my 0 for all y o the lie from x to z, the z does ot belog to { y R d : πy = πx }. Cosider the test fuctio V x = cπ s x for some c > 0 ad s 0, such that V x. Note that it is ot difficult to check that for s 0,, πv < by utilizig Defiitio 2. By some algebras, π s x P γ V x/v x = q γ zµ d dz+ Ax x Rx x π s x + z πx + z + π s x + z πx π s x 8 q γ zµ d dz,

19 where the acceptace regio Ax := {y X πy πx}, ad the potetial rejectio regio Rx := {y X πy < πx}. From [9, Propositio 3], we have P γ V x rsv x where rs := + s s +/s. Propositio 9 Expoetial tail. Suppose that the target desity π is expoetially tailed. Uder Assumptios -4, Cotaimet holds. Proof: Cosider s [0, /2. Uder Assumptio 4, let hα, s = r s + s 2 α if s Hα, s = + u,γ S d Y s 0 hα, tdt C δ, u,ɛ [ z e αs z e α s z ] q γ zµ d dz ad where ɛ, β, δ,, ad C δ,, are defied i Assumptio 4. So, Hβɛ/3, 0 = ad Hβɛ/3, 0 s = hβɛ/3, 0 e + βɛ e 3 if z q γ zµ d dz < 0. u,γ S d Y C δ, u,ɛ Therefore, there exists s 0 0, /2 such that Hβɛ/3, s 0 <. Deote Cx := x C δ, x, ɛ ad C x := x + C δ, x, ɛ. For x 2 ad y Cx C x, y x so y x < ɛ/3. Sice the target desity π is expoetially tailed ad Assumptio 2, for sufficietly large x > K with some K > 2, x, log πx β ad x, mx ɛ. The there exists some K 2 > K such that for x K 2, y, my ɛ for y Cx C x. Thus, log πy = β. oreover, y = x ± aξ for some δ a ad ξ S d. So, y, log πy y,my ξ, my = ξ x, my + x y, my + y, my < ɛ/3. 34 Hece, by Lemma 4, for x > K 2, { } Cx y R d : πy = πx For y = x + aξ C x, πy πx = = a 0 a 0 < ɛ 3 { } = ad C x y R d : πy = πx =. ξ, πx + tξ dt ξ, πx + tξ πx + tξ dt a 0 πx + tξ dt 0 so that C x Rx. Similarly, Cx Ax. Cosider the test fuctio V x = cπ s 0 x for some c > 0 such that V x >. By Assumptio, for ay compact set C R d, sup V x <. x C 9

20 For ay sequece {x : 0} with x, there exists some N > 0 such that > N, x > K 2. We have P γ V x /V x = I x,s 0 zq γ zµ d dz+ {Cx x } {C x x } I {Cx x } c {C x x } c x,s 0 zq γ zµ d dz, where I x,s 0 z = { π s 0 x π s 0 x, +z z Ax x, πx+z πx, z Rx x. + π s 0 x+z π s 0 x For z = aξ C x x ad t 0, z, by Eq. 34 So, by Assumptio 4, ξ, log πx + tξ = ξ, mx + tξ log πx + tξ < ɛβ/3. πx + z πx = e log πx+z log πx z = e 0 ξ, log πx+tξ dt e βɛ z /3 e βɛδ/3 e. Similarly, for z = aξ Cx x, πx πx + z e βɛ z /3 e. t s 0 t s 0 t s 0 t. Sice t s 0 t s 0 t is a icreasig fuctio o [0, ], I x,s 0 zq γ zµ d dz {Cx x } {C x x } Cx x C x x e s0βɛ z /3 q γ zµ d dz+ s 0 e βɛ z /3 + e s 0βɛ z /3 s 0 q γ zµ d dz. O the other had, I {Cx x } c {C x x } c x,s 0 zq γ zµ d dz } c rs 0 Q γ {Cx x } c {C x x. Defie K x,γ t := Cx x e t z q γ zµ d dz = C x x e t z q γ zµ d dz, ad So, H x,γ θ, t := K x,γtθ t + K x,γ 0 K x,γ θ + K x,γ tθ t P γ V x /V x H x,γβɛ/3, s 0. + rt 2K x,γ 0. 20

21 Clearly, K x,γ t /2. For 0 t < /2, H x,γ θ, t t =r t 2K x,γ 0 + K x,γθt + K x,γ θ t t 2 + θ t r t + t 2 θ t hθ, t. Cx x e θt z e θ t z z q γ zµ d dz K x,γθt K x,γθ t Sice H x,γ θ, 0 =, H x,γ θ, t Hθ, t for 0 t < /2. Thus, H x,γβɛ/3, s 0 Hβɛ/3, s 0 < P so lim sup sup γv x V x <. By Corollary, Cotaimet holds. x γ Y Proof of Theorem 3: For ii, by Propositio 9, Cotaimet holds. The ergodicity is implied by Cotaimet ad Dimiishig Adaptatio. For i, From Assumptio 3, for ay ɛ 0, η ad ay u S d, C ζ/2,ζ u,ɛ z q γ zµ d dz ιζvolc ζ/2,ζu, ɛ 2 where ι is defied i Eq. 2, ζ is defied i Assumptio 3, C a,b, is defied i Eq. 3. The right had side of the above equatio is positive ad idepedet of γ ad u. Sice target desity is lighter-tha-expoetially tailed, η 2 := lim sup x x, log πx = + such that there is some sufficietly large β such that Eq. 4 holds. So, Assumptio 4 is satisfied. For iii, adoptig the proof of [8, Theorem 5], we will show that the simultaeous drift coditio Eq. 6 holds. Deote Rg, x, y := gy gx gx, y x. Cosider the test fuctio V x := + f s x where fx := log πx for 2 m < s < mi 2 m, 3 m 2 where m is defied i Defiitio 3. So, 4 P γ V x V x = P γ f s x f s x = I j x, γ, where is defied i Assumptio 5 ad I 0 x, γ := sf s x fx 2 I x, γ := I 2 x, γ := I 3 x, γ := I 4 x, γ := { z } Rx x { z } Rx x { z } Rx x { z } Rx x { z } Rf s, x, x + zq γ zµ d dz, Rf s, x, x + z j=0 mx, z 2 z 2 q γ zµ d dz, Rπ, x, x + z q γ zµ d dz πx Rf s, x, x + z fx, z q γ zµ d dz Rπ, x, x + z πx 2 f s x, z q γ zµ d dz.

22 By [8, Lemma B.4] ad Assumptio 5, I x, γ = O x ms 2, I 2 x, γ = O x ms+2 4, I 3 x, γ = O x ms+ 3, I 4 x, γ = O x ms+2 3. Note that the O s i the above equatios are idepedet of γ. Sice 2 m < s < mi 2 m, 3 m 2, I x, γ, I 2 x, γ, I 3 x, γ ad I 4 x, γ coverge to zero as x. By Assumptio 2, for ɛ 0, η η is defied i Eq., x, mx < ɛ as x is sufficietly large. By Assumptio 3, for sufficietly large x, for ay z C 0,ζ x, ɛ ζ is defied i Assumptio 3, ι is defied i Eq. 2, ad C,, is defied i Eq. 3, Thus, mx, z = mx, x + mx, z x ɛ + ɛ/3. I 0 x, γ 4ɛ2 ιsf s x fx 2 9 C 0,ζ x,ɛ = c f s x fx 2 c 2 f s 2 m/m x, z 2 µ d dz for some c > 0 idepedet of x where C 0,ζ x, ɛ = C 0,ζ u, ɛ for ay u S d. So, there exist some K > 0 ad some c 3 > 0 such that V x >. ad P γ V x V x c 3 V α x for x > K, some α 0,. Let Ṽ x := V xi x > K + I x K. So, By Propositio 5, Cotaimet holds. 5 Coclusios ad Discussio P γ Ṽ x Ṽ x c 3Ṽ α x + c 3 I x K. For adaptive etropolis algorithms see similar results for adaptive etropolis-withi-gibbs algorithms i [6], we provide some coditios oly related to properties of the target desity ad the proposal family. For targets with lighter-tha-expoetial tails, ergodicity of adaptive etropolis algorithms ca be implied by the uiform local positivity of the family of proposal desities. For targets with expoetial tails, ergodicity of adaptive etropolis algorithms ca be implied by both the uiform local positivity ad the uiform lower boud of the first momet of the family of proposals. Recetly, there also is some results about this topic, see [24]. They show that if the target desity is regular, strogly decreasig, ad strogly lighter-tha-expoetially tailed lim sup x x, log πx x ρ = for some ρ > which is used to keep the covexity of outside maifold cotour of target desities, the strog law of large umber SLLN for symmetric radom-walk based adaptive etropolis algorithms holds. Compared with the results, although the coditios do ot require that the target desity is strogly lighter-tha-expoetially tailed, oe restrictio o proposal desity is eeded. [] show that if uder Assumptio 2 target desity is lighter-tha-expoetial tailed the radom-walk-based etropolis algorithms are geometrically ergodic. The techique i Propositio 9 ca be also applied to CC. So, eve if target desity is expoetially tailed uder some momet coditio similar as Eq. 4, ay radom-walk-based etropolis algorithm is still geometrically ergodic. Careful readers may metio that our symmetry assumptio qx, y = qx y = qy x is a little differet from the assumptio qx, y = q x y of []. 22

23 Ackowledgemets We thak atti Vihola for helpful commets. Refereces [] C Adrieu ad E oulies. O the ergodicity properties of some adaptive arkov Chai ote Carlo algorithms.. A. Appl. Probab., 63: , [2] C. Adrieu ad C.P. Robert. Cotrolled CC for optimal samplig.. Preprit, 200. [3] Y.F. Atchadé ad G. Fort. Limit Theorems for some adaptive CC algorithms with subgeometric kerels. Preprit, [4] Y.F. Atchadé ad J.S. Rosethal. O Adaptive arkov Chai ote Carlo Algorithms. Beroulli, 5:85 828, [5] Y. Bai. Simultaeous drift coditios o adaptive arkov Chai ote Carlo algorithms. yabai/yabai2.pdf, 2009a. [6] Y. Bai. Covergece of Adaptive arkov Chai ote Carlo ethods. PhD thesis, Departmet of Statistics, Uiversity of Toroto, [7] A.E. Brockwell ad J.B. Kadae. Idetificatio of regeeratio times i CC simulatio, with applicatio to adaptive schemes. J. Comp. Graph. Stat, 4: , [8] G. Fort ad E. oulies. V-Subgeometric ergodicity for a Hastigs-etropolis algorithm. Statist. Prob. Lett., 49:40 40, [9] W.R. Gilks, G.O. Roberts, ad S.K. Sahu. Adaptive arkov chai ote Carlo. J. Amer. Statist. Assoc., 93: , 998. [0] H. Haario, E. Saksma, ad J. Tammie. A adaptive etropolis algorithm. Beroulli, 7: , 200. [] S.F. Jarer ad E. Hase. Geometric ergodicity of etropolis algorithms. Stoch. Process. Appl., 85:34 36, [2] S.F. Jarer ad G.O. Roberts. Polyomial covergece rates of arkov Chais. A. Appl. Probab., 2: , [3] K.L. egerse ad R.L. Tweedie. Rate of covergeces of the Hastig ad etropolis algorithms. A. Statist., 24:0 2, 996. [4] S. P. ey ad R. L. Tweedie. arkov Chais ad Stochastic Stability. Lodo: Spriger- Verlag, 993. [5] H. Robbis ad S. oro. A stochastic approximatio method. A. ath. Stat., 22: , 95. [6] G.O. Roberts ad J.S. Rosethal. Optimal scalig for various etropolis-hastigs algorithms. Stat. Sci., 6:35 367,

24 [7] G.O. Roberts ad J.S. Rosethal. Examples of Adaptive CC. J. Comp. Graph. Stat., [8] G.O. Roberts ad J.S. Rosethal. Couplig ad Ergodicity of adaptive arkov chai ote Carlo algorithms. J. Appl. Prob., 44: , [9] G.O. Roberts ad J.S. Rosethal. Two covergece properties of hybrid samplers. A. Appl. Prob., 8: , 998. [20] G.O. Roberts ad R.L. Tweedie. Geometric covergece ad cetral limit theorems for multidimesioal Hastigs ad etropolis algorithms. Biometrika, 83:95 0, 996. [2] G.O. Roberts, A. Gelma, ad W.R. Gilks. Weak covergece ad optimal scalig of radom walk etropolis algorithms. A. Appl. Prob., 7:0 20, 997. [22] G.O. Roberts, J.S. Rosethal, ad P.O. Schwartz. Covergece propperties of perturbed arkov chais. J. Appl. Prob., 35:, 998. [23] J.S. Rosethal. iorizatio Coditios ad Covergece Rates for arkov Chai ote Carlo. J. Amer. Stats. Assoc., 90: , 995. [24] E. Saksma ad. Vihola. O the Ergodicity of the Adaptive etropolis Algorithms o Ubouded Domais. Preprit, [25] P. Tuomie ad R.L. Tweedie. Subgeometric rates of covergece of f -ergodic arkov chais. Adv. Appl. Probab., 263: ,

On the Containment Condition for Adaptive Markov Chain Monte Carlo Algorithms

On the Containment Condition for Adaptive Markov Chain Monte Carlo Algorithms Yan Bai, Gareth O. Roberts, and Jeffrey S. Rosenthal Last revised: July, 2010 Abstract This paper considers ergodicity properties