# Geometric Ergodicity and Hybrid Markov Chains

Save this PDF as:

Size: px
Start display at page:

Download "Geometric Ergodicity and Hybrid Markov Chains"

## Transcription

1 Geometric Ergodicity and Hybrid Markov Chains by Gareth O. Roberts* and Jeffrey S. Rosenthal** (August 1, 1996; revised, April 11, 1997.) Abstract. Various notions of geometric ergodicity for Markov chains on general state spaces exist. In this paper, we review certain relations and implications among them. We then apply these results to a collection of chains commonly used in Markov chain Monte Carlo simulation algorithms, the so-called hybrid chains. We prove that under certain conditions, a hybrid chain will inherit the geometric ergodicity of its constituent parts. Acknowledgements. We thank Charlie Geyer for a number of very useful comments regarding spectral theory and central limit theorems. We thank Alison Gibbs, Phil Reiss, Peter Rosenthal, and Richard Tweedie for very helpful discussions. We thank the referee and the editor for many excellent suggestions. 1. Introduction. A question of increasing importance in the Markov chain Monte Carlo literature (Gelfand and Smith, 1990; Smith and Roberts, 1993) is the issue of geometric ergodicity of Markov chains (Tierney, 1994, Section 3.2; Meyn and Tweedie, 1993, Chapters 15 and 16; Roberts and Tweedie, 1996). However, there are a number of different notions of the phrase geometrically ergodic, depending on perspective (total variation distance vs. * Statistical Laboratory, University of Cambridge, Cambridge CB2 1SB, U.K. Internet: ** Department of Statistics, University of Toronto, Toronto, Ontario, Canada M5S 3G3. Internet: 1

2 in L 2 ; with reference to a particular V -function; etc.). One goal of this paper is to review and clarify the relationship between such differing notions. We first discuss a general result (Proposition 1) giving the equivalence of a number of related ergodicity notions, involving total variation distance and V -uniform norms. Some of these equivalences follow from standard treatments of general state space Markov chains (Nummelin, 1984; Asmussen, 1987; Meyn and Tweedie, 1993), though they may not have previously been stated explicitly. Others of these equivalences are related to solidarity properties (Nummelin and Tweedie, 1978; Vere-Jones, 1962), regarding when a geometric rate of convergence can be chosen independently of the starting position. We then turn to L 2 theory, and discuss (Theorem 2) a number of equivalences of geometric ergodicity of reversible Markov chains, on L 1 and L 2. The essence of our analysis is the spectral theorem (e.g. Rudin, 1991; Reed and Simon, 1972; Conway, 1985) for bounded self-adjoint operators on a Hilbert space. Again, we believe that these equivalences are known, though they may not have been explicitly stated in this way. We further show that the conditions of Proposition 1 imply the conditions of Theorem 2. We are unable to establish the converse in complete generality, however we do note that the two sets of conditions are equivalent for most chains which arise in practice. We also argue (Corollary 3) that any of these equivalent conditions are sufficient to establish a functional central limit theorem for empirical averages of L 2 functions. We then turn our attention (Section 3) to the geometric ergodicity of various hybrid Markov chains which have been suggested in the literature (Tierney, 1994, Section 2.4; Chan and Geyer, 1994; Green, 1994). After a few preliminary observations, we prove (Theorem 6) that under suitable conditions, hybrid chains will inherit the geometric ergodicity of their constituent chains. This suggests the possibility of establishing the geometric ergodicity of large and complicated Markov chain algorithms, simply by verifying the geometric ergodicity of the simpler chains which give rise to them. We note that there are various alternatives to considering distributional convergence properties of Markov chains, such as considering the asymptotic variance of empirical estimators (cf. Geyer, 1992; Greenwood et al., 1995). But we do not pursue that here. 2

3 2. Equivalences of geometric ergodicity. We now turn our attention to results for geometric convergence. We consider a φ- irreducible, aperiodic Markov chain P (x, ) on a state space X, with stationary distribution π( ). [Recall that a chain is φ-irreducible if there is a non-zero measure φ on X, such that if φ(a) > 0, then P x (τ A < ) > 0 for all x X.] We shall sometimes assume that the underlying σ-algebra on X is countably generated; this is assumed in much of general state space Markov chain theory (cf. Jain and Jamison, 1967; Meyn and Tweedie, 1993, pp. 107 and 516), and is also necessary to ensure that the function x P (x, ) π( ) var is measurable (see Appendix). We recall that P acts to the right on functions and to the left on measures, so that µp (A) = P (x, A) µ(dx), P f(x) = f(y) P (x, dy). Call a subset S X hyper-small if π(s) > 0 and there is δ S > 0 and k N such that δ S 1 S (x), where d(p k (x, )) dπ is the density with respect to π of the absolutely d(p k (x, )) dπ continuous component of P k (x, ). A fundamental result of Jain and Jamison (1967), see also Orey (1971), states that on a countably generated measure space, every set of positive π-measure contains a hyper-small set. let L V Following Meyn and Tweedie (1993, p. 385), given a positive function V : X R, we f(x) V (x) be the vector space of all functions from X to R for which the norm f V = sup ; x X fdπ = 0}. We denote operator norms as usual, viz. X is finite. We let L V,0 = {f L V T L V f L V T f V ; T L f V =1 V,0 = sup f L V,0 f V =1 T f V. Finally, we write µ(f) for fdµ, and let Π be the operator defined by Π(f)(x) = π(f). X The following proposition borrows heavily from Meyn and Tweedie (1993) and Nummelin and Tweedie (1978). Note that the equivalence of (i) and (i ) below shows that the geometric rate ρ may be chosen independently of the starting point x X, though in general (i.e. for chains that are not uniformly ergodic ), the multiplicative constants C x will depend on x. Most of the remaining conditions concern the existence and properties of a geometric drift function V. 3

4 Proposition 1. The following are equivalent, for a φ-irreducible, aperiodic Markov chain P (x, ) on a countably generated state space X, with stationary distribution π( ). (i) The chain is π-a.e. geometrically ergodic, i.e. there is ρ < 1, and constants C x < for each x X, such that for π-a.e. x X, P n (x, ) π( ) var C x ρ n, n N, where µ( ) var = 1 2 sup µ(f) = sup µ(a) ; f 1 A X (i ) There are constants ρ x < 1 and C x < for each x X, such that for π-a.e. x X, P n (x, ) π( ) var C x ρ n x, n N ; (i ) There is a hyper-small set S X, and constants ρ S < 1 and C S <, such that S π(dy) π(s) P n (y, ) π( ) var C S ρ n S, n N ; (ii) There exists a π-a.e.-finite measurable function V : X [1, ], which may be taken to have π(v j ) < for any fixed j N, such that the chain is V -uniformly ergodic, i.e. for some ρ < 1 and some fixed constant C <, P n (x, ) π( ) V C V (x) ρ n, x X, n N, where µ( ) V = sup µ(f) ; f V (iii) There exists a π-a.e.-finite measurable function V : X [1, ], which may be taken to have π(v j ) < for any fixed j N, and a positive integer n, such that P n Π L V < 1 ; (iv) There exists a π-a.e.-finite measurable function V : X [1, ], which may be taken to have π(v j ) < for any fixed j N, and a positive integer n, such that P n L V,0 < 1. 4

5 Remarks. 1. By the spectral radius formula (e.g. Rudin, 1991, Theorem 10.13), r(t ) = inf n 1 T n 1/n, so that the displayed equations in (iii) and (iv) above can be restated as r(p Π) < 1 and r(p L ) < 1, respectively. V,0 2. We may use the same function V for each of (ii), (iii), and (iv) above. Proof. (i) = (i ) : Immediate. (i ) = (i ): We note that ρ x and C x may not be measurable as functions of x X. However, since X is countably generated, the function P n (x, ) π( ) var is measurable (see Appendix). Thus, we can define r x = lim inf P n (x, ) π( ) 1/n var n π( ) var r n/2 x and K x = sup P n (x, ) n as π-a.e.-finite measurable functions with π(r x < 1) = 1. Then we can find r < 1 and K < so that the set B = {x X ; r x r, K x K} has π(b) > 0. By Jain and Jamison (1967), there is a hyper-small subset S B. For x S, we have P n (x, ) π( ) var Kr n, so therefore π(dy) π(s) P n (y, ) π( ) var Kr n, as required. S (i ) = (i) : This is the content of Theorem 1 of Nummelin and Tweedie (1978), which generalizes the countable state space results of Vere-Jones (1962). (i) (ii) : Obviously (ii) implies (i). That (i) implies (ii) follows from Meyn and Tweedie (1993). Indeed, the existence of such a V follows from their Theorem (i) and Theorem 5.2.2, the finiteness of E π (V ) follows from their Theorem (iii) and Theorem (see also Meyn and Tweedie, 1994, Proposition 4.3 (i)), and the possibility of having E π (V j ) < follows from their Lemma (ii) (iii) : Clearly, (ii) is equivalent to the existence of specified V, ρ, and C, such that or sup f V sup x X (P n f)(x) π(f) V (x) C ρ n, n N, P n Π L V Cρ n, n N. Hence if n > Π L V follows. log C log(1/ρ), then P n Π L V < 1, proving (iii). For the converse, if P n = s < 1, then by sub-multiplicity of operator norms, P nk Π L V 5 sk, and (ii)

6 (iii) (iv) : f V 1, then Clearly (iii) implies (iv). For the converse, if P n L V,0 = ρ < 1, and (P nk Π)f V = P nk (f π(f)) V ρ k f π(f) V ρ k (1 + π(v )). Taking j = 1 so that π(v ) <, we see that this expression will be less than 1 for sufficiently large k, proving (iii). We now turn attention to results involving the spectral theory of bounded self-adjoint operators on Hilbert spaces. Given a Markov operator P with stationary distribution π, a number 1 p <, and a signed measure µ on X, define µ L p (π) by µ p L p (π) = X dµ dπ p dπ, µ << π µ + (X ) + µ (X ), p = 1, otherwise (note that if µ << π and p = 1, the two definitions coincide), set L p (π) = {µ ; µ Lp (π) < }, and set P Lp (π) = sup{ µp Lp (π) ; µ Lp (π) = 1}. It is well known (see e.g. Baxter and Rosenthal, 1995, Lemma 1) that we always have P Lp (π) 1. Recall that P is reversible with respect to π if π(dx)p (x, dy) = π(dy)p (y, dx) as measures on X X ; this is equivalent to P being a self-adjoint operator on the Hilbert space L 2 (π), with inner product given by µ, ν = X If P is reversible w.r.t. π, and dµ dπ = f, then d(µp ) dπ dµ dν dπ dπ dπ. = P f, as is easily checked by writing out the definitions and using reversibility. In particular, the action of P on signed measures µ L 2 (π) is precisely equivalent to the action of P on functions f with π(f 2 ) <. We shall apply the usual spectral theory (e.g. Rudin, 1991; Reed and Simon, 1972; Conway, 1985) to the operator P acting on measures in L 2 (π). 6

7 Theorem 2. Let P be a Markov operator on a state space X, reversible with respect to the probability measure π (so that P is self-adjoint on L 2 (π)). Then the following are equivalent, and furthermore are all implied by any of the equivalent conditions of Proposition 1. (i) P is L 2 (π)-geometrically ergodic, i.e. there is ρ < 1 such that for each probability measure µ L 2 (π), there is C µ < with µp n ( ) π( ) L2 (π) C µ ρ n, n N ; (ii) P has an L 2 (π) spectral gap, i.e. there is ρ < 1 such that for each signed measure µ L 2 (π) with µ(x ) = 0, µp ( ) L 2 (π) ρ µ L 2 (π) ; (iii) P L2 (π) is geometrically ergodic, i.e. there is ρ < 1 such that for each probability measure µ L 2 (π), there is C µ < such that µp n ( ) π( ) var C µ ρ n, n N ; (iv) P L2 (π) is geometrically ergodic, with C µ = 1 2 µ π L 2 (π), i.e. there is ρ < 1 such that for each probability measure µ L 2 (π), µp n ( ) π( ) var 1 2 µ π L 2 (π) ρ n, n N ; Remarks. We may use the same value of ρ in each of these four equivalent conditions. Also note that, by self-adjointness, in condition (ii) above it is equivalent to say that P π L 2 (π) ρ < 1 or that the spectral radius satisfies r(p ) ρ < 1 (cf. Conway, 1985, Proposition VIII.1.11(e)); we make use of this below. Finally, note that for probability measures µ, we have µ π 2 L 2 (π) = µ 2 L 2 (π) 1. 7

8 Proof. (ii) = (i), and (iv) = (iii) : Immediate. (i) = (iii), and (ii) = (iv) : since µ var = 1 2 µ L 1 (π) 1 2 µ L 2 (π). These follow from the Cauchy-Schwarz inequality, (iii) = (ii) : By the triangle inequality, condition (iii) implies that for any signed measure µ L 2 (π), there is C µ < such that µp n ( ) µ(x ) π( ) var C µ ρ n, n N. ( ) Let E( ) be the spectral measure corresponding to P acting on L 2 (π) (see e.g. Rudin, 1991; 1 Reed and Simon, 1972; Conway, 1985), so that µp k = λ k µe(dλ) for all k N and µ L 2 (π). Choose r (ρ, 1). 1 If (ii) does not hold, then either E(( 1, r)) or E((r, 1)) is not the zero operator. Assume (by replacing P by P 2 if necessary) that E((r, 1)) 0. Hence there is a non-zero signed measure ν in the range of E((r, 1)). Since P is self-adjoint, ν(x ) = π, ν = 0 by orthogonality of the spectral projections, hence there is Y X with ν(y) = νe ((r, 1)) (Y) > 0. Define the signed measure m by m(s) = νe(s)(y) for Borel S (r, 1). Let (r, 1) = A + A be the Hahn decomposition for m (see e.g. Royden, 1968, pp ). Since m(r, 1) = m(a + ) + m(a ) = ν(y) > 0, and m(a ) 0, therefore m(a + ) > 0. Set ω = νe(a + ) L 2 (π), so that ω << π and ω(y) = m(a + ) > 0. Again by orthogonality, ω(x ) = 0. On the other hand, using the functional calculus equation given in Conway (1985, p. 265, equation 1.11), and using that for dλ A + we 8

9 have ωe(dλ) = νe(dλ), we see that for k N, 2 ωp k ( ) var = ωp k ( ) L1 (π) = d(ωp k ) (x) dπ π(dx) X d(ωp k ) (x) dπ π(dx) Y Y d(ωp k ) (x)π(dx) dπ = π Y, ωp k = π Y, ω λ k E(dλ) A + = λ k π Y, ωe(dλ) = = A + A + [ λ k Y d(ωe(dλ)) ] (x) π(dx) dπ λ k (ωe(dλ)) (Y) = = A + A + λ k (νe(dλ)) (Y) λ k m(dλ) A + r k m(a + ), which contradicts ( ). Relation to Proposition 1: It remains to show that these conditions are implied by the equivalent conditions of Proposition 1. We show that condition (iii) is implied by Proposition 1 part (ii) (with j = 2). Indeed, for µ L 2 (π), we have (using the triangle inequality and the Cauchy-Schwartz inequality) that µp n ( ) π( ) var µp n ( ) π( ) V P n (x, ) π( ) V µ(dx) X CV (x)ρ n µ(dx) = Cρ n X X V (x) dµ ( (dµ ) ) 2 (x) π(dx) Cρn Eπ (V dπ 2 )E π. dπ 9

10 ( ( ) ) 2 Since E π (V 2 )E dµ π dπ <, condition (iii) follows. Remarks. 1 n 1. For most reversible chains which arise, the conditions of Proposition 1 and Theorem 2 are actually equivalent. For example, this is obviously true if the point masses δ x are all in L 2 (π) (which holds for an irreducible chain on a discrete measure space), or if P (x, ) L 2 (π) for all x. More generally, if the chain is of the form P (x, ) = (1 a x )δ x + a x ν x ( ) where ν x L 2 (π) and a x > 0, then the two sets of conditions are again seen to be equivalent, since if condition (iii) of Theorem 2 holds, then condition (i ) of Proposition 1 also holds. This includes most examples of Metropolis-Hastings algorithms (Metropolis et al., 1953; Hastings, 1970). However, we do not know if these conditions are equivalent in complete generality. 2. We note that a number of other mixing conditions for Markov chains are available, though we do not pursue them here. See for example Rosenblatt (1962, Sections V b and VIII d), and Carmona and Klein (1983). Finally, we consider the existence of central limit theorems for empirical averages n g(x i ) of a function g : X R evaluated at the realized values X 1, X 2,... of a i=1 Markov chain run. Such limit theorems are often considered in Markov chain Monte Carlo applications (cf. Tierney, 1994, Theorem 4). Geyer (1992, Section 2) and Chan and Geyer (1994) point out the usefulness of a certain Markov Functional Central Limit Theorem result (see e.g. Bhattacharya, 1982, Theorem 2.1; Kipnis and Varadhan, 1986, Corollary 1.5), which states the following. For a Markov chain P (, ) on X, reversible with respect to π( ), if π(g) = 0 and the quantity σ 2 g = λ 1 λ e g(dλ) is finite, where e g (S) = g(x) (E(S)g) (x)π(dx), then the continuous-time processes Y n (t) = X nt 1 n j=1 g(x j ) converge weakly to Brownian motion with variance σ 2 g. 10

11 On the other hand, if the Markov operator P has spectral radius ρ < 1, as in condition (ii) of Theorem 2, then it is easily seen (cf. Geyer, 1992, Section 3.5) that for π(g 2 ) <, σ 2 g 1 + ρ 1 ρ π(g2 ) <, so that the above Markov Functional Central Limit Theorem applies. We thus obtain: Corollary 3. Let the reversible Markov operator P satisfy any one equivalent condition of Proposition 1 or of Theorem 2 (e.g., let P be geometrically ergodic). g : X R with π(g) = 0 and π(g 2 ) <. Then σg 2 <, and as n, the processes Y n (t) = 1 nt n g(x j ) converge weakly (in the Skorohod topology, on any finite j=1 interval) to Brownian motion with variance σg. 2 In particular, setting t = 1, the random n variable 1 n g(x i ) converges in distribution to the normal distribution N(0, σg). 2 i=1 This corollary directly generalizes Chan and Geyer (1994, Theorem 2) for reversible chains; in particular, it shows that their condition π ( g 2+ɛ) < is then unnecessary. See Chan and Geyer (1994), and references therein, for some other approaches to central limit theorems for Markov chains, including those involving drift conditions (cf. Meyn and Tweedie, 1993, Theorem ). Remark. We note that the Markov Functional Central Limit Theorem used above applies only to a reversible Markov process with a spectral gap. For more general reversible Markov processes, the less restrictive results of Kipnis and Varadhan (1986) may be very useful. Furthermore, we note that corresponding results for non-reversible processes are an active area of study, for example in the contexts of interacting particle systems and in homogenization theory; see for example Derriennic and Lin (1996), and the recent unpublished works of Papanicolaou, Komorovskii, Carmona, Xu, Lamdin, Olla, Yau, and Sznitman. Let 11

12 3. Geometric ergodicity of hybrid samplers. Given a probability distribution π( ) on the state space X = X 1 X 2... X k, the usual deterministic-scan Gibbs sampler (DUGS) is the Markov kernel P = P 1 P 2... P k, where P i is the Markov kernel which replaces the i th coordinate by a draw from π(dx i x i ), leaving x i fixed (where x i = (x 1,..., x i 1, x i+1,..., x k )). This is a standard Markov chain Monte Carlo technique (Gelfand and Smith, 1990; Smith and Roberts, 1993; Tierney, 1994). The random-scan Gibbs sampler (RSGS), given by P = 1 k i P i, is sometimes used instead. Often the full conditionals π(dx i x i ) may be easily sampled, so that DUGS may be efficiently run on a computer. However, sometimes this is not feasible. Instead, one can define new operators Q i which are easily sampled, such that Q n i converges to P i as n. This is the method of one-variable-at-a-time Metropolis-Hastings or Metropolis within Gibbs (cf. Tierney, 1994, Section 2.4; Chan and Geyer, 1994, Theorem 1; Green, 1994). Such samplers prompt the following definition. Definition. Let C = (Q 1, Q 2,..., Q k ) be a collection of Markov kernels on a state space X. The random-scan hybrid sampler for C is the sampler defined by P C = 1 k (Q Q k ). A common example is the variable-at-a-time Metropolis-Hastings algorithms mentioned above. For another example, if the Q i are themselves Gibbs samplers, then the random-scan hybrid sampler would correspond to building a large Gibbs sampler out of smaller ones. Similarly, if the Q i are themselves Metropolis-Hastings algorithms (perhaps with singular proposals, cf. Tierney, 1995), then the random-scan hybrid sampler is also a Metropolis-Hastings algorithm but with a different proposal distribution. It is important to note that this usage of the word hybrid is distinct from the more specific notion of hybrid Monte Carlo algorithms as studied in the physics literature (cf. Duane et al., 1987; Neal, 1993, Section 5.2). We now discuss some observations about geometric convergence of these samplers. Our first result is a cautionary example. 12

13 Proposition 4. Let X = X 1 X 2, and let π be a probability measure on X. Let Q 1 and Q 2 be two Markov operators on X which fix the second and first coordinates (respectively). Assume that the usual RSGS for π is geometrically ergodic. Assume further that for each fixed y X 2 [resp. x X 1 ], we have that Q 1 [resp. Q 2 ] restricted to X 1 {y} [resp. {x} X 2 ] is geometrically ergodic, with stationary distribution equal to π(dx y) [resp. π(dy x)]. Then it is still possible that the hybrid sampler P C = 1 2 (Q 1 + Q 2 ) fails to be geometrically ergodic on X. Proof. A result of Roberts and Tweedie (1996) states that a Metropolis algorithm is not geometrically ergodic if the infimum of acceptance probabilities is 0. Therefore consider the following example. Let π be the bivariate density of two independent standard normal components. By independence, RSGS is geometrically ergodic (in fact uniformly ergodic with rate 1/2). Now let Q 1 be the following random walk Metropolis algorithm. Given (X n, Y n ) = (x, y), let Y n+1 = y and propose a candidate Z n+1 for X n+1 from a N(x, 1 + y 2 ) distribution, accepting with the probability: 1 π(z n+1 )/π(x n ). Similarly define Q 2 on fixed x by proposing a candidate for Y n+1 according to N(y, 1 + x 2 ). For fixed y, Q 1 is geometrically ergodic; and for fixed x, Q 2 is geometrically ergodic (see for example Roberts and Tweedie, 1996). However it is easy to verify that along any sequence of points (x i, y i ) such that x i and y i, the acceptance probability for each of Q 1 and Q 2 (and hence also for P C ) goes to 0. This provides our required counterexample. To continue, we need some notation. Let π be a probability measure on X = X 1... X k, and let P i be the operator which replaces the i th coordinate of x X by a draw from π(dx i x i ), leaving x i unchanged. Let M(X ) be the space of all σ-finite signed measures on X, and let M 0 = {µ M(X ) ; µ(x ) = 0}. Given a norm N on M(X ), and a Markov operator P on X, set P N = sup N (µp ). Say P is N -geometrically ergodic if πp = π N (µ)=1 and there is a positive integer m with P m M0 N < 1. (Note that, if N is total variation distance, then N -geometric ergodicity is equivalent to uniform ergodicity. On the other hand, if N (µ) = µ L2 (π), then N -geometric ergodicity is equivalent to condition (ii) of 13

14 Theorem 2 above.) Finally, call N π-contracting if it has the property that P N 1 for any Markov operator P satisfying πp = π. Thus, the L p (π) norms are all π-contracting (see e.g. Baxter and Rosenthal, 1995, Lemma 1), though the norm L V Proposition 5. RSGS is also N -geometrically ergodic. is usually not. If DUGS is N -geometrically ergodic, where N is π-contracting, then Proof. Suppose that (P 1... P k ) n M0 N < 1. Then, since ( ) nk 1 k (P P k ) = 1 k nk i 1,...,i nk P i1... P ink includes a term (P 1... P k ) n, it has N -norm < 1 by the triangle inequality. The result follows. We note that, in general, the converse to this last result is false. For example, even if RSGS is geometrically ergodic, DUGS may be periodic. We can now prove Theorem 6. Suppose DUGS or RSGS is N -geometrically ergodic, where N is π- contracting. Suppose further that there are operators Q i such that lim n Qn i P i N = 0, i.e. Q n i converges to P i in N -norm as n. Then the random-scan hybrid sampler P C = 1 k (Q Q k ) is N -geometrically ergodic. Remarks. 1. If furthermore N is such that νt var C ν T N, then we can conclude geometric convergence in the total variation norm (cf. Baxter and Rosenthal, 1995, Theorem 1 (b)). 2. Note that the condition lim n Qn i P i N = 0 implies a certain uniformity of convergence, uniform over choices of x i. 14

15 Proof. If DUGS is N -geometrically ergodic, there is a positive integer m with (P 1... P k ) m M0 N < 1. Also, by assumption, Q n i P i N 0, n, so (Q n 1 Q n 2... Q n k) m (P 1 P 2... P k ) m M0 N 0, n. Hence, there is a positive integer N such that (Q N 1 Q N 2... Q N k ) m M0 N = ρ < 1. But then expanding out (P C ) Nm, and recalling that Q N 1 where Q is any product of Q i s, we see that The N -geometric ergodicity follows. (P C ) Nm M0 N ( 1 ) Nm ( k Nm 1 + ρ ) < 1. k The result for RSGS follows similarly, from noting that ( 1 ) Nm Qi = 1 k k Nm Q N i 1... Q N i m + leftover i 1,...,i m = 1 k N ( 1 ) m Q N n i + leftover where leftover has N -norm 1 k N. Hence (P C ) m M0 N 1 (1 ρ)k N < 1. In particular, this result shows that if a hybrid sampler is constructed out of chains which are L 2 (π)-geometrically ergodic, and if RSGS is L 2 (π)-geometrically ergodic, then the resulting hybrid sampler will also be L 2 (π)-geometrically ergodic. It may be possible to use this result iteratively, to build up larger and larger geometric chains by combining smaller ones. 15

16 Appendix: Measurability of P n (x, ) π( ) var. In this appendix we prove that, on a countably generated state space, the quantity P n (x, ) π( ) var is measurable as a function of x. This fact was needed in the proof of Proposition 1 above. Lemma. Let ν( ) be a finite signed measure on (X, F). Suppose that F 0 is a set algebra generating F. Then for all S F, and for all ɛ > 0, there is S 0 F 0 with ν(s) ν(s 0 ) < ɛ. Proof. Take ν to be a measure (otherwise consider its positive and negative parts separately and use the triangle inequality). It follows from Doob (1994, Chapter IV, Section 3) that we can find S 0 F 0 with ν(s S 0 ) arbitrarily small. The result follows. Proposition. Let ν(x, ) be a bounded signed measure on (X, F), where F is countably generated by the sequence of sets {A 1, A 2,...}. function for all sets A. Then sup ν(x, A) is measurable. A F Proof. Assume that ν(, A) is a measurable The proof proceeds by demonstrating that the supremum can be written as a supremum of a countable collection of measurable functions. Therefore let F n = σ{a 1, A 2,..., A n }. Now fix x X. Let A be the set that achieves sup ν(x, A) (possible by the Hahn decomposition, e.g. Royden, 1968). Notice that F n is A n a set algebra generating F. So for arbitrary ɛ > 0, we can apply the previous lemma to find a set A n F n with ν(a ) ν(a) ν(a ) ɛ. Hence, sup A F ν(x, A) = sup n sup ν(x, A). A F n But sup A F n ν(x, A) is clearly a measurable function of x X. The result follows. claimed. Setting ν(x, A) = P n (x, A) π(a), we see that P n (x, ) π( ) var is measurable, as 16

17 REFERENCES S. Asmussen (1987), Applied probability and queues. John Wiley & Sons, New York. J.R. Baxter and J.S. Rosenthal (1995), Rates of convergence for everywhere-positive Markov chains. Stat. Prob. Lett. 22, R.N. Bhattacharya (1982), On the functional central limit theorem and the law of the iterated logarithm for Markov processes. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 60, R. Carmona and A. Klein (1983), Exponential moments for hitting times of uniformly ergodic Markov processes. Ann. Prob. 11, K.S. Chan and C.J. Geyer (1994), Discussion paper. Ann. Stat. 22, J.B. Conway (1985), A course in functional analysis. Springer-Verlag, New York. Y. Derriennic and M. Lin (1996), Sur le théorème limite central de Kipnis et Varadhan pour les chaînes réversibles ou normales. Comptes Rendus de l Académie des Sciences, Série I, Tome 323, No. 9, J.L. Doob (1994), Measure theory. Springer-Verlag, New York. S. Duane, A.D. Kennedy, B.J. Pendleton, and D. Roweth (1987), Hybrid Monte Carlo. Phys. Lett. B 195, A.E. Gelfand and A.F.M. Smith (1990), Sampling based approaches to calculating marginal densities. J. Amer. Stat. Assoc. 85, C. Geyer (1992), Practical Markov chain Monte Carlo. Stat. Sci., Vol. 7, No. 4, P.J. Green (1994), Reversible jump MCMC computation and Bayesian model determination. Tech. Rep., Dept. Mathematics, U. of Bristol. P.E. Greenwood, I.W. McKeague, and W. Wefelmeyer (1995), Outperforming the Gibbs sampler empirical estimator for nearest neighbour random fields. Tech. Rep., Dept. Mathematics, U. of British Columbia. W.K. Hastings (1970), Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, N. Jain and B. Jamison (1967), Contributions to Doeblin s theory of Markov processes. Z. Wahrsch. Verw. Geb. 8,

18 C. Kipnis and S.R.S. Varadhan (1986), Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusions. Comm. Math. Phys. 104, N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and E. Teller (1953), Equations of state calculations by fast computing machines. J. Chem. Phys. 21, S.P. Meyn and R.L. Tweedie (1993), Markov chains and stochastic stability. Springer- Verlag, London. S.P. Meyn and R.L. Tweedie (1994), Computable bounds for convergence rates of Markov chains. Ann. Appl. Prob. 4, R.M. Neal (1993), Probabilistic inference using Markov chain Monte Carlo methods. Tech. Rep. CRG-TR-93-1, Dept. Comp. Sci., U. of Toronto. E. Nummelin (1984), General irreducible Markov chains and non-negative operators. Cambridge University Press. E. Nummelin and R.L. Tweedie (1978), Geometric ergodicity and R-positivity for general Markov chains. Ann. Prob. 6, S. Orey (1971), Lecture notes on limit theorems for Markov chain transition probabilities. Van Nostrand Reinhold, London. M. Reed and B. Simon (1972), Methods of modern mathematical physics. Volume I: Functional analysis. Academic Press, New York. G.O. Roberts and R.L. Tweedie (1996), Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms. Biometrika 83, M. Rosenblatt (1962), Random Processes. Oxford University Press, New York. J.S. Rosenthal (1995), Minorization conditions and convergence rates for Markov chain Monte Carlo. J. Amer. Stat. Assoc. 90, R.L. Royden (1968), Real analysis, 2 nd ed. MacMillan, New York. W. Rudin (1991), Functional Analysis, 2 nd ed. McGraw-Hill, New York. A.F.M. Smith and G.O. Roberts (1993), Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods (with discussion). J. Roy. Stat. Soc. Ser. B 55,

19 L. Tierney (1994), Markov chains for exploring posterior distributions (with discussion). Ann. Stat. 22, L. Tierney (1995), A note on Metropolis Hastings kernels for general state spaces. Tech. Rep. 606, School of Statistics, U. of Minnesota. R.L. Tweedie (1978), Geometric ergodicity and R-positivity for general Markov chains. Ann. Prob. 6, D. Vere-Jones (1962), Geometric ergodicity in denumerable Markov chains. Quart. J. Math. Oxford (2) 13,

### ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS

The Annals of Applied Probability 1998, Vol. 8, No. 4, 1291 1302 ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS By Gareth O. Roberts 1 and Jeffrey S. Rosenthal 2 University of Cambridge

### The simple slice sampler is a specialised type of MCMC auxiliary variable method (Swendsen and Wang, 1987; Edwards and Sokal, 1988; Besag and Green, 1

Recent progress on computable bounds and the simple slice sampler by Gareth O. Roberts* and Jerey S. Rosenthal** (May, 1999.) This paper discusses general quantitative bounds on the convergence rates of

### Analysis of the Gibbs sampler for a model. related to James-Stein estimators. Jeffrey S. Rosenthal*

Analysis of the Gibbs sampler for a model related to James-Stein estimators by Jeffrey S. Rosenthal* Department of Statistics University of Toronto Toronto, Ontario Canada M5S 1A1 Phone: 416 978-4594.

### MATHS 730 FC Lecture Notes March 5, Introduction

1 INTRODUCTION MATHS 730 FC Lecture Notes March 5, 2014 1 Introduction Definition. If A, B are sets and there exists a bijection A B, they have the same cardinality, which we write as A, #A. If there exists

### Lectures on Stochastic Stability. Sergey FOSS. Heriot-Watt University. Lecture 4. Coupling and Harris Processes

Lectures on Stochastic Stability Sergey FOSS Heriot-Watt University Lecture 4 Coupling and Harris Processes 1 A simple example Consider a Markov chain X n in a countable state space S with transition probabilities

### arxiv: v1 [stat.co] 28 Jan 2018

AIR MARKOV CHAIN MONTE CARLO arxiv:1801.09309v1 [stat.co] 28 Jan 2018 Cyril Chimisov, Krzysztof Latuszyński and Gareth O. Roberts Contents Department of Statistics University of Warwick Coventry CV4 7AL

### Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015

Markov Chains Arnoldo Frigessi Bernd Heidergott November 4, 2015 1 Introduction Markov chains are stochastic models which play an important role in many applications in areas as diverse as biology, finance,

### Lecture 7. µ(x)f(x). When µ is a probability measure, we say µ is a stationary distribution.

Lecture 7 1 Stationary measures of a Markov chain We now study the long time behavior of a Markov Chain: in particular, the existence and uniqueness of stationary measures, and the convergence of the distribution

### Chapter 7. Markov chain background. 7.1 Finite state space

Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider

### ADJOINTS, ABSOLUTE VALUES AND POLAR DECOMPOSITIONS

J. OPERATOR THEORY 44(2000), 243 254 c Copyright by Theta, 2000 ADJOINTS, ABSOLUTE VALUES AND POLAR DECOMPOSITIONS DOUGLAS BRIDGES, FRED RICHMAN and PETER SCHUSTER Communicated by William B. Arveson Abstract.

Lebesgue-Radon-Nikodym Theorem Matt Rosenzweig 1 Lebesgue-Radon-Nikodym Theorem In what follows, (, A) will denote a measurable space. We begin with a review of signed measures. 1.1 Signed Measures Definition

### The Monte Carlo Computation Error of Transition Probabilities

Zuse Institute Berlin Takustr. 7 495 Berlin Germany ADAM NILSN The Monte Carlo Computation rror of Transition Probabilities ZIB Report 6-37 (July 206) Zuse Institute Berlin Takustr. 7 D-495 Berlin Telefon:

### ELEC633: Graphical Models

ELEC633: Graphical Models Tahira isa Saleem Scribe from 7 October 2008 References: Casella and George Exploring the Gibbs sampler (1992) Chib and Greenberg Understanding the Metropolis-Hastings algorithm

### AMENABLE ACTIONS AND ALMOST INVARIANT SETS

AMENABLE ACTIONS AND ALMOST INVARIANT SETS ALEXANDER S. KECHRIS AND TODOR TSANKOV Abstract. In this paper, we study the connections between properties of the action of a countable group Γ on a countable

### THE VARIANCE CONSTANT FOR THE ACTUAL WAITING TIME OF THE PH/PH/1 QUEUE. By Mogens Bladt National University of Mexico

The Annals of Applied Probability 1996, Vol. 6, No. 3, 766 777 THE VARIANCE CONSTANT FOR THE ACTUAL WAITING TIME OF THE PH/PH/1 QUEUE By Mogens Bladt National University of Mexico In this paper we consider

### L p Functions. Given a measure space (X, µ) and a real number p [1, ), recall that the L p -norm of a measurable function f : X R is defined by

L p Functions Given a measure space (, µ) and a real number p [, ), recall that the L p -norm of a measurable function f : R is defined by f p = ( ) /p f p dµ Note that the L p -norm of a function f may

### Modern Discrete Probability Spectral Techniques

Modern Discrete Probability VI - Spectral Techniques Background Sébastien Roch UW Madison Mathematics December 22, 2014 1 Review 2 3 4 Mixing time I Theorem (Convergence to stationarity) Consider a finite

### Total Expected Discounted Reward MDPs: Existence of Optimal Policies

Total Expected Discounted Reward MDPs: Existence of Optimal Policies Eugene A. Feinberg Department of Applied Mathematics and Statistics State University of New York at Stony Brook Stony Brook, NY 11794-3600

### Block Gibbs sampling for Bayesian random effects models with improper priors: Convergence and regeneration

Block Gibbs sampling for Bayesian random effects models with improper priors: Convergence and regeneration Aixin Tan and James P. Hobert Department of Statistics, University of Florida May 4, 009 Abstract

### 2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is

### An introduction to adaptive MCMC

An introduction to adaptive MCMC Gareth Roberts MIRAW Day on Monte Carlo methods March 2011 Mainly joint work with Jeff Rosenthal. http://www2.warwick.ac.uk/fac/sci/statistics/crism/ Conferences and workshops

### Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm

Chapter 13 Radon Measures Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm (13.1) f = sup x X f(x). We want to identify

### 1 Introduction This work follows a paper by P. Shields [1] concerned with a problem of a relation between the entropy rate of a nite-valued stationary

Prexes and the Entropy Rate for Long-Range Sources Ioannis Kontoyiannis Information Systems Laboratory, Electrical Engineering, Stanford University. Yurii M. Suhov Statistical Laboratory, Pure Math. &

### THE SET OF RECURRENT POINTS OF A CONTINUOUS SELF-MAP ON AN INTERVAL AND STRONG CHAOS

J. Appl. Math. & Computing Vol. 4(2004), No. - 2, pp. 277-288 THE SET OF RECURRENT POINTS OF A CONTINUOUS SELF-MAP ON AN INTERVAL AND STRONG CHAOS LIDONG WANG, GONGFU LIAO, ZHENYAN CHU AND XIAODONG DUAN

### Proofs for Large Sample Properties of Generalized Method of Moments Estimators

Proofs for Large Sample Properties of Generalized Method of Moments Estimators Lars Peter Hansen University of Chicago March 8, 2012 1 Introduction Econometrica did not publish many of the proofs in my

### On the Set of Limit Points of Normed Sums of Geometrically Weighted I.I.D. Bounded Random Variables

On the Set of Limit Points of Normed Sums of Geometrically Weighted I.I.D. Bounded Random Variables Deli Li 1, Yongcheng Qi, and Andrew Rosalsky 3 1 Department of Mathematical Sciences, Lakehead University,

### Czechoslovak Mathematical Journal

Czechoslovak Mathematical Journal Oktay Duman; Cihan Orhan µ-statistically convergent function sequences Czechoslovak Mathematical Journal, Vol. 54 (2004), No. 2, 413 422 Persistent URL: http://dml.cz/dmlcz/127899

### is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable

### THEOREMS, ETC., FOR MATH 515

THEOREMS, ETC., FOR MATH 515 Proposition 1 (=comment on page 17). If A is an algebra, then any finite union or finite intersection of sets in A is also in A. Proposition 2 (=Proposition 1.1). For every

### Generalized Resolvents and Harris Recurrence of Markov Processes

Generalized Resolvents and Harris Recurrence of Markov Processes Sean P. Meyn R.L. Tweedie March 1992 Abstract In this paper we consider a ϕ-irreducible continuous parameter Markov process Φ whose state

### Real Analysis Problems

Real Analysis Problems Cristian E. Gutiérrez September 14, 29 1 1 CONTINUITY 1 Continuity Problem 1.1 Let r n be the sequence of rational numbers and Prove that f(x) = 1. f is continuous on the irrationals.

### Entropy for zero-temperature limits of Gibbs-equilibrium states for countable-alphabet subshifts of finite type

Entropy for zero-temperature limits of Gibbs-equilibrium states for countable-alphabet subshifts of finite type I. D. Morris August 22, 2006 Abstract Let Σ A be a finitely primitive subshift of finite

### Simulated Annealing for Constrained Global Optimization

Monte Carlo Methods for Computation and Optimization Final Presentation Simulated Annealing for Constrained Global Optimization H. Edwin Romeijn & Robert L.Smith (1994) Presented by Ariel Schwartz Objective

### 1 Stochastic Dynamic Programming

1 Stochastic Dynamic Programming Formally, a stochastic dynamic program has the same components as a deterministic one; the only modification is to the state transition equation. When events in the future

### SELF-ADJOINTNESS OF SCHRÖDINGER-TYPE OPERATORS WITH SINGULAR POTENTIALS ON MANIFOLDS OF BOUNDED GEOMETRY

Electronic Journal of Differential Equations, Vol. 2003(2003), No.??, pp. 1 8. ISSN: 1072-6691. URL: http://ejde.math.swt.edu or http://ejde.math.unt.edu ftp ejde.math.swt.edu (login: ftp) SELF-ADJOINTNESS

### Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales Prakash Balachandran Department of Mathematics Duke University April 2, 2008 1 Review of Discrete-Time

### Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

### HAIYUN ZHOU, RAVI P. AGARWAL, YEOL JE CHO, AND YONG SOO KIM

Georgian Mathematical Journal Volume 9 (2002), Number 3, 591 600 NONEXPANSIVE MAPPINGS AND ITERATIVE METHODS IN UNIFORMLY CONVEX BANACH SPACES HAIYUN ZHOU, RAVI P. AGARWAL, YEOL JE CHO, AND YONG SOO KIM

### Markov Chains and Stochastic Sampling

Part I Markov Chains and Stochastic Sampling 1 Markov Chains and Random Walks on Graphs 1.1 Structure of Finite Markov Chains We shall only consider Markov chains with a finite, but usually very large,

### Tail inequalities for additive functionals and empirical processes of. Markov chains

Tail inequalities for additive functionals and empirical processes of geometrically ergodic Markov chains University of Warsaw Banff, June 2009 Geometric ergodicity Definition A Markov chain X = (X n )

### ON COMPLEMENTED SUBSPACES OF SUMS AND PRODUCTS OF BANACH SPACES

PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 124, Number 7, July 1996 ON COMPLEMENTED SUBSPACES OF SUMS AND PRODUCTS OF BANACH SPACES M. I. OSTROVSKII (Communicated by Dale Alspach) Abstract.

### arxiv:math/ v1 [math.fa] 26 Oct 1993

arxiv:math/9310217v1 [math.fa] 26 Oct 1993 ON COMPLEMENTED SUBSPACES OF SUMS AND PRODUCTS OF BANACH SPACES M.I.Ostrovskii Abstract. It is proved that there exist complemented subspaces of countable topological

### Notes on Integrable Functions and the Riesz Representation Theorem Math 8445, Winter 06, Professor J. Segert. f(x) = f + (x) + f (x).

References: Notes on Integrable Functions and the Riesz Representation Theorem Math 8445, Winter 06, Professor J. Segert Evans, Partial Differential Equations, Appendix 3 Reed and Simon, Functional Analysis,

### Convergence of Feller Processes

Chapter 15 Convergence of Feller Processes This chapter looks at the convergence of sequences of Feller processes to a iting process. Section 15.1 lays some ground work concerning weak convergence of processes

### 1. INTRODUCTION Propp and Wilson (1996,1998) described a protocol called \coupling from the past" (CFTP) for exact sampling from a distribution using

Ecient Use of Exact Samples by Duncan J. Murdoch* and Jerey S. Rosenthal** Abstract Propp and Wilson (1996,1998) described a protocol called coupling from the past (CFTP) for exact sampling from the steady-state

### Examples of Adaptive MCMC

Examples of Adaptive MCMC by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September 2006; revised January 2008.) Abstract. We investigate the use of adaptive MCMC algorithms to automatically tune the

### A note on some approximation theorems in measure theory

A note on some approximation theorems in measure theory S. Kesavan and M. T. Nair Department of Mathematics, Indian Institute of Technology, Madras, Chennai - 600 06 email: kesh@iitm.ac.in and mtnair@iitm.ac.in

### General Construction of Irreversible Kernel in Markov Chain Monte Carlo

General Construction of Irreversible Kernel in Markov Chain Monte Carlo Metropolis heat bath Suwa Todo Department of Applied Physics, The University of Tokyo Department of Physics, Boston University (from

### OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS

APPLICATIONES MATHEMATICAE 29,4 (22), pp. 387 398 Mariusz Michta (Zielona Góra) OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS Abstract. A martingale problem approach is used first to analyze

### MS&E 321 Spring Stochastic Systems June 1, 2013 Prof. Peter W. Glynn Page 1 of 10. x n+1 = f(x n ),

MS&E 321 Spring 12-13 Stochastic Systems June 1, 2013 Prof. Peter W. Glynn Page 1 of 10 Section 4: Steady-State Theory Contents 4.1 The Concept of Stochastic Equilibrium.......................... 1 4.2

### APPLICATIONS OF THE KANTOROVICH-RUBINSTEIN MAXIMUM PRINCIPLE IN THE THEORY OF MARKOV OPERATORS

12 th International Workshop for Young Mathematicians Probability Theory and Statistics Kraków, 20-26 September 2009 pp. 43-51 APPLICATIONS OF THE KANTOROVICH-RUBINSTEIN MAIMUM PRINCIPLE IN THE THEORY

### PDF hosted at the Radboud Repository of the Radboud University Nijmegen

PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a preprint version which may differ from the publisher's version. For additional information about this

### The Kadec-Pe lczynski theorem in L p, 1 p < 2

The Kadec-Pe lczynski theorem in L p, 1 p < 2 I. Berkes and R. Tichy Abstract By a classical result of Kadec and Pe lczynski (1962), every normalized weakly null sequence in L p, p > 2 contains a subsequence

### The Polya-Gamma Gibbs Sampler for Bayesian. Logistic Regression is Uniformly Ergodic

he Polya-Gamma Gibbs Sampler for Bayesian Logistic Regression is Uniformly Ergodic Hee Min Choi and James P. Hobert Department of Statistics University of Florida August 013 Abstract One of the most widely

### Positive definite functions, completely monotone functions, the Bernstein-Widder theorem, and Schoenberg s theorem

Positive definite functions, completely monotone functions, the Bernstein-Widder theorem, and Schoenberg s theorem Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto June

### Measure and integration

Chapter 5 Measure and integration In calculus you have learned how to calculate the size of different kinds of sets: the length of a curve, the area of a region or a surface, the volume or mass of a solid.

### PROBLEMS. (b) (Polarization Identity) Show that in any inner product space

1 Professor Carl Cowen Math 54600 Fall 09 PROBLEMS 1. (Geometry in Inner Product Spaces) (a) (Parallelogram Law) Show that in any inner product space x + y 2 + x y 2 = 2( x 2 + y 2 ). (b) (Polarization

### Math 209B Homework 2

Math 29B Homework 2 Edward Burkard Note: All vector spaces are over the field F = R or C 4.6. Two Compactness Theorems. 4. Point Set Topology Exercise 6 The product of countably many sequentally compact

### Banach Spaces II: Elementary Banach Space Theory

BS II c Gabriel Nagy Banach Spaces II: Elementary Banach Space Theory Notes from the Functional Analysis Course (Fall 07 - Spring 08) In this section we introduce Banach spaces and examine some of their

### MAGIC010 Ergodic Theory Lecture Entropy

7. Entropy 7. Introduction A natural question in mathematics is the so-called isomorphism problem : when are two mathematical objects of the same class the same (in some appropriately defined sense of

### Semicontinuous functions and convexity

Semicontinuous functions and convexity Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto April 3, 2014 1 Lattices If (A, ) is a partially ordered set and S is a subset

### X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

### A METRIC SPACE OF A.H. STONE AND AN EXAMPLE CONCERNING σ-minimal BASES

PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 126, Number 7, July 1998, Pages 2191 2196 S 0002-9939(98)04785-6 A METRIC SPACE OF A.H. STONE AND AN EXAMPLE CONCERNING σ-minimal BASES HAROLD R.

### Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract

Bayesian Estimation of A Distance Functional Weight Matrix Model Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies Abstract This paper considers the distance functional weight

### Sequences. Chapter 3. n + 1 3n + 2 sin n n. 3. lim (ln(n + 1) ln n) 1. lim. 2. lim. 4. lim (1 + n)1/n. Answers: 1. 1/3; 2. 0; 3. 0; 4. 1.

Chapter 3 Sequences Both the main elements of calculus (differentiation and integration) require the notion of a limit. Sequences will play a central role when we work with limits. Definition 3.. A Sequence

### AN IMPROVED MENSHOV-RADEMACHER THEOREM

PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 124, Number 3, March 1996 AN IMPROVED MENSHOV-RADEMACHER THEOREM FERENC MÓRICZ AND KÁROLY TANDORI (Communicated by J. Marshall Ash) Abstract. We

### KERNEL ESTIMATORS OF ASYMPTOTIC VARIANCE FOR ADAPTIVE MARKOV CHAIN MONTE CARLO. By Yves F. Atchadé University of Michigan

Submitted to the Annals of Statistics arxiv: math.pr/0911.1164 KERNEL ESTIMATORS OF ASYMPTOTIC VARIANCE FOR ADAPTIVE MARKOV CHAIN MONTE CARLO By Yves F. Atchadé University of Michigan We study the asymptotic

### REVIEW OF ESSENTIAL MATH 346 TOPICS

REVIEW OF ESSENTIAL MATH 346 TOPICS 1. AXIOMATIC STRUCTURE OF R Doğan Çömez The real number system is a complete ordered field, i.e., it is a set R which is endowed with addition and multiplication operations

### Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University February 7, 2007 2 Contents 1 Metric Spaces 1 1.1 Basic definitions...........................

### An Introduction to Entropy and Subshifts of. Finite Type

An Introduction to Entropy and Subshifts of Finite Type Abby Pekoske Department of Mathematics Oregon State University pekoskea@math.oregonstate.edu August 4, 2015 Abstract This work gives an overview

### Model reversibility of a two dimensional reflecting random walk and its application to queueing network

arxiv:1312.2746v2 [math.pr] 11 Dec 2013 Model reversibility of a two dimensional reflecting random walk and its application to queueing network Masahiro Kobayashi, Masakiyo Miyazawa and Hiroshi Shimizu

### PREDICTABLE REPRESENTATION PROPERTY OF SOME HILBERTIAN MARTINGALES. 1. Introduction.

Acta Math. Univ. Comenianae Vol. LXXVII, 1(28), pp. 123 128 123 PREDICTABLE REPRESENTATION PROPERTY OF SOME HILBERTIAN MARTINGALES M. EL KADIRI Abstract. We prove as for the real case that a martingale

### Examples of Dual Spaces from Measure Theory

Chapter 9 Examples of Dual Spaces from Measure Theory We have seen that L (, A, µ) is a Banach space for any measure space (, A, µ). We will extend that concept in the following section to identify an

### New Fréchet features for random distributions and associated sensitivity indices

New Fréchet features for random distributions and associated sensitivity indices Jean-Claude Fort a and Thierry Klein b March 3, 215 Abstract In this article we define new Fréchet features for random cumulative

### Lecture 20: Reversible Processes and Queues

Lecture 20: Reversible Processes and Queues 1 Examples of reversible processes 11 Birth-death processes We define two non-negative sequences birth and death rates denoted by {λ n : n N 0 } and {µ n : n

### A n (T )f = 1 n 1. T i f. i=0

REVISTA DE LA UNIÓN MATEMÁTICA ARGENTINA Volumen 46, Número 1, 2005, Páginas 47 51 A NOTE ON THE L 1 -MEAN ERGODIC THEOREM Abstract. Let T be a positive contraction on L 1 of a σ-finite measure space.

### The Caratheodory Construction of Measures

Chapter 5 The Caratheodory Construction of Measures Recall how our construction of Lebesgue measure in Chapter 2 proceeded from an initial notion of the size of a very restricted class of subsets of R,

### (x k ) sequence in F, lim x k = x x F. If F : R n R is a function, level sets and sublevel sets of F are any sets of the form (respectively);

STABILITY OF EQUILIBRIA AND LIAPUNOV FUNCTIONS. By topological properties in general we mean qualitative geometric properties (of subsets of R n or of functions in R n ), that is, those that don t depend

### 6. Duals of L p spaces

6 Duals of L p spaces This section deals with the problem if identifying the duals of L p spaces, p [1, ) There are essentially two cases of this problem: (i) p = 1; (ii) 1 < p < The major difference between

### The Gaussian free field, Gibbs measures and NLS on planar domains

The Gaussian free field, Gibbs measures and on planar domains N. Burq, joint with L. Thomann (Nantes) and N. Tzvetkov (Cergy) Université Paris Sud, Laboratoire de Mathématiques d Orsay, CNRS UMR 8628 LAGA,

### CPSC 540: Machine Learning

CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

### AN ELEMENTARY PROOF OF THE TRIANGLE INEQUALITY FOR THE WASSERSTEIN METRIC

PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 136, Number 1, January 2008, Pages 333 339 S 0002-9939(07)09020- Article electronically published on September 27, 2007 AN ELEMENTARY PROOF OF THE

### Lecture 4 Lebesgue spaces and inequalities

Lecture 4: Lebesgue spaces and inequalities 1 of 10 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 4 Lebesgue spaces and inequalities Lebesgue spaces We have seen how

### Soo Hak Sung and Andrei I. Volodin

Bull Korean Math Soc 38 (200), No 4, pp 763 772 ON CONVERGENCE OF SERIES OF INDEENDENT RANDOM VARIABLES Soo Hak Sung and Andrei I Volodin Abstract The rate of convergence for an almost surely convergent

### The Essential Equivalence of Pairwise and Mutual Conditional Independence

The Essential Equivalence of Pairwise and Mutual Conditional Independence Peter J. Hammond and Yeneng Sun Probability Theory and Related Fields, forthcoming Abstract For a large collection of random variables,

### A COMMENT ON FREE GROUP FACTORS

A COMMENT ON FREE GROUP FACTORS NARUTAKA OZAWA Abstract. Let M be a finite von Neumann algebra acting on the standard Hilbert space L 2 (M). We look at the space of those bounded operators on L 2 (M) that

### ( f ^ M _ M 0 )dµ (5.1)

47 5. LEBESGUE INTEGRAL: GENERAL CASE Although the Lebesgue integral defined in the previous chapter is in many ways much better behaved than the Riemann integral, it shares its restriction to bounded

### Exchangeability. Peter Orbanz. Columbia University

Exchangeability Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent noise Peter

### Max-min (σ-)additive representation of monotone measures

Noname manuscript No. (will be inserted by the editor) Max-min (σ-)additive representation of monotone measures Martin Brüning and Dieter Denneberg FB 3 Universität Bremen, D-28334 Bremen, Germany e-mail:

### AN EFFICIENT ESTIMATOR FOR GIBBS RANDOM FIELDS

K Y B E R N E T I K A V O L U M E 5 0 ( 2 0 1 4, N U M B E R 6, P A G E S 8 8 3 8 9 5 AN EFFICIENT ESTIMATOR FOR GIBBS RANDOM FIELDS Martin Janžura An efficient estimator for the expectation R f dp is

### S. DUTTA AND T. S. S. R. K. RAO

ON WEAK -EXTREME POINTS IN BANACH SPACES S. DUTTA AND T. S. S. R. K. RAO Abstract. We study the extreme points of the unit ball of a Banach space that remain extreme when considered, under canonical embedding,

### Overview of Markovian maps

Overview of Markovian maps Mike Boyle University of Maryland BIRS October 2007 *these slides are online, via Google: Mike Boyle Outline of the talk I. Subshift background 1. Subshifts 2. (Sliding) block

### An introduction to some aspects of functional analysis

An introduction to some aspects of functional analysis Stephen Semmes Rice University Abstract These informal notes deal with some very basic objects in functional analysis, including norms and seminorms

### On the Optimal Scaling of the Modified Metropolis-Hastings algorithm

On the Optimal Scaling of the Modified Metropolis-Hastings algorithm K. M. Zuev & J. L. Beck Division of Engineering and Applied Science California Institute of Technology, MC 4-44, Pasadena, CA 925, USA

### On Markov Chain Monte Carlo

MCMC 0 On Markov Chain Monte Carlo Yevgeniy Kovchegov Oregon State University MCMC 1 Metropolis-Hastings algorithm. Goal: simulating an Ω-valued random variable distributed according to a given probability

### Fall TMA4145 Linear Methods. Solutions to exercise set 9. 1 Let X be a Hilbert space and T a bounded linear operator on X.

TMA445 Linear Methods Fall 26 Norwegian University of Science and Technology Department of Mathematical Sciences Solutions to exercise set 9 Let X be a Hilbert space and T a bounded linear operator on

### A note on a construction of J. F. Feinstein

STUDIA MATHEMATICA 169 (1) (2005) A note on a construction of J. F. Feinstein by M. J. Heath (Nottingham) Abstract. In [6] J. F. Feinstein constructed a compact plane set X such that R(X), the uniform

### 3 Measurable Functions

3 Measurable Functions Notation A pair (X, F) where F is a σ-field of subsets of X is a measurable space. If µ is a measure on F then (X, F, µ) is a measure space. If µ(x) < then (X, F, µ) is a probability