The Monte Carlo Computation Error of Transition Probabilities

Size: px
Start display at page:

Download "The Monte Carlo Computation Error of Transition Probabilities"

Transcription

1 Zuse Institute Berlin Takustr Berlin Germany ADAM NILSN The Monte Carlo Computation rror of Transition Probabilities ZIB Report 6-37 (July 206)

2 Zuse Institute Berlin Takustr. 7 D-495 Berlin Telefon: Telefax: bibliothek@zib.de URL: ZIB-Report (Print) ISSN ZIB-Report (Internet) ISSN

3 The Monte Carlo Computation rror of Transition Probabilities Adam Nielsen Abstract In many applications one is interested to compute transition probabilities of a Markov chain. This can be achieved by using Monte Carlo methods with local or global sampling points. In this article, we analyze the error by the difference in the L 2 norm between the true transition probabilities and the approximation achieved through a Monte Carlo method. We give a formula for the error for Markov chains with locally computed sampling points. Further, in the case of reversible Markov chains, we will deduce a formula for the error when sampling points are computed globally. We will see that in both cases the error itself can be approximated with Monte Carlo methods. As a consequence of the result, we will derive surprising properties of reversible Markov chains. Introduction In many applications, one is interested to approximate the term P[X B X 0 A] from a Markov chain (X n ) with stationary measure µ. A solid method to do this is by using a Monte Carlo method. In this article, we will give the exact error between this term and the approximated term by the Monte Carlo method. It will turn out that when we approximate the term with N trajectories with starting points distributed locally in A, then the squared error in the L 2 norm is exactly given by ( N µa [P x [ X B] 2 ] µa [P x [ X B]] 2) where µ A (B) := µ(a B) is the stationary distribution restricted on A and ( X n ) is the reversed Markov chain. If the Markov chain(x n ) is reversible and if we approximate the term with N trajectories with global starting points, then the squared error in the L 2 norm is given by N ( P[X2 A, X B X 0 A] P[X 0 A] P[X B X 0 A] 2 ). We even give the exact error for a more generalized term which is of interest whenever one wants to compute a Markov State Model of a Markov operator based on an arbitrary function space which has application in computational

4 drug design [0, ]. Further, we derive from the result some surprising properties for reversible Markov chains. For example, we will show that reversible Markov chains rather return to a set then being in a set, i.e. for any measurable set A. 2 Basics P[X 2 A X 0 A] P[X 0 A] We denote with (, Σ, µ) and (Ω, A, P) probability spaces on any given sets and Ω. We call a map p: Σ [0, ] Markov kernel if A p(x, A) is almost surely a probability measure on Σ and x p(x, A) is measurable for all A Σ. We denote with (X n ) n, X n : Ω a Markov chain on a measurable state space [8, 6], i.e.there exists a Markov kernel p with P[X 0 A 0, X A,..., X n A n ] =... A 0 p(y n, A n ) p(y n 2, dy n )... p(y 0, dy ) µ(dy 0 ) A n for any n N, A 0,..., A n Σ. In this article, we only need this formula for n 2, where it simplifies to P[X 0 A 0, X A, X 2 A 2 ] = p(y, A 2 ) p(x, dy) µ(dx) A for any A 0, A, A 2 Σ. We assume throughout this paper that µ is a stationary measure, i.e. P[X n A] = µ(a) for any n N, A Σ. We call a Markov chain (X n ) reversible if p(x, B) µ(dx) = p(x, A) µ(dx) holds for any A, B Σ. This is equivalent to A A 0 P[X 0 A, X B] = P[X 0 B, X A] for any A, B Σ. In particular, it implies that µ is a stationary measure. We denote with L (µ) the space of all µ-integrable functions [, Page 99]. It is known [3, Theorem 2.] that there exists a Markov operator P : L (µ) L (µ) with P f L = f L and P f 0 for all non-negative f L (µ) which is associated to the Markov chain, i.e. p(x, A) f(x) µ(dx) = (P f)(x) µ(dx) for all A Σ, f L (µ). Since µ is a stationary measure, we have P (L 2 (µ)) L 2 (µ) and we can restrict P onto L 2 (µ) [2, Lemma ]. It is known [3] that a reversed Markov chain ( X n ) exists with P[X B, X 0 A] = P[ X A, X 0 B] B A 2

5 and that the Markov operator can be evaluated point-wise as P f(x) = x [f( X )]. () In the case where (X n ) is reversible, we have that the Markov operator P : L 2 (µ) L 2 (µ) is self-adjoint [4, Proposition.] and that X n = X n, thus it can be evaluated point-wise by P f(x) = x [f(x )] (2) Throughout the paper we note for any probability measure µ the expectation as µ [f] = f(x) µ(dx) and use the shortcut := P. We denote with f, g µ = f(x) g(x) µ(dx) the scalar product on L 2 (µ). In this article, we are interested in the quantity C := f, P g µ f, µ (3) where denotes the constant function that is everywhere one and f, g L 2 (µ) with f, µ 0. For the special case where f(x) = A (x) and g(x) = B (x) holds, we obtain C = µ(a) B p(x, A) µ(dx) = P[X B X 0 A]. There are many applications that involve an approximation of C for different types of functions f, g which can be found in [0]. 3 Computation of the Local rror To compute the error locally, we rewrite C = f, P g µ = f, µ P g(x) µ i (dx) with µ i (A) = f(x)µ(dx). f, µ A This term can be approximated by Monte Carlo methods [5]. To do so, we need random variables Y, Y,..., Y N : Ω distributed according to µ i. Then we can approximate the term C by The error can be stated as C := N C C 2 L 2 (P) N P g(y k ). k= VAR[P g(y )] =. N 3

6 This follows from [ C] = C and C C 2 L 2 (P) = VAR[ C] = N N 2 VAR[P g(y )] In this case the variance is simply given as = i= VAR[P g(y )]. N VAR[P g(y )] = [P g(y ) 2 ] [P g(y )] 2 = µi [ x [g( X )] 2 ] µi [ x [g( X )]] 2. In the case where the Markov chain (X n ) is reversible, the variance simplifies to VAR[P g(y )] = µi [ x [g(x )] 2 ] µi [ x [g(x )]] 2. If further g = A, we obtain 3. xample VAR[P g(y )] = µi [P x [X A] 2 ] µi [P x [X A]] 2. Consider the solution (Y t ) t 0 of the stochastic differential equation dy t = V (Y t ) + σdb t with σ > 0 and V (x) = (x 2) 2 (x + 2) 2. The potential V and a realization of (Y t ) are shown in Figure. The solution (Y t ) is known to be reversible, see [7, Proposition 4.5]. For any value τ > 0 the family (X n ) n N with X n := Y n τ is a reversible Markov chain. Let us partition [ 3, 3] into twenty equidistant sets (A l ) l=,...,20. We compute a matrix M R with M(i, j) = µi [P x [X A j ] 2 ] µi [P x [X A j ]] 2 with µ i (B) = µ(a i B). In each set A i we sample points x i,..., x i N with the Metropolis Monte Carlo Method distributed according to µ i. For each point x i j we sample M trajectories starting in x i j with endpoint yi k,j for k =,..., M. We then approximate and µi [P x [X A j ] 2 ] N ( N M Aj (y i M k,l) l= ( µi [P x [X A j ]] 2 N N M l= k= M Aj (yk,l) i The local sampled points and the matrix M are shown in Figure 2. It shows that in order to compute the transition probability from 0 to 2 or from 0 to 2, many trajectories are needed. k= ) 2 ) 2 4

7 20 2 V(x) 0 x(t) x t 0 7 Figure : Potential (left) and trajectory (right). 3, , , , , Figure 2: Local starting points (left) and matrix M (right). 4 Computation of the Global rror We assume in this section that (X n ) is reversible. If we denote with then one can rewrite φ(x) := P f(x) g(x) C = f, P g µ f, µ = P f, g µ f, µ = f, µ, φ(x) µ(dx). Analogous to the local case, we can approximate C by Monte Carlo methods. However, this time we need random variables Y, Y,..., Y N : Ω distributed according to µ. Then we can approximate the term C by Again, the error can be stated as C := N C C 2 L 2 (P) N φ(y k ). k= VAR[φ(Y )] =. N The main result of the article is the following theorem: Theorem It holds VAR[φ(Y )] = ν f [g 2 (X ) f(x 2 )] [f(x 0 )] νf [g(x )] 2 with ν f (A) = A f(x 0 (ω)) Ω f(x 0( ω)) P(d ω) P(dω). 5

8 3, ,500 2, , , Figure 3: Global starting points (left) and matrix M (right). The theorem shows that one can compute the exact error. Before we step into the proof, we will first show some applications of the theorem. 4. xample In the case where f(x) = A (x), g(x) = B (x) the variance simplifies to VAR[φ(Y )] = P[X 2 A, X B X 0 A] P[X 0 A] P[X B X 0 A] 2. We consider again the example from Section 3.. This time, we compute starting points globally with the Metropolis Monte Carlo Method distributed according to µ and we compute the matrix M ij = P[X 2 A i, X A j X 0 A i ] P[X 0 A i ] P[X A j X 0 A i ] 2. The results are shown in Figure 3. One may note that the entries of the global variance matrix are significantly higher then of the local variance matrix. The reason for that is the following. In the global case, the matrix M(i, j) gives indication how many global trajectories one needs to keep the error of P[X A j X 0 A i ] small. However, one can only use those trajectories with starting points in A i. But in the local case, we only sample starting points in A i and thus we do not need to dismiss any trajectories, this implies that we need less trajectories and thus it becomes clear that M(i, j) is smaller in the local case. specially, the global scheme has difficulties to capture the transition probability from a transition region (here the area near 0) to a transition region. Which is in turn no challenge for the local scheme. 5 Inequalities for Reversible Markov Chains Since VAR[φ(Y )] 0, we obtain νf [g 2 (X ) f(x 2 )] [f(x 0 )] νf [g(x )] 2 for any f, g L 2 (µ). If we set g(x) =, we obtain νf [f(x 2 )] [f(x 0 )]. 6

9 For the case where f(x) = A for A Σ, this simplifies to P[X 2 A X 0 A] P[X 0 A]. In short: Reversible Markov chains prefer to return to a set instead of being there. If = {,..., n} is finite and the Markov chain is described by a transition matrix P R n n which is reversible according to the probability vector π R n, i.e. π i p ij = π j p ji, then the inequality states This shows by the way that from it immediately follows P 2 (i, i) π i. n P 2 (i, i) = i= P 2 (i, i) = π i. Another inequality can be obtained as follows. Fix some sets A, B Σ with A B =. Define the function f at point x to be the probability that the Markov chain (X n ) hits next set A before hitting set B when starting in x. Since the Markov chain is reversible, this is equivalent to the probability that one visited last set A instead of set B. This function is also known as committor function [9]. The term νf [f(x 2 )] is then given as the conditioned probability that one came last from A instead of B, moves for two steps, and returns then next to A instead of B. The probability for this event is thus always higher, then [f(x 0 )] which represents simply the probability that one came last from A instead of B. xample Consider the finite reversible Markov chain 0 P = 0 0 visualized in Figure 4 with stationary measure π = ( 3, 3, 3). In this case we have P 2 (i, i) = 2 3 = π i. The probability that one came last from A := {} instead of B := {2} is given by [f(x 0 )] = = 2. The conditioned probability that one came last from A instead of B, moves for two steps, and returns then next to A instead of B is given by νf [f(x 2 )] = 2 ( ) ( ) 4 0 = 7 2 7

10 2 3 Figure 4: Visualized simple Markov chain where we have used that ν f ({}) = 2 3, ν f ({2}) = 0 and ν f ({3}) = 3. Thus, we have as predicted 7 2 = ν f [f(x 2 )] [f(x 0 )] = 2. 6 The proof We will now give the proof of Theorem. First, note that we have VAR[φ(Y )] = [φ 2 (Y )] [φ(y )] 2. For sets A, B Σ, we have [ A (X 0 ) B (X )] = P[X 0 A, X B] = A (x) B (y) p(x, dy) µ(dx). For measurable functions f, g 0, we then obtain [f(x 0 ) g(x )] = f(x) g(y) p(x, dy) µ(dx). (4) For φ(x) = P f(x) g(x) f, µ we first compute [φ 2 ( ) 2 (Y )] = P f(x) g(x) µ(dx) f, 2 µ ( ) = f, 2 P f(x) g 2 P f (x) µ(dx) µ ( ( ) ) = f, 2 f(x) P g 2 P f (x) µ(dx) µ = f, 2 µ leading to [φ 2 (Y )] = f, 2 µ ( ) f(x) g 2 (y) P f(y) p(x, dy) µ(dx) f(x) g 2 (y) y [f(x )] p(x, dy) µ(dx), where we have used quation (2). From quation (4) we get [ ] [φ 2 (Y )] = f, 2 f(x 0 ) g 2 (X ) X [f(x )]. µ 8

11 Further, according to [6, quation (3.28) in Chapter 3], we have X [f(x )] = [f(x 2 ) F ], where F = σ(x 0, X ). Thus, we obtain [ f(x 0 ) g 2 (X ) X [f(x )] ] = [ f(x 0 ) g 2 (X ) [f(x 2 ) F ] ] Also note that we have [ A [f(x 2 ) F ]] = [ A f(x 2 )] for any A F. Since f(x 0 ) g 2 (X ) is measurable according to F, we obtain Thus we finally get with [ f(x 0 ) g 2 (X ) [f(x 2 ) F ] ] = [ f(x 0 ) g 2 (X ) f(x 2 ) ]. [φ 2 (Y )] = νf [g 2 (X ) f(x 2 )] µ [f] ν f (A) = for all measurable sets A. Finally, we have [φ(y )] = µ [f] = µ [f] = µ [f] A f(x 0 (ω)) Ω f(x 0( ω)) P(d ω) P(dω) P f(x) g(x) µ(dx) f(x) P g(x) µ(dx) [ ] f(x) g(y) p(x, dy) µ(dx) = µ [f] [f(x 0)g(X )] = νf [g(x )]. This gives us the exact formula for the variance. Acknowledgement The author gatefully thanks the Berlin Mathematical School for financial support. 9

12 References [] Heinz Bauer. Maß und Integrationstheorie. Walter de Gruyter, Berlin; New York, 2nd edition, 992. [2] John R. Baxter and Jeffrey S. Rosenthal. Rates of convergence for everywhere-positive markov chains. Statistics & Probability Letters, 22: , 995. [3] berhard Hopf. The general temporally discrete Markoff process, volume [4] Wilhelm Huisinga. Metastability of Markovian systems. PhD thesis, Freie Universitt Berlin, 200. [5] David J. C. MacKay. Introduction to Monte Carlo methods. In M. I. Jordan, editor, Learning in Graphical Models, NATO Science Series, pages Kluwer Academic Press, 998. [6] Sean Meyn and Richard L. Tweedie. Markov Chains and Stochastic Stability. Cambridge University Press, New York, second edition edition, [7] Grigorios A. Pavliotis. Stochastic Processes and Applications. Springer, 204. [8] Daniel Revuz. Markov Chains. North-Holland Mathematical Library, 2nd edition, 984. [9] Christof Schütte and Marco Sarich. Metastability and Markov State Models in Molecular Dynamics: Modeling, Analysis, Algorithmic Approaches A co-publication of the AMS and the Courant Institute of Mathematical Sciences at New York University. [0] Christof Schütte and Marco Sarich. A critical appraisal of markov state models. The uropean Physical Journal Special Topics, pages 8, 205. [] Marcus Weber. Meshless Methods in Conformation Dynamics. PhD thesis, Freie Universitt Berlin,

Computing the nearest reversible Markov chain

Computing the nearest reversible Markov chain Konrad-Zuse-Zentrum für Informationstechnik Berlin Takustraße 7 D-14195 Berlin-Dahlem Germany ADAM NIELSEN AND MARCUS WEBER Computing the nearest reversible Markov chain ZIB Report 14-48 (December 2014)

More information

On a Generalized Transfer Operator

On a Generalized Transfer Operator Konrad-Zuse-Zentrum für Informationstechnik Berlin Takustraße 7 D-14195 Berlin-Dahlem Germany DM NILSN ND KONSTNTIN FCKLDY ND MRCUS WBR On a Generalized Transfer Operator ZIB-Report 13-74 (Nov 2013) Herausgegeben

More information

University of Toronto Department of Statistics

University of Toronto Department of Statistics Norm Comparisons for Data Augmentation by James P. Hobert Department of Statistics University of Florida and Jeffrey S. Rosenthal Department of Statistics University of Toronto Technical Report No. 0704

More information

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past. 1 Markov chain: definition Lecture 5 Definition 1.1 Markov chain] A sequence of random variables (X n ) n 0 taking values in a measurable state space (S, S) is called a (discrete time) Markov chain, if

More information

A generalization of Dobrushin coefficient

A generalization of Dobrushin coefficient A generalization of Dobrushin coefficient Ü µ ŒÆ.êÆ ÆÆ 202.5 . Introduction and main results We generalize the well-known Dobrushin coefficient δ in total variation to weighted total variation δ V, which

More information

Reversible Markov chains

Reversible Markov chains Reversible Markov chains Variational representations and ordering Chris Sherlock Abstract This pedagogical document explains three variational representations that are useful when comparing the efficiencies

More information

Markov State Models for Rare Events in Molecular Dynamics

Markov State Models for Rare Events in Molecular Dynamics Entropy 214, 16, 258-286; doi:1.339/e161258 OPEN ACCESS entropy ISSN 199-43 www.mdpi.com/journal/entropy Article Markov State Models for Rare Events in Molecular Dynamics Marco Sarich 1, *, Ralf Banisch

More information

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015 Markov Chains Arnoldo Frigessi Bernd Heidergott November 4, 2015 1 Introduction Markov chains are stochastic models which play an important role in many applications in areas as diverse as biology, finance,

More information

A regeneration proof of the central limit theorem for uniformly ergodic Markov chains

A regeneration proof of the central limit theorem for uniformly ergodic Markov chains A regeneration proof of the central limit theorem for uniformly ergodic Markov chains By AJAY JASRA Department of Mathematics, Imperial College London, SW7 2AZ, London, UK and CHAO YANG Department of Mathematics,

More information

On Reparametrization and the Gibbs Sampler

On Reparametrization and the Gibbs Sampler On Reparametrization and the Gibbs Sampler Jorge Carlos Román Department of Mathematics Vanderbilt University James P. Hobert Department of Statistics University of Florida March 2014 Brett Presnell Department

More information

Estimating exit rate for rare event dynamical systems by extrapolation

Estimating exit rate for rare event dynamical systems by extrapolation Zuse Institute Berlin Takustr. 7 14195 Berlin Germany JANNES QUER 1 AND MARCUS WEBER 1 Estimating exit rate for rare event dynamical systems by extrapolation 1 Zuse Institute Berlin ZIB Report 15-54 (revised

More information

Markov Processes on Discrete State Spaces

Markov Processes on Discrete State Spaces Markov Processes on Discrete State Spaces Theoretical Background and Applications. Christof Schuette 1 & Wilhelm Huisinga 2 1 Fachbereich Mathematik und Informatik Freie Universität Berlin & DFG Research

More information

MARKOV CHAINS AND HIDDEN MARKOV MODELS

MARKOV CHAINS AND HIDDEN MARKOV MODELS MARKOV CHAINS AND HIDDEN MARKOV MODELS MERYL SEAH Abstract. This is an expository paper outlining the basics of Markov chains. We start the paper by explaining what a finite Markov chain is. Then we describe

More information

Adaptive simulation of hybrid stochastic and deterministic models for biochemical systems

Adaptive simulation of hybrid stochastic and deterministic models for biochemical systems publications Listing in reversed chronological order. Scientific Journals and Refereed Proceedings: A. Alfonsi, E. Cances, G. Turinici, B. Di Ventura and W. Huisinga Adaptive simulation of hybrid stochastic

More information

Local consistency of Markov chain Monte Carlo methods

Local consistency of Markov chain Monte Carlo methods Ann Inst Stat Math (2014) 66:63 74 DOI 10.1007/s10463-013-0403-3 Local consistency of Markov chain Monte Carlo methods Kengo Kamatani Received: 12 January 2012 / Revised: 8 March 2013 / Published online:

More information

Faithful couplings of Markov chains: now equals forever

Faithful couplings of Markov chains: now equals forever Faithful couplings of Markov chains: now equals forever by Jeffrey S. Rosenthal* Department of Statistics, University of Toronto, Toronto, Ontario, Canada M5S 1A1 Phone: (416) 978-4594; Internet: jeff@utstat.toronto.edu

More information

1 Brownian Local Time

1 Brownian Local Time 1 Brownian Local Time We first begin by defining the space and variables for Brownian local time. Let W t be a standard 1-D Wiener process. We know that for the set, {t : W t = } P (µ{t : W t = } = ) =

More information

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M. Lecture 10 1 Ergodic decomposition of invariant measures Let T : (Ω, F) (Ω, F) be measurable, and let M denote the space of T -invariant probability measures on (Ω, F). Then M is a convex set, although

More information

Capacitary inequalities in discrete setting and application to metastable Markov chains. André Schlichting

Capacitary inequalities in discrete setting and application to metastable Markov chains. André Schlichting Capacitary inequalities in discrete setting and application to metastable Markov chains André Schlichting Institute for Applied Mathematics, University of Bonn joint work with M. Slowik (TU Berlin) 12

More information

Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms

Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms Yan Bai Feb 2009; Revised Nov 2009 Abstract In the paper, we mainly study ergodicity of adaptive MCMC algorithms. Assume that

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

7 Poisson random measures

7 Poisson random measures Advanced Probability M03) 48 7 Poisson random measures 71 Construction and basic properties For λ 0, ) we say that a random variable X in Z + is Poisson of parameter λ and write X Poiλ) if PX n) e λ λ

More information

The Kolmogorov continuity theorem, Hölder continuity, and the Kolmogorov-Chentsov theorem

The Kolmogorov continuity theorem, Hölder continuity, and the Kolmogorov-Chentsov theorem The Kolmogorov continuity theorem, Hölder continuity, and the Kolmogorov-Chentsov theorem Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto June 11, 2015 1 Modifications

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R. Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions

More information

ELEC633: Graphical Models

ELEC633: Graphical Models ELEC633: Graphical Models Tahira isa Saleem Scribe from 7 October 2008 References: Casella and George Exploring the Gibbs sampler (1992) Chib and Greenberg Understanding the Metropolis-Hastings algorithm

More information

Markov Chains and Monte Carlo Methods

Markov Chains and Monte Carlo Methods Ioana A. Cosma and Ludger Evers Markov Chains and Monte Carlo Methods Lecture Notes March 2, 200 African Institute for Mathematical Sciences Department of Pure Mathematics and Mathematical Statistics These

More information

Entropy, Inference, and Channel Coding

Entropy, Inference, and Channel Coding Entropy, Inference, and Channel Coding Sean Meyn Department of Electrical and Computer Engineering University of Illinois and the Coordinated Science Laboratory NSF support: ECS 02-17836, ITR 00-85929

More information

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING ERIC SHANG Abstract. This paper provides an introduction to Markov chains and their basic classifications and interesting properties. After establishing

More information

Entropy for zero-temperature limits of Gibbs-equilibrium states for countable-alphabet subshifts of finite type

Entropy for zero-temperature limits of Gibbs-equilibrium states for countable-alphabet subshifts of finite type Entropy for zero-temperature limits of Gibbs-equilibrium states for countable-alphabet subshifts of finite type I. D. Morris August 22, 2006 Abstract Let Σ A be a finitely primitive subshift of finite

More information

Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of. F s F t

Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of. F s F t 2.2 Filtrations Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of σ algebras {F t } such that F t F and F t F t+1 for all t = 0, 1,.... In continuous time, the second condition

More information

An Introduction to Entropy and Subshifts of. Finite Type

An Introduction to Entropy and Subshifts of. Finite Type An Introduction to Entropy and Subshifts of Finite Type Abby Pekoske Department of Mathematics Oregon State University pekoskea@math.oregonstate.edu August 4, 2015 Abstract This work gives an overview

More information

Lectures on Stochastic Stability. Sergey FOSS. Heriot-Watt University. Lecture 4. Coupling and Harris Processes

Lectures on Stochastic Stability. Sergey FOSS. Heriot-Watt University. Lecture 4. Coupling and Harris Processes Lectures on Stochastic Stability Sergey FOSS Heriot-Watt University Lecture 4 Coupling and Harris Processes 1 A simple example Consider a Markov chain X n in a countable state space S with transition probabilities

More information

Lecture 12: Detailed balance and Eigenfunction methods

Lecture 12: Detailed balance and Eigenfunction methods Miranda Holmes-Cerfon Applied Stochastic Analysis, Spring 2015 Lecture 12: Detailed balance and Eigenfunction methods Readings Recommended: Pavliotis [2014] 4.5-4.7 (eigenfunction methods and reversibility),

More information

7 The Radon-Nikodym-Theorem

7 The Radon-Nikodym-Theorem 32 CHPTER II. MESURE ND INTEGRL 7 The Radon-Nikodym-Theorem Given: a measure space (Ω,, µ). Put Z + = Z + (Ω, ). Definition 1. For f (quasi-)µ-integrable and, the integral of f over is (Note: 1 f f.) Theorem

More information

INTRODUCTION TO MARKOV CHAIN MONTE CARLO

INTRODUCTION TO MARKOV CHAIN MONTE CARLO INTRODUCTION TO MARKOV CHAIN MONTE CARLO 1. Introduction: MCMC In its simplest incarnation, the Monte Carlo method is nothing more than a computerbased exploitation of the Law of Large Numbers to estimate

More information

Lecture 22: Variance and Covariance

Lecture 22: Variance and Covariance EE5110 : Probability Foundations for Electrical Engineers July-November 2015 Lecture 22: Variance and Covariance Lecturer: Dr. Krishna Jagannathan Scribes: R.Ravi Kiran In this lecture we will introduce

More information

On Finding Subpaths With High Demand

On Finding Subpaths With High Demand Zuse Institute Berlin Takustr. 7 14195 Berlin Germany STEPHAN SCHWARTZ, LEONARDO BALESTRIERI, RALF BORNDÖRFER On Finding Subpaths With High Demand ZIB Report 18-27 (Juni 2018) Zuse Institute Berlin Takustr.

More information

THE INVERSE FUNCTION THEOREM

THE INVERSE FUNCTION THEOREM THE INVERSE FUNCTION THEOREM W. PATRICK HOOPER The implicit function theorem is the following result: Theorem 1. Let f be a C 1 function from a neighborhood of a point a R n into R n. Suppose A = Df(a)

More information

First Hitting Time Analysis of the Independence Metropolis Sampler

First Hitting Time Analysis of the Independence Metropolis Sampler (To appear in the Journal for Theoretical Probability) First Hitting Time Analysis of the Independence Metropolis Sampler Romeo Maciuca and Song-Chun Zhu In this paper, we study a special case of the Metropolis

More information

Approximate Dynamic Programming

Approximate Dynamic Programming Master MVA: Reinforcement Learning Lecture: 5 Approximate Dynamic Programming Lecturer: Alessandro Lazaric http://researchers.lille.inria.fr/ lazaric/webpage/teaching.html Objectives of the lecture 1.

More information

Verona Course April Lecture 1. Review of probability

Verona Course April Lecture 1. Review of probability Verona Course April 215. Lecture 1. Review of probability Viorel Barbu Al.I. Cuza University of Iaşi and the Romanian Academy A probability space is a triple (Ω, F, P) where Ω is an abstract set, F is

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible

More information

Fall TMA4145 Linear Methods. Solutions to exercise set 9. 1 Let X be a Hilbert space and T a bounded linear operator on X.

Fall TMA4145 Linear Methods. Solutions to exercise set 9. 1 Let X be a Hilbert space and T a bounded linear operator on X. TMA445 Linear Methods Fall 26 Norwegian University of Science and Technology Department of Mathematical Sciences Solutions to exercise set 9 Let X be a Hilbert space and T a bounded linear operator on

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Metastability of Markovian systems

Metastability of Markovian systems Metastability of Markovian systems A transfer operator based approach in application to molecular dynamics Vom Fachbereich Mathematik und Informatik der Freien Universität Berlin zur Erlangung des akademischen

More information

Average Reward Parameters

Average Reward Parameters Simulation-Based Optimization of Markov Reward Processes: Implementation Issues Peter Marbach 2 John N. Tsitsiklis 3 Abstract We consider discrete time, nite state space Markov reward processes which depend

More information

Randomized Algorithms for Semi-Infinite Programming Problems

Randomized Algorithms for Semi-Infinite Programming Problems Randomized Algorithms for Semi-Infinite Programming Problems Vladislav B. Tadić 1, Sean P. Meyn 2, and Roberto Tempo 3 1 Department of Automatic Control and Systems Engineering University of Sheffield,

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning March May, 2013 Schedule Update Introduction 03/13/2015 (10:15-12:15) Sala conferenze MDPs 03/18/2015 (10:15-12:15) Sala conferenze Solving MDPs 03/20/2015 (10:15-12:15) Aula Alpha

More information

Auxiliary Particle Methods

Auxiliary Particle Methods Auxiliary Particle Methods Perspectives & Applications Adam M. Johansen 1 adam.johansen@bristol.ac.uk Oxford University Man Institute 29th May 2008 1 Collaborators include: Arnaud Doucet, Nick Whiteley

More information

STOCHASTIC PROCESSES Basic notions

STOCHASTIC PROCESSES Basic notions J. Virtamo 38.3143 Queueing Theory / Stochastic processes 1 STOCHASTIC PROCESSES Basic notions Often the systems we consider evolve in time and we are interested in their dynamic behaviour, usually involving

More information

An indicator for the number of clusters using a linear map to simplex structure

An indicator for the number of clusters using a linear map to simplex structure An indicator for the number of clusters using a linear map to simplex structure Marcus Weber, Wasinee Rungsarityotin, and Alexander Schliep Zuse Institute Berlin ZIB Takustraße 7, D-495 Berlin, Germany

More information

Lecture 12: Detailed balance and Eigenfunction methods

Lecture 12: Detailed balance and Eigenfunction methods Lecture 12: Detailed balance and Eigenfunction methods Readings Recommended: Pavliotis [2014] 4.5-4.7 (eigenfunction methods and reversibility), 4.2-4.4 (explicit examples of eigenfunction methods) Gardiner

More information

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Eric Slud, Statistics Program Lecture 1: Metropolis-Hastings Algorithm, plus background in Simulation and Markov Chains. Lecture

More information

I. Bayesian econometrics

I. Bayesian econometrics I. Bayesian econometrics A. Introduction B. Bayesian inference in the univariate regression model C. Statistical decision theory D. Large sample results E. Diffuse priors F. Numerical Bayesian methods

More information

Ergodic Properties of Markov Processes

Ergodic Properties of Markov Processes Ergodic Properties of Markov Processes March 9, 2006 Martin Hairer Lecture given at The University of Warwick in Spring 2006 1 Introduction Markov processes describe the time-evolution of random systems

More information

Non-Factorised Variational Inference in Dynamical Systems

Non-Factorised Variational Inference in Dynamical Systems st Symposium on Advances in Approximate Bayesian Inference, 08 6 Non-Factorised Variational Inference in Dynamical Systems Alessandro D. Ialongo University of Cambridge and Max Planck Institute for Intelligent

More information

Sampling Methods (11/30/04)

Sampling Methods (11/30/04) CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

More information

Almost Sure Convergence of Two Time-Scale Stochastic Approximation Algorithms

Almost Sure Convergence of Two Time-Scale Stochastic Approximation Algorithms Almost Sure Convergence of Two Time-Scale Stochastic Approximation Algorithms Vladislav B. Tadić Abstract The almost sure convergence of two time-scale stochastic approximation algorithms is analyzed under

More information

Signed Measures. Chapter Basic Properties of Signed Measures. 4.2 Jordan and Hahn Decompositions

Signed Measures. Chapter Basic Properties of Signed Measures. 4.2 Jordan and Hahn Decompositions Chapter 4 Signed Measures Up until now our measures have always assumed values that were greater than or equal to 0. In this chapter we will extend our definition to allow for both positive negative values.

More information

Sobolev Spaces. Chapter 10

Sobolev Spaces. Chapter 10 Chapter 1 Sobolev Spaces We now define spaces H 1,p (R n ), known as Sobolev spaces. For u to belong to H 1,p (R n ), we require that u L p (R n ) and that u have weak derivatives of first order in L p

More information

Expectation Propagation for Approximate Bayesian Inference

Expectation Propagation for Approximate Bayesian Inference Expectation Propagation for Approximate Bayesian Inference José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department February 5, 2007 1/ 24 Bayesian Inference Inference Given

More information

Particle Filters: Convergence Results and High Dimensions

Particle Filters: Convergence Results and High Dimensions Particle Filters: Convergence Results and High Dimensions Mark Coates mark.coates@mcgill.ca McGill University Department of Electrical and Computer Engineering Montreal, Quebec, Canada Bellairs 2012 Outline

More information

An efficient approach to stochastic optimal control. Bert Kappen SNN Radboud University Nijmegen the Netherlands

An efficient approach to stochastic optimal control. Bert Kappen SNN Radboud University Nijmegen the Netherlands An efficient approach to stochastic optimal control Bert Kappen SNN Radboud University Nijmegen the Netherlands Bert Kappen Examples of control tasks Motor control Bert Kappen Pascal workshop, 27-29 May

More information

Exploring the Numerics of Branch-and-Cut for Mixed Integer Linear Optimization

Exploring the Numerics of Branch-and-Cut for Mixed Integer Linear Optimization Zuse Institute Berlin Takustr. 7 14195 Berlin Germany MATTHIAS MILTENBERGER 1, TED RALPHS 2, DANIEL E. STEFFY 3 Exploring the Numerics of Branch-and-Cut for Mixed Integer Linear Optimization 1 Zuse Institute

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

A Geometric Interpretation of the Metropolis Hastings Algorithm

A Geometric Interpretation of the Metropolis Hastings Algorithm Statistical Science 2, Vol. 6, No., 5 9 A Geometric Interpretation of the Metropolis Hastings Algorithm Louis J. Billera and Persi Diaconis Abstract. The Metropolis Hastings algorithm transforms a given

More information

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( ) Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca October 22nd, 2014 E. Tanré (INRIA - Team Tosca) Mathematical

More information

A QUASI-ERGODIC THEOREM FOR EVANESCENT PROCESSES. 1. Introduction

A QUASI-ERGODIC THEOREM FOR EVANESCENT PROCESSES. 1. Introduction A QUASI-ERGODIC THEOREM FOR EVANESCENT PROCESSES L.A. BREYER AND G.O. ROBERTS Abstract. We prove a conditioned version of the ergodic theorem for Markov processes, which we call a quasi-ergodic theorem.

More information

ELEMENTS OF PROBABILITY THEORY

ELEMENTS OF PROBABILITY THEORY ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable

More information

Some Results on the Ergodicity of Adaptive MCMC Algorithms

Some Results on the Ergodicity of Adaptive MCMC Algorithms Some Results on the Ergodicity of Adaptive MCMC Algorithms Omar Khalil Supervisor: Jeffrey Rosenthal September 2, 2011 1 Contents 1 Andrieu-Moulines 4 2 Roberts-Rosenthal 7 3 Atchadé and Fort 8 4 Relationship

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

An Brief Overview of Particle Filtering

An Brief Overview of Particle Filtering 1 An Brief Overview of Particle Filtering Adam M. Johansen a.m.johansen@warwick.ac.uk www2.warwick.ac.uk/fac/sci/statistics/staff/academic/johansen/talks/ May 11th, 2010 Warwick University Centre for Systems

More information

Markov Processes. Stochastic process. Markov process

Markov Processes. Stochastic process. Markov process Markov Processes Stochastic process movement through a series of well-defined states in a way that involves some element of randomness for our purposes, states are microstates in the governing ensemble

More information

Reduction of Random Variables in Structural Reliability Analysis

Reduction of Random Variables in Structural Reliability Analysis Reduction of Random Variables in Structural Reliability Analysis S. Adhikari and R. S. Langley Department of Engineering University of Cambridge Trumpington Street Cambridge CB2 1PZ (U.K.) February 21,

More information

III. Iterative Monte Carlo Methods for Linear Equations

III. Iterative Monte Carlo Methods for Linear Equations III. Iterative Monte Carlo Methods for Linear Equations In general, Monte Carlo numerical algorithms may be divided into two classes direct algorithms and iterative algorithms. The direct algorithms provide

More information

1 Stochastic Dynamic Programming

1 Stochastic Dynamic Programming 1 Stochastic Dynamic Programming Formally, a stochastic dynamic program has the same components as a deterministic one; the only modification is to the state transition equation. When events in the future

More information

RECURRENCE IN COUNTABLE STATE MARKOV CHAINS

RECURRENCE IN COUNTABLE STATE MARKOV CHAINS RECURRENCE IN COUNTABLE STATE MARKOV CHAINS JIN WOO SUNG Abstract. This paper investigates the recurrence and transience of countable state irreducible Markov chains. Recurrence is the property that a

More information

Central Limit Theorems for Conditional Markov Chains

Central Limit Theorems for Conditional Markov Chains Mathieu Sinn IBM Research - Ireland Bei Chen IBM Research - Ireland Abstract This paper studies Central Limit Theorems for real-valued functionals of Conditional Markov Chains. Using a classical result

More information

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i := 2.7. Recurrence and transience Consider a Markov chain {X n : n N 0 } on state space E with transition matrix P. Definition 2.7.1. A state i E is called recurrent if P i [X n = i for infinitely many n]

More information

A Markov state modeling approach to characterizing the punctuated equilibrium dynamics of stochastic evolutionary games

A Markov state modeling approach to characterizing the punctuated equilibrium dynamics of stochastic evolutionary games A Markov state modeling approach to characterizing the punctuated equilibrium dynamics of stochastic evolutionary games M. Hallier, C. Hartmann February 15, 2016 Abstract Stochastic evolutionary games

More information

Random Process Lecture 1. Fundamentals of Probability

Random Process Lecture 1. Fundamentals of Probability Random Process Lecture 1. Fundamentals of Probability Husheng Li Min Kao Department of Electrical Engineering and Computer Science University of Tennessee, Knoxville Spring, 2016 1/43 Outline 2/43 1 Syllabus

More information

APPROXIMATING SELECTED NON-DOMINANT TIMESCALES

APPROXIMATING SELECTED NON-DOMINANT TIMESCALES APPROXIMATING SELECTED NON-DOMINANT TIMESCALES BY MARKOV STATE MODELS Marco Sarich Christof Schütte Institut für Mathematik, Freie Universität Berlin Arnimallee 6, 95 Berlin, Germany Abstract We consider

More information

On the Convergence of Optimistic Policy Iteration

On the Convergence of Optimistic Policy Iteration Journal of Machine Learning Research 3 (2002) 59 72 Submitted 10/01; Published 7/02 On the Convergence of Optimistic Policy Iteration John N. Tsitsiklis LIDS, Room 35-209 Massachusetts Institute of Technology

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

Spectral properties of Markov operators in Markov chain Monte Carlo

Spectral properties of Markov operators in Markov chain Monte Carlo Spectral properties of Markov operators in Markov chain Monte Carlo Qian Qin Advisor: James P. Hobert October 2017 1 Introduction Markov chain Monte Carlo (MCMC) is an indispensable tool in Bayesian statistics.

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Probabilistic Models of Cognition, 2011 http://www.ipam.ucla.edu/programs/gss2011/ Roadmap: Motivation Monte Carlo basics What is MCMC? Metropolis Hastings and Gibbs...more tomorrow.

More information

Prior estimation and Bayesian inference from large cohort data sets 1

Prior estimation and Bayesian inference from large cohort data sets 1 Zuse Institute Berlin Takustrasse 7 D-14195 Berlin-Dahlem Germany ILJA KLEBANOV, ALEXANDER SIKORSKI, CHRISTOF SCHÜTTE, SUSANNA RÖBLITZ Prior estimation and Bayesian inference from large cohort data sets

More information

A D VA N C E D P R O B A B I L - I T Y

A D VA N C E D P R O B A B I L - I T Y A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2

More information

Chapter 7. Markov chain background. 7.1 Finite state space

Chapter 7. Markov chain background. 7.1 Finite state space Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider

More information

On the Central Limit Theorem for an ergodic Markov chain

On the Central Limit Theorem for an ergodic Markov chain Stochastic Processes and their Applications 47 ( 1993) 113-117 North-Holland 113 On the Central Limit Theorem for an ergodic Markov chain K.S. Chan Department of Statistics and Actuarial Science, The University

More information

EECS 598: Statistical Learning Theory, Winter 2014 Topic 11. Kernels

EECS 598: Statistical Learning Theory, Winter 2014 Topic 11. Kernels EECS 598: Statistical Learning Theory, Winter 2014 Topic 11 Kernels Lecturer: Clayton Scott Scribe: Jun Guo, Soumik Chatterjee Disclaimer: These notes have not been subjected to the usual scrutiny reserved

More information

(U) =, if 0 U, 1 U, (U) = X, if 0 U, and 1 U. (U) = E, if 0 U, but 1 U. (U) = X \ E if 0 U, but 1 U. n=1 A n, then A M.

(U) =, if 0 U, 1 U, (U) = X, if 0 U, and 1 U. (U) = E, if 0 U, but 1 U. (U) = X \ E if 0 U, but 1 U. n=1 A n, then A M. 1. Abstract Integration The main reference for this section is Rudin s Real and Complex Analysis. The purpose of developing an abstract theory of integration is to emphasize the difference between the

More information

Conformational Studies of UDP-GlcNAc in Environments of Increasing Complexity

Conformational Studies of UDP-GlcNAc in Environments of Increasing Complexity John von Neumann Institute for Computing Conformational Studies of UDP-GlcNAc in Environments of Increasing Complexity M. Held, E. Meerbach, St. Hinderlich, W. Reutter, Ch. Schütte published in From Computational

More information

MATH 202B - Problem Set 5

MATH 202B - Problem Set 5 MATH 202B - Problem Set 5 Walid Krichene (23265217) March 6, 2013 (5.1) Show that there exists a continuous function F : [0, 1] R which is monotonic on no interval of positive length. proof We know there

More information