Detailed Proofs of Lemmas, Theorems, and Corollaries
|
|
- Miranda Edwards
- 5 years ago
- Views:
Transcription
1 Dahua Lin CSAIL, MIT John Fisher CSAIL, MIT A List of Lemmas, Theorems, and Corollaries For being self-contained, we list here all the lemmas, theorems, and corollaries in the main paper. Lemma. The joint transition matrix P + given by Eq.() in the main paper has a stationary distribution in form of (αµ X, βµ Y ), if and only if µ X Q B µ Y, and µ Y Q F µ X. () Under this condition, we have αb βf. Further, if both P X and P Y are both reversible, then P + is also reversible, if and only if µ X (x)q B (x, y) µ Y (y)q F (y, x), (2) for all x X and y Y. Lemma 2. If the condition given above holds, then µ X and µ Y are respectively stationary distributions of Q BF and Q F B. Moreover, if P + is reversible, then both Q BF and Q F B are reversible. Theorem. Given a reversible Markov chain with transition matrix P, and ε (0, /2), then log(/(2ε))(τ ) t mix (ε) log(/(εµ min ))τ. (3) Here, τ is called the relaxation time, given by /γ (P). Theorem 2. Let λ 2 be the second largest eigenvalue of a reversible transition matrix P, then Φ 2 (P)/2 λ 2 2Φ (P). (4) Theorem 3. The reversible transition matrix P + as given by Eq.() in the main paper has η(b, f) 2 φ φ + Φ (P + ) max{b, f}. (5) Here, η(b, f) 2αb 2βf is the total probability of cross-space transition, φ min{φ (Q BF ), Φ (Q F B )}. Corollary. The joint chain P + is ergodic when the collapsed chains (Q BF and Q F B ) are both ergodic. Lemma 3. Let P be a reversible transition matrix over X, such that P(x, x) ξ > 0 for each x X then its smallest eigenvalue λ n has λ n 2ξ. Theorem 4. The hierarchically bridging Markov chain with b k < for k 0,...,, and f k < for k,..., is ergodic. If we write the equilibrium distribution in form of (αµ 0, β µ,..., β µ ), then (S) µ 0 equals the target distribution µ; (S2) for each k, and y Y k, µ k (y) is proportional to the total probability of its descendant target states (the target states derived by filling all its placeholders); (S3) α, the probability of being at the target level, is given by α + k (b 0 b k )/(f f k ). Corollary 2. If b k /f k+ κ < for each k,...,, then α > κ. B Proofs Here, we provide the proofs of the lemmas and theorems presented in the paper. B. Proof of Lemma Recall that P + is given by [ ] ( b)px bq P + B. fq F ( f)p Y Suppose P + has a stationary distribution in form of (αµ X, βµ Y ), then α( b)µ X P X + βfµ Y Q F αµ X, αbµ X Q B + β( f)µ Y P Y βµ Y. (6) Since µ X and µ Y are respectively stationary distributions of P X and P Y, i.e. µ X P X µ X and µ Y P Y µ Y, we have βfµ Y Q F αbµ X, and αbµ X Q B βfµ Y. (7) Note that Q B and Q F were defined with the condition that each of their rows sums to, i.e. Q F X Y and Q B Y X. Multiplying to the right of both hand sides of either equation results in αb βf. It immediately follows that µ X Q B µ Y, and µ Y Q F µ X. (8) For the other direction, we assume Q B and Q atisfy the conditions above, and αb βf. Plugging these
2 conditions into the left hand sides of the equations in Eq.(6) results in (αµ X, βµ Y )P + (αµ X, βµ Y ), which implies that (αµ X, βµ Y ) is a stationary distribution of P +. The proof of the first part is done. Next, we show the second part of the lemma, which is about the reversibility. Let µ + (αµ X, βµ Y ). Under the condition that P X and P Y are both reversible, we have for each x, x X, µ + (x)p + (x, x ) α( b)µ X (x)p X (x, x ), µ + (x )P + (x, x) α( b)µ X (x )P X (x, x), (9) and thus µ + (x)p + (x, x ) µ + (x )P + (x, x) (due to the reversibility of P X ). Likewise, we can get µ + P + (y, y ) µ + (y )P(y, y). Hence, P + is reversible if and only if µ + (x)p + (x, y) µ + (y)p + (y, x) for each x X and y Y. This can be expanded as αbµ X (x)q B (x, y) βfµ Y (y)q F (y, x), (0) which holds if and only if µ X (x)q B (x, y) µ Y (y)q F (y, x) (under the condition αb βf). The proof is completed. B.2 Proof of Lemma 2 If (αµ X, βµ Y ) is a stationary distribution of P +, then Eq.(8) holds. Thus, µ X Q B Q F µ Y Q F µ X, () implying that µ X is a stationary distribution of Q B Q F. Similarly, we can show that µ Y is a stationary distribution of Q F Q B. Furthermore, if P + is reversible, according to Lemma, we have µ X (x)q B (x, y) µ Y (y)q F (y, x) for each x X and y Y. Then for any x, x X, µ X (x)q BF (x, x ) µ X (x) y Y Q B (x, y)q F (y, x ) Similarly, we can get y Y µ X (x)q B (x, y)q F (y, x ) y Y µ Y (y)q F (y, x)q F (y, x ) µ X (x )Q BF (x, x) y Y µ Y (y)q F (y, x )Q F (y, x). Hence, µ X (x)q BF (x, x ) µ X (x )Q BF (x, x). (2) This implies that Q BF is reversible. Likewise, we can show that Q F B is reversible under the condition that P + is reversible. The proof is completed. B.3 Proof of Theorem 3 To prove theorem 3, we first establish a lemma on flow decomposition, and then accomplish the proof based on the lemma. B.3. A Lemma on Flow Decomposition For the joint chain P +, we analyze its bottleneck ratio by decomposing the flows. Consider a partition of the union space X Y into two parts: A B (with A X and B Y ) and A c B c (with A c X/A and B c Y/B). The flow between them comprises three parts: F(A, A c ) + F(B, B c ) + (F(A, B c ) + F(B, A c )). Here, F(A, A c ) is the flow within X, F(B, B c ) is the flow within Y, and F(A, B c ) + F(B, A c ) is the flow between X and Y. The first two are inherited from the original Markov chains. We focus on the third one, which reflects the effect of bridging. For this part of flow, we derive the following lemma by decomposing it along multiple paths. Lemma 4. Given arbitrary partition of X Y into A B and A c B c as described above, we have F(A, B c ) + F(B, A c ) αb Φ (Q BF )µ X (A), (3) when µ X (A) µ X (A c ), and F(A, B c ) + F(B, A c ) βf Φ (Q F B )µ Y (B), (4) when µ Y (B) µ Y (B c ). Proof. To analyze the flow F(A, B c ) + F(B, A c ), we further decompose it along multiple paths. Based on Eq.() in the main paper, we have F(A, B c ) αbµ X (x)q B (x, y) (5) x A y B c αb µ X (x)q B (x, y) Q F (y, x ) x A y B c x X In this way, we decompose the flow into a sum of the terms in form of µ X (x)q B (x, y)q F (y, x ), which we call the path weight along x y x, denoted by ω(x, y, x ). We can then rewrite F(A, B) as F(A, B) αb ω(x, y, x ). (6) x A y B x X For conciseness, we use ω(a, B, C) to denote the sum of paths traveling from A, via B, and ending up in C, i.e. x A y B x C ω(x, y, x ). Then, we have F(A, B c ) αb(ω(a, B c, A) + ω(a, B c, A c )), (7) F(A c, B) αb(ω(a c, B, A) + ω(a c, B, A c )). (8)
3 Dahua Lin, John Fisher As F is symmetric for a reversible chain, we have F(B, A c ) F(A c, B), and thus F(A, B c ) + F(B, A c ) αb(ω(a, B c, A c ) + ω(a c, B, A)) αb ω(a, Y, A c ). (9) On the other hand, we note that ω(a, Y, A c ) ω(x, y, x ) x A y Y x A c µ X (x)q B (x, y)q F (y, x ) x A y Y x A c µ X (x) Q B (x, y)q F (y, x ) x A x A c y Y µ X (x)q BF (x, x ). (20) x A x A c This is exactly the flow from A to A c with respect to the collapsed chain Q BF, i.e. ω(a, Y, A c ) F QBF (A, A c ). (2) Assuming µ X (A) µ X (A c ), we have F QBF (A, A c ) Φ (Q BF )µ(a), by the definition of bottleneck ratio. Combining this with Eq.(9) results in F(A, B c ) + F(B, A c ) αb ω(a, Y, A c ) αb Φ (Q BF )µ X (A). (22) Likewise, with the assumption µ Y (B) µ Y (B c ), we have F(A, B c ) + F(B, A c ) βf Φ (Q F B )µ Y (B). (23) The proof of the lemma is completed. B.3.2 The Proof of the Main Theorem Let µ + (αµ X, βµ Y ) be the stationary distribution of P +. For conciseness, we let (A, B) F(A B, A c B c ). When A and B are clear from the context, we simply write. Then the bottleneck ratio of P + is the minimum of the values of (A, B)/µ + (A B), among all possible choices of A X and B Y such that µ + (A B) /2 and A B. Throughout this proof, we assume A X, B Y, and µ + (A B) /2, i.e. αµ X (A) + βµ Y (B) /2. Under this assumption, there are three cases, which we respectively discuss as follows. Case. µ X (A) /2 and µ Y (B) /2. We have φ min{φ (Q BF ), Φ (Q F B )} in the theorem. In addition, F(A, B c ) + F(B, A c ). Combining this with Lemma 4, we get αb φµ X (A), and βf φµ Y (B). (24) Note that η/2 αb βf and α + β. Thus µ + (A B) α + β ηφ/2. (25) αµ X (A) + βµ Y (B) Case 2. µ X (A) < /2 and µ Y (B) > /2. Given arbitrary κ > 2, there are two possibilities: Case 2.. /κ µ X (X) < /2 and µ Y (B) > /2. Then αb φµ X (A) > αbφ. (26) κ Recall that µ + (A B) /2. Thus µ + (A B) 2 ηφ κ 2. (27) Case 2.2. µ X (X) < /κ and µ Y (B) > /2. Here, we utilize the following fact: F(B, A c ) F(B, X) F(B, A). Then, by the definition of flow, we have F(B, X) βfµ Y (B) > βf/2, (28) and by the symmetry of F (due to reversibility), F(B, A) F(A, B) F(A, Y ) αb/κ. (29) With αb βf, combining the results above leads to > βf/2 αb/κ αb(/2 /κ). (30) As a result, we get µ + (A B) > 2αb(/2 /κ) η ( 2/κ). (3) 2 Case 3. µ X (A) > /2 and µ Y (B) < /2. Following a similar argument as we developed above for case 2, given κ > 2, we can likewise get µ + (A B) { 2 ηφ κ 2 (µ X (X) /κ), η 2 ( 2/κ) (µ X(X) < /κ). (32) Note that µ X (A) > /2 and µ Y (B) > /2 cannot hold simultaneously under the assumption αµ X (A) + βµ Y (B) /2. Integrating the results derived for all cases, we obtain (A, B) µ + (A, B) η 2 min { 2 κ φ, 2 }, κ > 2. (33) κ Note that this inequality holds for all A and B with 0 < µ + (A B) /2. In this way, we can get a series of lower bound of the bottleneck ratio, using different values of κ. And the supreme of these lower bounds remains a lower bound. It is easy to see that the supreme attains when 2φ/κ 2/κ, leading to sup min κ>2 { 2 κ φ, 2 κ } φ + φ. (34)
4 It follows that Φ (P + ) η 2 φ φ +. (35) This completes the proof of the lower bound. Next, we show the upper bound, which is easier. Due to the definition of bottleneck ratio, for any given partition of X Y, the flow ratio derived from that partition constitutes an upper bound of Φ (P + ). Here, we consider the partition with one part being X and the other being Y. Then and thus the flow ratio is given by αb βf, (36) max(f, b). (37) min(α, β) This gives an upper bound of Φ (P + ). The proof of the theorem is completed. B.4 Proof of Corollary When both Q BF and Q F B are ergodic, their bottleneck ratios are positive. According to Theorem 3, the bottleneck ratio of Φ (P + ) is thus positive, implying that the spectral gap of P + is positive (by Theorem 2), and hence P + is irreducible. Since Q BF is ergodic and thus aperiodic, the greatest common divisor of the loop lengths for each x X is. Similar argument applies to each y Y. Consequently, the joint chain characterized by P + is aperiodic. Being an irreducible and aperiodic finite Markov chain, the joint chain given by P + is ergodic. This completes the proof. B.5 Proof of Lemma 3 Let P (P ξi)/( ξ). Since P has P(x, x) > ξ for each x X, the entries of the matrix P are all non-negative. In addition, P (P ξi) ( ξ). (38) ξ ξ This implies that P is also a stochastic matrix. Since P is reversible, all its eigenvalues are real numbers. Without losing generality, we assume they are λ... λ n. As P is a stochastic matrix, we have λ and λ n. According to the spectral mapping theorem, the eigenvalues of P, denoted by λ,..., λ n, are given by λ i (λ i ξ)/( ξ), for each i,..., n. As P is a stochastic matrix, we have λ n, and thus λn ξ ξ. Therefore, λ n 2ξ. The proof is completed. B.6 Proof of Theorem 4 We show this theorem by progressively proving a series of claims as follows. Claim. The augmented Markov chain is ergodic. With b k > 0 for k 0,...,, the root is accessible from each state (including both complete and partial assignments). With f k > 0 for k,...,, each state is accessible from root. These imply that any two states are accessible from each other via the root. Therefore, the chain is irreducible. In addition, f < makes the chain aperiodic. As this is a finite Markov chain, we can conclude that it is ergodic. Since the chain is ergodic, it has a unique stationary distribution, i.e. its equilibrium distribution. Therefore, it suffices to show that (αµ 0, β µ,..., β µ ) that satisfies the three statements in the theorem is a stationary distribution. Claim 2. Given vectors µ 0,..., µ respectively over the set of states at level 0,...,, such that µ 0 µ is a distribution over X, and for each k,...,, µ k is defined recursively by µ k (y) µ (k ) k (x), for y Y k. (39) Then, µ k for each k,..., represents a distribution over Y k, and ( ) µ k (y), y Y k k. (40) Here, x y means that x is a descendant of y. The proof of this claim is done by induction as follows. Obviously, when k, according to the definition above, we have µ (y) µ X (x). (4) ( ) This satisfies Eq.(40), as, and it is clear that µ is non-negative. In addition, we have µ (y) y Y y Y y Y x X y P a(x). (42) Here, P a(x) is the set of parent states of x. In the derivation above, we use the fact that x has parents,
5 Dahua Lin, John Fisher i.e. P a(x). The identity above implies that µ is a valid distribution over Y. Therefore, the claim holds when k. Suppose that the claim holds for k,..., m with m <, we are to show that it also holds for k m+, so as to complete the induction. Note that µ m+ is defined as µ m+ (y) m µ m (x), for y Y m+. Again, µ m+ is obviously non-negative, and y Y m+ µ m+ (y) m m y Y m+ z Ch(y) z Y m µ m (z) µ m (z) y P a(x). (43) Similar to the derivation for k, here we apply the fact that P a(x) m for each x Y m. This shows that µ m+ is a valid distribution over Y m. Moreover, we have for each y Y m+, µ m+ (y) m z Ch(y) ( m)!m! m! ( ) m x X:x z z Ch(y) x X:x z ( m )!m! (m + )! ( ). (44) m + By induction, we can conclude that the claim holds for each k,...,. Claim 3. When the construction is done up to the k-th level, the distribution µ + k (c k,0µ 0,..., c k,k µ k ) is a stationary of the augmented Markov chain. Here, µ 0,..., µ k are given by Claim 2, and c k,0,..., c k,k is defined such that for each k 0,..., k claim holds for k 0,..., m with m <, we are to show that it also holds for k m +. Note from Eq.(45) that c m+,k c m,k Z m Z m+, k 0,..., m, (47) Hence, showing the claim holds for k m+ is equivalent to showing that (αµ + m, βµ m+ ) is a stationary distribution of the augmented chain (up to (m + )-th level), with α Z m, (48) Z m+ and β Z m+ Z m b 0 b m. (49) Z m+ Z m+ f f m+ According to Lemma, it suffices to check that this distribution satisfies the cross-space detailed balance given in Eq.(2), which is not difficult to verify based on the construction described in section 3.2. In this proof, Claim proves the ergodicity of the joint chain. Claim 2 constructs a set of vectors µ 0,..., µ, and states that they are valid distributions over X, Y 0,..., Y, and satisfy the properties given in (S2). Claim 3 (induction up to k ) shows that the distributions constructed in Claim 2 is exactly a stationary distribution of the joint chain. Since the chain is ergodic, this is the equilibrium distribution. As a by product, Claim 3 also shows the the statement (S3) of the theorem. For (S), it is automatically established by the construction described in Claim 2. Therefore, we can conclude that the proof of the theorem has been completed. B.7 Proof of Corollary 2 Based on Theorem 4, we have α + k b 0 b k f f k + k Hence, α κ. The proof is done. κ k κ. (50) with c k,0 Z k, Z k + c k,l Z k b 0 b l f f l, (45) k l b 0 b l f f l. (46) We are going to show this claim by induction. Note that µ 0 µ over X is a stationary distribution of P. And µ + 0 c 0,0p 0, thus c 0,0. It immediately follows that the claim is true for k 0. Suppose this
Some Definition and Example of Markov Chain
Some Definition and Example of Markov Chain Bowen Dai The Ohio State University April 5 th 2016 Introduction Definition and Notation Simple example of Markov Chain Aim Have some taste of Markov Chain and
More informationMarkov Chains and Stochastic Sampling
Part I Markov Chains and Stochastic Sampling 1 Markov Chains and Random Walks on Graphs 1.1 Structure of Finite Markov Chains We shall only consider Markov chains with a finite, but usually very large,
More informationModern Discrete Probability Spectral Techniques
Modern Discrete Probability VI - Spectral Techniques Background Sébastien Roch UW Madison Mathematics December 22, 2014 1 Review 2 3 4 Mixing time I Theorem (Convergence to stationarity) Consider a finite
More informationLecture 5: Random Walks and Markov Chain
Spectral Graph Theory and Applications WS 20/202 Lecture 5: Random Walks and Markov Chain Lecturer: Thomas Sauerwald & He Sun Introduction to Markov Chains Definition 5.. A sequence of random variables
More informationLecture 7. µ(x)f(x). When µ is a probability measure, we say µ is a stationary distribution.
Lecture 7 1 Stationary measures of a Markov chain We now study the long time behavior of a Markov Chain: in particular, the existence and uniqueness of stationary measures, and the convergence of the distribution
More informationLecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321
Lecture 11: Introduction to Markov Chains Copyright G. Caire (Sample Lectures) 321 Discrete-time random processes A sequence of RVs indexed by a variable n 2 {0, 1, 2,...} forms a discretetime random process
More informationCSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming
CSC2411 - Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming Notes taken by Mike Jamieson March 28, 2005 Summary: In this lecture, we introduce semidefinite programming
More information215 Problem 1. (a) Define the total variation distance µ ν tv for probability distributions µ, ν on a finite set S. Show that
15 Problem 1. (a) Define the total variation distance µ ν tv for probability distributions µ, ν on a finite set S. Show that µ ν tv = (1/) x S µ(x) ν(x) = x S(µ(x) ν(x)) + where a + = max(a, 0). Show that
More informationCS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions
CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions Instructor: Erik Sudderth Brown University Computer Science April 14, 215 Review: Discrete Markov Chains Some
More informationMarkov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains
Markov Chains A random process X is a family {X t : t T } of random variables indexed by some set T. When T = {0, 1, 2,... } one speaks about a discrete-time process, for T = R or T = [0, ) one has a continuous-time
More informationMARKOV CHAINS AND MIXING TIMES
MARKOV CHAINS AND MIXING TIMES BEAU DABBS Abstract. This paper introduces the idea of a Markov chain, a random process which is independent of all states but its current one. We analyse some basic properties
More informationHands-On Learning Theory Fall 2016, Lecture 3
Hands-On Learning Theory Fall 016, Lecture 3 Jean Honorio jhonorio@purdue.edu 1 Information Theory First, we provide some information theory background. Definition 3.1 (Entropy). The entropy of a discrete
More informationMARKOV CHAINS AND HIDDEN MARKOV MODELS
MARKOV CHAINS AND HIDDEN MARKOV MODELS MERYL SEAH Abstract. This is an expository paper outlining the basics of Markov chains. We start the paper by explaining what a finite Markov chain is. Then we describe
More informationDiscrete solid-on-solid models
Discrete solid-on-solid models University of Alberta 2018 COSy, University of Manitoba - June 7 Discrete processes, stochastic PDEs, deterministic PDEs Table: Deterministic PDEs Heat-diffusion equation
More informationP i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=
2.7. Recurrence and transience Consider a Markov chain {X n : n N 0 } on state space E with transition matrix P. Definition 2.7.1. A state i E is called recurrent if P i [X n = i for infinitely many n]
More informationA n = A N = [ N, N] A n = A 1 = [ 1, 1]. n=1
Math 235: Assignment 1 Solutions 1.1: For n N not zero, let A n = [ n, n] (The closed interval in R containing all real numbers x satisfying n x n). It is easy to see that we have the chain of inclusion
More informationCONVERGENCE THEOREM FOR FINITE MARKOV CHAINS. Contents
CONVERGENCE THEOREM FOR FINITE MARKOV CHAINS ARI FREEDMAN Abstract. In this expository paper, I will give an overview of the necessary conditions for convergence in Markov chains on finite state spaces.
More informationStochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property
Chapter 1: and Markov chains Stochastic processes We study stochastic processes, which are families of random variables describing the evolution of a quantity with time. In some situations, we can treat
More informationUpper triangular matrices and Billiard Arrays
Linear Algebra and its Applications 493 (2016) 508 536 Contents lists available at ScienceDirect Linear Algebra and its Applications www.elsevier.com/locate/laa Upper triangular matrices and Billiard Arrays
More informationprocess on the hierarchical group
Intertwining of Markov processes and the contact process on the hierarchical group April 27, 2010 Outline Intertwining of Markov processes Outline Intertwining of Markov processes First passage times of
More informationMarkov Chains, Stochastic Processes, and Matrix Decompositions
Markov Chains, Stochastic Processes, and Matrix Decompositions 5 May 2014 Outline 1 Markov Chains Outline 1 Markov Chains 2 Introduction Perron-Frobenius Matrix Decompositions and Markov Chains Spectral
More information0 Sets and Induction. Sets
0 Sets and Induction Sets A set is an unordered collection of objects, called elements or members of the set. A set is said to contain its elements. We write a A to denote that a is an element of the set
More informationHomework #2 Solutions Due: September 5, for all n N n 3 = n2 (n + 1) 2 4
Do the following exercises from the text: Chapter (Section 3):, 1, 17(a)-(b), 3 Prove that 1 3 + 3 + + n 3 n (n + 1) for all n N Proof The proof is by induction on n For n N, let S(n) be the statement
More informationMARKOV CHAINS: STATIONARY DISTRIBUTIONS AND FUNCTIONS ON STATE SPACES. Contents
MARKOV CHAINS: STATIONARY DISTRIBUTIONS AND FUNCTIONS ON STATE SPACES JAMES READY Abstract. In this paper, we rst introduce the concepts of Markov Chains and their stationary distributions. We then discuss
More informationNOTES ON SIMPLE NUMBER THEORY
NOTES ON SIMPLE NUMBER THEORY DAMIEN PITMAN 1. Definitions & Theorems Definition: We say d divides m iff d is positive integer and m is an integer and there is an integer q such that m = dq. In this case,
More informationINTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING
INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING ERIC SHANG Abstract. This paper provides an introduction to Markov chains and their basic classifications and interesting properties. After establishing
More information4. Ergodicity and mixing
4. Ergodicity and mixing 4. Introduction In the previous lecture we defined what is meant by an invariant measure. In this lecture, we define what is meant by an ergodic measure. The primary motivation
More informationA Review of Linear Programming
A Review of Linear Programming Instructor: Farid Alizadeh IEOR 4600y Spring 2001 February 14, 2001 1 Overview In this note we review the basic properties of linear programming including the primal simplex
More informationLecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.
Lecture 10 1 Ergodic decomposition of invariant measures Let T : (Ω, F) (Ω, F) be measurable, and let M denote the space of T -invariant probability measures on (Ω, F). Then M is a convex set, although
More informationChapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University
Chapter 4 Data Transmission and Channel Capacity Po-Ning Chen, Professor Department of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 30050, R.O.C. Principle of Data Transmission
More informationA note on adiabatic theorem for Markov chains and adiabatic quantum computation. Yevgeniy Kovchegov Oregon State University
A note on adiabatic theorem for Markov chains and adiabatic quantum computation Yevgeniy Kovchegov Oregon State University Introduction Max Born and Vladimir Fock in 1928: a physical system remains in
More informationINTRODUCTION TO MARKOV CHAIN MONTE CARLO
INTRODUCTION TO MARKOV CHAIN MONTE CARLO 1. Introduction: MCMC In its simplest incarnation, the Monte Carlo method is nothing more than a computerbased exploitation of the Law of Large Numbers to estimate
More informationConfidence intervals for the mixing time of a reversible Markov chain from a single sample path
Confidence intervals for the mixing time of a reversible Markov chain from a single sample path Daniel Hsu Aryeh Kontorovich Csaba Szepesvári Columbia University, Ben-Gurion University, University of Alberta
More informationA generalization of Dobrushin coefficient
A generalization of Dobrushin coefficient Ü µ ŒÆ.êÆ ÆÆ 202.5 . Introduction and main results We generalize the well-known Dobrushin coefficient δ in total variation to weighted total variation δ V, which
More informationDiscrete Math, Spring Solutions to Problems V
Discrete Math, Spring 202 - Solutions to Problems V Suppose we have statements P, P 2, P 3,, one for each natural number In other words, we have the collection or set of statements {P n n N} a Suppose
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationConvex Optimization CMU-10725
Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state
More informationDiscrete time Markov chains. Discrete Time Markov Chains, Limiting. Limiting Distribution and Classification. Regular Transition Probability Matrices
Discrete time Markov chains Discrete Time Markov Chains, Limiting Distribution and Classification DTU Informatics 02407 Stochastic Processes 3, September 9 207 Today: Discrete time Markov chains - invariant
More informationFinite-Horizon Statistics for Markov chains
Analyzing FSDT Markov chains Friday, September 30, 2011 2:03 PM Simulating FSDT Markov chains, as we have said is very straightforward, either by using probability transition matrix or stochastic update
More informationThe Markov Chain Monte Carlo Method
The Markov Chain Monte Carlo Method Idea: define an ergodic Markov chain whose stationary distribution is the desired probability distribution. Let X 0, X 1, X 2,..., X n be the run of the chain. The Markov
More informationChapter 7. Markov chain background. 7.1 Finite state space
Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider
More informationMixing time for a random walk on a ring
Mixing time for a random walk on a ring Stephen Connor Joint work with Michael Bate Paris, September 2013 Introduction Let X be a discrete time Markov chain on a finite state space S, with transition matrix
More informationPartial Fraction Decomposition Honors Precalculus Mr. Velazquez Rm. 254
Partial Fraction Decomposition Honors Precalculus Mr. Velazquez Rm. 254 Adding and Subtracting Rational Expressions Recall that we can use multiplication and common denominators to write a sum or difference
More informationStochastic Processes
Stochastic Processes 8.445 MIT, fall 20 Mid Term Exam Solutions October 27, 20 Your Name: Alberto De Sole Exercise Max Grade Grade 5 5 2 5 5 3 5 5 4 5 5 5 5 5 6 5 5 Total 30 30 Problem :. True / False
More informationCOUPLING TIMES FOR RANDOM WALKS WITH INTERNAL STATES
COUPLING TIMES FOR RANDOM WALKS WITH INTERNAL STATES ELYSE AZORR, SAMUEL J. GHITELMAN, RALPH MORRISON, AND GREG RICE ADVISOR: YEVGENIY KOVCHEGOV OREGON STATE UNIVERSITY ABSTRACT. Using coupling techniques,
More information8. Prime Factorization and Primary Decompositions
70 Andreas Gathmann 8. Prime Factorization and Primary Decompositions 13 When it comes to actual computations, Euclidean domains (or more generally principal ideal domains) are probably the nicest rings
More informationIowa State University. Instructor: Alex Roitershtein Summer Homework #5. Solutions
Math 50 Iowa State University Introduction to Real Analysis Department of Mathematics Instructor: Alex Roitershtein Summer 205 Homework #5 Solutions. Let α and c be real numbers, c > 0, and f is defined
More informationHypercontractivity of spherical averages in Hamming space
Hypercontractivity of spherical averages in Hamming space Yury Polyanskiy Department of EECS MIT yp@mit.edu Allerton Conference, Oct 2, 2014 Yury Polyanskiy Hypercontractivity in Hamming space 1 Hypercontractivity:
More information11.4. RECURRENCE AND TRANSIENCE 93
11.4. RECURRENCE AND TRANSIENCE 93 Similar arguments show that simple smmetric random walk is also recurrent in 2 dimensions but transient in 3 or more dimensions. Proposition 11.10 If x is recurrent and
More informationIntroduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa
Introduction to Search Engine Technology Introduction to Link Structure Analysis Ronny Lempel Yahoo Labs, Haifa Outline Anchor-text indexing Mathematical Background Motivation for link structure analysis
More informationMinimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA
Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals Satoshi AOKI and Akimichi TAKEMURA Graduate School of Information Science and Technology University
More informationLecture 3: September 10
CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 3: September 10 Lecturer: Prof. Alistair Sinclair Scribes: Andrew H. Chan, Piyush Srivastava Disclaimer: These notes have not
More informationWeighted Sums of Orthogonal Polynomials Related to Birth-Death Processes with Killing
Advances in Dynamical Systems and Applications ISSN 0973-5321, Volume 8, Number 2, pp. 401 412 (2013) http://campus.mst.edu/adsa Weighted Sums of Orthogonal Polynomials Related to Birth-Death Processes
More informationPractical conditions on Markov chains for weak convergence of tail empirical processes
Practical conditions on Markov chains for weak convergence of tail empirical processes Olivier Wintenberger University of Copenhagen and Paris VI Joint work with Rafa l Kulik and Philippe Soulier Toronto,
More informationChapter IV Integration Theory
Chapter IV Integration Theory This chapter is devoted to the developement of integration theory. The main motivation is to extend the Riemann integral calculus to larger types of functions, thus leading
More information8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains
8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8.1 Review 8.2 Statistical Equilibrium 8.3 Two-State Markov Chain 8.4 Existence of P ( ) 8.5 Classification of States
More informationNote that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +
Random Walks: WEEK 2 Recurrence and transience Consider the event {X n = i for some n > 0} by which we mean {X = i}or{x 2 = i,x i}or{x 3 = i,x 2 i,x i},. Definition.. A state i S is recurrent if P(X n
More informationChapter 8. P-adic numbers. 8.1 Absolute values
Chapter 8 P-adic numbers Literature: N. Koblitz, p-adic Numbers, p-adic Analysis, and Zeta-Functions, 2nd edition, Graduate Texts in Mathematics 58, Springer Verlag 1984, corrected 2nd printing 1996, Chap.
More informationSummary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University)
Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University) The authors explain how the NCut algorithm for graph bisection
More informationLecture 9 Classification of States
Lecture 9: Classification of States of 27 Course: M32K Intro to Stochastic Processes Term: Fall 204 Instructor: Gordan Zitkovic Lecture 9 Classification of States There will be a lot of definitions and
More informationVISCOSITY SOLUTIONS. We follow Han and Lin, Elliptic Partial Differential Equations, 5.
VISCOSITY SOLUTIONS PETER HINTZ We follow Han and Lin, Elliptic Partial Differential Equations, 5. 1. Motivation Throughout, we will assume that Ω R n is a bounded and connected domain and that a ij C(Ω)
More informationh(x) lim H(x) = lim Since h is nondecreasing then h(x) 0 for all x, and if h is discontinuous at a point x then H(x) > 0. Denote
Real Variables, Fall 4 Problem set 4 Solution suggestions Exercise. Let f be of bounded variation on [a, b]. Show that for each c (a, b), lim x c f(x) and lim x c f(x) exist. Prove that a monotone function
More information25.1 Markov Chain Monte Carlo (MCMC)
CS880: Approximations Algorithms Scribe: Dave Andrzejewski Lecturer: Shuchi Chawla Topic: Approx counting/sampling, MCMC methods Date: 4/4/07 The previous lecture showed that, for self-reducible problems,
More informationGaussian Processes. 1. Basic Notions
Gaussian Processes 1. Basic Notions Let T be a set, and X : {X } T a stochastic process, defined on a suitable probability space (Ω P), that is indexed by T. Definition 1.1. We say that X is a Gaussian
More informationMarkov Chains and Spectral Clustering
Markov Chains and Spectral Clustering Ning Liu 1,2 and William J. Stewart 1,3 1 Department of Computer Science North Carolina State University, Raleigh, NC 27695-8206, USA. 2 nliu@ncsu.edu, 3 billy@ncsu.edu
More informationMATH 131A: REAL ANALYSIS (BIG IDEAS)
MATH 131A: REAL ANALYSIS (BIG IDEAS) Theorem 1 (The Triangle Inequality). For all x, y R we have x + y x + y. Proposition 2 (The Archimedean property). For each x R there exists an n N such that n > x.
More informationThe Distribution of Mixing Times in Markov Chains
The Distribution of Mixing Times in Markov Chains Jeffrey J. Hunter School of Computing & Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand December 2010 Abstract The distribution
More informationBURGESS INEQUALITY IN F p 2. Mei-Chu Chang
BURGESS INEQUALITY IN F p 2 Mei-Chu Chang Abstract. Let be a nontrivial multiplicative character of F p 2. We obtain the following results.. Given ε > 0, there is δ > 0 such that if ω F p 2\F p and I,
More informationMARKOV CHAIN MONTE CARLO
MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with
More informationLecture 12: Detailed balance and Eigenfunction methods
Miranda Holmes-Cerfon Applied Stochastic Analysis, Spring 2015 Lecture 12: Detailed balance and Eigenfunction methods Readings Recommended: Pavliotis [2014] 4.5-4.7 (eigenfunction methods and reversibility),
More informationMarkov and Gibbs Random Fields
Markov and Gibbs Random Fields Bruno Galerne bruno.galerne@parisdescartes.fr MAP5, Université Paris Descartes Master MVA Cours Méthodes stochastiques pour l analyse d images Lundi 6 mars 2017 Outline The
More informationNotes on Row Reduction
Notes on Row Reduction Francis J. Narcowich Department of Mathematics Texas A&M University September The Row-Reduction Algorithm The row-reduced form of a matrix contains a great deal of information, both
More informationEntropy and Ergodic Theory Lecture 4: Conditional entropy and mutual information
Entropy and Ergodic Theory Lecture 4: Conditional entropy and mutual information 1 Conditional entropy Let (Ω, F, P) be a probability space, let X be a RV taking values in some finite set A. In this lecture
More informationAlternative Characterization of Ergodicity for Doubly Stochastic Chains
Alternative Characterization of Ergodicity for Doubly Stochastic Chains Behrouz Touri and Angelia Nedić Abstract In this paper we discuss the ergodicity of stochastic and doubly stochastic chains. We define
More informationA New Approximation Algorithm for the Asymmetric TSP with Triangle Inequality By Markus Bläser
A New Approximation Algorithm for the Asymmetric TSP with Triangle Inequality By Markus Bläser Presented By: Chris Standish chriss@cs.tamu.edu 23 November 2005 1 Outline Problem Definition Frieze s Generic
More informationMarkov Chains. Andreas Klappenecker by Andreas Klappenecker. All rights reserved. Texas A&M University
Markov Chains Andreas Klappenecker Texas A&M University 208 by Andreas Klappenecker. All rights reserved. / 58 Stochastic Processes A stochastic process X tx ptq: t P T u is a collection of random variables.
More informationNecessary and sufficient conditions for strong R-positivity
Necessary and sufficient conditions for strong R-positivity Wednesday, November 29th, 2017 The Perron-Frobenius theorem Let A = (A(x, y)) x,y S be a nonnegative matrix indexed by a countable set S. We
More informationLecture 4 Lebesgue spaces and inequalities
Lecture 4: Lebesgue spaces and inequalities 1 of 10 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 4 Lebesgue spaces and inequalities Lebesgue spaces We have seen how
More informationFirst Hitting Time Analysis of the Independence Metropolis Sampler
(To appear in the Journal for Theoretical Probability) First Hitting Time Analysis of the Independence Metropolis Sampler Romeo Maciuca and Song-Chun Zhu In this paper, we study a special case of the Metropolis
More information1.3 Convergence of Regular Markov Chains
Markov Chains and Random Walks on Graphs 3 Applying the same argument to A T, which has the same λ 0 as A, yields the row sum bounds Corollary 0 Let P 0 be the transition matrix of a regular Markov chain
More informationUnderstanding MCMC. Marcel Lüthi, University of Basel. Slides based on presentation by Sandro Schönborn
Understanding MCMC Marcel Lüthi, University of Basel Slides based on presentation by Sandro Schönborn 1 The big picture which satisfies detailed balance condition for p(x) an aperiodic and irreducable
More informationLIMITING PROBABILITY TRANSITION MATRIX OF A CONDENSED FIBONACCI TREE
International Journal of Applied Mathematics Volume 31 No. 18, 41-49 ISSN: 1311-178 (printed version); ISSN: 1314-86 (on-line version) doi: http://dx.doi.org/1.173/ijam.v31i.6 LIMITING PROBABILITY TRANSITION
More information#A45 INTEGERS 9 (2009), BALANCED SUBSET SUMS IN DENSE SETS OF INTEGERS. Gyula Károlyi 1. H 1117, Hungary
#A45 INTEGERS 9 (2009, 591-603 BALANCED SUBSET SUMS IN DENSE SETS OF INTEGERS Gyula Károlyi 1 Institute of Mathematics, Eötvös University, Pázmány P. sétány 1/C, Budapest, H 1117, Hungary karolyi@cs.elte.hu
More informationCommon Learning. August 22, Abstract
Common Learning Martin W. Cripps Jeffrey C. Ely George J. Mailath Larry Samuelson August 22, 2006 Abstract Consider two agents who learn the value of an unknown parameter by observing a sequence of private
More informationc i r i i=1 r 1 = [1, 2] r 2 = [0, 1] r 3 = [3, 4].
Lecture Notes: Rank of a Matrix Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk 1 Linear Independence Definition 1. Let r 1, r 2,..., r m
More informationMarkov Chains, Random Walks on Graphs, and the Laplacian
Markov Chains, Random Walks on Graphs, and the Laplacian CMPSCI 791BB: Advanced ML Sridhar Mahadevan Random Walks! There is significant interest in the problem of random walks! Markov chain analysis! Computer
More informationDynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor)
Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor) Matija Vidmar February 7, 2018 1 Dynkin and π-systems Some basic
More informationMDP Preliminaries. Nan Jiang. February 10, 2019
MDP Preliminaries Nan Jiang February 10, 2019 1 Markov Decision Processes In reinforcement learning, the interactions between the agent and the environment are often described by a Markov Decision Process
More informationApplied Stochastic Processes
Applied Stochastic Processes Jochen Geiger last update: July 18, 2007) Contents 1 Discrete Markov chains........................................ 1 1.1 Basic properties and examples................................
More informationFinal Exam Review. 2. Let A = {, { }}. What is the cardinality of A? Is
1. Describe the elements of the set (Z Q) R N. Is this set countable or uncountable? Solution: The set is equal to {(x, y) x Z, y N} = Z N. Since the Cartesian product of two denumerable sets is denumerable,
More informationNotes on the second moment method, Erdős multiplication tables
Notes on the second moment method, Erdős multiplication tables January 25, 20 Erdős multiplication table theorem Suppose we form the N N multiplication table, containing all the N 2 products ab, where
More information7x 5 x 2 x + 2. = 7x 5. (x + 1)(x 2). 4 x
Advanced Integration Techniques: Partial Fractions The method of partial fractions can occasionally make it possible to find the integral of a quotient of rational functions. Partial fractions gives us
More informationSection Notes 9. Midterm 2 Review. Applied Math / Engineering Sciences 121. Week of December 3, 2018
Section Notes 9 Midterm 2 Review Applied Math / Engineering Sciences 121 Week of December 3, 2018 The following list of topics is an overview of the material that was covered in the lectures and sections
More informationAN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES
AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES JOEL A. TROPP Abstract. We present an elementary proof that the spectral radius of a matrix A may be obtained using the formula ρ(a) lim
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationComputing with Semigroup Congruences. Michael Torpey
Computing with Semigroup Congruences Michael Torpey 5th September 2014 Abstract In any branch of algebra, congruences are an important topic. An algebraic object s congruences describe its homomorphic
More information5: The Integers (An introduction to Number Theory)
c Oksana Shatalov, Spring 2017 1 5: The Integers (An introduction to Number Theory) The Well Ordering Principle: Every nonempty subset on Z + has a smallest element; that is, if S is a nonempty subset
More informationTheorem 5.3. Let E/F, E = F (u), be a simple field extension. Then u is algebraic if and only if E/F is finite. In this case, [E : F ] = deg f u.
5. Fields 5.1. Field extensions. Let F E be a subfield of the field E. We also describe this situation by saying that E is an extension field of F, and we write E/F to express this fact. If E/F is a field
More informationCTL Model Checking. Prof. P.H. Schmitt. Formal Systems II. Institut für Theoretische Informatik Fakultät für Informatik Universität Karlsruhe (TH)
CTL Model Checking Prof. P.H. Schmitt Institut für Theoretische Informatik Fakultät für Informatik Universität Karlsruhe (TH) Formal Systems II Prof. P.H. Schmitt CTLMC Summer 2009 1 / 26 Fixed Point Theory
More informationMAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9
MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended
More information