Problem set 2 for the course on. Markov chains and mixing times

Similar documents
1 Widrow-Hoff Algorithm

Stationary Distribution. Design and Analysis of Algorithms Andrei Bulatov

Notes for Lecture 17-18

EXERCISES FOR SECTION 1.5

Homework 2 Solutions

556: MATHEMATICAL STATISTICS I

CS Homework Week 2 ( 2.25, 3.22, 4.9)

Seminar 4: Hotelling 2

Tracking Adversarial Targets

Hamilton- J acobi Equation: Explicit Formulas In this lecture we try to apply the method of characteristics to the Hamilton-Jacobi equation: u t

LECTURE 1: GENERALIZED RAY KNIGHT THEOREM FOR FINITE MARKOV CHAINS

EXPONENTIAL PROBABILITY DISTRIBUTION

SMT 2014 Calculus Test Solutions February 15, 2014 = 3 5 = 15.

Math 10B: Mock Mid II. April 13, 2016

Solutions from Chapter 9.1 and 9.2

Chapter 6. Systems of First Order Linear Differential Equations

Predator - Prey Model Trajectories and the nonlinear conservation law

An random variable is a quantity that assumes different values with certain probabilities.

Mixing times and hitting times: lecture notes

Thermal Forces and Brownian Motion

Fourier Series & The Fourier Transform. Joseph Fourier, our hero. Lord Kelvin on Fourier s theorem. What do we want from the Fourier Transform?

Asymptotic Equipartition Property - Seminar 3, part 1

MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES PROBABILITY DISTRIBUTION OF A FUNCTION OF A RANDOM VARIABLE

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

Unit Root Time Series. Univariate random walk

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

20. Applications of the Genetic-Drift Model

arxiv: v1 [math.fa] 12 Jul 2012

Vehicle Arrival Models : Headway

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details!

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

Approximation Algorithms for Unique Games via Orthogonal Separators

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales

!!"#"$%&#'()!"#&'(*%)+,&',-)./0)1-*23)

Homework 10 (Stats 620, Winter 2017) Due Tuesday April 18, in class Questions are derived from problems in Stochastic Processes by S. Ross.

An Introduction to Malliavin calculus and its applications

Final Spring 2007

Homework 4 (Stats 620, Winter 2017) Due Tuesday Feb 14, in class Questions are derived from problems in Stochastic Processes by S. Ross.

2.1 Level, Weight, Nominator and Denominator of an Eta Product. By an eta product we understand any finite product of functions. f(z) = m.

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

Oscillation Properties of a Logistic Equation with Several Delays

4.5 Constant Acceleration

A note on diagonalization of integral quadratic forms modulo p m

Solutions to Assignment 1

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients

Wave Mechanics. January 16, 2017

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation:

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

THE 2-BODY PROBLEM. FIGURE 1. A pair of ellipses sharing a common focus. (c,b) c+a ROBERT J. VANDERBEI

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems

THE WAVE EQUATION. part hand-in for week 9 b. Any dilation v(x, t) = u(λx, λt) of u(x, t) is also a solution (where λ is constant).

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

. Now define y j = log x j, and solve the iteration.

The Strong Law of Large Numbers

HOMEWORK # 2: MATH 211, SPRING Note: This is the last solution set where I will describe the MATLAB I used to make my pictures.

Random variables. A random variable X is a function that assigns a real number, X(ζ), to each outcome ζ in the sample space of a random experiment.

Matlab and Python programming: how to get started

CHAPTER 12 DIRECT CURRENT CIRCUITS

Two Coupled Oscillators / Normal Modes

The consumption-based determinants of the term structure of discount rates: Corrigendum. Christian Gollier 1 Toulouse School of Economics March 2012

Module 2 F c i k c s la l w a s o s f dif di fusi s o i n

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes

Lecture 2 April 04, 2018

INSTANTANEOUS VELOCITY

non -negative cone Population dynamics motivates the study of linear models whose coefficient matrices are non-negative or positive.

GMM - Generalized Method of Moments

Richard A. Davis Colorado State University Bojan Basrak Eurandom Thomas Mikosch University of Groningen

VOL. 1, NO. 8, November 2011 ISSN ARPN Journal of Systems and Software AJSS Journal. All rights reserved

14 Autoregressive Moving Average Models

Practice Problems - Week #4 Higher-Order DEs, Applications Solutions

MATH 128A, SUMMER 2009, FINAL EXAM SOLUTION

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

SOLUTIONS TO ECE 3084

A Generalization of Student s t-distribution from the Viewpoint of Special Functions

Chapter 3 Boundary Value Problem

Linear Response Theory: The connection between QFT and experiments

Introduction to Probability and Statistics Slides 4 Chapter 4

Let us start with a two dimensional case. We consider a vector ( x,

Math 2142 Exam 1 Review Problems. x 2 + f (0) 3! for the 3rd Taylor polynomial at x = 0. To calculate the various quantities:

Expert Advice for Amateurs

Math 527 Lecture 6: Hamilton-Jacobi Equation: Explicit Formulas

1 Review of Zero-Sum Games

Conservation of Momentum. The purpose of this experiment is to verify the conservation of momentum in two dimensions.

On Edgeworth Expansions in Generalized Urn Models

Exercises: Similarity Transformation

IB Physics Kinematics Worksheet

Online Learning Applications

Sample Autocorrelations for Financial Time Series Models. Richard A. Davis Colorado State University Thomas Mikosch University of Copenhagen

Some Basic Information about M-S-D Systems

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems

EE650R: Reliability Physics of Nanoelectronic Devices Lecture 9:

ODEs II, Lecture 1: Homogeneous Linear Systems - I. Mike Raugh 1. March 8, 2004

Motion along a Straight Line

Machine Learning 4771

Chapter 2. First Order Scalar Equations

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Representation of Stochastic Process by Means of Stochastic Integrals

Lecture 2 October ε-approximation of 2-player zero-sum games

Macroeconomic Theory Ph.D. Qualifying Examination Fall 2005 ANSWER EACH PART IN A SEPARATE BLUE BOOK. PART ONE: ANSWER IN BOOK 1 WEIGHT 1/3

Note on oscillation conditions for first-order delay differential equations

Transcription:

J. Seif T. Hirscher Soluions o Proble se for he course on Markov chains and ixing ies February 7, 04 Exercise 7 (Reversible chains). (i) Assue ha we have a Markov chain wih ransiion arix P, such ha here exis a posiive funcion f and a non-negaive funcion g on S wih p ij = f(i) g(j) for all i,j S. Show ha such a Markov chain is reversible and derive he corresponding saionary probabiliy vecor. Why is he saionary disribuion unique in his case? How any seps will i ake o reach equilibriu? (ii) Can -sae Markov chains be non-reversible? Exhibi a 3-sae Markov chain wih p ij > 0 for all i,j S which is no reversible. Soluion. (i) Le E := {i S, g(i) > 0}. Since p ij > 0 for all i S, j E, all saes in E are essenial, saes ouside E us be inessenial due o p ij = 0 for all i S, j / E. In conclusion, E is he subse of all essenial saes. Tha P is sochasic yields = j S p ij = f(i) j S g(j). Hence, for all saes i, we ge f(i) =, where c := c j S g(j). If we undersand g as a vecor in R n, every uliple of i will saisfy he deailed balance equaions, due o f being consan. Obviously, π = g is he righ scaling o c ge a probabiliy disribuion. Since p ij > 0 for all i,j E, we have only one essenial counicaing class. By Prop..6 which iediaely follows fro Ex. 5 on he firs assignen he saionary disribuion π is unique. Since p ij does no depend on i, due o f being consan, he disribuion afer he firs sep will be π regardless of he iniial disribuion. So afer sep, equilibriu is reached. page

(ii) In Ex. 3 on he firs assignen i was shown ha a general -sae MC wih ransiion probabiliies p = p, p = q has he saionary disribuion π = ( q, p ), if i is no he case ha p = q = 0. Since π saisfies he p+q p+q deailed balance equaion (here is jus one having only saes) and in he case p = q = 0 boh sides equal 0, every -sae MC is reversible. As an exaple for a non-reversible chain on hree saes wih p ij > 0, we can ake he ransiion arix P =. 4 Since P is doubly sochasic, he unifor disribuion is saionary. The corresponding Markov chain is obviously irreducible, which ells us ha π = ( 3, 3, 3 ) is he unique saionary disribuion. This allows us o conclude fro π() p = 3 = 6 π() p = 3 4 = ha he chain is non-reversible, since every disribuion saisfying he deailed balance equaions would be saionary. For he nex exercise you will need he following corollary o he Convergence Theore (Th. 4.9): A finie Markov chain (X ) N0 which is irreducible and aperiodic forges is iniial sae and is disribuion converges o equilibriu in he sense ha for all saes i,j. li p() ij = li P[X = j X 0 = i] = π(j) Exercise 8. Le (X ) N0 be a finie irreducible Markov chain having saionary disribuion π and furher N(i,) denoe he (rando) nuber of visis of sae i aong X,...,X. Wihou using Proposiion.4 or Theore 4.6, show ha E[N(i,)] π(i) as and ha N(i,) π(i) in probabiliy. Noe ha Theore 4.6 acually iplies ha his laer convergence holds alos surely. Hin: Wrie N(i,) as a su of indicaor variables and bound is variance. Soluion. Le us firs consider only aperiodic chains, wrie q s := P(X s = i) and N(i,) = s= {Xs=i}. page

I is no hard o show ha any sequence (a s ) s N converging o soe lii a is Cesàro suable and he Cesàro su equals a, i.e. li a s = a. Using he convergence of p () ij q = P(X = i) = j S Cobining boh facs yields E[N(i,)] = s= as saed above, we can conclude p () ji P(X 0 = j) π(i) as. q s π(i) as. s= As o he second saeen, Chebyshev s inequaliy gives for all ε > 0: P ( N(i,) π(i) > ε ) P ( N(i,) E[N(i,)] > ε ) 4 var ( N(i,) ), ε if is large enough s.. E[N(i,)] π(i) < ε. In order o prove ha his ends o 0, which iplies N(i,) π(i) in probabiliy, i is lef o show ha var ( N(i,) ) 0 as. Using lineariy of expecaion, we can conclude var ( N(i,) ) = E[N(i,) ] ( E[N(i,)] ) = E [ = {Xr=i,X s=i} ] ( E [ {Xs=i} + s= r s s= q s qs + r = i,x s = i) q r q s ) r s(p(x s= s= q s + r<s s= q r (p (s r) ii q s ). ]) {Xs=i} The firs su divided by converges o π(i), so o 0 if divided by. As o he second su, we know fro above ha for δ > 0, here exiss T N such ha boh p (s) ii π(i) and q s π(i) are a os δ for s T. For s T, using he riangle inequaliy, we ge s q r (p (s r) ii q s ) s T q r δ + s q r δ q r + T δ π(i), as. r= This iplies for T : r<s r= r=s T + q r (p (s r) ii q s ) = s= r= r= s q r (p (s r) ii q s ) ( T + ( δ s=t q r + T ) ). r= page 3

Since his las upper bound converges o δ π(i) as well and δ > 0 was arbirary, we have shown ha var ( N(i,) ) = o( ) which concludes he proof for he aperiodic case. If he considered Markov chain is no aperiodic, due o irreducibiliy all saes have he sae period d. We won ge he convergence of probabiliies as saed before he exercise, bu if we consider P d, we can order he saes such ha he ransiion arix has block-diagonal for: Le define an equivalence relaion on S by i j p (sd) ij > 0 for soe s > 0. Le A,...,A d denoe he corresponding equivalence classes, ordered such ha P[X A k+ X 0 A k ] =. If we look a he d-sep MC, i has he irreducible coponens A,...,A d. If π denoes he saionary disribuion, πp d = π and for X 0 π: π(a k+ ) = P(X A k+ ) = P(X 0 A k ) = π(a k ). Hence π(a k ) = for all k. The d-sep MC resriced o A d k, for fixed k, is irreducible, aperiodic and has saionary disribuion π π(a k ) Ak = d π Ak. So by he above we ge for i A k : E[N(i,s)] d π(i) and N(i,s) d π(i) in probabiliy, s s where s is he ie in he d-sep MC. If denoes he ie in he original periodic chain, we ge = sd, hence jus as claied. E[N(i,)] π(i) and N(i,) π(i) in probabiliy, Exercise 9. You are given wo probabiliy easures µ and ν on a finie se S. Bob is required o flip a fair coin (which you canno see he resul of) and if he coin is heads, he us give you an eleen of S which has disribuion µ and if he coin is ails, he us give you an eleen of S which has disribuion ν (independenly chosen of he coin oss). Based upon wha you end up receiving, your job is o ry o guess if he coin was heads or ails and o axiize he probabiliy ha you are correc. Of course you can be correc wih probabiliy by jus always guessing heads bu you wan o do beer han his. (i) Show ha if µ ν TV δ, hen here is a sraegy which gives a probabiliy of being correc which is a leas + δ. (ii) Show ha if µ ν TV δ, hen he axial probabiliy of being correc is a os + δ. In conclusion, oal variaion easures he degree o which you can saisically ell apar wo disribuions. page 4

Soluion. By Prop. 4. we know ha µ ν TV = µ(i) ν(i) = µ(i) ν(i), i B where B := {i S, µ(i) ν(i)}. Le X denoe he rando eleen of S which we are given. Then for all i S: P(heads, X = i) = µ(i) and P(ails, X = i) = ν(i). (i) If µ ν TV δ, le us adop he sraegy o guess µ whenever we receive an eleen of B and ν oherwise. The probabiliy p of guessing correc becoes p = P(heads, X = i) + P(ails, X = i) i B i/ B = ( µ(i) + ) ν(i) i B i/ B = ( + ) µ(i) ν(i) = ( + µ ν TV) i B + δ. (ii) No aer which sraegy we adop, if we are given eleen i we are incorrec wih our guess wih probabiliy a leas in{p[heads X = i], P[ails X = i]}, even for a randoized decision. Hence, if µ ν TV δ: p P(X = i) in{p[heads X = i], P[ails X = i]} = in { µ(i), ν(i) } = ( µ(i) + i/ B i B = ( µ(i) + i B i/ B δ, which iplies p + δ. ) ν(i) ) ν(i) = µ ν TV Exercise 0. Le P be he ransiion arix of a finie Markov chain (X ) N0 wih saionary disribuion π and saring disribuion µ = L(X 0 ). Show ha he oal variaion disance of he disribuion of X, i.e. µ P, o π is non-increasing wih, i.e. µ P π TV µ P + π TV for all N 0. Explain how his iplies ha d() = ax P (i, ) π TV is non-increasing. page 5

Soluion. Le us wrie µ := µ P. Since πp = π, applying Prop. 4. and he riangle inequaliy yields Choosing µ = δ i shows µ P + π TV = µ P π P TV = µ (j)p (j,i) π(j)p (j,i) j S j S P (j,i) µ (j) π(j) j S = µ (j) π(j) P (j,i) j S = µ (j) π(j) = µ π TV. j S P + (i, ) π TV P (i, ) π TV, axiizing his over i gives d( + ) d(). Exercise. (i) Consider an aperiodic irreducible finie Markov chain having saionary disribuion π and he propery ha here exiss a sae i and a se A S wih π(a) = i A π(i) >, and d(i,a), where d(i,a) is he shores pah 4 disance fro i o any node in A in he (direced) Markov chain graph. Show ha ix. (ii) Use his o obain lower bounds for he ixing ie for a lazy rando walk on he hypercube Z d and on he orus Z, d >. How sharp a bound can you ge for he hypercube arguing his way? For wha cobinaions of and d can you prove, using (i), ha he chain is no rapidly ixing? Soluion. (i) If <, we find P[X A X 0 = i] = 0. This enails d() P (i, ) π TV π(a) j A P (i,j) = π(a) > 4. Hence ix >, which gives ix. (ii) In he case of he hypercube Z d, ake i = (0,...,0) and A o be he se of saes having a leas d coordinaes wih value, giving d(i,a) d. As he graph is regular, π is unifor and we ge π(a), which iplies ix d by he above. Fro Chebyshev s inequaliy, we know ha for a Bin(d, )-disribued rando variable Z P( Z EZ c) var(z) c = d 4c. page 6

As he disribuion of Z is syeric around is ean, his iediaely iplies P(Z EZ c) d. Hence, already he se B of vecors having a leas 8c d + d ones is of insufficien size, since π(b) = d B = P(Z EZ d) 8 < 4. So using he arguen above, when i coes o he asypoically leading er we won ge anyhing beer han d as a lower bound. In he case of he orus Z, d ake again i = (0,...,0) and his ie A o be he se of vecors which have a leas d coordinaes in { 4,..., 3 4 }. Then d(i,a) d 4 and since π is again unifor, π(a). Consequenly, ix d 4. Noe ha he size of he sae space is Z d = d. Using he lower bound on ix jus derived, we can conclude ha a sequence of lazy rando walks on ori Z d is no rapidly ixing, if is no polynoial in d, i.e. he sequence of (,d) is such ha for all k N. dk Exercise. Show ha here exiss a finie sae Markov chain so ha for wo of is saes i and j, li P (i, ) P (j, ) TV > 0 bu here exiss a coupling of he Markov chain saring respecively a i and a j, (X,Y ) N0 so ha T := inf{ : X = Y } is finie wih probabiliy. Noe ha his ells you ha condiion (5.) is very essenial in he saeen of Theore 5.. Soluion. By he convergence heore, such a chain can no be irreducible: In he aperiodic case boh P (i, ) and P (j, ) converge o π; in he periodic case define and he corresponding equivalence classes as in he soluion o Exercise 8, hen eiher i j, which will give convergence of boh P d (i, ) and P d (j, ) o π([i]) π [i] by he sae reasoning as above and hus li P (i, ) P (j, ) TV = 0, or i j which will ake P(T < ) = ipossible. A concree exaple can be found as Prop. 4. in he paper A Noe on Disagreeen Percolaion by Olle Häggsrö. Noe ha he coupling described here is no Markovian. If we had a Markovian coupling wih P(T < ) =, we could odify i for > T o enforce (5.) which hen would give li P (i, ) P (j, ) TV = 0, by Th. 5.. page 7