Necessary and sufficient conditions for strong R-positivity

Similar documents
Lecture 7. µ(x)f(x). When µ is a probability measure, we say µ is a stationary distribution.

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

Chapter 7. Markov chain background. 7.1 Finite state space

Markov Chains, Random Walks on Graphs, and the Laplacian

Convex Optimization CMU-10725

Perron Frobenius Theory

Markov Chains and Stochastic Sampling

Probability & Computing

Modern Discrete Probability Spectral Techniques

Markov Chains Handout for Stat 110

Section 3.9. Matrix Norm

215 Problem 1. (a) Define the total variation distance µ ν tv for probability distributions µ, ν on a finite set S. Show that

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1

Countable state discrete time Markov Chains

Positive and null recurrent-branching Process

Introduction to Machine Learning CMU-10701

Math Homework 5 Solutions

Random Walks on Hyperbolic Groups IV

Markov Chains, Stochastic Processes, and Matrix Decompositions

6 Markov Chain Monte Carlo (MCMC)

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Lecture 7. We can regard (p(i, j)) as defining a (maybe infinite) matrix P. Then a basic fact is

MATH 56A: STOCHASTIC PROCESSES CHAPTER 2

2. Transience and Recurrence

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

MATH 56A SPRING 2008 STOCHASTIC PROCESSES 65

A note on transitive topological Markov chains of given entropy and period

Lecture 5: Random Walks and Markov Chain

A review of Continuous Time MC STA 624, Spring 2015

Markov Chain Monte Carlo

Stochastic Processes (Week 6)

Markov Chain Monte Carlo (MCMC)

Transience: Whereas a finite closed communication class must be recurrent, an infinite closed communication class can be transient:

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

1 Continuous-time chains, finite state space

RECURRENCE IN COUNTABLE STATE MARKOV CHAINS

Applied Stochastic Processes

Chapter 2. Markov Chains. Introduction

The range of tree-indexed random walk


Math 456: Mathematical Modeling. Tuesday, April 9th, 2018

Classification of Countable State Markov Chains

Z-Pencils. November 20, Abstract

RANDOM WALKS AND PERCOLATION IN A HIERARCHICAL LATTICE

Birth-death chain models (countable state)

Markov Chains (Part 3)

process on the hierarchical group

Detailed Proof of The PerronFrobenius Theorem

Applications of The Perron-Frobenius Theorem

ALMOST SURE CONVERGENCE OF RANDOM GOSSIP ALGORITHMS

Spectral radius, symmetric and positive matrices

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

Markov chains. Randomness and Computation. Markov chains. Markov processes

MAA704, Perron-Frobenius theory and Markov chains.

Probabilistic Aspects of Computer Science: Markovian Models

Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +

SIMILAR MARKOV CHAINS

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.

Interlude: Practice Final

(a) Write down the value of q and of r. (2) Write down the equation of the axis of symmetry. (1) (c) Find the value of p. (3) (Total 6 marks)

MATH 564/STAT 555 Applied Stochastic Processes Homework 2, September 18, 2015 Due September 30, 2015

Invertibility and stability. Irreducibly diagonally dominant. Invertibility and stability, stronger result. Reducible matrices

FUNCTIONAL CENTRAL LIMIT (RWRE)

Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

A primer on basic probability and Markov chains

MATH36001 Perron Frobenius Theory 2015

STA 294: Stochastic Processes & Bayesian Nonparametrics

Intertwining of Markov processes

Section 1.7: Properties of the Leslie Matrix

Matrix analytic methods. Lecture 1: Structured Markov chains and their stationary distribution

Stochastic Processes

Propp-Wilson Algorithm (and sampling the Ising model)

INTRODUCTION TO MARKOV CHAIN MONTE CARLO

Mean-field dual of cooperative reproduction

SPECTRAL THEOREM FOR SYMMETRIC OPERATORS WITH COMPACT RESOLVENT

A relative entropy characterization of the growth rate of reward in risk-sensitive control

The Theory behind PageRank

Lecture 8: The Metropolis-Hastings Algorithm

Invariant measure for random walks on ergodic environments on a strip

ABSTRACT MARKOV CHAINS, RANDOM WALKS, AND CARD SHUFFLING. Nolan Outlaw. May 2015

Irreducibility. Irreducible. every state can be reached from every other state For any i,j, exist an m 0, such that. Absorbing state: p jj =1

Strong approximation for additive functionals of geometrically ergodic Markov chains

Powerful tool for sampling from complicated distributions. Many use Markov chains to model events that arise in nature.

Information Theory and Statistics Lecture 3: Stationary ergodic processes

Random Walks on Infinite Discrete Groups

STOCHASTIC PROCESSES Basic notions

Stochastic optimization Markov Chain Monte Carlo

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

Notes on Linear Algebra and Matrix Theory

ON THE ZERO-ONE LAW AND THE LAW OF LARGE NUMBERS FOR RANDOM WALK IN MIXING RAN- DOM ENVIRONMENT

CONVERGENCE THEOREM FOR FINITE MARKOV CHAINS. Contents

Graph fundamentals. Matrices associated with a graph

Proof: The coding of T (x) is the left shift of the coding of x. φ(t x) n = L if T n+1 (x) L

Ecaterina Sava. Lamplighter Random Walks and Entropy-Sensitivity of Languages

µ n 1 (v )z n P (v, )

Model Counting for Logical Theories

Ollivier Ricci curvature for general graph Laplacians

Transcription:

Necessary and sufficient conditions for strong R-positivity Wednesday, November 29th, 2017

The Perron-Frobenius theorem Let A = (A(x, y)) x,y S be a nonnegative matrix indexed by a countable set S. We say that A is irreducible if x, y S n 0 s.t. A n (x, y) > 0. We say that A is aperiodic if the greatest common divisor of {n 1 : A n (x, x) > 0} is one. [Perron-Frobenius (1912)] Let A be irreducible and let S be finite. Then there exist a unique constant c > 0 and a function h : S (0, ) that is unique up to scalar multiples, such that Ah = ch. How about infinite S? Observation Let A be a nonnegative matrix, c > 0 a constant, and h : S (0, ). Then the following conditions are equivalent. 1. Ah = ch. 2. P(x, y) := c 1 h(x) 1 A(x, y)h(y) defines a probability kernel.

Gibbs measures Let Ω n denote the space of all functions ω : {0,..., n} S. Let µ A,n x,y be the probability measure(!) on Ω n defined by µ A,n x,y (ω) := 1 {ω 0 =x, ω n=y} A n (x, y) n A(ω k 1, ω k ). k=1 We call this the Gibbs measure on Ω n with transfer matrix A and boundary conditions x, y. [Equivalence of transfer matrices] Let A, B be irreducible with A(x, y) > 0 iff B(x, y) > 0. Then the following conditions are equivalent. 1. µ A,n x,y = µ B,n x,y for all x, y, n such that A n (x, y) > 0. 2. There exists a c > 0 and h : S (0, ) such that B(x, y) := c 1 h(x) 1 A(x, y)h(y). Moreover, in 2., the constant c is unique and h is unique up to a multiplicative constant.

The infinite-volume limit Write A c B or A B and call A, B equivalent if B(x, y) := c 1 h(x) 1 A(x, y)h(y) (x, y S) for some c > 0 and h : S (0, ). Perron-Frobenius A irreducible and S finite unique probability kernel P such that A P. [Infinite-volume limit] The Gibbs measures µ A,n x,y converge weakly as n to the law of the Markov chain with initial state x and transition kernel P. Note A and P determine c uniquely, and h uniquely up to a multiplicative constant.

R-recurrence [David Vere-Jones 1962, 1967] Let A be a countable, irreducible, nonnegative matrix. Then there exists at most one recurrent probability kernel P such that A P. Call such A R-recurrent. Consequence If A is R-recurrent, then there exists a unique c > 0 and a h : S (0, ) that is unique up to constant multiples, such that P(x, y) := c 1 h(x) 1 A(x, y)h(y) is a recurrent probability kernel. Call A R-transient if A is not R-recurrent. Call A R-positive if P is positive recurrent. The result about the infinite-volume limit holds more generally for all R-positive A.

The spectral radius [Kingman 1963] Let A be aperiodic and irreducible. Then the limit ( ρ(a) := lim A n (x, y) ) 1/n n exists and does not depend on x, y S. We call this the spectral radius. A c B ρ(a) = cρ(b), P recurrent probability kernel ρ(p) = 1. [David Vere-Jones 1962, 1967] A is R-recurrent if and only if ρ(a) n A n (x, x) = n=1 for some, and hence for all x S.

The spectral radius Proof of the necessity: If A c P with P a recurrent probability kernel, then c = ρ(a) and ρ(a) n A n (x, x) = n=1 P n (x, x) =. n=1 Here we used P n (x, y) = c 1 h(x) 1 A n (x, y)h(y) and in particular P n (x, x) = c 1 A n (x, x). Similarly: A is R-positive if and only if lim n ρ(a) n A n (x, x) > 0.

A generalization of Perron-Frobenius [David Vere-Jones 1962, 1967] If A is irreducible and R-recurrent, then there exists a function h : S (0, ), unique up to scalar multiples, such that Ah = ρ(a)h. If some function f : S [0, ) satisfies Af ρ(a)f, then f is a scalar multiple of h. Consequence If Af = cf for some f : S (0, ), then ρ(a) c. If ρ(a) < c, then P(x, y) := c 1 f (x) 1 A(x, y)f (y) is a transient probability kernel. Note 1 In the infinite dimensional case, ρ(a) need not be the largest eigenvalue. Note 2 The theory of R-recurrence is just one way to generalize the Perron-Frobenius theorem to infinite dimensions. A functional analytic generalization is the Krein-Rutman theorem (1948).

Geometric ergodicity Let (X k ) k 0 be a Markov chain with irreducible transition kernel P. Let σ x := inf{k > 0 : X k = x} denote the first return time to x. Def P is strongly positive recurrent if for some, and hence for all x S, there exists an ε > 0 s.t. E x[ e εσ x ] <. [Kendall 59, Vere-Jones 62] Let P be irreducible and aperiodic with invariant law π. Then P is strongly positive recurrent if and only if it is geometrically ergodic in the sense that ε > 0, M x,y < s.t. P n (x, y) π(y) Mx,y e εn. (x, y S, n 0). Def A is strongly R-positive if A P for some (necessarily unique) strongly positive recurrent P.

Conditions for R-recurrence Problem: it is often not easy to check whether ρ(a) n A n (x, x) =. n=1 In fact, not using the Perron-Frobenius theorem, it is not even clear how to prove this is always true for finite matrices. Intuition: A should be R-recurrent if the Gibbs measures µ A,n x,y put, for large n, most of their mass on walks ω that stay close to x.

Intuition Recall ρ(a) := lim n ( A n (x, y) ) 1/n. The unnormalized measures ν A,n x,y (ω) := 1 {ω0 =x, ω n=y} n A(ω k 1, ω k ) k=1 have total mass A n (x, y), which grows exponentially as A n (x, y) = e n log ρ(a) + o(n). We need to know if most of this mass lies on walks that stay close to x, or on paths that wander far away.

Local modifications [Swart 2017] Let A B be irreducible with ρ(b) <. Assume that A(x, y) > 0 whenever B(x, y) > 0 and that { (x, y) : A(x, y) < B(x, y) } is finite and nonempty. (a) B is strongly R-positive if and only if ρ(a) < ρ(b). (b) A is R-transient if and only if ρ(a) = ρ ( A + ε(b A) ) for some ε > 0. Note: This implies in particular that finite matrices are strongly R-positive.

Pinning models Let Q denote the transition kernel of nearest-neighbor random walk on Z d. Define { e β Q(x, y) if x = 0, A β (x, y) := Q(x, y) otherwise. [Giacomin, Caravenna, Zambotti 2006] There exists a < β c < such that: A β is R-transient for β < β c. A β is R-null recurrent or weakly R-positive for β = β c. A β is strongly R-positive for β > β c. Moreover, β ρ(a β ) is constant on (, β c ] and stricty increasing on [β c, ). One has β c = 0 in dimensions d = 1, 2 and β c > 0 in dimensions d 3. In fact, e βc is the return probability of the random walk.

Bounds on the spectral radius Sharp upper bounds on ρ(a) can (in principle) be obtained from ρ(a) = inf { K < : f : S (0, ) s.t. Af Kf }. Let (π, Q) be a pair such that 1. π is a probability measure on some finite S S, 2. Q is a transition kernel on S with invariant law π. Define a large deviations rate function I A (π, Q) := x,y π(x)q(x, y) log ( Q(x, y) ). A(x, y) Sharp lower bounds on ρ(a) can be obtained from ρ(a) = sup e I A(π, Q). (π,q)

Open problems Define ρ (A) := inf { ρ(b) : B A a finite modification of A }. Then A strongly R-positive ρ (A) < ρ(a). Open problem Prove that ρ (A n ) = ρ (A) n. Open problem Develop the theory for semigroups (A t ) t 0 of nonnegative matrices. Open problem Show that the contact process modulo translations is strongly R-positive for all λ λ c. [Sturm & Swart 2014] The contact process modulo translations is R-positive for all λ < λ c.

Proof of the main result The most interesting part of the main theorem is: Let A be irreducible. If there exists a finite modification B A such that ρ(b) < ρ(a), then A P for some strongly positive recurrent probability kernel P. Sketch of the proof Fix a reference point z S. Let Ω z denote the space of all excursions away from z, i.e., functions ω : {0,..., n} S with ω 0 = z = ω n and ω k z for all 0 < k < n. Let l ω := n denote the length of ω. Define e ψ z(λ) := ω Ω z ν λ (ω) with ν λ (ω) := e λl ω l ω k=1 A(ω k 1, ω k ) STEP I: If there exists some λ such that ψ z (λ) = 0, then the process that makes i.i.d. excursions away from z with law ν λ is a recurrent Markov chain with transition kernel P A.

The logarithmic moment generating function ψ z (λ) strong R-positivity ψ z (λ) weak R-positivity λ λ λ λ z,+ λ ψ z (λ) R-null recurrence ψ z (λ) R-transience λ λ λ λ

Sketch of the proof (continued) Let G be the directed graph with vertex set S and edge set {(x, y) : A(x, y) > 0}. For any subgraph F G and vertices x, y F, define ψ F x,y (λ) = log φ F x,y (λ) with φ F x,y (λ) := ω Ω x,y (F ) e λl ω l ω k=1 A(ω k 1, ω k ), where Ω x,y (F ) denotes the space of excursions away from F starting in x and ending in y. Removal of an edge Let F = F \{e} be obtained from F by the removal of an edge e. Then φ F x,y (λ) = { φ F x,y (λ) + e λ A(x, y) if e = (x, y), φ F x,y (λ) otherwise (λ R).

Sketch of the proof Removal of an isolated vertex Let F = F \{z} be obtained from F by the removal of an isolated vertex z. Then φ F x,y (λ) = φ F x,y (λ) + φ F x,z(λ)φ F z,z(λ) k φ F z,y (λ). k=0 Proof: Set A(ω) := l ω k=1 A(ω k 1, ω k ). Distinguishing excursions away from F according to how often they visit the vertex z, we have φ F x,y (λ) = ω x,y e λl ω x,y A(ω x,y ) + e λ(l ω x,z + l ωz,y + l ω 1 z,z + + l ω k z,z ) k=0 ω x,z ω z,y ωz,z 1 ω k z,z where we sum over ω x,y Ω x,y (F ) etc. A(ω x,z )A(ω z,y )A(ω 1 z,z) A(ω k z,z),

Sketch of the proof Rewriting gives φ F x,y (λ) = ω x,y e λl ω x,y A(ω x,y )+ ( ω x,z e λl ω x,z A(ω x,z ) )( ω z,y e λl ω z,y A(ω z,y ) ) ( k=0 ω z,z e λl ω z,z A(ω z,z ) ) k.

Sketch of the proof Lemma (Exponential moments of excursions) Let P be an irreducible subprobability kernel. Set Then, if λ F x,y := sup{λ : ψ F x,y (λ) < }. λ F x,y,+ > 0 for all x, y F S holds for some finite nonempty subgraph F of G, it holds for all such subgraphs. Proof: By induction, removing edges and isolated vertices.

Sketch of the proof Lemma Set λ := sup{λ : ψ z (λ) = 0}. Then λ = log ρ(a). Proof of the theorem: Assume that A is not strongly positive recurrent. Let A A be a finite modification. We must show that ρ(a ) = ρ(a). By a similarity transformation, we may assume w.l.o.g. that A is a subprobability kernel and λ = 0. We need to show λ = 0. It suffices to show that for the subgraph F = {z}, we have λ z,+ = 0. Since A is not strongly positive, we have λ z,+ = 0. Since B is a finite modification, we can choose a finite subgraph F such that λ F x,y,+ is the same for A and A. Now λ z,+ 0 λ F x,y,+ 0 for some x, y F λ z,+ 0.