Fastest mixing Markov chain on a path

Similar documents
Notes for Lecture 11

2 Markov Chain Monte Carlo Sampling

Achieving Stationary Distributions in Markov Chains. Monday, November 17, 2008 Rice University

AN INTRODUCTION TO SPECTRAL GRAPH THEORY

CMSE 820: Math. Foundations of Data Sci.

A Hadamard-type lower bound for symmetric diagonally dominant positive matrices

Lecture 7: October 18, 2017

Large holes in quasi-random graphs

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory

Mixing Times for Random Walks on Geometric Random Graphs

Introduction to Computational Molecular Biology. Gibbs Sampling

Spectral Partitioning in the Planted Partition Model

Formulas for the Number of Spanning Trees in a Maximal Planar Map

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Linear chord diagrams with long chords

Ma/CS 6b Class 19: Extremal Graph Theory

c 2006 Society for Industrial and Applied Mathematics

Lecture 9: Expanders Part 2, Extractors

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

Lecture 2. The Lovász Local Lemma

1 Hash tables. 1.1 Implementation

Feedback in Iterative Algorithms

Chapter 7 Isoperimetric problem

SOME TRIBONACCI IDENTITIES

The Perturbation Bound for the Perron Vector of a Transition Probability Tensor

Bounds for the Extreme Eigenvalues Using the Trace and Determinant

ON THE RELIABILITY OF AN n-component SYSTEM 1. INTRODUCTION

CS322: Network Analysis. Problem Set 2 - Fall 2009

if j is a neighbor of i,

Fastest Mixing Markov Chain on Graphs with Symmetries

The random version of Dvoretzky s theorem in l n

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

Optimization Methods: Linear Programming Applications Assignment Problem 1. Module 4 Lecture Notes 3. Assignment Problem

Stochastic Matrices in a Finite Field

Compositions of Random Functions on a Finite Set

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

Asymptotic distribution of products of sums of independent random variables

Math 25 Solutions to practice problems

18.S096: Homework Problem Set 1 (revised)

A class of spectral bounds for Max k-cut

Modern Discrete Probability Spectral Techniques

Lecture 14: Graph Entropy

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Markov Decision Processes

Output Analysis and Run-Length Control

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Appendix to Quicksort Asymptotics

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

Spectral Graph Theory and its Applications. Lillian Dai Oct. 20, 2004

ON GRAPHS WITH THREE DISTINCT LAPLACIAN EIGENVALUES. 1 Introduction

Lecture 19: Convergence

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods

Monte Carlo Integration

RANDOM WALKS ON THE TORUS WITH SEVERAL GENERATORS

Application to Random Graphs

Lecture 2 Clustering Part II

6.3 Testing Series With Positive Terms

Convergence of random variables. (telegram style notes) P.J.C. Spreij

On Algorithm for the Minimum Spanning Trees Problem with Diameter Bounded Below

Lecture 2: Monte Carlo Simulation

Math 216A Notes, Week 5

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

Variable selection in principal components analysis of qualitative data using the accelerated ALS algorithm

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

A statistical method to determine sample size to estimate characteristic value of soil parameters

Chapter 2 The Solution of Numerical Algebraic and Transcendental Equations

AN EIGENVALUE REPRESENTATION FOR RANDOM WALK HITTING TIMES AND ITS APPLICATION TO THE ROOK GRAPH

Chapter 2 The Monte Carlo Method

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate

1 Covariance Estimation

A note on log-concave random graphs

ON POINTWISE BINOMIAL APPROXIMATION

NEW FAST CONVERGENT SEQUENCES OF EULER-MASCHERONI TYPE

Intro to Learning Theory

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences

The inverse eigenvalue problem for symmetric doubly stochastic matrices

5.1 Review of Singular Value Decomposition (SVD)

Estimation of the essential supremum of a regression function

THE STRONG LAW OF LARGE NUMBERS FOR STATIONARY SEQUENCES

Optimization Methods MIT 2.098/6.255/ Final exam

ON SOME DIOPHANTINE EQUATIONS RELATED TO SQUARE TRIANGULAR AND BALANCING NUMBERS

Topic 9: Sampling Distributions of Estimators

c 2009 Society for Industrial and Applied Mathematics

1.3 Convergence Theorems of Fourier Series. k k k k. N N k 1. With this in mind, we state (without proof) the convergence of Fourier series.

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

TR/46 OCTOBER THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION A. TALBOT

Spectral bounds for the k-independence number of a graph

Bernoulli trials with variable probabilities - an observation by Feller

Topic 9: Sampling Distributions of Estimators

Series: Infinite Sums

DIGITAL FILTER ORDER REDUCTION

ROLL CUTTING PROBLEMS UNDER STOCHASTIC DEMAND

Lecture 2 Long paths in random graphs

Math 155 (Lecture 3)

Rates of Convergence by Moduli of Continuity

INFINITE SEQUENCES AND SERIES

Basic Iterative Methods. Basic Iterative Methods

Shannon s noiseless coding theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

Transcription:

Fastest mixig Markov chai o a path Stephe Boyd Persi Diacois Ju Su Li Xiao Revised July 2004 Abstract We ider the problem of assigig trasitio probabilities to the edges of a path, so the resultig Markov chai or radom walk mixes as rapidly as possible. I this ote we prove that fastest mixig is obtaied whe each edge has a trasitio probability of. Although this result is ituitive (it was cojectured i [7]), ad ca be foud umerically usig covex optimizatio methods [2], we give a self-cotaied proof. I [2], the authors ider the problem of assigig trasitio probabilities to the edges of a coected graph i such a way that the associated Markov chai mixes as rapidly as possible. We show that this problem ca be solved, at least umerically, usig tools of covex PSfrag optimizatio, replacemets i particular, semidefiite programmig [9, 3]. The preset ote presets a simple, self cotaied example where the optimal Markov chai ca be idetified aalytically. Cosider a path with 2 odes, labeled 1, 2,...,, with 1 edges coectig pairs of adjacet odes, ad a loop at each ode, as show i figure 1. We ider a Markov chai (or radom walk) o this path, with trasitio probability from ode i to ode j deoted P ij. The requiremet that trasitios ca oly occur alog a edge or loop of the path is equivalet to P ij = 0 for i j > 1, i.e., P is a tridiagoal matrix. Sice P ij are trasitio probabilities, we have P ij 0, ad j P ij = 1, i.e., P is a stochastic matrix. This ca be expressed as P 1 = 1, where 1 is the vector with all compoets oe. We will ider symmetric trasitio probabilities, i.e., those that satisfy P ij = P ji. Thus, P is a symmetric, (doubly) stochastic, tridiagoal matrix. Sice P 1 = 1, we have (1/) T P = 1/, which meas that the uiform distributio, give by 1 T /, is statioary. P 11 P 22 P 33 P12 P23 1 2 3 1 Figure 1: A path with loops at each ode, with trasitio probabilities labeled. Submitted to The America Mathematical Mothly, October 2003. Authors listed i alphabetical order. Iformatio Systems Laboratory, Departmet of Electrical Egieerig, Staford Uiversity, CA 94305-9510. E-mail addresses: boyd@staford.edu, suju@staford.edu, lxiao@staford.edu. Departmet of Statistics ad Departmet of Mathematics, Staford Uiversity, CA 94305-4065. 1

The eigevalues of P are real (sice it is symmetric), ad o more tha oe i modulus (sice it is stochastic). We deote them i oicreasig order: 1 = λ 1 (P ) λ 2 (P ) λ (P ) 1. The asymptotic rate of covergece of the Markov chai to the statioary distributio, i.e., its mixig rate, depeds o the secod-largest eigevalue modulus (SLEM) of P, which we deote µ(p ): µ(p ) = max i=2,, λ i(p ) = max {λ 2 (P ), λ (P )}. The smaller µ(p ) is, the faster the Markov chai coverges to its statioary distributio. For example, we have the followig boud: π(t) 1 T / TV µ t, where π(t) = π(0)p t is the probability distributio at time t, ad TV deotes the total variatio orm. (The total variatio distace betwee two probability distributios π ad ˆπ is the maximum of prob π (S) probˆπ (S) over all subsets S {1, 2,..., }.) For more backgroud, see, e.g., [6, 4, 1, 2] ad refereces therei. The questio we address is: What choice of P miimizes µ(p ) amog all symmetric stochastic tridiagoal matrices? I other words, what is the fastest mixig (symmetric) Markov chai o a path? We will show that the trasitio matrix P = 0 0 achieves the smallest possible value of µ(p ), (π/), amog all symmetric stochastic tridiagoal matrices. Thus, to obtai the fastest mixig Markov chai o a path, we assig a PSfrag probability replacemets of of movig left, a probability of movig right, ad a probability of stayig at each of the two ed odes. (For the odes ot at either ed, the probability of stayig at the ode is zero.) This optimal Markov chai is show i figure 2. 1 2 3 1 Figure 2: Fastest mixig Markov chai o a path. For = 2, we have µ(p ) = (π/2) = 0, which is clearly the optimal solutio; i oe step the distributio is exactly uiform, for ay iitial distributio π(0). For 3, P is the trasitio matrix oe would guess yields fastest mixig; ideed, this was cojectured i [7]. But we are ot aware of a simpler proof of its optimality tha the oe we give below. Before proceedig, we describe aother cotext where the same mathematical problem arises. We imagie that there is a processor at each ode of our path, ad that each lik 2 (1)

represets a direct etwork coectio betwee the adjacet processors. Processor i has a job queue or load q i (t) (which we approximate as a positive real umber) at time t. The goal is to shift jobs across the liks, at each step, i such a way as to balace the load. I other words, we would like to have q i (t) q as t, where q = (1/) i q i (0) is the average of the iitial queues. We igore the reductio i the queues due to processig (or equivaletly, assume that the load balacig is doe before the processig begis). We use the followig simple scheme to balace the load: at each step, we compute the load imbalace, q i+1 (t) q i (t), across each lik. We the trasfer a fractio θ i [0, 1] of the load imbalace from the more loaded to the less loaded processor. We must have θ i + θ i+1 1, to esure that we are ot asked to trasfer more tha the load o a processor to its eighbors. It ca be show that if θ i are positive, ad satisfy θ i + θ i+1 1, the this iterative scheme achieves asymptotic balaced loads, i.e., q i (t) q as t. The problem is to fid the fractios θ i that result i the fastest possible load balacig. It turs out that this optimal iterative load balacig problem is idetical to the problem of fidig the fastest mixig Markov chai o a path, with P i,i+1 = θ i. I particular, the evolutio of the loads at the processors is give by q(t) = P t q(0). The speed of covergece of q(t) to q1 is give by the secod-largest eigevalue modulus µ(p ). By the basic result i this paper, the fastest possible load balacig is accomplished by shiftig oe-half of the load imbalace o each edge from the more loaded to the less loaded processor. More discussio of this load balacig problem ca be foud i [7]. We ow proceed to prove the basic result. Lemma. Let P R be a symmetric stochastic matrix. The we have µ(p ) = P (1/)11 T 2, where 2 deotes the spectral orm (maximum sigular value). Proof. To see this, we ote that 1 is the eigevector of P associated with the eigevalue λ 1 = 1. Therefore the eigevalues of P 11 T / are 0, λ 2,..., λ. Sice P 11 T / is symmetric, its spectral orm is equal to the maximum magitude of its eigevalues, i.e., max{λ 2, λ }, which is µ(p ). Lemma. Let P R be a symmetric stochastic matrix, ad suppose y, z R satisfy The we have µ(p ) 1 T z. 1 T y = 0, y 2 = 1, (2) (z i + z j )/2 y i y j for P ij 0. (3) Proof. For ay P, y ad z that satisfy the assumptios i the lemma, we have µ(p ) = P (1/)11 T 2 y ( T P (1/)11 ) T y = y T P y = i,j P ij y i y j 3

i,j ()(z i + z j )P ij = ()(z T P 1 + 1 T P z) = 1 T z. The first iequality follows from the assumptio y 2 = 1 ad the first lemma. The secod iequality follows from the assumptio (3), ad P ij 0. Theorem. The matrix P, give i (1), attais the smallest value of µ, (π/), amog all symmetric stochastic tridiagoal matrices. Proof. The result is clear for = 2. We assume ow that > 2. The eigevalues ad associated orthoormal eigevectors of P are λ 1 = 1, v 0 = (1/ )1 ( ) ( ) (j 1)π 2 (2k 1)(j 1)π λ j =, v j (k) =, 2 j = 2,..., k = 1,...,. (See, e.g.,[8, 16.3].) Therefore we have µ(p ) = λ 2 = λ = (π/). We show that this is the smallest µ possible by tructig a pair y ad z that satisfy the assumptios (2) ad (3) i the secod lemma, for ay symmetric tridiagoal stochastic matrix P, with 1 T z = (π/). We take y = v 2, so the assumptios (2) i the secod lemma clearly hold. We take z to be z i = 1 [ ( ( ) / ( ) π (2i 1)π π + ) ], i = 1,...,. It is easy to verify that 1 T z = (π/). It remais to check that y ad z satisfy (3) for ay symmetric tridiagoal matrix P. Let s first check the superdiagoal etries. For i = 1,..., 1, we have z i + z i+1 2 = 1 [ + ) 1 ( ( ) ( )) / (2i 1)π (2i + 1)π + 2 = 1 [ ( ( )] π 2iπ + ) = 2 ( ) ( ) (2i 1)π (2i + 1)π = y i y i+1. 2 2 ) ] Therefore equality always holds for the superdiagoal (ad subdiagoal) etries. For the diagoal etries, we eed to check (z i + z i )/2 = z i y 2 i, i.e., + ) ( ) / (2i 1)π ( ( ) π (2i 1)π 2 ) 2 = 1 + 2 4 ( ) (2i 1)π

for i = 1,...,, which is equivalet to [ ( )] [ π 1 1 But this is certaily true because This completes the proof. ( ) (2i 1)π ( ) / (2i 1)π ( ) ] π 0, i = 1,...,., i = 1,...,. ) Refereces [1] D. Aldous ad J. Fill. Reversible Markov Chais ad Radom Walks o Graphs. stat-www.berkeley.edu/users/aldous/rwg/book.html, 2003. Forthcomig book. [2] S. Boyd, P. Diacois, ad L. Xiao. Fastest mixig Markov chai o a graph. To appear i SIAM Review, problems ad techiques sectio, 2004. Available at www.staford.edu/~boyd/fmmc.html. [3] S. Boyd ad L. Vadeberghe. Covex Optimizatio. Cambridge Uiversity Press, 2004. Available at www.staford.edu/~boyd/cvxbook.html. [4] P. Brémaud. Markov Chais, Gibbs Fields, Mote Carlo Simulatio ad Queues. Texts i Applied Mathematics. Spriger-Verlag, Berli-Heidelberg, 1999. [5] G. Cobb ad Y. Che. A applicatio of Markov chai Mote Carlo to commuity ecology. The America Mathematical Mothly, 110(4):265 288, 2003. [6] P. Diacois ad D. Stroock. Geometric bouds for eigevalues of Markov chais. The Aals of Applied Probability, 1(1):36 61, 1991. [7] R. Diekma, S. Muthukrisha, ad M. V. Nayakkakuppam. Egieerig diffusive load balacig algorithms usig experimets. I Lecture Notes i Computer Sciece, volume 1253, pages 111 122. Spriger Verlag, 1997. [8] W. Feller. A Itroductio to Probability ad Its Applicatios, volume I. Wiely, New York, 3rd editio, 1968. [9] L. Vadeberghe ad S. Boyd. Semidefiite programmig. SIAM Review, 38(1):49 95, 1996. 5