Stochastic relations of random variables and processes

Similar documents
ABTEKNILLINEN KORKEAKOULU STOCHASTIC RELATIONS OF RANDOM VARIABLES AND PROCESSES

ABTEKNILLINEN KORKEAKOULU COMPUTATIONAL METHODS FOR STOCHASTIC RELATIONS AND MARKOVIAN COUPLINGS

Stochastic comparison of Markov queueing networks using coupling

arxiv: v1 [math.pr] 3 Jun 2011

Existence, Uniqueness and Stability of Invariant Distributions in Continuous-Time Stochastic Models

STABILIZATION OF AN OVERLOADED QUEUEING NETWORK USING MEASUREMENT-BASED ADMISSION CONTROL

Stein s Method for Steady-State Approximations: Error Bounds and Engineering Solutions

LIMITS FOR QUEUES AS THE WAITING ROOM GROWS. Bell Communications Research AT&T Bell Laboratories Red Bank, NJ Murray Hill, NJ 07974

Asymptotic Coupling of an SPDE, with Applications to Many-Server Queues

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

A Diffusion Approximation for Stationary Distribution of Many-Server Queueing System In Halfin-Whitt Regime

A generalization of Dobrushin coefficient

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

Statistics 150: Spring 2007

THE VARIANCE CONSTANT FOR THE ACTUAL WAITING TIME OF THE PH/PH/1 QUEUE. By Mogens Bladt National University of Mexico

PATH FUNCTIONALS OVER WASSERSTEIN SPACES. Giuseppe Buttazzo. Dipartimento di Matematica Università di Pisa.

Central limit theorems for ergodic continuous-time Markov chains with applications to single birth processes

Geometric ρ-mixing property of the interarrival times of a stationary Markovian Arrival Process

Asymptotic properties of the maximum likelihood estimator for a ballistic random walk in a random environment

Load Balancing in Distributed Service System: A Survey

HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS. Josef Teichmann

Non-homogeneous random walks on a semi-infinite strip

Overflow Networks: Approximations and Implications to Call-Center Outsourcing

Exact Simulation of the Stationary Distribution of M/G/c Queues

APPLICATIONS OF THE KANTOROVICH-RUBINSTEIN MAXIMUM PRINCIPLE IN THE THEORY OF MARKOV OPERATORS

Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms

25.1 Ergodicity and Metric Transitivity

Non-Essential Uses of Probability in Analysis Part IV Efficient Markovian Couplings. Krzysztof Burdzy University of Washington

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.

Asymptotic Irrelevance of Initial Conditions for Skorohod Reflection Mapping on the Nonnegative Orthant

ITERATED FUNCTION SYSTEMS WITH CONTINUOUS PLACE DEPENDENT PROBABILITIES

Class 11 Non-Parametric Models of a Service System; GI/GI/1, GI/GI/n: Exact & Approximate Analysis.

Markov processes and queueing networks

Random Process Lecture 1. Fundamentals of Probability

Information theoretic perspectives on learning algorithms

1 Continuous-time chains, finite state space

Positive Harris Recurrence and Diffusion Scale Analysis of a Push Pull Queueing Network. Haifa Statistics Seminar May 5, 2008

On Finding Optimal Policies for Markovian Decision Processes Using Simulation

GLUING LEMMAS AND SKOROHOD REPRESENTATIONS

Time Reversibility and Burke s Theorem

The Transition Probability Function P ij (t)

Random Locations of Periodic Stationary Processes

Stability and Rare Events in Stochastic Models Sergey Foss Heriot-Watt University, Edinburgh and Institute of Mathematics, Novosibirsk

On Existence of Limiting Distribution for Time-Nonhomogeneous Countable Markov Process

A regeneration proof of the central limit theorem for uniformly ergodic Markov chains

THE SKOROKHOD OBLIQUE REFLECTION PROBLEM IN A CONVEX POLYHEDRON

Consistency of the maximum likelihood estimator for general hidden Markov models

On a Class of Multidimensional Optimal Transportation Problems

STATIONARY NON EQULIBRIUM STATES F STATES FROM A MICROSCOPIC AND A MACROSCOPIC POINT OF VIEW

Tales of Time Scales. Ward Whitt AT&T Labs Research Florham Park, NJ

Brownian Motion and Conditional Probability

Mean-field dual of cooperative reproduction

Introduction to self-similar growth-fragmentations

Estimation of arrival and service rates for M/M/c queue system

Optimal Transport for Data Analysis

Introduction to asymptotic techniques for stochastic systems with multiple time-scales

Chapter 2 Event-Triggered Sampling

On Reflecting Brownian Motion with Drift

Asymptotic stability of an evolutionary nonlinear Boltzmann-type equation

Intertwining of Markov processes

Nonnegativity of solutions to the basic adjoint relationship for some diffusion processes

Probability and Measure

Supplement to Hidden Rust Models

ECE 4400:693 - Information Theory

University of Toronto Department of Statistics

Spatial Ergodicity of the Harris Flows

Peacocks Parametrised by a Partially Ordered Set

Optimal Control of Partially Observable Piecewise Deterministic Markov Processes

Zero-sum semi-markov games in Borel spaces: discounted and average payoff

Generalized Hypothesis Testing and Maximizing the Success Probability in Financial Markets

Lecture 9: Deterministic Fluid Models and Many-Server Heavy-Traffic Limits. IEOR 4615: Service Engineering Professor Whitt February 19, 2015

Economy of Scale in Multiserver Service Systems: A Retrospective. Ward Whitt. IEOR Department. Columbia University

G(t) := i. G(t) = 1 + e λut (1) u=2

A Functional Central Limit Theorem for an ARMA(p, q) Process with Markov Switching

ANALYSIS OF THE LAW OF THE ITERATED LOGARITHM FOR THE IDLE TIME OF A CUSTOMER IN MULTIPHASE QUEUES

3 hours UNIVERSITY OF MANCHESTER. 22nd May and. Electronic calculators may be used, provided that they cannot store text.

Logarithmic Sobolev Inequalities

Markov Chain Approximation of Pure Jump Processes. Nikola Sandrić (joint work with Ante Mimica and René L. Schilling) University of Zagreb

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Conditional moment representations for dependent random variables

A Note on the Central Limit Theorem for a Class of Linear Systems 1

STABILITY AND STRUCTURAL PROPERTIES OF STOCHASTIC STORAGE NETWORKS 1

TCOM 501: Networking Theory & Fundamentals. Lecture 6 February 19, 2003 Prof. Yannis A. Korilis

On hitting times and fastest strong stationary times for skip-free and more general chains

MODELING WEBCHAT SERVICE CENTER WITH MANY LPS SERVERS

Dynamic and Stochastic Brenier Transport via Hopf-Lax formulae on Was

Quality of Real-Time Streaming in Wireless Cellular Networks : Stochastic Modeling and Analysis

Logarithmic Sobolev inequalities in discrete product spaces: proof by a transportation cost distance

Technical Appendix for: When Promotions Meet Operations: Cross-Selling and Its Effect on Call-Center Performance

An elementary proof of the weak convergence of empirical processes

Optimal scaling of average queue sizes in an input-queued switch: an open problem

Some Results on the Ergodicity of Adaptive MCMC Algorithms

On the Resource/Performance Tradeoff in Large Scale Queueing Systems

IEOR 8100: Topics in OR: Asymptotic Methods in Queueing Theory. Fall 2009, Professor Whitt. Class Lecture Notes: Wednesday, September 9.

Faithful couplings of Markov chains: now equals forever

Overview of Markovian maps

Part II Probability and Measure

Advanced Computer Networks Lecture 2. Markov Processes

Walsh Diffusions. Andrey Sarantsev. March 27, University of California, Santa Barbara. Andrey Sarantsev University of Washington, Seattle 1 / 1

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K "

Transcription:

Stochastic relations of random variables and processes Lasse Leskelä Helsinki University of Technology 7th World Congress in Probability and Statistics Singapore, 18 July 2008

Fundamental problem of applied probability Ef (X(t)) =? What if X is complex? Asymptotics Simulation Bounds

Stochastic bounds Let X 1 and X 2 be (irreducible, positive recurrent) Markov processes with stationary distributions µ 1 and µ 2. Problem Can we show that µ 1 st µ 2 without explicitly knowing µ 1 or µ 2? Recall that µ 1 is stochastically less than µ 2, denoted µ 1 st µ 2, if f dµ1 f dµ 2 for all positive increasing f.

Sufficient condition Theorem (Whitt 1986; Massey 1987) A sufficient condition for µ 1 st µ 2 is that the transition rate kernels of X 1 and X 2 satisfy for all x y: Q 1 (x,b) Q 2 (y,b) for all upper sets B such that x,y / B Q 2 (x,b) Q 2 (y,b) for all lower sets B such that x,y / B The above condition is not sharp in general. Can we do any better?

Outline Stochastic relations Preservation of stochastic relations Maximal subrelations

Outline Stochastic relations Preservation of stochastic relations Maximal subrelations

Coupling A coupling of random elements X and Y is a bivariate random element (ˆX,Ŷ) such that: ˆX has the same distribution as X Ŷ has the same distribution as Y A coupling of probability measures µ on S 1 and ν on S 2 is a probability measure λ on S 1 S 2 having marginals µ and ν. Remark (ˆX,Ŷ ) is a coupling of X and Y if and only if P((ˆX,Ŷ) ) is a coupling of P(X ) and P(Y ).

Stochastic relations Any meaningful distributional relation should have a coupling counterpart (Hermann Thorisson).

Stochastic relations Any meaningful distributional relation should have a coupling counterpart (Hermann Thorisson). S 2 R S 1 Denote x y, if (x,y) R X st Y, if there exists a coupling (ˆX,Ŷ ) of X and Y such that ˆX Ŷ almost surely. µ st ν, if there exists a coupling λ of µ and ν such that λ(r) = 1. R st = {(µ,ν) : µ st ν} is the stochastic relation generated by R. For Dirac measures, δ x st δ y if and only if x y.

Functional characterization Theorem (Strassen 1965; L. 2008+) The following are equivalent: (i) µ st ν (ii) µ(b) ν(b ) for all compact B S 1 (iii) S 1 f dµ S 2 f dν for all upper semicontinuous compactly supported f : S 1 R + S 2 B R B = x1 B{x 2 S 2 : x 1 x 2 } B S 1 f (x 2 ) = sup f (x 1 ). x 1 :x 1 x 2

Functional characterization Theorem (Strassen 1965; L. 2008+) The following are equivalent: (i) µ st ν (ii) µ(b) ν(b ) for all compact B S 1 (iii) S 1 f dµ S 2 f dν for all upper semicontinuous compactly supported f : S 1 R + Remark If R is an order (reflexive and transitive) relation on S, then conditions (ii) and (iii) are equivalent to (ii ) µ(b) ν(b) for all measurable upper sets B, (iii ) S f dµ S f dν for all increasing measurable f : S 1 R +. (Strassen 1965; Kamae, Krengel, O Brien 1977)

Examples Stochastic equality. Let = st be the stochastic relation generated by the equality =. Then X = st Y if and only if X and Y have the same distribution. Stochastic ǫ-distance. Define x y by x y ǫ. Two real random variables satisfy X st Y if and only if for all x the corresponding c.d.f. s satisfy F Y (x ǫ) F X (x) F Y (x + ǫ). Stochastic induced order. Define x f,g y by f (x) g(y). Then µ f,g st ν if and only if µ(f 1 ((α, ))) ν(g 1 ((α, ))) for all real numbers α (Doisy 2000).

Outline Stochastic relations Preservation of stochastic relations Maximal subrelations

Monotonicity vs. relation-preservation Order relations monotone functions f : x y = f (x) f (y) General relations relation-preserving pairs of functions (f,g): x y = f (x) g(y) Stochastic relations stochastically relation-preserving pairs of probability kernels (random functions) (F, G): x y = F(x, ) st G(y, )

Preservation of stochastic relations A pair of probability kernels (P 1,P 2 ) stochastically preserves a relation R, if or equivalently, x 1 x 2 = P 1 (x 1, ) st P 2 (x 2, ) µ 1 st µ 2 = µ 1 P 1 st µ 2 P 2.

Preservation of stochastic relations A pair of probability kernels (P 1,P 2 ) stochastically preserves a relation R, if or equivalently, x 1 x 2 = P 1 (x 1, ) st P 2 (x 2, ) µ 1 st µ 2 = µ 1 P 1 st µ 2 P 2. Theorem (Zhang 1998; L. 2008+) A pair (P 1,P 2 ) stochastically preserves R if and only if there exists a probability kernel P on S 1 S 2 such that: (i) P(x, ) couples P 1 (x 1, ) and P 2 (x 2, ) for all x = (x 1,x 2 ). (ii) x R = P(x,R) = 1.

Stochastic relations of Markov processes A pair of Markov processes stochastically preserve a relation R, if x y = X(x,t) st Y (y,t) for all t, or equivalently, µ st ν = X(µ,t) st Y (ν,t) for all t.

Stochastic relations of Markov processes A pair of Markov processes stochastically preserve a relation R, if x y = X(x,t) st Y (y,t) for all t, or equivalently, µ st ν = X(µ,t) st Y (ν,t) for all t. Remark A Markov process is stochastically monotone, if x y = X(x,t) st X(y,t) for all t.

Relation-preserving Markov processes Let X 1 and X 2 be discrete-time Markov processes with transition probability kernels P 1 and P 2. Theorem (L. 2008+) The following are equivalent: (i) X 1 and X 2 stochastically preserve the relation R. (ii) P 1 (x 1,B) P 2 (x 2,B ) for all x 1 x 2 and compact B S 1. (iii) There exists a Markovian coupling of X 1 and X 2 for which R is invariant.

Relation-preserving Markov processes Let X 1 and X 2 be discrete-time Markov processes with transition probability kernels P 1 and P 2. Theorem (L. 2008+) The following are equivalent: (i) X 1 and X 2 stochastically preserve the relation R. (ii) P 1 (x 1,B) P 2 (x 2,B ) for all x 1 x 2 and compact B S 1. (iii) There exists a Markovian coupling of X 1 and X 2 for which R is invariant. Remarks If R is an order, (ii) can be replaced by (ii ) P 1 (x 1, B) P 2 (x 2, B) for all x 1 x 2 and upper sets B (Kamae, Krengel, O Brien 1977). An analogous result holds for nonexplosive Markov jump processes, generalizing the result of Whitt and Massey.

Outline Stochastic relations Preservation of stochastic relations Maximal subrelations

Stochastic subrelations Recall our starting point: Problem Can we show that the stationary distributions µ 1 and µ 2 of Markov processes X 1 and X 2 satisfy µ 1 st µ 2 without explicitly knowing µ 1 or µ 2?

Stochastic subrelations Recall our starting point: Problem Can we show that the stationary distributions µ 1 and µ 2 of Markov processes X 1 and X 2 satisfy µ 1 st µ 2 without explicitly knowing µ 1 or µ 2? The sufficient condition of Whitt and Massey essentially says that X 1 and X 2 stochastically preserve the order relation R = {(x,y) : x y}.

Stochastic subrelations Recall our starting point: Problem Can we show that the stationary distributions µ 1 and µ 2 of Markov processes X 1 and X 2 satisfy µ 1 st µ 2 without explicitly knowing µ 1 or µ 2? The sufficient condition of Whitt and Massey essentially says that X 1 and X 2 stochastically preserve the order relation R = {(x,y) : x y}. A less stringent sufficient condition: Show that X 1 and X 2 stochastically preserve a nontrivial subrelation of R.

Subrelation algorithm Given a closed relation R and continuous probability kernels P 1 and P 2, define a sequence of relations by R (0) = R, { } R (n+1) = (x,y) R (n) : (P 1 (x, ),P 2 (y, )) R (n) st, and let R = n=0 R(n).

Subrelation algorithm Given a closed relation R and continuous probability kernels P 1 and P 2, define a sequence of relations by R (0) = R, { } R (n+1) = (x,y) R (n) : (P 1 (x, ),P 2 (y, )) R (n) st, and let R = n=0 R(n). Theorem (L. 2008+) The relation R is the maximal closed subrelation of R that is stochastically preserved by (P 1,P 2 ). Especially, the pair (P 1,P 2 ) preserves a nontrivial subrelation of R if and only if R. Remark A modified algorithm works for Markov jump processes.

Application: Multilayer loss network Multiclass loss network with M k servers dedicated to class-k jobs (layer 1) N multiclass servers processing the overflow traffic (layer 2) λ 1 M 1 X 1,1 1 N X 2,1,..., X 2,K 1 M λ K K 1 X 1,K

Application: Multilayer loss network Modified system Y = (Y 1,1,...,Y 1,K ;Y 2,1,...,Y 2,K ) One class-1 server replaced by a shared server Can we show that E i,k X i,k E i,k Y i,k in steady state? Define the relation x y by i,k x i,k i,k y i,k. is not an order (different state spaces) X and Y do not preserve st But maybe (X,Y ) preserves some subrelation of st?

Application: Multilayer loss network Example Two customer classes Server configuration: M 1 = 3, M 2 = 2, N = 2 Arrival rates λ 1 = 1, λ 2 = 2 Service rate µ = 1 How many iterations do we need to compute R? X has 72 possible states Y has 90 possible states

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network What if we started with a stricter relation? Redefine x y by 0 i,k y i,k i,k x i,k 1

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network 90 80 70 60 50 y 40 30 20 10 0 0 10 20 30 40 50 60 70 x

Application: Multilayer loss network Theorem (Jonckheere & L. 2008) The processes X and Y stochastically preserve the relation R = {(x,y) : x y }, where = {0, e 2, e 2 e 1,1, 2e 2 e 1,1 }. Especially, the stationary distributions of the processes satisfy and Y 1 st X st Y, X 1,1 st Y 1,1, X 1,k = st Y 1,k for all k 1, X 2,k st Y 2,k. k k

Application: Load balancing λ 1 X 1 (t) X LB 1 (t) λ 1 + λ 2 λ 2 X 2 (t) X LB 2 (t) Common sense: E(X LB 1 (t) + X LB 2 (t)) E(X 1 (t) + X 2 (t))

Application: Load balancing λ 1 X 1 (t) X LB 1 (t) λ 1 + λ 2 λ 2 X 2 (t) X LB 2 (t) Common sense: E(X LB 1 (t) + X LB 2 (t)) E(X 1 (t) + X 2 (t)) The rate kernel pair (Q LB,Q) does not stochastically preserve: R nat = {(x,y) : x 1 y 1, x 2 y 2 } R sum = {(x,y) : x y }, where x = x 1 + x 2 How about a subrelation of R sum?

Application: Load balancing Theorem (L. 2008+) The subrelation algorithm started from R sum yields R (n) = { (x,y) : x y and x 1 x 2 y 1 y 2 + (y 1 y 2 n) +} R = {(x,y) : x y and x 1 x 2 y 1 y 2 }. Especially, (Q LB,Q) stochastically preserves the relation R. Remark R is the weak majorization order on Z 2 + X st Y if and only if Ef (X) Ef (Y ) for all coordinatewise increasing Schur-convex functions f (Marshall & Olkin 1979).

Conclusions Algorithmic probability Computational methods for analytical results Comparison without ordering State space reduction Open problems: Numerical methods for finite Markov chains Subrelations versus dependence orderings Diffusions, Feller processes, martingales,...

Discussion: Coupling vs. mass transportation W φ (µ,ν) = inf φ(x 1,x 2 )λ(dx) λ K(µ,ν) S 1 S 2 K(µ,ν) is the set of couplings of µ and ν φ µ ν W φ is a Wasserstein metric, if φ is a metric. µ st ν if and only if W φ (µ,ν) = 0 for φ(x 1,x 2 ) = 1(x 1 x 2 ). (Monge 1781, Kantorovich 1942, Wasserstein 1969, Chen 2005)

M.-F. Chen. Eigenvalues, Inequalities, and Ergodic Theory. Springer, 2005. M. Doisy. A coupling technique for stochastic comparison of functions of Markov processes. J. Appl. Math. Decis. Sci., 4(1):39 64, 2000. M. Jonckheere and L. Leskelä. Stochastic bounds for two-layer loss systems. To appear in Stoch. Models, arxiv:0708.1927, 2008+. T. Kamae, U. Krengel, and G. L. O Brien. Stochastic inequalities on partially ordered spaces. Ann. Probab., 5(6):899 912, 1977. L. Leskelä. Stochastic relations of random variables and processes. Submitted. Preprint: http://www.iki.fi/lsl/, 2008+. W. A. Massey.

Stochastic orderings for Markov processes on partially ordered spaces. Math. Oper. Res., 12(2):350 367, 1987. V. Strassen. The existence of probability measures with given marginals. Ann. Math. Statist., 36(2):423 439, 1965. H. Thorisson. Coupling, Stationarity, and Regeneration. Springer, 2000. W. Whitt. Stochastic comparisons for non-markov processes. Math. Oper. Res., 11(4):608 618, 1986. S.-Y. Zhang. Existence of ρ-optimal coupling operator for jump processes. Acta Math. Sin. (Chinese series), 41(2):393 398, 1998.