Dynamic Matching Models - PDF Free Download

Dynamic Matching Models Ana Bušić Inria Paris - Rocquencourt CS Department of École normale supérieure joint work with Varun Gupta, Jean Mairesse and Sean Meyn 3rd Workshop on Cognition and Control January 16 & 17, 2015, University of Florida 1 / 29

Bipartite matching model Model description Dynamic Bipartite Matching Model Static model long history in economics Finding Stable Matches 2012 Nobel Prize awarded to L. S. Shapley. Dynamic model introduced by Caldentey, Kaplan, and Weiss (2009) Multiclass queueing model Supply/Demand play symmetric roles Discrete time queueing model with two types of arrival: supply and demand. Arrival of Supply/Demand is i.i.d., with A D (t) = A S (t) for all t Instantaneous matchings according to a bipartite matching graph. Supply/Demand that cannot be matched are stored in a buffer. 2 / 29

Bipartite matching model Model description Matching in Health-care TA L K I N G A B O U T T R A N S P L A N TAT I O N Matching Kidneys and Donors Who can join this program? For recipients: If you are eligible for a kidney transplant and are receiving care at a transplant center in the United States, you can join... You must have a living donor who is willing and medically able to donate his or her kidney... For donors: You must also be willing to take part... U N I T E D N E T W O R K F O R O R G A N S H A R I N G 3 / 29

Bipartite matching model Model description Matching Policies Model specified by 1) Matching graph, 2) Joint probability measure µ for arrivals of Supply/Demand, and 4 / 29

Bipartite matching model Model description Matching Policies Model specified by 1) Matching graph, 2) Joint probability measure µ for arrivals of Supply/Demand, and 3) A matching policy. 4 / 29

Bipartite matching model Model description Matching Policies Model specified by 1) Matching graph, 2) Joint probability measure µ for arrivals of Supply/Demand, and 3) A matching policy. Admissible policies State feedback: Decision U(t) depends only on buffers Q(t) and immediate arrivals A(t), Q(t + 1) = Q(t) U(t) + A(t) Match according to graph, maintaining U D (t) = U S (t) Q D (t) = Q S (t) for all t 4 / 29

Bipartite matching model Necessary conditions Necessary stability conditions For a matching graph (D, S, E) we denote: D(s) = {d D : (d, s) E}, S(d) = {s S : (d, s) E}. Necessary conditions: If the model is stable then the marginals of µ satisfy NCond : { µd (U) < µ S (S(U)), U D µ S (V ) < µ D (D(V )), V S 5 / 29

Bipartite matching model Necessary conditions Proof Proof using network flow arguments: N = ( D S {i, f }, E {(i, d), d D} {(s, f ), s S} ). Capacities: µ D (d) for (i, d), µ S (s) for (s, f ), for (d, s). Lemma. 1 There exists a flow of value 1 in N iff µ satisfies NCond (< replaced by in NCond). 2 There exists a flow T of value 1 such that T (d, s) > 0 for all (d, s) E iff µ satisfies NCond. D S i f 6 / 29

Bipartite matching model Necessary conditions Proof of Lemma 2 Follows easily from connectivity of the matching graph. Fix η such that 0 < η < 1/ E. A strictly positive flow of value E η: η for (x, y) = (d, s) E T η(x, y) = S(d) η for (x, y) = (i, d) D(s) η for (x, y) = (s, f ). Define: µ D (d) = µ D (d) S(d) η 1 E η, µ S (s) = µ S(s) D(s) η 1 E η. For η small enough, µ D, µ S are probability measures satisfying NCond. For µ D, µ S there exists a flow T of value 1. A strictly positive flow of value 1: T = T η + (1 E η) T. 7 / 29

Bipartite matching model Necessary conditions Verification algorithm The pair (µ D, µ S ) satisfies NCond iff the pair ( µ D, µ S ) satisfies NCond for η strictly positive and small enough. Run MaxFlow on the input (N, µ D, µ S ) by considering η as a formal parameter as small as needed. Quantities of type: x + yη for x, y R. Addition: (x 1 + y 1 η) + (x 2 + y 2 η) = (x 1 + x 2 ) + (y 1 + y 2 )η. Comparisons: [ x1 + y 1 η = x 2 + y 2 η ] [ x 1 = x 2, y 1 = y 2 ] [ x1 + y 1 η < x 2 + y 2 η ] [ ] (x 1 < x 2 ) or (x 1 = x 2, y 1 < y 2 ) 8 / 29

Bipartite matching model Necessary conditions Example D 1 2 3 4 1 2 3 4 S 4 3 2 matching graph 1 4 3 2 arrival graph 1 Consider any µ with supp(µ) = F. We have µ S ({1, 2 }) = µ(3, 1 ) + µ(4, 2 ) µ D ({3, 4}), which contradicts NCond for U = {3, 4}. 9 / 29

Bipartite matching model Connectivity properties Connectivity properties Consider a bipartite matching structure (D, S, E, F ). Associated directed graph: the nodes are D S and the arcs are d s, if (d, s) E, s d, if (d, s) F. C 1 2 3 4 1 2 3 4 S 4 3 2 1 matching graph (D, S, E) 4 3 2 1 arrival graph (D, S, F ) associated directed graph 10 / 29

Bipartite matching model Connectivity properties Connectivity properties Thm. For a bipartite matching structure (D, S, E, F ) the following properties are equivalent: 1 There exists µ such that supp(µ) = F, supp(µ D ) = D, supp(µ S ) = S and µ satisfies NCond. 2 The associated directed graph is strongly connected. Thm. If the associated directed graph of (D, S, E, F ) is strongly connected, then any bipartite matching model [(D, S, E, F ), µ, Pol] has a unique strictly connected component with all states leading to it. 11 / 29

Bipartite matching model Sufficient conditions State space decomposition The state space can be decomposed into facets, defined only by the non-empty classes. Def. A facet is an ordered pair (U, V ) such that: U D, V S and U V (D S E). 1 2 3 1 2 3 3 2 1 3 2 1 facet ({3}, {3 }) facet ({2, 3}, {3 }) For a facet F = (U, V ), define: D (F) = U, D (F) = D(V ), D (F) = D (D (F) D (F)) S (F) = V, S (F) = S(U), S (F) = S (S (F) S (F)). 12 / 29

Bipartite matching model Sufficient conditions Sufficient conditions Conditions SCond: µ D (D (F)) + µ S (S (F)) > 1 µ(e D (F) S (F)), F (, ) Prop. (Sufficient conditions) A bipartite model with probability µ satisfying SCond is stable under any admissible matching policy. Proof. Variation of the linear Lyapunov function (number of unmatched customers): D D D S 1 0 0 S 0 0 or 1 1 S 0 1 1 13 / 29

Bipartite matching model Sufficient conditions Sufficient conditions Def. A facet F is called saturated if D (F) = or S (F) =. SCond = NCond (considering only the saturated facets). 1 2 3 1 2 3 3 2 1 non-saturated For the NN graph: SCond = {NCond (µ D (1) + µ S (1 ) > 1 µ(2, 2 ))}. For µ = µ D µ S and µ D = µ S = (x, y, 1 x y): 3 2 1 saturated NCond : { x < 0.5 2x + y > 1 SCond : { NCond 2x + y 2 > 1 14 / 29

Policy-specific results Match the Longest has maximal stability region Match the Longest has maximal stability region Match the Longest (ML) policy: a newly arriving customer of class c is matched to a server in S(c) with the largest buffer (similarly for newly arriving server). Thm. For any bipartite graph, ML has a maximal stability region. Proof: Quadratic Lyapunov function: L(x, y) = d D x 2 d + s S y 2 s. ML minimizes the value of this Lyapunov function at each step. Facet-dependent randomized policy. In a non-zero facet F: the server s S (F) is matched to d D (F) D(s) with probability Psd F. These probabilities can be chosen such that: d D, µ S (s)psd F > µ D (d). (symmetrically for customers) s S(d) For this randomized policy stability can be shown using Foster-Lyapunov criterion. 15 / 29

Policy-specific results Priorities are not always stable Priorities and Match the Shortest are not always stable Prop. NN model with either the MS policy or the PR (priority) policy such that customers of class 1 (resp. servers of class 1 ) give priority to servers of class 2 (resp. to customers of class 2): C 1 2 3 S 3 2 1 For both policies, the stability region is not maximal. Consider µ D = (1/3, 2/5, 4/15), µ S = µ D, and µ = µ D µ S. NCond are satisfied, but the Markov chain is transient (for MS or PR as above). 16 / 29

Policy-specific results Priorities are not always stable Stability region for Match the shortest 17 / 29

Workload Stabilizability Optimization Cost function c on buffer levels. Average-cost: η = lim sup N N 1 1 N t=0 E [ c(q(t)) ] 18 / 29

Workload Stabilizability Optimization Cost function c on buffer levels. Average-cost: η = lim sup N N 1 1 N t=0 E [ c(q(t)) ] Queue dynamics: Q(t + 1) = Q(t) U(t) + A(t), t 0 18 / 29

Workload Stabilizability Optimization Cost function c on buffer levels. Average-cost: η = lim sup N N 1 1 N t=0 E [ c(q(t)) ] Queue dynamics: Q(t + 1) = Q(t) U(t) + A(t), t 0 Input process U represents the sequence of matching activities. Input space: { } U = n e u e : n e Z + e E with u e = 1 i + 1 j for e = (i, j) E. X (t) = Q(t) + A(t) the state process of the MDP model. X (t + 1) = X (t) U(t) + A(t + 1) The state space X = {x Z l + : ξ 0 x = 0} with ξ 0 = (1,..., 1, 1,..., 1). 18 / 29

Workload Stabilizability Workload For any D D, corresponding workload vector ξ D defined so that ξ D x = xi D xj S i D j S(D) 19 / 29

Workload Stabilizability Workload For any D D, corresponding workload vector ξ D defined so that ξ D x = xi D xj S i D j S(D) Necessary and sufficient condition for a stabilizing policy: NCond: δ D := ξ D α > 0 for each D α = E[A(t)] arrival rate vector. Why is this workload? Consistent with routing/scheduling models: Fluid model, d x(t) = u(t) + α dt The minimal time to reach the origin from x(0) = x: T (x) = max D ξ D x δ D 19 / 29

Workload Workload Relaxation Workload Dynamics Fix one workload vector ξ D ; denote (ξ, δ) for (ξ D, δ D ). Workload W (t) = ξ X (t) 20 / 29

Workload Workload Relaxation Workload Dynamics Fix one workload vector ξ D ; denote (ξ, δ) for (ξ D, δ D ). Workload W (t) = ξ X (t) can be positive or negative. 20 / 29

Workload Workload Relaxation Workload Dynamics Fix one workload vector ξ D ; denote (ξ, δ) for (ξ D, δ D ). Workload W (t) = ξ X (t) can be positive or negative. Dynamics as in other queueing models, E[W (t + 1) W (t) X (t), U(t)] δ 20 / 29

Workload Workload Relaxation Relaxations A workload relaxation takes this as the model for control: One Dimensional Workload relaxation, Ŵ (t + 1) = Ŵ (t) δ + I (t) }{{} Idleness 0 + N(t + 1) }{{} Zero mean 21 / 29

Workload Workload Relaxation Relaxations A workload relaxation takes this as the model for control: One Dimensional Workload relaxation, Ŵ (t + 1) = Ŵ (t) δ + I (t) }{{} Idleness 0 Effective cost c : R R + : Given a cost function c for Q, Conclusions c(w) = min{c(x) : ξ x = w} + N(t + 1) }{{} Zero mean piecewise linear if c is linear Control of the relaxation = inventory model of Clark & Scarf Hedging policy, with threshold r: Idling is not permitted unless Ŵ (t) < r 21 / 29

Workload Examples Tracking the Relaxation x D 1 x D 2 x D 3 e 1 e 2 e 3 e 4 e 5 x S 3 x S 2 x S 1 22 / 29

Workload Examples Tracking the Relaxation x D 1 x D 2 x D 3 ξ 1 = (0, 0, 1, 1, 0, 0) T e 1 e 2 e 3 e 4 e 5 x S 3 x S 2 x S 1 22 / 29

Workload Examples Tracking the Relaxation x D 1 x D 2 x D 3 e 1 e 2 e 3 e 4 e 5 x S 3 x S 2 x S 1 ξ 1 = (0, 0, 1, 1, 0, 0) T W (t) = Q D 3 (t) Q S 1 (t) Relaxation: Matching of Supply 1 and Demand 2 allowed only if W (t) < r Example 1 Cost: c(x) = x D 1 + 2x D 2 + 3x D 3 + 3x S 1 + 2x S 2 + x S 3 = Effective Cost: c(w) = 4 w 22 / 29

Workload Examples Tracking the Relaxation x D 1 x D 2 x D 3 e 1 e 2 e 3 e 4 e 5 x S 3 x S 2 x S 1 ξ 1 = (0, 0, 1, 1, 0, 0) T W (t) = Q D 3 (t) Q S 1 (t) Relaxation: Matching of Supply 1 and Demand 2 allowed only if W (t) < r Example 2 Cost: c(x) = 3x D 1 + 2x D 2 + x D 3 + 3x S 1 + 2x S 2 + x S 3 = Effective Cost: c(w) = max(2w, 5w) 22 / 29

Workload Tracking the Relaxation Examples W (t) = Q3D (t) Q1S (t) xd1 Example 1 Cost: c(x) = x1d + 2x2D + 3x3D + 3x1S + 2x2S + x3s = Effective Cost: c (w ) = 4 w e1 xs3 xd2 e2 e3 xs2 xd3 e4 e5 xs1 Matching of Supply 1 and Demand 2 allowed only if W (t) < r 23 / 29

Workload Tracking the Relaxation Examples W (t) = Q3D (t) Q1S (t) xd1 Example 1 Cost: c(x) = x1d + 2x2D + 3x3D + 3x1S + 2x2S + x3s Supply Demand = Effective Cost: c (w ) = 4 w e1 xs3 QD1 QD2 QD3 QS1 QS2 QS3 xd2 e2 e3 xs2 xd3 e4 e5 xs1 Matching of Supply 1 and Demand 2 allowed only if W (t) < r 23 / 29

Workload Tracking the Relaxation Examples W (t) = Q3D (t) Q1S (t) xd1 Example 1 Cost: c(x) = x1d + 2x2D + 3x3D + 3x1S + 2x2S + x3s Demand = Effective Cost: c (w ) = 4 w QD1 QD2 e1 xd2 e2 xs3 QD3 e3 xs2 xd3 e4 e5 xs1 Matching of Supply 1 and Demand 2 allowed only if W (t) < r Supply Workload Relaxation: S Q1 S Q2 S Q3 Q1S (t) = Q2S (t) = 0 if W (t) > 0 Q2D (t) = Q3D (t) = 0 if W (t) < 0 23 / 29

Workload Tracking the Relaxation Examples W (t) = Q3D (t) Q1S (t) xd1 Example 1 Cost: c(x) = x1d + 2x2D + 3x3D + 3x1S + 2x2S + x3s = Effective Cost: c (w ) = 4 w e3 xs2 xd3 e4 e5 xs1 Matching of Supply 1 and Demand 2 allowed only if W (t) < r r = 14.9 74 e2 xs3 Average Cost Estimated in Simulation: 76 e1 xd2 72 Workload Relaxation: 70 68 66 Q1S (t) = Q2S (t) = 0 if W (t) > 0 Q2D (t) = Q3D (t) = 0 if W (t) < 0 64 62 0 5 10 15 20 25 30 r Simulation with r = 14.9 23 / 29

Workload Tracking the Relaxation Examples W (t) = Q3D (t) Q1S (t) xd1 Example 1 Cost: c(x) = x1d + 2x2D + 3x3D + 3x1S + 2x2S + x3s = Effective Cost: c (w ) = 4 w Average Cost Comparisons: e3 xs2 xd3 e4 e5 xs1 Matching of Supply 1 and Demand 2 allowed only if W (t) < r Priority 100 Workload Relaxation: MaxWeight Threshold (15) 50 2 e2 xs3 150 1 e1 xd2 3 4 T 5 6 x 10 Q1S (t) = Q2S (t) = 0 if W (t) > 0 Q2D (t) = Q3D (t) = 0 if W (t) < 0 Simulation with r = 14.9 23 / 29

Workload Tracking the Relaxation Examples W (t) = Q3D (t) Q1S (t) xd1 Example 2 Cost: c(x) = 3x1D + 2x2D + x3d + 3x1S + 2x2S + x3s Supply Demand = Effective Cost: c (w ) = max(2w, 5w ) QD1 QD2 QD3 QS1 QS2 QS3 e1 xs3 xd2 e2 e3 xs2 xd3 e4 e5 xs1 Matching of Supply 1 and Demand 2 allowed only if W (t) < r 24 / 29

Workload Tracking the Relaxation Examples W (t) = Q3D (t) Q1S (t) xd1 Example 2 Cost: c(x) = 3x1D + 2x2D + x3d + 3x1S + 2x2S + x3s Demand = Effective Cost: c (w ) = max(2w, 5w ) QD1 QD2 QD3 e1 xd2 e2 xs3 e3 xs2 xd3 e4 e5 xs1 Matching of Supply 1 and Demand 2 allowed only if W (t) < r Supply Workload Relaxation: S Q1 S Q2 S Q3 Q1S (t) = Q2S (t) = 0 if W (t) > 0 Q1D (t) = Q3D (t) = 0 if W (t) < 0 24 / 29

Workload Tracking the Relaxation Examples W (t) = Q3D (t) Q1S (t) xd1 Example 2 Cost: c(x) = 3x1D + 2x2D + x3d + 3x1S + 2x2S + x3s = Effective Cost: c (w ) = max(2w, 5w ) Average Cost Estimated in Simulation: 45 e1 xd2 e2 xs3 e3 xs2 xd3 e4 e5 xs1 Matching of Supply 1 and Demand 2 allowed only if W (t) < r r = 7.2 44 Workload Relaxation: 43 42 41 40 39 0 2 4 6 8 10 12 14 Q1S (t) = Q2S (t) = 0 if W (t) > 0 Q1D (t) = Q3D (t) = 0 if W (t) < 0 r Simulation with r = 7.2 24 / 29

Workload Tracking the Relaxation Examples W (t) = Q3D (t) Q1S (t) xd1 Example 2 Cost: c(x) = 3x1D + 2x2D + x3d + 3x1S + 2x2S + x3s = Effective Cost: c (w ) = max(2w, 5w ) Average Cost Comparisons: Priority 50 MaxWeight 40 20 0 1 2 3 4 xs3 e3 xs2 e4 e5 xs1 Workload Relaxation: Threshold (7) 30 e2 xd3 Matching of Supply 1 and Demand 2 allowed only if W (t) < r 70 60 e1 xd2 T 5 6 x 10 Q1S (t) = Q2S (t) = 0 if W (t) > 0 Q1D (t) = Q3D (t) = 0 if W (t) < 0 Simulation with r = 7.2 24 / 29

h-maxweight and Approximate Optimality h-maxweight h-maxweight Policy: U(t) = φ MW (Q(t)) φ MW (x) = arg min E[ h (x) (t + 1) X (t) = x, U(t) = u] u where (t + 1) = X (t + 1) X (t) = U(t) + A(t + 1) Average drift: φ MW (x) + α = E[ (t + 1) X (t) = x] = E[ U(t) + A(t + 1) X (t) = x] 25 / 29

h-maxweight and Approximate Optimality h-maxweight h-maxweight Policy: U(t) = φ MW (Q(t)) φ MW (x) = arg min E[ h (x) (t + 1) Q(t) = x, U(t) = u] u where (t + 1) = X (t + 1) X (t) = U(t) + A(t + 1) Hope that h approximates solution to a dynamic programming equation For average cost optimality, this means, E[h(Q(t + 1)) h(q(t)) Q(t) = x] = h (x) [ φ MW (x) + α] }{{} Average drift: φ MW (x) + α = c(x) ] + 1 2 [ (t E + 1) T 2 h ( X ) (t + 1) }{{} bounded E[ (t + 1) X (t) = x] = E[ U(t) + A(t + 1) X (t) = x] 25 / 29

h-maxweight and Approximate Optimality Asymptotic optimality Family of arrival processes {A δ (t)} parameterized by Additional assumptions: (A1) For one set D D we have ξ D α δ = δ, where α δ denotes the mean of A δ (t). Moreover, there is a fixed constant δ > 0 such that ξ D α δ δ for any D D, D D, and δ [0, δ ]. (A2) The distributions are continuous at δ = 0, with linear rate: For some constant b, E[ A δ (t) A 0 (t) ] bδ. (1) (A3) The sets E and F do not depend upon δ, and the graph associated with E is connected. Moreover, there exists i 0 S(D), j 0 D c, and ε I > 0 such that P{A δ i 0 (t) 1 and A δ j 0 (t) 1} ε I, 0 δ δ. (2) 26 / 29

h-maxweight and Approximate Optimality Asymptotic optimality There is a function h such that, under Assumptions (A1) (A3), for sufficiently large κ > 0, β > 0, and sufficiently small δ + > 0 (each independent of δ), the average cost η under the h-maxweight policy satisfies, ˆη η η ˆη + O(1) where η is the optimal average cost for the MDP model, ˆη is the optimal average cost for the workload relaxation, and the constant O(1) does not depend upon δ. The average cost for the relaxation satisfies the uniform bound, ˆη = ˆη + O(1) where ˆη is the optimal cost for the diffusion approx. for the relaxation: ˆη = 1 ( Θ c log 1 + c ) +, where c 1 Θ = 1 σ 2 2 δ. 27 / 29

Final remarks Final remarks/related work Performance bounds? Approximate optimal control for relaxations in higher dimensions? 28 / 29

Final remarks Final remarks/related work Performance bounds? Approximate optimal control for relaxations in higher dimensions? More general arrival assumptions. Admission control? Non-bipartite matching? Networks? 28 / 29

References References Related models Tassiulas, Ephremides, IEEE TAC 1992. McKeown, Mekkitikul, Anantharam, Walrand, IEEE Trans. Comm. 1999. Bipartite matching model Caldentey, Kaplan, Weiss, Adv. Appl. Probab. 2009. Adan & Weiss, Operations Research, 2012. Bušić, Gupta, Mairesse, Stability of the bipartite matching model. Adv. Appl. Probab. 2013. Bušić, Meyn, Optimization of Dynamic Matching Models. ArXiv:1411.1044. 2014. Adan, Bušić, Mairesse, Weiss, Reversibility of the FCFS bipartite matching model. In preparation. Workload relaxations Meyn, Stability and asymptotic optimality of generalized MaxWeight policies. SIAM J. Control Optim., 2009. Meyn, Control Techniques for Complex Networks. Cambridge University Press, 2007. 29 / 29