Polyhedral Approaches to Online Bipartite Matching

Polyhedral Approaches to Online Bipartite Matching Alejandro Toriello joint with Alfredo Torrico, Shabbir Ahmed Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Industrial and Enterprise Systems Engineering Seminar University of Illinois, Urbana-Champaign October 6, 2017

Motivation Online matching problems in the new economy... Online search: Keyword advertisement (e.g. AdWords).

Motivation Online matching problems in the new economy... Online search: Keyword advertisement (e.g. AdWords). Web browsing: Display advertisement (banners and pop-ups).

Motivation Online matching problems in the new economy... Online search: Keyword advertisement (e.g. AdWords). Web browsing: Display advertisement (banners and pop-ups). Ride-hailing: Driver assignment. Many other revenue management applications. Interested here in centrally managed markets.

Problem Definition impressions ads 1 a 2 b 3 t = 3

Problem Definition impressions ads 1 a 2 b 3 t = 2

Problem Definition impressions ads 1 a 2 b 3 t = 1

Problem Definition impressions ads 1 a 2 b 3 t = 0

Problem Definition impressions 1 ads a Bipartite graph with V : right side (ads), V = m, fixed and known; N: left side (impressions), N = n, types that may appear. 2 b 3

Problem Definition impressions 1 2 3 ads a b Bipartite graph with V : right side (ads), V = m, fixed and known; N: left side (impressions), N = n, types that may appear. We sequentially observe T i.i.d. uniform samples drawn from N. An appearing node must be matched or discarded. For this talk, assume n = m = T. Objective is maximizing expected number of matches. Can generalize to weighted case.

Online Bipartite Matching Brief review of related work Karp/Vazirani/Vazirani (90) Adversarial model (no distribution). Randomized ranking policy is optimal, with 1 1/e competitive ratio. Applications in online search and revenue management renew CS interest. First studied model is i.i.d. Feldman/Mehta/Mirrokni/Muthukrishnan (09): First algorithm guarantee that beats 1 1/e. Improvements and extensions: Bahmani/Kapralov (10), Haeupler/Mirrokni/Zadimoghaddam (11), Manshadi/Oveis Gharan/Saberi (12)... Jaillet/Lu (13): Currently best-known guarantee. Our goal: Upper bounds via polyhedral relaxations, employ in policy design.

Outline Preliminaries Static Relaxations and Policies Time-Indexed Relaxations Conclusions and Ongoing Work

An Initial Relaxation Feldman/Mehta/Mirrokni/Muthukrishnan (09) z ij : Probability policy ever matches impression i to ad j. max z 0 ij E z ij

An Initial Relaxation Feldman/Mehta/Mirrokni/Muthukrishnan (09) z ij : Probability policy ever matches impression i to ad j. max z 0 s.t. ij E i Γ(j) z ij z ij 1 (can match j once)

An Initial Relaxation Feldman/Mehta/Mirrokni/Muthukrishnan (09) z ij : Probability policy ever matches impression i to ad j. max z 0 s.t. ij E i Γ(j) j Γ(i) z ij z ij 1 z ij T/n = 1 (can match j once) (expect to see i once) Bipartite matching polytope for expected graph. Simple policy: Solve for max matching, match only these edges. Each edge matched w.p. 1 (1 1/n) n 1 1/e.

Improving the Relaxation Idea: Add additional valid inequalities satisfied by feasible policies. Similar to achievable region approach in applied probability, e.g. Bertsimas/Niño Mora (96), Coffman/Mitrani (80). i j First example (Haeupler/Mirrokni/Zadimoghaddam, 11): Upper bounds z ij 1 (1 1/n) n are valid. Impression i never appears w.p. (1 1/n) n.

Facial Dimension Proposition The upper bound inequalities z ij 1 (1 1/n) n are facet-defining for the polytope of achievable probabilities. Proof sketch. Consider the following E affinely independent policies: (i, j) Match edge when possible, nothing else. (i, j ) Match either edge when possible, nothing else. (i, j ) Match i to j on first appearance, then to j on second; nothing else. (i, j) Match (i, j) when possible; in last stage match i instead if it appears. Yes, it s a full-dimensional polytope.

Right Star Inequalities I. j Any set of neighbors I of ad j will never appear w.p. (1 I /n) n, thus is valid. z ij 1 (1 I /n) n i I Separate greedily in poly-time.

Right Star Inequalities I. j Any set of neighbors I of ad j will never appear w.p. (1 I /n) n, thus is valid. z ij 1 (1 I /n) n i I Separate greedily in poly-time. Proposition If I N, the inequality has facial dimension E I. Proposition If I = Γ(j) = N, the inequality corresponds to j s degree constraint, and is facet-defining.

Left Star Inequalities i. J For any set of neighbors J of impression i, z ij E[min{ J, B(T, 1/n)}] j J is valid, where B is a binomial r.v. Same greedy separation.

Left Star Inequalities i. J For any set of neighbors J of impression i, z ij E[min{ J, B(T, 1/n)}] j J is valid, where B is a binomial r.v. Same greedy separation. Theorem The inequality is facet-defining if J < n or J = Γ(i). Proof sketch. Construct J policies using an ordering of J. Corresponding sub-matrix is a circulant, thus non-singular. Apply previous argument to remaining edges.

Complete Subgraph Inequalities For any I N, J V, I.. J ij E I J is valid. z ij E[min{ J, B(T, I /n)}] Greedy separation for fixed I or J, full separation with MIP. Inequalities are not facet-defining except in cases previously mentioned.

What More Do We Need? Are 0-1 inequalities enough? 1 2 3 a b Calculated this instance s convex hull with PORTA. The polytope has 13 non-trivial facets, only four of which are covered by our previous inequalities. No other facet has 0-1 structure.

Constructing Policies DP Formulation Define v t (i, S) as value when impression i has just appeared, ads S V remain, t 1 impressions are left after i. Recursion: v t (i, S) = max { max j S Γ(i) {1 + E[vt 1 (η, S \ j)]} (match i to j) E[vt 1 (η, S)], (discard i) where η is uniform r.v. over N.

Constructing Policies DP Formulation Recursion is equivalent to min E[v T (η, V )] v 0 s.t. v t (i, S j) E[v t 1 (η, S)] 1, v t (i, S) E[v t 1 (η, S)] 0, t, i, S. t, i, j Γ(i), S j This LP s dual is policy LP, which projects down to achievable probabilities polytope. Our focus: Intelligently restrict feasible region to connect our bounds to policies via approximations of v.

Simplest Policy Example Expected graph relaxation Proposition Suppose we value 1. each appearance of impression i by q i 0, 2. each ad j by r j 0. The corresponding value of any DP state is v t (i, S) q i + (t 1)E[q η ] + j S r j. Restricting the value function LP feasible region in this manner yields the dual of the expected graph relaxation, with dual variables q, r.

Simplest Policy Example Expected graph relaxation When i appears and S ads remain, choose q i 1 2 3 r j a b arg min r j, j S Γ(i) if possible choose non-cover ad. Ranking Policy: Choose some priority order for ads, match lowest compatible ad. This policy is a cover ranking: use any priority order which places cover ads above non-cover ads. Follows from Goel/Mehta (10) that this policy has approx. guarantee 1 1/e.

Improved Policies Using new inequalities Theorem The expected graph relaxation with single-variable upper bounds corresponds to a ranking policy. This ranking has three priority classes, instead of two. Theorem The expected graph relaxation with right star inequalities corresponds to a time-dependent ranking policy, i.e. the ranking may vary by epoch t. No such simple policy for left star or complete graph inequalities.

Computational Experiments Geometric mean of gap w.r.t. benchmark for different instance types. Bound/Policy 20-Cycle 200-Cycle Small Large Dense Large Sparse EG 1.27 1.27 1.32 1.00 1.22 EG + UB 1.27 1.27 1.09 1.00 1.12 EG + RS 1.13 1.10 1.05 1.00 1.08 EG + LS 1.16 1.14 1.06 1.00 1.10 EG + RS + LS 1.13 1.10 1.05 1.00 1.07 Sim. Bound 1.05 1 1.00 1 1 DP 1-1 - - Matching 0.83 0.80 0.86 0.64 0.78 2-Matching 0.99 0.95 0.96 0.75 0.88 Cover ranking 0.99 0.95 0.99 0.92 0.93 UB ranking 0.99 0.95 1.00 0.92 0.94 TD ranking 0.99 0.95 1.00 0.95 0.96 Small n = 10, large n = 100. From Feldman/Mehta/Mirrokni/Muthukrishnan (09)

Computational Experiments Static relaxation takeaways Significant gap left to close, especially in sparser instances. Right stars close more of the gap than left stars, contrasting facial dimension results. Ranking-type policies outperform more naïve matching heuristics, yet still simple to implement.

Adding a Time Index Assume graph is complete w.l.o.g. (add edges with zero weight). zij t : Probability policy matches i to j in stage t. Original variables: z ij = t zt ij. Simple example: j V z t ij 1/n. Proposition These inequalities are facet-defining in the space of z t ij. Add over t to get j V z ij 1.

Adding a Time Index A more involved example Matching i to j in t implies two independent events: P(i j in t) P(i appears in t) P(j not matched in [t+1, n]). Equivalent to zkj τ + nzt ij 1. k N τ>t Proposition These inequalities are facet-defining for t n 1 and any i, j. Add over i for t = 1 to get i N z ij 1.

Adding a Time Index Bound dominance Theorem The same inequalities also imply the right-star static inequalities. Therefore, the LP max z 0 w ij zij t i,j,t s.t. j V z t ij 1/n zkj τ + nzt ij 1 k N τ>t beats the expected graph relaxation even with right-star inequalities. LP has Θ(n 3 ) variables and constraints, versus Θ(n 2 ) and Θ(n2 n ).

Dynamic Policy LP corresponds to approximation v t (i, S) q t i + τ<t E η [q τ η] + j S ( rij t + ) E η [rηj] τ, τ<t with appropriate dynamic values q, r. In state (t, i, S), corresponding policy chooses arg min j S Γ(i) E η [rηj]. τ τ<t Dynamic version of previous policies. Structurally similar to dynamic bid price policies in revenue management (e.g. Adelman, 07).

Computational Experiments Bound/Policy Small Large Dense Large Sparse EG + RS 1.08 1.02 1.08 TI-LP 1.05 0.98 1.03 Sim. Bound 1.02 1 1 DP 1 - - Dynamic Policy 1.00 0.93 0.97 TD Ranking 0.97 0.88 0.92 Small n = 10, large n = 50. New LP and policy improve across the board. Much longer LP solve times.

Pushing the Probabilistic Argument For pairs of impressions i 1, i 2 and ads j 1 j 2, k N τ>t+1 z τ kj + n(zt+1 i 1 j 1 + z t+1 i 1 j 2 ) + n 2 (z t i 2 j 1 + z t i 2 j 2 ) 1 + n is valid for t n 2.

Pushing the Probabilistic Argument For pairs of impressions i 1, i 2 and ads j 1 j 2, k N τ>t+1 z τ kj + n(zt+1 i 1 j 1 + z t+1 i 1 j 2 ) + n 2 (z t i 2 j 1 + z t i 2 j 2 ) 1 + n is valid for t n 2. Theorem These inequalities can be generalized for sets of h n 1 ads over h stages, and are valid and facet-defining. Separation is NP-hard (version of weighted bi-clique problem). Further generalizations possible, all facet-defining. In additional tests, not impactful empirically compared to those with h = 1.

Conclusions Polyhedral relaxations offer new, unexplored way to derive bounds for dynamic matching and resource allocation. Also offer insight into policy design. Even for bipartite matching, structure appears very complex vis-à-vis deterministic case. Must combine combinatorial and probabilistic techniques. atoriello@isye.gatech.edu www.isye.gatech.edu/ atoriello3