Mixed Robust/Average Submodular Partitioning

Size: px

Start display at page:

Download "Mixed Robust/Average Submodular Partitioning"

Gladys Black
5 years ago
Views:

1 Mixed Robust/Average Subodular Partitioning Kai Wei 1 Rishabh Iyer 1 Shengjie Wang 2 Wenruo Bai 1 Jeff Biles 1 1 Departent of Electrical Engineering, University of Washington 2 Departent of Coputer Science, University of Washington {kaiwei, rkiyer, wangsj, wrbai, biles}@u.washington.edu Abstract We investigate two novel ixed robust/average-case subodular data partitioning probles that we collectively call Subodular Partitioning. These probles generalize purely robust instances of the proble, naely ax-in subodular fair allocation (SFA) [8] and in-ax subodular load balancing (SLB) [15], and also average-case instances, that is the subodular welfare proble (SWP) [16] and subodular ultiway partition (SMP) [2]. While the robust versions have been studied in the theory counity [7, 8, 11, 15, 16], existing work has focused on tight approxiation guarantees, and the resultant algoriths are not generally scalable to large real-world applications. This is in contrast to the average case, where ost of the algoriths are scalable. In the present paper, we bridge this gap, by proposing several new algoriths (including greedy and relaxation algoriths) that not only scale to large datasets but that also achieve theoretical approxiation guarantees coparable to the state-of-the-art. We oreover provide new scalable algoriths that apply to additive cobinations of the robust and average-case objectives. 1 Introduction This paper studies two new ixed robust/average-case partitioning probles of the following for: Proble 1: ax [ λ in f i (A π i ) + λ ] f j (A π j ), Proble 2: in [ λ ax f i (A π i ) + λ ] f j (A π j ), π Π i π Π i where 0 λ 1, λ 1 λ, the set of sets π = (A π 1, A π 2,, A π ) is a partition of a finite set V (i.e, i A π i = V and i j, A π i Aπ j = ), and Π refers to the set of all partitions of V into blocks. The paraeter λ controls the objective: λ = 1 is the average case, λ = 0 is the robust case, and 0 < λ < 1 is a ixed case. In general, Probles 1 and 2 are hopelessly intractable, even to approxiate, but we assue that the f 1, f 2,, f are all onotone nondecreasing (i.e., f i (S) f i (T ) whenever S T ), noralized (f i ( ) = 0), and subodular [6] (i.e., S, T V, f i (S) + f i (T ) f i (S T ) + f i (S T )). These assuptions allow us to develop fast, siple, and scalable algoriths that have approxiation guarantees, as is done in this paper. These assuptions, oreover, allow us to retain the naturalness and applicability of Probles 1 and 2 to a wide variety of practical probles. Subodularity is a natural property in any real-world ML applications [13, 10, 12, 17]. When iniizing, subodularity naturally odel notions of interacting costs and coplexity, while when axiizing it readily odels notions of diversity, suarization quality, and inforation. Hence, Proble 1 asks for a partition whose blocks each (and that collectively) are a good, say, suary of the whole. Proble 2 on the other hand, asks for a partition whose blocks each (and that collectively) are internally hoogeneous (as is typical in clustering). Taken together, we call Probles 1 and 2 Subodular Partitioning. We further categorize these probles depending on if the f i s are identical to each other (hoogeneous) or not (heterogeneous). 1 1 Siilar sub-categorizations have been called the unifor vs. the non-unifor case in the past [15, 7]. 1

2 Proble 1 (Max-(Min+Avg)) Approxiation factor Proble 2 (Min-(Max+Avg)) Approxiation factor λ = 0, MATCHING [8] 1/(n + 1) λ = 0, SAMPLING [15] O( n log n) λ = 0, ELLIPSOID [7] O( n 1/4 log n log 3/2 ) λ = 0, ELLIPSOID [7] O( n log n) λ = 0, BINSRCH [11] 1/(2 1) λ = 1, GREEDSPLIT [18, 14] 2 λ = 1, GREEDWELFARE [5] 1/2 λ = 1, RELAX [2] O(log n) λ = 0, GREEDSAT (1/2 δ, δ 1/2+δ ) LOVÁSZ ROUND 0 λ 1, GENERALGREEDSAT λ/2 0 λ 1 GENERALLOVÁSZ ROUND λ = 0, Hardness 1/2 [8] λ = 0, Hardness λ = 1, Hardness 1 1/e [16] λ = 1, Hardness 2 2/ [4] Table 1: Suary of our contributions and existing work on Probles 1 and 2. 2 See text for details. Previous Work: Special cases of Probles 1 and 2 have appeared previously. Proble 1 with λ = 0 is called subodular fair allocation (SFA), and Proble 2 with λ = 0 is called subodular load balancing (SLB), robust optiization probles both of which previously have been studied. For SLB even in the hoogeneous setting, [15] show that the proble is inforation theoretically hard to approxiate within o( n/ log n). They give a sapling-based algorith achieving O( n/ log n) for the hoogeneous setting. However, the sapling-based algorith is not practical and scalable since it involves solving, in the worst-case, O(n 3 log n) instances of subodular function iniization. Another approach approxiates each subodular function by its ellipsoid approxiation (again non-scalable) and reduces SLB to its odular version (iniu akespan scheduling) leading to an approxiation factor of O( n log n) [7]. SFA, on the other hand, has been studied ostly in the heterogeneous setting. When f i s are all odular, the tightest algorith, so far, is to iteratively round an LP solution achieving O(1/( log 3 )) approxiation [1], whereas the proble is NP-hard to 1/2 + ɛ approxiate for any ɛ > 0 [8]. When f i s are subodular, [8] gives a atching-based algorith with a factor 1/(n + 1) approxiation that perfors poorly when n. [11] proposes a binary search algorith yielding an iproved factor of 1/(2 1). Siilar to SLB, [7] applies the sae ellipsoid approxiation techniques leading to a factor of O( n 1/4 log n log 3/2 ). These approaches are theoretically interesting, but they do not scale to large probles. Probles 1 and 2, when λ = 1, have also been previously studied. Proble 2 becoes the subodular ultiway partition (SMP) for which one can obtain a relaxation based 2-approxiation [2] in the hoogeneous case. In the heterogeneous case, the guarantee is O(log n) [3]. Siilarly, [18, 14] propose a greedy splitting 2-approxiation algorith for the hoogeneous setting. Proble 1 becoes the subodular welfare [16] for which a scalable greedy algorith achieves a 1/2 approxiation [5]. Unlike the worst case (λ = 0), any of the algoriths proposed for these probles are scalable. The general case (0 < λ < 1) of Probles 1 and 2 differs fro either of these extree cases since we wish both for a robust and average case partitioning, and controlling λ allows one to trade off between the two. Our contributions: In contrast to Probles 1 and 2 in the average case (i.e., λ = 1), existing algoriths for the worst case (λ = 0) are not scalable. This paper closes this gap, by proposing new classes of algorithic fraeworks to solve SFA and SLB: (1) greedy algoriths; and (2) a Lovász extension based relaxation algorith. For SFA under the heterogeneous setting, we propose a saturate greedy algorith (GREEDSAT) that iteratively solves instances of subodular welfare probles. We show GREEDSAT has a bi-criterion guarantee of (1/2 δ, δ/(1/2 + δ)), which ensures at least (1/2 δ) blocks receive utility at least δ/(1/2 + δ)op T for any 0 < δ < 1/2. For SLB, we first generalize the hardness result in [15] and show that it is hard to approxiate better than for any = o( n/ log n) even in the hoogeneous setting. We then give a Lovász extension based relaxation algorith (LOVÁSZ ROUND) yielding a tight factor of for the heterogeneous setting. As far as we know, this is the first algorith achieving a factor of for SLB in this setting. Next we show algoriths that handle generalizations of SFA and SLB to Probles 1 and 2. In particular we generalize GREEDSAT leading to GENERALGREEDSAT to solve Proble 1. GENERALGREEDSAT provides a guarantee that soothly interpolates in ters of λ between the bi-criterion factor by GREEDSAT in the case of λ = 0 and the constant factor of 1/2 by the greedy algorith in the case of λ = 1. For Proble 2 we introduce GENERALLOVÁSZ ROUND that generalizes LOVÁSZ ROUND and obtain an -approxiation for general λ. The theoretical contributions and the existing work for Probles 1 and 2 are suarized in Table 1. 2 Robust Subodular Partitioning (Probles 1 and 2 when λ = 0) Notation: we define f(j S) f(s j) f(s) as the gain of j V in the context of S V. We assue w.l.o.g. that the ground set is V = {1, 2,, n}. 2 Results obtained in this paper are arked as. Methods for only the hoogeneous setting are arked as. 2

3 GREEDSAT for SFA: We give a greedy style algorith Saturate Greedy to solve SFA (GREEDSAT, see Alg. 1). Siilar in flavor to the one proposed in [12] GREEDSAT defines an interediate objective F c (π) = i=1 f i c(aπ i ), where f i c(a) = 1 in{f i(a), c} (Line 2). The paraeter c controls the saturation in each block. fi c satisfies subodularity for each i. Unlike SFA, the cobinatorial optiization proble ax π Π F c (π) (Line 6) is uch easier and is an instance of the subodular welfare proble (SWP) [16]. 1: Input: {f i } i=1,, V, α. Algorith 1: GREEDSAT 2: Let F c (π) = 1 In this work, we solve Line 6 by the efficient greedy algorith as described in [5] with a factor 1/2. It should 3: Let c in = 0, c ax = in i f i (V ) i=1 in{f i(a π i ), c}. be clear that one can also ipleent a coputationally 4: while c ax c in ɛ do expensive ulti-linear relaxation algorith as given 5: c = 1 2 (cax + c in) in [16] to solve it with a tight factor (1 1/e). Setting 6: ˆπ c argax π Π F c (π) the input arguent α = 1/2 as the approxiation factor for Line 6, the essential idea of GREEDSAT is to 9: else 7: if F c (ˆπ c ) < αc then 8: c ax = c perfor a binary search over the paraeter c to find the 10: c in = c; ˆπ ˆπ c 11: end if largest c such that the returned solution ˆπ c for the instance of SWP satisfies F c (ˆπ c ) αc. GREEDSAT 13: Output: ˆπ. 12: end while ini fi(v ) terinates in solving O(log( ɛ )) instances of SWP. Theore 2.1 gives a bi-criterion optiality guarantee. Theore 2.1. Given ɛ > 0, 0 α 1 and any 0 < δ < α, GREEDSAT finds a partition such that δ at least (α δ) blocks receive utility at least 1 α+δ (ax π Π in i f i (A π i ) ɛ). For any 0 < δ < α Theore 2.1 ensures that the top valued (α δ) blocks in the partition returned by GREEDSAT are (δ/(1 α + δ) ɛ)-optial. δ controls the trade-off between the nuber of top valued blocks to bound and the perforance guarantee attained for these blocks. The saller δ is, the ore top blocks are bounded, but with a weaker guarantee. We set the input arguent α = 1/2 as the worst-case perforance guarantee for solving SWP so that the above theoretical analysis follows. However, the worst-case factor is often achieved by very contrived exaples of subodular functions. For the ones used in practice, the greedy algorith often leads to near-optial solution [12]. Setting α as the actual perforance guarantee for SWP (often very close to 1), the bound can be significantly iproved. In practice, we suggest setting α = LOVÁSZ ROUND for SLB (Proble 2 with λ = 0) In this section, we investigate the proble of SLB. Existing hardness result in [15] is o( n/ log n), which is independent of and iplicitly assues that = Θ( n/ log n). However, the applications for SLB are often dependent on, which is often chosen as n. We offer the hardness analysis in ters of in the following Theore. Theore 2.2. For any ɛ > 0, SLB cannot be approxiated to a factor of (1 ɛ) for any = o( n/ log n) with polynoial nuber of queries even under the hoogeneous setting. For the rest of the paper, we assue = o( n/ log n) for SLB, unless stated otherwise. Theore 2.2 iplies that SLB is hard to approxiate better than. However, arbitrary partition π Π already achieves the best approxiation factor of that one can hope for under the hoogeneous setting, since ax i f(a π i ) f(v ) i f(aπ i ) ax i f(a π i ) for any π Π. LOVÁSZ ROUND: Therefore we consider only the heterogeneous setting for SLB, for which we propose a tight algorith LOVÁSZ ROUND (see Alg. 2). The algorith proceeds as follows: (1) apply the Lovász extension of subodular functions to relax SLB to a convex prograing, which is exactly Algorith 2: LOVÁSZ ROUND 1: Input: {f i } i=1, { f i } i=1,, V. solved for a fractional solution (Line 2); (2) ap 2: Solve for {x i } i=1 via convex relaxation. 3: Rounding: Let A 1 =,..., = A =. the fractional solution to a partition using the 4: for j = 1,..., n do θ-rounding technique as proposed in [9] (Line 3-6). 5: î argax i x i (j); A î = A î The Lovász extension, which naturally connects a j 6: end for subodular function f with its convex relaxation f, 7: Output {A i } i=1. is defined as follows: given any x [0, 1] n, we obtain a perutation σ x by ordering its eleents in non-increasing order, and thereby a chain of sets S σx 0,..., Sn σx with S σx j = {σ x (1),..., σ x (j)} 3

4 for j = 1,..., n. The Lovász extension f for f is the weighted su of the ordered entries of x: f(x) = n x(σ x(j))(f(s σx j ) f(s σx j 1 )). Given the convexity of the f i s, SLB is relaxed to the following convex progra: in ax f i (x i ), s.t x i (j) 1, for j = 1,..., n (1) x 1,...,x [0,1] n i i=1 Denoting the optial solution for Eqn 1 as {x 1,..., x }, the θ-rounding step siply aps each ite j V to a block î such that î argin i x i (j). We obtain the bound for LOVÁSZ ROUND as follows: Theore 2.3. LOVÁSZROUND achieves a worst-case approxiation factor. We reark that, to the best of our knowledge, LOVÁSZROUND is the first algorith that is tight and that gives an approxiation in ters of for the heterogeneous setting. 3 General Subodular Partitioning (Probles 1 and 2 when 0 < λ < 1) Lastly we study Proble 1 and Proble 2, in the ost general case, i.e., 0 < λ < 1. We use the proposed algoriths for SFA (Proble 1 with λ = 0) and SLB (Proble 2 with λ = 1) as the building blocks to design algoriths for the general scenarios (0 < λ < 1). We first generalize the proposed GREEDSAT leading to GENERALGREEDSAT for Proble 1. For Proble 2 we generalize LOVÁSZ ROUND to obtain a Lovász extension based algorith. c Alost the sae as GREEDSAT we define an interediate objective: F λ (π) = 1 i=1 in{ λf i (A π i ) + λ 1 f j(a π j ), c} in GENERALGREEDSAT. Following the sae algorithic design as in GREEDSAT, GENERALGREEDSAT only differs fro GREEDSAT in Line 6, where the subodular welfare is defined on the interediate objective F λ c (π). Let α be the guarantee for solving subodular welfare, GENERALGREEDSAT approxiates Proble 1 with the following bounds: Theore 3.1. Given ɛ, α, and 0 λ 1, GENERALGREEDSAT finds a partition ˆπ that satisfies the following: λ ini f (Aˆπ i i ) + λ 1 i=1 fi(aˆπ i ) λα(op T ɛ) where OP T = ax π Π λ ini f i (A π i )+λ 1 i=1 f i(a π i ). Moreover, let F λ,i(π) = λf i (A π i )+λ 1 f j(a π j ). Given any 0 < δ < α, there is a set I {1,..., } such that I (α δ) and δ F i,λ (ˆπ) ax{, λα}(op T ɛ), i I. 1 α+δ Theore 3.1 generalizes Theore 2.1 when λ = 0, i.e., it recovers the bi-criterion guarantee in Theore 2.1 for the worst-case scenario (λ = 0). Moreover Theore 3.1 iplies that the factor of α for the average-case objective can alost be recovered by GENERALGREEDSAT if λ = 1. It also gives an iproved guarantee as λ increases suggesting that Proble 1 becoes easier as the ixed objective weights ore on the average-case objective. We also point out that the optiality guarantee of GENERALGREEDSAT soothly interpolates the two extree cases in ters of λ. Next we focus on Proble 2 for general λ. We generalize LOVÁSZ ROUND leading to GENER- ALLOVÁSZ ROUND. Alost the sae as LOVÁSZ ROUND, GENERALLOVÁSZ ROUND only differs in Line 2, where Proble 2 is relaxed as the following convex progra: in λ ax f i (x i ) + λ 1 x 1,...,x [0,1] n i f j (x j ), s.t x i (j) 1, for j = 1,..., n (2) After solving for the fractional solution {x i } i=1 to the convex prograing, GENERALLOVÁSZ ROUND then rounds it to a partition using the sae rounding technique as LOVÁSZ ROUND. The following Theore holds: Theore 3.2. GENERALLOVÁSZ ROUND is guaranteed to find a partition ˆπ Π such that ax i λfi (Aˆπ i ) + λ 1 f j(aˆπ j ) in π Π ax i λfi (A π i ) + λ 1 f j(a π j ). Theore 3.2 generalizes Theore 2.3 when λ = 0. Moreover we achieve a factor of by GENER- ALLOVÁSZ ROUND for any λ. Though the approxiation guarantee is independent of λ the algorith naturally exploits the trade-off between the worst-case and average-case objectives in ters of λ. i=1 4

5 References [1] A. Asadpour and A. Saberi. An approxiation algorith for ax-in fair allocation of indivisible goods. In SICOMP, [2] C. Chekuri and A. Ene. Approxiation algoriths for subodular ultiway partition. In FOCS, [3] C. Chekuri and A. Ene. Subodular cost allocation proble and applications. In Autoata, Languages and Prograing, pages Springer, [4] A. Ene, J. Vondrák, and Y. Wu. Local distribution and the syetry gap: Approxiability of ultiway partitioning probles. In SODA, [5] M. Fisher, G. Nehauser, and L. Wolsey. An analysis of approxiations for axiizing subodular set functions II. In Polyhedral cobinatorics, [6] S. Fujishige. Subodular functions and optiization, volue 58. Elsevier, [7] M. Goeans, N. Harvey, S. Iwata, and V. Mirrokni. Approxiating subodular functions everywhere. In SODA, [8] D. Golovin. Max-in fair allocation of indivisible goods. Technical Report CMU-CS , [9] R. Iyer, S. Jegelka, and J. Biles. Monotone closure of relaxed constraints in subodular optiization: Connections between iniization and axiization: Extended version. [10] S. Jegelka and J. Biles. Subodularity beyond subodular energies: coupling edges in graph cuts. In CVPR, [11] S. Khot and A. Ponnuswai. Approxiation algoriths for the ax-in allocation proble. In APPROX, [12] A. Krause, B. McMahan, C. Guestrin, and A. Gupta. Robust subodular observation selection. In JMLR, [13] M. Li, D. Andersen, and A. Sola. Graph partitioning via parallel subodular approxiation to accelerate distributed achine learning. In arxiv preprint arxiv: , [14] M. Narasihan, N. Jojic, and J. A. Biles. Q-clustering. In NIPS, [15] Z. Svitkina and L. Fleischer. Subodular approxiation: Sapling-based algoriths and lower bounds. In FOCS, [16] J. Vondrák. Optial approxiation for the subodular welfare proble in the value oracle odel. In STOC, [17] K. Wei, R. Iyer, and J. Biles. Subodularity in data subset selection and active learning. In ICML, [18] L. Zhao, H. Nagaochi, and T. Ibaraki. On generalized greedy splitting algoriths for ultiway partition probles. Discrete applied atheatics, 143(1): ,

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs On the Inapproxiability of Vertex Cover on k-partite k-unifor Hypergraphs Venkatesan Guruswai and Rishi Saket Coputer Science Departent Carnegie Mellon University Pittsburgh, PA 1513. Abstract. Coputing