Multi-Objective Maximization of Monotone Submodular Functions with Cardinality Constraint

Multi-Objective Maximization of Monotone Submodular Functions with Cardinality Constraint Rajan Udwani (rudwani@mit.edu) Abstract We consider the problem of multi-objective maximization of monotone submodular functions subject to cardinality constraint, one formulation of which is max A = min i {1,...,m} f i (A). Krause et al. (2008) showed that when the number of functions m grows as the cardinality i.e., m = Ω(), the problem is inapproximable (unless P = NP ). For the more general case of matroid constraint, Cheuri et al. (2010) gave a randomized (1 1/e) ɛ approximation for constant m. he runtime (number of queries to function oracle) scales exponentially as n m/ɛ3. We give the first polynomial time asymptotically constant factor approximations for m = o( ): (i) A randomized (1 1/e) algorithm based on Cheuri et al. (2010). log 3 (ii) A faster and more practical Õ(n/δ 3 ) time, randomized (1 1/e) 2 δ approximation based on Multiplicative-Weight-Updates. Finally, we characterize the variation in optimal solution value as a function of the cardinality, leading to a derandomized approximation for constant m. 1 Introduction A set function f : 2 N R on the ground set N is called submodular when f(a + a) f(a) f(b + a) f(b) for all B A N and a N \ A. he function is monotone if f(b) f(a) for all B A. W.l.o.g. assume f( ) = 0, which combined with monotonicity implies non-negativity. he problem of maximizing a monotone submodular function subject to a cardinality constraint, P 0 := max f(a), A N, A has been well studied. Nemhauser et al. [27, 28] showed that the greedy algorithm gives a guarantee of (1 1/e) and this is best possible in the value-oracle model. Later, Feige [11] showed this is also the best possible approximation under standard complexity assumptions. Further, Badanidiyuru and Vondrá [4] found a faster algorithm for P 0 that improved the quadratic query complexity of the classical greedy algorithm to nearly linear complexity, by trading off on the approximation guarantee. For the more general problem max A I f(a), where I is the collection of independent sets of a matroid; Calinescu et al. [6], Vondrá [34], obtained a (1 1/e) approximation by (approximately) optimizing the multilinear extension of the submodular function, followed by suitable rounding. Based on this framewor, tremendous progress was made over the last decade for a variety of different settings [6, 9, 10, 12, 34 36]. In diverse practical applications, such as sensor placement, influence maximization and feature selection, where instances of P 0 arise naturally, often the objective is uncertain or there are multiple criteria/objectives that must be regarded simultaneously [18, 19, 21 25, 31]. In such cases one often wants a set robust against the worst-case objective function. his motivated the following problem [24], P 1 = max min f i(a), A N, A i {1,2,...,m} where f i (.) is monotone submodular for every i. Krause et al. [24] show that when when m = Ω(), the problem is inapproximable unless P = NP. On the other hand, Cheuri et 1

al. [9] gave a randomized (1 1/e) ɛ approximation for the more general case of matroid constraint, when m is constant. he runtime scales as n m/ɛ3. Combined, there is a (1 1/e) ɛ approximation for constant m and inapproximability for m = Ω(). We focus on the unanswered regime of m = o(). Incidentally, [9] considers a different but equivalent formulation of the problem that stems from the influential paper on multi-objective optimization by Papadimitriou and Yannaais [32]. he alternative formulation, which we introduce in Section 2, is the reason we call this a multi-objective maximization problem (same as [9]). Recently, [29] showed that the greedy algorithm can be generalized to achieve a deterministic 1 1/e ɛ approximation for the special case of bi-objective maximization subject to a cardinality constraint. he runtime scales as n 1+1/ɛ and ɛ 1/2. Moreover, in practice n can range between tens of thousands to millions [23, 25, 31]. he state-of-the-art approximation algorithms (mentioned above) are intractable for such n. o get a fast algorithm, we invoe the Multiplicative-Weight-Updates (MWU) framewor which has proven to be a vital tool in the past few decades [1, 2, 5, 14 17, 20, 33, 37, 38]. Linear functions and constraints have been the primary setting of interest in these wors, but recent applications have shown its usefulness when considering non-linear and in particular submodular objectives [3, 8]. We apply the MWU framewor in vein of the Plotin-Shmoys-ardos scheme for solving LPs [33]. Essentially showing that the non-linearity only costs us a another factor of (1 1/e) in the guarantee and yields a nearly linear time algorithm. Our main contributions: We focus on the cardinality constraint case and show that the problem is approximable up to a factor of (1 1/e ɛ)(1 m ) which for, m = o ( ) ɛ 3 log 3 and ɛ = min{ 1 8 ln m, 4 m } tends to 1 1/e. his implies a very sharp threshold, since the problem cannot be approximated to within any polynomial factor for m = Ω() [24]. However, the runtime is O(n 5 ). Further, using MWU we give a fast O( n log m log n δ 3 δ ) time (1 1/e)2 (1 m/ɛ 3 ) ɛ δ approximation. Under the same asymptotic conditions as above, the guarantee simplifies to (1 1/e) 2 δ. Finally, we give an upper bound on the increase in optimal solution value as a function of cardinality and essentially resolve a conjecture posed in [30], leading to a deterministic 1 1/e ɛ approximation for constant m (all other results are randomized). Outline of the paper: We start with definitions and preliminaries in Section 2, in particular we review relevant parts of the algorithm in [9] that are essential for understanding the results here. In Section 3, we state and prove the main results. Finally, we conclude in Section 4 with some open problems. 2 Preliminaries While our focus will be on the cardinality case, discussing the more general matroid setting will help understand why our improvements don t translate over to the matroid case. Given ground set N of n elements, we use M = (N, I) to denote a matroid with the collection of independent sets given by I. P (M) denotes the matroid polytope. Also, let d = ran(m). Next, for real vectors x, we use the short hand x to denote the l 1 norm. Also, define the marginal increase in value of function f, due to a set X, when added to the set A as f(x A) = f(a X) f(a). Let β(η) = 1 1 e η [0, 1 1/e] for η [0, 1]. Note that β(1) = (1 1/e). Further, for, β( /) = (1 e 1 / /e) (1 1/e) /. (1) his function appears naturally in our analysis and will be useful for expressing approximation guarantees of the algorithms. Recall the widely popular greedy algorithm for P 0, summarized as: Starting with, at each step, add to the current set an element which adds the maximum marginal value, until elements are chosen. 2

It was shown by Nemhauser et al. in [27, 28] that this achieves a guarantee of β(1) for P 0 and that this is best possible. Next, given f : 2 N R, recall that its multilinear extension over x = {x 1,..., x n } [0, 1] n is defined as, F (x) = f(s) x i (1 x j ). S N i S he function can also be interpreted as the expectation of function value over sets obtained by including element i N independently with probability x i, i. While evaluating the exact value of this function is naturally hard in general, for the purpose of using this function in optimization algorithms, approximations obtained using a sampling based oracle suffice [4, 6, 9]. Now, we discuss the two variants of the multi-objective optimization problem below and argue that they are equivalent formulations. In both variants we are given m monotone submodular functions f 1 (.),..., f m (.) and a matroid constraint M with ran d. he first variant, P 1, cardinality version of which was discussed in Section 1, is to find the optimal solution (or more reasonably, find an approximation) to max X I min i f i (X). Note that the function min i f i (.) is not sub or super modular in general and recall that this variant was shown to be inapproximable in [24] for m = Ω(). he decision version of this variant is given by: Given t > 0, find a set S I such that min i f i (S ) t, or give a certificate of infeasibility. In the second variant, say P 2, which is used in [9], we are given a set of m positive real values V i and we wish to find a set S I, such that f i (S ) V i, i {1,..., m} or certify that no S exists. Since this is computationally hard, one instead aims to efficiently find a set S I such that f i (S) αv i for all i and some factor α, or certify that there is no set S such that f i (S ) V i, i. o see the equivalence, simply observe that w.l.o.g. we can assume V i = 1, i (since we can consider functions f i (.)/V i instead). he equivalence holds even for approximation factors. Finally, we use the notation SM for an optimal solution to the problem and OP M to denote the optimal solution value for formulation P 1. In case of a cardinality constraint,, we simply use S, OP. We now review the algorithm by Cheuri et al. [9] which is based on P 2. We refer to it as Algorithm 1 and it has three stages. Stage 1: Intuitively, the purpose of this stage is to pic a small initial set consisting of elements with large marginal values, i.e. ɛ 3 V i for some i, ensuring that the marginal value added by any leftover element, to every function, is small ( ɛ 3 V i, i). his is necessary for technical reasons in the rounding procedure in Stage 3. Given a set S I, index elements in S = {s 1, s 2,... } in order of how the greedy algorithm would pic them. Now, note that by virtue of monotonicity, there are at most 1/ɛ 3 elements such that f i (s j {s 1,..., s j 1 }) ɛ 3 V i for any fixed i, in fact due to decreasing marginal values, f i (s j {s 1,..., s j 1 }) < ɛ 3 V i for every j > 1/ɛ 3. his implies that we need only focus on sets of size m/ɛ 3 (at most 1/ɛ 3 elements for each function) to find the right initial set. In particular, one can try all possible initial sets of this size (i.e. run subsequent stages with different starting sets), leading to the n m/ɛ3 term in the runtime. Stages 2,3 have runtime polynomial in m (in fact Stage 3 has runtime independent of m), hence Stage 1 is really the bottlenec. It is not obvious at all if one can do better than brute force enumeration over all possible starting sets and still retain the approximation guarantee, since the final solution must be an independent set of a matroid. However, as we show later, for cardinality constraints one can indeed avoid enumeration. Stage 2: Given a starting set S from stage one, suppose now that we have the ground set N S, contracted matroid M\S = {X X S I} with ran d S and monotone submodular functions f i (. S). Also f i (e S) < ɛ 3 V i, i, since otherwise we can discard starting set S. Let x S {0, 1} n be the characteristic vector of S (i.e. support is S). Starting at the origin, this stage maintains a point x t P (M\S) and runs for = O(d 2 ) steps. At step t, solves an LP with one constraint per function i in addition to the constraints 3 j S

given by the matroid polytope, to get a direction y t P (M\S). he point is then updated to x t+1 = x t + 1 yt. he process is referred to as the continuous greedy algorithm [6, 9, 34]. Given, x 0 = 0, we have x t P (M\S), t and observe that, t, x t < x t+1 x = d S. For 0 < η 1, let us define x(m, η) as the final point when this subroutine is executed for η steps, with matroid polytope P (M). Note that the output could also be a certificate of impossibility, in which case we can move to the next starting set. We have from [9], Lemma 1. [9] (Lemma 7.3) F i (x(m, 1)) (1 1/e)V i, i. his implies F i ( x(m\s, 1) S ) (1 1/e)(Vi f i (S)), i, which gives us, F i (x(m\s, 1) + x S ) (1 1/e)V i, i. However, in the next section, we will more generally need the following, Lemma 2. F i (x(m, η)) β(η)v i, i and x(m, η) = ηd. Proof. We refer to Section 2.3 and 3 of [6] and Lemma 7.3 of [9] for intuition and understanding ey details of the core analysis (which we do not repeat here). All we need here is to note that the proof of Lemma 3.3 in [6] in fact, implies more generally that when dealing with a single function i, if the algorithm is terminated after a fraction η of the total steps, we get a point x(m, η) P (M) such that F i (x(m, η)) (1 (1 1/d) ηd )V i β(η)v i. Further, x(m, η) = ηd. Now as observed in Lemma 7.3 of [9], when dealing simultaneously with multiple functions, all essential details about the algorithm and analysis remain unchanged, except that the LP solved at each step/iteration increases in complexity, hence we get the desired bound i. Note, the above property is well nown for the classical greedy algorithm [27, 28] (η steps lead to a β(η) approximation) so it is not surprising that it also holds for the continuous greedy algorithm. In terms of runtime, a straightforward implementation of the continuous greedy process can tae up to Θ(n 8 ) steps [6]. his was reduced to O(n 4 /ɛ 3 ) in [13] and subsequently to Õ(n2 ) in [4]. However, the last improvement does not obviously translate when solving for multiple functions. hus as a lower bound, currently the fastest algorithm for this stage has runtime worse than O(n 4 /ɛ 3 ), which can be as bad as O(n 5 ). Further, unlie the case of a single submodular function, where the LP at each step is an instance of the max weight independent set problem over matroids (which lends itself to a fast greedy solution), with multiple functions, one needs to solve the LP with a matroid separation oracle. Further exacerbating the runtime problem. Stage 3: For the right starting set S 1, Stage 2 successfully outputs a point x(m\s, 1). Stage 3 decomposes x(m\s, 1) into a convex combination of bases of M\S 1 (or uses the natural decomposition obtained in Stage 2). his is followed by a random process that collapses these bases into a single basis S 2 such that, S 2 M\S 1 and f i (S 1 S 2 ) (1 1/e)(1 ɛ)v i, i. Additionally, this requires ɛ < 1/8 ln m. he rounding procedure is called swap rounding. Lemma 3. [9](heorem 1.4, heorem 7.2) Given m monotone submodular functions f i (.) with all marginal values in [0, ɛ 3 V i ], i, a fractional point x P (M) and ɛ < 1 8γ ln m. Swap Rounding yields a set R M, such that, Pr[f i (R) < (1 ɛ)f i (x)] < me 1/8ɛ < 1/m γ 1. i Remar: For any γ > 1, the above can be converted to a result w.h.p. by standard repetition. 3 Main Results 3.1 A (1 1/e) approximation for m = o ( ) log 3 We replace the enumeration in Stage 1 with a single starting set, obtained by scanning once over the ground set. he main idea is simply that for the cardinality constraint case, any starting set 4

that fulfills the Stage 3 requirement of small marginals is acceptable. As mentioned previously, Stages 2 and 3 scale well in m, so for the modification to mae sense at a high level, we should see a decline in the guarantee as m starts to grow closer to Ω(), else we would have a constant factor approximation in the regime which is nown to be inapproximable. Indeed, we will incur an additional 1 m factor in the guarantee due to the new Stage 1. For ɛ 1/8 ln m this tends ɛ 3 to 1 when m = o ( ) log 3 but degrades to 0 for m = Ω(). New Stage 1: Start with S 1 = and pass over all elements once, updating S 1 whenever an element is chosen to be added. For each element e, add it to S 1 if for some i, f i (e S 1 ) ɛ 3 V i. Note that we add at most m/ɛ 3 elements (at most 1/ɛ 3 for each function, as discussed in Section 2). When the subroutine terminates, for every remaining element e N\S 1, f i (e S 1 ) < ɛ 3 V i, i (as required by Lemma 3). Modified Stage 2: Since we are no longer concerned with general matroids, let s use the simplified notation x(, η) to denote the output of Stage 2 of Algorithm 1, with cardinality constraint replacing the matroid M. Further let, 1 = S 1 m/ɛ 3. he modified second stage outputs: x(, 1 /). Contrast this with the output from the traditional Stage 2: x( 1, 1). We do this to simplify analysis here but one can, in fact, use the traditional output x( 1, 1) just as well, though that adds unnecessary complexity to the analysis at this point. We elaborate on this in Section 3.3. Now, as a direct corollary of Lemma 2. Corollary 4. F i (x(, 1 /) S 1 ) β( 1 /)f i (S S 1) β( 1 /)(V i f i (S 1 )) and x(, 1 /) = 1. Stage 3: We use the swap rounding algorithm on x(, 1 /) to get a set S 2 of size 1, and output the final set S 1 S 2. heorem 5. For ɛ = min{ 1 8 ln m, 4 m }, with constant probability the algorithm outputs a (1 1/e)(1 ɛ)(1 m/ɛ 3 ) approximate solution. Asymptotically, 1 1/e approximate for m = o ( ). log 3 Proof. From Lemma 3 and Corollary 4 we have, f i (S 2 S 1 ) β( 1 /)(1 ɛ)(v i f i (S 1 )), i. Using (1) to lower bound β( 1 /) we get, f i (S 1 S 2 ) β(1)(1 ɛ)(1 m/ɛ 3 )(V i f i (S 1 )) + f i (S) (1 1/e)(1 ɛ)(1 m/ɛ 3 )V i, i. Note that the runtime is now independent of ɛ. Stage 1 has runtime O(n) and Stages 2 and 3 have runtime polynomial in n, m. hus, we can refine the guarantee by picing suitable ɛ. We choose ɛ = min{ 1 8 ln m, 4 m }, 1 where the 8 ln m is due to Lemma 3 and the 4 m term is to balance ɛ and m/ɛ3. he resulting guarantee becomes (1 1/e)(1 h()), where the function h() 0 as, so long as m = o ( ). log 3 Owing to the limitation posed by the runtime of continuous greedy based Stage 2 that we highlighted towards the end of Section 2, we next introduce a simple and much faster algorithm. he speedup however comes at the cost of an extra factor of (1 1/e) in the approximation guarantee. 3.2 A fast (1 1/e) 2 δ approximation for m = o ( ) log 3 he algorithm has 3 stages as before. Stage 1 remains the same as the new Stage 1 introduced in the previous section. Let S 1 be the output of this stage as before. Stage 2 was the main computational bottlenec above and we replace it with a fast MWU based subroutine that runs for = O( ln m ) rounds and solves an instance of P δ 2 0 during each round. Here δ manifests as a subtractive term in the approximation guarantee. he currently fastest algorithm for P 0, in [26], has runtime O(n log 1/δ) and an expected guarantee of (1 1/e) δ. However, the slightly slower, but still nearly linear time O( n δ log n δ ) thresholding algorithm in [4], has (the 5

usual) deterministic guarantee of (1 1/e) δ. Either of these would lead to a runtime of Õ(n/δ) = Õ( n ) which is a vast improvement over O(n 5 ). δ 3 Consider some algorithm A for P 0 with guarantee α, and let A(f, ) denote its output given monotone submodular function f and cardinality constraint as input. Note that α can be as large as β(1), and also, 1 = S 1 as before. Algorithm Stage 2: MWU 1: Initialize = 2 ln m, λ 1 δ 2 i = 1/m, f i (.) = f i(. S 1 ) 2: while 1 t do 3: g t (.) = m i=1 λt f i i (.) 4: X t = A(g, 1 ) 5: m t i = f i (X t ) α 6: λ t+1 i = λ t i (1 δmt i ) 7: t = t + 1 8: Output: x 2 = 1 t=1 Xt V i f i (S 1 ) he point x 2 obtained above is rounded to a set S 2 in Stage 3 (which remains unchanged). he final output is S 1 S 2. Observe that the experts in this instance of MWU are the normalized functions f i (.). In the Plotin-Shmoys-ardos (PS) framewor, each round involves solving an LP with a convex combination of linear constraints (the experts). Here we solve an instance of P 0 for the monotone submodular function given by a convex combination of the experts f i. Further in the PS framewor, the convex combination of solutions obtained over all rounds is approximately feasible due to linearity. Here the non-linearity leads to a loss in guarantee. Next, we formalize this and show the claimed guarantee and runtime bounds. Lemma 6. g t (X t ) α(1 m/ɛ 3 ) i λt i, t. Proof. Consider the optimal set S and note that i λt f i i (S ) i λt i, t. Now if we use the function g t (.) = i λt f i i (.) to greedily index the elements of S = {s 1,..., s }, we have that the truncated set S = {s 1,..., s S1 } is such that, g t (S ) (1 S 1 /) i λt i, and furthermore it has the right cardinality of S 1. hen since g t being a sum of finitely many monotone submodular functions, is also monotone submodular, we now that Step 4 of the algorithm always outputs a set X t such that g t (X t ) α(1 S 1 /) i λt i, t. Lemma 7. t i λt i ( f i (X t ) α) δ + f t i (X t ) α, i. o see this one need only observe that the non-linearity of the function doesn t play a role in deriving the above inequality and the proof exactly follows the analysis of the PS framewor in [1] (heorem 3.3 & 2.1), except that instead of querying an LP oracle in each round, we query an algorithm A(.) for P 0. However, for the sae of completeness we include the proof in Appendix A. Note that in the PS framewor, the LP oracle could output a certificate of infeasibility. Analogously, in our case if A outputs a set X t at some round t, such that Lemma 6 is violated, we have a certificate of infeasibility. t Xt, and note that in the PS framewor, linearity Now consider the output x 2 = 1 easily implies that x 2 is an approximately feasible solution to the LP. In an ideal scenario, one could hope for an analogous result here, i.e., F i (x 2 S 1 ) (α δ)(v i f i (S 1 )), i. However, due to non-linearity of our functions we should expect to incur some cost. Fortunately, it turns out that this only costs us a new factor of (1 1/e) in the approximation guarantee. 6

heorem 8. For ɛ = min{ 1 8 ln m, 4 m }, the algorithm maes O( n log m log n δ 3 δ ) queries, and with constant probability, outputs a feasible (1 1/e) 2 (1 m )(1 ɛ) δ approximate set. ɛ 3 Asymptotically, (1 1/e) 2 δ approximate for m = o ( ). Proof. We first focus on proving the approximation guarantee. o that end, we assume α = β(1) and process the LHS in Lemma 7, t i λt i ( f i (X t ) β(1)) hus, we have for every i, f t i (X t ) β(1) f t i (X t ) log 3 t = (g(xt ) β(1) i λt i ) t ( β(1) S 1 /)( i λt i ) (using Lemma 6) t (β(1) S 1 /) ( i λt i ) β(1) S 1 /. (δ + β(1) S 1 /) β(1)(1 S 1 /) δ. Next we will show that F i (x 2 ) (1 1/e) f t=1 i (X t )/. Recall x 2 = 1 concave closure of a submodular function f is given by, t Xt and the f + (x) = max { α X α X f(x) X α X X = x, X α X 1, α X 0 X N}. + t Clearly, f i (x 2 ) f i (X t ). So we just need to show F i (x 2 ) (1 1/e) f i + (x 2). his in fact, follows directly from Lemmas 4 and 5 in [7], which state f (x) f + (x) and F (x) (1 1/e)f (x) respectively, for every monotone submodular function f. Note that f (.) is another extension of submodular functions which we don t need to introduce for the purpose of this proof. Additionally, we also give a novel and direct proof of the inequality F (x) (1 1/e) t=1 f(xt )/, when x = 1 t Xt, in Appendix B. hus, F i (x 2 ) (1 1/e) t=1 f i (X t )/ β(1) 2 (1 S 1 /) β(1)δ. Finally from Lemma 3, using swap rounding on x 2 gives a set S 2 such that f i (S 2 S 1 ) [(1 1/e) 2 (1 S 1 /)(1 ɛ) δ](v i f i (S 1 )), i with constant probability, proving the claimed approximation guarantee. Similar to the proof of heorem 5, for m = o ( ) log 3 and ɛ = min{ 1 8 ln m, 4 m }, we have (1 m/ɛ3 )(1 ɛ) 1 as, resulting in the guarantee (1 1/e) 2 δ. o see the runtime bound, recall from Section 3.1 that Stage 1 taes time O(n). Stage 2 runs an instance of A(.), times, leading to an upper bound of O(( n δ log n δ ) log m ) = δ 2 O( n log m log n δ 3 δ ), if we use the thresholding algorithm in [4], and as we discussed above, we can get a slightly faster runtime of O( n log m log 1 δ 2 δ ) by compromising and allowing for a guarantee in expectation. Recall that using either of these algorithms incurs an additional multiplicative factor of (1 δ) to the guarantee above. Finally, since we have a decomposition of x 2 into sets of size 1 each from Stage 2, swap rounding in Stage 3 directly proceeds in rounds by probabilistically collapsing these sets into a single set. Each round taes O() time, leading to total runtime O( log m). Combining all three, we re left with the dominant O( n log m log n δ 2 δ 3 δ ) term. 7

3.3 Variation in optimal solution value as increases Consider the problem P 0 with cardinality constraint. Given an optimal solution, S with value OP for the problem, it is not difficult to see that for arbitrary, there is a subset S S of size, such that f(s ) OP. For instance, as we argued in Lemma 6, indexing the elements in S using the greedy algorithm, and choosing the set formed by the first elements gives such a set. his implies OP OP, and the bound is easily seen to be tight. his raises a natural question: Can we generalize this bound on variation of optimal solution value with varying, for multi-objective maximization? A priori, this isn t obvious even for modular functions. In particular, note that indexing elements in order they are piced by the greedy algorithm doesn t wor anymore since there are many functions and we need to balance values amongst all. We show that one can indeed derive such a bound. heorem 9. Given that there exists a set S such that f i(s ) V i, i and ɛ < 1 [m/ɛ 3, ], there exists S S of size, such that ( f i (S ) m/ɛ 3 ) (1 ɛ) m/ɛ 3 V i, i. 8 ln m. For every Proof. Since we merely need to show existence, we will not concern ourselves with computational issues. First, let us restrict our ground set of elements to S and let S 1 be a subset of size m/ɛ 3 such that f i (e S 1 ) < ɛ 3 V i, e S \S 1 and i (recall, we argued the existence of such a set in Section 2, Stage 1). Recall that x S \S 1 represents the characteristic vector of set S \S 1 and consider the point x = S 1 S 1 x S \S. Clearly, x = 1 S 1, and furthermore from the concavity of multilinear extensions in positive directions (Section 2.1 of [6]), we have F i (x S 1 ) S 1 S 1 f i(s \S 1 S 1 ) S 1 S 1 (V i f i (S 1 )), i. Finally, using swap rounding Lemma 3, we have that there exists a set S 2 of size S 1, such that f i (S 1 S 2 ) (1 ɛ) S 1 S 1 V i, i. Previously, Lemma 13 in [30], showed a similar property using a different constructive argument, but only for the very restricted case of constant m and (their result does generalize for larger m, but leads to a very loose bound). Resolution of conjecture in [30]: Note that this essentially resolves the conjecture in [30], which states that for constant m and every m, there exists a set S of size, such that f i (S) Θ(1) V i, i. his implies that for large enough, we can choose sets of size ( - tuples) at each step to get a deterministic (1 1/e) ɛ approximation with runtime O(n m/ɛ4 ) for the multi-objective maximization problem, when m is constant (all previous algorithms, including the ones presented here are randomized). heorem 10. For = m, choosing -tuples greedily w.r.t. h(.) = min ɛ 4 i f i (.) yields approximation guarantee (1 1/e)(1 2ɛ) while maing n m/ɛ4 queries. Proof. he analysis is similar to that of the standard greedy algorithm [27, 28]. Let S j denote the set at the end of iteration j. S 0 = and let the final set be S /. hen from heorem 9, we have that at each step j, there is some set X S \S j of size such that f i (A S j ) (1 ɛ) m/ɛ 3 m/ɛ 3 ( Vi f i (S j ) ), i. o simplify presentation let η = (1 ɛ) m/ɛ 3 and note that η 1. Further, 1/η as m/ɛ 3 for fixed m and = o(). We have for every i, f i (S j+1 ) (1 η)f i (S j ) ηv i, call this 8

inequality j + 1. Observe that inequality / is f i (S / ) (1 η)f i (S / 1) ηv i, i. herefore multiplying inequality / j by (1 η) j and telescoping we get for every i, f(s / ) / 1 j=0 (1 η) j ηv i ( 1 (1 η) / )V i ( 1 (1 η) 1 η η / )V i β(η / )V i (1 1/e)(η / )V i. Where we used (1) for the last inequality. Let ɛ = 4 m, then we have, η / (1 ɛ) 1 m/ ɛ 3 ( 1 m/ɛ 3 1 ) ( 1 4 m ) 2 1 1 4 m ( ) 3 (1 ) ( herefore, as we get the asymptotic guarantee (1 1/e) 1 4 ) 2 m = (1 1/e)(1 ɛ) 2. Finally, we would lie to address a technical point raised in the discussion involving Modified Stage 2 in Section 3.1. We used the output x(, 1 /) from the continuous greedy algorithm (Stage 2) as opposed to x( 1, 1). he reason for doing this was simply because using the output x( 1, 1) would have led to the additional difficulty of comparing the function values of sets S 1 and S, which is now straightforward. Formally, F i(x( 1, 1) S 1 ) β(1)f i (S 1 S 1 ), and for m = o ( ) we have 1 m/ɛ 3, and therefore due to Lemma 9, f i (S 1 S 1 ) (1 log 3 ɛ) 1 m/ɛ 3 m/ɛ 3 f i (S S 1), which will give us an asymptotic (1 1/e) bound even for x( 1, 1). 4 Conclusion and Open Problems In summary, we consider the problem of multi-objective maximization of monotone submodular functions subject to a cardinality constraint, when m = o ( ) log 3. No polynomial time constant factor approximations or strong inapproximability results were nown for the problem, though it was nown that the problem is inapproximable when m = Ω() and admitted a nearly 1 1/e approximation for constant m. We showed that when m = o ( ), one can indeed approach the best possible guarantee of 1 1/e and further also gave a nearly-linear time (1 1/e) 2 approximation for the same. Finally, we also established a natural bound on how the optimal solution value increases with increasing cardinality of the set. A natural question of course, is whether one can achieve approximations right up to m = o(). Additionally, it also of interest to as if there are fast algorithms with guarantee closer to 1 1/e, in contrast to the guarantee of (1 1/e) 2 shown here. Further, most of the ideas, and all results here, are for the case of uniform matroid (cardinality constraint). It is unclear if similar results can also be shown for a general matroid constraint. Acnowledgments he author would lie to than James B. Orlin and Andreas S. Schulz for their insightful comments and feedbac on the first draft of this wor. log 3 9

References [1] S. Arora, E. Hazan, and S. Kale. he multiplicative weights update method: a metaalgorithm and applications. heory of Computing, 8:121 164, 2012. [2] S. Arora and S. Kale. A combinatorial, primal-dual approach to semidefinite programs. In SOC, pages 227 236, 2007. [3] Y. Azar and I. Gamzu. Efficient submodular function maximization under linear pacing constraints. ICALP 12, pages 38 50. [4] A. Badanidiyuru and J. Vondrá. Fast algorithms for maximizing submodular functions. In SODA 14, pages 1497 1514. SIAM, 2014. [5] D. Bienstoc. Potential function methods for approximately solving linear programming problems: theory and practice, volume 53. Springer, 2006. [6] G. Calinescu, C. Cheuri, M. Pál, and J. Vondrá. Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing, 40(6):1740 1766, 2011. [7] G. Calinescu, C. Cheuri, Martin Pál, and J. Vondrá. Maximizing a submodular set function subject to a matroid constraint. In IPCO 2007, volume 7, pages 182 196. [8] C. Cheuri,.S. Jayram, and J. Vondra. On multiplicative weight updates for concave and submodular function maximization. ICS 15, pages 201 210. [9] C. Cheuri, J. Vondrá, and R. Zenlusen. Dependent randomized rounding via exchange properties of combinatorial structures. In FOCS 10, pages 575 584. IEEE, 2010. [10] S. Dobzinsi and J. Vondrá. From query complexity to computational complexity. In SOC 12, pages 1107 1116. ACM, 2012. [11] U. Feige. A threshold of ln n for approximating set cover. Journal of the ACM (JACM), 45(4):634 652, 1998. [12] M. Feldman, J.S. Naor, and R. Schwartz. A unified continuous greedy algorithm for submodular maximization. In FOCS 11, pages 570 579. IEEE. [13] Y. Filmus and J. Ward. Monotone submodular maximization over a matroid via nonoblivious local search. SIAM Journal on Computing, 43:514 542, 2014. [14] L. Fleischer. Approximating fractional multicommodity flow independent of the number of commodities. SIAM Journal on Discrete Mathematics, 13(4):505 520, 2000. [15] N. Garg and R. Khandear. Fractional covering with upper bounds on the variables: Solving lps with negative entries. In European Symposium on Algorithms, pages 371 382, 2004. [16] N. Garg and J. Koenemann. Faster and simpler algorithms for multicommodity flow and other fractional pacing problems. SIAM Journal on Computing, 37(2):630 652, 2007. [17] M. Grigoriadis and L. Khachiyan. Fast approximation schemes for convex programs with many blocs and coupling constraints. SIAM Journal on Optimization, 4(1):86 107, 1994. [18] C. Guestrin, A. Krause, and A.P. Singh. Near-optimal sensor placements in gaussian processes. In Proceedings of the 22nd international conference on Machine learning, pages 265 272. ACM, 2005. 10

[19] X. He and D. Kempe. Robust influence maximization. In SIGKDD, pages 885 894, 2016. [20] C. Koufogiannais and N. Young. Beating simplex for fractional pacing and covering linear programs. In FOCS, pages 494 504, 2007. [21] A. Krause and C. Guestrin. Near-optimal nonmyopic value of information in graphical models. UAI 05, pages 324 331, 2005. [22] A. Krause, C. Guestrin, A. Gupta, and J. Kleinberg. Near-optimal sensor placements: Maximizing information while minimizing communication cost. In Proceedings of the 5th international conference on Information processing in sensor networs, pages 2 10. ACM, 2006. [23] A. Krause, J. Lesovec, C. Guestrin, J. VanBriesen, and C. Faloutsos. Efficient sensor placement optimization for securing large water distribution networs. Journal of Water Resources Planning and Management, 134(6):516 526, 2008. [24] A. Krause, H B. McMahan, C. Guestrin, and A. Gupta. Robust submodular observation selection. Journal of Machine Learning Research, 9:2761 2801, 2008. [25] J. Lesovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Costeffective outbrea detection in networs. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 420 429. ACM, 2007. [26] B. Mirzasoleiman, A. Badanidiyuru, A. Karbasi, J. Vondrá, and A. Krause. Lazier than lazy greedy. In AAAI, 2015. [27] G.L. Nemhauser and L.A. Wolsey. Best algorithms for approximating the maximum of a submodular set function. Mathematics of operations research, 3(3):177 188, 1978. [28] G.L. Nemhauser, L.A. Wolsey, and M.L. Fisher. An analysis of approximations for maximizing submodular set functions i. Mathematical Programming, 14(1):265 294, 1978. [29] J. Orlin, A.S. Schulz, and R. Udwani. Robust monotone submodular function maximization. In IPCO 2016, pages 312 324. [30] J. Orlin, A.S. Schulz, and R. Udwani. Robust monotone submodular function maximization. arxiv preprint arxiv:1507.06616, 2016. [31] Salomons E. Berry J.W. Hart W.E. Phillips C.A. Watson J.P. Dorini G. Jonergouw P. Kapelan Z. di Pierro F. Ostfeld A., Uber J.G. he battle of the water sensor networs (bwsn): A design challenge for engineers and algorithms. Journal of Water Resources Planning and Management, 2008. [32] C. Papadimitriou and M. Yannaais. On the approximability of trade-offs and optimal access of web sources. In FOCS, pages 86 92, 2000. [33] S. Plotin, D. Shmoys, and E. ardos. Fast approximation algorithms for fractional pacing and covering problems. In FOCS, pages 495 504, 1991. [34] J. Vondrá. Optimal approximation for the submodular welfare problem in the value oracle model. In SOC 08, pages 67 74. ACM. [35] J. Vondrá. Symmetry and approximability of submodular maximization problems. SIAM Journal on Computing, 42(1):265 304, 2013. 11

[36] J. Vondrá, C. Cheuri, and R. Zenlusen. Submodular function maximization via the multilinear relaxation and contention resolution schemes. In SOC 11, pages 783 792. ACM, 2011. [37] N. Young. Randomized rounding without solving the linear program. In SODA, volume 95, pages 170 178, 1995. [38] Neal E Young. Sequential and parallel algorithms for mixed pacing and covering. In FOCS, pages 538 546, 2001. A Lemma 11. t i λt i ( f i (X t ) α) 4δ + f t i (X t ) α, i. Proof. We will use the potential function Φ t = i λt i. Let pt i = λt i /Φt and M t = i pt i mt i. hen we have, Φ t+1 = i λ t i(1 δm t i) = Φ t δφ t i p t im t i = Φ t (1 δm t ) Φ t e δm t After rounds, Φ Φ 1 e δ t M t. Further, for every i, Φ wi = 1 m t (1 δmt i ) ln(φ 1 e δ t M t ) t ln(1 δmt i ) ln m δ t M t ln m + t ln(1 δmt i ) Using ln( 1 1 ɛ ) ɛ + ɛ2 and ln(1 + ɛ) ɛ ɛ 2 for ɛ 0.5, and with = 2 ln m δ 2 (for a positive approximation guarantee), we have, t M t t δ + mt i, i. and δ < (1 1/e) B Lemma 12. F (x) (1 1/e) 1 t=1 f(xt ) Proof. Let x = t=1 Xt /. W.l.o.g., assume the sets X t are indexed such that f(x j ) f(x j+1 ) for every j 1. o simplify notation, let f(x t )/ = a t and t at = A. Recall that F (x) can be viewed as the expected function value of the set obtained by independently sampling element j with probability x j. Consider the alternative and equivalent process where starting with t = 1, one samples each element in set X t independently with probability 1/. hus the random process runs in steps, and in step t suppose the newly sampled subset of X t adds marginal value t. It follows from submodularity that, 1 a 1 and in general, t a t 1 t 1 j=1 j. Note that F (x) = t t. Now consider the LP where the objective is to minimize t t subject to a 1 a 2 a 0; a t = A and t a t 1 t 1 j=1 j with 0 = 0. It follows that the minimum occurs 12

at an extreme point where a t = A and a t = a 1 for all t j and a j+1 = 0. For all such points, it is not difficult to see that the objective is at least (1 1/e)A. herefore, we have F (x) (1 1/e)A = (1 1/e) t f(xt )/, as desired. 13