Multi-Objective Maximization of Monotone Submodular Functions with Cardinality Constraint

Similar documents
Maximization of Submodular Set Functions

Monotone Submodular Maximization over a Matroid

Streaming Algorithms for Submodular Function Maximization

Submodular Functions and Their Applications

1 Maximizing a Submodular Function

Locally Adaptive Optimization: Adaptive Seeding for Monotone Submodular Functions

Optimization of Submodular Functions Tutorial - lecture I

Approximating Submodular Functions. Nick Harvey University of British Columbia

9. Submodular function optimization

arxiv: v2 [cs.lg] 15 Aug 2017

Constrained Maximization of Non-Monotone Submodular Functions

Symmetry and hardness of submodular maximization problems

Learning symmetric non-monotone submodular functions

Gradient Methods for Submodular Maximization

Submodular Maximization with Cardinality Constraints

The Algorithmic Foundations of Adaptive Data Analysis November, Lecture The Multiplicative Weights Algorithm

1 Overview. 2 Multilinear Extension. AM 221: Advanced Optimization Spring 2016

A (k + 3)/2-approximation algorithm for monotone submodular k-set packing and general k-exchange systems

Submodularity and curvature: the optimal algorithm

1 Submodular functions

Maximizing Submodular Set Functions Subject to Multiple Linear Constraints

On the Impossibility of Black-Box Truthfulness Without Priors

A Note on the Budgeted Maximization of Submodular Functions

CS675: Convex and Combinatorial Optimization Fall 2014 Combinatorial Problems as Linear Programs. Instructor: Shaddin Dughmi

Santa Claus Schedules Jobs on Unrelated Machines

CS264: Beyond Worst-Case Analysis Lecture #11: LP Decoding

arxiv: v1 [math.oc] 1 Jun 2015

On Acceleration with Noise-Corrupted Gradients. + m k 1 (x). By the definition of Bregman divergence:

Submodular Maximization by Simulated Annealing

A Robust APTAS for the Classical Bin Packing Problem

The 2-valued case of makespan minimization with assignment constraints

Structured Robust Submodular Maximization: Offline and Online Algorithms

Submodular Functions, Optimization, and Applications to Machine Learning

Active Learning and Optimized Information Gathering

Maximization Problems with Submodular Objective Functions

Approximation Algorithms for Maximum. Coverage and Max Cut with Given Sizes of. Parts? A. A. Ageev and M. I. Sviridenko

Submodular Functions: Extensions, Distributions, and Algorithms A Survey

Randomized Pipage Rounding for Matroid Polytopes and Applications

Online Learning, Mistake Bounds, Perceptron Algorithm

A Faster Strongly Polynomial Time Algorithm for Submodular Function Minimization

Oblivious and Non-Oblivious Local Search for Combinatorial Optimization. Justin Ward

The Power of Local Search: Maximum Coverage over a Matroid

Stochastic Submodular Cover with Limited Adaptivity

CS675: Convex and Combinatorial Optimization Fall 2016 Combinatorial Problems as Linear and Convex Programs. Instructor: Shaddin Dughmi

Improved Approximation for Single-Sink Buy-at-Bulk

Submodular Functions Properties Algorithms Machine Learning

Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization

Robust Submodular Maximization: A Non-Uniform Partitioning Approach

Query and Computational Complexity of Combinatorial Auctions

Efficient Approximation for Restricted Biclique Cover Problems

Randomized algorithms for online bounded bidding

On Lagrangian Relaxation and Subset Selection Problems

EECS 495: Combinatorial Optimization Lecture Manolis, Nima Mechanism Design with Rounding

CSE541 Class 22. Jeremy Buhler. November 22, Today: how to generalize some well-known approximation results

Approximating maximum satisfiable subsystems of linear equations of bounded width

c 2014 Society for Industrial and Applied Mathematics

Irredundant Families of Subcubes

Resource Augmentation for Weighted Flow-time explained by Dual Fitting

Gradient Methods for Submodular Maximization

An Online Algorithm for Maximizing Submodular Functions

MDP Preliminaries. Nan Jiang. February 10, 2019

Solving Zero-Sum Security Games in Discretized Spatio-Temporal Domains

Approximation Algorithms for 2-Stage Stochastic Scheduling Problems

1 Continuous extensions of submodular functions

Integer Equal Flows. 1 Introduction. Carol A. Meyers Andreas S. Schulz

Integer Programming ISE 418. Lecture 12. Dr. Ted Ralphs

MATHEMATICAL ENGINEERING TECHNICAL REPORTS

An algorithm to increase the node-connectivity of a digraph by one

A simple LP relaxation for the Asymmetric Traveling Salesman Problem

The Power of Local Search: Maximum Coverage over a Matroid

Chapter 7 Randomization Algorithm Theory WS 2017/18 Fabian Kuhn

Lecture 6,7 (Sept 27 and 29, 2011 ): Bin Packing, MAX-SAT

Revisiting the Greedy Approach to Submodular Set Function Maximization

Online Submodular Minimization

6.854J / J Advanced Algorithms Fall 2008

arxiv: v1 [cs.ds] 15 Jul 2018

Lecture 20: LP Relaxation and Approximation Algorithms. 1 Introduction. 2 Vertex Cover problem. CSCI-B609: A Theorist s Toolkit, Fall 2016 Nov 8

Efficient Primal- Dual Graph Algorithms for Map Reduce

CS264: Beyond Worst-Case Analysis Lecture #15: Smoothed Complexity and Pseudopolynomial-Time Algorithms

Maximum Coverage over a Matroid Constraint

Learning convex bodies is hard

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

1 The independent set problem

Optimal Approximation for the Submodular Welfare Problem in the Value Oracle Model

Dissipativity Theory for Nesterov s Accelerated Method

Maximizing Non-monotone Submodular Set Functions Subject to Different Constraints: Combined Algorithms

Lecture 4: Lower Bounds (ending); Thompson Sampling

Approximability of Adaptive Seeding under Knapsack Constraints

arxiv: v2 [cs.ds] 28 Aug 2014

CS 583: Approximation Algorithms: Introduction

Cutting Plane Training of Structural SVM

CS261: A Second Course in Algorithms Lecture #12: Applications of Multiplicative Weights to Games and Linear Programs

THE POSTDOC VARIANT OF THE SECRETARY PROBLEM 1. INTRODUCTION.

A Greedy Framework for First-Order Optimization

A Multi-Exchange Local Search Algorithm for the Capacitated Facility Location Problem

15-850: Advanced Algorithms CMU, Fall 2018 HW #4 (out October 17, 2018) Due: October 28, 2018

An approximation algorithm for the partial covering 0 1 integer program

CS 6820 Fall 2014 Lectures, October 3-20, 2014

Lecture 25: Subgradient Method and Bundle Methods April 24

On Two Class-Constrained Versions of the Multiple Knapsack Problem

Transcription:

Multi-Objective Maximization of Monotone Submodular Functions with Cardinality Constraint Rajan Udwani (rudwani@mit.edu) Abstract We consider the problem of multi-objective maximization of monotone submodular functions subject to cardinality constraint, one formulation of which is max A = min i {1,...,m} f i (A). Krause et al. (2008) showed that when the number of functions m grows as the cardinality i.e., m = Ω(), the problem is inapproximable (unless P = NP ). For the more general case of matroid constraint, Cheuri et al. (2010) gave a randomized (1 1/e) ɛ approximation for constant m. he runtime (number of queries to function oracle) scales exponentially as n m/ɛ3. We give the first polynomial time asymptotically constant factor approximations for m = o( ): (i) A randomized (1 1/e) algorithm based on Cheuri et al. (2010). log 3 (ii) A faster and more practical Õ(n/δ 3 ) time, randomized (1 1/e) 2 δ approximation based on Multiplicative-Weight-Updates. Finally, we characterize the variation in optimal solution value as a function of the cardinality, leading to a derandomized approximation for constant m. 1 Introduction A set function f : 2 N R on the ground set N is called submodular when f(a + a) f(a) f(b + a) f(b) for all B A N and a N \ A. he function is monotone if f(b) f(a) for all B A. W.l.o.g. assume f( ) = 0, which combined with monotonicity implies non-negativity. he problem of maximizing a monotone submodular function subject to a cardinality constraint, P 0 := max f(a), A N, A has been well studied. Nemhauser et al. [27, 28] showed that the greedy algorithm gives a guarantee of (1 1/e) and this is best possible in the value-oracle model. Later, Feige [11] showed this is also the best possible approximation under standard complexity assumptions. Further, Badanidiyuru and Vondrá [4] found a faster algorithm for P 0 that improved the quadratic query complexity of the classical greedy algorithm to nearly linear complexity, by trading off on the approximation guarantee. For the more general problem max A I f(a), where I is the collection of independent sets of a matroid; Calinescu et al. [6], Vondrá [34], obtained a (1 1/e) approximation by (approximately) optimizing the multilinear extension of the submodular function, followed by suitable rounding. Based on this framewor, tremendous progress was made over the last decade for a variety of different settings [6, 9, 10, 12, 34 36]. In diverse practical applications, such as sensor placement, influence maximization and feature selection, where instances of P 0 arise naturally, often the objective is uncertain or there are multiple criteria/objectives that must be regarded simultaneously [18, 19, 21 25, 31]. In such cases one often wants a set robust against the worst-case objective function. his motivated the following problem [24], P 1 = max min f i(a), A N, A i {1,2,...,m} where f i (.) is monotone submodular for every i. Krause et al. [24] show that when when m = Ω(), the problem is inapproximable unless P = NP. On the other hand, Cheuri et 1

al. [9] gave a randomized (1 1/e) ɛ approximation for the more general case of matroid constraint, when m is constant. he runtime scales as n m/ɛ3. Combined, there is a (1 1/e) ɛ approximation for constant m and inapproximability for m = Ω(). We focus on the unanswered regime of m = o(). Incidentally, [9] considers a different but equivalent formulation of the problem that stems from the influential paper on multi-objective optimization by Papadimitriou and Yannaais [32]. he alternative formulation, which we introduce in Section 2, is the reason we call this a multi-objective maximization problem (same as [9]). Recently, [29] showed that the greedy algorithm can be generalized to achieve a deterministic 1 1/e ɛ approximation for the special case of bi-objective maximization subject to a cardinality constraint. he runtime scales as n 1+1/ɛ and ɛ 1/2. Moreover, in practice n can range between tens of thousands to millions [23, 25, 31]. he state-of-the-art approximation algorithms (mentioned above) are intractable for such n. o get a fast algorithm, we invoe the Multiplicative-Weight-Updates (MWU) framewor which has proven to be a vital tool in the past few decades [1, 2, 5, 14 17, 20, 33, 37, 38]. Linear functions and constraints have been the primary setting of interest in these wors, but recent applications have shown its usefulness when considering non-linear and in particular submodular objectives [3, 8]. We apply the MWU framewor in vein of the Plotin-Shmoys-ardos scheme for solving LPs [33]. Essentially showing that the non-linearity only costs us a another factor of (1 1/e) in the guarantee and yields a nearly linear time algorithm. Our main contributions: We focus on the cardinality constraint case and show that the problem is approximable up to a factor of (1 1/e ɛ)(1 m ) which for, m = o ( ) ɛ 3 log 3 and ɛ = min{ 1 8 ln m, 4 m } tends to 1 1/e. his implies a very sharp threshold, since the problem cannot be approximated to within any polynomial factor for m = Ω() [24]. However, the runtime is O(n 5 ). Further, using MWU we give a fast O( n log m log n δ 3 δ ) time (1 1/e)2 (1 m/ɛ 3 ) ɛ δ approximation. Under the same asymptotic conditions as above, the guarantee simplifies to (1 1/e) 2 δ. Finally, we give an upper bound on the increase in optimal solution value as a function of cardinality and essentially resolve a conjecture posed in [30], leading to a deterministic 1 1/e ɛ approximation for constant m (all other results are randomized). Outline of the paper: We start with definitions and preliminaries in Section 2, in particular we review relevant parts of the algorithm in [9] that are essential for understanding the results here. In Section 3, we state and prove the main results. Finally, we conclude in Section 4 with some open problems. 2 Preliminaries While our focus will be on the cardinality case, discussing the more general matroid setting will help understand why our improvements don t translate over to the matroid case. Given ground set N of n elements, we use M = (N, I) to denote a matroid with the collection of independent sets given by I. P (M) denotes the matroid polytope. Also, let d = ran(m). Next, for real vectors x, we use the short hand x to denote the l 1 norm. Also, define the marginal increase in value of function f, due to a set X, when added to the set A as f(x A) = f(a X) f(a). Let β(η) = 1 1 e η [0, 1 1/e] for η [0, 1]. Note that β(1) = (1 1/e). Further, for, β( /) = (1 e 1 / /e) (1 1/e) /. (1) his function appears naturally in our analysis and will be useful for expressing approximation guarantees of the algorithms. Recall the widely popular greedy algorithm for P 0, summarized as: Starting with, at each step, add to the current set an element which adds the maximum marginal value, until elements are chosen. 2

It was shown by Nemhauser et al. in [27, 28] that this achieves a guarantee of β(1) for P 0 and that this is best possible. Next, given f : 2 N R, recall that its multilinear extension over x = {x 1,..., x n } [0, 1] n is defined as, F (x) = f(s) x i (1 x j ). S N i S he function can also be interpreted as the expectation of function value over sets obtained by including element i N independently with probability x i, i. While evaluating the exact value of this function is naturally hard in general, for the purpose of using this function in optimization algorithms, approximations obtained using a sampling based oracle suffice [4, 6, 9]. Now, we discuss the two variants of the multi-objective optimization problem below and argue that they are equivalent formulations. In both variants we are given m monotone submodular functions f 1 (.),..., f m (.) and a matroid constraint M with ran d. he first variant, P 1, cardinality version of which was discussed in Section 1, is to find the optimal solution (or more reasonably, find an approximation) to max X I min i f i (X). Note that the function min i f i (.) is not sub or super modular in general and recall that this variant was shown to be inapproximable in [24] for m = Ω(). he decision version of this variant is given by: Given t > 0, find a set S I such that min i f i (S ) t, or give a certificate of infeasibility. In the second variant, say P 2, which is used in [9], we are given a set of m positive real values V i and we wish to find a set S I, such that f i (S ) V i, i {1,..., m} or certify that no S exists. Since this is computationally hard, one instead aims to efficiently find a set S I such that f i (S) αv i for all i and some factor α, or certify that there is no set S such that f i (S ) V i, i. o see the equivalence, simply observe that w.l.o.g. we can assume V i = 1, i (since we can consider functions f i (.)/V i instead). he equivalence holds even for approximation factors. Finally, we use the notation SM for an optimal solution to the problem and OP M to denote the optimal solution value for formulation P 1. In case of a cardinality constraint,, we simply use S, OP. We now review the algorithm by Cheuri et al. [9] which is based on P 2. We refer to it as Algorithm 1 and it has three stages. Stage 1: Intuitively, the purpose of this stage is to pic a small initial set consisting of elements with large marginal values, i.e. ɛ 3 V i for some i, ensuring that the marginal value added by any leftover element, to every function, is small ( ɛ 3 V i, i). his is necessary for technical reasons in the rounding procedure in Stage 3. Given a set S I, index elements in S = {s 1, s 2,... } in order of how the greedy algorithm would pic them. Now, note that by virtue of monotonicity, there are at most 1/ɛ 3 elements such that f i (s j {s 1,..., s j 1 }) ɛ 3 V i for any fixed i, in fact due to decreasing marginal values, f i (s j {s 1,..., s j 1 }) < ɛ 3 V i for every j > 1/ɛ 3. his implies that we need only focus on sets of size m/ɛ 3 (at most 1/ɛ 3 elements for each function) to find the right initial set. In particular, one can try all possible initial sets of this size (i.e. run subsequent stages with different starting sets), leading to the n m/ɛ3 term in the runtime. Stages 2,3 have runtime polynomial in m (in fact Stage 3 has runtime independent of m), hence Stage 1 is really the bottlenec. It is not obvious at all if one can do better than brute force enumeration over all possible starting sets and still retain the approximation guarantee, since the final solution must be an independent set of a matroid. However, as we show later, for cardinality constraints one can indeed avoid enumeration. Stage 2: Given a starting set S from stage one, suppose now that we have the ground set N S, contracted matroid M\S = {X X S I} with ran d S and monotone submodular functions f i (. S). Also f i (e S) < ɛ 3 V i, i, since otherwise we can discard starting set S. Let x S {0, 1} n be the characteristic vector of S (i.e. support is S). Starting at the origin, this stage maintains a point x t P (M\S) and runs for = O(d 2 ) steps. At step t, solves an LP with one constraint per function i in addition to the constraints 3 j S

given by the matroid polytope, to get a direction y t P (M\S). he point is then updated to x t+1 = x t + 1 yt. he process is referred to as the continuous greedy algorithm [6, 9, 34]. Given, x 0 = 0, we have x t P (M\S), t and observe that, t, x t < x t+1 x = d S. For 0 < η 1, let us define x(m, η) as the final point when this subroutine is executed for η steps, with matroid polytope P (M). Note that the output could also be a certificate of impossibility, in which case we can move to the next starting set. We have from [9], Lemma 1. [9] (Lemma 7.3) F i (x(m, 1)) (1 1/e)V i, i. his implies F i ( x(m\s, 1) S ) (1 1/e)(Vi f i (S)), i, which gives us, F i (x(m\s, 1) + x S ) (1 1/e)V i, i. However, in the next section, we will more generally need the following, Lemma 2. F i (x(m, η)) β(η)v i, i and x(m, η) = ηd. Proof. We refer to Section 2.3 and 3 of [6] and Lemma 7.3 of [9] for intuition and understanding ey details of the core analysis (which we do not repeat here). All we need here is to note that the proof of Lemma 3.3 in [6] in fact, implies more generally that when dealing with a single function i, if the algorithm is terminated after a fraction η of the total steps, we get a point x(m, η) P (M) such that F i (x(m, η)) (1 (1 1/d) ηd )V i β(η)v i. Further, x(m, η) = ηd. Now as observed in Lemma 7.3 of [9], when dealing simultaneously with multiple functions, all essential details about the algorithm and analysis remain unchanged, except that the LP solved at each step/iteration increases in complexity, hence we get the desired bound i. Note, the above property is well nown for the classical greedy algorithm [27, 28] (η steps lead to a β(η) approximation) so it is not surprising that it also holds for the continuous greedy algorithm. In terms of runtime, a straightforward implementation of the continuous greedy process can tae up to Θ(n 8 ) steps [6]. his was reduced to O(n 4 /ɛ 3 ) in [13] and subsequently to Õ(n2 ) in [4]. However, the last improvement does not obviously translate when solving for multiple functions. hus as a lower bound, currently the fastest algorithm for this stage has runtime worse than O(n 4 /ɛ 3 ), which can be as bad as O(n 5 ). Further, unlie the case of a single submodular function, where the LP at each step is an instance of the max weight independent set problem over matroids (which lends itself to a fast greedy solution), with multiple functions, one needs to solve the LP with a matroid separation oracle. Further exacerbating the runtime problem. Stage 3: For the right starting set S 1, Stage 2 successfully outputs a point x(m\s, 1). Stage 3 decomposes x(m\s, 1) into a convex combination of bases of M\S 1 (or uses the natural decomposition obtained in Stage 2). his is followed by a random process that collapses these bases into a single basis S 2 such that, S 2 M\S 1 and f i (S 1 S 2 ) (1 1/e)(1 ɛ)v i, i. Additionally, this requires ɛ < 1/8 ln m. he rounding procedure is called swap rounding. Lemma 3. [9](heorem 1.4, heorem 7.2) Given m monotone submodular functions f i (.) with all marginal values in [0, ɛ 3 V i ], i, a fractional point x P (M) and ɛ < 1 8γ ln m. Swap Rounding yields a set R M, such that, Pr[f i (R) < (1 ɛ)f i (x)] < me 1/8ɛ < 1/m γ 1. i Remar: For any γ > 1, the above can be converted to a result w.h.p. by standard repetition. 3 Main Results 3.1 A (1 1/e) approximation for m = o ( ) log 3 We replace the enumeration in Stage 1 with a single starting set, obtained by scanning once over the ground set. he main idea is simply that for the cardinality constraint case, any starting set 4

that fulfills the Stage 3 requirement of small marginals is acceptable. As mentioned previously, Stages 2 and 3 scale well in m, so for the modification to mae sense at a high level, we should see a decline in the guarantee as m starts to grow closer to Ω(), else we would have a constant factor approximation in the regime which is nown to be inapproximable. Indeed, we will incur an additional 1 m factor in the guarantee due to the new Stage 1. For ɛ 1/8 ln m this tends ɛ 3 to 1 when m = o ( ) log 3 but degrades to 0 for m = Ω(). New Stage 1: Start with S 1 = and pass over all elements once, updating S 1 whenever an element is chosen to be added. For each element e, add it to S 1 if for some i, f i (e S 1 ) ɛ 3 V i. Note that we add at most m/ɛ 3 elements (at most 1/ɛ 3 for each function, as discussed in Section 2). When the subroutine terminates, for every remaining element e N\S 1, f i (e S 1 ) < ɛ 3 V i, i (as required by Lemma 3). Modified Stage 2: Since we are no longer concerned with general matroids, let s use the simplified notation x(, η) to denote the output of Stage 2 of Algorithm 1, with cardinality constraint replacing the matroid M. Further let, 1 = S 1 m/ɛ 3. he modified second stage outputs: x(, 1 /). Contrast this with the output from the traditional Stage 2: x( 1, 1). We do this to simplify analysis here but one can, in fact, use the traditional output x( 1, 1) just as well, though that adds unnecessary complexity to the analysis at this point. We elaborate on this in Section 3.3. Now, as a direct corollary of Lemma 2. Corollary 4. F i (x(, 1 /) S 1 ) β( 1 /)f i (S S 1) β( 1 /)(V i f i (S 1 )) and x(, 1 /) = 1. Stage 3: We use the swap rounding algorithm on x(, 1 /) to get a set S 2 of size 1, and output the final set S 1 S 2. heorem 5. For ɛ = min{ 1 8 ln m, 4 m }, with constant probability the algorithm outputs a (1 1/e)(1 ɛ)(1 m/ɛ 3 ) approximate solution. Asymptotically, 1 1/e approximate for m = o ( ). log 3 Proof. From Lemma 3 and Corollary 4 we have, f i (S 2 S 1 ) β( 1 /)(1 ɛ)(v i f i (S 1 )), i. Using (1) to lower bound β( 1 /) we get, f i (S 1 S 2 ) β(1)(1 ɛ)(1 m/ɛ 3 )(V i f i (S 1 )) + f i (S) (1 1/e)(1 ɛ)(1 m/ɛ 3 )V i, i. Note that the runtime is now independent of ɛ. Stage 1 has runtime O(n) and Stages 2 and 3 have runtime polynomial in n, m. hus, we can refine the guarantee by picing suitable ɛ. We choose ɛ = min{ 1 8 ln m, 4 m }, 1 where the 8 ln m is due to Lemma 3 and the 4 m term is to balance ɛ and m/ɛ3. he resulting guarantee becomes (1 1/e)(1 h()), where the function h() 0 as, so long as m = o ( ). log 3 Owing to the limitation posed by the runtime of continuous greedy based Stage 2 that we highlighted towards the end of Section 2, we next introduce a simple and much faster algorithm. he speedup however comes at the cost of an extra factor of (1 1/e) in the approximation guarantee. 3.2 A fast (1 1/e) 2 δ approximation for m = o ( ) log 3 he algorithm has 3 stages as before. Stage 1 remains the same as the new Stage 1 introduced in the previous section. Let S 1 be the output of this stage as before. Stage 2 was the main computational bottlenec above and we replace it with a fast MWU based subroutine that runs for = O( ln m ) rounds and solves an instance of P δ 2 0 during each round. Here δ manifests as a subtractive term in the approximation guarantee. he currently fastest algorithm for P 0, in [26], has runtime O(n log 1/δ) and an expected guarantee of (1 1/e) δ. However, the slightly slower, but still nearly linear time O( n δ log n δ ) thresholding algorithm in [4], has (the 5

usual) deterministic guarantee of (1 1/e) δ. Either of these would lead to a runtime of Õ(n/δ) = Õ( n ) which is a vast improvement over O(n 5 ). δ 3 Consider some algorithm A for P 0 with guarantee α, and let A(f, ) denote its output given monotone submodular function f and cardinality constraint as input. Note that α can be as large as β(1), and also, 1 = S 1 as before. Algorithm Stage 2: MWU 1: Initialize = 2 ln m, λ 1 δ 2 i = 1/m, f i (.) = f i(. S 1 ) 2: while 1 t do 3: g t (.) = m i=1 λt f i i (.) 4: X t = A(g, 1 ) 5: m t i = f i (X t ) α 6: λ t+1 i = λ t i (1 δmt i ) 7: t = t + 1 8: Output: x 2 = 1 t=1 Xt V i f i (S 1 ) he point x 2 obtained above is rounded to a set S 2 in Stage 3 (which remains unchanged). he final output is S 1 S 2. Observe that the experts in this instance of MWU are the normalized functions f i (.). In the Plotin-Shmoys-ardos (PS) framewor, each round involves solving an LP with a convex combination of linear constraints (the experts). Here we solve an instance of P 0 for the monotone submodular function given by a convex combination of the experts f i. Further in the PS framewor, the convex combination of solutions obtained over all rounds is approximately feasible due to linearity. Here the non-linearity leads to a loss in guarantee. Next, we formalize this and show the claimed guarantee and runtime bounds. Lemma 6. g t (X t ) α(1 m/ɛ 3 ) i λt i, t. Proof. Consider the optimal set S and note that i λt f i i (S ) i λt i, t. Now if we use the function g t (.) = i λt f i i (.) to greedily index the elements of S = {s 1,..., s }, we have that the truncated set S = {s 1,..., s S1 } is such that, g t (S ) (1 S 1 /) i λt i, and furthermore it has the right cardinality of S 1. hen since g t being a sum of finitely many monotone submodular functions, is also monotone submodular, we now that Step 4 of the algorithm always outputs a set X t such that g t (X t ) α(1 S 1 /) i λt i, t. Lemma 7. t i λt i ( f i (X t ) α) δ + f t i (X t ) α, i. o see this one need only observe that the non-linearity of the function doesn t play a role in deriving the above inequality and the proof exactly follows the analysis of the PS framewor in [1] (heorem 3.3 & 2.1), except that instead of querying an LP oracle in each round, we query an algorithm A(.) for P 0. However, for the sae of completeness we include the proof in Appendix A. Note that in the PS framewor, the LP oracle could output a certificate of infeasibility. Analogously, in our case if A outputs a set X t at some round t, such that Lemma 6 is violated, we have a certificate of infeasibility. t Xt, and note that in the PS framewor, linearity Now consider the output x 2 = 1 easily implies that x 2 is an approximately feasible solution to the LP. In an ideal scenario, one could hope for an analogous result here, i.e., F i (x 2 S 1 ) (α δ)(v i f i (S 1 )), i. However, due to non-linearity of our functions we should expect to incur some cost. Fortunately, it turns out that this only costs us a new factor of (1 1/e) in the approximation guarantee. 6

heorem 8. For ɛ = min{ 1 8 ln m, 4 m }, the algorithm maes O( n log m log n δ 3 δ ) queries, and with constant probability, outputs a feasible (1 1/e) 2 (1 m )(1 ɛ) δ approximate set. ɛ 3 Asymptotically, (1 1/e) 2 δ approximate for m = o ( ). Proof. We first focus on proving the approximation guarantee. o that end, we assume α = β(1) and process the LHS in Lemma 7, t i λt i ( f i (X t ) β(1)) hus, we have for every i, f t i (X t ) β(1) f t i (X t ) log 3 t = (g(xt ) β(1) i λt i ) t ( β(1) S 1 /)( i λt i ) (using Lemma 6) t (β(1) S 1 /) ( i λt i ) β(1) S 1 /. (δ + β(1) S 1 /) β(1)(1 S 1 /) δ. Next we will show that F i (x 2 ) (1 1/e) f t=1 i (X t )/. Recall x 2 = 1 concave closure of a submodular function f is given by, t Xt and the f + (x) = max { α X α X f(x) X α X X = x, X α X 1, α X 0 X N}. + t Clearly, f i (x 2 ) f i (X t ). So we just need to show F i (x 2 ) (1 1/e) f i + (x 2). his in fact, follows directly from Lemmas 4 and 5 in [7], which state f (x) f + (x) and F (x) (1 1/e)f (x) respectively, for every monotone submodular function f. Note that f (.) is another extension of submodular functions which we don t need to introduce for the purpose of this proof. Additionally, we also give a novel and direct proof of the inequality F (x) (1 1/e) t=1 f(xt )/, when x = 1 t Xt, in Appendix B. hus, F i (x 2 ) (1 1/e) t=1 f i (X t )/ β(1) 2 (1 S 1 /) β(1)δ. Finally from Lemma 3, using swap rounding on x 2 gives a set S 2 such that f i (S 2 S 1 ) [(1 1/e) 2 (1 S 1 /)(1 ɛ) δ](v i f i (S 1 )), i with constant probability, proving the claimed approximation guarantee. Similar to the proof of heorem 5, for m = o ( ) log 3 and ɛ = min{ 1 8 ln m, 4 m }, we have (1 m/ɛ3 )(1 ɛ) 1 as, resulting in the guarantee (1 1/e) 2 δ. o see the runtime bound, recall from Section 3.1 that Stage 1 taes time O(n). Stage 2 runs an instance of A(.), times, leading to an upper bound of O(( n δ log n δ ) log m ) = δ 2 O( n log m log n δ 3 δ ), if we use the thresholding algorithm in [4], and as we discussed above, we can get a slightly faster runtime of O( n log m log 1 δ 2 δ ) by compromising and allowing for a guarantee in expectation. Recall that using either of these algorithms incurs an additional multiplicative factor of (1 δ) to the guarantee above. Finally, since we have a decomposition of x 2 into sets of size 1 each from Stage 2, swap rounding in Stage 3 directly proceeds in rounds by probabilistically collapsing these sets into a single set. Each round taes O() time, leading to total runtime O( log m). Combining all three, we re left with the dominant O( n log m log n δ 2 δ 3 δ ) term. 7

3.3 Variation in optimal solution value as increases Consider the problem P 0 with cardinality constraint. Given an optimal solution, S with value OP for the problem, it is not difficult to see that for arbitrary, there is a subset S S of size, such that f(s ) OP. For instance, as we argued in Lemma 6, indexing the elements in S using the greedy algorithm, and choosing the set formed by the first elements gives such a set. his implies OP OP, and the bound is easily seen to be tight. his raises a natural question: Can we generalize this bound on variation of optimal solution value with varying, for multi-objective maximization? A priori, this isn t obvious even for modular functions. In particular, note that indexing elements in order they are piced by the greedy algorithm doesn t wor anymore since there are many functions and we need to balance values amongst all. We show that one can indeed derive such a bound. heorem 9. Given that there exists a set S such that f i(s ) V i, i and ɛ < 1 [m/ɛ 3, ], there exists S S of size, such that ( f i (S ) m/ɛ 3 ) (1 ɛ) m/ɛ 3 V i, i. 8 ln m. For every Proof. Since we merely need to show existence, we will not concern ourselves with computational issues. First, let us restrict our ground set of elements to S and let S 1 be a subset of size m/ɛ 3 such that f i (e S 1 ) < ɛ 3 V i, e S \S 1 and i (recall, we argued the existence of such a set in Section 2, Stage 1). Recall that x S \S 1 represents the characteristic vector of set S \S 1 and consider the point x = S 1 S 1 x S \S. Clearly, x = 1 S 1, and furthermore from the concavity of multilinear extensions in positive directions (Section 2.1 of [6]), we have F i (x S 1 ) S 1 S 1 f i(s \S 1 S 1 ) S 1 S 1 (V i f i (S 1 )), i. Finally, using swap rounding Lemma 3, we have that there exists a set S 2 of size S 1, such that f i (S 1 S 2 ) (1 ɛ) S 1 S 1 V i, i. Previously, Lemma 13 in [30], showed a similar property using a different constructive argument, but only for the very restricted case of constant m and (their result does generalize for larger m, but leads to a very loose bound). Resolution of conjecture in [30]: Note that this essentially resolves the conjecture in [30], which states that for constant m and every m, there exists a set S of size, such that f i (S) Θ(1) V i, i. his implies that for large enough, we can choose sets of size ( - tuples) at each step to get a deterministic (1 1/e) ɛ approximation with runtime O(n m/ɛ4 ) for the multi-objective maximization problem, when m is constant (all previous algorithms, including the ones presented here are randomized). heorem 10. For = m, choosing -tuples greedily w.r.t. h(.) = min ɛ 4 i f i (.) yields approximation guarantee (1 1/e)(1 2ɛ) while maing n m/ɛ4 queries. Proof. he analysis is similar to that of the standard greedy algorithm [27, 28]. Let S j denote the set at the end of iteration j. S 0 = and let the final set be S /. hen from heorem 9, we have that at each step j, there is some set X S \S j of size such that f i (A S j ) (1 ɛ) m/ɛ 3 m/ɛ 3 ( Vi f i (S j ) ), i. o simplify presentation let η = (1 ɛ) m/ɛ 3 and note that η 1. Further, 1/η as m/ɛ 3 for fixed m and = o(). We have for every i, f i (S j+1 ) (1 η)f i (S j ) ηv i, call this 8

inequality j + 1. Observe that inequality / is f i (S / ) (1 η)f i (S / 1) ηv i, i. herefore multiplying inequality / j by (1 η) j and telescoping we get for every i, f(s / ) / 1 j=0 (1 η) j ηv i ( 1 (1 η) / )V i ( 1 (1 η) 1 η η / )V i β(η / )V i (1 1/e)(η / )V i. Where we used (1) for the last inequality. Let ɛ = 4 m, then we have, η / (1 ɛ) 1 m/ ɛ 3 ( 1 m/ɛ 3 1 ) ( 1 4 m ) 2 1 1 4 m ( ) 3 (1 ) ( herefore, as we get the asymptotic guarantee (1 1/e) 1 4 ) 2 m = (1 1/e)(1 ɛ) 2. Finally, we would lie to address a technical point raised in the discussion involving Modified Stage 2 in Section 3.1. We used the output x(, 1 /) from the continuous greedy algorithm (Stage 2) as opposed to x( 1, 1). he reason for doing this was simply because using the output x( 1, 1) would have led to the additional difficulty of comparing the function values of sets S 1 and S, which is now straightforward. Formally, F i(x( 1, 1) S 1 ) β(1)f i (S 1 S 1 ), and for m = o ( ) we have 1 m/ɛ 3, and therefore due to Lemma 9, f i (S 1 S 1 ) (1 log 3 ɛ) 1 m/ɛ 3 m/ɛ 3 f i (S S 1), which will give us an asymptotic (1 1/e) bound even for x( 1, 1). 4 Conclusion and Open Problems In summary, we consider the problem of multi-objective maximization of monotone submodular functions subject to a cardinality constraint, when m = o ( ) log 3. No polynomial time constant factor approximations or strong inapproximability results were nown for the problem, though it was nown that the problem is inapproximable when m = Ω() and admitted a nearly 1 1/e approximation for constant m. We showed that when m = o ( ), one can indeed approach the best possible guarantee of 1 1/e and further also gave a nearly-linear time (1 1/e) 2 approximation for the same. Finally, we also established a natural bound on how the optimal solution value increases with increasing cardinality of the set. A natural question of course, is whether one can achieve approximations right up to m = o(). Additionally, it also of interest to as if there are fast algorithms with guarantee closer to 1 1/e, in contrast to the guarantee of (1 1/e) 2 shown here. Further, most of the ideas, and all results here, are for the case of uniform matroid (cardinality constraint). It is unclear if similar results can also be shown for a general matroid constraint. Acnowledgments he author would lie to than James B. Orlin and Andreas S. Schulz for their insightful comments and feedbac on the first draft of this wor. log 3 9

References [1] S. Arora, E. Hazan, and S. Kale. he multiplicative weights update method: a metaalgorithm and applications. heory of Computing, 8:121 164, 2012. [2] S. Arora and S. Kale. A combinatorial, primal-dual approach to semidefinite programs. In SOC, pages 227 236, 2007. [3] Y. Azar and I. Gamzu. Efficient submodular function maximization under linear pacing constraints. ICALP 12, pages 38 50. [4] A. Badanidiyuru and J. Vondrá. Fast algorithms for maximizing submodular functions. In SODA 14, pages 1497 1514. SIAM, 2014. [5] D. Bienstoc. Potential function methods for approximately solving linear programming problems: theory and practice, volume 53. Springer, 2006. [6] G. Calinescu, C. Cheuri, M. Pál, and J. Vondrá. Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing, 40(6):1740 1766, 2011. [7] G. Calinescu, C. Cheuri, Martin Pál, and J. Vondrá. Maximizing a submodular set function subject to a matroid constraint. In IPCO 2007, volume 7, pages 182 196. [8] C. Cheuri,.S. Jayram, and J. Vondra. On multiplicative weight updates for concave and submodular function maximization. ICS 15, pages 201 210. [9] C. Cheuri, J. Vondrá, and R. Zenlusen. Dependent randomized rounding via exchange properties of combinatorial structures. In FOCS 10, pages 575 584. IEEE, 2010. [10] S. Dobzinsi and J. Vondrá. From query complexity to computational complexity. In SOC 12, pages 1107 1116. ACM, 2012. [11] U. Feige. A threshold of ln n for approximating set cover. Journal of the ACM (JACM), 45(4):634 652, 1998. [12] M. Feldman, J.S. Naor, and R. Schwartz. A unified continuous greedy algorithm for submodular maximization. In FOCS 11, pages 570 579. IEEE. [13] Y. Filmus and J. Ward. Monotone submodular maximization over a matroid via nonoblivious local search. SIAM Journal on Computing, 43:514 542, 2014. [14] L. Fleischer. Approximating fractional multicommodity flow independent of the number of commodities. SIAM Journal on Discrete Mathematics, 13(4):505 520, 2000. [15] N. Garg and R. Khandear. Fractional covering with upper bounds on the variables: Solving lps with negative entries. In European Symposium on Algorithms, pages 371 382, 2004. [16] N. Garg and J. Koenemann. Faster and simpler algorithms for multicommodity flow and other fractional pacing problems. SIAM Journal on Computing, 37(2):630 652, 2007. [17] M. Grigoriadis and L. Khachiyan. Fast approximation schemes for convex programs with many blocs and coupling constraints. SIAM Journal on Optimization, 4(1):86 107, 1994. [18] C. Guestrin, A. Krause, and A.P. Singh. Near-optimal sensor placements in gaussian processes. In Proceedings of the 22nd international conference on Machine learning, pages 265 272. ACM, 2005. 10

[19] X. He and D. Kempe. Robust influence maximization. In SIGKDD, pages 885 894, 2016. [20] C. Koufogiannais and N. Young. Beating simplex for fractional pacing and covering linear programs. In FOCS, pages 494 504, 2007. [21] A. Krause and C. Guestrin. Near-optimal nonmyopic value of information in graphical models. UAI 05, pages 324 331, 2005. [22] A. Krause, C. Guestrin, A. Gupta, and J. Kleinberg. Near-optimal sensor placements: Maximizing information while minimizing communication cost. In Proceedings of the 5th international conference on Information processing in sensor networs, pages 2 10. ACM, 2006. [23] A. Krause, J. Lesovec, C. Guestrin, J. VanBriesen, and C. Faloutsos. Efficient sensor placement optimization for securing large water distribution networs. Journal of Water Resources Planning and Management, 134(6):516 526, 2008. [24] A. Krause, H B. McMahan, C. Guestrin, and A. Gupta. Robust submodular observation selection. Journal of Machine Learning Research, 9:2761 2801, 2008. [25] J. Lesovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Costeffective outbrea detection in networs. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 420 429. ACM, 2007. [26] B. Mirzasoleiman, A. Badanidiyuru, A. Karbasi, J. Vondrá, and A. Krause. Lazier than lazy greedy. In AAAI, 2015. [27] G.L. Nemhauser and L.A. Wolsey. Best algorithms for approximating the maximum of a submodular set function. Mathematics of operations research, 3(3):177 188, 1978. [28] G.L. Nemhauser, L.A. Wolsey, and M.L. Fisher. An analysis of approximations for maximizing submodular set functions i. Mathematical Programming, 14(1):265 294, 1978. [29] J. Orlin, A.S. Schulz, and R. Udwani. Robust monotone submodular function maximization. In IPCO 2016, pages 312 324. [30] J. Orlin, A.S. Schulz, and R. Udwani. Robust monotone submodular function maximization. arxiv preprint arxiv:1507.06616, 2016. [31] Salomons E. Berry J.W. Hart W.E. Phillips C.A. Watson J.P. Dorini G. Jonergouw P. Kapelan Z. di Pierro F. Ostfeld A., Uber J.G. he battle of the water sensor networs (bwsn): A design challenge for engineers and algorithms. Journal of Water Resources Planning and Management, 2008. [32] C. Papadimitriou and M. Yannaais. On the approximability of trade-offs and optimal access of web sources. In FOCS, pages 86 92, 2000. [33] S. Plotin, D. Shmoys, and E. ardos. Fast approximation algorithms for fractional pacing and covering problems. In FOCS, pages 495 504, 1991. [34] J. Vondrá. Optimal approximation for the submodular welfare problem in the value oracle model. In SOC 08, pages 67 74. ACM. [35] J. Vondrá. Symmetry and approximability of submodular maximization problems. SIAM Journal on Computing, 42(1):265 304, 2013. 11

[36] J. Vondrá, C. Cheuri, and R. Zenlusen. Submodular function maximization via the multilinear relaxation and contention resolution schemes. In SOC 11, pages 783 792. ACM, 2011. [37] N. Young. Randomized rounding without solving the linear program. In SODA, volume 95, pages 170 178, 1995. [38] Neal E Young. Sequential and parallel algorithms for mixed pacing and covering. In FOCS, pages 538 546, 2001. A Lemma 11. t i λt i ( f i (X t ) α) 4δ + f t i (X t ) α, i. Proof. We will use the potential function Φ t = i λt i. Let pt i = λt i /Φt and M t = i pt i mt i. hen we have, Φ t+1 = i λ t i(1 δm t i) = Φ t δφ t i p t im t i = Φ t (1 δm t ) Φ t e δm t After rounds, Φ Φ 1 e δ t M t. Further, for every i, Φ wi = 1 m t (1 δmt i ) ln(φ 1 e δ t M t ) t ln(1 δmt i ) ln m δ t M t ln m + t ln(1 δmt i ) Using ln( 1 1 ɛ ) ɛ + ɛ2 and ln(1 + ɛ) ɛ ɛ 2 for ɛ 0.5, and with = 2 ln m δ 2 (for a positive approximation guarantee), we have, t M t t δ + mt i, i. and δ < (1 1/e) B Lemma 12. F (x) (1 1/e) 1 t=1 f(xt ) Proof. Let x = t=1 Xt /. W.l.o.g., assume the sets X t are indexed such that f(x j ) f(x j+1 ) for every j 1. o simplify notation, let f(x t )/ = a t and t at = A. Recall that F (x) can be viewed as the expected function value of the set obtained by independently sampling element j with probability x j. Consider the alternative and equivalent process where starting with t = 1, one samples each element in set X t independently with probability 1/. hus the random process runs in steps, and in step t suppose the newly sampled subset of X t adds marginal value t. It follows from submodularity that, 1 a 1 and in general, t a t 1 t 1 j=1 j. Note that F (x) = t t. Now consider the LP where the objective is to minimize t t subject to a 1 a 2 a 0; a t = A and t a t 1 t 1 j=1 j with 0 = 0. It follows that the minimum occurs 12

at an extreme point where a t = A and a t = a 1 for all t j and a j+1 = 0. For all such points, it is not difficult to see that the objective is at least (1 1/e)A. herefore, we have F (x) (1 1/e)A = (1 1/e) t f(xt )/, as desired. 13