Approximating Submodular Functions Nick Harvey University of British Columbia
Approximating Submodular Functions Part 1 Nick Harvey University of British Columbia Department of Computer Science July 11th, 2015 Joint work with M. Goemans, S. Iwata and V. Mirrokni 1 / 71
Submodular Functions Studied for decades in combinatorial optimization and economics Used in approximation algorithms, algorithmic game theory, machine learning, etc. Discrete analogue of convex functions [Lovász 83], [Murota 03] 2 / 71
Submodular Functions Studied for decades in combinatorial optimization and economics Used in approximation algorithms, algorithmic game theory, machine learning, etc. Discrete analogue of convex functions [Lovász 83], [Murota 03] 3 / 71
Valuation Functions A first step in economic modeling: individuals have valuation functions giving utility for different outcomes or events 4 / 71
Valuation Functions A first step in economic modeling: individuals have valuation functions giving utility for different outcomes or events Focus on combinatorial settings: n items, {1, 2,..., n} = [n] f : 2 [n] R Submodularity is often a natural assumption 5 / 71
Topic of Lectures Approximating submodular functions based on few values Motivating Example Microsoft Office consists of a set A of products. e.g., A = {Word, Excel, Outlook, PowerPoint,...}. Consumer has a valuation function f : 2 A R Want to learn f without asking consumer too many questions (Perhaps useful in pricing different bundles of Office?) 6 / 71
Topic of Lectures Approximating submodular functions based on few values Motivating Example Microsoft Office consists of a set A of products. e.g., A = {Word, Excel, Outlook, PowerPoint,...}. Consumer has a valuation function f : 2 A R Want to learn f without asking consumer too many questions (Perhaps useful in pricing different bundles of Office?) Outline Today: Actively querying the function Tomorrow: Observing samples from a distribution 7 / 71
Submodular Functions Definition f : 2 [n] R is submodular if, for all A, B [n]: f (A) + f (B) f (A B) + f (A B) 8 / 71
Submodular Functions Definition f : 2 [n] R is submodular if, for all A, B [n]: f (A) + f (B) f (A B) + f (A B) Equivalent Definition f is submodular if, for all A B and i / B: f (A {i}) f (A) f (B {i}) f (B) Diminishing returns: the more you have, the less you want 9 / 71
Example f is submodular if, for all A B and i / B: f (A {i}) f (A) f (B {i}) f (B) Diminishing returns: the more you have, the less you want 10 / 71
Combinatorial Examples Linear algebra: Let v 1,..., v n R d. For S [n], let f (S) = dim span { v i : i S }. 11 / 71
Combinatorial Examples Linear algebra: Let v 1,..., v n R d. For S [n], let f (S) = dim span { v i : i S }. Matroids: Let M = (E, I) be a matroid. For S E, the rank function is f (S) = max { I : I S, I I }. 12 / 71
Combinatorial Examples Linear algebra: Let v 1,..., v n R d. For S [n], let f (S) = dim span { v i : i S }. Matroids: Let M = (E, I) be a matroid. For S E, the rank function is f (S) = max { I : I S, I I }. Coverage: Let A 1,..., A n U. For S [n], define f (S) = max A i. i S 13 / 71
Minimizing Submodular Functions (Given Oracle Access) Can solve min S f (S) in polynomial time (and oracle calls). First shown using the ellipsoid method. [Grotschel, Lovasz, Schrijver 81] Combinatorial, strongly-polynomial time algorithms known. [Schrijver 01], [Iwata, Fleischer, Fujishige 01],... 14 / 71
Minimizing Submodular Functions (Given Oracle Access) Can solve min S f (S) in polynomial time (and oracle calls). First shown using the ellipsoid method. [Grotschel, Lovasz, Schrijver 81] Combinatorial, strongly-polynomial time algorithms known. [Schrijver 01], [Iwata, Fleischer, Fujishige 01],... One of the most powerful algorithms in combinatorial optimization. Generalizes bipartite matching, matroid intersection, polymatroid intersection, submodular flow,... Example: Matroid Intersection [Edmonds 70] Given matroids M 1 = (E, I 1 ) and M 2 = (E, I 2 ) max{ I : I I 1 I 2 } = min{r 1 (S) + r 2 (E \ S) : S E} 15 / 71
Maximizing Submodular Functions (Given Oracle Access) Maximization Can approximate max S f (S) to within 1/4, assuming f 0: Just pick S uniformly at random! [Feige, Mirrokni, Vondrák 07] 16 / 71
Maximizing Submodular Functions (Given Oracle Access) Maximization Can approximate max S f (S) to within 1/4, assuming f 0: Just pick S uniformly at random! [Feige, Mirrokni, Vondrák 07] Can approximate max S f (S) to within 1/2, assuming f 0. [Buchbinder, Feldman, Naor, Schwartz 12] 17 / 71
Monotone Functions Definition f : 2 [n] R is monotone if, for all A B [n]: f (A) f (B) Constrained Maximization Let M = (E, I) be a matroid. Assume f : 2 E R 0 is monotone and submodular. Can approximate max S I f (S) to within 1 1/e. [Calinescu, Chekuri, Pal, Vondrák 09], [Filmus, Ward 12] 18 / 71
Monotone Functions Definition f : 2 [n] R is monotone if, for all A B [n]: f (A) f (B) Problem Given a monotone, submodular f, construct using poly(n) oracle queries a function ˆf such that: ˆf (S) f (S) α(n) ˆf (S) S [n] Approximation Quality How small can we make α(n)? α(n) = n is trivial 19 / 71
Approximating Submodular Functions Everywhere Positive Result Problem Given a monotone, submodular f, construct using poly(n) oracle queries a function ˆf such that: ˆf (S) f (S) α(n) ˆf (S) S [n] Our Positive Result A deterministic algorithm that constructs ˆf (S) with α(n) = n + 1 for matroid rank functions f, or α(n) = O( n log n) for general monotone submodular f Also, ˆf is submodular: ˆf (S) = c i for some scalars c i. i S 20 / 71
Approximating Submodular Functions Everywhere Almost Tight Our Positive Result A deterministic algorithm that constructs ˆf (S) = c i with α(n) = n + 1 for matroid rank functions f, or α(n) = O( n log n) for general monotone submodular f Our Negative Result With polynomially many oracle calls, α(n) = Ω( n/ log n) (even for randomized algs) i S 21 / 71
Polymatroid Definition Given submodular f, polymatroid { P f = x R n + : } x i f (S) for all S [n] i S A few properties [Edmonds 70]: Can optimize over P f with greedy algorithm Separation problem for P f is submodular fctn minimization For monotone f, can reconstruct f : f (S) = max x P f 1 S, x 22 / 71
Our Approach: Geometric Relaxation We know: Suppose that: Then: where f (S) = max x P f 1 S, x Q P f λq ˆf (S) f (S) λˆf (S) ˆf (S) = max x Q 1 S, x Q P f λq 23 / 71
John s Theorem [1948] Maximum Volume Ellipsoids Definition A convex body K is centrally symmetric if x K x K. 24 / 71
John s Theorem [1948] Maximum Volume Ellipsoids Definition A convex body K is centrally symmetric if x K x K. Definition An ellipsoid E is an α-ellipsoidal approximation of K if E K α E. 25 / 71
John s Theorem [1948] Maximum Volume Ellipsoids Definition A convex body K is centrally symmetric if x K x K. Definition An ellipsoid E is an α-ellipsoidal approximation of K if E K α E. Theorem Let K be a centrally symmetric convex body in R n. Let E max (or John ellipsoid) be maximum volume ellipsoid contained in K. Then K n E max. 26 / 71
John s Theorem [1948] Maximum Volume Ellipsoids Definition A convex body K is centrally symmetric if x K x K. Definition An ellipsoid E is an α-ellipsoidal approximation of K if E K α E. Theorem Let K be a centrally symmetric convex body in R n. Let E max (or John ellipsoid) be maximum volume ellipsoid contained in K. Then K n E max. Algorithmically? 27 / 71
Ellipsoids Basics Definition An ellipsoid is E(A) = {x R n : x T Ax 1} where A 0 is positive definite matrix. Handy notation Write x A = x T Ax. Then E(A) = {x R n : x A 1} Optimizing over ellipsoids max x E(A) c, x = c A 1 28 / 71
Algorithms for Ellipsoidal Approximations Explicitly Given Polytopes Can find E max in P-time (up to ɛ) if explicitly given as K = {x : Ax b} [Grötschel, Lovász and Schrijver 88], [Nesterov, Nemirovski 89], [Khachiyan, Todd 93],... Polytopes given by Separation Oracle only n + 1-ellipsoidal approximation for convex bodies given by weak separation oracle [Grötschel, Lovász and Schrijver 88] No (randomized) n 1 ɛ -ellipsoidal approximation [J. Soto 08] 29 / 71
Finding Larger and Larger Inscribed Ellipsoids Informal Statement We have A 0 such that E(A) K. Suppose we find z K but z far outside of E(A). Then should be able to find A 0 such that E(A ) K vol E(A ) > vol E(A) 30 / 71
Finding Larger and Larger Inscribed Ellipsoids Informal Statement We have A 0 such that E(A) K. Suppose we find z K but z far outside of E(A). Then should be able to find A 0 such that E(A ) K vol E(A ) > vol E(A) 31 / 71
Finding Larger and Larger Inscribed Ellipsoids Formal Statement Theorem If A 0 and z R n with d = z 2 A n then E(A ) is max volume ellipsoid inscribed in conv{e(a), z, z} where A = n d d 1 n 1 A + n d 2 ( 1 d 1 n 1 Moreover, vol E(A ) = k n (d) vol E(A) where (d ) n ( ) n 1 n 1 k n (d) = n d 1 ) Azz T A z E(A) z 32 / 71
Finding Larger and Larger Inscribed Ellipsoids Formal Statement Theorem If A 0 and z R n with d = z 2 A n then E(A ) is max volume ellipsoid inscribed in conv{e(a), z, z} where A = n d d 1 n 1 A + n d 2 ( 1 d 1 n 1 Moreover, vol E(A ) = k n (d) vol E(A) where (d ) n ( ) n 1 n 1 k n (d) = n d 1 z ) Azz T A z 33 / 71
Finding Larger and Larger Inscribed Ellipsoids Remarks vol E(A ) = k n (d) vol E(A) where (d ) n ( ) n 1 n 1 k n (d) = n d 1 Remarks k n (d) > 1 for d > n proves John s theorem Significant volume increase for d n + 1: k n (n + 1) = 1 + Θ(1/n 2 ) Polar statement previously known [Todd 82] A gives formula for minimum volume ellipsoid containing E(A) { x : b c, x b } 34 / 71
Review of Plan Given monotone, submodular f, make n O(1) queries, construct ˆf s.t. ˆf (S) f (S) Õ( n) ˆf (S) S V. 35 / 71
Review of Plan Given monotone, submodular f, make n O(1) queries, construct ˆf s.t. ˆf (S) f (S) Õ( n) ˆf (S) S V. Can reconstruct f from the polymatroid P f = { x R n + : i S x i f (S) S [n] } by f (S) = max x Pf 1 S, x. 36 / 71
Review of Plan Given monotone, submodular f, make n O(1) queries, construct ˆf s.t. ˆf (S) f (S) Õ( n) ˆf (S) S V. Can reconstruct f from the polymatroid P f = { x R n + : i S x i f (S) S [n] } by f (S) = max x Pf 1 S, x. Make P f centrally symmetric by reflections: S(P f ) = { x : ( x 1, x 2,, x n ) P f } 37 / 71
Review of Plan Given monotone, submodular f, make n O(1) queries, construct ˆf s.t. ˆf (S) f (S) Õ( n) ˆf (S) S V. Can reconstruct f from the polymatroid P f = { x R n + : i S x i f (S) S [n] } by f (S) = max x Pf 1 S, x. Make P f centrally symmetric by reflections: S(P f ) = { x : ( x 1, x 2,, x n ) P f } Max volume ellipsoid E max has E max S(P f ) n E max. Take ˆf (S) = max x Emax 1 S, x. 38 / 71
Review of Plan Given monotone, submodular f, make n O(1) queries, construct ˆf s.t. ˆf (S) f (S) Õ( n) ˆf (S) S V. Can reconstruct f from the polymatroid P f = { x R n + : i S x i f (S) S [n] } by f (S) = max x Pf 1 S, x. Make P f centrally symmetric by reflections: S(P f ) = { x : ( x 1, x 2,, x n ) P f } Max volume ellipsoid E max has E max S(P f ) n E max. Take ˆf (S) = max x Emax 1 S, x. Compute ellipsoids E 1, E 2,... in S(P f ) that converge to E max. 39 / 71
Review of Plan Given monotone, submodular f, make n O(1) queries, construct ˆf s.t. ˆf (S) f (S) Õ( n) ˆf (S) S V. Can reconstruct f from the polymatroid P f = { x R n + : i S x i f (S) S [n] } by f (S) = max x Pf 1 S, x. Make P f centrally symmetric by reflections: S(P f ) = { x : ( x 1, x 2,, x n ) P f } Max volume ellipsoid E max has E max S(P f ) n E max. Take ˆf (S) = max x Emax 1 S, x. Compute ellipsoids E 1, E 2,... in S(P f ) that converge to E max. Given E i = E(A i ), need z S(P f ) with z Ai n + 1. If z, can compute Ei+1 of larger volume. If z, then Ei E max. 40 / 71
Remaining Task Ellipsoidal Norm Maximization Ellipsoidal Norm Maximization Given A 0 and well-bounded convex body K by separation oracle. (So B(r) K B(R) where B(d) is ball of radius d.) Solve max x K x A 41 / 71
Remaining Task Ellipsoidal Norm Maximization Ellipsoidal Norm Maximization Given A 0 and well-bounded convex body K by separation oracle. (So B(r) K B(R) where B(d) is ball of radius d.) Solve max x K x A Bad News Ellipsoidal Norm Maximization NP-complete for S(P f ) and P f. (Even if f is a graphic matroid rank function.) 42 / 71
Remaining Task Ellipsoidal Norm Maximization Ellipsoidal Norm Maximization Given A 0 and well-bounded convex body K by separation oracle. (So B(r) K B(R) where B(d) is ball of radius d.) Solve max x K x A Bad News Ellipsoidal Norm Maximization NP-complete for S(P f ) and P f. (Even if f is a graphic matroid rank function.) Approximations are good enough P-time α-approx. algorithm for Ellipsoidal Norm Maximization = P-time α n + 1-ellipsoidal approximation for K (in O(n 3 log(r/r)) iterations) 43 / 71
Ellipsoidal Norm Maximization Taking Advantage of Symmetry Our Task Given A 0, and f find max x S(Pf ) x A. 44 / 71
Ellipsoidal Norm Maximization Taking Advantage of Symmetry Our Task Given A 0, and f find max x S(Pf ) x A. Observation: Symmetry Helps S(P f ) invariant under axis-aligned reflections. (Diagonal {±1} matrices.) = same is true for E max = E max = E(D) where D is diagonal. 45 / 71
Remaining Task Ellipsoidal Norm Maximization Our Task Given diagonal D 0, and f find Equivalently, max x S(P f ) x D max s.t. i d ix 2 i x P f Maximizing convex function over convex set max attained at vertex. 46 / 71
Remaining Task Ellipsoidal Norm Maximization Our Task Given diagonal D 0, and f find max s.t. i d ix 2 i x P f Maximizing convex function over convex set max attained at vertex. Matroid Case If f is matroid rank function = vertices in {0, 1} n = xi 2 = x i. Our task is max i d ix i s.t. x P f This is the max weight base problem, solvable by greedy algorithm. 47 / 71
Remaining Task Ellipsoidal Norm Maximization Our Task Given diagonal D 0, and f find max s.t. i d ix 2 i x P f Maximizing convex function over convex set max attained at vertex. General Monotone Submodular Case More complicated: uses approximate maximization of submodular function [Nemhauser, Wolsey, Fischer 78], etc. Can find O(log n)-approximate maximum. 48 / 71
Summary of Algorithm Theorem In P-time, construct a (submodular) function ˆf (S) = c i with α(n) = n + 1 for matroid rank functions f, or α(n) = O( n log n) for general monotone submodular f. The algorithm is deterministic. i S 49 / 71
Ω( n/ log n) Lower Bound Theorem n With poly(n) queries, cannot approximate f better than log n. Even for randomized algs, and even if f is matroid rank function. 50 / 71
Ω( n/ log n) Lower Bound Informal Idea Theorem n With poly(n) queries, cannot approximate f better than log n. Even for randomized algs, and even if f is matroid rank function. 51 / 71
Ω( n/ log n) Lower Bound Informal Idea Theorem n With poly(n) queries, cannot approximate f better than log n. Even for randomized algs, and even if f is matroid rank function. 52 / 71
Lower Bound Proof Algorithm performs queries S 1,..., S k. It distinguishes f from f iff for some i, β + S i R < min{ S i, n}. (1) 53 / 71
Lower Bound Proof Algorithm performs queries S 1,..., S k. It distinguishes f from f iff for some i, β + S i R < min{ S i, n}. (1) Suppose S i n. Then (1) holds iff S i R > β. 54 / 71
Lower Bound Proof Algorithm performs queries S 1,..., S k. It distinguishes f from f iff for some i, β + S i R < min{ S i, n}. (1) Suppose S i n. Then (1) holds iff S i R > β. Pick R at random, each element w.p. 1/ n. E [ S i R ] = S i / n 1 Chernoff bound = Pr [ S i R > β ] e β/2 = n Θ(1) Union bound implies that no query distinguishes f from f. 55 / 71
Summary Problem Given a monotone, submodular f, construct using poly(n) oracle queries a function ˆf such that: ˆf (S) f (S) α(n) ˆf (S) S [n] Our Positive Result A deterministic algorithm that constructs ˆf (S) = i S c i with α(n) = n + 1 for matroid rank functions f, or α(n) = O( n log n) for general monotone submodular f Our Negative Result With polynomially many oracle calls, α(n) = Ω( n/ log n) (even for randomized algs) 56 / 71
Extensions Existential Approximations Consider the statement: for a non-negative function f : 2 [n] R with f ( ) = 0, there exists a simple function ˆf with ˆf (S) f (S) n ˆf (S). True for any monotone, submodular function. [This talk] 57 / 71
Extensions Existential Approximations Consider the statement: for a non-negative function f : 2 [n] R with f ( ) = 0, there exists a simple function ˆf with ˆf (S) f (S) n ˆf (S). True for any monotone, submodular function. [This talk] True for any fractionally subadditive (XOS) function. [Balcan et al. 2012], [Badanidiyuru et al. 2012] 58 / 71
Extensions Existential Approximations Consider the statement: for a non-negative function f : 2 [n] R with f ( ) = 0, there exists a simple function ˆf with ˆf (S) f (S) n ˆf (S). True for any monotone, submodular function. [This talk] True for any fractionally subadditive (XOS) function. [Balcan et al. 2012], [Badanidiyuru et al. 2012] True for any (non-monotone) symmetric, submodular function. [Balcan, Harvey, Iwata 2012] 59 / 71
Extensions Existential Approximations Consider the statement: for a non-negative function f : 2 [n] R with f ( ) = 0, there exists a simple function ˆf with ˆf (S) f (S) n ˆf (S). True for any monotone, submodular function. [This talk] True for any fractionally subadditive (XOS) function. [Balcan et al. 2012], [Badanidiyuru et al. 2012] True for any (non-monotone) symmetric, submodular function. [Balcan, Harvey, Iwata 2012] True for any subadditive function, with approximation O( n log n). [Badanidiyuru et al. 2012] But, ˆf cannot be found by polynomially many value queries. 60 / 71
Extensions Existential Approximations Consider the statement: for a non-negative function f : 2 [n] R with f ( ) = 0, there exists a simple function ˆf with ˆf (S) f (S) n ˆf (S). True for any monotone, submodular function. [This talk] True for any fractionally subadditive (XOS) function. [Balcan et al. 2012], [Badanidiyuru et al. 2012] True for any (non-monotone) symmetric, submodular function. [Balcan, Harvey, Iwata 2012] True for any subadditive function, with approximation O( n log n). [Badanidiyuru et al. 2012] But, ˆf cannot be found by polynomially many value queries. Open Question True for any submodular function? 61 / 71
Other Notions of Approximation Suppose f : 2 [n] [0, 1] is submodular. Then f is within l 2 -distance ɛ to a function of O( 1 log 1 ɛ 2 ɛ ) variables (a junta ). [Feldman-Vondrak 13] a polynomial of degree O( 1 ɛ 2 ). [Cheragchi et al. 11] a polynomial of degree O( 1 ɛ 4/5 log 1 ɛ ). Moreover, the 4/5 is optimal! [Feldman-Vondrak 15] 62 / 71