Submodular Functions Properties Algorithms Machine Learning

Similar documents
1 Submodular functions

Submodularity in Machine Learning

Optimization of Submodular Functions Tutorial - lecture I

Submodular Functions and Their Applications

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 24: Introduction to Submodular Functions. Instructor: Shaddin Dughmi

9. Submodular function optimization

Applications of Submodular Functions in Speech and NLP

Learning with Submodular Functions: A Convex Optimization Perspective

1 Maximizing a Submodular Function

Discrete Optimization in Machine Learning. Colorado Reed

Advanced Combinatorial Optimization Updated April 29, Lecture 20. Lecturer: Michel X. Goemans Scribe: Claudio Telha (Nov 17, 2009)

CSE541 Class 22. Jeremy Buhler. November 22, Today: how to generalize some well-known approximation results

Symmetry and hardness of submodular maximization problems

EE595A Submodular functions, their optimization and applications Spring 2011

Approximating Submodular Functions. Nick Harvey University of British Columbia

An Introduction to Submodular Functions and Optimization. Maurice Queyranne University of British Columbia, and IMA Visitor (Fall 2002)

1 Sparsity and l 1 relaxation

Matroids and submodular optimization

1 Continuous extensions of submodular functions

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 17: Combinatorial Problems as Linear Programs III. Instructor: Shaddin Dughmi

Monotone Submodular Maximization over a Matroid

Diffusion of Innovation and Influence Maximization

OWL to the rescue of LASSO

Stochastic Submodular Cover with Limited Adaptivity

Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra. Problem Set 6

Minimization of Symmetric Submodular Functions under Hereditary Constraints

Submodular Functions, Optimization, and Applications to Machine Learning

Submodular Functions: Extensions, Distributions, and Algorithms A Survey

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

EECS 495: Combinatorial Optimization Lecture Manolis, Nima Mechanism Design with Rounding

BBM402-Lecture 20: LP Duality

Word Alignment via Submodular Maximization over Matroids

Lecture notes on the ellipsoid algorithm

Maximizing the Spread of Influence through a Social Network. David Kempe, Jon Kleinberg, Éva Tardos SIGKDD 03

1 Overview. 2 Multilinear Extension. AM 221: Advanced Optimization Spring 2016

Lecture 10 (Submodular function)

Submodular Function Minimization Based on Chapter 7 of the Handbook on Discrete Optimization [61]

Revisiting the Greedy Approach to Submodular Set Function Maximization

CO759: Algorithmic Game Theory Spring 2015

Submodular Maximization by Simulated Annealing

Nonlinear Discrete Optimization

Greedy Maximization Framework for Graph-based Influence Functions

Tightness of LP Relaxations for Almost Balanced Models

Mathematical Preliminaries

Deciding Emptiness of the Gomory-Chvátal Closure is NP-Complete, Even for a Rational Polyhedron Containing No Integer Point

Convex relaxation for Combinatorial Penalties

Polyhedral Results for A Class of Cardinality Constrained Submodular Minimization Problems

Efficient Approximation for Restricted Biclique Cover Problems

Topics in Theoretical Computer Science April 08, Lecture 8

CS 322: (Social and Information) Network Analysis Jure Leskovec Stanford University

Lecture 11 October 7, 2013

Branch-and-Bound. Leo Liberti. LIX, École Polytechnique, France. INF , Lecture p. 1

A combinatorial algorithm minimizing submodular functions in strongly polynomial time

7. Lecture notes on the ellipsoid algorithm

Lecture 4 September 15

CS 598RM: Algorithmic Game Theory, Spring Practice Exam Solutions

Reachability-based matroid-restricted packing of arborescences

Constrained Maximization of Non-Monotone Submodular Functions

Restricted Strong Convexity Implies Weak Submodularity

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

Integer Programming ISE 418. Lecture 8. Dr. Ted Ralphs

Topic: Balanced Cut, Sparsest Cut, and Metric Embeddings Date: 3/21/2007

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

1 Matroid intersection

The Ellipsoid (Kachiyan) Method

ACO Comprehensive Exam October 14 and 15, 2013

On the Complexity of Some Enumeration Problems for Matroids

A Note on the Budgeted Maximization of Submodular Functions

A submodular-supermodular procedure with applications to discriminative structure learning

Streaming Algorithms for Submodular Function Maximization

The Limitations of Optimization from Samples

1 Primals and Duals: Zero Sum Games

ACO Comprehensive Exam March 20 and 21, Computability, Complexity and Algorithms

Part 6: Structured Prediction and Energy Minimization (1/2)

Semi-Markov/Graph Cuts

CS675: Convex and Combinatorial Optimization Fall 2016 Combinatorial Problems as Linear and Convex Programs. Instructor: Shaddin Dughmi

The Steiner Network Problem

CS675: Convex and Combinatorial Optimization Fall 2014 Combinatorial Problems as Linear Programs. Instructor: Shaddin Dughmi

Lecture 1 : Probabilistic Method

Locally Adaptive Optimization: Adaptive Seeding for Monotone Submodular Functions

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Notes for Lecture 2. Statement of the PCP Theorem and Constraint Satisfaction

1 Problem Formulation

The Complexity of Maximum. Matroid-Greedoid Intersection and. Weighted Greedoid Maximization

The Simplex Method for Some Special Problems

On Constrained Boolean Pareto Optimization

Lecture 9: PGM Learning

An ADMM Algorithm for Clustering Partially Observed Networks

Combinatorial Optimization

Submodular function optimization: A brief tutorial (Part I)

Stochastic Submodular Cover with Limited Adaptivity

Packing Arborescences

Algorithms for Enumerating Circuits in Matroids

Query and Computational Complexity of Combinatorial Auctions

minimize x subject to (x 2)(x 4) u,

Integer Programming ISE 418. Lecture 12. Dr. Ted Ralphs

Geometric Steiner Trees

arxiv: v1 [math.oc] 1 Jun 2015

DEPARTMENT OF STATISTICS AND OPERATIONS RESEARCH OPERATIONS RESEARCH DETERMINISTIC QUALIFYING EXAMINATION. Part I: Short Questions

CS 6820 Fall 2014 Lectures, October 3-20, 2014

Transcription:

Submodular Functions Properties Algorithms Machine Learning Rémi Gilleron Inria Lille - Nord Europe & LIFL & Univ Lille Jan. 12 revised Aug. 14 Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 1 / 28

History and Context Origin: many set functions in economics have the diminishing returns property leading to so-called submodular functions. Optimization problems first studied in the early 70 s [Edmonds 71]. Maximization of submodular functions: a greedy algorithm with performance guarantees [Nemhauser et al 78] with applications to machine learning: information gain, network diffusion, active learning. Exact discrete minimization problem: a fully combinatorial, strongly polynomial algorithm for exact minimization of submodular functions [Iwata & Orlin, SODA 09]. Many particular cases. Approximate minimization of submodular functions: related to convex optimization problems using the Lovász extension [Lovász 82, Bach 11]. Many applications to machine learning: clustering, subset selection, sparsity. Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 2 / 28

Motivation for Magnet Many functions defined on graphs are submodular: cover functions, cut functions,... Many machine learning settings lead to optimization of submodular functions, thus if you can prove that the function to be optimized is submodular, then you get for free: algorithms, performance guarantees, complexity analysis, but, it can be the case that more efficient algorithms exist for particular submodular functions. There are strong connections between minimization of submodular functions and convex minimization problems. Also, submodular functions allow to define structured sparsity inducing norms. Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 3 / 28

Plan 1 Examples of Submodular Functions 2 Definitions and Properties of Submodular Functions 3 Maximization of Submodular Functions 4 Minimization of Submodular Functions 5 Conclusion Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 4 / 28

Plan 1 Examples of Submodular Functions 2 Definitions and Properties of Submodular Functions 3 Maximization of Submodular Functions 4 Minimization of Submodular Functions 5 Conclusion Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 5 / 28

Economics A factory has the capability of producing any subset S of a given set E of products. Producing subset S induces a setup cost c(s) to make the factory ready to produce S. Suppose that we have decided to produce S and we consider whether to add a product e to S, then we will have to pay an additional cost (or marginal cost) c(s {e}) c(s). Economics suggests that additional cost is a non-increasing function of S, i.e. adding e to a larger set should produce an additional cost no more than adding e to a smaller set. The cost function c is submodular, i.e. S T T {e}, c(s {e}) c(s) c(t {e}) c(t ). Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 6 / 28

Set Cover We have to place sensors in a room with a set E of possible locations. We suppose that each sensor covers a disc of fixed radius. Choosing a subset S of possible locations induces a covered area f (S) by sensors placed at S Suppose that we have chosen S and T, and a new location s, Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 7 / 28

Set Cover We have to place sensors in a room with a set E of possible locations. We suppose that each sensor covers a disc of fixed radius. Choosing a subset S of possible locations induces a covered area f (S) by sensors placed at S Suppose that we have chosen S and T, and a new location s, Then, it is easy to show that the covering function f is submodular: S T T {s}, f (S {s}) f (S) f (T {s}) f (T ) Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 7 / 28

Cut function for undirected graphs Let G = (V, E) be an undirected graph, let S be a subset of V, let δ(s) = {(i, j) E i S, j S}, the cut capacity function c is defined by cut(s) = #δ(s) (or by cut(s) = w(δ(s)) for weighted graphs). We have cut(s {v}) cut(s) = #{(v, j) E j S} #{(v, j) E j S}, cut(t {v}) cut(t ) = #{(v, j) E j T } #{(v, j) E j T }, as #{(v, j) E j T } #{(v, j) E j S}, and #{(v, j) E j T } #{(v, j) E j S}, therefore we get S T T {v}, cut(s {v}) f (S) cut(t {v}) f (T ). Thus, the capacity cut function is submodular. Moreover, the capacity cut function is symmetric, i.e. S cut(s) = cut( S), and hence non monotone. Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 8 / 28

Submodular functions for graphs Let G = (V, E) be an undirected graph, cut functions, Let A be a subset of E, the rank r(a) of A is the maximal size of a subset F of A such that (V, F ) has no cycle. The rank function r is submodular. Let A be a subset of E, the function nc which gives the number of connected components of the subgraph induced by A is supermodular (its opposite is modular) because nc(a) = V r(a) Let G = (V, E) be a weighted directed graph, let S be a subset of V, define the cut function cut + (S) = {(i,j) E i S, j S} w (i,j), the cut function cut + is submodular. The same holds for cut (ingoing edges), cut (outgoing and ingoing edges), and also for s, t-cut functions (source and sink in different subsets), i.e. cut functions are submodular Note: also many functions in matroids are submodular 8-) Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 9 / 28

Plan 1 Examples of Submodular Functions 2 Definitions and Properties of Submodular Functions 3 Maximization of Submodular Functions 4 Minimization of Submodular Functions 5 Conclusion Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 10 / 28

Definitions We consider E a set, and we set E = {1,..., p}. We consider real-valued set functions f : 2 E R, and f ( ) = 0. Set function f is submodular if Definition with Diminishing Returns Property (first order differences) (1) S T T {e}, f (S {e}) f (S) f (T {e}) f (T ) Alternative Definition (2) X, Y E, f (X ) + f (Y ) f (X Y ) + f (X Y ) (2) (1): let S T T {e}, take X = S {e} and Y = T, then f (S {e}) + f (T ) f (S {e} T ) + f ((S {e}) T ), thus f (S {e}) + f (T ) f (T {e}) + f (S), and then f (S {e}) f (S) f (T {e}) f (T ) Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 11 / 28

Definitions We consider E a set, and we set E = {1,..., p}. We consider real-valued set functions f : 2 E R, and f ( ) = 0. Set function f is submodular if Definition with Diminishing Returns Property (first order differences) (1) S T T {e}, f (S {e}) f (S) f (T {e}) f (T ) Alternative Definition (2) X, Y E, f (X ) + f (Y ) f (X Y ) + f (X Y ) (1) (2): (1) can be rewritten f (S) f (T ) f (S {e}) f (T {e}), enumerate Y \ X as e 1, e 2,..., e k, then f (X Y ) f (X ) f ((X Y ) {e 1 }) f (X {e 1 }) f (X Y ) f (X ) f ((X Y ) {e 1, e 2 }) f (X {e 1, e 2 })... f (X Y ) f (X ) f ((X Y ) {e 1,..., e k }) f (X {e 1,..., e k }) f (X Y ) f (X ) f (Y ) f (X Y ) Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 11 / 28

The case of cardinality functions: f (S) as a function of S Example: let E = {1, 2}, A E, x = A, f (A) = c(x), and {1} {2} {1, 2} Submodular? c(x) = x 0 1 1 2 c(x) = x 2 0 1 1 4 c(x) = x 2 0-1 -1-4 c(x) = x 0 1 1 2 c(x) = Gini( x E ) = 4 x 2 (1 x 2 ) 0 1 1 0 submodular: S T T {e}, f (S {e}) f (S) f (T {e}) f (T ) Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 12 / 28

The case of cardinality functions: f (S) as a function of S Example: let E = {1, 2}, A E, x = A, f (A) = c(x), and {1} {2} {1, 2} Submodular? c(x) = x 0 1 1 2 modular c(x) = x 2 0 1 1 4 No 1 c(x) = x 2 0-1 -1-4 Yes c(x) = x 0 1 1 2 Yes c(x) = Gini( x E ) = 4 x 2 (1 x 2 ) 0 1 1 0 Yes submodular: S T T {e}, f (S {e}) f (S) f (T {e}) f (T ) Theorem Let a set function f be defined as a cardinality function: f (A) = c( A ) for a scalar function c. Then, f is submodular if and only if c is concave. Note: submodular defined as the (discrete) first derivative is non increasing, i.e. the function is concave. 1 For the square function: f ({2}) f ( ) = 1, f ({1, 2}) f ({1}) = 3 Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 13 / 28

Closure properties Example: let E = {1, 2}, {1} {2} {1, 2} Submodular? f 0 1 0 1 Yes g 0 0 1 1 Yes h = 3f 0 3 0 3 Yes i = 3f + 2g 0 3 2 5 Yes j = min(f, g) 0 0 0 1 No j = max(f, g) 0 1 1 1 Yes But 2 Submodular functions are closed under positive linear combinations. They are not closed under Min and Max 2 Not in general. For a proof, use concavity of cardinality functions Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 14 / 28

So far Many examples of submodular functions. Submodular cardinality functions are concave. Submodular functions are closed under positive linear combinations, not under Min and Max. If f is submodular, f is supermodular and its dual d is supermodular, where d is defined by d(a) = f (E) f (E \ A) Let us now consider optimization of submodular functions. Note that maximizing a submodular function minimizing its dual Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 15 / 28

Plan 1 Examples of Submodular Functions 2 Definitions and Properties of Submodular Functions 3 Maximization of Submodular Functions 4 Minimization of Submodular Functions 5 Conclusion Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 16 / 28

Exact Submodular Function Maximization Given a submodular function f, the problem is find S = argmax S E f (S) Note that we suppose given a value oracle for f. Also, it can be restricted to a family F of feasible sets, and then we suppose given a membership oracle for F. The max-cut problem for graphs is a particular case. Recall that the decision problem given a graph G and an integer k, determine whether there is a cut of size at least k in G is NP-complete. In general, (exact) submodular function maximization is NP-hard Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 17 / 28

Approximate Submodular Function Maximization Find S = argmax S E f (S), where f is submodular Let us denote opt = max S E f (S) When the function is non negative, several approximation algorithms with theoretical guarantees exist: Random: return R a uniformly random subset of E The expected value of f (R) is shown to be greater than 1 4 opt Local Search: 1. S {e} where f ({e}) is maximal over singleton sets 2. If e E \ S f (S {e}) (1 + ɛ )f (S), S S {e} Goto 2 n 2 3. If e E \ S f (S \ {e}) (1 + ɛ )f (S), S S \ {e} Goto 2 n 2 4. Return the maximum between f (S) and f (X \ S) The expected value of f (R) is shown to be greater than ( 1 3 ɛ n )opt. The output is a local optimum of LS, it satisfies: if I S or S I, then f (I ) f (S). Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 18 / 28

Greedy Submodular Function Maximization Find S = argmax S E f (S), where f is submodular and non decreasing 3 Let us denote opt = max S E f (S), opt k = max S E, S k f (S) Greedy Algorithm for Non Decreasing Submodular Functions 1. S 2. repeat until A = k 3. select e E \ S s.t. f (S {e}) f (S) is maximal, S S {e} Theorem (Performance Guarantee) [Nemhauser et al 78] Let f be a non decreasing submodular function. Greedy output a set S such that f (S) [1 (1 1 k )k ]opt k, and f (S) (1 1 e )opt k. The (1 1 e )-approximation is optimal: for any ɛ, it is NP-hard to achieve a (1 1 e + ɛ)-approximation. I.e. Greedy is the best we can do. 3 f (S {e}) f (S) 0 Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 19 / 28

Spread of Influence in Social Networks Independent Cascade Model Start with a seed S of active nodes. Unfold in discrete step according to: when u becomes active at step t, it is given a chance to active its neighbours according to p uv. Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 20 / 28

Spread of Influence in Social Networks Influence Maximization Problem Given k, find initial seed S of k active nodes of maximum influence f (S), where f (S) is the expected number of nodes active at the end of the process. Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 20 / 28

Spread of Influence in Social Networks Theorem [KempeKleinbergTardos 03] The influence function is submodular. hint: one can flip coins in advance and to get live edges. Then f (S) can be computed from paths of live edges). As f is non negative and non decreasing, we get that the Greedy algorithm solves the Influence Maximization Problem (find a seed S) within 1 1 e of optimal influence. Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 20 / 28

So far Examples and properties of submodular functions Maximization of submodular functions NP-hard problem Approximate maximization algorithms with performance guarantees Applications to influence networks, document summarization, variable selection using information gain, active learning,... and not presented here: Randomized maximization algorithms with performance guarantees Approximate maximization algorithms with constraints (more general than S k). For instance, when the set of feasible sets has a structure (for instance a matroid). Let us now consider minimization of submodular functions. Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 21 / 28

Plan 1 Examples of Submodular Functions 2 Definitions and Properties of Submodular Functions 3 Maximization of Submodular Functions 4 Minimization of Submodular Functions 5 Conclusion Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 22 / 28

Minimization of Submodular Functions While (exact) maximization is NP-hard, there is an algorithm that computes the minimum of any submodular function f : {0, 1} n R in poly(n) time [GrõtschelLovászSchriver 88]. The combinatorial algorithms are sophisticated: a fully combinatorial, strongly polynomial algorithm for (exactly) minimizing submodular functions by [Iwata & Orlin, SODA 09] in time O(n 8 logn); algorithm in time O(n 3 ) for symmetric 4 submodular functions; many other particular cases. Thus, (approximate) minimization algorithms for submodular functions have been defined using continuous optimization. Most of them are based on the convexity of the Lovàsz extension of any submodular function. Many applications to machine learning: clustering (with entropy or cut function), MAP inference in Markov Random Fields, speaker segmentation, image denoising, among others. 4 f (A) = f (E \ A) Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 23 / 28

Lovász Extension Let f be a set function, equivalently let f be a function from {0, 1} n into R, the Lovász Extension g of f is a function from [0, 1] n into R defined as: let x [0, 1] n, g(x) = i=n i=0 λ if (S i ), where = S 0 S 1... S n is a chain such that i=n i=0 λ i1 Si = x and i=n i=0 λ i = 1 and λ i 0. Intuition: An input of g is a point in the n-dimensional cube, an input of f is one of the 2 n corners of this cube. The cube can be divided into n! simplices where a simplex is the set of points whose coordinates are ordered according to a permutation of {1,..., n}. x=(0.1,0.8,0.3) lies in the simplex {x R 3 x 2 x 3 x 1 } with corners (0, 0, 0), (0, 1, 0), (0, 1, 1) and (1, 1, 1). Then, each point x lies in a simplex, x is a weighted average of the corners of this simplex, thus define f (x) as the weighted average of the f -values at the corners. f(x) = f(0.1,0.8,0.3) = 0.2 f(0,0,0) + 0.1 f(1,1,1) + 0.2 f(0,1,1) + 0.5 f(0,1,0) g can be easily computed from f -values on the corners Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 24 / 28

Submodularity and Convexity Let E = {1,..., n}. Let A E, let 1 A be the indicator vector of set A. The base results [Edmonds 71, Lovasz 83] are: Lovász extension: every submodular function f induces its Lovász extension g on [0, 1] n such that g(1 A ) = f (A) for every A E. Submodularity and Convexity: Lovász extension g is convex. f is submodular if and only if its Minimization: let f be a submodular function and let g be its Lovász extension, then min x [0,1] ng(x) = min x {0,1} ng(x) = min A E f (A) where 1 A = x. Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 25 / 28

Structured Sparsity and Submodular Functions Let us consider estimators obtained by the regularized empirical risk minimization: argmin w R p 1 i=n n i=1 l(y i, w T x i ) + λω(w), where l is a loss function and Ω is a norm. sparsity: the l 1 -norm is fequently used to promote sparsity structured sparsity goes further with the notion of sparsity patterns. For instance, grouped l 1 -norm with overlapping groups using submodular functions: an alternative is to consider penalty functions of the form f : w f (Supp(w)) where Supp(w) = {i E w i 0} is the support of w. Then, if f is submodular and non decreasing, if the values of all singletons is strictly positive, the function Ω : w Ω(w) = g( w ) is a norm and it is the convex envelope of f. Thus new structured sparsity-inducing norms together with minimization algorithms. Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 26 / 28

Plan 1 Examples of Submodular Functions 2 Definitions and Properties of Submodular Functions 3 Maximization of Submodular Functions 4 Minimization of Submodular Functions 5 Conclusion Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 27 / 28

Conclusion useful knowledge for machine learners because submodular functions appear naturally in many ML settings, also NLP settings 8-), they come with well studied maximization and minimization algorithms often with complexity analysis and performance guarantees, they allow to define new sparsity-inducing norms. Convex or concave? More: constrained optimization algorithms such as cardinality constraints ( S k), optimization algorithms for subclasses of submodular functions, matroids 8-),... Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised Aug. 14 28 / 28