Discrete Optimization in Machine Learning. Colorado Reed

Size: px
Start display at page:

Download "Discrete Optimization in Machine Learning. Colorado Reed"

Transcription

1 Discrete Optimization in Machine Learning Colorado Reed [ML-RCC] 31 Jan

2 Acknowledgements Some slides/animations based on: Krause et al. tutorials: Pushmeet Kohli tutorial: Jeff Bilmes class please consult before reusing slide if marked AK, PM, or JB 2

3 What is Discrete Optimization? Optimization problem with finite or countably infinite set of solutions image segmentation games common examples TSP; Set Cover; Network flows; Vertex Coloring; Knapsack Problem 3

4 Discrete Optimization in ML Given RVs Y, X 1,, X n Predict Y X i1,, X ik Y Sick k most informative features X 1 Fever X 2 Age X 3 Biopsy Result A * := argmax I(X A ; Y) s.t. A k where I(X A ; Y) = H(Y) H(Y X A ) AK 4

5 Discrete Optimization in ML Given RVs X 1,, X n, V X 1 X 4 X 3 Partition V into sets that are as independent as possible X 2 X 5 X 6 formally A X 4 X 1 X 3 V \ A X 2 X 5 X 6 A * := argmin A I(X A ; X V\A ) s.t. A n where I(X A ; X V\A ) = H(X V\A ) H(X V\A X A ) 5

6 This Tutorial Given a finite set V, function F: 2 V R A * = argopt A F(A) s.t. satisfy contraints(a) Methods not covered here Relaxations Mixed integer programming POMDPs Heuristics submodularity scalable algorithms naturally occurring rising interest in ML performance guarantees! Goal: foster submodular intuition 6

7 Submodular set functions Set function F: 2 V R on Finite Set V = {1, 2,, n} is called submodular if For all A, B V: F(A) + F(B) F(A B) + F(A B) + B A A B A B + AK Equivalent diminishing returns characterization: + {s} B A + {s} Large improvement Small improvement For A B, s B, F(A {s}) F(A) F(B {s}) F(B) 7

8 Intuitive Example: Shopping F(A) = amount of time spent shopping for items in A V = { } A = { } + B = { } + F({ }) F({ }) - F({ }) - F({ }) 8 For A B, s B, F(A {s}) F(A) F(B {s}) F(B)

9 Example: Entropy Given random variables X 1,...,X N and index set V F (A) =H(X A )= X x A p(x A ) log p(x A ) Proof let A B V, 2 V \ B, need to show H(X A[{ } ) H(X A ) H(X B[{ } ) H(X B ) H(X { } X A ) H(X { } X B ) information never hurts JB 9

10 Example: Entropy Given random variables X 1,...,X N and index set V F (A) =H(X A )= X x A p(x A ) log p(x A ) Proof # 2 (mutual information) let A B V, 2 V \ B, I(X A ; X B ) 0 H(X A )+H(x H(X B ) H(X A[B ) H(X A\B ) 0 H(X A )+H(x H(X B ) H(X A[B )+H(X A\B ) 10

11 Example: Mutual information Given random variables X 1,...,X N and index set V F (A) =I(X A ; X V \A )=H(X V \A ) H(X V \A X A ) F (A [ { }) F (A) =H(X { } X A ) H(X { } X V \A[{ } ) Nonincreasing in A Nondecreasing in A F(A {ϵ})-f(a) monotonically nonincreasing F submodular 11

12 Quick Definition: [Super]modular Set function F: 2 V R on V = {1,2,,n} is supermodular iff For all A, B V: F(A) + F(B) F(A B) + F(A B) increasing returns OR synergy ; -F is submodular Dr. Steven Pinker (Harvard Psychologist) answering NYT question: What scientific concept would improve everybody's cognitive toolkit? Emergent systems are ones in which many different elements interact. The pattern of interaction then produces a new element that is greater than the sum of the parts, which then exercises a top-down influence on the constituent elements. JB modular if both submodular and supermodular 12

13 Quiz V = {1,...,n},A V define characteristic vector : w A = where wi A =1ifi2A; 0 otherwise (w A 1,...,w A n ) F (A) =w A r T where r i 2 R, r = n Prove F (A) is submodular Submodular definition For A B, s B: F(A {s}) F(A) F(B {s}) F(B) 13

14 Closedness properties F 1,,F m submodular functions on V and λ 1,,λ m > 0 Then: F(A) = i λ i F i (A) is submodular Submodularity closed under nonnegative linear combinations Very useful: F θ (A) submodular: θ P(θ) F θ (A) submodular 14

15 Submodularity and convexity V = {1,...,n},A V define characteristic vector : w A = {w A 1,...,w A n } where w A i =1ifi 2 A; 0 otherwise Very important Theorem [Lovasz 1983] Every submodular function F ( ) induces a function g( ) on R n s.t.! g(w) is convex! F (A) =g(w A ) for all A V! min F (A) =ming(w) s.t.w 2 [0, 1]n A w but what is g(w)? 15

16 Submodular Polytope P f = {x 2 R n : x(a) apple F (A) for all A V } where x(a) = X i2a x i x {b} Example: V = {a,b} A F(A) 0 {a} - 1 {b} 2 {a,b} 0 P F 2 1 x({b}) F({b}) x({a,b}) F({a,b}) x {a} AK x({a}) F({a}) 16

17 Lovasz extension [Lovasz 1983] g(w) = max w T x x2p f P f = {x 2 R n : x(a) apple F (A) for all A V } w {b} x w =argmax xϵ PF w T x x w 2 1 w g(w)=w T x w w {a} Quiz We defined g(w), however evaluating g(w) involves solving why exponentially many constraints? an LP with exponentially many constraints AK, JB 17

18 Evaluating the Lovasz extension g(w) = max w T x x2p f P f = {x 2 R n : x(a) apple F (A) for all A V } Very Important Theorem [Edmonds 71, Lovasz 83] For any given w can obtain optimal solution x w to the LP via the following greedy algorithm : Order V = {e 1,...,e n } : w(e 1 )... w(e n ) Let x w (e i )=F ({e 1,...,e i }) F ({e 1,...,e i 1 }) then w T x w = g(w) = max x2p f w T x 18

19 [-2,2] Example: Lovasz extension -2 g([0, 1]) = [0, 1] T [ g([1, 1]) = [1, 1] T [ [-1,1] -1 w {b} 2, 2] = 2 = F ({b}) 2 {b} {a,b} 1 {} {a} w 0 1 {a} 1, 1] = 0 = F ({a, b}) A F(A) 0 {a} - 1 {b} 2 {a,b} 0 w =[0, 1], want g(w) g(w) = max w T x x2p f greedy ordering: e 1 = b, e 2 = a x w (e 1 )=F ({b}) F (;) =2 x w (e 2 )=F ({b, a}) F ({b}) = 2 AK Order V = {e 1,...,e n } : w(e 1 )... w(e n ) Let x w (e i )=F({e 1,...,e i }) F ({e 1,...,e i 1 }) then w T x w = g(w) = max w T x 19 x2p f

20 Lovasz Extension: useful? Theorem [Lovasz 1983] g(w) obtains its minimum in [0, 1] n at a corner [0,1] 2 Translation: the corners correspond to the characteristic vector inputs, so minimizing g(w) minimizes F(A)! How to minimize? 20

21 21 slide credit: Satoru Iwata

22 Submodular Minimization in Practice x(v)=f(v) [-1,1] x* 2 1 x {b} Base polytope: B F = P F {x(v) = F(V)} A F(A) 0 {a} - 1 {b} 2 {a,b} x {a} Minimum norm algorithm: 1. Find x* = argmin x 2 s.t. x ϵ B F x*=[-1,1] 2. Return A* = {i: x*(i) < 0} A*={a} AK Theorem [Fujishige 1991]: A* is an optimal solution Note: solve 1. via Wolfe s algorithm Runtime finite but unknown 22

23 23

24 Empirical Justification [Fujishige et al 2006] Running time (seconds) Min-norm algorithm problem size AK DEMO 24

25 MNP: Super-fast! Blue: [Fujishige 2006] ; Red: [Garcia 2012] 25

26 Symmetric Submodular Functions If F(A) = F(V \ A) then we can minimize in O(n 3 ) o NB: for nontrivial solutions Queyranne s Algorithm: o Straightforward split/merge element sets algorithm o Takes around 10 minutes to explain o Implemented in Krause s Matlab toolbox general submodular functions can be exactly minimized with min-norm algorithm symmetric submodular functions can be exactly minimized with Queyranne s algorithm in O(n 3 ) 26

27 Example: Q-Clustering [Narasimhan, Jojic, Bilmes NIPS 2005] o o o o o A 1 V o o o o A 2 o o AK Group data points V into homogeneous clusters Find a partition that minimizes F(A 1,,A k ) = i E(A i ) e.g: Entropy H(A) Cut function Theorem: (2-2/k) F(P) F(P opt ) for partition P k = 2 and F(A) is symmetric submod : use Q-Algo and obtain opt! First algorithm for finding optimal MDL clustering for k = 2 And provides optimality guarantees for all symmetric submodular partitions! 27

28 Example: MAP for Pairwise MRFs p q unary terms (data) - x p are discrete variables (i.e., x p {0,1}) - θ p ( ) are unary potentials - θ pq (, ) are pairwise potentials submodular iff pairwise terms (coherence) pairwise submodular functions can be minimized in O(n 3 ) O(n) in practice! (via st-mincut algorithm) 28

29 Example: MAP for Pairwise MRFs S st-mincut E(x) T Solution PK 29

30 Sub-example: Image Segmentation PK 30

31 Sub-example: Image Segmentation E(x θ) = θ p x p + θ pq x p (1 x q ) p likelihood p,q regularity x p {0,1} E: {0, 1} n R 0 fg 1 bg Can find global minimum using min-cut/max-flow algorithm PK 31

32 Sub-example: Image Segmentation E(a 1, a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 a 1 a 2 source Augmenting Paths Algorithm 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path a a 2 3. Repeat until no path can be found Cut = 11 sink E(1,1) = 11 32

33 Sub-example: Image Segmentation E(a 1, a 2 ) = ā ā a9a 5a ā4ā 1 2a a ā 2 2a + ā 1 ā 1 a 22 + ā 1 a 2 E(a 1, a 2 ) = 2a 1 + 5ā 1 + 9a 2 + 4ā 2 + 2a 1 ā 2 + ā 1 a 2 a 1 a 2 max flow source Augmenting Paths Algorithm 1. Find path from source to sink with positive capacity 2. Push maximum possible flow through this path a a 2 3. Repeat until no path can be found sink 33

34 History of Maxflow Algorithms Augmenting Path and Push-Relabel n: #nodes m: #edges U: maximum edge weight Predates Edmonds & Lovasz work! [Slide credit: Andrew Goldberg; 34 from PK]

35 Sub-example: Image denoising image credit: 35

36 Sub-example: Image denoising Y 1 Y 2 Y 3 Pairwise Markov Random Field X 1 X 2 X 3 P(x 1,,x n,y 1,,y n ) = i,j ψ i,j (y i,y j ) Π i φ i (x i,y i ) Y 4 Y 5 Y 6 Y 7 X 4 Y 8 X 5 Y 9 X 6 Want argmax y P(y x) =argmax y log P(x,y) =argmin y i,j E i,j (y i,y j )+ i E i (y i ) X 7 X 8 X 9 X i : observed pixels Y i : true pixels E i,j (y i,y j ) = -log ψ i,j (y i,y j ) Suppose y i are binary F(A) = E(y A ) where y A i = 1 iff i2 A Then min y E(y) = min A F(A) AK 36

37 maximizing submodular functions 37

38 Concave or Convex? g( A ) A For A B, s B, F(A {s}) F(A) F(B {s}) F(B) Theorem Suppose g: N R and F(A) = g( A ) Then F(A) submodular if and only if g is concave 38

39 Maximizing convex functions: NP hard? Maximizing submodular functions: NP hard Yes, but it some cases we have approximability guarantees 39

40 Maximizing Submodular Functions Approximability (Greedy) 1 1/e Constraints monotone with cardinality constraints monotone with matroid constraints nonnegative symmetric submodular 1/3 nonnegativity 40

41 Example: Max Cover (NP-Hard) Want to cover floorplan with discs Place sensors in building Possible locations V For A V: F(A) area covered by sensors placed at A Goal: Place k sensors that cover as much area as possible AK 41

42 Example: max cover is submodular A={S 1, S 2 } S 1 S 2 S F(A {S }) - F(A) S 1 S 2 F(B {S }) - F(B) S 3 AK S 4 S B = {S 1, S 2, S 3, S 4 } 42

43 repeat Greedy Max Cover Algorithm -pick the location that covers the max uncovered area -mark the area covered by the sensor as covered until done GREEDY (1 1/e)OPT : best possible [Feige 1998] 43

44 Submodularity embraces Greed Theorem [Nemhauser et al 1978]: Greedy gives a (1-1/e)-approximation for maximizing monotone submodular functions.* *Subject to cardinality constraints A greedy : F(A greedy ) (1-1/e) max A k F(A) Common scenario: best possible guarantee unless NP = P 44

45 Theorem [Nemhauser et al 1978]: Greedy gives a (1-1/e)-approximation for maximizing monotone submodular functions.* *Subject to cardinality constraints Proof: S i : first i elements selected by the greedy algorithm C : F (C) =OPT Show via induction : F (C) F (S i ) apple (1 1/k) i F (C) case i =0:F (C) F (S 0 ) apple F (C) In step i>0, Greedy selects element i maximizing F Si 1 ( i ) F (C) F (S i 1 ) apple X F Si 1 ( ) by submodularity & inductive hypothesis 2C\S i 1 X 1 implying : F Si 1 ( i ) C \ S i 1 F Si 1 ( ) 2 \ 2C\S i 1 1 k (F (C) F (S i 1)) 45

46 Theorem [Nemhauser et al 1978]: Greedy gives a (1-1/e)-approximation for maximizing monotone submodular functions.* *Subject to cardinality constraints Proof continued: S i : first i elements selected by the greedy algorithm C : F (C) =OPT Show via induction : F (C) F (S i ) apple (1 1/k) i F (C) In step i>0, Greedy selects element i maximizing F Si 1 ( i ) 1... F Si 1 ( i ) k (F (C) F (S i 1)) F (C) F (S i )=F(C) F (S i 1 ) F Si 1 ( i ) 1 apple F (C) F (S i 1 ) k (F (C) F (S i 1)) =(1 1/k)(F (C) F (S i 1 )) apple (1 1/k) i F (C) apple 1 46 e F (C)

47 Example: Influence in social networks [Kempe, Kleinberg, Tardos KDD 03] Alice Dorothy 0.2 Eric Prob. of influencing Bob 0.5 Fiona Charlie Who should get free cell phones? V = {Alice,Bob,Charlie,Dorothy,Eric,Fiona} F(A) = Expected number of people influenced when targeting A Key idea: Flip coins in advance live edges slide credit: AK 47

48 Adaptive Submodularity [Golovan & Krause 2010] Key idea: optimize over policies (decision trees) instead of sets Objective function: F(A, x V ) A: set of actions you ve taken x V : realization of a set of random variables Expected marginal benefits conditioned on the observations (e x A ) = Σ P(x V x A ) [F(A {e}, x V ) - F(A, x V ))] x V Adaptive submodularity: (e x A ) (e x B ) for A B Adaptive monotonicity: (e x A ) 0 for all e, x A Theorem: Adaptive greedy algorithm returns policy π greedy G(π greedy ) (1 1/e) G(π opt ), where G(π) = E[F(π(x V ), x V )] 48 more policies than sets

49 Adaptive Submodularity Example Prior over diseases P(Y) Likelihood of outcomes P(X V Y) Suppose that P(X V Y) is deterministic (noise free) Each test eliminates hypotheses y How should we test to eliminate all incorrect hypotheses? Generalized binary search - Equivalent to max. info-gain Y Sick X 1 Fever X 2 Age X 3 Biopsy Result 49

50 Summary: Submodularity in ML Minimization Clustering MAP inference MRF (computer vision) Structure learning* Maximization Active learning Ranking* Feature Selection 50

51 DEMO See toolbox: 51

52 Resources Beyond convexity: submodularity in machine learning K Jeff Bilmes Submodular Optimization Class: ee595a_spring_2011/ Pushmeet Kohli video lecture on MAP in MRF: Coursera courses on discrete optimization:

Submodularity in Machine Learning

Submodularity in Machine Learning Saifuddin Syed MLRG Summer 2016 1 / 39 What are submodular functions Outline 1 What are submodular functions Motivation Submodularity and Concavity Examples 2 Properties of submodular functions Submodularity

More information

9. Submodular function optimization

9. Submodular function optimization Submodular function maximization 9-9. Submodular function optimization Submodular function maximization Greedy algorithm for monotone case Influence maximization Greedy algorithm for non-monotone case

More information

Submodular Functions Properties Algorithms Machine Learning

Submodular Functions Properties Algorithms Machine Learning Submodular Functions Properties Algorithms Machine Learning Rémi Gilleron Inria Lille - Nord Europe & LIFL & Univ Lille Jan. 12 revised Aug. 14 Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised

More information

EE595A Submodular functions, their optimization and applications Spring 2011

EE595A Submodular functions, their optimization and applications Spring 2011 EE595A Submodular functions, their optimization and applications Spring 2011 Prof. Jeff Bilmes University of Washington, Seattle Department of Electrical Engineering Winter Quarter, 2011 http://ee.washington.edu/class/235/2011wtr/index.html

More information

Submodular Functions, Optimization, and Applications to Machine Learning

Submodular Functions, Optimization, and Applications to Machine Learning Submodular Functions, Optimization, and Applications to Machine Learning Spring Quarter, Lecture 12 http://www.ee.washington.edu/people/faculty/bilmes/classes/ee596b_spring_2016/ Prof. Jeff Bilmes University

More information

Discrete Inference and Learning Lecture 3

Discrete Inference and Learning Lecture 3 Discrete Inference and Learning Lecture 3 MVA 2017 2018 h

More information

Optimization of Submodular Functions Tutorial - lecture I

Optimization of Submodular Functions Tutorial - lecture I Optimization of Submodular Functions Tutorial - lecture I Jan Vondrák 1 1 IBM Almaden Research Center San Jose, CA Jan Vondrák (IBM Almaden) Submodular Optimization Tutorial 1 / 1 Lecture I: outline 1

More information

Submodular Functions and Their Applications

Submodular Functions and Their Applications Submodular Functions and Their Applications Jan Vondrák IBM Almaden Research Center San Jose, CA SIAM Discrete Math conference, Minneapolis, MN June 204 Jan Vondrák (IBM Almaden) Submodular Functions and

More information

1 Submodular functions

1 Submodular functions CS 369P: Polyhedral techniques in combinatorial optimization Instructor: Jan Vondrák Lecture date: November 16, 2010 1 Submodular functions We have already encountered submodular functions. Let s recall

More information

EE596A Submodularity Functions, Optimization, and Application to Machine Learning Fall Quarter 2012

EE596A Submodularity Functions, Optimization, and Application to Machine Learning Fall Quarter 2012 EE596A Submodularity Functions, Optimization, and Application to Machine Learning Fall Quarter 2012 Prof. Jeff Bilmes University of Washington, Seattle Department of Electrical Engineering Spring Quarter,

More information

CSE541 Class 22. Jeremy Buhler. November 22, Today: how to generalize some well-known approximation results

CSE541 Class 22. Jeremy Buhler. November 22, Today: how to generalize some well-known approximation results CSE541 Class 22 Jeremy Buhler November 22, 2016 Today: how to generalize some well-known approximation results 1 Intuition: Behavior of Functions Consider a real-valued function gz) on integers or reals).

More information

A submodular-supermodular procedure with applications to discriminative structure learning

A submodular-supermodular procedure with applications to discriminative structure learning A submodular-supermodular procedure with applications to discriminative structure learning Mukund Narasimhan Department of Electrical Engineering University of Washington Seattle WA 98004 mukundn@ee.washington.edu

More information

1 Maximizing a Submodular Function

1 Maximizing a Submodular Function 6.883 Learning with Combinatorial Structure Notes for Lecture 16 Author: Arpit Agarwal 1 Maximizing a Submodular Function In the last lecture we looked at maximization of a monotone submodular function,

More information

Maximizing the Spread of Influence through a Social Network. David Kempe, Jon Kleinberg, Éva Tardos SIGKDD 03

Maximizing the Spread of Influence through a Social Network. David Kempe, Jon Kleinberg, Éva Tardos SIGKDD 03 Maximizing the Spread of Influence through a Social Network David Kempe, Jon Kleinberg, Éva Tardos SIGKDD 03 Influence and Social Networks Economics, sociology, political science, etc. all have studied

More information

Symmetry and hardness of submodular maximization problems

Symmetry and hardness of submodular maximization problems Symmetry and hardness of submodular maximization problems Jan Vondrák 1 1 Department of Mathematics Princeton University Jan Vondrák (Princeton University) Symmetry and hardness 1 / 25 Submodularity Definition

More information

Approximating Submodular Functions. Nick Harvey University of British Columbia

Approximating Submodular Functions. Nick Harvey University of British Columbia Approximating Submodular Functions Nick Harvey University of British Columbia Approximating Submodular Functions Part 1 Nick Harvey University of British Columbia Department of Computer Science July 11th,

More information

MAP Estimation Algorithms in Computer Vision - Part II

MAP Estimation Algorithms in Computer Vision - Part II MAP Estimation Algorithms in Comuter Vision - Part II M. Pawan Kumar, University of Oford Pushmeet Kohli, Microsoft Research Eamle: Image Segmentation E() = c i i + c ij i (1- j ) i i,j E: {0,1} n R 0

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu Find most influential set S of size k: largest expected cascade size f(s) if set S is activated

More information

Pushmeet Kohli. Microsoft Research Cambridge. IbPRIA 2011

Pushmeet Kohli. Microsoft Research Cambridge. IbPRIA 2011 Pushmeet Kohli Microsoft Research Cambridge IbPRIA 2011 2:30 4:30 Labelling Problems Graphical Models Message Passing 4:30 5:00 - Coffee break 5:00 7:00 - Graph Cuts Move Making Algorithms Speed and Efficiency

More information

An Introduction to Submodular Functions and Optimization. Maurice Queyranne University of British Columbia, and IMA Visitor (Fall 2002)

An Introduction to Submodular Functions and Optimization. Maurice Queyranne University of British Columbia, and IMA Visitor (Fall 2002) An Introduction to Submodular Functions and Optimization Maurice Queyranne University of British Columbia, and IMA Visitor (Fall 2002) IMA, November 4, 2002 1. Submodular set functions Generalization to

More information

EE595A Submodular functions, their optimization and applications Spring 2011

EE595A Submodular functions, their optimization and applications Spring 2011 EE595A Submodular functions, their optimization and applications Spring 2011 Prof. Jeff Bilmes University of Washington, Seattle Department of Electrical Engineering Spring Quarter, 2011 http://ssli.ee.washington.edu/~bilmes/ee595a_spring_2011/

More information

Quiz 1 Date: Monday, October 17, 2016

Quiz 1 Date: Monday, October 17, 2016 10-704 Information Processing and Learning Fall 016 Quiz 1 Date: Monday, October 17, 016 Name: Andrew ID: Department: Guidelines: 1. PLEASE DO NOT TURN THIS PAGE UNTIL INSTRUCTED.. Write your name, Andrew

More information

Applications of Submodular Functions in Speech and NLP

Applications of Submodular Functions in Speech and NLP Applications of Submodular Functions in Speech and NLP Jeff Bilmes Department of Electrical Engineering University of Washington, Seattle http://ssli.ee.washington.edu/~bilmes June 27, 2011 J. Bilmes Applications

More information

Active Learning and Optimized Information Gathering

Active Learning and Optimized Information Gathering Active Learning and Optimized Information Gathering Lecture 13 Submodularity (cont d) CS 101.2 Andreas Krause Announcements Homework 2: Due Thursday Feb 19 Project milestone due: Feb 24 4 Pages, NIPS format:

More information

Markov Random Fields for Computer Vision (Part 1)

Markov Random Fields for Computer Vision (Part 1) Markov Random Fields for Computer Vision (Part 1) Machine Learning Summer School (MLSS 2011) Stephen Gould stephen.gould@anu.edu.au Australian National University 13 17 June, 2011 Stephen Gould 1/23 Pixel

More information

Tightness of LP Relaxations for Almost Balanced Models

Tightness of LP Relaxations for Almost Balanced Models Tightness of LP Relaxations for Almost Balanced Models Adrian Weller University of Cambridge AISTATS May 10, 2016 Joint work with Mark Rowland and David Sontag For more information, see http://mlg.eng.cam.ac.uk/adrian/

More information

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Supervised Learning Input: labelled training data i.e., data plus desired output Assumption:

More information

Discrete Optimization Lecture 5. M. Pawan Kumar

Discrete Optimization Lecture 5. M. Pawan Kumar Discrete Optimization Lecture 5 M. Pawan Kumar pawan.kumar@ecp.fr Exam Question Type 1 v 1 s v 0 4 2 1 v 4 Q. Find the distance of the shortest path from s=v 0 to all vertices in the graph using Dijkstra

More information

Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization

Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization Journal of Artificial Intelligence Research 42 (2011) 427-486 Submitted 1/11; published 11/11 Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization Daniel Golovin

More information

Pushmeet Kohli Microsoft Research

Pushmeet Kohli Microsoft Research Pushmeet Kohli Microsoft Research E(x) x in {0,1} n Image (D) [Boykov and Jolly 01] [Blake et al. 04] E(x) = c i x i Pixel Colour x in {0,1} n Unary Cost (c i ) Dark (Bg) Bright (Fg) x* = arg min E(x)

More information

Variable Elimination (VE) Barak Sternberg

Variable Elimination (VE) Barak Sternberg Variable Elimination (VE) Barak Sternberg Basic Ideas in VE Example 1: Let G be a Chain Bayesian Graph: X 1 X 2 X n 1 X n How would one compute P X n = k? Using the CPDs: P X 2 = x = x Val X1 P X 1 = x

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Discrete DC Programming by Discrete Convex Analysis

Discrete DC Programming by Discrete Convex Analysis Combinatorial Optimization (Oberwolfach, November 9 15, 2014) Discrete DC Programming by Discrete Convex Analysis Use of Conjugacy Kazuo Murota (U. Tokyo) with Takanori Maehara (NII, JST) 141111DCprogOberwolfLong

More information

Part 6: Structured Prediction and Energy Minimization (1/2)

Part 6: Structured Prediction and Energy Minimization (1/2) Part 6: Structured Prediction and Energy Minimization (1/2) Providence, 21st June 2012 Prediction Problem Prediction Problem y = f (x) = argmax y Y g(x, y) g(x, y) = p(y x), factor graphs/mrf/crf, g(x,

More information

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved. Chapter 11 Approximation Algorithms Slides by Kevin Wayne. Copyright @ 2005 Pearson-Addison Wesley. All rights reserved. 1 Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should

More information

Learning Bayes Net Structures

Learning Bayes Net Structures Learning Bayes Net Structures KF, Chapter 15 15.5 (RN, Chapter 20) Some material taken from C Guesterin (CMU), K Murphy (UBC) 1 2 Learning Bayes Nets Known Structure Unknown Data Complete Missing Easy

More information

CS 322: (Social and Information) Network Analysis Jure Leskovec Stanford University

CS 322: (Social and Information) Network Analysis Jure Leskovec Stanford University CS 322: (Social and Inormation) Network Analysis Jure Leskovec Stanord University Initially some nodes S are active Each edge (a,b) has probability (weight) p ab b 0.4 0.4 0.2 a 0.4 0.2 0.4 g 0.2 Node

More information

Submodular Functions, Optimization, and Applications to Machine Learning

Submodular Functions, Optimization, and Applications to Machine Learning Submodular Functions, Optimization, and Applications to Machine Learning Spring Quarter, Lecture 3 http://j.ee.washington.edu/~bilmes/classes/ee596b_spring_2014/ Prof. Jeff Bilmes University of Washington,

More information

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 24: Introduction to Submodular Functions. Instructor: Shaddin Dughmi

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 24: Introduction to Submodular Functions. Instructor: Shaddin Dughmi CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 24: Introduction to Submodular Functions Instructor: Shaddin Dughmi Announcements Introduction We saw how matroids form a class of feasible

More information

Submodular Functions, Optimization, and Applications to Machine Learning

Submodular Functions, Optimization, and Applications to Machine Learning Submodular Functions, Optimization, and Applications to Machine Learning Spring Quarter, Lecture 14 http://www.ee.washington.edu/people/faculty/bilmes/classes/ee596b_spring_2016/ Prof. Jeff Bilmes University

More information

Learning symmetric non-monotone submodular functions

Learning symmetric non-monotone submodular functions Learning symmetric non-monotone submodular functions Maria-Florina Balcan Georgia Institute of Technology ninamf@cc.gatech.edu Nicholas J. A. Harvey University of British Columbia nickhar@cs.ubc.ca Satoru

More information

More Approximation Algorithms

More Approximation Algorithms CS 473: Algorithms, Spring 2018 More Approximation Algorithms Lecture 25 April 26, 2018 Most slides are courtesy Prof. Chekuri Ruta (UIUC) CS473 1 Spring 2018 1 / 28 Formal definition of approximation

More information

Part 7: Structured Prediction and Energy Minimization (2/2)

Part 7: Structured Prediction and Energy Minimization (2/2) Part 7: Structured Prediction and Energy Minimization (2/2) Colorado Springs, 25th June 2011 G: Worst-case Complexity Hard problem Generality Optimality Worst-case complexity Integrality Determinism G:

More information

Greedy Maximization Framework for Graph-based Influence Functions

Greedy Maximization Framework for Graph-based Influence Functions Greedy Maximization Framework for Graph-based Influence Functions Edith Cohen Google Research Tel Aviv University HotWeb '16 1 Large Graphs Model relations/interactions (edges) between entities (nodes)

More information

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved. Chapter 11 Approximation Algorithms Slides by Kevin Wayne. Copyright @ 2005 Pearson-Addison Wesley. All rights reserved. 1 P and NP P: The family of problems that can be solved quickly in polynomial time.

More information

From MAP to Marginals: Variational Inference in Bayesian Submodular Models

From MAP to Marginals: Variational Inference in Bayesian Submodular Models From MAP to Marginals: Variational Inference in Bayesian Submodular Models Josip Djolonga Department of Computer Science ETH Zürich josipd@inf.ethz.ch Andreas Krause Department of Computer Science ETH

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

Semi-Markov/Graph Cuts

Semi-Markov/Graph Cuts Semi-Markov/Graph Cuts Alireza Shafaei University of British Columbia August, 2015 1 / 30 A Quick Review For a general chain-structured UGM we have: n n p(x 1, x 2,..., x n ) φ i (x i ) φ i,i 1 (x i, x

More information

Learning with Submodular Functions: A Convex Optimization Perspective

Learning with Submodular Functions: A Convex Optimization Perspective Foundations and Trends R in Machine Learning Vol. 6, No. 2-3 (2013) 145 373 c 2013 F. Bach DOI: 10.1561/2200000039 Learning with Submodular Functions: A Convex Optimization Perspective Francis Bach INRIA

More information

Word Alignment via Submodular Maximization over Matroids

Word Alignment via Submodular Maximization over Matroids Word Alignment via Submodular Maximization over Matroids Hui Lin, Jeff Bilmes University of Washington, Seattle Dept. of Electrical Engineering June 21, 2011 Lin and Bilmes Submodular Word Alignment June

More information

Submodular Functions, Optimization, and Applications to Machine Learning

Submodular Functions, Optimization, and Applications to Machine Learning Submodular Functions, Optimization, and Applications to Machine Learning Spring Quarter, Lecture 13 http://www.ee.washington.edu/people/faculty/bilmes/classes/ee563_spring_2018/ Prof. Jeff Bilmes University

More information

CSCI 3210: Computational Game Theory. Cascading Behavior in Networks Ref: [AGT] Ch 24

CSCI 3210: Computational Game Theory. Cascading Behavior in Networks Ref: [AGT] Ch 24 CSCI 3210: Computational Game Theory Cascading Behavior in Networks Ref: [AGT] Ch 24 Mohammad T. Irfan Email: mirfan@bowdoin.edu Web: www.bowdoin.edu/~mirfan Course Website: www.bowdoin.edu/~mirfan/csci-3210.html

More information

LECTURE 2. Convexity and related notions. Last time: mutual information: definitions and properties. Lecture outline

LECTURE 2. Convexity and related notions. Last time: mutual information: definitions and properties. Lecture outline LECTURE 2 Convexity and related notions Last time: Goals and mechanics of the class notation entropy: definitions and properties mutual information: definitions and properties Lecture outline Convexity

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Beyond Convexity Submodularity in Machine Learning

Beyond Convexity Submodularity in Machine Learning Beyond Convexity Submodularity in Machine Learning Andreas Krause, Carlos Guestrin Carnegie Mellon University International Conference on Machine Learning July 5, 2008 Carnegie Mellon Acknowledgements

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Submodular Functions: Extensions, Distributions, and Algorithms A Survey

Submodular Functions: Extensions, Distributions, and Algorithms A Survey Submodular Functions: Extensions, Distributions, and Algorithms A Survey Shaddin Dughmi PhD Qualifying Exam Report, Department of Computer Science, Stanford University Exam Committee: Serge Plotkin, Tim

More information

Diffusion of Innovation and Influence Maximization

Diffusion of Innovation and Influence Maximization Diffusion of Innovation and Influence Maximization Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

MAP Examples. Sargur Srihari

MAP Examples. Sargur Srihari MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances

More information

Submodular Functions, Optimization, and Applications to Machine Learning

Submodular Functions, Optimization, and Applications to Machine Learning Submodular Functions, Optimization, and Applications to Machine Learning Spring Quarter, Lecture 17 http://www.ee.washington.edu/people/faculty/bilmes/classes/ee563_spring_2018/ Prof. Jeff Bilmes University

More information

1 Continuous extensions of submodular functions

1 Continuous extensions of submodular functions CS 369P: Polyhedral techniques in combinatorial optimization Instructor: Jan Vondrák Lecture date: November 18, 010 1 Continuous extensions of submodular functions Submodular functions are functions assigning

More information

arxiv: v1 [cs.ds] 1 Nov 2014

arxiv: v1 [cs.ds] 1 Nov 2014 Provable Submodular Minimization using Wolfe s Algorithm Deeparnab Chakrabarty Prateek Jain Pravesh Kothari November 4, 2014 arxiv:1411.0095v1 [cs.ds] 1 Nov 2014 Abstract Owing to several applications

More information

CS675: Convex and Combinatorial Optimization Fall 2016 Combinatorial Problems as Linear and Convex Programs. Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2016 Combinatorial Problems as Linear and Convex Programs. Instructor: Shaddin Dughmi CS675: Convex and Combinatorial Optimization Fall 2016 Combinatorial Problems as Linear and Convex Programs Instructor: Shaddin Dughmi Outline 1 Introduction 2 Shortest Path 3 Algorithms for Single-Source

More information

NP Completeness and Approximation Algorithms

NP Completeness and Approximation Algorithms Chapter 10 NP Completeness and Approximation Algorithms Let C() be a class of problems defined by some property. We are interested in characterizing the hardest problems in the class, so that if we can

More information

Hands-On Learning Theory Fall 2016, Lecture 3

Hands-On Learning Theory Fall 2016, Lecture 3 Hands-On Learning Theory Fall 016, Lecture 3 Jean Honorio jhonorio@purdue.edu 1 Information Theory First, we provide some information theory background. Definition 3.1 (Entropy). The entropy of a discrete

More information

Part II: Integral Splittable Congestion Games. Existence and Computation of Equilibria Integral Polymatroids

Part II: Integral Splittable Congestion Games. Existence and Computation of Equilibria Integral Polymatroids Kombinatorische Matroids and Polymatroids Optimierung in der inlogistik Congestion undgames im Verkehr Tobias Harks Augsburg University WINE Tutorial, 8.12.2015 Outline Part I: Congestion Games Existence

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

Preliminaries. Introduction to EF-games. Inexpressivity results for first-order logic. Normal forms for first-order logic

Preliminaries. Introduction to EF-games. Inexpressivity results for first-order logic. Normal forms for first-order logic Introduction to EF-games Inexpressivity results for first-order logic Normal forms for first-order logic Algorithms and complexity for specific classes of structures General complexity bounds Preliminaries

More information

Decision Tree Fields

Decision Tree Fields Sebastian Nowozin, arsten Rother, Shai agon, Toby Sharp, angpeng Yao, Pushmeet Kohli arcelona, 8th November 2011 Introduction Random Fields in omputer Vision Markov Random Fields (MRF) (Kindermann and

More information

Submodular function optimization: A brief tutorial (Part I)

Submodular function optimization: A brief tutorial (Part I) Submodular function optimization: A brief tutorial (Part I) Donglei Du (ddu@unb.ca) Faculty of Business Administration, University of New Brunswick, NB Canada Fredericton E3B 9Y2 Workshop on Combinatorial

More information

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Probabilistic Graphical Models: MRFs and CRFs CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Why PGMs? PGMs can model joint probabilities of many events. many techniques commonly

More information

A Note on the Budgeted Maximization of Submodular Functions

A Note on the Budgeted Maximization of Submodular Functions A Note on the udgeted Maximization of Submodular Functions Andreas Krause June 2005 CMU-CALD-05-103 Carlos Guestrin School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract Many

More information

Tufts COMP 135: Introduction to Machine Learning

Tufts COMP 135: Introduction to Machine Learning Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Logistic Regression Many slides attributable to: Prof. Mike Hughes Erik Sudderth (UCI) Finale Doshi-Velez (Harvard)

More information

Structure Learning: the good, the bad, the ugly

Structure Learning: the good, the bad, the ugly Readings: K&F: 15.1, 15.2, 15.3, 15.4, 15.5 Structure Learning: the good, the bad, the ugly Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 29 th, 2006 1 Understanding the uniform

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,

More information

APPROXIMATION ALGORITHMS RESOLUTION OF SELECTED PROBLEMS 1

APPROXIMATION ALGORITHMS RESOLUTION OF SELECTED PROBLEMS 1 UNIVERSIDAD DE LA REPUBLICA ORIENTAL DEL URUGUAY IMERL, FACULTAD DE INGENIERIA LABORATORIO DE PROBABILIDAD Y ESTADISTICA APPROXIMATION ALGORITHMS RESOLUTION OF SELECTED PROBLEMS 1 STUDENT: PABLO ROMERO

More information

CSC 411 Lecture 3: Decision Trees

CSC 411 Lecture 3: Decision Trees CSC 411 Lecture 3: Decision Trees Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 03-Decision Trees 1 / 33 Today Decision Trees Simple but powerful learning

More information

Revisiting the Greedy Approach to Submodular Set Function Maximization

Revisiting the Greedy Approach to Submodular Set Function Maximization Submitted to manuscript Revisiting the Greedy Approach to Submodular Set Function Maximization Pranava R. Goundan Analytics Operations Engineering, pranava@alum.mit.edu Andreas S. Schulz MIT Sloan School

More information

Guaranteeing Solution Quality for SAS Optimization Problems by being Greedy

Guaranteeing Solution Quality for SAS Optimization Problems by being Greedy Guaranteeing Solution Quality for SAS Optimization Problems by being Greedy Ulrike Stege University of Victoria ustege@uvic.ca Preliminary results of this in: S. Balasubramanian, R.J. Desmarais, H.A. Müller,

More information

Adaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization

Adaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization Adaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization Daniel Golovin California Institute of Technology Pasadena, CA 91125 dgolovin@caltech.edu Andreas Krause California

More information

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability Theory for Machine Learning. Chris Cremer September 2015 Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares

More information

Introduction to Logistic Regression and Support Vector Machine

Introduction to Logistic Regression and Support Vector Machine Introduction to Logistic Regression and Support Vector Machine guest lecturer: Ming-Wei Chang CS 446 Fall, 2009 () / 25 Fall, 2009 / 25 Before we start () 2 / 25 Fall, 2009 2 / 25 Before we start Feel

More information

Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra. Problem Set 6

Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra. Problem Set 6 Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra Problem Set 6 Due: Wednesday, April 6, 2016 7 pm Dropbox Outside Stata G5 Collaboration policy:

More information

CS 188: Artificial Intelligence. Machine Learning

CS 188: Artificial Intelligence. Machine Learning CS 188: Artificial Intelligence Review of Machine Learning (ML) DISCLAIMER: It is insufficient to simply study these slides, they are merely meant as a quick refresher of the high-level ideas covered.

More information

CSCE 478/878 Lecture 6: Bayesian Learning

CSCE 478/878 Lecture 6: Bayesian Learning Bayesian Methods Not all hypotheses are created equal (even if they are all consistent with the training data) Outline CSCE 478/878 Lecture 6: Bayesian Learning Stephen D. Scott (Adapted from Tom Mitchell

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information

Qualifying Exam in Machine Learning

Qualifying Exam in Machine Learning Qualifying Exam in Machine Learning October 20, 2009 Instructions: Answer two out of the three questions in Part 1. In addition, answer two out of three questions in two additional parts (choose two parts

More information

Maximization of Submodular Set Functions

Maximization of Submodular Set Functions Northeastern University Department of Electrical and Computer Engineering Maximization of Submodular Set Functions Biomedical Signal Processing, Imaging, Reasoning, and Learning BSPIRAL) Group Author:

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

Energy minimization via graph-cuts

Energy minimization via graph-cuts Energy minimization via graph-cuts Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement de l information et vision artificielle Binary energy minimization We will first consider binary MRFs: Graph

More information

Active Learning and Optimized Information Gathering

Active Learning and Optimized Information Gathering Active Learning and Optimized Information Gathering Lecture 7 Learning Theory CS 101.2 Andreas Krause Announcements Project proposal: Due tomorrow 1/27 Homework 1: Due Thursday 1/29 Any time is ok. Office

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 8: Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk Based on slides by Sharon Goldwater October 14, 2016 Frank Keller Computational

More information

Submodular Functions, Optimization, and Applications to Machine Learning

Submodular Functions, Optimization, and Applications to Machine Learning Submodular Functions, Optimization, and Applications to Machine Learning Spring Quarter, Lecture 4 http://www.ee.washington.edu/people/faculty/bilmes/classes/ee563_spring_2018/ Prof. Jeff Bilmes University

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

CS675: Convex and Combinatorial Optimization Fall 2014 Combinatorial Problems as Linear Programs. Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Combinatorial Problems as Linear Programs. Instructor: Shaddin Dughmi CS675: Convex and Combinatorial Optimization Fall 2014 Combinatorial Problems as Linear Programs Instructor: Shaddin Dughmi Outline 1 Introduction 2 Shortest Path 3 Algorithms for Single-Source Shortest

More information

arxiv: v1 [math.oc] 1 Jun 2015

arxiv: v1 [math.oc] 1 Jun 2015 NEW PERFORMANCE GUARANEES FOR HE GREEDY MAXIMIZAION OF SUBMODULAR SE FUNCIONS JUSSI LAIILA AND AE MOILANEN arxiv:1506.00423v1 [math.oc] 1 Jun 2015 Abstract. We present new tight performance guarantees

More information

1 Matroid intersection

1 Matroid intersection CS 369P: Polyhedral techniques in combinatorial optimization Instructor: Jan Vondrák Lecture date: October 21st, 2010 Scribe: Bernd Bandemer 1 Matroid intersection Given two matroids M 1 = (E, I 1 ) and

More information