Contents. 1 Introduction 5. 2 Convexity 9. 3 Cones and Generalized Inequalities Conjugate Functions Duality in Optimization 27
|
|
- Oscar Jennings
- 5 years ago
- Views:
Transcription
1 A N D R E W T U L L O C H C O N V E X O P T I M I Z AT I O N T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E
2
3 Contents 1 Introduction Setup Convexity 6 2 Convexity 9 3 Cones and Generalized Inequalities 15 4 Conjugate Functions The Legendre-Fenchel Transform Duality Correspondences 24 5 Duality in Optimization 27 6 First-order Methods 31 7 Interior Point Methods 33
4 4 andrew tulloch 8 Support Vector Machines Machine Learning Linear Classifiers Kernel Trick 39 9 Total Variation Meyer s G-norm Non-local Regularization How to choose w(x, y)? Relaxation Mumford-Shah model Bibliography 49
5 1 Introduction 1.1 Setup A setup is (i) A function f : U Ω (energy), an input image (i), and a reconstructed candidate image (u) and find the minimizer of the problem where f is typically of the form min f (u, i) (1.1) u U f (u, i) = l(u, i) + }{{} r(u) }{{} cost regulariser (1.2) Example 1.1 (Rudin-Osher-Fortem). Will reduce contrast 1 min u (u I) 2 dx + u du (1.3) 2 Ω Ω Will not introduce new jumps Example 1.2 (Total variation (TV) L1). 1 min u u I dx + λ u du (1.4) 2 Ω Ω In general does not cause contrast loss.
6 6 andrew tulloch Can show that if I = B r (0), then I r λ 2 u = c r < λ 2 (1.5) Example 1.3 (Inpainting). 1 min u I dx + λ u dx (1.6) u 2 Ω/A Ω 1.2 Convexity Theorem 1.4 If a function f is convex, every local optimizer is a global optimizer. Theorem 1.5 If a function f is convex, it can (very often) be efficiently optimized (polynomial time in number of bits in the input) In computer vision, problems are: Usually large-scale ( variables) Usually non-differentiable Definition 1.6. f is lower semicountinuous if f (x ) lim inf x x f (x) = min{α R (x k ) x 1, f (xk ) α} (1.7) Theorem 1.7. f : R n R. Then the following are equivalent (i) f is lower semi continuous, (ii) epi f is closed C in R n R (iii) lev α f = {x R n f (x) α} are closed for all α R Proof. ((i) (ii)) Take (x k, α k ) epi f (x, α) R n R. Then By 2.8 f (x) lim inf k f (x k ) limin f k x k = α as (x, α) epi f. ((ii) (iii)) epi f closed implies epi f (R n {a }) closed for all α R, if and only if {x R n f (x) α } (1.8)
7 convex optimization 7 is closed for all α R If α =, lev + f = R n. If α =, lev f = k N lev k f which is the intersection of closed sets. ((iii) (i)) For x j x, take the subsequence x k x, with Then f (x k ) lim inf f (x) = c (1.9) x x if c R, for all K(ɛ) = K such that f (x k ) C + ɛ for all k > K. x k lev C+ɛ f x lev C+ɛ. Equivalently, x lev C f f (x ) C = lim inf x x f (x). If c = +, done - f (x ) + = lim inf. If c =, same argument with k N, f (x k ) k,... Example 1.8. f (x) = x is lower semi-continuous, but does not have a minimizer. Definition 1.9. f : R n R is level-bounded if and only if lev α f is bounded for all α R. This is also known as coercivity. Example f (x) = x 2 is level-bounded. f (x) = 1 is not level x bounded. Theorem f : R n R is level bounded if and only if f (x k ) if x k. Proof. Exercise. Theorem f : R n R is lower-semicontinuous, level-bounded, and proper. Then inf x f (x) (, + ) and argmin f = {x f (x) in f f (x)} is nonempty and compact. Proof. arg min f = {x f (x) inf x f (x)} (1.10) = {x f (x) inf f (x) + 1, k N} (1.11) k = k N lev f (1.12) inf f + 1 k If inf f is, just replace inf f + 1 k by α with α >, and set α k = k, k N.
8 8 andrew tulloch These are bounded (by level-boundedness), closed (by f being lower semicontinuous), and non-empty (since 1 k 0.). Then these limit sets are compact, we can just take the limit of the left boundaries of the level sets, and construct the convergent subsequence that is contained in every level set. We need to show that inf f =. If inf f =, then there exists x argmin f with f (x) =. Since f is proper, this cannot exist. Thus, inf f == in f ty. Remark For Theorem 1.12, it suffices to have lev α f bounded and non-empty for at least one α R. Proposition We have the following properties for lower semi continuity. (i) If f, g is lower semicontinuous, then f + g is lower semicontinuous (ii) IF f is lower semicontinuous, λ 0, then λ f is lower semicontinuous (iii) f : R n R is lower semicontinuous, g : R m R n is continuous, then f g is lower semicontinuous.
9 2 Convexity Definition 2.1. (i) f : R n R is convex if f ((1 τ)x + τy) (1 τ) f (x) + τ f (y) (2.1) for all x, y R n, τ (0, 1). (ii) A set C R n is convex if and only if I(C) is convex. (iii) f is strictly convex if and only if (2.1) holds strictly whenever x = y and f (x), f (y) R. Remark 2.2. C R n is convex if and only if for all x, y C, the connecting line segment is contained in C. Exercise 2.3. Show {x a T x + b 0} (2.2) is convex. f (x) = a T x + b (2.3) is convex. Definition 2.4. x 0,..., x m R n, λ 0,..., λ m 0 with m i=0 λ i = 1, then i λ i x i is a convex combination of the x i. Theorem 2.5. R n R is convex if and only if f ( n i=1 λ i x i ) n λ i f (x i ) (2.4) i=1
10 10 andrew tulloch C R n is convex if and only if C contains all convex combinations of it s elements. Proof. n ( ) λ λ i x i = λ m x m + 1 λ i m x 1 λ i i=1 m (2.5) which is a convex combination of two points, and proceed by induction on m. The set version is proven by application on I(C). Proposition 2.6. f : R n R is convex implies that the domain of f is convex. Proposition 2.7. f : R n R is convex if and only if epi f is convex in R n R if and only if {(x, α) R n R f (x) < α} (2.6) is convex in R n R. Proof. epi f is convex if and only if for all τ (0, 1), x, y, α f (x), β > f (y), (x, α), (y, β) epi f, we have f ((1 τ)x + τy) (1 τ)α + τβ (2.7) τ (0, 1), x, y, f ((1 τ)x + τy) (1 τ) f (x) + τ f (y) (2.8) f is convex. (2.9) Proposition 2.8. f : R n R is convex implies lev α f is convex for all α R. Proof. f ((1 τ)x + τy) (1 τ) f (x) + τ f (y) α which implies lev α is convex. α = then the lev α f = R n which is convex. Theorem 2.9. f : R n R is convex. Then
11 convex optimization 11 (i) arg min f is convex (ii) x is a local minimizer of f implies x is a global minimizer of f (iii) f is strictly convex and proper implies f has at most one global minimizer. Proof. (i) f = arg min f =. f = arg min f = lev inf f f is convex by previous proposition. (ii) Assume x is a local minimizer and there exists y with f (y) < f (x). Then f ((1 τ)x + τy) (1 τ) f (x) + τ f (y) < f (x) }{{} < f (x) Taking τ 0 shows that x cannot be a local minimizer, and thus y cannot exist. (iii) Assume x, y minimizes, which implies f (x) = f (y). Then f ( 1 2 x y) < 1 2 f (x) + 1 f (y) = f (x) = f (y) 2 which implies x, y not global minimizers. Proposition To construct convex functions: (i) Let f i, i I convex, then is convex. f (x) = sup i I (ii) Let f i, i I strictly convex, I finite, then f (x) (2.10) is strictly convex. sup i I f (i) (2.11) (iii) C i, i I convex sets, then i I C i (2.12) is convex.
12 12 andrew tulloch (iv) f k, k N convex, is convex. lim sup k f k (x) (2.13) Proof. Exercise. Example (i) C, D convex does not imply C D is convex (e.g. disjoint) (ii) f : R n R is convex, C is convex implies f + I(C) is convex. (iii) f (x) = x = max{x x} is convex. (iv) f (x) = x p, p 1 is convex, as p = sup, y (2.14) y p =1 Theorem Let C R n be open and convex. Let f : C R be differentiable. Then the following are equivalent: (i) f is [strictly] convex (ii) y x, f (y) f (x) 00 for all x, y C [with > 0 for x = y] (iii) f (x) + f (x), y x f (y) for all x, y C [with < for x = y] (iv) If f is additionally twice differentiable, then 2 f is positive semidefinite for all x C. (ii) is monotonicity of f. Proof. Exercise (reduce to n = 1, then extend by a convex function is convex on R n iff it is convex on all R n 1 subsets.) Remark Note that the inverse of (iv) does not necessarily hold, for example f (x) = x 4. Proposition We have the following results hold for convex functions. (i) f i,..., f m : R n R is convex, λ i,..., λ m 0, the n f = λ i f i (2.15) i is convex, and strictly convex if there exists i such that λ i > 0 and f i is strictly convex.
13 convex optimization 13 (ii) f i : R n R is convex implies f (x 1,..., x m ) = f i (x i ) is convex, strictly convex if all f i are strictly convex. (iii) f : R n R is convex, A R n n, b R n implies g(x) = f (Ax + b) is convex. Remark is convex. M ɛ = sup x R n ( Mx ) Proposition (i) c 1,..., C m convex implies C 1 C m convex (ii) C R n, A R m n, b R m implies L(C) is convex with L(x) = Ax + b. (iii) C R m,... L (C) is convex. (iv) f, g convex implies f + g are convex (v) f convex, λ R implies λ f is convex Definition For S R n, x R n, define the projection of x onto S as Π S (y) = arg min x y 2 (2.16) x S Proposition If C R n is convex, closed, and non-empty, then Π C is single-valued - that is, the projection is unique. Proof. Let Π S (y) = arg min x S 1 2 x y I C(x) (2.17) To show uniqueness, 1 2 x y 2 is strictly convex, and x y 2 = I > 0, so f is strictly convex. f is proper (C = ), and thus f has at most one minimizer. To show existence, we have that f is proper, lower semicontinuous (left part from continuous, right part from C closed). Level bounded as x y 2 2 as x 2, and I(C) 0. Thus, the arg min f =.
14 14 andrew tulloch Definition Let S R n is arbitrary. Then con S = C convex, C S C (2.18) is the convex ball of S. Remark con S is the smallest convex set that contains S. Theorem Let S R n, then con S = { n i=0 λ i x i x i S, λ i 0, n i=0 λ i = 1 } (2.19) Proof. D is convex and contains S - if x, y D, then (1 τ)x + (τ)y D. Thus con S D. From a previous theorem, con S convex implies con S contains all convex combinations of points in S. Thus con S D Thus con S = D. Definition For a set C R n, define cl C as cl C = {x R n for all open neighborhoods N of x, N C = } int C as (2.20) int C = {x R n there exists an open neighborhood N of x with N C} C (the boundary) as (2.21) C = cl C\ int C (2.22) Remark cl G = S (2.23) S closed, S G
15 3 Cones and Generalized Inequalities Definition 3.1. K R n is a cone if and only if 0 K and λx K for all x K, λ 0. Definition 3.2. A cone in pointed if and only if x x n = 0 x 1 = = x n = 0 (3.1) Example 3.3. (i) R n is a pointed cone. (ii) {(x 1, x 2 ) R 2 x 2 > 0} is not a cone. (iii) {(x 1, x 2 ) R 2 x 2 0} is a cone, but is not pointed. (iv) {x R n x i 0, i 1,..., n} is a pointed cone. Proposition 3.4. K R n be any set. Then the following are equivalent: (i) K is a convex cone. (ii) K is a cone and K + K K. (iii) K = and i=0 n α ix i K for all x i K, α i 0. Proof. Exercise. Proposition 3.5. If K is a convex cone, then K is pointed if and only if K ( K) = {0}. Proof. Exercise. Definition 3.6. f (x) = {v R n f (x) + v, y x f (y) y} (3.2)
16 16 andrew tulloch f (x) = lim t 0 f (x + t) f (x) t Proposition 3.7. Let f, g : R n R be convex. Then (3.3) (i) f differentiable at x implies f (x) = { f (x)}. (ii) f differentiable at x, g(x) R implies ( f + g)(x) = f (x) + g(x) (3.4) Proof. We have (i) Equivalent to (ii) with g = 0. (ii) To show LHS RHS, if v g(x), then v, y x g(y), which by Theorem 3.12 in notes gives f (x), y x + f (x) f (y) (3.5) v + f (x), y x + ( f + g)(x) ( f + g)(y) (3.6) v + f (x) ( f + g) (3.7) The other direction is given as lim inf t 0 lim inf z x lim inf z x f (z) + g(z) f (x) g(x) v, z x z x g(z) g(x) v f (x), z x z x g(z(t)) g(x) v f (x), (1 t)x + ty y t y x lim inf t 0 t(g(y) g(x)) t v f (x), y x t y x 0 (3.8) 0 (3.9) 0 (3.10) 0 (3.11) g(y) g(x) v f (x), y x 0 v f (x) (3.12) g(x) v f (x) g(x) (3.13)
17 convex optimization 17 Theorem 3.8. Let f : R n R be proper. Then x arg min f 0 f (x) (3.14) Proof. 0 f (x) 0, y x + f (x) f (y) (3.15) }{{} 0 Definition 3.9. For a convex set C R n and point x C, N C (x) = {v R n v, y x 0 y G} (3.16) is the normal cone of x. By convention, N C (x) = for all x / C. Proposition Let C be convex and C =. Then I(C) (x) = N C (x) (3.17) Proof. For x C, we have I(C) = {v I(C) (x) + v, y x I(C) (y) y C} = N C (x) (3.18) For x / C,. Fill in proof here Proposition C, closed, C = and convex, x R n. Then y = Π C (x) x y N C (y) (3.19) Proof. y Π C (x) y minimizes 1 2 y x 2 + I(C) (y ). (3.20) }{{} g(y) If and only if 0 g(y) if and only if 0 y x + I(C) (y) Proposition Let f : R n R be convex proper. Then x / dom f f (x) = {v R n (u, 1) N epi f (x, f (x))} x dom f (3.21)
18 18 andrew tulloch If x dom f, N dom f = {v R n (x, 0) N epi f (x, f (x))} (3.22) Example Let subdifferential of f : R R, f (x) = x is sign x x = 0 f (x) = [ 1, 1] x = 0 (3.23) Definition For C R n, define the affine hull as aff(c) = A affine, C A A (3.24) rint(c) = {x R n there exists an open neighborhood N of x such that N aff(c) C} (3.25) Example aff(r n ) = R n (3.26) aff(r n ) = R n (3.27) rint([0, 1] 2 ) = (0, 1) 2 (3.28) rint([0, 1] {0}) = (0, 1) {0} (3.29) Exercise We know inta B A B (3.30) Does rint(a B) rint A rint B (3.31) Proposition Let f : R n R be convex. Then (i) (i) g(x) = f (x + y) g(x) = f (x + y) (ii) g(x) = f (λx) g(x) = λ f (x) (iii) g(x) = λ f (x) g(x) = λ f (x) (ii) f : R n R proper, convex, A R m n such that {Ay y R m } rint dom f = (3.32)
19 convex optimization 19 Then for x dom( f A) we have ( f A)(x) = A T f (Ax) (3.33) (iii) Let f 1,... f m : R n R be proper, convex, and rint dom f 1 rint dom f m =. Then f (x) = f 1 (x) + + f m (x) (3.34)
20
21 4 Conjugate Functions 4.1 The Legendre-Fenchel Transform Definition 4.1. For f : R n R, define is the convex hull of f. con f (x) = sup g(x) (4.1) g f,g convex Proposition 4.2. con f is the greatest convex function majorized by f. Definition 4.3. For f : R n R, the (lower) closure of f is defined as cl f (x) = lim inf y x Proposition 4.4. For f : R n R, f (y) (4.2) epi(cl f ) = cl(epi f ) (4.3) In particular, if f is convex, then cl f is convex. Proof. Exercise. Proposition 4.5. If f : R n R, then (cl f )(x) = sup g(x) (4.4) g f,g lsc Proof. Fill in
22 22 andrew tulloch Theorem 4.6. Let C R n be closed and convex. Then C = H b,β (4.5) (b,β)s.t.c H b,β where H b,β = {x R n x, b β 0} (4.6) Proof. Theorem 4.7. Let f : R n R be proper, lsc, and convex. Then f (x) = sup g(x) (4.7) g f,g affine Proof. Definition 4.8. For f : R n R, let f : R n R be defined by This is quite an involved proof. f (v) = sup x R n { v, x f (x)} (4.8) as the conjugate to f. The mapping f f is the Legendre-Fenchel Transform Remark 4.9. For v R n, f (v) = sup{x R n { v, x f (x)} (4.9) f (v) v, x f (x) x R n f (x) v, x f (x) x R n. (4.10) Thus f is the largest affine function with gradient v majorized by f. Theorem Assume f : R n R. Then (i) f = (con f ) = (cl f ) = (cl con f ) and f f = ( f ), the biconjugate of f. (ii) If con f is proper, then f, f as proper, lower semicontinuous, convex and f = cl con f. (iii) If f : R n R is proper, lower semicontinuous, and proper, then f = f (4.11)
23 convex optimization 23 Proof. We have (v, β) epi f β v, x f (x) x R n f (x) v, x β x R n (4.12) We claim that for an affine function h, we have h f h con f h cl f h cl con f. This is shown as con f is the largest convex function less than or equal to f, and h is convex. Same for cl, cl con, etc. Thus in (4.12) we can replace f by con f, cl f, cl con f, which gives our required result. We also have f (y) = sup v R n { v, y sup x R n { v, x f (x)}} (4.13) sup v R n { v, y v, y + f (y)} (4.14) = f (y) (4.15) For the second part, we have con f is proper, and claim that cl con f is proper, lower semicontinuous, and convex. Lower semicontinuity is a give. Convexity is given by the previous proposition that f is convex implies cl f is convex. Properness is to be shown in an exercise. Applying the previous theorem, cl con f (x) = sup g(x) (4.16) g cl con f,g affine = sup (v,β) epi f { v, x β} (4.17) = sup (v,β) epi f { v, x f (v)} (4.18) = sup v dom f { v, x f (v)} (4.19) = sup v R n { v, x f (v)} f (x) (4.20) with g(x) v, x β, (v, β) epi(cl con f ) (v, β) epi f. To show f is proper, lower semicontinuous, and convex, we have epi f is the intersection of closed convex sets, and therefore closed and convex, and hence f is lower semicontinuous and convex.
24 24 andrew tulloch To show properness, we have con f is proper implies there exists x R n with con f (x) <. Then f (v) = sup x R n{ v, x f (x)}, which is greater than. If f +, then cl con f = f = sup v v, x f (x), and so cl con f is proper, which implies f is proper, lower semicontinuous, and convex. Applying to f - we need con f proper (which is proper by previous result) and thus f is proper, lower semicontinuous, and convex. For part 3, apply 2 - f is convex, which implies con f = f and con f is proper (as f is proper), and f is lsc and convex, and thus f = cl con f = f. 4.2 Duality Correspondences Theorem Let f : R n R be proper, lower semicontinuous, and convex. Then: (i) f = ( f ) 1 (ii) v f (x) f (x) + f (v) = v, x x ( f )(v). (iii) f (x) = arg max{ v, x f (v )} (4.21) v f (x) = arg max{ v, x f (x )} (4.22) x (4.23) (ii) Proof. (i) This is obvious from (2) f (x) + f (v) = v, x (4.24) { f (v) = v, x f (x) (4.25) x arg max{ v, x f (x )} (4.26)... (4.27) Finish off proof
25 convex optimization 25 Proposition Let f : R n R be proper, lower semicontinuous, and convex. Then Proof. Exercise ( f ( ) a, ) = f ( +a) (4.28) ( f ( + b)) = f ( ), b (4.29) ( f ( ) + c) = f ( ) c (4.30) (λ f ( )) = λ f ( ), λ > 0 λ (4.31) (λ f ( λ )) = λ f ( ) (4.32) Proposition f i : R n R, i : 1,..., m proper, f (x 1,..., x m ) = i f i (x i ). Then f (v 1,..., v m ) = i f (v i ) Definition For any set S R n, define the support function G S (v) = sup v, x = (δ S ) (v) (4.33) x S Definition A function h : R n R is positively homogeneous if 0 dom h and h(λx) = λh(x) for all λ > 0, x R n. Proposition The set of positive homogeneous proper lower semicontinuous convex functions and the set of closed convex nonempty sets are in one-to-one correspondence through the Legendre-Fenchel transform. δ C G C (4.34) and and x G C (v) x C (4.35) G C (v) = v, x v N C (x) = δ C (x) (4.36) The set of closed convex cones is in one-to-one correspondence with itself: δ K δ K (4.37) K = {v R n v, x 0 x K} (4.38)
26 26 andrew tulloch and x N K (v) x K, v K, (4.39) x, v = 0 v N K (x) (4.40)
27 5 Duality in Optimization Definition 5.1. For f : R n R m R proper, lower semicontinuous, convex, we define the primal and dual problems and the inf-projections inf φ(x), φ(x) = f (x, 0) (5.1) x Rn sup y R m ψ(y), ψ(y) = f (0, y) (5.2) p(u) = inf f (x, u) (5.3) x Rn q(v) = inf y R m f (v, y) (5.4) f is the perturbation function for φ, p is the associated projection function. Consider the problem inf x Consider the perturbed problem 1 2 x z 2 + δ 0 (Ax b) = inf φ(x) (5.5) f (x, u) = 1 2 x z 2 + δ 0 (Ax b + u) (5.6) Proposition 5.2. Assume f satisfying the assumptions in Definition 5.1. Then (i) φ, psi are convex and lower semicontinuous (ii) p, q are convex
28 28 andrew tulloch (iii) p(0) = inf x φ(x) (iv) p (0) = sup y ψ(y) (v) inf x φ(x) < 0 dom p (vi) sup ψ(y) > 0 dom q Proof. (i) φ is clearly convex. For ψ, f is lower semicontinuous and convex, which implies ψ is lower semicontinuous and convex. (ii) Look at the strict epigraph of p: E = {(u, α) R m R p(u) < α} (5.7) = {(u, α) R m R x : f (x, u) < α} (5.8) = A{(u, α, x) R m R R n f (x, u) α} (5.9) = A(u, α, x) (u, α) (5.10) as a linear map over a convex set. As E is convex, p must be convex. Similarly with q. (iii) For p(0), this proceeds by definition. For p (0), p (y) = sup{ y, u p(u)} (5.11) u = sup y, u f (x, u) (5.12) u,x = f (0, y) (5.13) and p (0) = sup 0, y p (y) (5.14) y = sup f (0, y) (5.15) y = sup ψ(y) (5.16) y (iv) By definition, 0 dom p p(0) = inf x f (x) <. complete proof
29 convex optimization 29 Theorem 5.3. Let f as in Definition 7.1. Then weak duality holds p(0) = inf x φ(x) sup ψ(y) = p (0) (5.17) and under certain conditions the inf, sup are equal and finite (strong duality). p(0) R and p lower-semicontinuous at 0 if and only if inf φ(x) = sup ψ(y) R. y Definition 5.4. inf φ sup ψ (5.18) is the duality gap Proof. ( ) p cl p p cl p(0) = p(0). ( ) cl p is lower semicontinuous, convex cl p(x) is proper, sup ψ = (p ) (0) = (cl p) (0) = cl p(0) = p(0) = inf φ Proposition 5.5. b Fill in notes from lecture
30
31 6 First-order Methods Idea is that we do gradient descent on our objective function f, with step size τ k. Definition 6.1. Let f : R n R, then (i) F τk f (x k ) = x k τ k f (x k ) = (I τ k f )(x k ) (6.1) (ii) B τk f (x k ) = (I + τ k f ) 1 (x k ) (6.2) so (6.3) Fill in Proposition 6.2. If f : R n R is proper, lower semicontinuous, and convex, for τ > 0, Proof. B τk f (x) = arg min{ 1 y x f (y)} (6.4) y 2τ k y arg min{...} 0 1 τ k (y x) f (y) (6.5) (6.6) Missed lecture material
32
33 7 Interior Point Methods Theorem 7.1 (Problem). Consider the problem with K a closed convex cone - thus Ax K b. inf x c, x + δ K(Ax b) (7.1) Notation. The idea is to replace δ K by a smooth approximation F, with F(x) as x boundary K, and solve inf c, x + 1 t F(x), equivalent to solving t c, x + F(x). Proposition 7.2. F is canonical barrier for K. Then F is smooth on dom F = K and strictly convex, with F(tx) = F(x) Θ F λt, (7.2) and (i) F(x) dom F (ii) F(x), x = Θ F (iii) F( F(x)) = x (iv) F(tx) = 1 t F(x). For the problem inf c, x (7.3) x
34 34 andrew tulloch such that Ax K b with K closed, convex, self-dual cone, we have sup inf c, x + δ K(Ax b) (7.4) x x b, y h ( A T y c) k (y) (7.5) sup b, y s.t.y K 0, A T y = c (7.6) y inf c, x + F(ax b) (7.7) x sup b, y F(y) δ A T y=c (7.8) y More conditions on, etc Proposition 7.3. For t > 0, define x(t) = arg min{t c, x + F(Ax b)} (7.9) y(t) = arg min{ t b, y + F(y) + δ A T y=c } (7.10) z(t) = (x(t), y(t)) (7.11) These paths exist, are unique, and (x, y) = (x(t), y(t)) A T y = c ty + F(Ax b) = 0 (7.12) Proof. The first part follows by definition. For (7.12), the primal optimality condition is that 0 = tc + A T F(Ax b). The dual optimality condition is that 0 tb + F(y) + N z A T z=c (y), which is A T y = c 0 tb + F(y) + range A A T y = c (7.13) Assume x satisfies the primal optimality condition. Then 0 tb + F(y) + range A, A T y = c (7.14) tc + A T F(Ax b) = 0 tc + A T ( ty) = 0 (7.15) A T y = c (7.16)
35 convex optimization 35 A few more steps... Proposition 7.4. If x, y feasible, then φ(x) ψ(y) = y, Ax b (7.17) Moreover, for (x(t), y(t)) on the central path, φ(x(t)) ψ(y(t)) = Θ F t (7.18) Proof. φ(x) ψ(y) = c, x b, y (7.19) = A T y, x b, y (7.20) = Ax b, y (7.21) For the second part, we have φ(x(t)) ψ(y(t)) = y(t), Ax(t) b (7.22) = 1t F(Ax(t) b), Ax(t) b (7.23) = Θ F t (7.24) Fill in from lecture notes
36
37 8 Support Vector Machines 8.1 Machine Learning Given X = {x 1,..., x n } a sample set, x i F R m, with F our feature space, and C a set of classes. We seek to find h θ : F C, θ Θ (8.1) with Θ our parameter space. Our task is to find θ such that h θ is the best mapping from X into C. (i) Unsupervised learning (only X is known, usually C not known) Clustering Outlier detection Mapping to lower dimensional subspace (ii) Supervised learning ( C known). We have training date T = {(x 1, y 1 ),..., (x n, y n )}, with y i C. We seek to find θ = arg min θ Θ f (θ, T) = n g(y i, h θ (x i )) + R(θ) (8.2) i=1
38 38 andrew tulloch 8.2 Linear Classifiers The idea is to consider h θ : R m { 1, 1}, θ = w (8.3) b h θ (x) = sign( w, x + b) (8.4) We want to consider maximum margin classifiers, satisfying which can be rewritten as w max min y i ( w,b,w =0 i w, x + b w ) (8.5) max c (8.6) w,b such that ( w c y i w, xi + b ) ( w w c y i w, x i ) + b (8.7) (8.8) (8.9) or just min w,b 1 2 w 2 (8.10) such that 1 y i ( w, x + b) (8.11) Definition 8.1. In standard form, inf w,b k(w, b) + h(m w e) (8.12) b
39 convex optimization 39 The conjugates are k(w, b) = 1 2 w 2 (8.13) k (u, c) = 1 2 u 2 + δ {0} (c) (8.14) h(z) = δ 0 (z)h (v) = δ v (v) (8.15) In saddle point form, inf sup 1 w,b z 2 w M w e, z δ 0 (z) (8.16) b The dual problem is sup e, z δ 0 (z) 1 n z 2 y i x i z i 2 2 δ {0}( y, z ) (8.17) i=1 and thus inf z such that z 0, y, z = 0. The optimality conditions are 1 2 y i x i z i e, z (8.18) i (8.19) We use the fact that if k(x) + h(ax + b) is our primal, then the dual is b, y k ( A T y) h (z). 8.3 Kernel Trick Fill in rest of optimality conditions Explanation of support vectors The idea is to embed our features into a a higher dimensional space, mapping function φ. Then our decision function takes the form h θ (x) = sign( φ(x), w + b) (8.20)
40
41 9 Total Variation 9.1 Meyer s G-norm Idea - regulariser that favors textured regions u G = {inf v v = u, v L (R d, R d )} (9.1) Discretized, we have u G = inf{δ C(v) + δ G T v=u } (9.2) C = {v v(x) 2 1} (9.3) x = inf v sup v = sup w sup{δc w, (v) G T v + u } w (9.4) sup{ v, Gw + δc (v) w, u } v (9.5) = sup{δ C ( Gw) + w, u } (9.6) w = sup{ w, u δ C ( Gw)} (9.7) w = sup{ w, u TV(w) 1} (9.8) and so G is the dual to TV. G = δ B TV (9.9)
42 42 andrew tulloch where B TV = {u TV(u) 1}. Similarly, sup{ u, w w G 1} (9.10) = sup{ u, w v : w = G T v, v 1} (9.11) = sup{ u, G Tv v 1} (9.12) = TV(u) (9.13) Why is good in separating noise? arg min 1 2 u g 2 2 s.t. u G λ (9.14) = λb G (g) = B λb G (g) = g B λb G (9.15) = g B λtv (g) 9.2 Non-local Regularization In real-world images, large u does not always mean noise. Definition 9.1. Ω = {1,..., n}, given u R n, x, y Ω, w Ω 2 R 0, then y u(x) = (u(y) u(x))w(x, y) (9.16) w u(x = ( y u(x))) y Ω (9.17) A suitable divergence w u(x) = y Ω (w(x, y) v(y, x))w(x, y) adjoint to w with respect to Euclidean inner products. w v, u = v, u w (9.18) Non-local regularizers are J(u) = g( u(x)) (9.19) w x Ω
43 convex optimization 43 with TV g NL (u) = u(x) 2 (9.20) w x Ω TV d NL (u) = x Ω y Ω y u(x) (9.21) This reduces to classical TV if the weights are chosen as constant (w(x, y) = 1 h ) How to choose w(x, y)? (i) Large if neighborhoods of x, y are similar with respect to a distance metric. (ii) Sparse, otherwise we have n 2 terms in the regularized (n is number of pixels). Possible choice d u (x, y) = A(x) = arg min A Ω K G (t)(u(y + t) u(x + t)) 2 dt (9.22) { d u (x, y) A S(x), A = k} (9.23) y A 1 y A(x), x A(y) w(x, y) = 0 otherwise (9.24) with K G a Gaussian kernel of variance G 2.
44
45 10 Relaxation Example 10.1 (Segmentation). Find C Ω that fits the given data and prior knowledge about typical shapes. Theorem 10.2 (Chan-Vese). Given g : Ω R, Ω R d, consider inf f CV(C, c 1, c 2 ) (10.1) C R,c 1,c 2 R f CV (C, c 1, c 2 ) = (g c 1 ) 2 dx + (g c 2 ) 2 dx + λh d 1 ( C) C Ω\C (10.2). Confirm form of the regulariser Thus fit c 1 to shape, c 2 to outside of shape, and C is the region of the shape Mumford-Shah model f (K, u) = inf K Ω closed,u C (Ω\K) Ω (g u) 2 dx + λ f (K, u) (10.3) Ω\K u 2 2 dx + νhd 1 ( K). (10.4) Chan-Vese is a special case of this, with forcing u = c 1 I(C) + c 2 (1 I(C)).
46 46 andrew tulloch inf (g c 1 ) 2 dx + (g c 2 ) 2 dx + λh d 1 ( C) (10.5) C Ω,c 1,c 2 R C Ω\C inf u(g c 1 ) 2 + (1 u)(g c 2 ) 2 dx + λtv(u) u BV(D,{0,1}),c 1,c 2 R Fix c 1, c 2. Then Ω (10.6) inf u ((g c 1 ) 2 (g c 2 ) 2 ) dx + λtv(u) u BV(Ω,{0,1}) Ω }{{} S( ) (C) Replacing {0, 1} by it s convex hull, we obtain inf u Sdx + λtv(u) u BV(Ω,[0,1]) Ω (R) This is a convex relaxation - (i) Replace non-convex energy by convex approximation (ii) Replace non-convex f by con f. Can the minima of (R) have values / {0, 1}. Assume u 1, u 2 BV(Ω, {0, 1}) that u 2 minimized (R) and (C), and u 1 = u 2. Then is a minimizer of (R). 1 2 u u 2 / BV(Ω, {0, 1}) (10.7) Proposition Let c 1, c 2 be fixed. If u minimizes (R) and u BV(Ω, {0, 1}), then u minimizes (C) and (u = 1 C ) C minimizes f CV (, c 1, c 2 ). Proof. Let C Ω. Then 1 C BV(Ω, [0, 1]). Then f (1 C ) f (1 C ) (10.8) C Ω. Definition L = BV(Ω, [0, 1]). For u L, α [0, 1], 1 u(x) > α u α (x) = I({u > α}) (x) = 0 u(x) α (10.9)
47 convex optimization 47 Then f : L R satisfies general coarea condition (gcc) if and only if f (u) = 1 0 f (u α )dα (10.10) Upper bound? Proposition Let s L (Ω), Ω bounded. Then f as in (R) satisfies gcc. Proof. If f 1, f 2 satisfy gcc, then f 1 + f 2 satisfy gcc. This follows as λtv satisfies gcc by the coarea formula for total variation. Then we have Ω 1 u(x)s(x)dx = ( I({u(x) > α}) S(x)dα)dx (10.11) Ω 0 1 = I({u(x) > α}) dxdα. (10.12) 0 Ω Theorem Assume f : BV(Ω, [0, 1]) R satisfies gcc and with u arg min f (u). (10.13) u BV(Ω,[0,1]) Then for almost any α [0, 1], u α is a minimizer of f over BV(Ω, {0, 1}), Proof. u α arg min f (u). (10.14) u BV(Ω,{0,1}) S = {α f (u α) = f (u )} (10.15) If L(S) =, then for a.e. α, f (u a) = f (u ) = If L(S) > 0, then there exists ɛ > 0 such that inf f (u) (10.16) u BV(Ω,[0,1]) L(S ɛ ) > 0, S ɛ = {α u α f (u ) + ɛ} (10.17)
48 48 andrew tulloch which implies f (u ) = f (u )dα + f (u )dα (10.18) [0,1]\S ɛ S ɛ < which contradicts gcc Remark Discretization problem, consider such that 0 u 1 vs 0 f (u α)dα ɛl(s ɛ ) (10.19) f (u α)dα (10.20) min f (u), f (u) = s, u + λ Gu 2 (10.21) u R n min f (u), f (u) = s, u + λ Gu 2. (10.22) u R n such that u {0, 1} n. If u solves the relaxation, does u α solve the combinatorial problem? Only if the discretized energy f satisfies gcc. i i
49 11 Bibliography
A set C R n is convex if and only if δ C is convex if and only if. f : R n R is strictly convex if and only if f is convex and the inequality (1.
ONVEX OPTIMIZATION SUMMARY ANDREW TULLOH 1. Eistence Definition. For R n, define δ as 0 δ () = / (1.1) Note minimizes f over if and onl if minimizes f + δ over R n. Definition. (ii) (i) dom f = R n f()
More informationOptimization and Optimal Control in Banach Spaces
Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,
More informationOptimality Conditions for Nonsmooth Convex Optimization
Optimality Conditions for Nonsmooth Convex Optimization Sangkyun Lee Oct 22, 2014 Let us consider a convex function f : R n R, where R is the extended real field, R := R {, + }, which is proper (f never
More informationConvex Analysis and Optimization Chapter 2 Solutions
Convex Analysis and Optimization Chapter 2 Solutions Dimitri P. Bertsekas with Angelia Nedić and Asuman E. Ozdaglar Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com
More informationConvex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version
Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com
More informationThe proximal mapping
The proximal mapping http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/37 1 closed function 2 Conjugate function
More informationConstraint qualifications for convex inequality systems with applications in constrained optimization
Constraint qualifications for convex inequality systems with applications in constrained optimization Chong Li, K. F. Ng and T. K. Pong Abstract. For an inequality system defined by an infinite family
More informationGEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect
GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, 2018 BORIS S. MORDUKHOVICH 1 and NGUYEN MAU NAM 2 Dedicated to Franco Giannessi and Diethard Pallaschke with great respect Abstract. In
More informationConvex Functions. Pontus Giselsson
Convex Functions Pontus Giselsson 1 Today s lecture lower semicontinuity, closure, convex hull convexity preserving operations precomposition with affine mapping infimal convolution image function supremum
More informationLecture 1: Background on Convex Analysis
Lecture 1: Background on Convex Analysis John Duchi PCMI 2016 Outline I Convex sets 1.1 Definitions and examples 2.2 Basic properties 3.3 Projections onto convex sets 4.4 Separating and supporting hyperplanes
More informationSubgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives
Subgradients subgradients and quasigradients subgradient calculus optimality conditions via subgradients directional derivatives Prof. S. Boyd, EE392o, Stanford University Basic inequality recall basic
More informationLocal strong convexity and local Lipschitz continuity of the gradient of convex functions
Local strong convexity and local Lipschitz continuity of the gradient of convex functions R. Goebel and R.T. Rockafellar May 23, 2007 Abstract. Given a pair of convex conjugate functions f and f, we investigate
More informationDedicated to Michel Théra in honor of his 70th birthday
VARIATIONAL GEOMETRIC APPROACH TO GENERALIZED DIFFERENTIAL AND CONJUGATE CALCULI IN CONVEX ANALYSIS B. S. MORDUKHOVICH 1, N. M. NAM 2, R. B. RECTOR 3 and T. TRAN 4. Dedicated to Michel Théra in honor of
More informationThe local equicontinuity of a maximal monotone operator
arxiv:1410.3328v2 [math.fa] 3 Nov 2014 The local equicontinuity of a maximal monotone operator M.D. Voisei Abstract The local equicontinuity of an operator T : X X with proper Fitzpatrick function ϕ T
More informationChapter 2 Convex Analysis
Chapter 2 Convex Analysis The theory of nonsmooth analysis is based on convex analysis. Thus, we start this chapter by giving basic concepts and results of convexity (for further readings see also [202,
More informationLecture 5. The Dual Cone and Dual Problem
IE 8534 1 Lecture 5. The Dual Cone and Dual Problem IE 8534 2 For a convex cone K, its dual cone is defined as K = {y x, y 0, x K}. The inner-product can be replaced by x T y if the coordinates of the
More informationLocally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem
56 Chapter 7 Locally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem Recall that C(X) is not a normed linear space when X is not compact. On the other hand we could use semi
More informationLECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE
LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization
More informationIntegral Jensen inequality
Integral Jensen inequality Let us consider a convex set R d, and a convex function f : (, + ]. For any x,..., x n and λ,..., λ n with n λ i =, we have () f( n λ ix i ) n λ if(x i ). For a R d, let δ a
More informationShiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient
Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian
More informationBASICS OF CONVEX ANALYSIS
BASICS OF CONVEX ANALYSIS MARKUS GRASMAIR 1. Main Definitions We start with providing the central definitions of convex functions and convex sets. Definition 1. A function f : R n R + } is called convex,
More informationSubgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives
Subgradients subgradients strong and weak subgradient calculus optimality conditions via subgradients directional derivatives Prof. S. Boyd, EE364b, Stanford University Basic inequality recall basic inequality
More informationOptimality Conditions for Constrained Optimization
72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)
More informationg 2 (x) (1/3)M 1 = (1/3)(2/3)M.
COMPACTNESS If C R n is closed and bounded, then by B-W it is sequentially compact: any sequence of points in C has a subsequence converging to a point in C Conversely, any sequentially compact C R n is
More informationOn duality theory of conic linear problems
On duality theory of conic linear problems Alexander Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 3332-25, USA e-mail: ashapiro@isye.gatech.edu
More informationEE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1
EE 546, Univ of Washington, Spring 2012 6. Proximal mapping introduction review of conjugate functions proximal mapping Proximal mapping 6 1 Proximal mapping the proximal mapping (prox-operator) of a convex
More informationDual methods for the minimization of the total variation
1 / 30 Dual methods for the minimization of the total variation Rémy Abergel supervisor Lionel Moisan MAP5 - CNRS UMR 8145 Different Learning Seminar, LTCI Thursday 21st April 2016 2 / 30 Plan 1 Introduction
More informationTHE INVERSE FUNCTION THEOREM
THE INVERSE FUNCTION THEOREM W. PATRICK HOOPER The implicit function theorem is the following result: Theorem 1. Let f be a C 1 function from a neighborhood of a point a R n into R n. Suppose A = Df(a)
More informationConvex Analysis and Optimization Chapter 4 Solutions
Convex Analysis and Optimization Chapter 4 Solutions Dimitri P. Bertsekas with Angelia Nedić and Asuman E. Ozdaglar Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com
More informationContents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.
Minimization Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping. 1 Minimization A Topological Result. Let S be a topological
More information3.1 Convexity notions for functions and basic properties
3.1 Convexity notions for functions and basic properties We start the chapter with the basic definition of a convex function. Definition 3.1.1 (Convex function) A function f : E R is said to be convex
More informationL p Spaces and Convexity
L p Spaces and Convexity These notes largely follow the treatments in Royden, Real Analysis, and Rudin, Real & Complex Analysis. 1. Convex functions Let I R be an interval. For I open, we say a function
More informationA SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06
A SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06 CHRISTIAN LÉONARD Contents Preliminaries 1 1. Convexity without topology 1 2. Convexity
More informationSemi-infinite programming, duality, discretization and optimality conditions
Semi-infinite programming, duality, discretization and optimality conditions Alexander Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205,
More informationUTILITY OPTIMIZATION IN A FINITE SCENARIO SETTING
UTILITY OPTIMIZATION IN A FINITE SCENARIO SETTING J. TEICHMANN Abstract. We introduce the main concepts of duality theory for utility optimization in a setting of finitely many economic scenarios. 1. Utility
More informationA Dual Condition for the Convex Subdifferential Sum Formula with Applications
Journal of Convex Analysis Volume 12 (2005), No. 2, 279 290 A Dual Condition for the Convex Subdifferential Sum Formula with Applications R. S. Burachik Engenharia de Sistemas e Computacao, COPPE-UFRJ
More informationSWFR ENG 4TE3 (6TE3) COMP SCI 4TE3 (6TE3) Continuous Optimization Algorithm. Convex Optimization. Computing and Software McMaster University
SWFR ENG 4TE3 (6TE3) COMP SCI 4TE3 (6TE3) Continuous Optimization Algorithm Convex Optimization Computing and Software McMaster University General NLO problem (NLO : Non Linear Optimization) (N LO) min
More informationSemicontinuous functions and convexity
Semicontinuous functions and convexity Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto April 3, 2014 1 Lattices If (A, ) is a partially ordered set and S is a subset
More informationNonsymmetric potential-reduction methods for general cones
CORE DISCUSSION PAPER 2006/34 Nonsymmetric potential-reduction methods for general cones Yu. Nesterov March 28, 2006 Abstract In this paper we propose two new nonsymmetric primal-dual potential-reduction
More informationAppendix B Convex analysis
This version: 28/02/2014 Appendix B Convex analysis In this appendix we review a few basic notions of convexity and related notions that will be important for us at various times. B.1 The Hausdorff distance
More informationSome Properties of the Augmented Lagrangian in Cone Constrained Optimization
MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented
More informationVictoria Martín-Márquez
A NEW APPROACH FOR THE CONVEX FEASIBILITY PROBLEM VIA MONOTROPIC PROGRAMMING Victoria Martín-Márquez Dep. of Mathematical Analysis University of Seville Spain XIII Encuentro Red de Análisis Funcional y
More informationDivision of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45
Division of the Humanities and Social Sciences Supergradients KC Border Fall 2001 1 The supergradient of a concave function There is a useful way to characterize the concavity of differentiable functions.
More informationPractice Exam 1: Continuous Optimisation
Practice Exam : Continuous Optimisation. Let f : R m R be a convex function and let A R m n, b R m be given. Show that the function g(x) := f(ax + b) is a convex function of x on R n. Suppose that f is
More informationLectures on Geometry
January 4, 2001 Lectures on Geometry Christer O. Kiselman Contents: 1. Introduction 2. Closure operators and Galois correspondences 3. Convex sets and functions 4. Infimal convolution 5. Convex duality:
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationMath 273a: Optimization Subgradients of convex functions
Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions
More informationCHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.
1 ECONOMICS 594: LECTURE NOTES CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS W. Erwin Diewert January 31, 2008. 1. Introduction Many economic problems have the following structure: (i) a linear function
More informationSet, functions and Euclidean space. Seungjin Han
Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,
More informationOptimization for Machine Learning
Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html
More informationOn the use of semi-closed sets and functions in convex analysis
Open Math. 2015; 13: 1 5 Open Mathematics Open Access Research Article Constantin Zălinescu* On the use of semi-closed sets and functions in convex analysis Abstract: The main aim of this short note is
More informationMath The Laplacian. 1 Green s Identities, Fundamental Solution
Math. 209 The Laplacian Green s Identities, Fundamental Solution Let be a bounded open set in R n, n 2, with smooth boundary. The fact that the boundary is smooth means that at each point x the external
More informationOn duality gap in linear conic problems
On duality gap in linear conic problems C. Zălinescu Abstract In their paper Duality of linear conic problems A. Shapiro and A. Nemirovski considered two possible properties (A) and (B) for dual linear
More informationSubgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus
1/41 Subgradient Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes definition subgradient calculus duality and optimality conditions directional derivative Basic inequality
More informationCONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS
CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS A Dissertation Submitted For The Award of the Degree of Master of Philosophy in Mathematics Neelam Patel School of Mathematics
More informationMAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9
MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended
More informationLecture: Duality of LP, SOCP and SDP
1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:
More informationUSING FUNCTIONAL ANALYSIS AND SOBOLEV SPACES TO SOLVE POISSON S EQUATION
USING FUNCTIONAL ANALYSIS AND SOBOLEV SPACES TO SOLVE POISSON S EQUATION YI WANG Abstract. We study Banach and Hilbert spaces with an eye towards defining weak solutions to elliptic PDE. Using Lax-Milgram
More informationDUALITY, OPTIMALITY CONDITIONS AND PERTURBATION ANALYSIS
1 DUALITY, OPTIMALITY CONDITIONS AND PERTURBATION ANALYSIS Alexander Shapiro 1 School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA, E-mail: ashapiro@isye.gatech.edu
More informationLecture 8. Strong Duality Results. September 22, 2008
Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation
More informationZERO DUALITY GAP FOR CONVEX PROGRAMS: A GENERAL RESULT
ZERO DUALITY GAP FOR CONVEX PROGRAMS: A GENERAL RESULT EMIL ERNST AND MICHEL VOLLE Abstract. This article addresses a general criterion providing a zero duality gap for convex programs in the setting of
More informationConvex Optimization and Modeling
Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual
More informationLECTURE 12 LECTURE OUTLINE. Subgradients Fenchel inequality Sensitivity in constrained optimization Subdifferential calculus Optimality conditions
LECTURE 12 LECTURE OUTLINE Subgradients Fenchel inequality Sensitivity in constrained optimization Subdifferential calculus Optimality conditions Reading: Section 5.4 All figures are courtesy of Athena
More informationMaster 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique
Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some
More informationConvex Stochastic optimization
Convex Stochastic optimization Ari-Pekka Perkkiö January 30, 2019 Contents 1 Practicalities and background 3 2 Introduction 3 3 Convex sets 3 3.1 Separation theorems......................... 5 4 Convex
More informationFunctional Analysis I
Functional Analysis I Course Notes by Stefan Richter Transcribed and Annotated by Gregory Zitelli Polar Decomposition Definition. An operator W B(H) is called a partial isometry if W x = X for all x (ker
More informationReview of Multi-Calculus (Study Guide for Spivak s CHAPTER ONE TO THREE)
Review of Multi-Calculus (Study Guide for Spivak s CHPTER ONE TO THREE) This material is for June 9 to 16 (Monday to Monday) Chapter I: Functions on R n Dot product and norm for vectors in R n : Let X
More informationMAT 578 FUNCTIONAL ANALYSIS EXERCISES
MAT 578 FUNCTIONAL ANALYSIS EXERCISES JOHN QUIGG Exercise 1. Prove that if A is bounded in a topological vector space, then for every neighborhood V of 0 there exists c > 0 such that tv A for all t > c.
More informationChapter 1. Optimality Conditions: Unconstrained Optimization. 1.1 Differentiable Problems
Chapter 1 Optimality Conditions: Unconstrained Optimization 1.1 Differentiable Problems Consider the problem of minimizing the function f : R n R where f is twice continuously differentiable on R n : P
More informationSubdifferential representation of convex functions: refinements and applications
Subdifferential representation of convex functions: refinements and applications Joël Benoist & Aris Daniilidis Abstract Every lower semicontinuous convex function can be represented through its subdifferential
More informationConvex Analysis Notes. Lecturer: Adrian Lewis, Cornell ORIE Scribe: Kevin Kircher, Cornell MAE
Convex Analysis Notes Lecturer: Adrian Lewis, Cornell ORIE Scribe: Kevin Kircher, Cornell MAE These are notes from ORIE 6328, Convex Analysis, as taught by Prof. Adrian Lewis at Cornell University in the
More informationMath 273a: Optimization Convex Conjugacy
Math 273a: Optimization Convex Conjugacy Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com Convex conjugate (the Legendre transform) Let f be a closed proper
More informationConvexity in R n. The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If τ I(x, u), τ 0, define. f(x + τu) f(x). τ.
Convexity in R n Let E be a convex subset of R n. A function f : E (, ] is convex iff f(tx + (1 t)y) (1 t)f(x) + tf(y) x, y E, t [0, 1]. A similar definition holds in any vector space. A topology is needed
More informationSecond Order Elliptic PDE
Second Order Elliptic PDE T. Muthukumar tmk@iitk.ac.in December 16, 2014 Contents 1 A Quick Introduction to PDE 1 2 Classification of Second Order PDE 3 3 Linear Second Order Elliptic Operators 4 4 Periodic
More informationLecture 2: Convex Sets and Functions
Lecture 2: Convex Sets and Functions Hyang-Won Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 2 Network Optimization, Fall 2015 1 / 22 Optimization Problems Optimization problems are
More informationExtreme Abridgment of Boyd and Vandenberghe s Convex Optimization
Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The
More informationLinear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016
Linear Programming Larry Blume Cornell University, IHS Vienna and SFI Summer 2016 These notes derive basic results in finite-dimensional linear programming using tools of convex analysis. Most sources
More information1 Inner Product Space
Ch - Hilbert Space 1 4 Hilbert Space 1 Inner Product Space Let E be a complex vector space, a mapping (, ) : E E C is called an inner product on E if i) (x, x) 0 x E and (x, x) = 0 if and only if x = 0;
More informationMath 341: Convex Geometry. Xi Chen
Math 341: Convex Geometry Xi Chen 479 Central Academic Building, University of Alberta, Edmonton, Alberta T6G 2G1, CANADA E-mail address: xichen@math.ualberta.ca CHAPTER 1 Basics 1. Euclidean Geometry
More information(c) For each α R \ {0}, the mapping x αx is a homeomorphism of X.
A short account of topological vector spaces Normed spaces, and especially Banach spaces, are basic ambient spaces in Infinite- Dimensional Analysis. However, there are situations in which it is necessary
More informationRepresentation of the polar cone of convex functions and applications
Representation of the polar cone of convex functions and applications G. Carlier, T. Lachand-Robert October 23, 2006 version 2.1 Abstract Using a result of Y. Brenier [1], we give a representation of the
More informationLecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016
Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,
More informationLecture 7 Monotonicity. September 21, 2008
Lecture 7 Monotonicity September 21, 2008 Outline Introduce several monotonicity properties of vector functions Are satisfied immediately by gradient maps of convex functions In a sense, role of monotonicity
More informationIn English, this means that if we travel on a straight line between any two points in C, then we never leave C.
Convex sets In this section, we will be introduced to some of the mathematical fundamentals of convex sets. In order to motivate some of the definitions, we will look at the closest point problem from
More informationIntroduction to Convex and Quasiconvex Analysis
Introduction to Convex and Quasiconvex Analysis J.B.G.Frenk Econometric Institute, Erasmus University, Rotterdam G.Kassay Faculty of Mathematics, Babes Bolyai University, Cluj August 27, 2001 Abstract
More informationLECTURE SLIDES ON BASED ON CLASS LECTURES AT THE CAMBRIDGE, MASS FALL 2007 BY DIMITRI P. BERTSEKAS.
LECTURE SLIDES ON CONVEX ANALYSIS AND OPTIMIZATION BASED ON 6.253 CLASS LECTURES AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY CAMBRIDGE, MASS FALL 2007 BY DIMITRI P. BERTSEKAS http://web.mit.edu/dimitrib/www/home.html
More informationConvex Optimization M2
Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization
More informationON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction
J. Korean Math. Soc. 38 (2001), No. 3, pp. 683 695 ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE Sangho Kum and Gue Myung Lee Abstract. In this paper we are concerned with theoretical properties
More informationMaths 212: Homework Solutions
Maths 212: Homework Solutions 1. The definition of A ensures that x π for all x A, so π is an upper bound of A. To show it is the least upper bound, suppose x < π and consider two cases. If x < 1, then
More informationEE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17
EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko
More informationConvex Optimization Boyd & Vandenberghe. 5. Duality
5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized
More information2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.
Chapter 3 Duality in Banach Space Modern optimization theory largely centers around the interplay of a normed vector space and its corresponding dual. The notion of duality is important for the following
More informationA SHORT INTRODUCTION TO BANACH LATTICES AND
CHAPTER A SHORT INTRODUCTION TO BANACH LATTICES AND POSITIVE OPERATORS In tis capter we give a brief introduction to Banac lattices and positive operators. Most results of tis capter can be found, e.g.,
More informationPROBLEMS. (b) (Polarization Identity) Show that in any inner product space
1 Professor Carl Cowen Math 54600 Fall 09 PROBLEMS 1. (Geometry in Inner Product Spaces) (a) (Parallelogram Law) Show that in any inner product space x + y 2 + x y 2 = 2( x 2 + y 2 ). (b) (Polarization
More informationContinuous Sets and Non-Attaining Functionals in Reflexive Banach Spaces
Laboratoire d Arithmétique, Calcul formel et d Optimisation UMR CNRS 6090 Continuous Sets and Non-Attaining Functionals in Reflexive Banach Spaces Emil Ernst Michel Théra Rapport de recherche n 2004-04
More informationInequality Constraints
Chapter 2 Inequality Constraints 2.1 Optimality Conditions Early in multivariate calculus we learn the significance of differentiability in finding minimizers. In this section we begin our study of the
More informationEconomics 204 Fall 2011 Problem Set 2 Suggested Solutions
Economics 24 Fall 211 Problem Set 2 Suggested Solutions 1. Determine whether the following sets are open, closed, both or neither under the topology induced by the usual metric. (Hint: think about limit
More informationEE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015
EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,
More informationCONVEX OPTIMIZATION VIA LINEARIZATION. Miguel A. Goberna. Universidad de Alicante. Iberian Conference on Optimization Coimbra, November, 2006
CONVEX OPTIMIZATION VIA LINEARIZATION Miguel A. Goberna Universidad de Alicante Iberian Conference on Optimization Coimbra, 16-18 November, 2006 Notation X denotes a l.c. Hausdorff t.v.s and X its topological
More informationIterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem
Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences
More information