Surrogate Risk Consistency: the Classification Case

Size: px
Start display at page:

Download "Surrogate Risk Consistency: the Classification Case"

Transcription

1 Chapter 11 Surrogate Risk Consistency: the Classification Case I. The setting: supervised prediction problem (a) Have data coming in pairs (X,Y) and a loss L : R Y R (can have more general losses) (b) Often, it is hard to minimize L (for example, if L is non-convex), so we use a surrogate ϕ (c) We would like to compare the risks of functions f : X R: R ϕ (f) := E[ϕ(f(X),Y)] and R(f) := E[L(f(X),Y)] In particular, when does minimizing the surrogate give minimization of the true risk? (d) Our goal: when we define the Bayes risks R ϕ and R Definition 11.1 (Fisher consistency). We say the loss ϕ is Fisher consistent if for any sequence of functions f n Rϕ(fn) R implies R(f n ) R II. Classification case (a) We focus on the binary classification case so that Y { 1,1} 1. Margin-based losses: predict sign correctly, so for α R, ϕ L(α,y) = 1{αy 0} and ϕ(α,y) = (yα). 2. Consider conditional version of risks. Let η(x) = P(Y = 1 X = x) be conditional probability, then and R(f) = E[1{f(X)Y 0}] = P(sign(f(X)) Y) = E[η(X)1{f(X) 0}+(1 η(x))1{f(x) 0}] = E[l(f(X),η(X))] R (f) = E[(Yf(X))] = E[η(X)(f(X))+(1 η(x))( f(x))] = E[l (f(x),η(x))] where we have defined the conditional risks l(α,η) = η1{α 0}+(1 η)1{α 0} and l (α,η) = η(α)+(1 η)( α). 105

2 3. Note the minimizer of l: we have α (η) = sign(η 1/2), and f (X) = sign(η(x) 1/2) minimizes risk R(f) over all f 4. Minimizing f can be achieved pointwise, and we have R = E[inf α l(α,η(x))] and R = E[inf α l (α,η(x))]. (b) Example 11.1 (Exponential loss): Consider the exponential loss, used in AdaBoost (among other settings), which sets (α) = e α. In this case, we have argminl (α,η) = 1 α 2 log η 1 η because α l (α,η) = ηe α +(1 η)e α. Thus f (x) = 1 η(x) 2 log 1 η(x), and this is Fisher consistent. (c) Classification calibration 1. Consider pointwise versions of risk (all that is necessary, turns out) 2. Define the infimal conditional -risks as l (η) := inf l (α,η) and l wrong α (η) := inf l (α,η). α(η 1/2) 0 3. Intuition: if we always have l (η) < lwrong (η) for all η, we should do fine 4. Define the sub-optimality function H : [0,1] R ( ) ( ) H(δ) := l wrong 1+δ 1+δ l. 2 2 Definition The margin-based loss is classification calibrated if H(δ) > 0 for all δ > 0. Equivalently, for any η 1 2, we have l (η) < lwrong (η). 5. Example (Example 11.1 continued): For the exponential loss, we have l wrong { (η) = inf ηe α +(1 η)e α} = e 0 = 1 α(2η 1) 0 while the unconstrained minimal conditional risk is l (η) = η 1 η η η +(1 η) 1 η = 2 η(1 η), so that H(δ) = 1 1 δ δ2. Example 11.2 (Hinge loss): We can also consider the hinge loss, which is defined as (α) = [1 α] +. We first compute the minimizers of the conditional risk; we have l (α,η) = η[1 α] + +(1 η)[1+α] +, whose unique minimizer (for η {0, 1 2,1}) is α(η) = sign(2η 1). We thus have l (η) = 2min{η,1 η} and lwrong (η) = η +(1 η) = 1. We obtain H(δ) = 1 min{1+δ,1 δ} = δ. Comparing to the sub-optimality function for exp-loss, is tighter. 106

3 6. Pictures: use exponential loss, with η and without. (d) Our goal: using classification calibration, find some function ψ such that ψ(r (f) R ) R(f) R, where ψ(δ) > 0 for all δ > 0. Can we get a convex version of H, them maybe use Jensen s inequality to get the results? Turns out we will be able to do this. III. Some necessary asides on convex analysis (a) Epigraphs and closures 1. For a function f, the epigraph epif is the set of points (x,t) such that f(x) t 2. A function f is said to be closed if its epigraph is closed, which for convex f occurs if and only if f is lower semicontinuous (meaning liminf x x0 f(x) f(x 0 )) 3. Note: a one-dimensional closed convex function is continuous Lemma Let f : R R be convex. Then f is continuous on the interior of its domain. (Proof in notes; just give a picture) Lemma Let f : R R be closed convex. Then f is continuous on its domain. 4. The closure of a function f is the function cl f whose epigraph is the closed convex hull of epif (picture) (b) Conjugate functions (Fenchel-Legendre transform) 1. Let f : R d R be an (arbitrary) function. Its conjugate (or Fenchel-Legendre conjugate) is defined to be f (s) := sup{ t,s f(t)}. t (Picture here) Note that we always have f (s)+f(t) s,t, or f(t) s,t f (s) 2. The Fenchel biconjugate is defined to be f (t) = sup s { t,s f (s)} (Picture here, noting that f (t) = s implies f (t) = ts f(t)) 3. In fact, the biconjugate is the largest closed convex function smaller than f: Lemma We have f (x) = sup { a,x b : a,t b f(t) for all t}. a R d,b R Proof Let A R d R denote all the pairs (a,b) minorizing f, that is, those pairs such that f(t) a,t b for all t. Then we have (a,b) A f(t) a,t b for all t b a,t f(t) all t b f (a) and a domf. Thus we obtain the following sequence of equalities: sup { a,t b} = sup{ a,t b : a domf, b f (a)} (a,b) A = sup{ a,t f (a)}. So we have all the supporting hyperplanes to the graph of f as desired. 107

4 4. Other interesting lemma: Lemma Let h be either (i) continuous on [0,1] or (ii) non-decreasing on [0,1]. (And set h(1 + δ) = + for δ > 0.) If h satisfies h(t) > 0 for t > 0 and h(0) = 0, then f(t) = h (t) satisfies f(t) > 0 for any t > 0. (Proof by picture) IV. Classification calibration results: (a) Getting quantitative bounds on risk: define the ψ-transform via (b) Main theorem for today: ψ(δ) := H (δ). (11.0.1) Theorem Let be a margin-based loss function and ψ the associated ψ-transform. Then for any f : X R, Moreover, the following three are equivalent: 1. The loss is classification-calibrated 2. For any sequence δ n [0,1], ψ(r(f) R ) R (f) R. (11.0.2) ψ(δ n ) 0 δ n For any sequence of measurable functions f n : X R, R (f n ) R implies R(f n ) R. 1. Some insights from theorem. Recall examples 11.1 and For both of these, we have that ψ(δ) = H(δ), as H is convex. For the hinge loss, (α) = [1 α] +, we obtain for any f that P(Yf(X) 0) inf f P(Yf(X) 0) E[ [1 Yf(X)] + ] inf f E[ [1 Yf(X)] + ]. On the other hand, for the exponential loss, we have ( ) 1 2 P(Yf(X) 0) infp(yf(x) 0) E[exp( Yf(X))] inf 2 E[exp( Yf(X))]. f f The hinge loss is sharper. 2. Example 11.8 (Regression for classification): What about the surrogate loss 1 2 (f(x) y)2? In the homework, show which margin this corresponds to, and moreover, H(δ) = 1 2 δ2. So regressing on the labels is consistent. (c) Proof of Theorem 11.7 The proof of the theorem proceeds in several parts. 1. We first state a lemma, which follows from the results on convex functions we have already proved. The lemma is useful for several different parts of our proof. Lemma We have the following. a. The functions H and ψ are continuous. 108

5 b. We have H 0 and H(0) = 0. c. If H(δ) > 0 for all δ > 0, then ψ(δ) > 0 for all δ > 0. Because H(0) = 0 and H 0: we have l wrong (1/2) := inf l (α,1/2) = inf l (α,1/2) = l α(1 1) 0 α (1/2), so H(0) = l (1/2) l (1/2) = 0. (It is clear that the sub-optimality gap H 0 by construction.) 2. We begin with the first statement of the theorem, inequality (11.0.2). Consider first the gap (for a fixed margin α) in conditional 0-1 risk, l(α,η) inf α l(α,η) = η1{α 0}+(1 η)1{α 0} η1{η 1/2} (1 η)1{η 1/2} { 0 if sign(α) = sign(η 1 = 2 ) η (1 η) η (1 η) = 2η 1 if sign(α) sign(η 1 2 ). In particular, we obtain that the gap in risks is R(f) R = E[1{sign(f(X)) sign(2η(x) 1)} 2η(X) 1 ]. (11.0.3) Now we use expression (11.0.3) to get an upper bound on R(f) R via the -risk. Indeed, consider the ψ-transform (11.0.1). By Jensen s inequality, we have that ψ(r(f) R ) E[ψ(1{sign(f(X)) sign(2η(x) 1)} 2η(X) 1 )]. Now we recall from Lemma 11.9 that ψ(0) = 0. Thus we have ψ(r(f) R ) E[ψ(1{sign(f(X)) sign(2η(x) 1)} 2η(X) 1 )] = E[1{sign(f(X)) sign(2η(x) 1)}ψ( 2η(X) 1 )] (11.0.4) Now we use the special structure of the suboptimality function we have constructed. Note that ψ H, and moreover, we have for any α R that [ ] 1{sign(α) sign(2η 1)}H( 2η 1 ) = 1{sign(α) sign(2η 1)} inf l (α,η) l (η) α(2η 1) 0 because (1+ 2η 1 )/2 = max{η,1 η}. Combining inequalities (11.0.4) and (11.0.5), we see that l (α,η) l (η), (11.0.5) ψ(r(f) R ) E[1{sign(f(X)) sign(2η(x) 1)}H( 2η(X) 1 )] E [ l (f(x),η(x)) l (η(x))] = R (f) R, which is our desired result. 3. Having proved the quantitative bound (11.0.2), we now turn to proving the second part of Theorem Using Lemma 11.9, we can prove the equivalence of all three items. We begin by showing that IV(b)1 implies IV(b)2. If is classification calibrated, we have H(δ) > 0 for all δ > 0. Because ψ is continuous and ψ(0) = 0, if δ 0, then 109

6 ψ(δ) 0. It remains to show that ψ(δ) 0 implies that δ 0. But this is clear because we know that ψ(0) = 0 andψ(δ) > 0 whenever δ > 0, and the convexity of ψ implies that ψ is increasing. To obtain IV(b)3 from IV(b)2, note that by inequality (11.0.2), we have ψ(r(f n ) R ) R (f n ) R 0, so we must have that δ n = R(f n ) R 0. Finally, we show that IV(b)1 follows from IV(b)3. Assume for the sake of contradiction that IV(b)3 holds but IV(b)1 fails, that is, is not classification calibrated. Then there must exist η < 1/2 and a sequence α n 0 (i.e. a sequence of predictions with incorrect sign) satisfying l (α n,η) l (η). Construct the classification problem with a singleton X = {x}, and set P(Y = 1) = η. Then the sequence f n (x) = α n satisfies R (f n ) R but the true 0-1 risk R(f n) R. V. Classification calibration in the convex case a. Suppose that is convex, which we often use for computational reasons b. Theorem (Bartlett, Jordan, McAuliffe [1]). If is convex, then is classification calibrated if and only if (0) exists and (0) < 0. Proof First, suppose that is differentiable at 0 and (0) < 0. Then l (α,η) = η(α)+(1 η)( α) satisfies l (0,η) = (2η 1) (0), and if (0) < 0, this quantity is negative for η > 1/2. Thus the minimizing α(η) (0, ]. (Proof by picture, but formalize in full notes.) For the other direction assume that is classification calibrated. Recall the definition of a subgradient g α of the function at α R is any g α such that (t) (α)+g α (t α) for all t R. (Picture.) Let g 1,g 2 be such that l(α) l(0)+g 1 α and l(α) l(0)+g 2 α, which exist by convexity. We show that both g 1,g 2 < 0 and g 1 = g 2. By convexity we have l (α,η) η((0)+g 1 α)+(1 η)((0) g 2 α) = [ηg 1 (1 η)g 2 ]α+(0). (11.0.6) We first show that g 1 = g 2, meaning that is differentiable. Without loss of generality, assume g 1 > g 2. Then for η > 1/2, we would have ηg 1 (1 η)g 2 > 0, which would imply that l (α,η) (0) inf α 0 { η(α )+(1 η)( α ) } = l wrong (η), for all α 0 by (11.0.6), by taking α = 0 in the second inequality. By our assumption of classification calibration, for η > 1/2 we know that inf α l (α,η) < inf l (α,η) = l wrong α 0 (η) so l (η) = inf l (α,η), α 0 and under the assumption that g 1 > g 2 we obtain l (η) = inf α 0l (α,η) > l wrong (η), which is a contradiction to classification calibration. We thus obtain g 1 = g 2, so that the function has a unique subderivative at α = 0 and is thus differentiable. 110

7 Now that we know is differentiable at 0, consider η(α)+(1 η)( α) (2η 1) (0)α+(0). If (0) 0, then for α 0 and η > 1/2 we must have the right hand side is at least (0), which contradicts classification calibration, because we know that l (η) < lwrong (η) exactly as in the preceding argument Proofs of convex analytic results Proof of Lemma 11.4 First, let (a,b) domf and fix x 0 (a,b). Let x x 0, which is no loss of generality, and we may also assume x (a,b). Then we have for some α,β [0,1]. Rearranging by convexity, x = αa+(1 α)x 0 and x 0 = βb+(1 β)x f(x) αf(a)+(1 α)f(x 0 ) = f(x 0 )+α(f(a) f(x 0 )) and Taking α,β 0, we obtain f(x 0 ) βf(b)+(1 β)f(x), or 1 1 β f(x 0) f(x)+ β 1 β f(b). lim inf x x 0 f(x) f(x 0 ) and limsup x x 0 f(x) f(x 0 ) as desired Proof of Lemma 11.4 We need only consider the endpoints of the domain by Lemma 11.3, and we only need to show that limsup x x0 f(x) f(x 0 ). But this is obvious by convexity: let x = ty + (1 t)x 0 for any y domf, and taking t 0, we have f(x) tf(y)+(1 t)f(x 0 ) f(x 0 ) Proof of Lemma 11.6 Webeginwith thecase(i). Definethefunction h low (t) := inf s t h(s). Then becausehiscontinuous, weknowthatoveranycompactsetitattainsitsinfimum, andthus(byassumptiononh)h low (t) > 0 for all t > 0. Moreover, h low is non-decreasing. Now define f low (t) = h low (t) to be the biconjugate of h low ; it is clear that f f low as h h low. Thus we see that case (ii) implies case (i), so we turn to the more general result to see that f low (t) > 0 for all t > 0. For the result in case (ii), assume for the sake of contradiction there is some z (0,1) satisfying h (z) = 0. It is clear that h (0) = 0 and h 0, so we must have h (z/2) = 0. Now, by 111

8 assumption we have h(z/2) = b > 0, whence we have h(1) b > 0. In particular, the piecewise linear function defined by { 0 if t z/2 g(t) = b 1 z/2 (t z/2) if t > z/2 is closed, convex, and satisfies g h. But g(z) > 0 = h (z), a contradiction to the fact that h is the largest (closed) convex function below h. 112

9 Bibliography [1] P. L. Bartlett, M. I. Jordan, and J. McAuliffe. Convexity, classification, and risk bounds. Journal of the American Statistical Association, 101: ,

Calibrated Surrogate Losses

Calibrated Surrogate Losses EECS 598: Statistical Learning Theory, Winter 2014 Topic 14 Calibrated Surrogate Losses Lecturer: Clayton Scott Scribe: Efrén Cruz Cortés Disclaimer: These notes have not been subjected to the usual scrutiny

More information

Lecture 10 February 23

Lecture 10 February 23 EECS 281B / STAT 241B: Advanced Topics in Statistical LearningSpring 2009 Lecture 10 February 23 Lecturer: Martin Wainwright Scribe: Dave Golland Note: These lecture notes are still rough, and have only

More information

Statistical Properties of Large Margin Classifiers

Statistical Properties of Large Margin Classifiers Statistical Properties of Large Margin Classifiers Peter Bartlett Division of Computer Science and Department of Statistics UC Berkeley Joint work with Mike Jordan, Jon McAuliffe, Ambuj Tewari. slides

More information

Calibrated asymmetric surrogate losses

Calibrated asymmetric surrogate losses Electronic Journal of Statistics Vol. 6 (22) 958 992 ISSN: 935-7524 DOI:.24/2-EJS699 Calibrated asymmetric surrogate losses Clayton Scott Department of Electrical Engineering and Computer Science Department

More information

AdaBoost and other Large Margin Classifiers: Convexity in Classification

AdaBoost and other Large Margin Classifiers: Convexity in Classification AdaBoost and other Large Margin Classifiers: Convexity in Classification Peter Bartlett Division of Computer Science and Department of Statistics UC Berkeley Joint work with Mikhail Traskin. slides at

More information

Convexity, Classification, and Risk Bounds

Convexity, Classification, and Risk Bounds Convexity, Classification, and Risk Bounds Peter L. Bartlett Computer Science Division and Department of Statistics University of California, Berkeley bartlett@stat.berkeley.edu Michael I. Jordan Computer

More information

8. Conjugate functions

8. Conjugate functions L. Vandenberghe EE236C (Spring 2013-14) 8. Conjugate functions closed functions conjugate function 8-1 Closed set a set C is closed if it contains its boundary: x k C, x k x = x C operations that preserve

More information

Some Background Math Notes on Limsups, Sets, and Convexity

Some Background Math Notes on Limsups, Sets, and Convexity EE599 STOCHASTIC NETWORK OPTIMIZATION, MICHAEL J. NEELY, FALL 2008 1 Some Background Math Notes on Limsups, Sets, and Convexity I. LIMITS Let f(t) be a real valued function of time. Suppose f(t) converges

More information

Convexity in R n. The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If τ I(x, u), τ 0, define. f(x + τu) f(x). τ.

Convexity in R n. The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If τ I(x, u), τ 0, define. f(x + τu) f(x). τ. Convexity in R n Let E be a convex subset of R n. A function f : E (, ] is convex iff f(tx + (1 t)y) (1 t)f(x) + tf(y) x, y E, t [0, 1]. A similar definition holds in any vector space. A topology is needed

More information

Large Margin Classifiers: Convexity and Classification

Large Margin Classifiers: Convexity and Classification Large Margin Classifiers: Convexity and Classification Peter Bartlett Division of Computer Science and Department of Statistics UC Berkeley Joint work with Mike Collins, Mike Jordan, David McAllester,

More information

STATISTICAL BEHAVIOR AND CONSISTENCY OF CLASSIFICATION METHODS BASED ON CONVEX RISK MINIMIZATION

STATISTICAL BEHAVIOR AND CONSISTENCY OF CLASSIFICATION METHODS BASED ON CONVEX RISK MINIMIZATION STATISTICAL BEHAVIOR AND CONSISTENCY OF CLASSIFICATION METHODS BASED ON CONVEX RISK MINIMIZATION Tong Zhang The Annals of Statistics, 2004 Outline Motivation Approximation error under convex risk minimization

More information

Convex Functions. Pontus Giselsson

Convex Functions. Pontus Giselsson Convex Functions Pontus Giselsson 1 Today s lecture lower semicontinuity, closure, convex hull convexity preserving operations precomposition with affine mapping infimal convolution image function supremum

More information

Static Problem Set 2 Solutions

Static Problem Set 2 Solutions Static Problem Set Solutions Jonathan Kreamer July, 0 Question (i) Let g, h be two concave functions. Is f = g + h a concave function? Prove it. Yes. Proof: Consider any two points x, x and α [0, ]. Let

More information

GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect

GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, 2018 BORIS S. MORDUKHOVICH 1 and NGUYEN MAU NAM 2 Dedicated to Franco Giannessi and Diethard Pallaschke with great respect Abstract. In

More information

BASICS OF CONVEX ANALYSIS

BASICS OF CONVEX ANALYSIS BASICS OF CONVEX ANALYSIS MARKUS GRASMAIR 1. Main Definitions We start with providing the central definitions of convex functions and convex sets. Definition 1. A function f : R n R + } is called convex,

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

2 Loss Functions and Their Risks

2 Loss Functions and Their Risks 2 Loss Functions and Their Risks Overview. We saw in the introduction that the learning problems we consider in this book can be described by loss functions and their associated risks. In this chapter,

More information

AW -Convergence and Well-Posedness of Non Convex Functions

AW -Convergence and Well-Posedness of Non Convex Functions Journal of Convex Analysis Volume 10 (2003), No. 2, 351 364 AW -Convergence Well-Posedness of Non Convex Functions Silvia Villa DIMA, Università di Genova, Via Dodecaneso 35, 16146 Genova, Italy villa@dima.unige.it

More information

LECTURE 4 LECTURE OUTLINE

LECTURE 4 LECTURE OUTLINE LECTURE 4 LECTURE OUTLINE Relative interior and closure Algebra of relative interiors and closures Continuity of convex functions Closures of functions Reading: Section 1.3 All figures are courtesy of

More information

Convex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs)

Convex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs) ORF 523 Lecture 8 Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Any typos should be emailed to a a a@princeton.edu. 1 Outline Convexity-preserving operations Convex envelopes, cardinality

More information

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives Subgradients subgradients and quasigradients subgradient calculus optimality conditions via subgradients directional derivatives Prof. S. Boyd, EE392o, Stanford University Basic inequality recall basic

More information

On the Consistency of AUC Pairwise Optimization

On the Consistency of AUC Pairwise Optimization On the Consistency of AUC Pairwise Optimization Wei Gao and Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University Collaborative Innovation Center of Novel Software Technology

More information

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

g 2 (x) (1/3)M 1 = (1/3)(2/3)M. COMPACTNESS If C R n is closed and bounded, then by B-W it is sequentially compact: any sequence of points in C has a subsequence converging to a point in C Conversely, any sequentially compact C R n is

More information

Online Convex Optimization

Online Convex Optimization Advanced Course in Machine Learning Spring 2010 Online Convex Optimization Handouts are jointly prepared by Shie Mannor and Shai Shalev-Shwartz A convex repeated game is a two players game that is performed

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 20 Subgradients Assumptions

More information

Handout 2: Elements of Convex Analysis

Handout 2: Elements of Convex Analysis ENGG 5501: Foundations of Optimization 2018 19 First Term Handout 2: Elements of Convex Analysis Instructor: Anthony Man Cho So September 10, 2018 As briefly mentioned in Handout 1, the notion of convexity

More information

Classification objectives COMS 4771

Classification objectives COMS 4771 Classification objectives COMS 4771 1. Recap: binary classification Scoring functions Consider binary classification problems with Y = { 1, +1}. 1 / 22 Scoring functions Consider binary classification

More information

Convex Optimization Theory. Chapter 3 Exercises and Solutions: Extended Version

Convex Optimization Theory. Chapter 3 Exercises and Solutions: Extended Version Convex Optimization Theory Chapter 3 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

Convex Analysis Background

Convex Analysis Background Convex Analysis Background John C. Duchi Stanford University Park City Mathematics Institute 206 Abstract In this set of notes, we will outline several standard facts from convex analysis, the study of

More information

Chapter 2: Preliminaries and elements of convex analysis

Chapter 2: Preliminaries and elements of convex analysis Chapter 2: Preliminaries and elements of convex analysis Edoardo Amaldi DEIB Politecnico di Milano edoardo.amaldi@polimi.it Website: http://home.deib.polimi.it/amaldi/opt-14-15.shtml Academic year 2014-15

More information

Introduction to Convex and Quasiconvex Analysis

Introduction to Convex and Quasiconvex Analysis Introduction to Convex and Quasiconvex Analysis J.B.G.Frenk Econometric Institute, Erasmus University, Rotterdam G.Kassay Faculty of Mathematics, Babes Bolyai University, Cluj August 27, 2001 Abstract

More information

Consistency of Nearest Neighbor Methods

Consistency of Nearest Neighbor Methods E0 370 Statistical Learning Theory Lecture 16 Oct 25, 2011 Consistency of Nearest Neighbor Methods Lecturer: Shivani Agarwal Scribe: Arun Rajkumar 1 Introduction In this lecture we return to the study

More information

Lower semicontinuous and Convex Functions

Lower semicontinuous and Convex Functions Lower semicontinuous and Convex Functions James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University October 6, 2017 Outline Lower Semicontinuous Functions

More information

IE 521 Convex Optimization

IE 521 Convex Optimization Lecture 5: Convex II 6th February 2019 Convex Local Lipschitz Outline Local Lipschitz 1 / 23 Convex Local Lipschitz Convex Function: f : R n R is convex if dom(f ) is convex and for any λ [0, 1], x, y

More information

Semicontinuous functions and convexity

Semicontinuous functions and convexity Semicontinuous functions and convexity Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto April 3, 2014 1 Lattices If (A, ) is a partially ordered set and S is a subset

More information

Lecture 1: Background on Convex Analysis

Lecture 1: Background on Convex Analysis Lecture 1: Background on Convex Analysis John Duchi PCMI 2016 Outline I Convex sets 1.1 Definitions and examples 2.2 Basic properties 3.3 Projections onto convex sets 4.4 Separating and supporting hyperplanes

More information

Convex Optimization Notes

Convex Optimization Notes Convex Optimization Notes Jonathan Siegel January 2017 1 Convex Analysis This section is devoted to the study of convex functions f : B R {+ } and convex sets U B, for B a Banach space. The case of B =

More information

4. Convex Sets and (Quasi-)Concave Functions

4. Convex Sets and (Quasi-)Concave Functions 4. Convex Sets and (Quasi-)Concave Functions Daisuke Oyama Mathematics II April 17, 2017 Convex Sets Definition 4.1 A R N is convex if (1 α)x + αx A whenever x, x A and α [0, 1]. A R N is strictly convex

More information

Solution. 1 Solution of Homework 7. Sangchul Lee. March 22, Problem 1.1

Solution. 1 Solution of Homework 7. Sangchul Lee. March 22, Problem 1.1 Solution Sangchul Lee March, 018 1 Solution of Homework 7 Problem 1.1 For a given k N, Consider two sequences (a n ) and (b n,k ) in R. Suppose that a n b n,k for all n,k N Show that limsup a n B k :=

More information

Support Vector Machines and Bayes Regression

Support Vector Machines and Bayes Regression Statistical Techniques in Robotics (16-831, F11) Lecture #14 (Monday ctober 31th) Support Vector Machines and Bayes Regression Lecturer: Drew Bagnell Scribe: Carl Doersch 1 1 Linear SVMs We begin by considering

More information

Lecture 8. Strong Duality Results. September 22, 2008

Lecture 8. Strong Duality Results. September 22, 2008 Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation

More information

Optimality Conditions for Nonsmooth Convex Optimization

Optimality Conditions for Nonsmooth Convex Optimization Optimality Conditions for Nonsmooth Convex Optimization Sangkyun Lee Oct 22, 2014 Let us consider a convex function f : R n R, where R is the extended real field, R := R {, + }, which is proper (f never

More information

Iowa State University. Instructor: Alex Roitershtein Summer Homework #5. Solutions

Iowa State University. Instructor: Alex Roitershtein Summer Homework #5. Solutions Math 50 Iowa State University Introduction to Real Analysis Department of Mathematics Instructor: Alex Roitershtein Summer 205 Homework #5 Solutions. Let α and c be real numbers, c > 0, and f is defined

More information

Dedicated to Michel Théra in honor of his 70th birthday

Dedicated to Michel Théra in honor of his 70th birthday VARIATIONAL GEOMETRIC APPROACH TO GENERALIZED DIFFERENTIAL AND CONJUGATE CALCULI IN CONVEX ANALYSIS B. S. MORDUKHOVICH 1, N. M. NAM 2, R. B. RECTOR 3 and T. TRAN 4. Dedicated to Michel Théra in honor of

More information

Abstract Monotone Operators Representable by Abstract Convex Functions

Abstract Monotone Operators Representable by Abstract Convex Functions Applied Mathematical Sciences, Vol. 6, 2012, no. 113, 5649-5653 Abstract Monotone Operators Representable by Abstract Convex Functions H. Mohebi and A. R. Sattarzadeh Department of Mathematics of Shahid

More information

HOMEWORK ASSIGNMENT 6

HOMEWORK ASSIGNMENT 6 HOMEWORK ASSIGNMENT 6 DUE 15 MARCH, 2016 1) Suppose f, g : A R are uniformly continuous on A. Show that f + g is uniformly continuous on A. Solution First we note: In order to show that f + g is uniformly

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

2 Upper-bound of Generalization Error of AdaBoost

2 Upper-bound of Generalization Error of AdaBoost COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Haipeng Zheng March 5, 2008 1 Review of AdaBoost Algorithm Here is the AdaBoost Algorithm: input: (x 1,y 1 ),...,(x m,y

More information

Surrogate losses and regret bounds for cost-sensitive classification with example-dependent costs

Surrogate losses and regret bounds for cost-sensitive classification with example-dependent costs Surrogate losses and regret bounds for cost-sensitive classification with example-dependent costs Clayton Scott Dept. of Electrical Engineering and Computer Science, and of Statistics University of Michigan,

More information

Chapter 2 Convex Analysis

Chapter 2 Convex Analysis Chapter 2 Convex Analysis The theory of nonsmooth analysis is based on convex analysis. Thus, we start this chapter by giving basic concepts and results of convexity (for further readings see also [202,

More information

LECTURE 12 LECTURE OUTLINE. Subgradients Fenchel inequality Sensitivity in constrained optimization Subdifferential calculus Optimality conditions

LECTURE 12 LECTURE OUTLINE. Subgradients Fenchel inequality Sensitivity in constrained optimization Subdifferential calculus Optimality conditions LECTURE 12 LECTURE OUTLINE Subgradients Fenchel inequality Sensitivity in constrained optimization Subdifferential calculus Optimality conditions Reading: Section 5.4 All figures are courtesy of Athena

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

More information

Design and Analysis of Algorithms Lecture Notes on Convex Optimization CS 6820, Fall Nov 2 Dec 2016

Design and Analysis of Algorithms Lecture Notes on Convex Optimization CS 6820, Fall Nov 2 Dec 2016 Design and Analysis of Algorithms Lecture Notes on Convex Optimization CS 6820, Fall 206 2 Nov 2 Dec 206 Let D be a convex subset of R n. A function f : D R is convex if it satisfies f(tx + ( t)y) tf(x)

More information

Introduction and Preliminaries

Introduction and Preliminaries Chapter 1 Introduction and Preliminaries This chapter serves two purposes. The first purpose is to prepare the readers for the more systematic development in later chapters of methods of real analysis

More information

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO QUESTION BOOKLET EECS 227A Fall 2009 Midterm Tuesday, Ocotober 20, 11:10-12:30pm DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO You have 80 minutes to complete the midterm. The midterm consists

More information

Sparseness Versus Estimating Conditional Probabilities: Some Asymptotic Results

Sparseness Versus Estimating Conditional Probabilities: Some Asymptotic Results Sparseness Versus Estimating Conditional Probabilities: Some Asymptotic Results Peter L. Bartlett 1 and Ambuj Tewari 2 1 Division of Computer Science and Department of Statistics University of California,

More information

QF101: Quantitative Finance August 22, Week 1: Functions. Facilitator: Christopher Ting AY 2017/2018

QF101: Quantitative Finance August 22, Week 1: Functions. Facilitator: Christopher Ting AY 2017/2018 QF101: Quantitative Finance August 22, 2017 Week 1: Functions Facilitator: Christopher Ting AY 2017/2018 The chief function of the body is to carry the brain around. Thomas A. Edison 1.1 What is a function?

More information

Hamiltonian Mechanics

Hamiltonian Mechanics Chapter 3 Hamiltonian Mechanics 3.1 Convex functions As background to discuss Hamiltonian mechanics we discuss convexity and convex functions. We will also give some applications to thermodynamics. We

More information

Subdifferential representation of convex functions: refinements and applications

Subdifferential representation of convex functions: refinements and applications Subdifferential representation of convex functions: refinements and applications Joël Benoist & Aris Daniilidis Abstract Every lower semicontinuous convex function can be represented through its subdifferential

More information

Math 118B Solutions. Charles Martin. March 6, d i (x i, y i ) + d i (y i, z i ) = d(x, y) + d(y, z). i=1

Math 118B Solutions. Charles Martin. March 6, d i (x i, y i ) + d i (y i, z i ) = d(x, y) + d(y, z). i=1 Math 8B Solutions Charles Martin March 6, Homework Problems. Let (X i, d i ), i n, be finitely many metric spaces. Construct a metric on the product space X = X X n. Proof. Denote points in X as x = (x,

More information

L p Spaces and Convexity

L p Spaces and Convexity L p Spaces and Convexity These notes largely follow the treatments in Royden, Real Analysis, and Rudin, Real & Complex Analysis. 1. Convex functions Let I R be an interval. For I open, we say a function

More information

DEFINABLE VERSIONS OF THEOREMS BY KIRSZBRAUN AND HELLY

DEFINABLE VERSIONS OF THEOREMS BY KIRSZBRAUN AND HELLY DEFINABLE VERSIONS OF THEOREMS BY KIRSZBRAUN AND HELLY MATTHIAS ASCHENBRENNER AND ANDREAS FISCHER Abstract. Kirszbraun s Theorem states that every Lipschitz map S R n, where S R m, has an extension to

More information

Parcours OJD, Ecole Polytechnique et Université Pierre et Marie Curie 05 Mai 2015

Parcours OJD, Ecole Polytechnique et Université Pierre et Marie Curie 05 Mai 2015 Examen du cours Optimisation Stochastique Version 06/05/2014 Mastère de Mathématiques de la Modélisation F. Bonnans Parcours OJD, Ecole Polytechnique et Université Pierre et Marie Curie 05 Mai 2015 Authorized

More information

A SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06

A SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06 A SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06 CHRISTIAN LÉONARD Contents Preliminaries 1 1. Convexity without topology 1 2. Convexity

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 4. Suvrit Sra. (Conjugates, subdifferentials) 31 Jan, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 4. Suvrit Sra. (Conjugates, subdifferentials) 31 Jan, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 4 (Conjugates, subdifferentials) 31 Jan, 2013 Suvrit Sra Organizational HW1 due: 14th Feb 2013 in class. Please L A TEX your solutions (contact TA if this

More information

On duality theory of conic linear problems

On duality theory of conic linear problems On duality theory of conic linear problems Alexander Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 3332-25, USA e-mail: ashapiro@isye.gatech.edu

More information

ZERO DUALITY GAP FOR CONVEX PROGRAMS: A GENERAL RESULT

ZERO DUALITY GAP FOR CONVEX PROGRAMS: A GENERAL RESULT ZERO DUALITY GAP FOR CONVEX PROGRAMS: A GENERAL RESULT EMIL ERNST AND MICHEL VOLLE Abstract. This article addresses a general criterion providing a zero duality gap for convex programs in the setting of

More information

MAT 771 FUNCTIONAL ANALYSIS HOMEWORK 3. (1) Let V be the vector space of all bounded or unbounded sequences of complex numbers.

MAT 771 FUNCTIONAL ANALYSIS HOMEWORK 3. (1) Let V be the vector space of all bounded or unbounded sequences of complex numbers. MAT 771 FUNCTIONAL ANALYSIS HOMEWORK 3 (1) Let V be the vector space of all bounded or unbounded sequences of complex numbers. (a) Define d : V V + {0} by d(x, y) = 1 ξ j η j 2 j 1 + ξ j η j. Show that

More information

Examples of Dual Spaces from Measure Theory

Examples of Dual Spaces from Measure Theory Chapter 9 Examples of Dual Spaces from Measure Theory We have seen that L (, A, µ) is a Banach space for any measure space (, A, µ). We will extend that concept in the following section to identify an

More information

DS-GA 1003: Machine Learning and Computational Statistics Homework 6: Generalized Hinge Loss and Multiclass SVM

DS-GA 1003: Machine Learning and Computational Statistics Homework 6: Generalized Hinge Loss and Multiclass SVM DS-GA 1003: Machine Learning and Computational Statistics Homework 6: Generalized Hinge Loss and Multiclass SVM Due: Monday, April 11, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to

More information

Subgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives

Subgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives Subgradients subgradients strong and weak subgradient calculus optimality conditions via subgradients directional derivatives Prof. S. Boyd, EE364b, Stanford University Basic inequality recall basic inequality

More information

Hence, (f(x) f(x 0 )) 2 + (g(x) g(x 0 )) 2 < ɛ

Hence, (f(x) f(x 0 )) 2 + (g(x) g(x 0 )) 2 < ɛ Matthew Straughn Math 402 Homework 5 Homework 5 (p. 429) 13.3.5, 13.3.6 (p. 432) 13.4.1, 13.4.2, 13.4.7*, 13.4.9 (p. 448-449) 14.2.1, 14.2.2 Exercise 13.3.5. Let (X, d X ) be a metric space, and let f

More information

Epiconvergence and ε-subgradients of Convex Functions

Epiconvergence and ε-subgradients of Convex Functions Journal of Convex Analysis Volume 1 (1994), No.1, 87 100 Epiconvergence and ε-subgradients of Convex Functions Andrei Verona Department of Mathematics, California State University Los Angeles, CA 90032,

More information

arxiv:math/ v1 [math.fa] 4 Feb 1993

arxiv:math/ v1 [math.fa] 4 Feb 1993 Lectures on Maximal Monotone Operators R. R. Phelps Dept. Math. GN 50, Univ. of Wash., Seattle WA 98195; phelps@math.washington.edu (Lectures given at Prague/Paseky Summer School, Czech Republic, August

More information

Course 212: Academic Year Section 1: Metric Spaces

Course 212: Academic Year Section 1: Metric Spaces Course 212: Academic Year 1991-2 Section 1: Metric Spaces D. R. Wilkins Contents 1 Metric Spaces 3 1.1 Distance Functions and Metric Spaces............. 3 1.2 Convergence and Continuity in Metric Spaces.........

More information

Notes on uniform convergence

Notes on uniform convergence Notes on uniform convergence Erik Wahlén erik.wahlen@math.lu.se January 17, 2012 1 Numerical sequences We begin by recalling some properties of numerical sequences. By a numerical sequence we simply mean

More information

FUNCTIONAL COMPRESSION-EXPANSION FIXED POINT THEOREM

FUNCTIONAL COMPRESSION-EXPANSION FIXED POINT THEOREM Electronic Journal of Differential Equations, Vol. 28(28), No. 22, pp. 1 12. ISSN: 172-6691. URL: http://ejde.math.txstate.edu or http://ejde.math.unt.edu ftp ejde.math.txstate.edu (login: ftp) FUNCTIONAL

More information

Advanced Calculus I Chapter 2 & 3 Homework Solutions October 30, Prove that f has a limit at 2 and x + 2 find it. f(x) = 2x2 + 3x 2 x + 2

Advanced Calculus I Chapter 2 & 3 Homework Solutions October 30, Prove that f has a limit at 2 and x + 2 find it. f(x) = 2x2 + 3x 2 x + 2 Advanced Calculus I Chapter 2 & 3 Homework Solutions October 30, 2009 2. Define f : ( 2, 0) R by f(x) = 2x2 + 3x 2. Prove that f has a limit at 2 and x + 2 find it. Note that when x 2 we have f(x) = 2x2

More information

On surrogate loss functions and f-divergences

On surrogate loss functions and f-divergences On surrogate loss functions and f-divergences XuanLong Nguyen, Martin J. Wainwright, xuanlong.nguyen@stat.duke.edu wainwrig@stat.berkeley.edu Michael I. Jordan, jordan@stat.berkeley.edu Department of Statistical

More information

6.1 Variational representation of f-divergences

6.1 Variational representation of f-divergences ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

More information

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set Analysis Finite and Infinite Sets Definition. An initial segment is {n N n n 0 }. Definition. A finite set can be put into one-to-one correspondence with an initial segment. The empty set is also considered

More information

The proximal mapping

The proximal mapping The proximal mapping http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/37 1 closed function 2 Conjugate function

More information

Fenchel Duality between Strong Convexity and Lipschitz Continuous Gradient

Fenchel Duality between Strong Convexity and Lipschitz Continuous Gradient Fenchel Duality between Strong Convexity and Lipschitz Continuous Gradient Xingyu Zhou The Ohio State University zhou.2055@osu.edu December 5, 2017 Xingyu Zhou (OSU) Fenchel Duality December 5, 2017 1

More information

Integral Jensen inequality

Integral Jensen inequality Integral Jensen inequality Let us consider a convex set R d, and a convex function f : (, + ]. For any x,..., x n and λ,..., λ n with n λ i =, we have () f( n λ ix i ) n λ if(x i ). For a R d, let δ a

More information

Stanford Statistics 311/Electrical Engineering 377

Stanford Statistics 311/Electrical Engineering 377 I. Bayes risk in classification problems a. Recall definition (1.2.3) of f-divergence between two distributions P and Q as ( ) p(x) D f (P Q) : q(x)f dx, q(x) where f : R + R is a convex function satisfying

More information

THE UNIQUE MINIMAL DUAL REPRESENTATION OF A CONVEX FUNCTION

THE UNIQUE MINIMAL DUAL REPRESENTATION OF A CONVEX FUNCTION THE UNIQUE MINIMAL DUAL REPRESENTATION OF A CONVEX FUNCTION HALUK ERGIN AND TODD SARVER Abstract. Suppose (i) X is a separable Banach space, (ii) C is a convex subset of X that is a Baire space (when endowed

More information

Introduction to Convex Analysis Microeconomics II - Tutoring Class

Introduction to Convex Analysis Microeconomics II - Tutoring Class Introduction to Convex Analysis Microeconomics II - Tutoring Class Professor: V. Filipe Martins-da-Rocha TA: Cinthia Konichi April 2010 1 Basic Concepts and Results This is a first glance on basic convex

More information

Extended Monotropic Programming and Duality 1

Extended Monotropic Programming and Duality 1 March 2006 (Revised February 2010) Report LIDS - 2692 Extended Monotropic Programming and Duality 1 by Dimitri P. Bertsekas 2 Abstract We consider the problem minimize f i (x i ) subject to x S, where

More information

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

LECTURE SLIDES ON BASED ON CLASS LECTURES AT THE CAMBRIDGE, MASS FALL 2007 BY DIMITRI P. BERTSEKAS.

LECTURE SLIDES ON BASED ON CLASS LECTURES AT THE CAMBRIDGE, MASS FALL 2007 BY DIMITRI P. BERTSEKAS. LECTURE SLIDES ON CONVEX ANALYSIS AND OPTIMIZATION BASED ON 6.253 CLASS LECTURES AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY CAMBRIDGE, MASS FALL 2007 BY DIMITRI P. BERTSEKAS http://web.mit.edu/dimitrib/www/home.html

More information

Lecture 1: January 12

Lecture 1: January 12 10-725/36-725: Convex Optimization Fall 2015 Lecturer: Ryan Tibshirani Lecture 1: January 12 Scribes: Seo-Jin Bang, Prabhat KC, Josue Orellana 1.1 Review We begin by going through some examples and key

More information

A function(al) f is convex if dom f is a convex set, and. f(θx + (1 θ)y) < θf(x) + (1 θ)f(y) f(x) = x 3

A function(al) f is convex if dom f is a convex set, and. f(θx + (1 θ)y) < θf(x) + (1 θ)f(y) f(x) = x 3 Convex functions The domain dom f of a functional f : R N R is the subset of R N where f is well-defined. A function(al) f is convex if dom f is a convex set, and f(θx + (1 θ)y) θf(x) + (1 θ)f(y) for all

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Convex Analysis and Optimization Chapter 4 Solutions

Convex Analysis and Optimization Chapter 4 Solutions Convex Analysis and Optimization Chapter 4 Solutions Dimitri P. Bertsekas with Angelia Nedić and Asuman E. Ozdaglar Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

Characterizations of the solution set for non-essentially quasiconvex programming

Characterizations of the solution set for non-essentially quasiconvex programming Optimization Letters manuscript No. (will be inserted by the editor) Characterizations of the solution set for non-essentially quasiconvex programming Satoshi Suzuki Daishi Kuroiwa Received: date / Accepted:

More information

EC9A0: Pre-sessional Advanced Mathematics Course. Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1

EC9A0: Pre-sessional Advanced Mathematics Course. Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1 EC9A0: Pre-sessional Advanced Mathematics Course Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1 1 Infimum and Supremum Definition 1. Fix a set Y R. A number α R is an upper bound of Y if

More information

Convex Optimization Theory

Convex Optimization Theory Convex Optimization Theory A SUMMARY BY DIMITRI P. BERTSEKAS We provide a summary of theoretical concepts and results relating to convex analysis, convex optimization, and duality theory. In particular,

More information

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,

More information