arxiv: v3 [math.oc] 8 May 2018

Size: px
Start display at page:

Download "arxiv: v3 [math.oc] 8 May 2018"

Transcription

1 Noname manuscript No will be inserted by the editor Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods Tianxiao Sun Quoc Tran-Dinh arxiv: v3 [mathoc] 8 May 08 Received: date / Accepted: date Abstract We study the smooth structure o convex unctions by generalizing a powerul concept so-called sel-concordance introduced by Nesterov and Nemirovsii in the early 990s to a broader class o convex unctions which we call generalized sel-concordant unctions This notion allows us to develop a uniied ramewor or designing Newton-type methods to solve convex optimization problems The proposed theory provides a mathematical tool to analyze both local and global convergence o Newton-type methods without imposing unveriiable assumptions as long as the underlying unctionals all into our class o generalized sel-concordant unctions First, we introduce the class o generalized sel-concordant unctions which covers the class o standard selconcordant unctions as a special case Next, we establish several properties and ey estimates o this unction class which can be used to design numerical methods Then, we apply this theory to develop several Newton-type methods or solving a class o smooth convex optimization problems involving generalized sel-concordant unctions We provide an explicit step-size or a damped-step Newton-type scheme which can guarantee a global convergence without perorming any globalization strategy We also prove a local quadratic convergence o this method and its ull-step variant without requiring the Lipschitz continuity o the objective Hessian mapping Then, we extend our result to develop proximal Newton-type methods or a class o composite convex minimization problems involving generalized sel-concordant unctions We also achieve both global and local convergence without additional assumptions Finally, we veriy our theoretical results via several numerical examples, and compare them with existing methods Keywords Generalized sel-concordance Newton-type method proximal Newton method quadratic convergence local and global convergence convex optimization Mathematics Subject Classiication C Introduction The Newton method is a classical numerical scheme or solving systems o nonlinear equations and smooth optimization [47, 50] However, there are at least two reasons that prevent the use o such methods rom solving large-scale problems Firstly, while these methods oten have a ast local convergence rate which can be up to a quadratic rate, their global convergence has not been wellunderstood [46] In practice, one can use a damped-step scheme utilizing the Lipschitz constant o the objective derivatives to compute a suitable step-size as oten seen in gradient-type methods, or Corresponding author quoctd@ uncedu Tianxiao Sun Quoc Tran-Dinh Department o Statistics and Operations Research, University o North Carolina at Chapel Hill UNC 38 Hanes Hall, CB# 360, UNC Chapel Hill, NC tianxias, quoctd@ uncedu

2 T Sun and Q Tran-Dinh incorporate the algorithm with a globalization strategy such as line-search, trust-region, or ilter to guarantee a descent property [47] Both strategies allow us to prove a global convergence o the underlying Newton-type method in some sense Unortunately, in practice, there exist several problems whose objective unction does not have global Lipschitz gradient or Hessian such as logarithmic or reciprocal unctions This class o problems does not provide us some uniorm bounds to obtain a constant step-size in optimization algorithms On the other hand, using a globalization strategy or determining step-sizes oten requires centralized computation such as unction evaluations which prevent us rom using distributed computation and stochastic descent methods Secondly, Newton algorithms are second-order methods which oten require a high periteration complexity due to the operations on the Hessian mapping o the objective unction or its approximations In addition, these methods require the underlying unctionals to be smooth up to a given smoothness levels which does not oten hold in many practical models Motivation: In recent years, there has been a great interest in Newton-type methods or solving convex optimization problems and monotone equations due to the development o new techniques and mathematical tools in optimization, machine learning, and randomized algorithms [6,, 6, 8, 34, 4, 43, 54, 55, 57, 58, 6] Several combinations o Newton-type methods and other techniques such as proximal operators [8], cubic regularization [4], gradient regularization [55], randomized algorithms such as setching [54], subsampling [8], and ast eigen-decomposition [6] have opened up a new research direction and attracted a great attention in solving nonsmooth and largescale problems Hitherto, research in this direction remains ocusing on speciic classes o problems where standard assumptions such as nonsingularity and Hessian Lipschitz continuity are preserved However, such assumptions do not hold or many other examples as shown in [6] Moreover, i they are satisied, then we oten get a lower bound o possible step-sizes or our algorithm which may lead to a poor perormance, especially in large-scale problems In the seminal wor [45], Nesterov and Nemirovsii showed that the class o log-barriers does not satisy the standard assumptions o the Newton method i the solution o the underlying problem is closed to the boundary o the barrier unction domain They introduced a powerul concept called sel-concordance to overcome this drawbac and developed new Newton schemes to achieve global and local convergence without requiring any additional assumption, or a globalization strategy While the sel-concordance notion was initially invented to study interior-point methods, it is less well-nown in other communities Recent wors [, 4, 38, 6, 67, 7] have popularized this concept to solve other problems arising rom machine learning, statistics, image processing, scientiic computing, and variational inequalities Our goals: In this paper, motivated by [, 63, 7], we aim at generalizing the sel-concordance concept in [45] to a broader class o smooth and convex unctions To illustrate our idea, we consider a univariate smooth and convex unction ϕ : R R I ϕ satisies the inequality ϕ t M ϕ ϕ t 3/ or all t in the domain o ϕ and or a given constant M ϕ 0, then we say that ϕ is sel-concordant in Nesterov and Nemirovsii s sense [45] We instead generalize this inequality to ϕ t M ϕ ϕ t ν, or all t in the domain o ϕ, and or given constants ν > 0 and M ϕ 0 We emphasize that generalizing rom univariate to multivariate unctions in the standard selconcordant case ie, ν = 3 [45] preserves several important properties including the multilinear symmetry [40, Lemma 4], while, unortunately, they do not hold or the case ν 3 Thereore, we modiy the deinition in [45] to overcome this drawbac Note that a similar idea has been also studied in [, 63] or a class o logistic-type unctions Nevertheless, the deinition using in these papers is limited, and still creates certain diiculty or developing urther theory in general cases Our second goal is to develop a uniied mechanism to analyze convergence including global and local convergence o the ollowing Newton-type scheme: x + := x s F x F x,

3 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 3 where F can be represented as the right-hand side o a smooth monotone equation F x = 0, or the optimality condition o a convex optimization or a convex-concave saddle-point problem, F is the Jacobian map o F, and s 0, ] is a given step-size Despite the Newton scheme is invariant to a change o variables [6], its convergence property relies on the growth o the Hessian mapping along the Newton iterative process In classical settings, the Lipschitz continuity and the non-degeneracy o the Hessian mapping in a neighborhood o a given solution are ey assumptions to achieve local quadratic convergence rate [6] These assumptions have been considered to be standard, but they are oten very diicult to chec in practice, especially the second requirement A natural idea is to classiy the unctionals o the underlying problem into a nown class o unctions to choose a suitable method or minimizing it While irst-order methods or convex optimization essentially rely on the Lipschitz gradient continuity, Newton schemes usually use the Lipschitz continuity o the Hessian mapping and its non-degeneracy to obtain a well-deined Newton direction as we have mentioned For sel-concordant unctions, the second condition automatically holds, but the irst assumption ails to satisy However, both ull-step and damped-step Newton methods still wor in this case by appropriately choosing a suitable metric This situation has been observed and standard assumptions have been modiied in dierent directions to still guarantee convergence o Newton-type methods, see [6] or an intensive study o generic Newton-type methods, and [45, 40] or the sel-concordant unction class Our approach: We attempt to develop some bacground theory or a broad class o smooth and convex unctions under the structure By adopting the local norm deined via the Hessian mapping o such a convex unction rom [45], we can prove some lower and upper bound estimates or the local norm distance between two points in the domain as well as or the growth o the Hessian mapping Together with this bacground theory, we also identiy a class o unctions using in generalized linear models [37, 39] as well as in empirical ris minimization [68] that alls into our generalized sel-concordance class or many well-nown loss-type unctions as listed in Table Applying our generalized sel-concordant theory, we develop a class o Newton-type methods to solve the ollowing composite convex minimization problem: F := min F x := x + gx, 3 x R p where is a generalized sel-concordant unction in our context, and g is a proper, closed, and convex unction that can be reerred to as a regularization term We consider two cases The irst case is a non-composite convex problem in which g is vanished ie, g = 0 In the second case, we assume that g is equipped with a tractably proximal operator see 34 or the deinition Our contribution: To this end, our main contribution can be summarized as ollows a We generalize the sel-concordant notion in [40] to a more broader class o smooth convex unctions which we call generalized sel-concordance We identiy several loss-type unctions that can be cast into our generalized sel-concordant class We also prove several undamental properties and show that the sum and linear transormation o generalized sel-concordant unctions are generalized sel-concordant or a given range o ν or under suitable assumptions b We develop lower and upper bounds on the Hessian mapping, the gradient mapping, and the unction values or generalized sel-concordant unctions These estimates are ey to analyze several numerical optimization methods including Newton-type methods c We propose a class o Newton methods including ull-step and damped-step schemes to minimize a generalized sel-concordant unction We explicitly show how to choose a suitable stepsize to guarantee a descent direction in the damped-step scheme, and prove a local quadratic convergence or both the damped-step and the ull-step schemes using a suitable metric d We also extend our Newton schemes to handle the composite setting 3 We develop both ullstep and damped-step proximal Newton methods to solve this problem and provide a rigorous theoretical convergence guarantee in both local and global sense

4 4 T Sun and Q Tran-Dinh e We also study a quasi-newton variant o our Newton scheme to minimize a generalized selconcordant unction Under a modiication o the well-nown Dennis-Moré condition [5] or a BFGS update, we show that our quasi-newton method locally converges at a superlinear rate to the solution o the underlying problem Let us emphasize the ollowing aspects o our contribution Firstly, we observe that the selconcordance notion is a powerul concept and has widely been used in interior-point methods as well as in other optimization schemes [8,35,6,7], generalizing it to a broader class o smooth convex unctions can substantially cover a number o new applications or can develop new methods or solving old problems including logistic and multimonomial logistic regression, optimization involving exponential objectives, and distance-weighted discrimination problems in support vector machine see Table below Secondly, veriying theoretical assumptions or convergence guarantees o a Newton method is not trivial, our theory allows one to classiy the underlying unctions into dierent subclasses by using dierent parameters ν and M ϕ in order to choose suitable algorithms to solve the corresponding optimization problem Thirdly, the theory developed in this paper can potentially apply to other optimization methods such as gradient-type, setching and sub-sampling Newton, and Fran-Wole s algorithms as done in the literature [49, 54, 57, 58, 6] Finally, our generalization also shows that it is possible to impose additional structure such as sel-concordant barrier to develop path-ollowing scheme or interior-point-type methods or solving a subclass o composite convex minimization problems o the orm 3 We believe that our theory is not limited to convex optimization, but can be extended to solve convex-concave saddle-point problems, and monotone equations/inclusions involving generalized sel-concordant unctions [67] Summary o generalized sel-concordant properties: For our reerence convenience, we provide a short summary on the main properties o generalized sel-concordant gsc unctions in Table Although several results hold or a dierent range o ν, the complete theory only holds or Table A summary o generalized sel-concordant properties Result Property Range o ν Deinitions and deinitions o gsc unctions ν > 0 Proposition sum o gsc unctions ν Proposition aine transormation o gsc unctions with Ax = Ax + b Proposition 3a non-degenerate property ν Proposition 3b unboundedness ν > 0 Proposition 4a gsc and strong convexity ν 0, 3] Proposition 4b gsc and Lipschitz gradient continuity ν Proposition 6 Propositions 7, 8, 9, and 0 i is the conjugate o a gsc unction, then ν + ν = 6 local norm, Hessian, gradient, and unction value bounds ν 0, 3] or general A ν > 3 or over-completed A ν 0, 6 i p = univariate ν [3, 6 i p > multivariate ν ν [, 3] However, this is suicient to cover two important cases: ν = in [,] and ν = 3 in [45] Related wor: Since the sel-concordance concept was introduced in 990s [45], its irst extension is perhaps proposed by [] or a class o logistic regression In [63], the authors extended [] to study proximal Newton method or logistic, multinomial logistic, and exponential loss unctions By augmenting a strongly convex regularizer, Zhang and Lin in [7] showed that the regularized logistic loss unction is indeed standard sel-concordant In [] Bach continued exploiting his result in [] to show that the averaging stochastic gradient method can achieve the same best-nown convergence rate as in strongly convex case without adding a regularizer In [6], the authors exploited standard

5 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 5 sel-concordance theory in [45] to develop several classes o optimization algorithms including proximal Newton, proximal quasi-newton, and proximal gradient methods to solve composite convex minimization problems In [35], Lu extended [6] to study randomized bloc coordinate descent methods In a recent paper [], Gao and Goldarb investigated quasi-newton methods or sel-concordant problems As another example, [53] proposed an alternative to the standard selconcordance, called sel-regularity The authors applied this theory to develop a new paradigm or interior-point methods The theory developed in this paper, on the one hand, is a generalization o the well-nown sel-concordance notion developed in [45]; on the other hand, it also covers the wor in [, 6, 7] as speciic examples Several concrete applications and extensions o sel-concordance notion can also be ound in the literature including [8, 3, 49, 53] Recently, [4] exploited smooth structures o exponential unctions to design interior-point methods or solving two undamental problems in scientiic computing called matrix scaling and balancing Paper organization: The rest o this paper is organized as ollows Section develops the oundation theory or our generalized sel-concordant unctions including deinitions, examples, basic properties, Fenchel s conjugate, smoothing technique, and ey bounds Section 3 is devoted to studying ull-step and damped-step Newton schemes to minimize a generalized sel-concordant unction including their global and local convergence guarantees Section 4 considers to the composite setting 3 and studies proximal Newton-type methods, and investigates their convergence guarantees Section 5 deals with a quasi-newton scheme or solving the noncomposite problem o 3 Numerical examples are provided in Section 6 to illustrate advantages o our theory Finally, or clarity o presentation, several technical results and proos are moved to the appendix Theory o generalized sel-concordant unctions We generalize the class o sel-concordant unctions introduced by Nesterov and Nemirovsii in [40] to a broader class o smooth and convex unctions We identiy several examples o such unctions Then, we develop several properties o this unction class by utilizing our new deinitions Notation: Given a proper, closed, and convex unction : R p R +, we denote by dom := x R p x < + the domain o, and by x := w R p y x + w, y x, y dom the subdierential o at x dom We use C 3 dom to denote the class o three times continuously dierentiable unctions on its open domain dom We denote by its gradient map, by its Hessian map, and by 3 its third-order derivative For a twice continuously dierentiable convex unction, is symmetric positive semideinite, and can be written as 0 I it is positive deinite, then we write 0 Let R + and R ++ denote the sets o nonnegative and positive real numbers, respectively We use S p + and S p ++ to denote the sets o symmetric positive semideinite and symmetric positive deinite matrices o the size p p, respectively Given a p p matrix H 0, we deine a weighted norm with respect to H as u H := Hu, u / or u R p The corresponding dual norm is v H := H v, v / I H = I, the identity matrix, then u H = u H = u, where is the standard Euclidean norm Note that = We say that is strongly convex with the strong convexity parameter µ 0 i µ is convex We also say that has Lipschitz gradient i is Lipschitz continuous with the Lipschitz constant L [0, +, ie, x y L x y or all x, y dom For C 3 dom, i x 0 at a given x dom, then we deine a local norm u x := xu, u / as a weighted norm o u with respect to x The corresponding dual norm v x, is deined as v x := max v, u u x = x v, v / or v R p Univariate generalized sel-concordant unctions Let ϕ : R R be a three times continuously dierentiable unction on the open domain domϕ Then, we write ϕ C 3 domϕ In this case, ϕ is convex i and only i ϕ t 0 or all t domϕ We introduce the ollowing deinition

6 6 T Sun and Q Tran-Dinh Deinition Let ϕ : R R be a C 3 domϕ and univariate unction with open domain domϕ, and ν > 0 and M ϕ 0 be two constants We say that ϕ is M ϕ, ν-generalized sel-concordant i ϕ t M ϕ ϕ t ν, t domϕ 4 The inequality 4 also indicates that ϕ t 0 or all t dom Hence, ϕ is convex Clearly, i ϕt = a t +bt or any constants a 0 and b R, we have ϕ t = a and ϕ t = 0 The inequality 4 is automatically satisied or any ν > 0 and M ϕ 0 The smallest value o M ϕ is zero Hence, any convex quadratic unction is 0, ν-generalized sel-concordant or any ν > 0 While 4 holds or any other constant ˆM ϕ M ϕ, we oten require that M ϕ is the smallest constant satisying 4 Example Let us now provide some common examples satisying Deinition a Standard sel-concordant unctions: I we choose ν = 3, then 4 becomes ϕ t M ϕ ϕ t 3/ which is the standard sel-concordant unctions in R introduced in [45] b Logistic unctions: In [], Bach modiied the standard sel-concordant inequality in [45] to obtain ϕ t M ϕ ϕ t, and showed that the well-nown logistic loss ϕt := log + e t satisies this deinition In [63] the authors also exploited this deinition, and developed a class o irst-order and second-order methods to solve composite convex minimization problems Hence, ϕt := log + e t is a generalized sel-concordant unction with M ϕ = and ν = c Exponential unctions: The exponential unction ϕt := e t also belongs to 4 with M ϕ = and ν = This unction is oten used, eg, in Ada-boost [33], or in matrix scaling [4] d Distance-weighted discrimination DWD: We consider a more general unction ϕt := t on q domϕ = R ++ and q studied in [36] or DWD using in support vector machine As shown q+3 in Table, this unction satisies Deinition with M ϕ = and ν = q+, 3 q+ q+ qq+ e Entropy unction: We consider the well-nown entropy unction ϕt := t lnt or t > 0 We can easily show that ϕ t = t = ϕ t Hence, it is generalized sel-concordant with ν = 4 and M ϕ = in the sense o Deinition or t, This unction is Arcsine distribution: We consider the unction ϕt := t convex and smooth Moreover, we veriy that it satisies Deinition with ν = 4 M ϕ = , 3 and 7 < 35 We can generalize this unction to ϕt := [t ab t] q or 7/5 t a, b, where a < b and q > 0 Then, we can show that ν = q+3 q+, 3 g Robust Regression: Consider a monomial unction ϕt := t q or q, studied in [7] or robust regression using in statistics Then, M ϕ = q q qq and ν = 3 q q 4, + As concrete examples, the ollowing table, Table, provides a non-exhaustive list o generalized sel-concordant unctions used in the literature Table Examples o univariate generalized sel-concordant unctions F, L means that ϕ is Lipschitz continuous Function name Form o ϕt ν M domϕ Application F, L Reerence Log-barrier lnt 3 R ++ Poisson no [0,40,45] Entropy-barrier t lnt lnt 3 R ++ Interior-point no [40] Logistic ln + e t R Classiication yes [9] Exponential e t R AdaBoost, etc no [4,33] Negative power t q, q > 0 q+3 q+ q+ q+ qq+ R ++ DWD no [36] Arcsine distribution t 4 5 < 35, Random wals no [4] Positive power t q, q, 3 q q q q qq R + Regression no [7] Entropy t lnt 4 R + KL divergence no [0]

7 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 7 Remar All examples given in Table all into the case ν However, we note that Deinition also covers [7, Lemma ] as a special case when ν 0, Unortunately, as we will see in what ollows, it is unclear how to generalize several properties o generalized sel-concordance rom univariate to multivariable unctions or ν 0,, except or strongly convex unctions Table only provides common generalized sel-concordant unctions using in practice However, it is possible to combine these unctions to obtain mixture unctions that preserve the generalized sel-concordant inequality given in Deinition For instance, the barrier entropy t lnt lnt is a standard sel-concordant unction, and it is the sum o the entropy t lnt and the negative logarithmic unction logt which are generalized sel-concordant with ν = 4 and ν = 3, respectively Multivariate generalized sel-concordant unctions Let : R p R be a C 3 dom smooth and convex unction with open domain dom Given the Hessian o, x dom, and u, v R p, we consider the unction ψt := x + tvu, u Then, it is obvious to show that ψ t := 3 x + tv[v]u, u or t R such that x + tv dom, where 3 is the third-order derivative o It is clear that ψ0 = xu, u = u x By using the local norm, we generalize Deinition to multivariate unctions : R p R as ollows Deinition A C 3 -convex unction : R p R is said to be an M, ν-generalized sel-concordant unction o the order ν > 0 and the constant M 0 i, or any x dom and u, v R p, it holds 3 x[v]u, u M u x v ν x v 3 ν 5 Here, we use a convention that 0 0 = 0 or the case ν < or ν > 3 We denote this class o unctions by F M,νdom shortly, FM,ν when dom is explicitly deined Let us consider the ollowing two extreme cases: I ν =, 5 leads to 3 x[v]u, u M u x v which collapses to the deinition introduced in [] by letting u = v I ν = 3 and u = v, 5 reduces to 3 x[u]u, u M u 3 x, Deinition becomes the standard sel-concordant deinition introduced in [40, 45] We emphasize that Deinition is not symmetric, but can avoid the use o multilinear mappings as required in [, 45] However, by [45, Proposition 9] or [40, Lemma 4], Deinition with ν = 3 is equivalent to [40, Deinition 4] or standard sel-concordant unctions 3 Basic properties o generalized sel-concordant unctions We irst show that i and are two generalized sel-concordant unctions, then β + β is also a generalized sel-concordant or any β, β > 0 according to Deinition Proposition Sum o generalized sel-concordant unctions Let i be M i, ν-generalized sel-concordant unctions satisying 5, where M i 0 and ν or i =,, m Then, or β i > 0, i =,,, m, the unction x := m i= β i i x is well-deined on dom = m i= dom i, and is M, ν-generalized sel-concordant with the same order ν and the constant M := max β ν i M i i m 0

8 8 T Sun and Q Tran-Dinh Proo It is suicient to prove or m = For m >, it ollows rom m = by induction By [40, Theorem 35], is a closed and convex unction In addition, dom = dom dom Let us ix some x dom and u, v R p Then, by Deinition, we have 3 i x[v]u, u M i i xu, u i xv, v ν v 3 ν, i =, Denote w i := i xu, u 0 and s i := i xv, v 0 or i =, We can derive 3 x[v]u, u β 3 x[v]u, u + β 3 x[v]u, u xu, u xv, v ν xu, u xv, v ν β w [ M β w s ν + M β w s ν β w + β w β s + β s ν β s β s +β s ] β w β w +β w [T ] v 3 ν 6 Let ξ := β w +β w [0, ] and η := [0, ] Then, = ξ 0 and = η 0 Hence, the term [T ] in the square bracets o 6 becomes β s β s +β s hξ, η := β ν M ξη ν + β ν M ξ η ν, ξ, η [0, ] Since ν and ξ, η [0, ], we can upper bound hξ, η as hξ, η β ν M ξ + β ν M ξ, ξ [0, ] The right-hand side unction is linear in ξ on [0, ] It achieves the maximum at its boundary Hence, we have max hξ, η max β ν M, β ν M ξ [0,],η [0,] Using this estimate into 6, we can show that := β + β is M, ν-generalized sel-concordant with M := max β ν M, β ν M Using Proposition, we can also see that i is M, ν-generalized sel-concordant, and β > 0, then gx := βx is also M g, ν-generalized sel-concordant with the constant M g := β ν M The convex quadratic unction qx := Qx, x + c x with Q S p + is 0, ν-generalized selconcordant or any ν > 0 Hence, by Proposition, i is M, ν-generalized sel-concordant, then x + Qx, x + c x is also M, ν-generalized sel-concordant Next, we consider an aine transormation o a generalized sel-concordant unction Proposition Aine transormation Let Ax := Ax + b be an aine transormation rom R p to R q, and be an M, ν-generalized sel-concordant unction with ν > 0 Then, the ollowing statements hold: a I ν 0, 3], then gx := Ax is M g, ν-generalized sel-concordant with M g := M A 3 ν b I ν > 3 and λ min A A > 0, then gx := Ax is M g, ν-generalized sel-concordant with M g := M λ min A A 3 ν, where λ min A A is the smallest eigenvalue o A A Proo Since gx = Ax = Ax + b, it is easy to show that gx = A AxA and 3 gx[v] = A 3 Ax[Av]A Let us denote by x := Ax + b, ũ := Au, and ṽ := Av Then, using Deinition, we have 3 gx[v]u, u = A 3 x[ṽ]au, u = 3 x[ṽ]ũ, ũ 5 M xũ, ũ xṽ, ṽ ν ṽ 3 ν = M A AxAu, u A AxAv, v ν Av 3 ν = M gxu, u gxv, v ν Av 3 ν 7

9 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 9 a I ν 0, 3], then we have Av 3 ν A 3 ν v 3 ν Hence, the last inequality 7 implies 3 gx[v]u, u M A 3 ν gxu, u gxv, v ν v 3 ν, which shows that g is M g, ν-generalized sel-concordant with M g := M A 3 ν b Note that Av = v A Av λ min A A v 0, where λ mina A is the smallest eigenvalue o A A I λ min A A > 0 and ν > 3, then we have Av 3 ν λ min A A 3 ν v 3 ν Combining this estimate and 7, we can show that g is M g, ν-generalized sel-concordant with M g := M λ min A A 3 ν Remar Proposition shows that generalized sel-concordance is preserved via an aine transormations i ν 0, 3] I ν > 3, then it requires A to be over-completed, ie, λ min A A > 0 Hence, the theory developed in the sequel remains applicable or ν > 3 i A is over-completed The ollowing result is an extension o standard sel-concordant unctions ν = 3 whose proo is very similar to [40, Theorems 43, 44] by replacing the parameters M = and ν = 3 with the general parameters M 0 and ν > 0 or ν, respectively We omit the detailed proo Proposition 3 Let be an M, ν-generalized sel-concordant unction with ν > 0 Then: a I ν and dom contains no straight line, then x 0 or any x dom b I there exists x bddom, the boundary o dom, then, or any x bddom, and any sequence x dom such that lim x = x, we have lim x = + Note that Proposition 3a only holds or ν I we consider gx := Ax or a given aine operator Ax = Ax + b, then the non-degenerateness o g is only guaranteed i A is ull-ran Otherwise, it is non-degenerated in a given subspace o A 4 Generalized sel-concordant unctions with special structures We irst show that i a generalized sel-concordant unction is strongly convex or has a Lipschitz gradient, then it can be cast into the special case ν = or ν = 3 Proposition 4 Let F M,ν be an M, ν-generalized sel-concordant with ν > 0 Then: a I ν 0, 3] and is also strongly convex on dom with the strong convexity parameter µ > 0 in l -norm, then is also ˆM, ˆν-generalized sel-concordant with ˆν = 3 and ˆM := M µ 3 ν b I ν and is Lipschitz continuous with the Lipschitz constant L [0, + in l -norm, then is also ˆM, ˆν-generalized sel-concordant with ˆν = and ˆM := M L ν Proo a I is strongly convex with the strong convexity parameter µ > 0 in l -norm, then we have xv, v µ v or any v R p Hence, v v x µ In this case, 5 leads to 3 x[v]u, u 3 ν M u v M x v v x x µ 3 ν u x v x Hence, is ˆM, ˆν - generalized sel-concordant with ˆν = 3 and ˆM := M µ 3 ν b Since is Lipschitz continuous with the Lipschitz constant L [0, + in l -norm, we have v x = xv, v L v or all v R p which leads to v x v L or all v R p On the other hand, F M,ν with ν, we can show that 3 x[v]u, u ν M u v x x v v M L ν u x v Hence, is also ˆM, ˆν-generalized sel-concordant with ˆν = and ˆM := M L ν

10 0 T Sun and Q Tran-Dinh Proposition 4 provides two important properties I the gradient map o a generalized selconcordant unction is Lipschitz continuous, we can always classiy it into the special case ν = Thereore, we can exploit both structures: generalized sel-concordance and Lipschitz gradient to develop better algorithms This idea is also applied to generalized sel-concordant and strongly convex unctions Given n smooth convex univariate unctions ϕ i : R R satisying 4 or i =,, n with the same order ν > 0, we consider the unction : R p R deined by the ollowing orm: x := n n ϕ i a i x + b i, 8 i= where a i R p and b i R are given vectors and numbers, respectively or i =,, n This convex unction is called a inite sum and widely used in machine learning and statistics The decomposable structure in 8 oten appears in generalized linear models [7, ], and empirical ris minimization [7], where ϕ i is reerred to as a loss unction as can be ound, eg, in Table Next, we show that i ϕ i is generalized sel-concordant with ν [, 3], then is also generalized sel-concordant This result is a direct consequence o Proposition and Proposition Corollary I ϕ i in 8 satisies 4 or i =,, n with the same order ν [, 3] and M ϕi 0, then deined by 8 is also M, ν-generalized sel-concordant in the sense o Deinition with the same order ν and the constant M := n ν max M ϕi a i 3 ν i n Finally, we show that i we regularize in 8 by a strongly convex quadratic term, then the resulting unction becomes sel-concordant The proo can ollow the same path as [7, Lemma ] Proposition 5 Let x := n n i= ϕ ia i x+b i+ψx, where ψx := Qx, x +c x is strongly convex quadratic unction with Q S++ p I ϕ i satisies 4 or i =,, n with the same order ν 0, 3] and a constant M ϕi > 0, then is ˆM, 3-generalized sel-concordant in the sense o Deinition with ˆM := λ min Q ν 3 max M ϕi a i 3 ν i n 5 Fenchel s conjugate o generalized sel-concordant unctions Primal-dual theory is undamental in convex optimization Hence, it is important to study the Fenchel conjugate o generalized sel-concordant unctions Let : R p R be an M, ν-generalized sel-concordant unction We consider Fenchel s conjugate o as x = sup x, u u u dom 9 u Since is proper, closed, and convex, is well-deined and also proper, closed, and convex Moreover, since is smooth and convex, by Fermat s rule, i u x satisies u x = x, then is well-deined at x This shows that dom = x R p u x = x is solvable Example Let us loo at some univariate unctions By using 9, we can directly show that: I ϕs = log + e s, then ϕ t = t logt + t log t I ϕs = s logs, then ϕ t = e t 3 I ϕs = e s, then ϕ t = t logt t Intuitively, these examples show that i ϕ is generalized sel-concordant, then its conjugate ϕ is also generalized sel-concordant For more examples, we reer to [3, Chapter 3] Let us generalize this result in the ollowing proposition whose proo is given in Appendix A

11 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods Proposition 6 I is M, ν-generalized sel-concordant in dom R p such that x 0 or x dom, then the conjugate unction o given by 9 is well-deined, and M, ν - generalized sel-concordant on dom := x R p u x, u is bounded rom below on dom, where M = M and ν = 6 ν provided that ν [3, 6 i p > and ν 0, 6 i p = Moreover, we have x = u x and x = u x, where u x is a unique solution o the maximization problem max u x, u u u dom in 9 or any x dom Proposition 6 allows us to apply our generalized sel-concordance theory in this paper to the dual problem o a convex problem involving generalized sel-concordant unctions, especially, when the objective unction o the primal problem is generalized sel-concordant with ν 3, 4] The Fenchel conjugates are certainly useul when we develop optimization algorithms to solve constrained convex optimization involving generalized sel-concordant unctions, see, eg, [65, 66] 6 Generalized sel-concordant approximation o nonsmooth convex unctions Several well-nown convex unctions are nonsmooth However, they can be approximated up to an arbitrary accuracy by a generalized sel-concordant unction via smoothing Smoothing techniques clearly allow us to enrich the applicability o our theory to nonsmooth convex problems Given a proper, closed, possibly nonsmooth, and convex unction : R p R + One can smooth using the ollowing Nesterov s smoothing technique [4] γ x := sup x, u u γωu, 0 u dom where is the Fenchel conjugate o, ω : domω R p R is a smooth convex unction such that dom domω, and γ > 0 is called a smoothness parameter In particular, i is Lipschitz continuous, then dom is bounded [3] Hence, the sup operator in 0 reduces to the max operator Our goal is to choose an appropriate smoothing unction ω such that the smoothed unction γ is well-deined and generalized sel-concordant or any ixed smoothness parameter γ > 0 Example 3 Let us provide a ew examples with well-nown nonsmooth convex unctions: a Consider the l -norm unction x := x in R p Then, it can be rewritten as x = max x, u u u = max u,v p x, u v u i + v i =, u, v R p + We can smooth this unction by γ by choosing ωu, v := lnp + p i= u i lnu i + v i lnv i In this case, we obtain γ x = γ ln p i= e x i/γ + e xi/γ γ lnp This unction is clearly generalized sel-concordant with ν =, see [63, Lemma 4] However, i we choose ωu := p p i= u i, then we get γ x = p i= x i + γ γp In this case, γ is also generalized sel-concordant with ν = 8 3 and M γ = 3γ 3 b The hinge loss unction ϕt := max 0, t can be written as ϕt = t + t Hence, t e γ +e we can smooth this unction by ϕ γ t := γ ln t γ + t with a smoothness parameter γ > 0 Clearly, ϕ γ is generalized sel-concordant with ν = i=

12 T Sun and Q Tran-Dinh In many practical problems, the conjugate o can be written as the sum = ϕ+δ U, where ϕ is a generalized sel-concordant unction, and δ U is the indicator unction o a given nonempty, closed, and convex set U In this case, γ in 0 becomes γ x := sup x, u ϕu γωu u U u I ω is a generalized sel-concordant such that ν ϕ = ν ω, and U = domω domϕ, then γ is generalized sel-concordant with ν γ = 6 ν ϕ as shown in Proposition 6 7 Key bounds on Hessian map, gradient map, and unction values Now, we develop some ey bounds on the local norms, Hessian map, gradient map, and unction values o generalized sel-concordant unctions In this subsection, we assume that the Hessian map o is nondegenerate at any point in its domain For this purpose, given ν, we deine the ollowing quantity or any x, y dom: M y x i ν = d ν x, y := ν M y x 3 ν y x ν x i ν > Here, i ν > 3, then we require x y Otherwise, we set d ν x, y := 0 i x = y In addition, we also deine the unction ω ν : R R + as i ν > τ ν ω ν τ := 3 e τ i ν =, with dom ω ν =, i ν >, and dom ω ν = R i ν = We also adopt the Diin ellipsoidal notion rom [40] as W 0 x; r := y R p d ν x, y < r The next proposition provides a bound on the local norm deined by a generalized sel-concordant unction This bound is given or the local distances y x x and y x y between two points x and y in dom Proposition 7 Bound o local norms I ν >, then, or any x dom, we have W 0 x; dom For any x, y dom, let d ν x, y be deined by, and ω ν be deined by 3 Then, we have ω ν d ν x, y y x x y x y ω ν d ν x, y y x x 4 I ν >, then the right-hand side inequality o 4 holds i d ν x, y < Proo We irst consider the case ν > Let u R p and u 0 Consider the ollowing univariate unction φt := x + tuu, u ν = u ν x+tu It is easy to compute the derivative o this unction, and obtain ν φ 3 x + tu[u]u, u ν 3 x + tu[u]u, u t = = x + tuu, u ν u ν x+tu Using Deinition with u = v and x + tu instead o x, we have φ t ν M u 3 ν This implies that φt φ0 ν M u 3 ν t On the other hand, we can see that domφ = φ0 t R φt > 0 Hence, we have domφ contains, ν M u 3 ν act and the deinition o φ, we can show that dom contains y := x + tu t < φ0 ν M u 3 ν u ν x ν M u 3 ν Using this

13 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 3 However, since t = y x ν x u ν x This shows that W 0 x; dom Since y x ν y y x 3 ν, the condition t < u 3 ν u ν x is equivalent to d ν M u 3 ν ν x, y < 0 φ tdt 0 φ t dt, integrating φ t over the interval [0, ] we get u ν x+u u ν x ν Using u = y x in the last inequality, we get y x ν y which is equivalent to y x ν y y x ν x ν M y x ν x x y 3 ν + ν y x ν x M y x ν x x y 3 ν M u 3 ν y x ν x = y x ν x ν M y x 3 ν d ν x, y = y x ν x + d ν x, y, given that d ν x, y < Taing the power o ν > 0 in both sides, we get 4 or the case ν > Now, we consider the case ν = Let 0 u R p We consider the ollowing unction φt := ln x + tuu, u = ln u x+tu Clearly, it is easy to show that φ t = 3 x+tu[u]u,u x+tuu,u = 3 x+tu[u]u,u u x+tu with u = v and x + tu instead o x, we obtain φ t M u Since 0 φ tdt 0 φ t dt, integrating φ t over the interval [0, ] we get ln u x+u ln u x M u Using again Deinition Substituting u = y x into this inequality, we get ln y x y ln y x M x y x Hence, ln y x x M y x ln y x y ln y x x + M y x This inequality leads to 4 or the case ν = Next, we develop new bounds or the Hessian map o in the ollowing proposition Proposition 8 Bounds o Hessian map For any x, y dom, let d ν x, y be deined by, and ω ν be deined by 3 Then, we have ν [ d ν x, y] x y [ d ν x, y] ν x i ν >, 5 e dνx,y x y e dνx,y x i ν =, 6 where 5 holds i d ν x, y < or the case ν > Proo Let ν > and 0 u R n Consider the ollowing univariate unction on [0, ]: ψt := x + ty xu, u, t [0, ] I we denote by y t := x + ty x, then y t x = ty x, ψt = u y t, and ψ t = 3 y t [y x]u, u By Deinition, we have [ ψ t M u y t y x ν y t y x 3 ν yt x ] ν = M ψt yt 3 ν t y x, which implies d ln ψt dt M [ yt x ] ν yt 3 ν t y x 7

14 4 T Sun and Q Tran-Dinh Assume that d ν x, y < Then, by the deinition o y t and d ν, we have d ν x, y t = td ν x, y and y t x x = t y x x Using Proposition 7, we can derive [ t y t x yt t ] ν y t x 3 ν y t x ν ν x y t x x Hence, we can urther derive Integrating d ln ψt dt 0 d ln ψt dt = t [ d νx, y t ] ν y t x x = [ d ν x, yt] ν y x x [ ] ν t y y x ν x t x yt d ν x, yt with respect to t on [0, ] and using the last inequality and 7, we get dt d ln ψt dt dt y x ν x y x 3 ν dt d ν x, yt 0 Clearly, we can compute this integral explicitly as [ ] u [ ] ln y u = ψ ln d νx, y ] ψ0 ν d x ν x, y ln [ d νx, y] = ln [ d ν x, y ν Rearranging this inequality, we obtain [ d ν x, y] ν u y u yu, u xu, u [ d νx, y] ν x Since this inequality holds or any 0 u R p, it implies 5 I u = 0, then 5 obviously holds Now, we consider the case ν = It ollows rom 7 that [ ] u ln y u = d ln ψt dt x 0 dt d ln ψt 0 dt dt M y x dt = M y x 0 Since this inequality holds or any u R p, it implies 6 The ollowing corollary provides a bound on the mean o the Hessian map Gx, y := 0 x+ τy xdτ whose proo is moved to Appendix A Corollary For any x, y dom, let d ν x, y be deined by Then, we have where κ ν d ν x, y x and κ ν t := 0 κ ν t := x + τy xdτ κ ν d ν x, y x, 8 e t t i ν = t ν ν t ] i ν = 4 i ν > and ν 4, [ t ν ν t e t t i ν = ln t t i ν = 4 [ ] t ν 4 ν t i ν > and ν 4 ν ν 4 Here, i ν >, then we require d ν x, y to satisy d ν x, y < or x, y dom in 8 0

15 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 5 We prove a bound on the gradient inner product o a generalized sel-concordant unction Proposition 9 Bounds o gradient map For any x, y dom, we have ω ν d ν x, y y x x y x, y x ω ν d ν x, y y x x, 9 where, i ν >, then the right-hand side inequality o 9 holds i d ν x, y <, and e τ τ i ν = ln τ ω ν τ := τ i ν = 4 otherwise ν ν 4 τ ν 4 ν τ 0 Here, ω ν τ 0 or all τ dom ω ν Proo Let y t := x + ty x By the mean-value theorem, we have y x, y x = 0 y t y x, y x dt = We consider the unction ω ν deined by 3 It ollows rom Proposition 7 that ω ν d ν x, y t y t x x y t x y t ω ν d ν x, y t y t x x 0 t y t x y t dt Now, we note that d ν x, y t = td ν x, y and y t x x = t y x x, the last estimate leads to ω ν td ν x, y y x x t y t x y t ω ν td ν x, y y x x Substituting this estimate into, we obtain y x x 0 ω ν td ν x, y dt y x, y x y x x 0 ω ν td ν x, y dt Using the unction ω ν τ rom 3 to compute the let-hand side and the right-hand side integrals, we obtain 9 Finally, we prove a bound on the unction values o an M, ν-generalized sel-concordant unction in the ollowing proposition Proposition 0 Bounds o unction values For any x, y dom, we have ω ν d ν x, y y x x y x x, y x ω ν d ν x, y y x x, where, i ν >, then the right-hand side inequality o holds i d ν x, y < Here, d ν x, y is deined by and ω ν is deined by e τ τ τ i ν = τ ln τ τ i ν = 3 ω ν τ := τ ln τ+τ 3 τ i ν = 4 [ ] τ 3 ν ν otherwise ν 4 ν τ ν 3 ντ Note that ω ν τ 0 or all τ domω ν

16 6 T Sun and Q Tran-Dinh Proo For any x, y dom, let y t := x + ty x Then, y t x = ty x By the mean-value theorem, we have Now, by Proposition 9, we have y x x, y x = 0 t y t x, y t x dt ω ν d ν x, y t y t x x y t x, y t x ω ν d ν x, y t y t x x Clearly, by the deinition, we have d ν x, y t = td ν x, y and y t x x = t y x x Combining these relations, and the above two inequalities, we can show that y x x 0 t ω ν td ν x, y dt y x x, y x y x x 0 t ω ν td ν x, y dt By integrating the let-hand side and the right-hand side o this estimate using the deinition 0 o ω ν τ, we obtain 3 Generalized sel-concordant minimization We apply the theory developed in the previous sections to design new Newton-type methods to minimize a generalized sel-concordant unction More precisely, we consider the ollowing noncomposite convex problem: := min x, 4 x Rp where : R p R is an M, ν-generalized sel-concordant unction in the sense o Deinition with ν [, 3] and M 0 Since is smooth and convex, the optimality condition x = 0 is necessary and suicient or x to be an optimal solution o 4 The ollowing theorem shows the existence and uniqueness o the solution x o 4 It can be considered as a special case o Theorem 4 below with g 0 Theorem Suppose that F M,νdom or given parameters M > 0 and ν [, 3] Denote by σ min x := λ min x and λx := x x or x dom Suppose urther that there exists x dom such that σ min x > 0 and λx < [σ minx] 3 ν 4 νm Then, problem 4 has a unique solution x in dom We say that the unique solution x o 4 is strongly regular i x 0 The strong regularity o x or 4 is equivalent to the strong second order optimality condition Theorem covers [40, Theorem 4] or standard sel-concordant unctions as a special case We consider the ollowing Newton-type scheme to solve 4 Starting rom an arbitrary initial point x 0 dom, we generate a sequence x as ollows: 0 x + := x + τ n nt, where n nt := x x, 5 and τ 0, ] is a given step-size We call n nt a Newton direction I τ = or all 0, then we call 5 a ull-step Newton scheme Otherwise, ie, τ 0,, we call 5 a damped-step Newton scheme

17 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 7 Clearly, computing the Newton direction n nt requires to solve the ollowing linear system: x n nt = x 6 Next, we deine a Newton decrement λ and a quantity β, respectively as λ := n nt x = x x and β := M n nt = M x x 7 With λ and β given by 7, we also deine β i ν = d := ν M ν λ ν β 3 ν i ν, 3] Let us irst show how to choose a suitable step-size τ in the damped-step Newton scheme and prove its convergence properties in the ollowing theorem whose proo can be ound in Appendix A5 Theorem Let x be the sequence generated by the damped-step Newton scheme 5 with the ollowing step-size: β ln + β i ν = τ := ν ] 9 d [ + 4 ν ν d 4 ν i ν, 3], where λ, β are deined by 7, and d is deined by 8 Then, τ 0, ], x in dom, and this step-size guarantees the ollowing descent property 8 x + x, 30 where := λ τ ω ν τ d τ λ > 0 with ω ν deined by 3 Assume that the unique solution x o 4 exists Then, there exists a neighborhood N x such that i we initialize the Newton scheme 5 at x 0 N x dom, then the whole sequence x converges to x at a quadratic rate Example 4 Better step-size or regularized logistic and exponential models Consider the minimization problem 4 with the objective unction := φ + γ, where φ is deined as in 8 with ϕ i t = log + e t being the logistic loss That is x := n n log + e a i x + γ x i= As we shown in Section that is either generalized sel-concordant with ν = or generalized sel-concordant with ν = 3 but with dierent constant M Let us deine R A := max a i i n Then, i we consider ν =, then we have M = R A due to Corollary, while i we choose ν = 3, then M 3 = γ R A due to Proposition 4 By the deinition o, we have x γi Hence, using this inequality and the deinition o λ and β rom 7, we can show that β = M x x R A γ λ = M 3 λ 3 For any τ > 0, we have ln+τ τ > +05τ Using this elementary result and 3, we obtain τ = ln+β β > +05β +05M 3 λ = τ 3

18 8 T Sun and Q Tran-Dinh This inequality has shown that the step-size τ given by Theorem satisies τ > τ 3 ν, where τ is a given step-size computed by 9 or ν = and 3, respectively Such a statement conirms that the damped-step Newton method using τ is theoretically better than using τ 3 This result will be empirically conirmed by our experiments in Section 6 Next, we study the ull-step Newton scheme derived rom 5 by setting the step-size τ = or all 0 as a ull-step Let σ := λ min x be the smallest eigenvalue o x Since x 0, we have σ > 0 The ollowing theorem shows a local quadratic convergence o the ull-step Newton scheme 5 or solving 4 whose proo can be ound in Appendix A6 Theorem 3 Let x be the sequence generated by the ull-step Newton scheme 5 by setting the step-size τ = or 0 Let d ν := d ν x, x + be deined by and λ be deined by 7 Then, the ollowing statements hold: a I ν = and the starting point x 0 satisies σ / 0 λ 0 < d M, then both sequences σ / λ and d decrease and quadratically converge to zero, where d 0964 b I < ν < 3, and the starting point x 0 sequences satisies σ 3 ν 0 λ 0 < M min d ν ν,, then both σ 3 ν λ and dν decrease and quadratically converge to zero, where d ν is the unique solution o the equation ν R ν d ν = 4 d ν 4 ν ν in dν with R ν given by 56 c I ν = 3 and the starting point x 0 satisies λ 0 < M, then the sequence λ decreases and quadratically converges to zero As a consequence, i d ν locally converges to zero at a quadratic rate, then x x H also locally converges to zero at a quadratic rate, where H = I, the identity matrix, i ν = ; H = x i ν = 3; and H = x ν i < ν < 3 Hence, x locally converges to x, the unique solution o 4, at a quadratic rate I we combine the results o Theorem and Theorem 3, then we can design a two-phase Newton algorithm or solving 4 as ollows: Phase : Starting rom an arbitrary initial point x 0 dom, we perorm the damped-step Newton scheme 5 until the condition in Theorem 3 is satisied Phase : Using the output x j o Phase as an initial point or the ull-step Newton scheme 5 with τ =, and perorm this scheme until it achieves an ε-solution x to 4 We also note that the damped-step Newton scheme 5 can also achieve a local quadratic convergence as shown in Theorem Hence, we combine this act and the above two-phase scheme to derive the Newton algorithm as shown in Algorithm below Per-iteration complexity: The main step o Algorithm is the solution o the symmetric positive deinite linear system 6 This system can be solved by using either Cholesy actorization or conjugate gradient methods which, in the worst-case, requires Op 3 operations Computing λ requires the inner product n nt, x which needs Op operations Conceptually, the two-phase option o Algorithm requires the smallest eigenvalue o x to terminate Phase However, switching rom Phase to Phase can be done automatically allowing some tolerance in the step-size τ Indeed, the step-size τ given by 9 converges to as Hence, when τ is closed to, eg, τ 09, we can automatically set it to and remove the computation o λ to reduce the computational time

19 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 9 Algorithm Newton algorithm or generalized sel-concordant minimization : Inputs: Choose an arbitrary initial point x 0 dom and a desired accuracy ε > 0 : Output: An ε-solution x o 4 3: Initialization: Compute d ν according to Theorem 3 i needed 4: For = 0,, max, perorm: 5: Compute the Newton direction n nt by solving x n nt = x 6: Compute λ := n nt x, and compute β := M n nt i ν 3 7: I λ ε, then TERMINATE and return x 8: I Phase is used, then compute σ = λ min x i ν < 3 9: I Phase is used and λ, σ satisies Theorem 3, then set τ := ull-step Otherwise, compute the step-size τ by 9 damped-step 0: Update x + := x + τ n nt : End or In the one-phase option, we can always perorm only Phase until achieving an ε-optimal solution as shown in Theorem Thereore, the per-iteration complexity o Algorithm is Op 3 + Op in the worst-case A careul implementation o conjugate gradient methods with a warm-start can signiicantly reduce this per-iteration computation complexity Remar 3 Inexact Newton methods We can allow Algorithm to compute the Newton direction n nt approximately In this case, we approximately solve the symmetric positive deinite system 6 By an appropriate choice o stopping criterion, we can still prove convergence o Algorithm under inexact computation o n nt For instance, the ollowing criterion is oten used in inexact Newton methods [6], but deined via the local dual norm o : x n nt + x x κ x x, or a given relaxation parameter κ [0, This extension can be ound in our orthcoming wor 4 Composite generalized sel-concordant minimization Let F M,νdom, and g be a proper, closed, and convex unction We consider the composite convex minimization problem 3 which we recall here or our convenience o reerences: F := min F x := x + gx 3 x R p Note that domf := dom domg may be empty To mae this problem nontrivial, we assume that domf is nonempty The optimality condition or 3 can be written as ollows: 0 x + gx 33 Under the qualiication condition 0 ri domg dom, 33 is necessary and suicient or x to be an optimal solution o 3, where ri X is the relative interior o X 4 Existence, uniqueness, and regularity o optimal solutions Assume that x is positive deinite ie, nonsingular at some point x domf We prove in the ollowing theorem that problem 3 has a unique solution x The proo can be ound in Appendix A4 This theorem can also be considered as a generalization o [40, Theorem 4] and [6, Lemma 4] in standard sel-concordant settings in [40, 6]

arxiv: v1 [math.oc] 13 Dec 2018

arxiv: v1 [math.oc] 13 Dec 2018 A NEW HOMOTOPY PROXIMAL VARIABLE-METRIC FRAMEWORK FOR COMPOSITE CONVEX MINIMIZATION QUOC TRAN-DINH, LIANG LING, AND KIM-CHUAN TOH arxiv:8205243v [mathoc] 3 Dec 208 Abstract This paper suggests two novel

More information

1 Relative degree and local normal forms

1 Relative degree and local normal forms THE ZERO DYNAMICS OF A NONLINEAR SYSTEM 1 Relative degree and local normal orms The purpose o this Section is to show how single-input single-output nonlinear systems can be locally given, by means o a

More information

(One Dimension) Problem: for a function f(x), find x 0 such that f(x 0 ) = 0. f(x)

(One Dimension) Problem: for a function f(x), find x 0 such that f(x 0 ) = 0. f(x) Solving Nonlinear Equations & Optimization One Dimension Problem: or a unction, ind 0 such that 0 = 0. 0 One Root: The Bisection Method This one s guaranteed to converge at least to a singularity, i not

More information

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lecture 15 Newton Method and Self-Concordance. October 23, 2008 Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

More information

2. ETA EVALUATIONS USING WEBER FUNCTIONS. Introduction

2. ETA EVALUATIONS USING WEBER FUNCTIONS. Introduction . ETA EVALUATIONS USING WEBER FUNCTIONS Introduction So ar we have seen some o the methods or providing eta evaluations that appear in the literature and we have seen some o the interesting properties

More information

YURI LEVIN AND ADI BEN-ISRAEL

YURI LEVIN AND ADI BEN-ISRAEL Pp. 1447-1457 in Progress in Analysis, Vol. Heinrich G W Begehr. Robert P Gilbert and Man Wah Wong, Editors, World Scientiic, Singapore, 003, ISBN 981-38-967-9 AN INVERSE-FREE DIRECTIONAL NEWTON METHOD

More information

Newton s Method. Javier Peña Convex Optimization /36-725

Newton s Method. Javier Peña Convex Optimization /36-725 Newton s Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and

More information

Scattered Data Approximation of Noisy Data via Iterated Moving Least Squares

Scattered Data Approximation of Noisy Data via Iterated Moving Least Squares Scattered Data Approximation o Noisy Data via Iterated Moving Least Squares Gregory E. Fasshauer and Jack G. Zhang Abstract. In this paper we ocus on two methods or multivariate approximation problems

More information

Self-Concordant Barrier Functions for Convex Optimization

Self-Concordant Barrier Functions for Convex Optimization Appendix F Self-Concordant Barrier Functions for Convex Optimization F.1 Introduction In this Appendix we present a framework for developing polynomial-time algorithms for the solution of convex optimization

More information

A Simple Explanation of the Sobolev Gradient Method

A Simple Explanation of the Sobolev Gradient Method A Simple Explanation o the Sobolev Gradient Method R. J. Renka July 3, 2006 Abstract We have observed that the term Sobolev gradient is used more oten than it is understood. Also, the term is oten used

More information

Feedback Linearization

Feedback Linearization Feedback Linearization Peter Al Hokayem and Eduardo Gallestey May 14, 2015 1 Introduction Consider a class o single-input-single-output (SISO) nonlinear systems o the orm ẋ = (x) + g(x)u (1) y = h(x) (2)

More information

Selected Methods for Modern Optimization in Data Analysis Department of Statistics and Operations Research UNC-Chapel Hill Fall 2018

Selected Methods for Modern Optimization in Data Analysis Department of Statistics and Operations Research UNC-Chapel Hill Fall 2018 Selected Methods for Modern Optimization in Data Analysis Department of Statistics and Operations Research UNC-Chapel Hill Fall 08 Instructor: Quoc Tran-Dinh Scriber: Quoc Tran-Dinh Lecture 4: Selected

More information

Robust feedback linearization

Robust feedback linearization Robust eedback linearization Hervé Guillard Henri Bourlès Laboratoire d Automatique des Arts et Métiers CNAM/ENSAM 21 rue Pinel 75013 Paris France {herveguillardhenribourles}@parisensamr Keywords: Nonlinear

More information

arxiv: v2 [math.oc] 20 Jan 2018

arxiv: v2 [math.oc] 20 Jan 2018 Composite convex minimization involving self-concordant-lie cost functions Quoc Tran-Dinh, Yen-Huan Li and Volan Cevher Laboratory for Information and Inference Systems LIONS) EPFL, Lausanne, Switzerland

More information

Computing proximal points of nonconvex functions

Computing proximal points of nonconvex functions Mathematical Programming manuscript No. (will be inserted by the editor) Warren Hare Claudia Sagastizábal Computing proximal points o nonconvex unctions the date o receipt and acceptance should be inserted

More information

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points Roberto s Notes on Dierential Calculus Chapter 8: Graphical analysis Section 1 Extreme points What you need to know already: How to solve basic algebraic and trigonometric equations. All basic techniques

More information

The Clifford algebra and the Chevalley map - a computational approach (detailed version 1 ) Darij Grinberg Version 0.6 (3 June 2016). Not proofread!

The Clifford algebra and the Chevalley map - a computational approach (detailed version 1 ) Darij Grinberg Version 0.6 (3 June 2016). Not proofread! The Cliord algebra and the Chevalley map - a computational approach detailed version 1 Darij Grinberg Version 0.6 3 June 2016. Not prooread! 1. Introduction: the Cliord algebra The theory o the Cliord

More information

ORIE 6326: Convex Optimization. Quasi-Newton Methods

ORIE 6326: Convex Optimization. Quasi-Newton Methods ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

Convex Functions and Optimization

Convex Functions and Optimization Chapter 5 Convex Functions and Optimization 5.1 Convex Functions Our next topic is that of convex functions. Again, we will concentrate on the context of a map f : R n R although the situation can be generalized

More information

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth

More information

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods

An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods Renato D.C. Monteiro B. F. Svaiter May 10, 011 Revised: May 4, 01) Abstract This

More information

A Distributed Newton Method for Network Utility Maximization, II: Convergence

A Distributed Newton Method for Network Utility Maximization, II: Convergence A Distributed Newton Method for Network Utility Maximization, II: Convergence Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract The existing distributed algorithms for Network Utility

More information

Lecture : Feedback Linearization

Lecture : Feedback Linearization ecture : Feedbac inearization Niola Misovic, dipl ing and Pro Zoran Vuic June 29 Summary: This document ollows the lectures on eedbac linearization tought at the University o Zagreb, Faculty o Electrical

More information

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS Igor V. Konnov Department of Applied Mathematics, Kazan University Kazan 420008, Russia Preprint, March 2002 ISBN 951-42-6687-0 AMS classification:

More information

Strong Lyapunov Functions for Systems Satisfying the Conditions of La Salle

Strong Lyapunov Functions for Systems Satisfying the Conditions of La Salle 06 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 6, JUNE 004 Strong Lyapunov Functions or Systems Satisying the Conditions o La Salle Frédéric Mazenc and Dragan Ne sić Abstract We present a construction

More information

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Ryan Tibshirani Convex Optimization /36-725 Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x

More information

Unconstrained minimization

Unconstrained minimization CSCI5254: Convex Optimization & Its Applications Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions 1 Unconstrained

More information

Categories and Natural Transformations

Categories and Natural Transformations Categories and Natural Transormations Ethan Jerzak 17 August 2007 1 Introduction The motivation or studying Category Theory is to ormalise the underlying similarities between a broad range o mathematical

More information

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal IPAM Summer School 2012 Tutorial on Optimization methods for machine learning Jorge Nocedal Northwestern University Overview 1. We discuss some characteristics of optimization problems arising in deep

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

10. Unconstrained minimization

10. Unconstrained minimization Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation

More information

Basic mathematics of economic models. 3. Maximization

Basic mathematics of economic models. 3. Maximization John Riley 1 January 16 Basic mathematics o economic models 3 Maimization 31 Single variable maimization 1 3 Multi variable maimization 6 33 Concave unctions 9 34 Maimization with non-negativity constraints

More information

Iteration-complexity of first-order penalty methods for convex programming

Iteration-complexity of first-order penalty methods for convex programming Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)

More information

Convex Optimization. 9. Unconstrained minimization. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University

Convex Optimization. 9. Unconstrained minimization. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University Convex Optimization 9. Unconstrained minimization Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2017 Autumn Semester SJTU Ying Cui 1 / 40 Outline Unconstrained minimization

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

On a Closed Formula for the Derivatives of e f(x) and Related Financial Applications

On a Closed Formula for the Derivatives of e f(x) and Related Financial Applications International Mathematical Forum, 4, 9, no. 9, 41-47 On a Closed Formula or the Derivatives o e x) and Related Financial Applications Konstantinos Draais 1 UCD CASL, University College Dublin, Ireland

More information

Math 216A. A gluing construction of Proj(S)

Math 216A. A gluing construction of Proj(S) Math 216A. A gluing construction o Proj(S) 1. Some basic deinitions Let S = n 0 S n be an N-graded ring (we ollows French terminology here, even though outside o France it is commonly accepted that N does

More information

SEPARATED AND PROPER MORPHISMS

SEPARATED AND PROPER MORPHISMS SEPARATED AND PROPER MORPHISMS BRIAN OSSERMAN Last quarter, we introduced the closed diagonal condition or a prevariety to be a prevariety, and the universally closed condition or a variety to be complete.

More information

Scattering of Solitons of Modified KdV Equation with Self-consistent Sources

Scattering of Solitons of Modified KdV Equation with Self-consistent Sources Commun. Theor. Phys. Beijing, China 49 8 pp. 89 84 c Chinese Physical Society Vol. 49, No. 4, April 5, 8 Scattering o Solitons o Modiied KdV Equation with Sel-consistent Sources ZHANG Da-Jun and WU Hua

More information

SWEEP METHOD IN ANALYSIS OPTIMAL CONTROL FOR RENDEZ-VOUS PROBLEMS

SWEEP METHOD IN ANALYSIS OPTIMAL CONTROL FOR RENDEZ-VOUS PROBLEMS J. Appl. Math. & Computing Vol. 23(2007), No. 1-2, pp. 243-256 Website: http://jamc.net SWEEP METHOD IN ANALYSIS OPTIMAL CONTROL FOR RENDEZ-VOUS PROBLEMS MIHAI POPESCU Abstract. This paper deals with determining

More information

Line Search Methods for Unconstrained Optimisation

Line Search Methods for Unconstrained Optimisation Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic

More information

SEPARATED AND PROPER MORPHISMS

SEPARATED AND PROPER MORPHISMS SEPARATED AND PROPER MORPHISMS BRIAN OSSERMAN The notions o separatedness and properness are the algebraic geometry analogues o the Hausdor condition and compactness in topology. For varieties over the

More information

Nonsymmetric potential-reduction methods for general cones

Nonsymmetric potential-reduction methods for general cones CORE DISCUSSION PAPER 2006/34 Nonsymmetric potential-reduction methods for general cones Yu. Nesterov March 28, 2006 Abstract In this paper we propose two new nonsymmetric primal-dual potential-reduction

More information

Feedback Linearization Lectures delivered at IIT-Kanpur, TEQIP program, September 2016.

Feedback Linearization Lectures delivered at IIT-Kanpur, TEQIP program, September 2016. Feedback Linearization Lectures delivered at IIT-Kanpur, TEQIP program, September 216 Ravi N Banavar banavar@iitbacin September 24, 216 These notes are based on my readings o the two books Nonlinear Control

More information

arxiv: v7 [math.oc] 22 Feb 2018

arxiv: v7 [math.oc] 22 Feb 2018 A SMOOTH PRIMAL-DUAL OPTIMIZATION FRAMEWORK FOR NONSMOOTH COMPOSITE CONVEX MINIMIZATION QUOC TRAN-DINH, OLIVIER FERCOQ, AND VOLKAN CEVHER arxiv:1507.06243v7 [math.oc] 22 Feb 2018 Abstract. We propose a

More information

arxiv: v4 [math.oc] 5 Jan 2016

arxiv: v4 [math.oc] 5 Jan 2016 Restarted SGD: Beating SGD without Smoothness and/or Strong Convexity arxiv:151.03107v4 [math.oc] 5 Jan 016 Tianbao Yang, Qihang Lin Department of Computer Science Department of Management Sciences The

More information

Supplementary material for Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values

Supplementary material for Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values Supplementary material or Continuous-action planning or discounted ininite-horizon nonlinear optimal control with Lipschitz values List o main notations x, X, u, U state, state space, action, action space,

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

Interior-Point Methods for Linear Optimization

Interior-Point Methods for Linear Optimization Interior-Point Methods for Linear Optimization Robert M. Freund and Jorge Vera March, 204 c 204 Robert M. Freund and Jorge Vera. All rights reserved. Linear Optimization with a Logarithmic Barrier Function

More information

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function Zhongyi Liu, Wenyu Sun Abstract This paper proposes an infeasible interior-point algorithm with

More information

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α

More information

Root Arrangements of Hyperbolic Polynomial-like Functions

Root Arrangements of Hyperbolic Polynomial-like Functions Root Arrangements o Hyperbolic Polynomial-like Functions Vladimir Petrov KOSTOV Université de Nice Laboratoire de Mathématiques Parc Valrose 06108 Nice Cedex France kostov@mathunicer Received: March, 005

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo January 29, 2012 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

Written Examination

Written Examination Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes

More information

Lecture 14: October 17

Lecture 14: October 17 1-725/36-725: Convex Optimization Fall 218 Lecture 14: October 17 Lecturer: Lecturer: Ryan Tibshirani Scribes: Pengsheng Guo, Xian Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

VALUATIVE CRITERIA FOR SEPARATED AND PROPER MORPHISMS

VALUATIVE CRITERIA FOR SEPARATED AND PROPER MORPHISMS VALUATIVE CRITERIA FOR SEPARATED AND PROPER MORPHISMS BRIAN OSSERMAN Recall that or prevarieties, we had criteria or being a variety or or being complete in terms o existence and uniqueness o limits, where

More information

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING XIAO WANG AND HONGCHAO ZHANG Abstract. In this paper, we propose an Augmented Lagrangian Affine Scaling (ALAS) algorithm for general

More information

WE consider an undirected, connected network of n

WE consider an undirected, connected network of n On Nonconvex Decentralized Gradient Descent Jinshan Zeng and Wotao Yin Abstract Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

arxiv: v1 [math.oc] 1 Jul 2016

arxiv: v1 [math.oc] 1 Jul 2016 Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the

More information

Integral Jensen inequality

Integral Jensen inequality Integral Jensen inequality Let us consider a convex set R d, and a convex function f : (, + ]. For any x,..., x n and λ,..., λ n with n λ i =, we have () f( n λ ix i ) n λ if(x i ). For a R d, let δ a

More information

Complexity bounds for primal-dual methods minimizing the model of objective function

Complexity bounds for primal-dual methods minimizing the model of objective function Complexity bounds for primal-dual methods minimizing the model of objective function Yu. Nesterov July 4, 06 Abstract We provide Frank-Wolfe ( Conditional Gradients method with a convergence analysis allowing

More information

Convex Optimization Lecture 16

Convex Optimization Lecture 16 Convex Optimization Lecture 16 Today: Projected Gradient Descent Conditional Gradient Descent Stochastic Gradient Descent Random Coordinate Descent Recall: Gradient Descent (Steepest Descent w.r.t Euclidean

More information

Approximative Methods for Monotone Systems of min-max-polynomial Equations

Approximative Methods for Monotone Systems of min-max-polynomial Equations Approximative Methods or Monotone Systems o min-max-polynomial Equations Javier Esparza, Thomas Gawlitza, Stean Kieer, and Helmut Seidl Institut ür Inormatik Technische Universität München, Germany {esparza,gawlitza,kieer,seidl}@in.tum.de

More information

Unconstrained minimization: assumptions

Unconstrained minimization: assumptions Unconstrained minimization I terminology and assumptions I gradient descent method I steepest descent method I Newton s method I self-concordant functions I implementation IOE 611: Nonlinear Programming,

More information

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality

More information

Convex Optimization. Problem set 2. Due Monday April 26th

Convex Optimization. Problem set 2. Due Monday April 26th Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining

More information

Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization

Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization Clément Royer - University of Wisconsin-Madison Joint work with Stephen J. Wright MOPTA, Bethlehem,

More information

TESTING TIMED FINITE STATE MACHINES WITH GUARANTEED FAULT COVERAGE

TESTING TIMED FINITE STATE MACHINES WITH GUARANTEED FAULT COVERAGE TESTING TIMED FINITE STATE MACHINES WITH GUARANTEED FAULT COVERAGE Khaled El-Fakih 1, Nina Yevtushenko 2 *, Hacene Fouchal 3 1 American University o Sharjah, PO Box 26666, UAE kelakih@aus.edu 2 Tomsk State

More information

On High-Rate Cryptographic Compression Functions

On High-Rate Cryptographic Compression Functions On High-Rate Cryptographic Compression Functions Richard Ostertág and Martin Stanek Department o Computer Science Faculty o Mathematics, Physics and Inormatics Comenius University Mlynská dolina, 842 48

More information

Solving Continuous Linear Least-Squares Problems by Iterated Projection

Solving Continuous Linear Least-Squares Problems by Iterated Projection Solving Continuous Linear Least-Squares Problems by Iterated Projection by Ral Juengling Department o Computer Science, Portland State University PO Box 75 Portland, OR 977 USA Email: juenglin@cs.pdx.edu

More information

LINEAR AND NONLINEAR PROGRAMMING

LINEAR AND NONLINEAR PROGRAMMING LINEAR AND NONLINEAR PROGRAMMING Stephen G. Nash and Ariela Sofer George Mason University The McGraw-Hill Companies, Inc. New York St. Louis San Francisco Auckland Bogota Caracas Lisbon London Madrid Mexico

More information

GEORGIA INSTITUTE OF TECHNOLOGY H. MILTON STEWART SCHOOL OF INDUSTRIAL AND SYSTEMS ENGINEERING LECTURE NOTES OPTIMIZATION III

GEORGIA INSTITUTE OF TECHNOLOGY H. MILTON STEWART SCHOOL OF INDUSTRIAL AND SYSTEMS ENGINEERING LECTURE NOTES OPTIMIZATION III GEORGIA INSTITUTE OF TECHNOLOGY H. MILTON STEWART SCHOOL OF INDUSTRIAL AND SYSTEMS ENGINEERING LECTURE NOTES OPTIMIZATION III CONVEX ANALYSIS NONLINEAR PROGRAMMING THEORY NONLINEAR PROGRAMMING ALGORITHMS

More information

BASICS OF CONVEX ANALYSIS

BASICS OF CONVEX ANALYSIS BASICS OF CONVEX ANALYSIS MARKUS GRASMAIR 1. Main Definitions We start with providing the central definitions of convex functions and convex sets. Definition 1. A function f : R n R + } is called convex,

More information

Optimization Tutorial 1. Basic Gradient Descent

Optimization Tutorial 1. Basic Gradient Descent E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.

More information

Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles

Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles Convex Optimization on Large-Scale Domains Given by Linear Minimization Oracles Arkadi Nemirovski H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Joint research

More information

A class of Smoothing Method for Linear Second-Order Cone Programming

A class of Smoothing Method for Linear Second-Order Cone Programming Columbia International Publishing Journal of Advanced Computing (13) 1: 9-4 doi:1776/jac1313 Research Article A class of Smoothing Method for Linear Second-Order Cone Programming Zhuqing Gui *, Zhibin

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

Cubic regularization of Newton s method for convex problems with constraints

Cubic regularization of Newton s method for convex problems with constraints CORE DISCUSSION PAPER 006/39 Cubic regularization of Newton s method for convex problems with constraints Yu. Nesterov March 31, 006 Abstract In this paper we derive efficiency estimates of the regularized

More information

The Steepest Descent Algorithm for Unconstrained Optimization

The Steepest Descent Algorithm for Unconstrained Optimization The Steepest Descent Algorithm for Unconstrained Optimization Robert M. Freund February, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 1 Steepest Descent Algorithm The problem

More information

Convex Optimization Algorithms for Machine Learning in 10 Slides

Convex Optimization Algorithms for Machine Learning in 10 Slides Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,

More information

Rigorous pointwise approximations for invariant densities of non-uniformly expanding maps

Rigorous pointwise approximations for invariant densities of non-uniformly expanding maps Ergod. Th. & Dynam. Sys. 5, 35, 8 44 c Cambridge University Press, 4 doi:.7/etds.3.9 Rigorous pointwise approximations or invariant densities o non-uniormly expanding maps WAEL BAHSOUN, CHRISTOPHER BOSE

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

A FULL-NEWTON STEP INFEASIBLE-INTERIOR-POINT ALGORITHM COMPLEMENTARITY PROBLEMS

A FULL-NEWTON STEP INFEASIBLE-INTERIOR-POINT ALGORITHM COMPLEMENTARITY PROBLEMS Yugoslav Journal of Operations Research 25 (205), Number, 57 72 DOI: 0.2298/YJOR3055034A A FULL-NEWTON STEP INFEASIBLE-INTERIOR-POINT ALGORITHM FOR P (κ)-horizontal LINEAR COMPLEMENTARITY PROBLEMS Soodabeh

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

A Distributed Newton Method for Network Utility Maximization, I: Algorithm

A Distributed Newton Method for Network Utility Maximization, I: Algorithm A Distributed Newton Method for Networ Utility Maximization, I: Algorithm Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract Most existing wors use dual decomposition and first-order

More information

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization Iranian Journal of Operations Research Vol. 4, No. 1, 2013, pp. 88-107 Research Note A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization B. Kheirfam We

More information

Numerical Methods for Differential Equations Mathematical and Computational Tools

Numerical Methods for Differential Equations Mathematical and Computational Tools Numerical Methods for Differential Equations Mathematical and Computational Tools Gustaf Söderlind Numerical Analysis, Lund University Contents V4.16 Part 1. Vector norms, matrix norms and logarithmic

More information

The concept of limit

The concept of limit Roberto s Notes on Dierential Calculus Chapter 1: Limits and continuity Section 1 The concept o limit What you need to know already: All basic concepts about unctions. What you can learn here: What limits

More information

University of Houston, Department of Mathematics Numerical Analysis, Fall 2005

University of Houston, Department of Mathematics Numerical Analysis, Fall 2005 3 Numerical Solution of Nonlinear Equations and Systems 3.1 Fixed point iteration Reamrk 3.1 Problem Given a function F : lr n lr n, compute x lr n such that ( ) F(x ) = 0. In this chapter, we consider

More information

Some Hermite-Hadamard type integral inequalities for operator AG-preinvex functions

Some Hermite-Hadamard type integral inequalities for operator AG-preinvex functions Acta Univ. Sapientiae, Mathematica, 8, (16 31 33 DOI: 1.1515/ausm-16-1 Some Hermite-Hadamard type integral inequalities or operator AG-preinvex unctions Ali Taghavi Department o Mathematics, Faculty o

More information

VALUATIVE CRITERIA BRIAN OSSERMAN

VALUATIVE CRITERIA BRIAN OSSERMAN VALUATIVE CRITERIA BRIAN OSSERMAN Intuitively, one can think o separatedness as (a relative version o) uniqueness o limits, and properness as (a relative version o) existence o (unique) limits. It is not

More information

On Convexity of Reachable Sets for Nonlinear Control Systems

On Convexity of Reachable Sets for Nonlinear Control Systems Proceedings o the European Control Conerence 27 Kos, Greece, July 2-5, 27 WeC5.2 On Convexity o Reachable Sets or Nonlinear Control Systems Vadim Azhmyakov, Dietrich Flockerzi and Jörg Raisch Abstract

More information

Observer design for a general class of triangular systems

Observer design for a general class of triangular systems 1st International Symposium on Mathematical Theory of Networks and Systems July 7-11, 014. Observer design for a general class of triangular systems Dimitris Boskos 1 John Tsinias Abstract The paper deals

More information

A Distributed Newton Method for Network Utility Maximization

A Distributed Newton Method for Network Utility Maximization A Distributed Newton Method for Networ Utility Maximization Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie Abstract Most existing wor uses dual decomposition and subgradient methods to solve Networ Utility

More information

Web Appendix for The Value of Switching Costs

Web Appendix for The Value of Switching Costs Web Appendix or The Value o Switching Costs Gary Biglaiser University o North Carolina, Chapel Hill Jacques Crémer Toulouse School o Economics (GREMAQ, CNRS and IDEI) Gergely Dobos Gazdasági Versenyhivatal

More information

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Robert M. Freund March, 2004 2004 Massachusetts Institute of Technology. The Problem The logarithmic barrier approach

More information