arxiv: v3 [math.oc] 8 May 2018

Size: px

Start display at page:

Download "arxiv: v3 [math.oc] 8 May 2018"

Anna Hamilton
5 years ago
Views:

1 Noname manuscript No will be inserted by the editor Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods Tianxiao Sun Quoc Tran-Dinh arxiv: v3 [mathoc] 8 May 08 Received: date / Accepted: date Abstract We study the smooth structure o convex unctions by generalizing a powerul concept so-called sel-concordance introduced by Nesterov and Nemirovsii in the early 990s to a broader class o convex unctions which we call generalized sel-concordant unctions This notion allows us to develop a uniied ramewor or designing Newton-type methods to solve convex optimization problems The proposed theory provides a mathematical tool to analyze both local and global convergence o Newton-type methods without imposing unveriiable assumptions as long as the underlying unctionals all into our class o generalized sel-concordant unctions First, we introduce the class o generalized sel-concordant unctions which covers the class o standard selconcordant unctions as a special case Next, we establish several properties and ey estimates o this unction class which can be used to design numerical methods Then, we apply this theory to develop several Newton-type methods or solving a class o smooth convex optimization problems involving generalized sel-concordant unctions We provide an explicit step-size or a damped-step Newton-type scheme which can guarantee a global convergence without perorming any globalization strategy We also prove a local quadratic convergence o this method and its ull-step variant without requiring the Lipschitz continuity o the objective Hessian mapping Then, we extend our result to develop proximal Newton-type methods or a class o composite convex minimization problems involving generalized sel-concordant unctions We also achieve both global and local convergence without additional assumptions Finally, we veriy our theoretical results via several numerical examples, and compare them with existing methods Keywords Generalized sel-concordance Newton-type method proximal Newton method quadratic convergence local and global convergence convex optimization Mathematics Subject Classiication C Introduction The Newton method is a classical numerical scheme or solving systems o nonlinear equations and smooth optimization [47, 50] However, there are at least two reasons that prevent the use o such methods rom solving large-scale problems Firstly, while these methods oten have a ast local convergence rate which can be up to a quadratic rate, their global convergence has not been wellunderstood [46] In practice, one can use a damped-step scheme utilizing the Lipschitz constant o the objective derivatives to compute a suitable step-size as oten seen in gradient-type methods, or Corresponding author quoctd@ uncedu Tianxiao Sun Quoc Tran-Dinh Department o Statistics and Operations Research, University o North Carolina at Chapel Hill UNC 38 Hanes Hall, CB# 360, UNC Chapel Hill, NC tianxias, quoctd@ uncedu

2 T Sun and Q Tran-Dinh incorporate the algorithm with a globalization strategy such as line-search, trust-region, or ilter to guarantee a descent property [47] Both strategies allow us to prove a global convergence o the underlying Newton-type method in some sense Unortunately, in practice, there exist several problems whose objective unction does not have global Lipschitz gradient or Hessian such as logarithmic or reciprocal unctions This class o problems does not provide us some uniorm bounds to obtain a constant step-size in optimization algorithms On the other hand, using a globalization strategy or determining step-sizes oten requires centralized computation such as unction evaluations which prevent us rom using distributed computation and stochastic descent methods Secondly, Newton algorithms are second-order methods which oten require a high periteration complexity due to the operations on the Hessian mapping o the objective unction or its approximations In addition, these methods require the underlying unctionals to be smooth up to a given smoothness levels which does not oten hold in many practical models Motivation: In recent years, there has been a great interest in Newton-type methods or solving convex optimization problems and monotone equations due to the development o new techniques and mathematical tools in optimization, machine learning, and randomized algorithms [6,, 6, 8, 34, 4, 43, 54, 55, 57, 58, 6] Several combinations o Newton-type methods and other techniques such as proximal operators [8], cubic regularization [4], gradient regularization [55], randomized algorithms such as setching [54], subsampling [8], and ast eigen-decomposition [6] have opened up a new research direction and attracted a great attention in solving nonsmooth and largescale problems Hitherto, research in this direction remains ocusing on speciic classes o problems where standard assumptions such as nonsingularity and Hessian Lipschitz continuity are preserved However, such assumptions do not hold or many other examples as shown in [6] Moreover, i they are satisied, then we oten get a lower bound o possible step-sizes or our algorithm which may lead to a poor perormance, especially in large-scale problems In the seminal wor [45], Nesterov and Nemirovsii showed that the class o log-barriers does not satisy the standard assumptions o the Newton method i the solution o the underlying problem is closed to the boundary o the barrier unction domain They introduced a powerul concept called sel-concordance to overcome this drawbac and developed new Newton schemes to achieve global and local convergence without requiring any additional assumption, or a globalization strategy While the sel-concordance notion was initially invented to study interior-point methods, it is less well-nown in other communities Recent wors [, 4, 38, 6, 67, 7] have popularized this concept to solve other problems arising rom machine learning, statistics, image processing, scientiic computing, and variational inequalities Our goals: In this paper, motivated by [, 63, 7], we aim at generalizing the sel-concordance concept in [45] to a broader class o smooth and convex unctions To illustrate our idea, we consider a univariate smooth and convex unction ϕ : R R I ϕ satisies the inequality ϕ t M ϕ ϕ t 3/ or all t in the domain o ϕ and or a given constant M ϕ 0, then we say that ϕ is sel-concordant in Nesterov and Nemirovsii s sense [45] We instead generalize this inequality to ϕ t M ϕ ϕ t ν, or all t in the domain o ϕ, and or given constants ν > 0 and M ϕ 0 We emphasize that generalizing rom univariate to multivariate unctions in the standard selconcordant case ie, ν = 3 [45] preserves several important properties including the multilinear symmetry [40, Lemma 4], while, unortunately, they do not hold or the case ν 3 Thereore, we modiy the deinition in [45] to overcome this drawbac Note that a similar idea has been also studied in [, 63] or a class o logistic-type unctions Nevertheless, the deinition using in these papers is limited, and still creates certain diiculty or developing urther theory in general cases Our second goal is to develop a uniied mechanism to analyze convergence including global and local convergence o the ollowing Newton-type scheme: x + := x s F x F x,

3 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 3 where F can be represented as the right-hand side o a smooth monotone equation F x = 0, or the optimality condition o a convex optimization or a convex-concave saddle-point problem, F is the Jacobian map o F, and s 0, ] is a given step-size Despite the Newton scheme is invariant to a change o variables [6], its convergence property relies on the growth o the Hessian mapping along the Newton iterative process In classical settings, the Lipschitz continuity and the non-degeneracy o the Hessian mapping in a neighborhood o a given solution are ey assumptions to achieve local quadratic convergence rate [6] These assumptions have been considered to be standard, but they are oten very diicult to chec in practice, especially the second requirement A natural idea is to classiy the unctionals o the underlying problem into a nown class o unctions to choose a suitable method or minimizing it While irst-order methods or convex optimization essentially rely on the Lipschitz gradient continuity, Newton schemes usually use the Lipschitz continuity o the Hessian mapping and its non-degeneracy to obtain a well-deined Newton direction as we have mentioned For sel-concordant unctions, the second condition automatically holds, but the irst assumption ails to satisy However, both ull-step and damped-step Newton methods still wor in this case by appropriately choosing a suitable metric This situation has been observed and standard assumptions have been modiied in dierent directions to still guarantee convergence o Newton-type methods, see [6] or an intensive study o generic Newton-type methods, and [45, 40] or the sel-concordant unction class Our approach: We attempt to develop some bacground theory or a broad class o smooth and convex unctions under the structure By adopting the local norm deined via the Hessian mapping o such a convex unction rom [45], we can prove some lower and upper bound estimates or the local norm distance between two points in the domain as well as or the growth o the Hessian mapping Together with this bacground theory, we also identiy a class o unctions using in generalized linear models [37, 39] as well as in empirical ris minimization [68] that alls into our generalized sel-concordance class or many well-nown loss-type unctions as listed in Table Applying our generalized sel-concordant theory, we develop a class o Newton-type methods to solve the ollowing composite convex minimization problem: F := min F x := x + gx, 3 x R p where is a generalized sel-concordant unction in our context, and g is a proper, closed, and convex unction that can be reerred to as a regularization term We consider two cases The irst case is a non-composite convex problem in which g is vanished ie, g = 0 In the second case, we assume that g is equipped with a tractably proximal operator see 34 or the deinition Our contribution: To this end, our main contribution can be summarized as ollows a We generalize the sel-concordant notion in [40] to a more broader class o smooth convex unctions which we call generalized sel-concordance We identiy several loss-type unctions that can be cast into our generalized sel-concordant class We also prove several undamental properties and show that the sum and linear transormation o generalized sel-concordant unctions are generalized sel-concordant or a given range o ν or under suitable assumptions b We develop lower and upper bounds on the Hessian mapping, the gradient mapping, and the unction values or generalized sel-concordant unctions These estimates are ey to analyze several numerical optimization methods including Newton-type methods c We propose a class o Newton methods including ull-step and damped-step schemes to minimize a generalized sel-concordant unction We explicitly show how to choose a suitable stepsize to guarantee a descent direction in the damped-step scheme, and prove a local quadratic convergence or both the damped-step and the ull-step schemes using a suitable metric d We also extend our Newton schemes to handle the composite setting 3 We develop both ullstep and damped-step proximal Newton methods to solve this problem and provide a rigorous theoretical convergence guarantee in both local and global sense

4 4 T Sun and Q Tran-Dinh e We also study a quasi-newton variant o our Newton scheme to minimize a generalized selconcordant unction Under a modiication o the well-nown Dennis-Moré condition [5] or a BFGS update, we show that our quasi-newton method locally converges at a superlinear rate to the solution o the underlying problem Let us emphasize the ollowing aspects o our contribution Firstly, we observe that the selconcordance notion is a powerul concept and has widely been used in interior-point methods as well as in other optimization schemes [8,35,6,7], generalizing it to a broader class o smooth convex unctions can substantially cover a number o new applications or can develop new methods or solving old problems including logistic and multimonomial logistic regression, optimization involving exponential objectives, and distance-weighted discrimination problems in support vector machine see Table below Secondly, veriying theoretical assumptions or convergence guarantees o a Newton method is not trivial, our theory allows one to classiy the underlying unctions into dierent subclasses by using dierent parameters ν and M ϕ in order to choose suitable algorithms to solve the corresponding optimization problem Thirdly, the theory developed in this paper can potentially apply to other optimization methods such as gradient-type, setching and sub-sampling Newton, and Fran-Wole s algorithms as done in the literature [49, 54, 57, 58, 6] Finally, our generalization also shows that it is possible to impose additional structure such as sel-concordant barrier to develop path-ollowing scheme or interior-point-type methods or solving a subclass o composite convex minimization problems o the orm 3 We believe that our theory is not limited to convex optimization, but can be extended to solve convex-concave saddle-point problems, and monotone equations/inclusions involving generalized sel-concordant unctions [67] Summary o generalized sel-concordant properties: For our reerence convenience, we provide a short summary on the main properties o generalized sel-concordant gsc unctions in Table Although several results hold or a dierent range o ν, the complete theory only holds or Table A summary o generalized sel-concordant properties Result Property Range o ν Deinitions and deinitions o gsc unctions ν > 0 Proposition sum o gsc unctions ν Proposition aine transormation o gsc unctions with Ax = Ax + b Proposition 3a non-degenerate property ν Proposition 3b unboundedness ν > 0 Proposition 4a gsc and strong convexity ν 0, 3] Proposition 4b gsc and Lipschitz gradient continuity ν Proposition 6 Propositions 7, 8, 9, and 0 i is the conjugate o a gsc unction, then ν + ν = 6 local norm, Hessian, gradient, and unction value bounds ν 0, 3] or general A ν > 3 or over-completed A ν 0, 6 i p = univariate ν [3, 6 i p > multivariate ν ν [, 3] However, this is suicient to cover two important cases: ν = in [,] and ν = 3 in [45] Related wor: Since the sel-concordance concept was introduced in 990s [45], its irst extension is perhaps proposed by [] or a class o logistic regression In [63], the authors extended [] to study proximal Newton method or logistic, multinomial logistic, and exponential loss unctions By augmenting a strongly convex regularizer, Zhang and Lin in [7] showed that the regularized logistic loss unction is indeed standard sel-concordant In [] Bach continued exploiting his result in [] to show that the averaging stochastic gradient method can achieve the same best-nown convergence rate as in strongly convex case without adding a regularizer In [6], the authors exploited standard

5 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 5 sel-concordance theory in [45] to develop several classes o optimization algorithms including proximal Newton, proximal quasi-newton, and proximal gradient methods to solve composite convex minimization problems In [35], Lu extended [6] to study randomized bloc coordinate descent methods In a recent paper [], Gao and Goldarb investigated quasi-newton methods or sel-concordant problems As another example, [53] proposed an alternative to the standard selconcordance, called sel-regularity The authors applied this theory to develop a new paradigm or interior-point methods The theory developed in this paper, on the one hand, is a generalization o the well-nown sel-concordance notion developed in [45]; on the other hand, it also covers the wor in [, 6, 7] as speciic examples Several concrete applications and extensions o sel-concordance notion can also be ound in the literature including [8, 3, 49, 53] Recently, [4] exploited smooth structures o exponential unctions to design interior-point methods or solving two undamental problems in scientiic computing called matrix scaling and balancing Paper organization: The rest o this paper is organized as ollows Section develops the oundation theory or our generalized sel-concordant unctions including deinitions, examples, basic properties, Fenchel s conjugate, smoothing technique, and ey bounds Section 3 is devoted to studying ull-step and damped-step Newton schemes to minimize a generalized sel-concordant unction including their global and local convergence guarantees Section 4 considers to the composite setting 3 and studies proximal Newton-type methods, and investigates their convergence guarantees Section 5 deals with a quasi-newton scheme or solving the noncomposite problem o 3 Numerical examples are provided in Section 6 to illustrate advantages o our theory Finally, or clarity o presentation, several technical results and proos are moved to the appendix Theory o generalized sel-concordant unctions We generalize the class o sel-concordant unctions introduced by Nesterov and Nemirovsii in [40] to a broader class o smooth and convex unctions We identiy several examples o such unctions Then, we develop several properties o this unction class by utilizing our new deinitions Notation: Given a proper, closed, and convex unction : R p R +, we denote by dom := x R p x < + the domain o, and by x := w R p y x + w, y x, y dom the subdierential o at x dom We use C 3 dom to denote the class o three times continuously dierentiable unctions on its open domain dom We denote by its gradient map, by its Hessian map, and by 3 its third-order derivative For a twice continuously dierentiable convex unction, is symmetric positive semideinite, and can be written as 0 I it is positive deinite, then we write 0 Let R + and R ++ denote the sets o nonnegative and positive real numbers, respectively We use S p + and S p ++ to denote the sets o symmetric positive semideinite and symmetric positive deinite matrices o the size p p, respectively Given a p p matrix H 0, we deine a weighted norm with respect to H as u H := Hu, u / or u R p The corresponding dual norm is v H := H v, v / I H = I, the identity matrix, then u H = u H = u, where is the standard Euclidean norm Note that = We say that is strongly convex with the strong convexity parameter µ 0 i µ is convex We also say that has Lipschitz gradient i is Lipschitz continuous with the Lipschitz constant L [0, +, ie, x y L x y or all x, y dom For C 3 dom, i x 0 at a given x dom, then we deine a local norm u x := xu, u / as a weighted norm o u with respect to x The corresponding dual norm v x, is deined as v x := max v, u u x = x v, v / or v R p Univariate generalized sel-concordant unctions Let ϕ : R R be a three times continuously dierentiable unction on the open domain domϕ Then, we write ϕ C 3 domϕ In this case, ϕ is convex i and only i ϕ t 0 or all t domϕ We introduce the ollowing deinition

6 6 T Sun and Q Tran-Dinh Deinition Let ϕ : R R be a C 3 domϕ and univariate unction with open domain domϕ, and ν > 0 and M ϕ 0 be two constants We say that ϕ is M ϕ, ν-generalized sel-concordant i ϕ t M ϕ ϕ t ν, t domϕ 4 The inequality 4 also indicates that ϕ t 0 or all t dom Hence, ϕ is convex Clearly, i ϕt = a t +bt or any constants a 0 and b R, we have ϕ t = a and ϕ t = 0 The inequality 4 is automatically satisied or any ν > 0 and M ϕ 0 The smallest value o M ϕ is zero Hence, any convex quadratic unction is 0, ν-generalized sel-concordant or any ν > 0 While 4 holds or any other constant ˆM ϕ M ϕ, we oten require that M ϕ is the smallest constant satisying 4 Example Let us now provide some common examples satisying Deinition a Standard sel-concordant unctions: I we choose ν = 3, then 4 becomes ϕ t M ϕ ϕ t 3/ which is the standard sel-concordant unctions in R introduced in [45] b Logistic unctions: In [], Bach modiied the standard sel-concordant inequality in [45] to obtain ϕ t M ϕ ϕ t, and showed that the well-nown logistic loss ϕt := log + e t satisies this deinition In [63] the authors also exploited this deinition, and developed a class o irst-order and second-order methods to solve composite convex minimization problems Hence, ϕt := log + e t is a generalized sel-concordant unction with M ϕ = and ν = c Exponential unctions: The exponential unction ϕt := e t also belongs to 4 with M ϕ = and ν = This unction is oten used, eg, in Ada-boost [33], or in matrix scaling [4] d Distance-weighted discrimination DWD: We consider a more general unction ϕt := t on q domϕ = R ++ and q studied in [36] or DWD using in support vector machine As shown q+3 in Table, this unction satisies Deinition with M ϕ = and ν = q+, 3 q+ q+ qq+ e Entropy unction: We consider the well-nown entropy unction ϕt := t lnt or t > 0 We can easily show that ϕ t = t = ϕ t Hence, it is generalized sel-concordant with ν = 4 and M ϕ = in the sense o Deinition or t, This unction is Arcsine distribution: We consider the unction ϕt := t convex and smooth Moreover, we veriy that it satisies Deinition with ν = 4 M ϕ = , 3 and 7 < 35 We can generalize this unction to ϕt := [t ab t] q or 7/5 t a, b, where a < b and q > 0 Then, we can show that ν = q+3 q+, 3 g Robust Regression: Consider a monomial unction ϕt := t q or q, studied in [7] or robust regression using in statistics Then, M ϕ = q q qq and ν = 3 q q 4, + As concrete examples, the ollowing table, Table, provides a non-exhaustive list o generalized sel-concordant unctions used in the literature Table Examples o univariate generalized sel-concordant unctions F, L means that ϕ is Lipschitz continuous Function name Form o ϕt ν M domϕ Application F, L Reerence Log-barrier lnt 3 R ++ Poisson no [0,40,45] Entropy-barrier t lnt lnt 3 R ++ Interior-point no [40] Logistic ln + e t R Classiication yes [9] Exponential e t R AdaBoost, etc no [4,33] Negative power t q, q > 0 q+3 q+ q+ q+ qq+ R ++ DWD no [36] Arcsine distribution t 4 5 < 35, Random wals no [4] Positive power t q, q, 3 q q q q qq R + Regression no [7] Entropy t lnt 4 R + KL divergence no [0]

7 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 7 Remar All examples given in Table all into the case ν However, we note that Deinition also covers [7, Lemma ] as a special case when ν 0, Unortunately, as we will see in what ollows, it is unclear how to generalize several properties o generalized sel-concordance rom univariate to multivariable unctions or ν 0,, except or strongly convex unctions Table only provides common generalized sel-concordant unctions using in practice However, it is possible to combine these unctions to obtain mixture unctions that preserve the generalized sel-concordant inequality given in Deinition For instance, the barrier entropy t lnt lnt is a standard sel-concordant unction, and it is the sum o the entropy t lnt and the negative logarithmic unction logt which are generalized sel-concordant with ν = 4 and ν = 3, respectively Multivariate generalized sel-concordant unctions Let : R p R be a C 3 dom smooth and convex unction with open domain dom Given the Hessian o, x dom, and u, v R p, we consider the unction ψt := x + tvu, u Then, it is obvious to show that ψ t := 3 x + tv[v]u, u or t R such that x + tv dom, where 3 is the third-order derivative o It is clear that ψ0 = xu, u = u x By using the local norm, we generalize Deinition to multivariate unctions : R p R as ollows Deinition A C 3 -convex unction : R p R is said to be an M, ν-generalized sel-concordant unction o the order ν > 0 and the constant M 0 i, or any x dom and u, v R p, it holds 3 x[v]u, u M u x v ν x v 3 ν 5 Here, we use a convention that 0 0 = 0 or the case ν < or ν > 3 We denote this class o unctions by F M,νdom shortly, FM,ν when dom is explicitly deined Let us consider the ollowing two extreme cases: I ν =, 5 leads to 3 x[v]u, u M u x v which collapses to the deinition introduced in [] by letting u = v I ν = 3 and u = v, 5 reduces to 3 x[u]u, u M u 3 x, Deinition becomes the standard sel-concordant deinition introduced in [40, 45] We emphasize that Deinition is not symmetric, but can avoid the use o multilinear mappings as required in [, 45] However, by [45, Proposition 9] or [40, Lemma 4], Deinition with ν = 3 is equivalent to [40, Deinition 4] or standard sel-concordant unctions 3 Basic properties o generalized sel-concordant unctions We irst show that i and are two generalized sel-concordant unctions, then β + β is also a generalized sel-concordant or any β, β > 0 according to Deinition Proposition Sum o generalized sel-concordant unctions Let i be M i, ν-generalized sel-concordant unctions satisying 5, where M i 0 and ν or i =,, m Then, or β i > 0, i =,,, m, the unction x := m i= β i i x is well-deined on dom = m i= dom i, and is M, ν-generalized sel-concordant with the same order ν and the constant M := max β ν i M i i m 0

8 8 T Sun and Q Tran-Dinh Proo It is suicient to prove or m = For m >, it ollows rom m = by induction By [40, Theorem 35], is a closed and convex unction In addition, dom = dom dom Let us ix some x dom and u, v R p Then, by Deinition, we have 3 i x[v]u, u M i i xu, u i xv, v ν v 3 ν, i =, Denote w i := i xu, u 0 and s i := i xv, v 0 or i =, We can derive 3 x[v]u, u β 3 x[v]u, u + β 3 x[v]u, u xu, u xv, v ν xu, u xv, v ν β w [ M β w s ν + M β w s ν β w + β w β s + β s ν β s β s +β s ] β w β w +β w [T ] v 3 ν 6 Let ξ := β w +β w [0, ] and η := [0, ] Then, = ξ 0 and = η 0 Hence, the term [T ] in the square bracets o 6 becomes β s β s +β s hξ, η := β ν M ξη ν + β ν M ξ η ν, ξ, η [0, ] Since ν and ξ, η [0, ], we can upper bound hξ, η as hξ, η β ν M ξ + β ν M ξ, ξ [0, ] The right-hand side unction is linear in ξ on [0, ] It achieves the maximum at its boundary Hence, we have max hξ, η max β ν M, β ν M ξ [0,],η [0,] Using this estimate into 6, we can show that := β + β is M, ν-generalized sel-concordant with M := max β ν M, β ν M Using Proposition, we can also see that i is M, ν-generalized sel-concordant, and β > 0, then gx := βx is also M g, ν-generalized sel-concordant with the constant M g := β ν M The convex quadratic unction qx := Qx, x + c x with Q S p + is 0, ν-generalized selconcordant or any ν > 0 Hence, by Proposition, i is M, ν-generalized sel-concordant, then x + Qx, x + c x is also M, ν-generalized sel-concordant Next, we consider an aine transormation o a generalized sel-concordant unction Proposition Aine transormation Let Ax := Ax + b be an aine transormation rom R p to R q, and be an M, ν-generalized sel-concordant unction with ν > 0 Then, the ollowing statements hold: a I ν 0, 3], then gx := Ax is M g, ν-generalized sel-concordant with M g := M A 3 ν b I ν > 3 and λ min A A > 0, then gx := Ax is M g, ν-generalized sel-concordant with M g := M λ min A A 3 ν, where λ min A A is the smallest eigenvalue o A A Proo Since gx = Ax = Ax + b, it is easy to show that gx = A AxA and 3 gx[v] = A 3 Ax[Av]A Let us denote by x := Ax + b, ũ := Au, and ṽ := Av Then, using Deinition, we have 3 gx[v]u, u = A 3 x[ṽ]au, u = 3 x[ṽ]ũ, ũ 5 M xũ, ũ xṽ, ṽ ν ṽ 3 ν = M A AxAu, u A AxAv, v ν Av 3 ν = M gxu, u gxv, v ν Av 3 ν 7

9 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 9 a I ν 0, 3], then we have Av 3 ν A 3 ν v 3 ν Hence, the last inequality 7 implies 3 gx[v]u, u M A 3 ν gxu, u gxv, v ν v 3 ν, which shows that g is M g, ν-generalized sel-concordant with M g := M A 3 ν b Note that Av = v A Av λ min A A v 0, where λ mina A is the smallest eigenvalue o A A I λ min A A > 0 and ν > 3, then we have Av 3 ν λ min A A 3 ν v 3 ν Combining this estimate and 7, we can show that g is M g, ν-generalized sel-concordant with M g := M λ min A A 3 ν Remar Proposition shows that generalized sel-concordance is preserved via an aine transormations i ν 0, 3] I ν > 3, then it requires A to be over-completed, ie, λ min A A > 0 Hence, the theory developed in the sequel remains applicable or ν > 3 i A is over-completed The ollowing result is an extension o standard sel-concordant unctions ν = 3 whose proo is very similar to [40, Theorems 43, 44] by replacing the parameters M = and ν = 3 with the general parameters M 0 and ν > 0 or ν, respectively We omit the detailed proo Proposition 3 Let be an M, ν-generalized sel-concordant unction with ν > 0 Then: a I ν and dom contains no straight line, then x 0 or any x dom b I there exists x bddom, the boundary o dom, then, or any x bddom, and any sequence x dom such that lim x = x, we have lim x = + Note that Proposition 3a only holds or ν I we consider gx := Ax or a given aine operator Ax = Ax + b, then the non-degenerateness o g is only guaranteed i A is ull-ran Otherwise, it is non-degenerated in a given subspace o A 4 Generalized sel-concordant unctions with special structures We irst show that i a generalized sel-concordant unction is strongly convex or has a Lipschitz gradient, then it can be cast into the special case ν = or ν = 3 Proposition 4 Let F M,ν be an M, ν-generalized sel-concordant with ν > 0 Then: a I ν 0, 3] and is also strongly convex on dom with the strong convexity parameter µ > 0 in l -norm, then is also ˆM, ˆν-generalized sel-concordant with ˆν = 3 and ˆM := M µ 3 ν b I ν and is Lipschitz continuous with the Lipschitz constant L [0, + in l -norm, then is also ˆM, ˆν-generalized sel-concordant with ˆν = and ˆM := M L ν Proo a I is strongly convex with the strong convexity parameter µ > 0 in l -norm, then we have xv, v µ v or any v R p Hence, v v x µ In this case, 5 leads to 3 x[v]u, u 3 ν M u v M x v v x x µ 3 ν u x v x Hence, is ˆM, ˆν - generalized sel-concordant with ˆν = 3 and ˆM := M µ 3 ν b Since is Lipschitz continuous with the Lipschitz constant L [0, + in l -norm, we have v x = xv, v L v or all v R p which leads to v x v L or all v R p On the other hand, F M,ν with ν, we can show that 3 x[v]u, u ν M u v x x v v M L ν u x v Hence, is also ˆM, ˆν-generalized sel-concordant with ˆν = and ˆM := M L ν

10 0 T Sun and Q Tran-Dinh Proposition 4 provides two important properties I the gradient map o a generalized selconcordant unction is Lipschitz continuous, we can always classiy it into the special case ν = Thereore, we can exploit both structures: generalized sel-concordance and Lipschitz gradient to develop better algorithms This idea is also applied to generalized sel-concordant and strongly convex unctions Given n smooth convex univariate unctions ϕ i : R R satisying 4 or i =,, n with the same order ν > 0, we consider the unction : R p R deined by the ollowing orm: x := n n ϕ i a i x + b i, 8 i= where a i R p and b i R are given vectors and numbers, respectively or i =,, n This convex unction is called a inite sum and widely used in machine learning and statistics The decomposable structure in 8 oten appears in generalized linear models [7, ], and empirical ris minimization [7], where ϕ i is reerred to as a loss unction as can be ound, eg, in Table Next, we show that i ϕ i is generalized sel-concordant with ν [, 3], then is also generalized sel-concordant This result is a direct consequence o Proposition and Proposition Corollary I ϕ i in 8 satisies 4 or i =,, n with the same order ν [, 3] and M ϕi 0, then deined by 8 is also M, ν-generalized sel-concordant in the sense o Deinition with the same order ν and the constant M := n ν max M ϕi a i 3 ν i n Finally, we show that i we regularize in 8 by a strongly convex quadratic term, then the resulting unction becomes sel-concordant The proo can ollow the same path as [7, Lemma ] Proposition 5 Let x := n n i= ϕ ia i x+b i+ψx, where ψx := Qx, x +c x is strongly convex quadratic unction with Q S++ p I ϕ i satisies 4 or i =,, n with the same order ν 0, 3] and a constant M ϕi > 0, then is ˆM, 3-generalized sel-concordant in the sense o Deinition with ˆM := λ min Q ν 3 max M ϕi a i 3 ν i n 5 Fenchel s conjugate o generalized sel-concordant unctions Primal-dual theory is undamental in convex optimization Hence, it is important to study the Fenchel conjugate o generalized sel-concordant unctions Let : R p R be an M, ν-generalized sel-concordant unction We consider Fenchel s conjugate o as x = sup x, u u u dom 9 u Since is proper, closed, and convex, is well-deined and also proper, closed, and convex Moreover, since is smooth and convex, by Fermat s rule, i u x satisies u x = x, then is well-deined at x This shows that dom = x R p u x = x is solvable Example Let us loo at some univariate unctions By using 9, we can directly show that: I ϕs = log + e s, then ϕ t = t logt + t log t I ϕs = s logs, then ϕ t = e t 3 I ϕs = e s, then ϕ t = t logt t Intuitively, these examples show that i ϕ is generalized sel-concordant, then its conjugate ϕ is also generalized sel-concordant For more examples, we reer to [3, Chapter 3] Let us generalize this result in the ollowing proposition whose proo is given in Appendix A

11 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods Proposition 6 I is M, ν-generalized sel-concordant in dom R p such that x 0 or x dom, then the conjugate unction o given by 9 is well-deined, and M, ν - generalized sel-concordant on dom := x R p u x, u is bounded rom below on dom, where M = M and ν = 6 ν provided that ν [3, 6 i p > and ν 0, 6 i p = Moreover, we have x = u x and x = u x, where u x is a unique solution o the maximization problem max u x, u u u dom in 9 or any x dom Proposition 6 allows us to apply our generalized sel-concordance theory in this paper to the dual problem o a convex problem involving generalized sel-concordant unctions, especially, when the objective unction o the primal problem is generalized sel-concordant with ν 3, 4] The Fenchel conjugates are certainly useul when we develop optimization algorithms to solve constrained convex optimization involving generalized sel-concordant unctions, see, eg, [65, 66] 6 Generalized sel-concordant approximation o nonsmooth convex unctions Several well-nown convex unctions are nonsmooth However, they can be approximated up to an arbitrary accuracy by a generalized sel-concordant unction via smoothing Smoothing techniques clearly allow us to enrich the applicability o our theory to nonsmooth convex problems Given a proper, closed, possibly nonsmooth, and convex unction : R p R + One can smooth using the ollowing Nesterov s smoothing technique [4] γ x := sup x, u u γωu, 0 u dom where is the Fenchel conjugate o, ω : domω R p R is a smooth convex unction such that dom domω, and γ > 0 is called a smoothness parameter In particular, i is Lipschitz continuous, then dom is bounded [3] Hence, the sup operator in 0 reduces to the max operator Our goal is to choose an appropriate smoothing unction ω such that the smoothed unction γ is well-deined and generalized sel-concordant or any ixed smoothness parameter γ > 0 Example 3 Let us provide a ew examples with well-nown nonsmooth convex unctions: a Consider the l -norm unction x := x in R p Then, it can be rewritten as x = max x, u u u = max u,v p x, u v u i + v i =, u, v R p + We can smooth this unction by γ by choosing ωu, v := lnp + p i= u i lnu i + v i lnv i In this case, we obtain γ x = γ ln p i= e x i/γ + e xi/γ γ lnp This unction is clearly generalized sel-concordant with ν =, see [63, Lemma 4] However, i we choose ωu := p p i= u i, then we get γ x = p i= x i + γ γp In this case, γ is also generalized sel-concordant with ν = 8 3 and M γ = 3γ 3 b The hinge loss unction ϕt := max 0, t can be written as ϕt = t + t Hence, t e γ +e we can smooth this unction by ϕ γ t := γ ln t γ + t with a smoothness parameter γ > 0 Clearly, ϕ γ is generalized sel-concordant with ν = i=

12 T Sun and Q Tran-Dinh In many practical problems, the conjugate o can be written as the sum = ϕ+δ U, where ϕ is a generalized sel-concordant unction, and δ U is the indicator unction o a given nonempty, closed, and convex set U In this case, γ in 0 becomes γ x := sup x, u ϕu γωu u U u I ω is a generalized sel-concordant such that ν ϕ = ν ω, and U = domω domϕ, then γ is generalized sel-concordant with ν γ = 6 ν ϕ as shown in Proposition 6 7 Key bounds on Hessian map, gradient map, and unction values Now, we develop some ey bounds on the local norms, Hessian map, gradient map, and unction values o generalized sel-concordant unctions In this subsection, we assume that the Hessian map o is nondegenerate at any point in its domain For this purpose, given ν, we deine the ollowing quantity or any x, y dom: M y x i ν = d ν x, y := ν M y x 3 ν y x ν x i ν > Here, i ν > 3, then we require x y Otherwise, we set d ν x, y := 0 i x = y In addition, we also deine the unction ω ν : R R + as i ν > τ ν ω ν τ := 3 e τ i ν =, with dom ω ν =, i ν >, and dom ω ν = R i ν = We also adopt the Diin ellipsoidal notion rom [40] as W 0 x; r := y R p d ν x, y < r The next proposition provides a bound on the local norm deined by a generalized sel-concordant unction This bound is given or the local distances y x x and y x y between two points x and y in dom Proposition 7 Bound o local norms I ν >, then, or any x dom, we have W 0 x; dom For any x, y dom, let d ν x, y be deined by, and ω ν be deined by 3 Then, we have ω ν d ν x, y y x x y x y ω ν d ν x, y y x x 4 I ν >, then the right-hand side inequality o 4 holds i d ν x, y < Proo We irst consider the case ν > Let u R p and u 0 Consider the ollowing univariate unction φt := x + tuu, u ν = u ν x+tu It is easy to compute the derivative o this unction, and obtain ν φ 3 x + tu[u]u, u ν 3 x + tu[u]u, u t = = x + tuu, u ν u ν x+tu Using Deinition with u = v and x + tu instead o x, we have φ t ν M u 3 ν This implies that φt φ0 ν M u 3 ν t On the other hand, we can see that domφ = φ0 t R φt > 0 Hence, we have domφ contains, ν M u 3 ν act and the deinition o φ, we can show that dom contains y := x + tu t < φ0 ν M u 3 ν u ν x ν M u 3 ν Using this

13 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 3 However, since t = y x ν x u ν x This shows that W 0 x; dom Since y x ν y y x 3 ν, the condition t < u 3 ν u ν x is equivalent to d ν M u 3 ν ν x, y < 0 φ tdt 0 φ t dt, integrating φ t over the interval [0, ] we get u ν x+u u ν x ν Using u = y x in the last inequality, we get y x ν y which is equivalent to y x ν y y x ν x ν M y x ν x x y 3 ν + ν y x ν x M y x ν x x y 3 ν M u 3 ν y x ν x = y x ν x ν M y x 3 ν d ν x, y = y x ν x + d ν x, y, given that d ν x, y < Taing the power o ν > 0 in both sides, we get 4 or the case ν > Now, we consider the case ν = Let 0 u R p We consider the ollowing unction φt := ln x + tuu, u = ln u x+tu Clearly, it is easy to show that φ t = 3 x+tu[u]u,u x+tuu,u = 3 x+tu[u]u,u u x+tu with u = v and x + tu instead o x, we obtain φ t M u Since 0 φ tdt 0 φ t dt, integrating φ t over the interval [0, ] we get ln u x+u ln u x M u Using again Deinition Substituting u = y x into this inequality, we get ln y x y ln y x M x y x Hence, ln y x x M y x ln y x y ln y x x + M y x This inequality leads to 4 or the case ν = Next, we develop new bounds or the Hessian map o in the ollowing proposition Proposition 8 Bounds o Hessian map For any x, y dom, let d ν x, y be deined by, and ω ν be deined by 3 Then, we have ν [ d ν x, y] x y [ d ν x, y] ν x i ν >, 5 e dνx,y x y e dνx,y x i ν =, 6 where 5 holds i d ν x, y < or the case ν > Proo Let ν > and 0 u R n Consider the ollowing univariate unction on [0, ]: ψt := x + ty xu, u, t [0, ] I we denote by y t := x + ty x, then y t x = ty x, ψt = u y t, and ψ t = 3 y t [y x]u, u By Deinition, we have [ ψ t M u y t y x ν y t y x 3 ν yt x ] ν = M ψt yt 3 ν t y x, which implies d ln ψt dt M [ yt x ] ν yt 3 ν t y x 7

14 4 T Sun and Q Tran-Dinh Assume that d ν x, y < Then, by the deinition o y t and d ν, we have d ν x, y t = td ν x, y and y t x x = t y x x Using Proposition 7, we can derive [ t y t x yt t ] ν y t x 3 ν y t x ν ν x y t x x Hence, we can urther derive Integrating d ln ψt dt 0 d ln ψt dt = t [ d νx, y t ] ν y t x x = [ d ν x, yt] ν y x x [ ] ν t y y x ν x t x yt d ν x, yt with respect to t on [0, ] and using the last inequality and 7, we get dt d ln ψt dt dt y x ν x y x 3 ν dt d ν x, yt 0 Clearly, we can compute this integral explicitly as [ ] u [ ] ln y u = ψ ln d νx, y ] ψ0 ν d x ν x, y ln [ d νx, y] = ln [ d ν x, y ν Rearranging this inequality, we obtain [ d ν x, y] ν u y u yu, u xu, u [ d νx, y] ν x Since this inequality holds or any 0 u R p, it implies 5 I u = 0, then 5 obviously holds Now, we consider the case ν = It ollows rom 7 that [ ] u ln y u = d ln ψt dt x 0 dt d ln ψt 0 dt dt M y x dt = M y x 0 Since this inequality holds or any u R p, it implies 6 The ollowing corollary provides a bound on the mean o the Hessian map Gx, y := 0 x+ τy xdτ whose proo is moved to Appendix A Corollary For any x, y dom, let d ν x, y be deined by Then, we have where κ ν d ν x, y x and κ ν t := 0 κ ν t := x + τy xdτ κ ν d ν x, y x, 8 e t t i ν = t ν ν t ] i ν = 4 i ν > and ν 4, [ t ν ν t e t t i ν = ln t t i ν = 4 [ ] t ν 4 ν t i ν > and ν 4 ν ν 4 Here, i ν >, then we require d ν x, y to satisy d ν x, y < or x, y dom in 8 0

15 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 5 We prove a bound on the gradient inner product o a generalized sel-concordant unction Proposition 9 Bounds o gradient map For any x, y dom, we have ω ν d ν x, y y x x y x, y x ω ν d ν x, y y x x, 9 where, i ν >, then the right-hand side inequality o 9 holds i d ν x, y <, and e τ τ i ν = ln τ ω ν τ := τ i ν = 4 otherwise ν ν 4 τ ν 4 ν τ 0 Here, ω ν τ 0 or all τ dom ω ν Proo Let y t := x + ty x By the mean-value theorem, we have y x, y x = 0 y t y x, y x dt = We consider the unction ω ν deined by 3 It ollows rom Proposition 7 that ω ν d ν x, y t y t x x y t x y t ω ν d ν x, y t y t x x 0 t y t x y t dt Now, we note that d ν x, y t = td ν x, y and y t x x = t y x x, the last estimate leads to ω ν td ν x, y y x x t y t x y t ω ν td ν x, y y x x Substituting this estimate into, we obtain y x x 0 ω ν td ν x, y dt y x, y x y x x 0 ω ν td ν x, y dt Using the unction ω ν τ rom 3 to compute the let-hand side and the right-hand side integrals, we obtain 9 Finally, we prove a bound on the unction values o an M, ν-generalized sel-concordant unction in the ollowing proposition Proposition 0 Bounds o unction values For any x, y dom, we have ω ν d ν x, y y x x y x x, y x ω ν d ν x, y y x x, where, i ν >, then the right-hand side inequality o holds i d ν x, y < Here, d ν x, y is deined by and ω ν is deined by e τ τ τ i ν = τ ln τ τ i ν = 3 ω ν τ := τ ln τ+τ 3 τ i ν = 4 [ ] τ 3 ν ν otherwise ν 4 ν τ ν 3 ντ Note that ω ν τ 0 or all τ domω ν

16 6 T Sun and Q Tran-Dinh Proo For any x, y dom, let y t := x + ty x Then, y t x = ty x By the mean-value theorem, we have Now, by Proposition 9, we have y x x, y x = 0 t y t x, y t x dt ω ν d ν x, y t y t x x y t x, y t x ω ν d ν x, y t y t x x Clearly, by the deinition, we have d ν x, y t = td ν x, y and y t x x = t y x x Combining these relations, and the above two inequalities, we can show that y x x 0 t ω ν td ν x, y dt y x x, y x y x x 0 t ω ν td ν x, y dt By integrating the let-hand side and the right-hand side o this estimate using the deinition 0 o ω ν τ, we obtain 3 Generalized sel-concordant minimization We apply the theory developed in the previous sections to design new Newton-type methods to minimize a generalized sel-concordant unction More precisely, we consider the ollowing noncomposite convex problem: := min x, 4 x Rp where : R p R is an M, ν-generalized sel-concordant unction in the sense o Deinition with ν [, 3] and M 0 Since is smooth and convex, the optimality condition x = 0 is necessary and suicient or x to be an optimal solution o 4 The ollowing theorem shows the existence and uniqueness o the solution x o 4 It can be considered as a special case o Theorem 4 below with g 0 Theorem Suppose that F M,νdom or given parameters M > 0 and ν [, 3] Denote by σ min x := λ min x and λx := x x or x dom Suppose urther that there exists x dom such that σ min x > 0 and λx < [σ minx] 3 ν 4 νm Then, problem 4 has a unique solution x in dom We say that the unique solution x o 4 is strongly regular i x 0 The strong regularity o x or 4 is equivalent to the strong second order optimality condition Theorem covers [40, Theorem 4] or standard sel-concordant unctions as a special case We consider the ollowing Newton-type scheme to solve 4 Starting rom an arbitrary initial point x 0 dom, we generate a sequence x as ollows: 0 x + := x + τ n nt, where n nt := x x, 5 and τ 0, ] is a given step-size We call n nt a Newton direction I τ = or all 0, then we call 5 a ull-step Newton scheme Otherwise, ie, τ 0,, we call 5 a damped-step Newton scheme

17 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 7 Clearly, computing the Newton direction n nt requires to solve the ollowing linear system: x n nt = x 6 Next, we deine a Newton decrement λ and a quantity β, respectively as λ := n nt x = x x and β := M n nt = M x x 7 With λ and β given by 7, we also deine β i ν = d := ν M ν λ ν β 3 ν i ν, 3] Let us irst show how to choose a suitable step-size τ in the damped-step Newton scheme and prove its convergence properties in the ollowing theorem whose proo can be ound in Appendix A5 Theorem Let x be the sequence generated by the damped-step Newton scheme 5 with the ollowing step-size: β ln + β i ν = τ := ν ] 9 d [ + 4 ν ν d 4 ν i ν, 3], where λ, β are deined by 7, and d is deined by 8 Then, τ 0, ], x in dom, and this step-size guarantees the ollowing descent property 8 x + x, 30 where := λ τ ω ν τ d τ λ > 0 with ω ν deined by 3 Assume that the unique solution x o 4 exists Then, there exists a neighborhood N x such that i we initialize the Newton scheme 5 at x 0 N x dom, then the whole sequence x converges to x at a quadratic rate Example 4 Better step-size or regularized logistic and exponential models Consider the minimization problem 4 with the objective unction := φ + γ, where φ is deined as in 8 with ϕ i t = log + e t being the logistic loss That is x := n n log + e a i x + γ x i= As we shown in Section that is either generalized sel-concordant with ν = or generalized sel-concordant with ν = 3 but with dierent constant M Let us deine R A := max a i i n Then, i we consider ν =, then we have M = R A due to Corollary, while i we choose ν = 3, then M 3 = γ R A due to Proposition 4 By the deinition o, we have x γi Hence, using this inequality and the deinition o λ and β rom 7, we can show that β = M x x R A γ λ = M 3 λ 3 For any τ > 0, we have ln+τ τ > +05τ Using this elementary result and 3, we obtain τ = ln+β β > +05β +05M 3 λ = τ 3

18 8 T Sun and Q Tran-Dinh This inequality has shown that the step-size τ given by Theorem satisies τ > τ 3 ν, where τ is a given step-size computed by 9 or ν = and 3, respectively Such a statement conirms that the damped-step Newton method using τ is theoretically better than using τ 3 This result will be empirically conirmed by our experiments in Section 6 Next, we study the ull-step Newton scheme derived rom 5 by setting the step-size τ = or all 0 as a ull-step Let σ := λ min x be the smallest eigenvalue o x Since x 0, we have σ > 0 The ollowing theorem shows a local quadratic convergence o the ull-step Newton scheme 5 or solving 4 whose proo can be ound in Appendix A6 Theorem 3 Let x be the sequence generated by the ull-step Newton scheme 5 by setting the step-size τ = or 0 Let d ν := d ν x, x + be deined by and λ be deined by 7 Then, the ollowing statements hold: a I ν = and the starting point x 0 satisies σ / 0 λ 0 < d M, then both sequences σ / λ and d decrease and quadratically converge to zero, where d 0964 b I < ν < 3, and the starting point x 0 sequences satisies σ 3 ν 0 λ 0 < M min d ν ν,, then both σ 3 ν λ and dν decrease and quadratically converge to zero, where d ν is the unique solution o the equation ν R ν d ν = 4 d ν 4 ν ν in dν with R ν given by 56 c I ν = 3 and the starting point x 0 satisies λ 0 < M, then the sequence λ decreases and quadratically converges to zero As a consequence, i d ν locally converges to zero at a quadratic rate, then x x H also locally converges to zero at a quadratic rate, where H = I, the identity matrix, i ν = ; H = x i ν = 3; and H = x ν i < ν < 3 Hence, x locally converges to x, the unique solution o 4, at a quadratic rate I we combine the results o Theorem and Theorem 3, then we can design a two-phase Newton algorithm or solving 4 as ollows: Phase : Starting rom an arbitrary initial point x 0 dom, we perorm the damped-step Newton scheme 5 until the condition in Theorem 3 is satisied Phase : Using the output x j o Phase as an initial point or the ull-step Newton scheme 5 with τ =, and perorm this scheme until it achieves an ε-solution x to 4 We also note that the damped-step Newton scheme 5 can also achieve a local quadratic convergence as shown in Theorem Hence, we combine this act and the above two-phase scheme to derive the Newton algorithm as shown in Algorithm below Per-iteration complexity: The main step o Algorithm is the solution o the symmetric positive deinite linear system 6 This system can be solved by using either Cholesy actorization or conjugate gradient methods which, in the worst-case, requires Op 3 operations Computing λ requires the inner product n nt, x which needs Op operations Conceptually, the two-phase option o Algorithm requires the smallest eigenvalue o x to terminate Phase However, switching rom Phase to Phase can be done automatically allowing some tolerance in the step-size τ Indeed, the step-size τ given by 9 converges to as Hence, when τ is closed to, eg, τ 09, we can automatically set it to and remove the computation o λ to reduce the computational time

19 Generalized Sel-Concordant Functions: A Recipe or Newton-Type Methods 9 Algorithm Newton algorithm or generalized sel-concordant minimization : Inputs: Choose an arbitrary initial point x 0 dom and a desired accuracy ε > 0 : Output: An ε-solution x o 4 3: Initialization: Compute d ν according to Theorem 3 i needed 4: For = 0,, max, perorm: 5: Compute the Newton direction n nt by solving x n nt = x 6: Compute λ := n nt x, and compute β := M n nt i ν 3 7: I λ ε, then TERMINATE and return x 8: I Phase is used, then compute σ = λ min x i ν < 3 9: I Phase is used and λ, σ satisies Theorem 3, then set τ := ull-step Otherwise, compute the step-size τ by 9 damped-step 0: Update x + := x + τ n nt : End or In the one-phase option, we can always perorm only Phase until achieving an ε-optimal solution as shown in Theorem Thereore, the per-iteration complexity o Algorithm is Op 3 + Op in the worst-case A careul implementation o conjugate gradient methods with a warm-start can signiicantly reduce this per-iteration computation complexity Remar 3 Inexact Newton methods We can allow Algorithm to compute the Newton direction n nt approximately In this case, we approximately solve the symmetric positive deinite system 6 By an appropriate choice o stopping criterion, we can still prove convergence o Algorithm under inexact computation o n nt For instance, the ollowing criterion is oten used in inexact Newton methods [6], but deined via the local dual norm o : x n nt + x x κ x x, or a given relaxation parameter κ [0, This extension can be ound in our orthcoming wor 4 Composite generalized sel-concordant minimization Let F M,νdom, and g be a proper, closed, and convex unction We consider the composite convex minimization problem 3 which we recall here or our convenience o reerences: F := min F x := x + gx 3 x R p Note that domf := dom domg may be empty To mae this problem nontrivial, we assume that domf is nonempty The optimality condition or 3 can be written as ollows: 0 x + gx 33 Under the qualiication condition 0 ri domg dom, 33 is necessary and suicient or x to be an optimal solution o 3, where ri X is the relative interior o X 4 Existence, uniqueness, and regularity o optimal solutions Assume that x is positive deinite ie, nonsingular at some point x domf We prove in the ollowing theorem that problem 3 has a unique solution x The proo can be ound in Appendix A4 This theorem can also be considered as a generalization o [40, Theorem 4] and [6, Lemma 4] in standard sel-concordant settings in [40, 6]

arxiv: v1 [math.oc] 13 Dec 2018

arxiv: v1 [math.oc] 13 Dec 2018 A NEW HOMOTOPY PROXIMAL VARIABLE-METRIC FRAMEWORK FOR COMPOSITE CONVEX MINIMIZATION QUOC TRAN-DINH, LIANG LING, AND KIM-CHUAN TOH arxiv:8205243v [mathoc] 3 Dec 208 Abstract This paper suggests two novel