A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function Zhongyi Liu, Wenyu Sun Abstract This paper proposes an infeasible interior-point algorithm with full-newton step for linear programming, which is an extension of the work of Roos (SIAM J. Optim., 16(4):1110 1136, 2006). We introduce a kernel function in the algorithm. For p [0, 1), the polynomial complexity can be proved and the result coincides with the best result for infeasible interior-point methods, that is, O(n log n/ε). Keywords: linear programming, infeasible interior-point method, full-newton step, polynomial complexity, kernel function AMS subject classification: 65K05, 90C51 1 Introduction For a comprehensive learning about interior-point methods (IPMs), we refer to Roos et al. [7]. In Roos [8], a full-newton step infeasible interior-point algorithm for linear programming (LP ) was presented and he also proved that the complexity of the algorithm coincides with the best known iteration bound for infeasible College of Science, Hohai University, Nanjing 210098, China. Email: zhyi@hhu.edu.cn School of Mathematics and Computer Science, Nanjing Normal University, Nanjing 210097, China. Email: wysun@njnu.edu.cn 1
IPMs. Some extensions which are still based on (LP ) were carried out by Liu and Sun [2], Mansouri and Roos [4]. Recently Peng et al. [6] introduced a new class of primal-dual IPMs based on self-regular proximities. These methods do not use the classic Newton directions. Instead they use a direction that can be characterized as a steepest descent direction (in a scaled space) for a so-called self-regular barrier function. Each such barrier function is determined by a simple univariate self-regular function, called its kernel function. In this paper we introduce a kernel function in the algorithm and the polynomial complexity can be proved in the case of p [0, 1). We are concerned with the (LP ) problem given in the following standard form: and its associated dual problem: (P ) min c T x (D) max b T y s.t. Ax = b, x 0, s.t. A T y + s = c, s 0, where c, x, s R n, b, y R m and A R m n is of full row rank. As usual for infeasible IPMs we assume that the initial iterates (x 0, y 0, s 0 ) are as follows: x 0 = s 0 = ζe, y 0 = 0, µ 0 = ζ 2, where e is the all-one vector of length n, µ 0 is the initial dual gap and ζ > 0 is such that x + s ζ, for some optimal solution (x, y, s ) of (P ) and (D). It is not trivial to find such initial feasible interior point. One method to overcome this difficult is to use the homogeneous self-dual embedding model by introducing artificial variables. The embedding technique was presented first by Ye et al. [9] and described in detail in Part I of Roos et al. [7]. It is generally agreed that the total number of inner iterations required by the algorithm is an appropriate measure for its efficiency and this number is referred to as the iteration complexity of the algorithm. Using (x 0 ) T s 0 = nζ 2, the total number of iterations in the algorithm of Roos [8] is bounded above by 24n log max{nζ2, rb 0, r0 c }, (1.1) ε 2
where r 0 b and r0 c is the initial residual vectors: r 0 b = b Ax0, r 0 c = c A T y 0 s 0. Up to a constant factor, the iteration bound (1.1) was first obtained by Mizuno [5] and it is the best known iteration bound for infeasible IPMs. Now we recall the main ideas underlying the algorithm in Roos [8]. For any ν with 0 < ν 1 we consider the perturbed problem (P ν ), defined by (P ν ) min{(c νr 0 b) T x : Ax = b νr 0 b, x 0}, and its dual problem (D ν ), which is given by (D ν ) max{(b νr 0 b) T y : A T y + s = c νr 0 c, s 0}. Note that if ν = 1 then x = x 0 yields a strictly feasible solution of (P ν ), and (y, s) = (y 0, s 0 ) a strictly feasible solution of (D ν ). Due to the choice of the initial iterates we may conclude that if ν = 1 then (P ν ) and (D ν ) each have a strictly feasible solution, which means that both perturbed problems then satisfy the well known interior-point condition (IPC). Lemma 1.1. ([8, Lemma 1.1]) The perturbed problems (P ν ) and (D ν ) satisfy the IPC for each ν (0, 1], if and only if the original problems (P ) and (D) are feasible. Assuming that (P ) and (D) are feasible, it follows from Lemma 1.1 that the problems (P ν ) and (D ν ) satisfy the IPC, for each ν (0, 1]. And then their central paths exist. This means that the system b Ax = νr 0 b, x 0, c A T y s = νr 0 c, s 0, xs = µe has a unique solution for every µ > 0, where xs denotes a Hadamard (componentwise) product of two vectors x and s. If ν (0, 1] and µ = νζ 2 we denote this unique solution in the sequel as (x(ν), y(ν), s(ν)). As a consequence, x(ν) is the µ-center of (P ν ) and (y(ν), s(ν)) the µ-center of (D ν ). Due to this notation we have, by taking ν = 1, (x(1), y(1), s(1)) = (x 0, y 0, s 0 ) = (ζe, 0, ζe). 3
We measure proximity of iterates (x, y, s) to the µ-center of the perturbed problems (P ν ) and (D ν ) by the quantity δ(x, s; µ), which is defined as follows: δ(x, s; µ) := δ(v) := 1 xs 2 v v 1, where v := µ. (1.2) Initially we have x = s = ζe and µ = ζ 2, whence v = e and δ(x, s; µ) = 0. In the sequel we assume that at the start of each iteration, δ(x, s; µ) is smaller than or equal to a (small) threshold value τ > 0. So this is certainly true at the start of the first iteration. For the feasibility step in Roos [8] they used search directions f x, f y and f s that are (uniquely) defined by the system A f x = θνr 0 b, (1.3) A T f y + f s = θνr 0 c, (1.4) s f x + x f s = µe xs. (1.5) Now we describe one main iteration of the algorithm in Roos [8]. Suppose that for some ν (0, 1] we have x, y and s satisfying the feasibility conditions (1.3)-(1.4), and such that x T s = nµ and δ(x, s; µ) τ, where µ = νζ 2. Each main iteration consists of a so-called feasibility step, a µ- update, and a few centering steps, respectively. First, we find new iterates x f, y f and s f that satisfy (1.3) and (1.4) with ν replaced by ν +. As we will see, by taking θ small enough this can be realized by one feasibility step, to be described below soon. So, as a result of the feasibility step we obtain iterates that are feasible for (P ν +) and (D ν +). Then we reduce ν to ν + = ()ν, with θ (0, 1), and apply a limited number of centering steps with respect to the µ + -centers of (P ν +) and (D ν +). The centering steps keep the iterates feasible for (P ν +) and (D ν +), their purpose is to get iterates x +, y + and s + such that (x + ) T s + = nµ +, where µ + = ν + ζ 2 and δ(x +, s + ; µ + ) τ. This process is repeated until the duality gap and the norms of the residual vectors are less than some prescribed accuracy parameter ε. Now we give a more formal description of the algorithm in Figure 1. 4
Primal-Dual Infeasible IPMs Input: Accuracy parameter ε > 0; barrier update parameter θ, 0 < θ < 1; threshold parameter τ > 0. begin x := ζe; y := 0; s := ζe; ν = 1; while max{x T s, b Ax, c A T y s } ε do begin feasibility step: (x, y, s) := (x, y, s) + ( f x, f y, f s); µ-update: µ := ()µ; centering steps: while (δ(x, s; µ) τ) do (x, y, s) := (x, y, s) + ( x, y, s); end while end end Figure 1: Algorithm It can easily be understood that if (x, y, s) is feasible for the perturbed problems (P ν ) and (D ν ) then after the feasibility step the iterates satisfy the feasibility conditions for (P ν +) and (D ν +), provided that they satisfy the nonnegativity conditions. Assuming that before the step δ(x, s; µ) τ holds, and by taking θ small enough, it can be guaranteed that after the step the iterates x f = x + f x, y f = y + f y, s f = s + f s are nonnegative and moreover δ(x f, s f ; µ + ) 1/ 2, where µ + = ()µ. So, after the µ-update the iterates are feasible for (P ν +) and (D ν +) and µ is such that δ(x f, s f ; µ) 1/ 2. In the centering steps, starting at the iterates (x, y, s) = (x f, y f, s f ) and targeting at the µ-centers, the search directions x, y, s are the usual primal-dual 5
Newton directions, (uniquely) defined by A x = 0, A T y + s = 0, s x + x s = µe xs. Denoting the iterates after a centering step as x +, y + following results from Roos et al. [7]. and s +, we recall the Lemma 1.2. If δ := δ(x, s; µ) 1, then the primal-dual Newton step is feasible, i.e., x + and s + are nonnegative, and (x + ) T s + = nµ. Moreover, if δ := δ(x, s; µ) 1/ 2, then δ(x +, s + ; µ) δ 2. The centering steps serve to get iterates that satisfy x T s = nµ + and δ := δ(x, s; µ) τ, where τ is (much) smaller than 1/ 2. By using Lemma 1.2, the required number of centering steps can easily be obtained. Because after the µ-update we have δ = δ(x f, s f ; µ + ) 1/ 2, and hence after k centering steps the iterates (x, y, s) satisfy δ(x, s; µ + ) ( 1 2 ) 2k. From this one easily deduces that no more than log 2 (log 2 1 τ 2 ) (1.6) centering steps are needed. Before we begin to describe the idea of the algorithm, we give the definition of a kernel function. Definition 1.3. A twice differentiable function ψ(t) : (0, ) [0, ) is called a kernel function if ψ(1) = 0, ψ (1) = 0, ψ (t) > 0, t > 0. Defining d f x := v f x x, df s := v f s, (1.7) s with v as defined in (1.2). The system which defines the search directions f x, f y and f s, can be expressed in terms of the scaled search directions 6
d f x and d f s as follows: Ād f x = θνr 0 b, Ā T f y µ + df s = θνvs 1 r 0 c, d f x + d f s = v 1 v, where Ā = AV 1 X, V = diag(v), X = diag(x). Note that the right-hand side of the third equation in the system is the negative gradient direction of the logarithmic barrier function Ψ(v) := ψ(v i ), v i = i=1 xi s i µ, whose kernel function is ψ(t) = 1 2 (t2 1) log t. In this paper the feasibility step is a slight modification of the standard Newton direction which is defined by a new system as follows where the kernel function of Φ(v) is Ād f x = θνr 0 b, Ā T f y µ + df s = θνvs 1 r 0 c, d f x + d f s = Φ(v), φ(t) := 1 2 (t2 1) + t1 p 1 p 1. From the definition of a kernel function, for p 0, φ(t) is indeed a kernel function. Here we limit p [0, 1). Since φ (t) = t t p, the third equation in the system can be written as d f x + d f s = v p v. (1.8) We define σ(x, s; µ) := σ(v) := 1 2 Φ(v) = 1 2 v p v. (1.9) 7
Note that σ(v) = 0 if and only if v = e, thus σ(v) is also a suitable proximity. The following result shows that the proximity σ(v) is always smaller than the proximity δ(v) for p [0, 1). Specially note that we don t abandon the logarithmic barrier function in the paper though we use a new kernel function to define the feasibility step, thus Corollary A.10 in Roos [8] still holds true. Lemma 1.4. For any t > 0, one has t p t t t 1, p [0, 1) Proof. Since (t t 1 ) 2 (t t p ) 2 = (1 t1 p )(1 + t 1 p 2t 2 ) t 2, for 0 < t 1 or t > 1, the expression is always nonnegative. Hence the lemma follows. The following result will be used in Lemma 1.7. Lemma 1.5. For any t > 0, one has t 1 p 2 t 1 p 2 t 1 t, p [0, 1). Proof. Since (t 2 + 1 t 2 ) (t1 p + 1 t 1 p ) = t1 p (t 3 p 1)(t 1+p 1) t 3 p, for 0 < t 1 or t > 1, the expression is always nonnegative. Hence the lemma follows. Lemma 1.6. If v 2 = n, then v 1 p 2 2 n, p [0, 1). Proof. Note e γ = n for any γ R, the proof can easily be obtained by Holder inequality. The following lemma focuses on the behavior of feasible full-newton steps. Lemma 1.7. Let (x, s) be a positive primal-dual pair and µ > 0 such that x T s = nµ. Moreover let δ(v) = δ(x, s; µ) and ṽ := v 1 p 2 /. Then δ(ṽ) 2 ()δ(v) 2 + 8 θ2 n 4().
Proof. δ(ṽ) 2 = 1 4 v 1 p 2 v 1 p 2 2 = 1 4 (v 1 p 2 v 1 p 1 p 1 2 ) + ( v 2 v 1 p 2 ) 2 = 1 p 4 v 2 v 1 p 2 2 + θ2 v 1 p 2 2 4() 1 1 p 2 θ(v 2 v 1 p 2 ) T v 1 p 2 ()δ(v) 2 + θ2 n 4() where we have used Lemma 1.5 in the inequality, and since v 2 = n we have used Lemma 1.6 in the inequality twice. Lemma 1.8. ([4, Lemma A.1]) For i = 1,..., m, let f i : R + R denote a convex function. Then, for any nonzero vector z R n +, the following inequality ( f i (z i ) 1 z e T j f j (e T z) + ) f i (0) z i=1 j=1 i j holds. 2 Analysis of the feasibility step Let x, y and s denote the iterates at the start of an iteration, and assume that δ(v) τ. Recall that in the first iteration we have δ(v) = 0. 2.1 Effect of the feasibility step As we established in Section 1, the feasibility step generates new iterates x f, y f and s f that satisfy the feasibility conditions for (P ν +) and (D ν +), except possibly the nonnegativity conditions. A crucial element in the analysis is to show that after the feasibility step δ(x f, s f ; µ + ) 1/ 2, i.e., that the new iterates are positive and within the region where the Newton process targeting at the µ + - centers of (P ν +) and (D ν +) is quadratically convergent. Note that (1.8) can be rewritten as x s + s x = µ p+1 2 (xs) p+1 2 xs. 9
Then using xs = µv 2 and f x f s = µd f xd f s, we may write x f s f = xs + (s f x + x f s) + f x f s = µ p+1 2 (xs) p+1 2 + f x f s = µ(v 1 p + d f xd f s ). (2.1) Lemma 2.1. The new iterates are certainly strictly feasible if and only if v 1 p + d f xd f s > 0. Proof. Note that if x f and s f are positive then (2.1) makes clear that v 1 p +d f xd f s > 0. Following Lemma 4.1 in Mansouri and Roos [4], the converse can be proved. Thus the lemma follows. Using (1.7) we may also write x f = x + f x = x + xdf x v = x v (v + df x), (2.2) s f = s + f s = s + sdf s v = s v (v + df s ). (2.3) Lemma 2.2. ([2, Lemma 6]) The new iterates are certainly strictly feasible if where d f x < 1 ρ(δ) and d f s < 1 ρ(δ), (2.4) ρ(δ) := δ + δ 2 + 1. The proof of Lemma 2.2 makes clear that the elements of the vector v satisfy In the sequel we denote and This implies 1 ρ(δ) v1 p i ρ(δ), i = 1,..., n. (2.5) ω i := ω i (v) := 1 2 d f xi 2 + d f si 2, ω := ω(v) := (ω 1,..., ω n ). (d f x) T d f s d f x d f s 1 2 ( df x 2 + d f s 2 ) 2ω 2, d f xi df si = df xi df si 1 2 ( df xi 2 + d f si 2 ) 2ω 2 i 2ω 2, 1 i n. 10
Lemma 2.3. Assuming v 1 p + d f xd f s > 0, one has Proof. 4δ(v f ) 2 4()δ(v) 2 + θ2 n + 2ω2 + 2()ρ(δ)2 ω 2 1 2ρ(δ)ω 2. hence according to (1.2), (v f ) 2 = µ(v1 p + d f xd f s ) = v1 p + d f xd f s, µ + 4δ(v f ) 2 = = i=1 i=1 i=1 For each i we define the function f i (z i ) = v1 p i + z i ( ) (v f i )2 + (v f i ) 2 2 ( v 1 p i + d f xi df si ( v 1 p i One can easily verify that if v 1 p i z i = 2ωi 2, we can require By using (2.5), this certainly holds if + 2ω 2 i ) + 2 v 1 p i + d f xi df si + ) v 1 p 2. i 2ωi 2 + v 1 p i z i 2, i = 1,..., n. z i > 0 then f i (z i ) is convex in z i. Taking v 1 p i 2ω 2 i > 0. We therefore may use Lemma 1.8 and give 4δ(v f ) 2 j=1 = 1 2ω 2 + i j 2ω 2 < 1 ρ(δ). (2.6) f j (ω j ) 1 2ω 2 j=1 j=1 2ω 2 j ( [ ( (v 1 p 2ωj 2 j + 2ω 2 ( v 1 p i + v 1 p 2 i 11 f j (2ω 2 ) + i j f i (0) + v 1 p j 2ω 2) 2 ) )]. )
Using Lemma 1.7, we obtain ( v 1 p i + Then = i j i=1 v 1 p i ) 2 ( v 1 p i + v 1 p 2 i 4()δ(v) 2 + θ2 n ) ( v 1 p j ) + v 1 p 2 j ) ( v 1 p j + v 1 p 2 j 4δ(v f ) 2 4()δ(v) 2 + θ2 n + 1 ( 2ω 2 vj + 2ω 2 2ω 2 j + v j=1 j 2ω 2 ( v j 2 + ) 2) v j = 4()δ(v) 2 + θ2 n + 2ω2 + 1 2ω 2 ()2ω 2 2ω 2 j v j (v j 2ω 2 ) 4()δ(v) 2 + θ2 n + 2ω2 j=1 + ()2ω2 1 ( 1 ρ(δ) ρ(δ) 2ω2 ) = 4()δ(v) 2 + θ2 n + 2ω2 + 2()ρ(δ)2 ω 2 1 2ρ(δ)ω 2.. We conclude this section by presenting a value that we don t allow ω to exceed. We observe that because we need to have δ(v f ) 1/ 2, it follows from Lemma 2.3 that it suffices if 4()δ(v) 2 + θ2 n + 2ω2 + 2()ρ(δ)2 ω 2 1 2ρ(δ)ω 2 2. (2.7) At this stage we decide to choose τ = 1 8, θ = α 4, α 1. (2.8) n Since the left-hand side of (2.7) is monotonically increasing with respect to ω 2, then, for n 1 and δ(v) τ, together with (2.6), one can verify that ω 1 4 δ(v f ) 1 2. (2.9) 12
2.2 Upper bound for ω(v) Let us denote the null space of the matrix Ā as L. So, L := {ξ R n : Āξ = 0}. Obviously, the affine space {ξ R n : Āξ = θνrb 0} equals df x + L. The row space of Ā equals the orthogonal complement L of L, and d f s θνvs 1 rc 0 + L. We recall a lemma from Roos [8]. Lemma 2.4. Let q be the (unique) point in the intersection of the affine spaces d f x + L and d f s + L. Then 2ω(v) q 2 + ( q + 2σ(v)) 2. Proof. We can get the proof easily by using v p v instead of v 1 v, then the result follows by replacing δ(v) by σ(v). Recall from (2.9) that in order to guarantee that δ(v f ) 1/ 2 we want to have ω 1/4. Due to Lemma 2.4 this will certainly hold if q satisfies Now still from Roos [8], we can get q 2 + ( q + 2σ(v)) 2 1 4. (2.10) µ q θνζ e T ( x s + s ). (2.11) x To further proceed we need upper and lower bounds for the elements of the vectors x and s. 2.3 Bounds for x/s and s/x Recall that x is feasible for (P ν ) and (y, s) for (D ν ) and, moreover σ(x, s; µ) τ, i.e., these iterates are close to the µ-centers of (P ν ) and (D ν ). Based on this information we need to estimate the sizes of the entries of the vectors x/s and s/x. Since τ = 1/8, we can again use a result from Roos [8], namely, Corollary A.10, which gives x s x(µ, ν) 2, µ s x s(µ, ν) 2. µ 13
Substitution into (2.11) yields This implies x(µ, µ q θνζ 2e T ν)2 ( + µ s(µ, ν)2 ). µ µ q 2θνζ x(µ, ν) 2 + s(µ, ν) 2. By using µ = µ 0 ν = ζ 2 ν and θ = α/(4 n), we obtain the following upper bound for the norm of q: α q 2 2ζ x(µ, ν) 2 + s(µ, ν) 2. n Define κ(ζ, ν) = x(µ, ν) 2 + s(µ, ν) 2 2 nζ, 0 < ν 1, µ = µ 0 ν, and κ(ζ) = max κ(ζ, ν), 0<ν 1 we may have q 1 2 α κ(ζ). We find in (2.10) that in order to have δ(v f ) 1/ 2, we should have q 2 + ( q + 2σ(v)) 2 1/4. Therefore, since σ(v) δ(v) τ = 1/8, it suffices if q satisfies q 2 + ( q + 1 4 )2 1/4. This holds if and only if q ( 3 1)/4. We conclude that if we take α = 1 2 κ(ζ), (2.12) we will certainly have δ(v f ) 1/ 2. κ(ζ) = 2n. Following Roos [8], we can prove that 3 Iteration bound In the previous sections we have found that if at the start of an iteration the iterates satisfy δ(x, s; µ) τ, with τ as defined in (2.8), then after the feasibility 14
step, with θ as in (2.8), and α as in (2.12), the iterates satisfy δ(x, s; µ + ) 1/ 2. According to (1.6), at most log 2 (log 2 1 τ 2 ) = log 2(log 2 64) centering steps suffice to get iterates that satisfy δ(x, s; µ + ) τ. So each iteration consists of at most 4 so-called inner iterations, in each of which we need to compute a new search direction. In each main iteration both the duality gap and the norms of the residual vectors are reduced by the factor. Hence, using (x 0 ) T s 0 = nζ 2, the total number of iterations is bounded above by Since 1 θ log max{nζ2, rb 0, r0 c }. ε θ = α 4 n = 1 8 n κ(ζ), the total number of inner iterations is therefore bounded above by 32 2n log max{nζ2, rb 0, r0 c }. ε 4 Concluding remarks In this paper we introduce a kernel function in the infeasible interior-point algorithm with full-newton step for linear programming. For p [0, 1), the polynomial complexity can be obtained. And for p = 1, the kernel function will lose some barrier properties. We shall investigate it for one specific case. More importantly, we expect further research for a larger scope of p. References [1] Y. Bai, M. El Ghami, C. Roos. A new efficient large-update primal-dual interior-point method based on a finite barrier. SIAM Journal on Optimization, 13(3):766 782, 2002. [2] Z. Liu and W. Sun. An infeasible interior-point algorithm with full-newton step for linear optimization. Numerical Algorithms, 46(2):173 188, 2007. 15
[3] Z. Liu and W. Sun. A full-newton step infeasible interior-point algorithm for linear programming based on a special kernel function. Working paper. [4] H. Mansouri and C. Roos. Simplified O(nL) infeasible interior-point algorithm for linear optimization using full-newton step. Optimization Methods and Software, 22(3):519 530, 2007. [5] S. Mizuno. Polynomiality of infeasible-interior-point algorithms for linear programming. Mathematical Programming, 67(1):109 119, 1994. [6] J. Peng, C. Roos, and T. Terlaky. Self-regular functions and new search directions for linear and semidefinite optimization. Mathematical Programming, 93(1):129 171, 2002. [7] C. Roos, T. Terlaky, and J.-Ph.Vial. Theory and Algorithms for Linear Optimization. An Interior Approach. John Wiley and Sons, Chichester, UK, 1997. [8] C. Roos. A full-newton step O(n) infeasible interior-point algorithm for linear optimization. SIAM Journal on Optimization, 16(4):1110 1136, 2006. [9] Y. Ye, M.J. Todd, and S. Mizuno. An O( nl)-iteration homogeneous and self-dual linear programming algorithm. Mathematical of Operations Research, 19(1):53 67, 1994. 16