Barrier Method Javier Peña Convex Optimization 10-725/36-725 1
Last time: Newton s method For root-finding F (x) = 0 x + = x F (x) 1 F (x) For optimization x f(x) x + = x 2 f(x) 1 f(x) Assume f strongly convex, and both f, 2 f are Lipschitz. If x (0) near x then quadratically. x (k) x and f(x (k) ) f For global convergence use damped Newton s method: x + = x t 2 f(x) 1 f(x) 2
Newton s method for linearly-constrained optimization For Newton s method x f(x) subject to Ax = b x + = x + tv where [ 2 f(x) A T A 0 ] [ v w ] [ f(x) = Ax b ]. The latter is precisely the root-finding Newton step for the KKT conditions of the above equality-constrained problem, namely [ f(x) + A T ] [ ] y 0 = Ax b 0 3
Barrier method Consider the convex optimization problem x subject to f(x) Ax = b h i (x) 0, i = 1,... m. Major challenge: figure out binding/non-binding constraints. This complication occurs at the boundary of the feasible region. Letting C := {x : h i (x) 0, i = 1,... m}, we can rewrite the above problem as x f(x) + I C (x) Ax = b. Main idea of interior-point methods: approximate I C with a barrier function for C to avoid the boundary of C and to make the problem amenable to Newton s method. 4
Log barrier function Assume h 1,... h m : R n R are convex and twice differentiable. The function m φ(x) = log( h i (x)) i=1 is called the logarithmic barrier function for the set {x : h i (x) < 0, i = 1,... m}, which we assume is nonempty. Approximate original problem with x f(x) + 1 t φ(x) Ax = b x tf(x) + φ(x) Ax = b where t > 0. 5
Outline Today: Central path Properties and interpretations Barrier method Convergence analysis Feasibility methods 6
Log barrier calculus For the log barrier function m φ(x) = log( h i (x)) i=1 we have and 2 φ(x) = m i=1 φ(x) = m i=1 1 h i (x) h i(x) 1 h i (x) 2 h i(x) h i (x) T m i=1 1 h i (x) 2 h i (x) 7
Central path Given t > 0, consider the barrier problem defined above x tf(x) + φ(x) subject to Ax = b. Let x (t) denote the solution to this above barrier problem. The central path is the set {x (t) : t > 0}. Under suitable conditions, this set is a smooth path in R n and as t, we have x (t) x, where x is a solution to our original problem. 8
Perturbed KKT conditions KKT conditions for barrier problem: t f(x (t)) m i=1 1 h i (x (t)) h i(x (t)) + A T w = 0 Ax (t) = b, h i (x (t)) < 0, i = 1,... m KKT conditions of the original problem: m f(x ) + u i h i (x ) + A T v = 0 i=1 Ax = b, h i (x ) 0, u i 0, i = 1,... m h i (x ) u i = 0, i = 1,..., m 9
10 Duality gap By convexity we have f(x (t)) f(x ) f(x (t)) T (x (t) x ) and h i (x (t)) h i (x ) h i (x (t)) T (x (t) x ), i = 1,..., m. Thus the previous two sets of KKT conditions yield f(x (t)) f m t. This is a useful stopping criterion.
11 Barrier method v.0 For ɛ > 0 pick t = m/ɛ and solve x subject to tf(x) + φ(x) Ax = b to get f(x (t)) f ɛ. This is not a good idea because the barrier problem is too difficult to solve. The above approach aims to find a point near the end of the central path. A better approach is to generate points along the central path.
Barrier method v.1 Solve a sequence of barrier problems x subject to tf(x) + φ(x) Ax = b for increasing values of t. Pick t (0) > 0 and let k := 0 Solve the barrier problem for t = t (0) to produce x (0) = x (t) While m/t > ɛ Pick t (k+1) > t (k) Solve the barrier problem at t = t (k+1), using Newton s method initialized at x (k), to produce x (k+1) = x (t) end while Common update t (k+1) = µt (k) for µ > 1. Centering step: the step that solves the barrier problem. 12
13 Considerations: Choice of µ: if µ is too small, then many outer iterations might be needed; if µ is too big, then Newton s method (each centering step) might take many iterations to converge. Choice of t (0) : if t (0) is too small, then many outer iterations might be needed; if t (0) is too big, then the first Newton s solve (first centering step) might require many iterations to compute x (0). Fortunately, the performance of the barrier method is often quite robust to the choice of µ and t (0) in practice. The appropriate range for these parameters is scale dependent.
14 Example of a small LP in n = 50 dimensions, m = 100 inequality constraints (from B & V page 571): 10 2 10 0 duality gap 10 2 10 4 10 6 µ =50 µ =150 µ =2 0 20 40 60 80 Newton iterations
15 Convergence analysis Assume that we solve the centering steps exactly. The following result is immediate Theorem: The barrier method after k centering steps satisfies f(x (k) ) f m µ k t (0) In other words, to reach a desired accuracy level of ɛ, we require log(m/(t (0) ɛ)) log µ + 1 centering steps with the barrier method (plus initial centering step).
16 Barrier method v.2 The previous algorithm generates points that are exactly on the central path. However, the central path is only a means to an end. There is no need to solve each problem exactly. Pick t (0) > 0 and let k := 0. Solve the barrier problem for t = t (0) to produce x (0) x (t) While m/t > ɛ Pick t (k+1) > t (k) Solve the barrier problem at t = t (k+1), using Newton s method initialized at x (k), to produce x (k) x (t) Important issues (can be formalized): How close should each approximation be? How many Newton steps suffice at each centering step?
17 Example of barrier method progress for an LP with m constraints (from B & V page 575): 10 4 10 2 duality gap 10 0 10 2 m =50 m =500 m =1000 10 4 0 10 20 30 40 50 Newton iterations Can see roughly linear convergence in each case, and logarithmic scaling with m
18 Seen differently, the number of Newton steps needed (to decrease initial duality gap by factor of 10 4 ) grows very slowly with m: 35 Newton iterations 30 25 20 15 10 1 10 2 10 3 m Note that the cost of a single Newton step does depends on the size of the problem.
Feasibility methods We have implicitly assumed that we have a strictly feasible point for the first centering step, i.e., for computing x (0) = x, solution of barrier problem at t = t (0). This is a point x such that h i (x) < 0, i = 1,... m, Ax = b How to find such a feasible x? By solving x,s s subject to h i (x) s, i = 1,... m Ax = b. The goal is for s to be negative at the solution. This is known as a feasibility method. We can apply the barrier method to the above problem, since it is easy to find a strictly feasible starting point. 19
20 Note that we do not need to solve this problem to high accuracy. Once we find a feasible (x, s) with s < 0, we can terate early. An alternative is to solve the problem x,s 1 T s subject to h i (x) s i, i = 1,... m Ax = b, s 0. Previously s was the maximum infeasibility across all inequalities. Now each inequality has own infeasibility variable s i, i = 1,... m. One advantage: when the original system is infeasible, the solution of the above problem will be informative. The nonzero entries of s will tell us which of the constraints cannot be satisfied.
21 References and further reading S. Boyd and L. Vandenberghe (2004), Convex optimization, Chapter 11 A. Nemirovski (2004), Interior-point polynomial time methods in convex programg, Chapter 4 J. Nocedal and S. Wright (2006), Numerical optimization, Chapters 14 and 19
22 A formal barrier method A convex function φ : D R defined on an open convex set D R n is a self-concordant barrier with parameter ν if φ is self-concordant For all x D we have λ(x) 2 = φ(x) ( 2 φ(x) ) 1 φ(x) ν. Consider the problem x subject to c T x x D Approximate with x tc T x + φ(x).
23 For convenience let φ t (x) := tc T x + φ(x) and let λ t (x) denote the corresponding Newton decrement. Key observation: for t + > t ( ) λ t +(x) t+ t + ν. t λ t(x) + t 1 Theorem If λ t (x) 1 9 and t+ t 1 + 1 8 ν then λ t +(x+ ) 1 9 for x + = x ( 2 φ t +(x) ) 1 φt +(x). Consequently, if we start with x (0), t (0) such that λ t (0)(x (0) ) < 1 9 and choose µ := 1 + 1 8 in the barrier method, one Newton ν iteration suffices at each centering step.