y Ray of Half-line or ray through in the direction of y

Similar documents
16 Chapter 3. Separation Properties, Principal Pivot Transforms, Classes... for all j 2 J is said to be a subcomplementary vector of variables for (3.

Example Bases and Basic Feasible Solutions 63 Let q = >: ; > and M = >: ;2 > and consider the LCP (q M). The class of ; ;2 complementary cones

58 Appendix 1 fundamental inconsistent equation (1) can be obtained as a linear combination of the two equations in (2). This clearly implies that the

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS

Optimality, Duality, Complementarity for Constrained Optimization

Chapter 6. Computational Complexity of Complementary Pivot Methods 301 ~q(n) =(2 n 2 n;1 ::: 2) T ^q(n) =(;2 n ;2 n ; 2 n;1 ;2 n ; 2 n;1 ; 2 n;2 ::: ;

Copositive Plus Matrices

Algorithm for QP thro LCP

334 Chapter 8. Polynomially Bounded Algorithms for Some Classes of LCPs General Step: Let q be the present update right hand side constants vector. If

Chapter 1. Preliminaries

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

462 Chapter 11. New LP Algorithms and Some Open Problems whether there exists a subset of fd 1 ::: d n g whose sum is equal to d, known as the subset

Optimality Conditions for Constrained Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Chap6 Duality Theory and Sensitivity Analysis

Lecture: Algorithms for LP, SOCP and SDP

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition)

COMPUTATIONAL COMPLEXITY OF PARAMETRIC LINEAR PROGRAMMING +

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

A TOUR OF LINEAR ALGEBRA FOR JDEP 384H

Constrained Optimization and Lagrangian Duality

Yinyu Ye, MS&E, Stanford MS&E310 Lecture Note #06. The Simplex Method

4TE3/6TE3. Algorithms for. Continuous Optimization

Duality Theory, Optimality Conditions

Contents. 4.5 The(Primal)SimplexMethod NumericalExamplesoftheSimplexMethod

Chapter 1: Linear Programming

Appendix A Taylor Approximations and Definite Matrices

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

TRANSPORTATION PROBLEMS

Linear Algebra and Matrix Inversion

Vector Spaces. Addition : R n R n R n Scalar multiplication : R R n R n.

Semidefinite Programming

Integer programming: an introduction. Alessandro Astolfi

Linear Programming: Simplex

Conic Linear Programming. Yinyu Ye

IE 5531: Engineering Optimization I

1. Algebraic and geometric treatments Consider an LP problem in the standard form. x 0. Solutions to the system of linear equations

2.098/6.255/ Optimization Methods Practice True/False Questions

3. THE SIMPLEX ALGORITHM

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra

Constrained Optimization

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

3 The Simplex Method. 3.1 Basic Solutions

Convex Functions and Optimization

Chapter 5 Linear Programming (LP)

1 Outline Part I: Linear Programming (LP) Interior-Point Approach 1. Simplex Approach Comparison Part II: Semidenite Programming (SDP) Concludin

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Relation of Pure Minimum Cost Flow Model to Linear Programming

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

OPERATIONS RESEARCH. Linear Programming Problem

Absolute value equations

CSCI 1951-G Optimization Methods in Finance Part 01: Linear Programming

MAT-INF4110/MAT-INF9110 Mathematical optimization

1 Review Session. 1.1 Lecture 2

Generalization to inequality constrained problem. Maximize

"SYMMETRIC" PRIMAL-DUAL PAIR

Math Camp Notes: Linear Algebra I

The Simplex Algorithm

Linear Programming. Linear Programming I. Lecture 1. Linear Programming. Linear Programming

Lecture 18: Optimization Programming

Optimization Tutorial 1. Basic Gradient Descent

Advanced Mathematical Programming IE417. Lecture 24. Dr. Ted Ralphs

Chap 2. Optimality conditions

Conic Linear Programming. Yinyu Ye

A Parametric Simplex Algorithm for Linear Vector Optimization Problems

LP. Kap. 17: Interior-point methods

that nds a basis which is optimal for both the primal and the dual problems, given

Convex Optimization & Lagrange Duality

MATHEMATICAL PROGRAMMING I

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

CO 250 Final Exam Guide

Numerical Optimization

Linear Programming in Matrix Form

The dual simplex method with bounds

Convex Optimization M2

ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3

OPTIMISATION 2007/8 EXAM PREPARATION GUIDELINES

5. Duality. Lagrangian

ARE202A, Fall Contents

6.1 Matrices. Definition: A Matrix A is a rectangular array of the form. A 11 A 12 A 1n A 21. A 2n. A m1 A m2 A mn A 22.

Machine Learning. Support Vector Machines. Manfred Huber

Understanding the Simplex algorithm. Standard Optimization Problems.

II. Analysis of Linear Programming Solutions

Robust linear optimization under general norms

Discrete Optimization 23

OPTIMALITY AND STABILITY OF SYMMETRIC EVOLUTIONARY GAMES WITH APPLICATIONS IN GENETIC SELECTION. (Communicated by Yang Kuang)

Key words. Complementarity set, Lyapunov rank, Bishop-Phelps cone, Irreducible cone

4.6 Linear Programming duality

A Review of Linear Programming

EC /11. Math for Microeconomics September Course, Part II Problem Set 1 with Solutions. a11 a 12. x 2

Ω R n is called the constraint set or feasible set. x 1

INDEFINITE TRUST REGION SUBPROBLEMS AND NONSYMMETRIC EIGENVALUE PERTURBATIONS. Ronald J. Stern. Concordia University

Transcription:

Chapter LINEAR COMPLEMENTARITY PROBLEM, ITS GEOMETRY, AND APPLICATIONS. THE LINEAR COMPLEMENTARITY PROBLEM AND ITS GEOMETRY The Linear Complementarity Problem (abbreviated as LCP) is a general problem which unies linear and quadratic programs and bimatrix games. The study of LCP has led to many far reaching benets. For example, an algorithm known as the complementary pivot algorithm rst developed for solving LCPs, has been generalized in a direct manner to yield ecient algorithms for computing Brouwer and Kakutani xed points, for computing economic equilibria, and for solving systems of nonlinear equations and nonlinear programming problems. Also, iterative methods developed for solving LCPs hold great promise for handling very large scale linear programs which cannot be tackled with the well known simplex method because of their large size and the consequent numerical diculties. For these reasons the study of LCP oers rich rewards for people learning or doing research in optimization or engaged in practical applications of optimization. In this book we discuss the LCP in all its depth. Let M beagiven square matrix of order n and q a column vector in R n. Throughout this book we will use the symbols w ::: w n z ::: z n to denote the variables in the problem. In an LCP there is no objective function to be optimized. The problem is: nd w =(w ::: w n ) T, z =(z ::: z n ) T satisfying w ; Mz = q w > = 0 z > = 0 and w i z i =0 for all i (:)

Chapter. Linear Complementarity Problem, Its Geometry, and Applications The only data in the problem is the column vector q and the square matrix M. So we will denote the LCP of nding w R n, z R n satisfying (.) by the symbol (q M). It is said to be an LCP of order n. In an LCP of order n there are n variables. As a specic example, let n =,M = >: >, q = >: ;5 >. This leads to the LCP ;6 w ; z ; z = ;5 w ; z ;z = ;6: w w z z > = 0 and w z = w z =0: The problem (.) can be expressed in the form of a vector equation as w > : > + w 0 > : 0 > + z > : ; > + z ; > : ; > = ; (:) >: ;5 > (:3) ;6 w w z z > = 0 and w z = w z =0 (:4) In any solution satisfying (.4), at least one of the variables in each pair (w j z j ), has to equal zero. One approach for solving this problem is to pick one variable from each of the pairs (w z ), (w z ) and to x them at zero value in (.3). The remaining variables in the system may be called usable variables. After eliminating the zero variables from (.3), if the remaining system has a solution in which the usable variables are nonnegative, that would provide a solution to (.3) and (.4). Pick w, w as the zero-valued variables. After setting w, w equal to 0 in (.3), the remaining system is z > : ; > + z > : ; > = ; q ; >: ;5 ;6 z > = 0 z > = 0 > = >: q q > = q (:5) - () - q - (-) Figure. A Complementary Cone Equation (.5) has a solution i the vector q can be expressed as a nonnegative linear combination of the vectors (; ;) T and (; ;) T. The set of all nonnegative

.. The Linear Complementarity Problem and its Geometry 3 linear combinations of (; ;) T and (; ;) T is a cone in the q q -space as in Figure.. Only if the given vector q = (;5 ;6) T lies in this cone, does the LCP (.) have a solution in which the usable variables are z z. We verify that the point (;5 ;6) T does lie in the cone, that the solution of (.5) is (z z ) = (4=3 7=3) and, hence, a solution for (.) is (w w z z ) = (0 0 4=3 7=3). The cone in Figure. is known as a complementary cone associated with the LCP (.). Complementary cones are generalizations of the well-known class of quadrants or orthants... Notation The symbol I usually denotes the unit matrix. If we want to emphasize its order, we denote the unit matrix of order n by the symbol I n. We will use the abbreviation LP for \Linear Program" and BFS for \Basic Feasible Solution". See [.,.6]. LCP is the abbreviation for \Linear Complementarity Problem" and NLP is the abbreviation for \Nonlinear Program". Column and Row Vectors of a Matrix If A = (a ij ) is a matrix of order m n say, we will denote its jth column vector (a j ::: a mj ) T by the symbol A. j, and its ith row vector (a i ::: a in ) by A i.. Nonnegative, Semipositive, Positive Vectors Let x = (x ::: x n ) T R n. x > = 0, that is nonnegative, if x j > = 0 for all j. Clearly, 0 > = 0. x is said to be semipositive, denoted by x 0, if x j > = 0 for all j and at least one x j > 0. Notice the distinction in the symbols for denoting nonnegative (> = with two lines under the >) and semipositive ( with only a single line under the >). 0 6 0, the zero vector is the only nonnegative vector which is not semipositive. Also, if x 0, P n j= x j > 0. The vector x>0, strictly positive, if x j > 0 for all j. Given two vectors x y R n we write x > = y, if x ; y > = 0, x y if x ; y 0, and x>yif x ; y>0. Pos Cones If fx ::: x r gr n, the cone fx : x = x + :::+ r x r ::: r > = 0g is denoted by Posfx ::: x r g. Given the matrix A of order m n, Pos(A) denotes the cone PosfA. ::: A. n g = fx : x = A for =( ::: n ) T > = 0g. Directions, Rays, Half-Lines, and Step Length Any point y R n, y 6= 0, denes a direction in R n. Given the direction y, it's ray is the half-line obtained by joining the origin 0 to y and continuing indenitely in the

4 Chapter. Linear Complementarity Problem, Its Geometry, and Applications same direction, it is the set of points fy : > = 0g. Given x R n, by moving from x in the direction y we get points of the form x + y where > = 0, and the set of all such points fx + y : > = 0g is the haline or ray through x in the direction y. The point x + y for >0issaidtohave been obtained by moving from x in the direction y a step length of. As an example, if y =( ) T R n, the ray of y is the set of all points of the form f( ) T : > = 0g. In addition, if, x =( ;) T, the haline through x in the direction y is the set of all points of the form f( + ;+) T : > = 0g. See Figure.. In this half-line, letting =,we get the point (0 ) T, and this point is obtained by taking a step of length from x =( ;) T in the direction y =( ) T. x y y Ray of Half-line or ray through in the direction of y x Figure. Rays and Half-Lines.. Complementary Cones In the LCP (q M), the complementary cones are dened by the matrix M. The point q does not play any role in the denition of complementary cones. Let M be a given square matrix of order n. For obtaining C(M), the class of complementary cones corresponding to M, the pair of column vectors (I. j ;M. j ) is

.. The Linear Complementarity Problem and its Geometry 5 known as the jth complementary pair of vectors, < = j < = n. Pick a vector from the pair (I. j ;M. j ) and denote it by A. j. The ordered set of vectors (A. ::: A. n ) is known as a complementary set of vectors. The cone Pos(A. ::: A. n )=fy : y = A. + :::+ n A. n > = 0 ::: n > = 0g is known as a complementary cone in the class C(M). Clearly there are n complementary cones. Example. Let n =andm = I. In this case, the class C(I) is just the class of orthants in R. In general for any n, C(I) is the class of orthants in R n. Thus the class of complementary cones is a generalization of the class of orthants. See Figure.3. Figures.4 and.5 provide some more examples of complementary cones. In the example in Figure.5 since fi. ;M. g is a linearly dependent set, the cone Pos(I. ;M. ) has an empty interior. It consists of all the points on the horizontal axis in Figure.6 (the thick axis). The remaining three complementary cones have nonempty interiors. Pos( I I I I, ) Pos(, ) Pos(, I ) M Pos(, ) I I I I I I I I M Pos( I, M ) Pos( I, I ) Pos( I, I ) M Pos( M, M ) Figure.3 Figure.4 Complementary Cones when M = >: ; 3 >. When M = I, the Complementarity Cones are the Orthants.

6 Chapter. Linear Complementarity Problem, Its Geometry, and Applications Degenerate, Nondegenerate Complementary Cones Let Pos(A. ::: A. n ) be a complementary cone in C(M). This cone is said to be a nondegenerate complementary cone if it has a nonempty interior, that is if fa. ::: A. n g is a linearly independent set degenerate complementary cone if its interior is empty, which happens when fa. ::: A. n g is a linearly dependent set. As examples, all the complementary cones in Figures.3,.4,.5, are nondegenerate. In Figure.6 the complementary cone Pos(I. ;M. ) is degenerate, the remaining three complementary cones are nondegenerate. M I I M I I M M Figure.5 Complementary Cones when M = >: ; ; Figure.6 Complementary Cones when M = >: >. 0 >...3 The Linear Complementary Problem Given the square matrix M of order n and the column vector q R n,thelcp(q M), is equivalent to the problem of nding a complementary cone in C(M) that contains the point q, that is, to nd a complementary set of column vectors (A. ::: A. n ) such that (i) A. j fi. j ;M. j g for < = j < = n (ii) q can be expressed as a nonnegative linear combination of (A. ::: A. n )

.. The Linear Complementarity Problem and its Geometry 7 where I is the identity matrix of order n and I. j is its jth column vector. This is equivalent to nding w R n, z R n satisfying P n j= I. jw j ; P n j= M. jz j = q w j > = 0 z j > = 0 for all j, and either w j =0or z j =0for all j. In matrix notation this is w ; Mz = q (:6) w > = 0 z > = 0 (:7) w j z j =0 for all j: (:) P n Because of (.7), the condition (.) is equivalent to j= w jz j = w T z =0 this condition is known as the complementarity constraint. In any solution of the LCP (q M), if one of the variables in the pair (w j z j ) is positive, the other should be zero. Hence, the pair (w j z j )isknown as the jth complementary pair of variables and each variable in this pair is the complement of the other. In (.6) the column vector corresponding to w j is I. j, and the column vector corresponding to z j is ;M. j. For j = to n, the pair (I. j ;M. j ) is the jth complementary pair of column vectors in the LCP (q M). For j = to n, let y j fw j z j g and let A. j be the column vector corresponding to y j in (.6). So A. j fi. j ; M. j g. Then y = (y ::: y n ) is a complementary vector of variables in this LCP, the ordered set (A. ::: A. n ) is the complementary set of column vectors corresponding to it and the matrix A with its column vectors as A. ::: A. n in that order is known as the complementary matrix corresponding to it. If fa. ::: A. n g is linearly independent, y is a complementary basic vector of variables in this LCP, and the complementary matrix A whose column vectors are A. ::: A. n in that order, is known as the complementary basis for (.6) corresponding to the complementary basic vector y. The cone Pos(A. ::: A. n ) = fx : x = A. + :::+ n A. n > = 0 ::: n > = 0g is the complementary cone in the class C(M) corresponding to the complementary set of column vectors (A. ::: A. n ), or the associated complementary vector of variables y. A solution of the LCP (q M), always means a (w z) satisfying all the constraints (.6), (.7), (.). A complementary feasible basic vector for this LCP is a complementary basic vector satisfying the property thatq can be expressed as a nonnegative combination of column vectors in the corresponding complementary basis. Thus each complementary feasible basic vector leads to a solution of the LCP. The union of all the complementary cones associated with the square matrix M is denoted by the symbol K(M). K(M) is clearly the set of all vectors q for which the LCP (q M) has at least one solution. We will say that the vector z leads to a solution of the LCP (q M) i (w = Mz + q z) is a solution of this LCP. As an illustration, here are all the complementary vectors of variables and the corresponding complementary matrices for (.), an LCP of order.

Chapter. Linear Complementarity Problem, Its Geometry, and Applications Complementary vector of variables (w w ) (w z ) (z w ) (z z ) The corresponding complementary matrix >: 0 > 0 >: ; 0 ; >: ; 0 ; >: ; ; ; ; > > > Since each of these complementary matrices is nonsingular, all the complementary vectors are complementary basic vectors, and all the complementary matrices are complementary bases, in this LCP. Since q = (;5 ;6) T in (.) can be expressed as a nonnegative combination of the complementary matrix corresponding to (z z ) (z z ) is a complementary feasible basic vector for this LCP. The reader should draw all the complementary cones corresponding to this LCP on the two dimensional Cartesian plane, and verify that for this LCP, their union, the set K(M) =R. The Total Enumeration Method for the LCP Consider the LCP (q M) of order n. The complementarity constraint (.) implies that in any solution (w z) of this LCP, foreach j =to n, we must have either w j =0 or z j =0: This gives the LCP a combinatorial, rather than nonlinear avour. It automatically leads to an enumeration method for the LCP. There are exactly n complementary vectors of variables. Let y r =(y r ::: y r n) r = to n where yj r fw j z j g for each j =ton, be all the complementary vectors of variables. Let A r be the complementary matrix corresponding to y r, r = to n. Solve the following system (P r ). A r y r = q y r (P > r ) = 0 : This system can be solved by Phase I of the simplex method for LP,orby other methods for solving linear equality and inequality systems. If this system has a feasible solution, y r, say, then y r = y r all variables not in y r, equal to zero

.. Application to Linear Programming is a solution of LCP (q M). If the complementary matrix A r is singular, the system (P r ) may have no feasible solution, or have one or an innite number of feasible solutions. Each feasible solution of (P r ) leads to a solution of the LCP (q M) as discussed above. When this is repeated for r =to n, all solutions of the LCP (q M) can be obtained. The method discussed at the beginning of Section. to solve an LCP of order is exactly this enumeration method. This enumeration method is convenient to use only when n =, since = 4 is small and to check whether the system (P r ) has a solution for any r, we can draw the corresponding complementary cone in the two dimensional Cartesian plane and check whether it contains q. When n>, particularly for large n, this enumeration method becomes impractical since n grows very rapidly. In Chapter and later chapters we discuss ecient pivotal and other methods for solving special classes of LCPs that arise in several practical applications. In Section.7 we show that the general LCP is a hard problem. At the moment, the only known algorithms which are guaranteed to solve the general LCP are enumerative methods, see Section.3.. APPLICATION TO LINEAR PROGRAMMING In a general LP there may be some inequality constraints, equality constraints, sign restricted variables and unrestricted variables. Transform each lower bounded variable, say x j > = l j, into a nonnegative variable by substituting x j = l j + y j where y j > = 0. Transform each sign restricted variable of the form x j < = 0into a nonnegative variable by substituting x j = ;y j where y j > = 0. Eliminate the unrestricted variables one after the other, using the equality constraints (see Chapter of [. or.6]). In the resulting system, if there is still an equality constraint left, eliminate a nonnegative variable from the system using it, thereby transforming the constraint into an inequality constraint in the remaining variables. Repeat this process until there are no more equality constraints. In the resulting system, transform any inequality constraint of the \< = " form into one of \> = " form, bymultiplying both sides of it by `-'. If the objetive function is to be maximized, replace it by its negative which should be minimized, and eliminate any constant terms in it. When all this work is completed, the original LP is transformed into: Minimize cx Subject to Ax > = b (:) x > = 0 which isinsymmetric form. Here, suppose A is of order m N. If x is an optimum feasible solution of (.), by the results of the duality theory of linear programming (see [.,.6]) there exists a dual vector y R m, primal slack vector v R m, and dual slack vector u R N which together satisfy

0 Chapter. Linear Complementarity Problem, Its Geometry, and Applications >: u v >: u v > ; > > = 0 > >: x y >: 0 ;AT A 0 >: x > > y = 0 and > = >: ct > ;b >: u > T > : x v y > =0: (:0) Conversely, if u, v, x, y together satisfy all the conditions in (.0), x is an optimum solution of (.). In ; (.0) all the vectors and matrices are written in partitioned u form. For example, v is the vector (u ::: u N v ::: v m ) T. If n = m + N, w = >: u v > z = >: x y > M = >: 0 ;AT A 0 > q = >: ct (.0) is seen to be an LCP of order n of the type (.6) to (.). Solving the LP (.) can be achieved by solving the LCP (.0). Also, the various complementary pairs of variables in the LCP (.0) are exactly those in the pair of primal, dual LPs (.) and its dual. As an example consider the following LP. Minimize ;3x +4x Subject to x ; x + 3x > = ;6 ; 3x + x ; 3x 3 > = x j > = 0 j = 3: Let (v y ), (v y ) denote the nonnegative slack variable, dual variable respectively, associated with the two primal constraints in that order. Let u, u, u 3 denote the nonnegative dual slack variable associated with the dual constraint corresponding to the primal variable x, x, x 3, in that order. Then the primal and dual systems together with the complementary slackness conditions for optimality are x ; x + 3x 3 ; v = ;6 ;3x + x ; 3x 3 ; v = y ; 3y + u = ;3 ; y + y + u = 4 3y ; 3y + u 3 = 0: x j u j y i v i > = 0 for all i j. x j u j = y i v i =0 for all i j. ;b >

.3. Quadratic Programming This is exactly the following LCP. u u u 3 v v x x x 3 y y 0 0 0 0 0 0 0 ;3 ;3 0 0 0 0 0 0 0 ; 4 0 0 0 0 0 0 0 3 ;3 0 0 0 0 0 ; ;3 0 0 6 0 0 0 0 3 ; 3 0 0 ; All variables > = 0. u x = u x = u 3 x 3 = v y = v y =0..3 QUADRATIC PROGRAMMING Using the methods discussed in Section. any problem in which a quadratic objective function has to be optimized subject to linear equality and inequality constraints can be transformed into a problem of the form Minimize Subject to Q(x) =cx + xt Dx Ax > = b x > = 0 (:) where A is a matrix of order m N, and D is a square symmetric matrix of order N. There is no loss of generality in assuming that D is a symmetric matrix, because if it is not symmetric replacing D by (D + D T )= (which is a symmetric matrix) leaves Q(x) unchanged. We assume that D is symmetric..3. Review on Positive Semidenite Matrices A square matrix F =(f ij ) of order n, whether it is symmetric or not, is said to be a positive semidenite matrix if y T Fy > = 0 for all y R n. It is said to be a positive denite matrix if y T Fy > 0 for all y 6= 0. We will use the abbreviations PSD, PD for \positive semidenite" and \positive denite", respectively. Principal Submatrices, Principal Subdeterminants Let F = (f ij ) be a square matrix of order n. Let fi ::: i r g f ::: ng with its elements arranged in increasing order. Erase all the entries in F in row i and column i for each i 6 fi ::: i r g. What remains is a square submatrix of F of order r:

Chapter. Linear Complementarity Problem, Its Geometry, and Applications >: f i i ::: f i i r.. f ir i ::: f ir i r This submatrix is known as the principal submatrix of F determined by the subset fi ::: i r g. Denoting the subset fi ::: i r g by J, we denote this principal submatrix by the symbol F JJ. It is (f ij : i J j J). The determinant of this principal submatrix is called the principal subdeterminant off determined by the subset J. The principal submatrix of F determined by, the empty set, is the empty matrix which has no entries. Its determinant is dened by convention to be equal to. The principal submatrix of F determined by f ::: ng is F itself. The principal submatrices of F determined by nonempty subsets of f ::: ng are nonempty principal submatrices of F. Since the number of distinct nonempty subsets of f ::: ng is n ;, there are n ; nonempty principal submatrices of F. The principal submatrices of F determined by proper subsets of f ::: ng are known as proper principal submatrices of F. So each proper principal submatrix of F is of order < = n ;. Example. Let 0 ; F = >: 3 4> : 5 ;3 The principal submatrix corresponding to the subset f 3g is >: 0 >. The principal submatrix corresponding to the subset fg is 3, the second element in the principal ;3 diagonal of F. > : Several results useful in studying P(S)D matrices will now be discussed. Results on P(S)D Matrices Result. If B = (b ) is a matrix of order, it is PD i b > 0, and it is PSD i b > = 0. Proof. Let y =(y ) R. Then y T By = b y. So y T By > 0 for all y R, y 6= 0, i b > 0, and hence B is PD i b > 0. Also y T By > = 0 for all y R, i b > = 0, and hence B is PSD i b > = 0. Result. If F is a PD matrix all its principal submatrices must also be PD. Proof. Consider the principal submatrix, G, generated by the subset f g. G = >: f f > : Let t = >: y > : f f y

.3. Quadratic Programming 3 Pick y =(y y 0 0 ::: 0) T. Then y T Fy = t T Gt. However, since F is PD, y T Fy > 0 for all y 6= 0. So t T Gt > 0 for all t 6= 0. Hence, G is PD too. A similar argument can be used to prove that every principal submatrix of F is also PD. Result.3 If F is PD, f ii > 0 for all i. This follows as a corollary of Result.. Result.4 is proved using arguments similar to those in Result.. If F is a PSD matrix, all principal submatrices of F are also PSD. This Result.5 If F is PSD matrix, f ii > = 0 for all i. This follows from Result.4. Result.6 Suppose F is a PSD matrix. If f ii =0,then f ij + f ji =0for all j. Proof. To be specic let f be 0 and suppose that f + f 6=0. By Result.4 the principal submatrix > : f f f f > = >: 0 f > f f must be PSD. Hence f y +(f + f )y y > = 0 for all y y. Since f + f 6= 0, take y = (;f ; )=(f + f ) and y =. The above inequality is violated since the left-hand side becomes equal to ;, leading to a contradiction. Result.7 If D is a symmetric PSD matrix and d ii = 0, then D. i = D i. =0. This follows from Result.6. Denition: The Gaussian Pivot Step Let A =(a ij ) be a matrix of order m n. A Gaussian pivot step on A, with row r as the pivot row and column s as the pivot column can only be carried out if the element lying in both of them, a rs, is nonzero. This element a rs is known as the pivot element for this pivot step. The pivot step subtracts suitable multiples of the pivot row from each row i for i>rso as to transform the entryinthisrow and the pivot column into zero. Thus this pivot step transforms A = into >: >: a ::: a s ::: a n... a r ::: a rs ::: a rn a r+ ::: a r+ s ::: a r+ n... a m ::: a ms ::: a mn a ::: a s ::: a n... a r ::: a rs ::: a rn a 0 r+ ::: 0 ::: a 0 r+ n... a 0 m ::: 0 ::: a 0 mn > >

4 Chapter. Linear Complementarity Problem, Its Geometry, and Applications where a 0 ij = a ij ; (a rj a is )=a rs,fori = r +tom, j =ton. As an example consider the Gaussian pivot step in the following matrix with row as the pivot row and column 3 as the pivot column. The pivot element is inside a box. >: ; 0 ;4 ; 4 6 ; ;4 ;3 ; 3 ;4 3 0 This Gaussian pivot step transforms this matrix into >: ; 0 ; 4 ; 4 6 ; ;4 ; 4 0 ; ;3 ;0 0 4 Result. Let D be a square symmetric matrix of order n > =. Suppose D is PD. Subtract suitable multiples of row fromeach of the other rows so that all the entries in column except the rst is transformed into zero. That is, transform D = 6 4 d ::: d n d ::: d n.. d n ::: d nn 3 7 5 into D = 6 4 > > d ::: d n 0 ~ d ::: ~ dn... 0 dn ~ ::: dnn ~ by a Gaussian pivot step with row as pivot row and column as pivot column, clearly ~d ij = d ij ; d j d i =d for all i j > =. E, the matrix obtained by striking o the rst row and the rst column from D,isalso symmetric and PD. Also, if D is an arbitrary square symmetric matrix, it is PD i d > 0 and the matrix E obtained as above is PD. Proof. Since D is symmetric d ij = d ji for all i j. Therefore, y T Dy = nx i= nx = d y + j= y i y j d ij = d y +y n X j= n X j= d j y j =d + X i j> = d j y j + X i j> = y i ~ dij y j : 3 7 5 y i y j d ij Letting y = ;( P n j= d jy j )=d,weverify that if D is PD, then P i j> = y i ~ d ij y j > 0 for all (y ::: y n ) 6= 0,which implies that E is PD. The fact that E is also symmetric is clear since ~ d ij = d ij ; d j d i =d = ~ d ji by the symmetry of D. If D is an arbitrary symmetric matrix, the above equation clearly implies that D is PD i d > 0 and E is PD.

Result..3. Quadratic Programming 5 A square matrix F is PD (or PSD) i F + F T is PD (or PSD). Proof. This follows because x T (F + F T )x =x T Fx. Result.0 Let F be a square matrix of order n and E a matrix of order m n. The square matrix A = >: F ;ET E 0 > of order (m + n) is PSD i F is PSD. Proof. Let =(y ::: y n t ::: t m ) T R n+m and y =(y ::: y n ) T. For all, we have T A = y T Fy. So T A > = 0 for all R n+m i y T Fy > = 0 for all y R n. That is, A is PSD i F is PSD. Result. symmetric. If B is a square nonsingular matrix of order n, D = B T B is PD and Proof. The symmetry follows because D T = D. For any y R n, y 6= 0, y T Dy = y T B T By = kybk > 0 since yb 6= 0 (because B is nonsingular, y 6= 0 implies yb 6= 0). So D is PD. Result. If A is any matrix of order m n, A T A is PSD and symmetric. Proof. Similar to the proof of Result.. Principal Subdeterminants of PD, PSD Matrices We will need the following theorem from elementary calculus. Theorem. Intermediate value theorem: Let f() be a continuous real valued function dened on the closed interval 0 < = < = where 0 <. Let f be a real number strictly between f( 0 ) and f( ). Then there exists a satisfying 0 < <, and f() =f. For a proof of Theorem. see books on calculus, for example, W. Rudin, Principles of Mathematical Analysis, McGraw-Hill, second edition, 64, p.. Theorem. states that a continuous real valued function dened on a closed interval, assumes all intermediate values between its initial and nal values in this interval. Now we will resume our discussion of PD, PSD matrices. Theorem. If F is a PD matrix, whether it is symmetric or not, the determinant of F is strictly positive. Proof. Let F be of order n. Let I be the identity matrix of order n. If the determinant of F is zero, F is singular, and hence there exists a nonzero column vector x R n such that x T F = 0, which implies that x T Fx = 0, a contradiction to the hypothesis that F

6 Chapter. Linear Complementarity Problem, Its Geometry, and Applications is PD. So the determinant of F is nonzero. In a similar manner we conclude that the determinant of any PD-matrix is nonzero. For 0 < <, dene F () = F +(; )I, and f() =determinant of F (). Obviously f() is a polynomial in, and hence f() is a real valued continuous function dened on the interval 0 < = < =. Given a column vector x R n, x 6= 0, x T F ()x = x T Fx +(; )x T x > 0 for all 0 < = < = because F is PD. So F () is a PD matrix for all 0 < = < =. So from the above argument f() 6= 0 for any satisfying 0 < = < =. Clearly, f(0) =, and f() = determinant of F. If f() < 0 by Theorem. there exists a satisfying 0 < < and f() = 0, a contradiction. Hence f() 6< 0. Hence the determinant of F cannot be negative. Also it is nonzero. Hence the determinant of F is strictly positive. Theorem.3 If F is a PD matrix, whether it is symmetric or not, all principal subdeterminants of F are strictly positive. Proof. This follows from Result. and Theorem.. Theorem.4 is nonnegative. If F is a PSD matrix, whether it is symmetric or not, its determinant Proof. For 0 < = < =, dene F (), f() as in the proof of Theorem.. Since I is PD, and F is PSD F () is a PD matrix for 0 < = <. f(0) =, and f() is the determinant of F. If f() < 0, there exists a satisfying 0 < <, and f() =0,a contradiction since F () is a PD matrix. Hence f() 6< 0. So the determinant of F is nonnegative. Theorem.5 If F is a PSD matrix, whether it is symmetric or not, all its principal subdeterminants are nonnegative. Proof. Follows from Result.4 and Theorem.4. Theorem.6 H = >: Let d ::: d n d n+... d n ::: d nn d n n+ d n+ ::: d n+ n d n+ n+ > D = >: d ::: d n.. d n ::: d nn be symmetric matrices. H is of order n + and D is a principal submatrix of H. So d ij = d ji for all i, j = to n +. Let x R n, d = (d n+ ::: d n n+ ) T, and Q(x) =x T Dx+d T x + d n+ n+. Suppose D is a PD matrix. Let x = ;D ; d. Then x is the point which minimizes Q(x) over x R n, and Q(x )= (determinant ofh) / (determinant of D). (:) >

.3. Quadratic Programming 7 Also for any x R n Q(x) =Q(x )+(x ; x ) T D(x ; x ): (:3) Proof. Since H is symmetric @Q(x) @x = (Dx + d). Hence x is the only point in R n which satises @Q(x) @x =0. Also Dx = ;d implies Q(x )=x T Dx +d T x + d n+ n+ = d T x + d n+ n+ : (:4) For i = to n +, if g i n+ = d i n+ + P n j= d ijx j, and if g = (g n+ ::: g n n+ ) T, then g = d + Dx =0. Also g n+ n+ = d n+ n+ + d T x = Q(x ) from (.4). Now, from the properties of determinants, it is well known that the value of a determinant is unaltered if a constant multiple of one of its columns is added to another. For j = to n, multiply the jth column of H by x j and add the result to column n +of H. This leads to Determinant of H = determinant of = determinant of >: >: = (Q(x )) (determinant of D) d ::: d n g n+... d n ::: d nn g n n+ d n+ ::: d n+ n g n+ n+ d ::: d n 0... d n ::: d nn 0 d n+ ::: d n+ n Q(x ) which yields (.). (.3) can be veried by straight forward expansion of its right hand side, or it also follows from Taylor expansion of Q(x) around x, since @ Q(x) @x = D and x satises @Q(x) @x =0.Since D is a PD matrix, wehave(x;x ) T D(x;x ) > 0, for all x R n, x 6= x. This and (.3) together imply that: Q(x) > Q(x ), for all x R n, x 6= x. Hence x is the point which minimizes Q(x) over x R n. > > Theorem.7 Let H, D be square, symmetric matrices dened as in Theorem.6. H is PD i D is PD and the determinant of H is strictly positive. Proof. Suppose H is PD. By Theorem. the determinant of H is strictly positive, and by Result. its principal submatrix D is also PD. Suppose that D is PD and the determinant ofh is strictly positive. Let x =(x ::: x n ) T and =(x ::: x n x n+ ) T. Dene d, Q(x) as in Theorem.6. If x n+ =0, but 6= 0 (i. e., x 6= 0), T H = x T Dx > 0, since D is PD. Now suppose x n+ 6= 0. Let = (=x n+ )x. Then T H = x T Dx +x n+ d T x + d n+ n+ x n+ = x n+q().

Chapter. Linear Complementarity Problem, Its Geometry, and Applications So, when x n+ 6=0, T H = x n+q() > = x n+ (minimum value of Q() over R n ) = x n+ ((determinant ofh)/determinant ofd)) > 0. So under our hypothesis that D is PD and the determinantofh is strictly positive, wehave T H > 0 for all R n+, 6= 0,that is H is PD. Theorem. is PD i the determinants of these n +principal submatrices of H, Let H be the square symmetric matrix dened in Theorem.6. H (d ) are strictly positive. >: d d > d d >: d d d 3 d d d 3 > ::: D H d 3 d 3 d 33 Proof. Proof is by induction on the order of the matrix. Clearly, the statement of the theorem is true if H is of order. Now suppose the statement of the theorem is true for all square symmetric matrices of order n. By this and the hypothesis, we know that the matrix D is PD. So D is PD and the determinant of H is strictly positive by the hypothesis. By Theorem.7 these facts imply that H is PD too. Hence, by induction, the statement of the theorem is true in general. Theorem. A square symmetric matrix is PD i all its principal subdeterminants are strictly positive. Proof. Let the matrix be H dened as in Theorem.6. If H is PD, all its principal subdeterminants are strictly positive by Theorem.3. On the other hand, if all the principal subdeterminants of H are strictly positive, the n+ principal subdeterminants of H discussed in Theorem. are strictly positive, and by Theorem. this implies that H is PD. Denition: P-matrix A square matrix, whether symmetric or not, is said to be a P -matrix i all its principal subdeterminants are strictly positive. As examples, the matrices I, >: 4 >, >: > are P -matrices. The matrices 0 >: 0 >, >: ; 0 >, >: > are not P -matrices. 0 0 0 Theorem.0 A symmetric P -matrix is always PD.IfaP-matrix is not symmetric, it may notbe PD. Proof. By Theorem. B, a symmetric matrix is PD i it is a P -matrix. Consider the matrix B, B = >: 0 > B + B 6 T = >: 6 > : 6

.3. Quadratic Programming Since all its principal subdeterminants are, B is a P -matrix. However, the determinant of (B + B T ) is strictly negative, and hence it is not a PD matrix by Theorem., and by Result. this implies that B is not PD. Actually, itcan be veried that, ( ;)B( ;) T = ;4 < 0. Note. The interesting thing to note is that if H is a symmetric matrix, and if the n+ principal subdeterminants of H discussed in Theorem. are strictly positive, by Theorems.0 and. all principal subdeterminants of H are positive. This result may not be true if H is not symmetric. Exercises. If H is a square symmetric PSD matrix, and its determinant is strictly positive, then prove that H is a PD matrix. Construct a numerical example to show that this result is not necessarily true if H is not symmetric.. Is the following statement true? \H is PSD i its (n+) principal subdeterminants discussed in Theorem. are all nonnegative." Why? Illustrate with a numerical example. By Theorem. the class of PD matrices is a subset of the class of P -matrices. By Theorem.0 when restricted to symmetric matrices, the property of being a PD matrix is the same as the property of being a P -matrix. An asymmetric P -matrix may not be PD, it may be a PSD matrix as the matrix f M(n) below is, or it may not even be a PSD matrix. Let fm(n) = >: 0 0 ::: 0 0 0 ::: 0 0 ::: 0 0........ ::: 0 ::: > : (:5) fm(n) is a lower triangular matrix in which all the diagonal entries are, and all entries below the diagonal are. All the principal subdeterminants of f M(n) are clearly equal to, and hence f M(n) isap -matrix. However, f M(n)+( f M(n)) T is the matrix in which all the entries are, and it can be veried that it is a PSD matrix and not a PD matrix. Theorem. Let F be a square PSD matrix of order n, whether it is symmetric or not. If x R n is such that x T F x =0, then (F + F T )x =0.

0 Chapter. Linear Complementarity Problem, Its Geometry, and Applications Proof. Let D = F +F T. D is symmetric and by Result., D is PSD. For all x R n, x T Dx =x T Fx. So x T Dx =0too. We wish to prove that Dx =0. Let x R n. For all real numbers (x + x) T D(x + x) > = 0, that is x T Dx +x T Dx > = 0 (:6) since x T Dx = 0. If x T Dx = 0, by taking = and then ; in (.6), we conclude that x T Dx = 0. If x T Dx 6= 0, since D is PSD, x T Dx > 0. In this case, from (.6) we conclude that x T Dx > = ;x T Dx for > 0, and x T Dx < = ;x T Dx for < 0. Taking to be a real number of very small absolute value, from these we conclude that x T Dx must be equal to zero in this case. Thus whether x T Dx =0,orx T Dx > 0, we have x T Dx = 0. Since this holds for all x R n, we must have x T D = 0, that is Dx =0. Algorithm for Testing Positive Deniteness Let F = (f ij ) be a given square matrix of order n. Find D = F + F T. F is PD i D is. To test whether F is PD, we can compute the n principal subdeterminants of D determined by the subsets fg f g ::: f ::: ng. F is PD i each of these n determinants are positive, by Theorem.. However, this is not an ecient method unless n is very small, since the computation of these separate determinants is time consuming. We now describe a method for testing positive deniteness of F which requires at most n Gaussian pivot steps on D along its main diagonal hence the computational eort required by this method is O(n 3 ). This method is based on Result.. (i) If any of the principal diagonal elements in D are nonpositive, D is not PD. Terminate. (ii) Subtract suitable multiples of row from all the other rows, so that all the entries in column and rows to n of D are transformed into zero. That is, transform D into D as in Result.. If any diagonal element in the transformed matrix, D, is nonpositive, D is not PD. Terminate. (iii) In general, after r steps we will have a matrix D r of the form: 6 4 d d ::: d n 0 d ~ ::: dn ~. 0... d rr ::: d rn 0 ^dr+ r+ ::: ^dr+ n..... 0 0 0 ^dn r+ ::: ^dnn Subtract suitable multiples of row r +in D r from rows i for i >r+,so that all the entries in column r +and rows i for i > r + are transformed into 0. 3 7 5 :

.3. Quadratic Programming This transforms D r into D r+. If any element in the principle diagonal of D r+ is nonpositive, D is not PD. Terminate. Otherwise continue the algorithm in the same manner for n ; steps, until D n; is obtained, which is of the form 6 4 d d ::: d n 0 ~ d ::: ~ dn 0 3 7... 5 : 0 0 ::: d nn D n; is upper triangular. That's why this algorithm is called the superdiagonalization algorithm. If no termination has occured earlier and all the diagonal elements of D n; are positive, D, and hence, F is PD. Example.3 Test whether F = 6 4 3 ; 0 5 0 4 4 3 0 ; ; 3 6 3 3 7 5 is PD, D = F + F T = 6 4 6 0 3 0 4 4 0 7 4 5 ; : 3 0 ; 3 All the entries in the principal diagonal of D (i. e., the entries d ii for all i) are strictly positive. So apply the rst step in superdiagonalization getting D. Since all elements in the principal diagonal of D are strictly positive, continue. The matrices obtained in the order are: D = D 3 = 6 4 6 4 6 0 0 4 4 0 0 4 0 0 ; 0 3 ; 0 3 3 34 3 6 0 0 4 4 0 0 0 0 3 ; 0 3 0 0 0 3 7 5 D = 3 7 5 : 6 4 6 0 0 4 4 0 0 0 0 0 0 ; 0 3 ; 0 3 3 34 3 The algorithm terminates now. Since all diagonal entries in D 3 are strictly positive, conclude that D and, hence, F is PD. 3 7 5 Example.4 Test whether D = >: 0 0 0 4 0 4 4 5 0 0 5 3 > is PD.

Chapter. Linear Complementarity Problem, Its Geometry, and Applications D is already symmetric, and all its diagonal elements are positive. The rst step of the algorithm requires performing the operation: (row 3){ (row ) on D. This leads to 0 0 0 4 0 D = >: 0 4 0 5> : 0 0 5 3 Since the third diagonal element in D is not strictly positive, D is not PD. Algorithm for Testing Positive Semideniteness Let F = (f ij ) be the given square matrix. Obtain D = F + F T. If any diagonal element of D is 0, all the entries in the row and column of the zero diagonal entry must be zero. Otherwise D (and hence F ) is not PSD and we terminate. Also, if any diagonal entries in D are negative, D cannot be PSD and we terminate. If termination has not occurred, reduce the matrix D by striking o the rows and columns of zero diagonal entries. Start o by performing the row operations as in (ii) above, that is, transform D into D. If any diagonal element in D is negative, D is not PSD. Let E be the submatrix of D obtained by striking o the rst row and column of D. Also, if a diagonal element in E is zero, all entries in its row and column in E must be zero. Otherwise D is not PSD. Terminate. Continue if termination does not occur. In general, after r steps we willhave a matrix D r as in (iii) above. Let E r be the square submatrix of D r obtained by striking o the rst r rows and columns of D r. If any diagonal element in E r is negative, D cannot be PSD. If any diagonal element of E r is zero, all the entries in its row and column in E r must be zero otherwise D is not PSD. Terminate. If termination does not occur, continue. Let d ss be the rst nonzero (and, hence, positive) diagonal elementine r. Subtract suitable multiples of row s in D r from rows i, i>s, so that all the entries in column s and rows i, i > s in D r, are transformed into 0. This transforms D r into D s and we repeat the same operations with D s. If termination does not occur until D n; is obtained and, if the diagonal entries in D n; are nonnegative, D and hence F are PSD. In the process of obtaining D n;, if all the diagonal elements in all the matrices obtained during the algorithm are strictly positive, D and hence F is not only PSD but actually PD. Example.5 Is the matrix 0 ; ;3 ;4 53 3 3 0 0 F = 6 3 3 3 0 0 4 7 5 PSD? D = F + F T = 4 0 0 4 ;5 0 0 4 6 4 3 0 0 0 0 0 0 6 6 0 0 0 6 6 0 07 5 : 0 0 0 6 0 0 0 4

.3. Quadratic Programming 3 D. and D. are both zero vectors. So we eliminate them, but we will call the remaining matrix by the same name D. All the diagonal entries in D are nonnegative. Thus we apply the rst step in superdiagonalization. This leads to D = >: 6 6 0 0 0 0 0 0 0 0 6 0 0 4 > E = >: 0 0 0 0 6 0 4 The rst diagonal entry in E is 0, but the rst column and row of E are both zero vectors. Also all the remaining diagonal entries in D are strictly positive. So continue with superdiagonalization. Since the second diagonal element in D is zero, move to the third diagonal element of D. This step leads to D 3 = >: 6 6 0 0 0 0 0 0 0 0 6 0 0 0 0 All the diagonal entries in D 3 are nonnegative. D and hence F is PSD but not PD. > : > : Example.6 Is the matrix D in Example.4 PSD? Referring to Example.4 after the rst step in superdiagonalization, we have E = >: 4 0 4 0 5 0 5 3 The second diagonal entry in E is 0, but the second row and column of E are not zero vectors. So D is not PSD. > :.3. Relationship of Positive Semideniteness to the Convexity of Quadratic Functions Let ; be a convex subset of R n, and let g(x) be a real valued function dened on ;. g(x) issaid to be a convex function on ;, if g(x +(; )x ) < = g(x )+(; )g(x ) (:7)

4 Chapter. Linear Complementarity Problem, Its Geometry, and Applications for every pair of points x, x in ;, and for all 0 < = < =. g(x) issaidtobeastrictly convex function on ; if (.7) holds as a strict inequality for every pair of distinct points x, x in ; (i. e., x 6= x ) and for all 0 <<. See Appendix 3. Let F be a given square matrix of order n and c a row vector in R n. Let f(x) =cx + x T Fx. Here we discuss conditions under which f(x) is convex, or strictly convex. Let D = (=)(F + F T ). If F is symmetric then F = D, otherwise D is the symmetrized ; form of F. Clearly f(x) = cx + x T Dx. It can be veried that @f(x) @f(x) @f(x) T @x = ::: @x @x n = c T +(F + F T )x = c T +Dx, and that @ f(x) @x = the Hessian of f(x) = F + F T = D. Let x, x be two arbitrary column vectors in R n and let = x ; x. Let be a number between 0 and. By expanding both sides it can be veried that f(x )+(; )f(x ) ; f(x +(; )x )=( ; ) T D where = x ; x. So f(x )+(; )f(x ) ; f(x +(; )x ) > = 0 for all x x R n and 0 < = < =, i T D > = 0 for all R n, that is i D (or equivalently F ) is PSD. Hence f(x) isconvex on R n i F (or equivalently D) is PSD. Also by the above argument we see that f(x )+(;)f(x );f(x +(;)x ) > 0 for all x 6= x in R n and 0 < <, i T D > 0 for all R n, 6= 0. Hence f(x) is strictly convex on R n i T D > 0 for all 6= 0, that is i D (or equivalently F ) is PD. These are the conditions for the convexity or strict convexity of the quadratic function f(x) over the whole space R n. It is possible for f(x) to be convex on a lower dimensional convex subset of R n (for example, a subspace of R n ) even thoughthe matrix F is not PSD. For example, the quadratic form f(x) = (x x ) >: ; 0 > (x 0 x ) T is convex over the subspace f(x x ) : x = 0g but not over the whole of R. Exercise.3 Let K R n be a convex set and Q(x) =cx + xt Dx. If Q(x) is convex over K and K has a nonempty interior, prove that Q(x) isconvex over the whole space R n..3.3 Necessary Optimality Conditions for Quadratic Programming We will now resume our discussion of the quadratic program (.). Theorem. of the LP If x is an optimum solution of (.), x is also an optimum solution minimize subject to (c + x T D)x Ax > = b x > = 0 : (:)

.3. Quadratic Programming 5 Proof. Notice that the vector of decision variables in (.) is x x is a given point and the cost coecients in the LP (.) depend on x. The constraints in both (.) and (.) are the same. The set of feasible solutions is a convex polyhedron. Let ^x be any feasible solution. By convexity of the set of feasible solutions x = ^x+(;)x = x + (^x ; x) is also a feasible solution for any 0 < <. Since x is an optimum feasible solution of (.), Q(x ) ; Q(x) > = 0, that is (c + x T D)(^x ; x) +(=) (^x ; x) T D(^x ; x) > = 0 for all 0 <<. Dividing both sides by leads to (c + x T D) (^x ; x) > = (;=)(^x ; x) T D(^x ; x) for all 0 <<. This obviously implies (c + x T D) (^x ; x) > = 0, that is, (c + x T D)^x > = (c + x T D)x. Since this must hold for an arbitrary feasible solution ^x, x must be an optimum feasible solution of (.). Corollary. y R m and slack vectors u R N, v R m such that x, y, u, v together satisfy >: u v >: u v If x is an optimum feasible solution of (.), there exist vectors > ; > > = 0 > >: x y >: D ;AT A 0 >: x > > y = 0 and > = >: u v >: ct ;b > > T > : x y > =0: (:) Proof. >From the above theorem x must be an optimum solution of the LP (.). The corollary follows by using the results of Section. on this fact. Necessary and Sucient Optimality Conditions for Convex Quadratic Programs The quadratic minimization problem (.) is said to be a convex quadratic program if Q(x) is convex, that is, if D is a PSD matrix (by the results in Section.3., or Theorem 7 of Appendix 3). If D is not PSD, (.) is said to be a nonconvex quadratic program. Associate a Lagrange multiplier y i to the ith constraint \A i.x > = b i " i =to m and a Lagrange multiplier u j to the sign restriction on x j in (.), j =to N. Let y =(y ::: y m ) T, u =(u ::: u N ) T. Then the Lagrangian corresponding to the quadratic program (.) is L(x y u)=q(x);y T (Ax;b);u T x. The Karush-Kuhn-Tucker necessary optimality conditions for (.) are @L @x (x y u)=ct + Dx ; A T y ; u =0 y > = 0 u > = 0 y T (Ax ; b) =0 u T x =0 Ax ; b > = 0 x > = 0 : (:0) Denoting the slack variables Ax ; b by v, the conditions (.0) can be veried to be exactly those in (.), written out in the form of an LCP. A feasible solution x

6 Chapter. Linear Complementarity Problem, Its Geometry, and Applications for (.), is said to be a Karush-Kuhn-Tucker point (or abbreviated as a KKT point) if there exist Lagrange multiplier vectors y, u, such that x, y, u together satisfy (.0) or the equivalent (.). So the LCP (.) is the problem of nding a KKT point for (.). We now have the following results. Theorem.3 it, whether Q(x) is convex or not. If x is an optimum solution for (.), x mustbeakkt point for Proof. Follows from Theorem. and Corollary.. Thus (.0) or equivalently (.) provide the necessary optimality conditions for a feasible solution x of (.) to be optimal. Or, in other words, every optimum solution for (.) must be a KKT point for it. However, given a KKT point for (.) we cannot guarantee that it is optimal to (.) in general. In the special case when D is PSD, every KKT point for (.) is optimal to (.), this is proved in Theorem.4 below. Thus for convex quadratic programs, (.0) or equivalently (.) provide necessary and sucient optimality conditions. Theorem.4 If D is PSD and x is a KKT point of (.), x is an optimum feasible solution of (.). Proof. >From the denition of a KKT point and the results in Section., if x is a KKT point for (.), it must be an optimum feasible solution of the LP (.). Let x be any feasible solution of (.). Q(x) ; Q(x) =(c + x T D)(x ; x)+ (x ; x)t D(x ; x) : The rst term on the right-hand side expression is nonnegative since x is an optimal feasible solution of (.). The second term in that expression is also nonnegative since D is PSD. Hence, Q(x) ; Q(x) > = 0forall feasible solutions, x, of (.). This implies that x is an optimum feasible solution of (.). Clearly (.) is an LCP. An optimum solution of (.) must be a KKT point for it. Solving (.) provides a KKT point for (.) and if D is PSD, this KKT point is an optimum solution of (.). [If D is not PSD and if a KKT point is obtained when (.) is solved, it may not be an optimum solution of (.).] Example.7 Minimum Distance Problem. Let K denote the shaded convex polyhedral region in Figure.7. Let P 0 be the point (; ;). Find the point in K that is closest to P 0 (in terms of the usual Euclidean distance). Such problems appear very often in operations research applications.

.3. Quadratic Programming 7 P 4 (5,4) 000000 (,3) 0000000000 P 00000000000 00000000000 0000000000 K P (5,) 00000000 3 0000000 00000 0000 00 P 0 P (-,-) (4,0) Figure.7 Every point ink can be expressed as a convex combination of its extreme points (or corner points) P, P, P 3, P 4. That is, the coordinates of a general point in K are: ( +4 +5 3 +5 4 3 +0 + 3 +4 4 ) where the i satisfy + + 3 + 4 = and i > = 0 for all i. Hence, the problem of nding the point in K closest to P 0 is equivalent to solving: Minimize ( +4 +5 3 +5 4 ; (;)) +(3 + 3 +4 4 ; (;)) Subject to + + 3 + 4 = i > = 0 for all i: 4 can be eliminated from this problem by substituting the expression 4 =; ; ; 3 for it. Doing this and simplifying, leads to the quadratic program Minimize (;66 ;54 ;0) + >: T Subject to ; ; ; 3 > = ; > = 0 34 6 4 6 34 6 4 6 where = ( 3 ) T. Solving this quadratic program is equivalent to solving the LCP u 34 6 4 ;66 u >: u 3 > ; 6 34 6 >: 4 6 > >: 3 > = ;54 >: ;0> : v ; ; ; 0 y >

Chapter. Linear Complementarity Problem, Its Geometry, and Applications All variables u u u 3 v 3 y > = 0 and u = u = u 3 3 = v y =0: Let (~u ~u ~u 3 ~v ~ ~ ~ 3 ~y ) be a solution to this LCP. Let ~ 4 =; ~ ; ~ ; ~ 3. Then ~x =( ~ +4 ~ +5 ~ 3 +5 ~ 4 3 ~ + ~ 3 +4 ~ 4 ) is the point in K that is closest to P 0..3.4 Convex Quadratic Programs and LCPs Associated with PSD Matrices Consider the LCP (q M), which is (.6) { (.), in which the matrix M is PSD. Consider also the quadratic program Minimize z T (Mz + q) Subject to Mz + q > = 0 z > = 0 : This is a convex quadratic programming problem since M is PSD. If the optimum objectivevalue in this quadratic program is > 0, clearly the LCP (q M) has no solution. If the optimum objective value in this quadratic program is zero, and z is any optimum solution for it, then (w = Mz + q z) is a solution of the LCP. Conversely if (~w ~z) is any solution of the LCP (q M), the optimum objective value in the above quadratic program must be zero, and ~z is an optimum solution for it. Thus every LCP associated with a PSD matrix can be posed as a convex quadratic program. Now, consider a convex quadratic program in which Q(x) =cx+ xt Dx (where D is a symmetric PSD matrix) has to be minimized subject to linear constraints. Replace each equality constraint by a pair of opposing inequality constraints (for example, Ax = b is replaced by Ax < = b and Ax > = b). Now the problem is one of minimizing Q(x) subject to a system of linear inequality constraints. This can be transformed into an LCP as discussed in Section.3.3. The matrix M in the corresponding LCP will be PSD by Result.0, since D is PSD. Thus every convex quadratic programming problem can be posed as an LCP associated with a PSD matrix.