Solving Variational Inequality Problems via Smoothing-Nonsmooth Reformulations

Similar documents
A Continuation Method for the Solution of Monotone Variational Inequality Problems

A Regularization Newton Method for Solving Nonlinear Complementarity Problems

Spectral gradient projection method for solving nonlinear monotone equations

A Novel Inexact Smoothing Method for Second-Order Cone Complementarity Problems

A PENALIZED FISCHER-BURMEISTER NCP-FUNCTION. September 1997 (revised May 1998 and March 1999)

1. Introduction The nonlinear complementarity problem (NCP) is to nd a point x 2 IR n such that hx; F (x)i = ; x 2 IR n + ; F (x) 2 IRn + ; where F is

ON REGULARITY CONDITIONS FOR COMPLEMENTARITY PROBLEMS

Error bounds for symmetric cone complementarity problems

A SIMPLY CONSTRAINED OPTIMIZATION REFORMULATION OF KKT SYSTEMS ARISING FROM VARIATIONAL INEQUALITIES

WHEN ARE THE (UN)CONSTRAINED STATIONARY POINTS OF THE IMPLICIT LAGRANGIAN GLOBAL SOLUTIONS?

Newton-type Methods for Solving the Nonsmooth Equations with Finitely Many Maximum Functions

A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems

20 J.-S. CHEN, C.-H. KO AND X.-R. WU. : R 2 R is given by. Recently, the generalized Fischer-Burmeister function ϕ p : R2 R, which includes

system of equations. In particular, we give a complete characterization of the Q-superlinear

Key words. linear complementarity problem, non-interior-point algorithm, Tikhonov regularization, P 0 matrix, regularized central path

ON GAP FUNCTIONS OF VARIATIONAL INEQUALITY IN A BANACH SPACE. Sangho Kum and Gue Myung Lee. 1. Introduction

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

A continuation method for nonlinear complementarity problems over symmetric cone

Department of Social Systems and Management. Discussion Paper Series

Stationary Points of Bound Constrained Minimization Reformulations of Complementarity Problems1,2

RECURSIVE APPROXIMATION OF THE HIGH DIMENSIONAL max FUNCTION

Fischer-Burmeister Complementarity Function on Euclidean Jordan Algebras

Technische Universität Dresden Herausgeber: Der Rektor

Smoothed Fischer-Burmeister Equation Methods for the. Houyuan Jiang. CSIRO Mathematical and Information Sciences

A Trust Region Algorithm Model With Radius Bounded Below for Minimization of Locally Lipschitzian Functions

School of Mathematics, the University of New South Wales, Sydney 2052, Australia

On Penalty and Gap Function Methods for Bilevel Equilibrium Problems

SOME STABILITY RESULTS FOR THE SEMI-AFFINE VARIATIONAL INEQUALITY PROBLEM. 1. Introduction

Polynomial complementarity problems

Alternative theorems for nonlinear projection equations and applications to generalized complementarity problems

A Smoothing Newton Method for Solving Absolute Value Equations

Journal of Inequalities in Pure and Applied Mathematics

Key words. Complementarity set, Lyapunov rank, Bishop-Phelps cone, Irreducible cone

A quasisecant method for minimizing nonsmooth functions

Semismooth Newton methods for the cone spectrum of linear transformations relative to Lorentz cones

A QP-FREE CONSTRAINED NEWTON-TYPE METHOD FOR VARIATIONAL INEQUALITY PROBLEMS. Christian Kanzow 1 and Hou-Duo Qi 2

Applying a type of SOC-functions to solve a system of equalities and inequalities under the order induced by second-order cone

A convergence result for an Outer Approximation Scheme

The nonsmooth Newton method on Riemannian manifolds

A smoothing Newton-type method for second-order cone programming problems based on a new smoothing Fischer-Burmeister function

FIRST- AND SECOND-ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH VANISHING CONSTRAINTS 1. Tim Hoheisel and Christian Kanzow

GENERALIZED second-order cone complementarity

Affine scaling interior Levenberg-Marquardt method for KKT systems. C S:Levenberg-Marquardt{)KKTXÚ

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

On the Coerciveness of Merit Functions for the Second-Order Cone Complementarity Problem

Tensor Complementarity Problem and Semi-positive Tensors

A FRITZ JOHN APPROACH TO FIRST ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH EQUILIBRIUM CONSTRAINTS

On Second-order Properties of the Moreau-Yosida Regularization for Constrained Nonsmooth Convex Programs

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Smoothing Functions and Smoothing Newton Method for Complementarity and Variational Inequality Problems 1

Downloaded 12/13/16 to Redistribution subject to SIAM license or copyright; see

A class of Smoothing Method for Linear Second-Order Cone Programming

Cubic regularization of Newton s method for convex problems with constraints

Bulletin of the. Iranian Mathematical Society

A polynomial time interior point path following algorithm for LCP based on Chen Harker Kanzow smoothing techniques

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Lecture 13 Newton-type Methods A Newton Method for VIs. October 20, 2008

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

1 Directional Derivatives and Differentiability

Symmetric Tridiagonal Inverse Quadratic Eigenvalue Problems with Partial Eigendata

The effect of calmness on the solution set of systems of nonlinear equations

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Lecture 7 Monotonicity. September 21, 2008

A Power Penalty Method for a Bounded Nonlinear Complementarity Problem

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

Semismooth Support Vector Machines

Properties of Solution Set of Tensor Complementarity Problem

Implications of the Constant Rank Constraint Qualification

A double projection method for solving variational inequalities without monotonicity

SOLUTION OF NONLINEAR COMPLEMENTARITY PROBLEMS

Extended Monotropic Programming and Duality 1

Expected Residual Minimization Method for Stochastic Linear Complementarity Problems 1

A semismooth Newton method for tensor eigenvalue complementarity problem

A derivative-free nonmonotone line search and its application to the spectral residual method

Subdifferential representation of convex functions: refinements and applications

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Subgradient Projectors: Extensions, Theory, and Characterizations

An Even Order Symmetric B Tensor is Positive Definite

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

SECOND-ORDER CHARACTERIZATIONS OF CONVEX AND PSEUDOCONVEX FUNCTIONS

Merit functions and error bounds for generalized variational inequalities

SEMISMOOTH LEAST SQUARES METHODS FOR COMPLEMENTARITY PROBLEMS

A Dual Condition for the Convex Subdifferential Sum Formula with Applications

An inexact subgradient algorithm for Equilibrium Problems

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Journal of Convex Analysis (accepted for publication) A HYBRID PROJECTION PROXIMAL POINT ALGORITHM. M. V. Solodov and B. F.

Affine covariant Semi-smooth Newton in function space

Numerical Optimization

Local Indices for Degenerate Variational Inequalities

Locally Lipschitzian Guiding Function Method for ODEs.

Nonsmooth Matrix Valued Functions Defined by Singular Values

A First Order Method for Finding Minimal Norm-Like Solutions of Convex Optimization Problems

Pacific Journal of Optimization (Vol. 2, No. 3, September 2006) ABSTRACT

Radius Theorems for Monotone Mappings

Solving generalized semi-infinite programs by reduction to simpler problems.

Inequality Constraints

A Relaxed Explicit Extragradient-Like Method for Solving Generalized Mixed Equilibria, Variational Inequalities and Constrained Convex Minimization

Lecture 19 Algorithms for VIs KKT Conditions-based Ideas. November 16, 2008

Summary Notes on Maximization

Transcription:

Solving Variational Inequality Problems via Smoothing-Nonsmooth Reformulations Defeng Sun and Liqun Qi School of Mathematics The University of New South Wales Sydney 2052, Australia Email: (Defeng Sun) D.Sun@unsw.edu.au (Liqun Qi) L.Qi @unsw.edu.au February 5, 1999; Revised, September 21, 1999 Abstract It has long been known that variational inequality problems can be reformulated as nonsmooth equations. Recently, locally high-order convergent Newton methods for nonsmooth equations have been well established via the concept of semismoothness. When the constraint set of the variational inequality problem is a rectangle, several locally convergent Newton methods for the reformulated nonsmooth equations can aslo be globalized. In this paper, our main aim is to provide globally and locally high-order convergent Newton methods for solving variational inequality problems with general constraints. To achieve this, we first prove via convolution that these nonsmooth equations can be well approximated by smooth equations, which have desirable properties for the design of Newton methods. We then reformulate the variational inequality problems as equivalent smoothing-nonsmooth equations and apply Newton-type methods to solve the latter systems, and so the variational inequality problems. Stronger convergence results have been obtained. 1 Introduction It has been a long history in mathematical programming field to construct smoothing functions to approximate nonsmooth functions. In this paper we will restrict our study to the smoothing functions of those nonsmooth functions arising from variational inequality problems. The variational inequality problem (VIP for abbreviation) is to find x X such that (x x ) T F (x ) 0 for all x X, (1.1) The main results of this paper were presented at the International Conference on Nonlinear Optimization and Variational Inequalities, Hong Kong, December 15 18, 1998. This work is supported by the Australian Research Council. 1

2 where X is a nonempty closed convex subset of and F : D is continuously differentiable on some open set D, which contains X. When X = +, the VIP reduces to the nonlinear complementarity problem (NCP): Find x + such that F (x ) + and F (x ) T x =0. (1.2) It is well known (see, e.g., [5, 14]) that solving (1.1) is equivalent to finding a root of the following equation: W (x) :=x Π X [x F (x)] = 0, (1.3) where for any x,π X (x) is the Euclidean projection of x onto X. It is also well known that solving the VIP is equivalent to solving the following normal equation E(y) :=F (Π X (y)) + y Π X (y) = 0 (1.4) in the sense that if y is a solution of (1.4) then x := Π X (y ) is a solution of (1.1), and conversely if x is a solution of (1.1) then y := x F (x ) is a solution of (1.4) [33]. Both (1.3) and (1.4) are nonsmooth equations and have led to various generalized Newton s methods under semismoothness assumptions [32, 26, 23]. The difference between (1.3) and (1.4) is that W is only defined on D and may not have definition outside D if F is not well defined outside D while E is defined on even if F is only defined on X. When X is a rectangle, several globally and locally superlinearly convergent Newtontype methods for solving (1.3) and/or (1.4) are available, see [29] for a review. It is interesting to know whether these results can be generalized to the case that X is not a rectangle. In this paper, we will address this issue and provide a way to solve (1.3) and (1.4) without assuming X to be a rectangle. From the above discussions, we can see that the nonsmoothness of the reformulated functions inherits from the nonsmoothness of the projection operator Π X. Due to its key role in our analysis, we will discuss its differentiability properties in details in Section 2. In particular, we show that all the generalized Jacobian of the projection operator Π X are symmetric. This is actually a key result in analysing the nonsingularity of the generalized Jacobian of W and E and their smoothing counterparts. In section 3, we discuss some smoothing functions of the projection operator Π X. An algorithm for solving variational inequality problems and its convergence analysis are presented in Section 4. In particular, we provide some stronger convergence results for monotone variational inequality problems both globally and locally. We give some final remarks in Section 5. 2 Preliminaries 2.1 Generalized Jacobians Suppose that H : R m is locally Lipschitz continuous. Then by Rademacher s theorem, H is almost everywhere differentiable. There are many kinds of definitions about the generalized Jacobians of H. Among them, Clarke s definition H [4] is most used in optimization. In this paper we also need other definitions of generalized Jacobians of H. Ioffe [16] and Hiriart-Urruty [15] defined the following so-called support bifunction H 0 (x; u, v) := lim sup y x,t 0 v T [H(y + tu) H(y)], x,u,v R m. (2.1) t

3 For a point-set map A : R m n, Sweetser [39] defines plena = {A R m n Ab Ab b }. In [15], Hiriart-Urruty proved the following useful results. Theorem 2.1 (i) H 0 (x; u, v) = max V H(x) vt Vu. (ii) where CH(x) = plen H(x), CH(x) :={V R m n v T Vu H 0 (x; u, v) u,v R m }. 2.2 Convolution Let Φ : R + be a kernel function, i.e., Φ is integrable (in the sense of Lebesgue) and Φ(x)dx =1. Define Θ : R ++ R + by Θ(ε, x) =ε n Φ(ε 1 x), where (ε, x) R ++. Then a smoothing approximation of H via convolution can be described by G(ε, x) := H(x y)θ(ε, y)dy = H(x εy)φ(y)dy = H(y)Θ(ε, x y)dy, (2.2) where (ε, x) R ++. In order to make (2.2) meaningful, we need some assumptions about the kernel function Φ, which will be addressed in the next section. When H is bounded and uniformly continuous, see [27] for some discussions about G. For convenience of discussion, we always define G(0,x)=H(x) and for any ε<0, let G(ε, x) =G( ε, x), x. 2.3 Jacobian Characterizations of the Projection Operator Π X The following properties about the projection operator Π X are well known [42]. Proposition 2.1 For any x, y, (i) Π X (y) Π X (x) y x.

4 (ii) (y x) T (Π X (y) Π X (x)) Π X (y) Π X (x) 2. Part (i) of Proposition 2.1 says that Π X is a nonexpansive and globally Lipschitz continuous operator while part (ii) of Proposition 2.1 implies that Π X is a monotone operator. Then Clarke s generalized Jacobian Π X is well defined everywhere by Rademacher s theorem and for any x, all S Π X (x) are positive semidefinite [17]. The next theorem, which gives a partial answer to an open question posed by Hiriart-Urruty [15], summarizes some important properties of Π X. Theorem 2.2 For x, all S Π X (x) are symmetric, positive semidefinite and S 1. Proof: We only need to prove that all S Π X (x) are symmetric by considering of the arguments before this theorem. Define φ : Rby φ(y) = 1 2 ( y 2 y Π X (y) 2 ), y. Then, by [42], φ is continuously differentiable with gradient given by φ(y) =Π X (y). Thus, if Π X is differentiable at some point y, we have Π X(y) = 2 φ(y), which, according to [22, 3.3.4], proves that Π X (y) is symmetric. This, by the definition of Π X, has in fact proved that all S Π X (x) are symmetric. We will see subsequently that the symmetric property of all S Π X (x) plays an essential role in our analysis. 2.4 Quasi P 0 -matrix and quasi P -matrix A matrix A n is a called a P 0 -matrix (P -matrix) if every of its principal minors is nonnegative (positive). Here, we will introduce some generalizations of P 0 -matrix and P -matrix in order to exploit the properties of the generalized Jacobians of the projection operator Π X. Definition 2.1 A matrix A n is called a quasi P 0 -matrix (P -matrix) if there exists an orthogonal matrix U n such that UAU T is a P 0 -matrix (P -matrix). It is obvious that any P 0 -matrix (P -matrix) is a quasi P 0 -matrix (P -matrix). Any quasi P -matrix is a quasi P 0 -matrix and any quasi P -matrix is nonsingular. If A is a quasi P 0 -matrix, then for any ε>0, B := A + εi is a quasi P -matrix, where I is the identity matrix. We will see later that the concepts of quasi P 0 -matrix and P -matrix are useful in the analysis of nonsingularity of generalized Jacobians considered in this paper.

5 Theorem 2.3 Suppose that S Π X [x F (x)], x. Then there exists an orthogonal matrix U n such that Σ := USU T is a diagonal matrix with 0 Σ ii 1,i {1, 2,,n}. Moreover, if UF (x)u T is a P 0 -matrix (P -matrix), then V := I S(I F (x)) is a quasi P 0 -matrix (P -matrix). Proof: By Theorem 2.2 we know that there exists an orthogonal matrix U such that Σ:=USU T to be a diagonal matrix with 0 Σ ii 1,i {1, 2,,n}. Then UVU T =(I Σ) + Σ(UF (x)u T ). Since 0 Σ ii 1, i {1, 2,,n} and that UF (x)u T is a P 0 -matrix (P -matrix), UVU T is a P 0 -matrix (P -matrix) as well. This, by Definition 2.1, completes our proof. In [7], Facchinei and Pang introduced a concept of so called generalized P 0 -function. Suppose that X is the Cartesian product of m (with m 1) lower dimensional sets: m X := X j, (2.3) j=1 with each X j being a nonempty closed convex subset of j and m j=1 n j = n. Correspondingly, suppose that both the variable x and the function F (x) are partitioned in the following way: x 1 F 1 (x) x 2 F 2 (x) x = and F (x) =.., (2.4) x m F m (x) where for each j, both x j and F j (x) belong to j. Let L(X) denote all the sets in which have the same partitioned structure as X, i.e., D L(X) if and only if D can be expressed as m D = D j j=1 with D j j. Then F is called a generalized P 0 -function on D L(X) if for every pair of distinct vectors x and y in D, there exists an index j 0 such that x j0 y j 0 and (x j 0 y j 0 ) T (F j 0 (x) F j 0 (x)) 0. (Note that this definition is the one given in [7] if D = X. The above presentation is more accurate than that in [7]). If F is a generalized P 0 -function on any D L(X), we say that F is a generalized P 0 -function on L(X). Corollary 2.1 If F is a continuously differentiable generalized P 0 -function on L(X) and x, then for any S Π X [x F (x)], V := I S[I F (x)] is a quasi P 0 -matrix. Proof: According to Theorem 2.2, for any S Π X [x F (x)], there exist m matrices S i Π X i[x i F i (x)] such that S = diag(s 1,S 2,,S m ),

6 and for each i {1, 2,,m} there exists an orthogonal matrix U i such that Σ i := U i SU i T is a diagonal matrix with 0 (Σ i ) jj 1,j {1, 2,,n i }. Define U := diag(u 1,U 2,,U m ) and Σ := diag(σ 1, Σ 2,, Σ m ). Then U is an orthogonal matrix and S = U T ΣU. Since F is a generalized P 0 -function on L(X), it is easy to show that for any nonzero vector h, there exists an index i {1, 2,,m} such that h i 0,h i i and (h i ) T (F (x)h) i 0. Then for any d 0, there exists an index i {1, 2,,m} such that (U i ) T d i 0,d i i and [(U i ) T d i ] T (F (x)u T d) i 0. Therefore, (d i ) T (Qd) i 0, where Q := UF (x)u T. Then there exists an index j(i) such that d i j(i) 0 and d i j(i) (Qd)i j(i) 0, which implies that Q is a P 0 -matrix. Thus, by Theorem 2.3, we get the desired result. Corollary 2.2 If X is a rectangle and for x, F (x) is a P 0 -matrix (P -matrix) or if F (x) is positive semidefinite (positive definite), then for any S Π X [x F (x)], V := I S[I F (x)] is a quasi P 0 -matrix (P -matrix). Actually, when X is a rectangle and F is a P 0 -function (P -function), Ravindran and Gowda [34] have proved that W and E are P 0 -functions (P -functions). See [12] for more discussions about this topic. One might expect that if F is monotone, then W or E must be monotone as well. This is, however, not true, and can be seen clearly by the following example. Consider the NCP with 2 0 F (x) =Mx, M =, x R 2 3 2 and X = R 2 +. Then 1 0 V := W(0). 3 2 Clearly, V is not positive semidefinite, and so, the function W defined in (1.3) is not monotone even F itself is strongly monotone. It is also noted that if X is not a rectangle and for some x, F (x) is only a P 0 -matrix, V W(x) may be not a quasi P 0 -matrix, which can be demonstrated by the following example. Consider the VIP with 1 0 F (x) =Mx, M =, x R 2 20 1 and X = {x R 2 x 1 + x 2 =0}. Note that M is a P -matrix, but is not a positive semidefinite matrix. Then the function W defined in (1.3) is continuously differentiable with W 9 0 (0, 0) =. 10 1

7 The matrix W (0, 0) is not a quasi P 0 -matrix, because otherwise T := W (0, 0) + diag(1, 1) should be a quasi P -matrix and then det(t ) > 0. The latter is not true since det(t )= 16 < 0. Next, we discuss a result related to level sets of the function W defined in (1.3). Theorem 2.4 Suppose that X is of structure (2.3) and that F is a continuous generalized P 0 -function on L(X). Suppose that H : is continuous, H is partitioned in the way as F in (2.4), and there exists a constant γ>0 such that for any j {1, 2,,m} and y, x, (y j x j ) T (H j (y) H j (x)) γ y j x j 2. (2.5) Define Y : by Then for any c 0, the following set is bounded. Y (x) :=x Π X [x (F (x)+h(x))], x. L c := {x Y (x) c} Proof: Suppose by contradiction that L c is unbounded. Then there exists a sequence {x k } such that x k L c and {x k } is unbounded. For each k, x k can be partitioned as with (x k ) j j,j {1, 2,,m}. Let x k =((x k ) 1, (x k ) 2,, (x k ) m ), I := {j (x k ) j is unbounded} and I b := {j (x k ) j is bounded}. Then I is nonempty. Let a =(a 1,a 2,,a m ) be an arbitrary vector in X with a j X j,j {1, 2,,m} and define a sequence {y k } by { (y k ) j (x := k ) j if j I b a j + Y j (x k ) if j I. Then {y k } is bounded. Since F is a generalized P 0 -function on L(X), there exists an index j k I (note that (y k ) j =(x k ) j for all j I b ) such that (y k ) j k (x k ) j k and [(y k ) j k (x k ) j k ] T (F j k (y k ) F j k (x k )) 0. (2.6) Without loss of generality, assume that j k j 0. Since (x k ) j 0 Y j 0 (x k )=Π X j 0 [(x k ) j 0 (F j 0 (x k )+H j 0 (x k ))], by (ii) of Proposition 2.1 with y := a j 0 and x := (x k ) j 0 (F j 0 (x k )+H j 0 (x k )), we have ( Y j 0 (x k )+F j 0 (x k )+H j 0 (x k )) T [a j 0 (x k ) j 0 + Y j 0 (x k )] 0, which, together with (2.6), implies that (H j 0 (x k )) T [(y k ) j 0 (x k ) j 0 ] (Y j 0 (x k )+F j 0 (y k )) T [(y k ) j 0 (x k ) j 0 ].

8 This means that there exists a constant c 1 > 0 such that for all k sufficiently large, (H j 0 (x k )) T [(y k ) j 0 (x k ) j 0 ] c 1 (y k ) j 0 (x k ) j 0. Thus, for all k sufficiently large, we have (H j 0 (x k ) H j 0 (y k )) T [(y k ) j 0 (x k ) j 0 ] (c 1 + H j 0 (y k ) ) (y k ) j 0 (x k ) j 0, which contradicts (2.5) (note that {y k } is bounded). This contradiction shows that L c is bounded. When F itself is strongly monotone, from Theorem 2.4 we have the following result, which appeared in [24] with a lengthy proof. Corollary 2.3 Suppose that F is continuous and strongly monotone on and that W is defined in (1.3). Then for any c 0, the following set is bounded. {x W (x) c} All the results hold for W in this subsection can be parallelized to E. Here we omit the details. 3 Smoothing approximations In this section, unless otherwise stated, we suppose that H : R m is locally Lipschitz continuous. Suppose that G is defined by (2.2). Define supp(φ) = {y Φ(y) > 0}. The kernel function Φ in supp(φ) will be omitted if it is obvious from the context. 3.1 Supp(Φ) is bounded In this subsection, we suppose that supp(φ) is bounded. The first smoothing function we want to discuss is the Steklov averaged function. Define { 1 if max y i 0.5 Φ(y) = i 0 otherwise. (3.1) Then, for any ε>0, the function G(ε, ) defined in (2.2), according to [6], is the Steklov averaged function of H with G(ε, x) = 1 x1 +ε/2 xn+ε/2 ε n H(y)dy, x. (3.2) x 1 ε/2 x n ε/2 The following result regarding of the continuous differentiability of the Steklov averaged function was first obtained by Gupal [13]. See also Mayne and Polak [21] and Xu and Chang [41].

9 Proposition 3.1 Suppose that H : R m is continuous and that Φ is given by (3.1). Then for any ε>0, G(ε, ) is continuously differentiable on and (G j ) x(ε, x) = n i=1 T 1 x1 +ε/2 xi 1 +ε/2 e i ε n 1 dy 1 x 1 ε/2 x i 1 ε/2 dy i 1 xi+1 +ε/2 x i+1 ε/2 xn+ε/2 dy i+1 dy n x n ε/2 1 ε [H j(y 1,,y i 1,x i ε/2,y i+1,,y n ) H j (y 1,,y i 1,x i + ε/2,y i+1,,y n )], (3.3) where j =1, 2,,m and e i is the ith unit coordinate vector, i =1, 2,,n. Proposition 3.2 (Sobolev [37], Schwartz [35] ) If supp(φ) is bounded and Φ is continuously differentiable, then for any ε>0, G(ε, ) is continuously differentiable and for x G x (ε, x) = H(y)Θ x (ε, x y)dy. (3.4) Recall that a locally Lipschitz continuous function H : R m is said to be semismooth at x,if lim V H(x+th ) h h, t 0 {Vh } exists for any h [32]. H is said to be strongly semismooth at x if H is semismooth at x and for any V H(x + h), h 0, Vh H (x; h) =O( h 2 ). See [32, 26] for details about the discussion of semismoothness and strong semismoothness. Theorem 3.1 If Φ is defined by (3.1) or if supp(φ) is bounded and Φ is continuously differentiable, then we have (i) G(, ) is continuously differentiable on R ++. (ii) G is locally Lipschitz continuous on +1.IfHitself is globally Lipschitz continuous on with Lipschitz constant L, then G is Lipschitz continuous on +1 and for any fixed ε>0 and x, G x (ε, x) L. (3.5) (iii) lim z x,ε 0 G x (ε, z) plen H(x). (iv) For any x, π x G(0,x) plen H(x) H 1 (x) H 2 (x) H m (x), (3.6) where π x G(0,x):={A R m n there exists a R m such that (a A) G(0,x)}.

10 (v) For any ε>0 and x, G x (ε, x) is positive semi-definite (positive definite) if H is monotone (strongly monotone) on. (vi) If m = n and there exists a continuously differentiable function f : Rsuch that H(y) = f(y) for all y, then G x(ε, x) is symmetric at any (ε, x) R ++. (vii) If H is semismooth at x, then for any (ε, d) R ++ with (ε, d) 0 we have G(ε, x + d) G(0,x) G ε (ε, x + d) = o( (ε, d) ), (3.7) d and G(ε, x + d) G(0,x) G (ε, x + d) if H is strongly semismooth at x. ( ε d ) = O( (ε, d) 2 ) (3.8) Proof: (i) Similar to the proof of Propositions 3.1 and 3.2, we can see that for any fixed x, G(,x) is continuously differentiable on R +. Moreover, by (3.3) and (3.4)), we can see that for any ε>0 and x we have lim τ ε,z x G x (τ,z) =G x (ε, x). Then by the definition we can prove that G (ε, x) exists for any (ε, x) R ++. The continuity of G follows from the continuous differentiability of G(,x) and G(ε, ). (ii) Let (ε, x), (τ,z) be any two points in +1. Then G(ε, x) G(τ,z) = H(x ε y)φ(y)dy H(z τ y)φ(y)dy H(x ε y) H(z τ y) Φ(y)dy = H(x ε y) H(z τ y) Φ(y)dy supp sup y supp H(x ε y) H(z τ y), which, together with the Lipschitz continuity of H, implies the Lipschitz continuity of G. By the above arguments we can see that if H is globally Lipschitz continuous with Lipschitz constant L, then G(ε, y) G(ε, z) L y z y, z, which implies (3.5). (iii) First, from the proof in [16, Theorem 10.4], for any v R m and u, we have v T G x (ε, z)u = lim t 0 = lim t 0 sup y supp v T [G(ε, z + tu) G(ε, z)] t v T [H(z + tu εy) H(z εy)] Φ(y)dy supp t H 0 (z εy; u, v).

11 Therefore, v T G x (ε, z)u sup Since H is upper semicontinuous, we have Hence, by Theorem 2.1, for any we have max y supp V H(z εy) vt Vu. lim sup v T G x (ε, z)u max z x,ε 0 A = lim z x,ε 0 G x(ε, z), v T Au H 0 (x; u, v), which, by Theorem 2.1, implies A plen H(x). (iv) Define V H(x) vt Vu. B(x) :={B R m (n+1) B = lim G (0,y),y x, G (0,y) exists}. Since, if G is differentiable at any (0,y) R, then H must be differentiable at y as well and G x(0,y)=h (y), for any B B(x), there exist a vector a R m and a matrix A H(x) such that (a A)=B B(x). (3.9) Then, (iii) of this theorem, (3.9) and the convexity of plen H(x) imply that π x G(0,x) plen H(x). Since for each i, H i (x) is compact and convex, plen H(x) H 1 (x) H 2 (x) H m (x). So, (3.6) is proved. (v) First, from our assumptions there exists a nonnegative number α (α > 0 if H is strongly monotone) such that (x z) T [H(x) H(z)] α x z 2, x, z. Then for any h and any fixed ε>0, we have h T G x(ε, x)h h T [G(ε, x + th) G(ε, x)] = lim t 0 t h T [H(x + th εy) H(x εy)] = lim Φ(y)dy t 0 supp t lim αh T hφ(y)dy t 0 supp = αh T h, which proves (v).

12 (vi) For any ε>0, define γ ε : Rby γ ε (x) = f(x εy)φ(y)dy, x. Then, by (i) of this theorem, γ ε is continuously differentiable. By direct computations, we have γ ε (x) = f(x εy)φ(y)dy = H(x εy)φ(y)dy = G(ε, x), x. This, together with (i), means that γ ε is twice continuously differentiable on and G x (ε, x) = 2 γ ε (x), x, which means that G x (ε, x) is symmetric by [22, 3.3.4]. (vii) According to the definition, for any (ε, d) R ++ with (ε, d) 0 we have ε G(ε, x + d) G(0,x) G (ε, x + d) d [ ] H(x + d + td (ε + tε)y) H(x + d εy) = lim H(x + d εy) H(x) Φ(y)dy t 0 supp t = lim [H(x + d εy) H(x) V t (d εy)]φ(y)dy, t 0 supp where the last equality follows from the Mean Value Theorem [4, Proposition 2.6.5] and V t is an element of the convex hull of H([x + d εy, x + d εy + t(d εy)]). By Carathéodory theorem, there exists at most mn + 1 elements such that W t,i H(x + d εy + tθ i (d ε)) V t = mn+1 i=1 where θ i,λ i [0, 1] and mn+1 i=1 λ i = 1. Then G(ε, x + d) G(0,x) G x(ε, x + d) mn+1 = lim t 0 supp i=1 λ i W t,i, ( ε d ) λ i [H(x + d εy) H(x) W t,i (d εy)] Φ(y)dy mn+1 = lim λ i [H(x + d εy + tθ i (d εy)) H(x) t 0 supp i=1 W t,i (d εy + tθ i (d εy)) + O(t)] Φ(y)dy.

13 Then, if H is semismooth at x, we have G(ε, x + d) G(ε, x) G x (ε, x + d) ( ε d ) = sup y supp o( d εy ) =o( (ε, d) ). This proves (3.7). If H is strongly semismooth at x, by following the above arguments, we can prove (3.8). In proving part (iii) of Theorem 3.1, we used an idea from Ioffe [16]. For the Steklov averaged function, Xu and Chang [41] proved a similar result to part (iii) of Theorem 3.1. Recently, by assuming that H : is globally Lipschitz continuous, Xu [40] applied this result for the Steklov averaged function of H to modify an algorithm proposed in [3]. 3.2 Supp(Φ) is infinite Assumption 3.1 (i) H is globally Lipschitz continuous with Lipschitz constant L. (ii) κ := y Φ(y)dy <. (iii) Φ is continuously differentiable and for any ε>0 and x, the following integral H(y)Θ x (ε, x y)dy exists. (iv) For any ε>0, x and h 0 it holds that sup y [Θ x(ε, x + th y) Θ x(ε, x y)]h dy = o( h ). t [0,1] Proposition 3.3 Suppose that Assumption 3.1 holds. continuously differentiable with Jacobian given by Then for any ε > 0, G(ε, ) is G x (ε, x) = H(y)Θ x (ε, x y)dy, x Rn. (3.10) Proof: By (i) and (ii) of Assumption 3.1, for any (ε, x) R ++, G is well defined.

14 By (iii) and (iv) of Assumption 3.1, for any h with h 0 we have G(ε, x + h) G(ε, x) H(y)Θ x (ε, x y)hdy = H(y)[Θ(ε, x + h y) Θ(ε, x y) Θ x(ε, x y)h]dy H(y) max t [0,1] [Θ x (ε, x + th y) Θ x (ε, x y)]h dy L sup y [Θ x(ε, x + th y) Θ x(ε, x y)]h dy t [0,1] +( H(y 0 ) + L y 0 ) sup [Θ x(ε, x + th y) Θ x(ε, x y)]h dy t [0,1] = o( h ), where y 0 is an arbitrary point. Then we have proved that G x (ε, x) exists and (3.10) holds. The continuity of G (ε, ) follows from (iv) of Assumption 3.1. Assumption 3.2 (i) For any ε>0 and x, the following integral H(y)Θ ε (ε, x y)dy (3.11) exists. (ii) For any ε>0, x and τ Rwith τ 0 we have sup y [Θ t [0,1] ε (ε + tτ, x y) Θ ε (ε + τ,x y)]τ dy = o(τ). (3.12) (iii) For any ε>0 and x we have lim y Θ τ ε,z x ε (τ,z y) Θ ε (ε, x y) dy =0. (3.13) If H is globally Lipschitz continuous, there are plenty of kernel functions satisfying Assumptions 3.1 and 3.2, e.g., the Weierstrass kernel function Φ(y) = 1 ( π) n e y 2. Analogously to Theorem 3.1 we have the following theorem. Theorem 3.2 Suppose that supp(φ) is infinite and Φ is continuously differentiable. Suppose that Assumption 3.1 holds. Then we have (i) If Assumption 3.2 holds, then G(, ) is continuously differentiable on R ++.

15 (ii) G is globally Lipschitz continuous on +1 and for any fixed ε>0 and x, G x (ε, x) L. (3.14) (iii) lim z x,ε 0 G x(ε, z) plen H(x). (iv) For any x, π x G(0,x) plen H(x) H 1 (x) H 2 (x) H m (x). (3.15) (v) For any ε>0 and x, G x (ε, x) is positive semi-definite (positive definite) if H is monotone (strongly monotone) on. (vi) Suppose that m = n. If there exists a continuously differentiable function f : R such that H(y) = f(y) for all y and for all ε>0 and x, γ ε (x) := f(x εy)φ(y)dy is well defined, then G x (ε, x) is symmetric at any (ε, x) R ++. (vii) If H is semismooth at x, then for any (ε, d) R ++ with (ε, d) 0 we have G(ε, x + d) G(0,x) G ε (ε, x + d) = o( (ε, d) ). (3.16) d Proof: (i) From parts (i) and (ii) of Assumption 3.2 we can see that for any (ε, x) R ++, G ε (ε, x) exists and G ε (,x) is continuous. Part (iii) of Assumption 3.2 guarantees that lim τ ε,z x G ε(τ,z) =G ε(ε, x), which, together with the continuity of G ε (,x) and G x (ε, ) (Proposition 3.3), implies that G is continuously differentiable on R ++. (ii) Let (ε, x), (τ,z) be any two points in +1. Then G(ε, x) G(τ,z) = H(x ε y)φ(y)dy H(z τ y)φ(y)dy H(x ε y) H(z τ y) Φ(y)dy R n = L (x ε y) (z τ y) Φ(y)dy L[ x z + ε τ κ] L 2 max{1,κ} (ε, x) (τ,z), which proves the global Lipschitz continuity of G. By the above arguments, we also have G(ε, y) G(ε, z) L y z y, z.

16 This implies (3.14). (iii) For any given δ>0, there exists a number M>0such that Φ(y)dy δ 2. y M Let D M := {y y M}. Suppose that v R m and u are two arbitrarily chosen points. Since H is upper semicontinuous, there exists a number τ > 0 such that for all ε [0,τ] and all z with z x τ we have sup y D M max V H(z εy) vt Vu max V H(x) vt Vu+ δ v u. (3.17) 2 Since H is globally Lipschitz continuous, by the definition, we have v T G x (ε, z)u = lim v T [G(ε, z + tu) G(ε, z)] t 0 t v T [H(z + tu εy) H(z εy)] = lim{ Φ(y)dy t 0 D M t v T [H(z + tu εy) H(z εy)] + Φ(y)dy} y M t sup F 0 (z εy; u, v)+l v u δ y D M 2, which, together with (3.17), implies that for any δ>0, there exists a number τ>0 such that for all ε [0,τ] and all z with z x τ we have v T G x(ε, z)u sup y D M This means that Hence, by Theorem 2.1, for any max V H(z εy) vt Vu+ L v u δ 2 lim sup v T G x (ε, z)u max z x,ε 0 max V H(x) vt Vu+(L +1) v u δ 2. V H(x) vt Vu. A = lim z x,ε 0 G x(ε, z), we have v T Au F 0 (x; u, v), which, by Theorem 2.1, implies A plen H(x). (iv) Similar to the proof of (iv) in Theorem 3.1. (v) Similar to the proof of (v) in Theorem 3.1. (vi) By the definition, γ ε (x + h) γ ε (x) H(x εy) T hφ(y)dy = [f(x + h εy) f(x εy) f (x εy)h]φ(y)dy = 1 0 [f (x + th εy) f (x εy)]hdtφ(y)dy.

17 Since f = H is globally Lipschitz continuous, for any δ>0 there exists a number M>0 such that 1 [f (x + th εy) f (x εy)]h dtφ(y)dy δ 2 h 2. y M 0 For the given M, there exists a number τ>0such that for all ε [0,τ] and all h with h τ we have 1 [f (x + th εy) f (x εy)]h dtφ(y)dy δ 2 h y M 0 because f is uniformly continuous on any bounded set. Therefore, for any δ>0there exists a number τ (0, 1] such that for all ε [0,τ] and all h with h τ we have 1 [f (x + th εy) f (x εy)]h dtφ(y)dy δ h. 0 This implies that γ ε (x + h) γ ε (x) H(x εy) T hφ(y)dy = o( h ), which means that γ ε is differentiable and γ ε (x) = H(x εy)φ(y)dy = G(ε, x). Since G(ε, ) is continuous, γ ε is continuously differentiable. By Proposition 3.3, G(ε, ) is continuously differentiable, and so γ ε ( ) is twice continuously differentiable with 2 γ ε (x) =G x (ε, x), x Rn. This, together with the symmetric property of 2 γ ε (x), implies that G x (ε, x) is symmetric. (vii) First, for any given δ>0, under the assumptions, we know that there exists a number M>0such that max{ Φ(y)dy, y Φ(y)dy} δ y M y M 4. Then for any d and ε>0, we have [ H(x + d εy) H(x) y M H(x + d + td (ε + tε)y) H(x + d εy) t ] Φ(y)dy (L d + ε) δ 2. Next, according to the proof in part (vii) of Theorem 3.1, we know that there exists a number τ>0such that for all (ε, d) +1 with ε>0 and (ε, d) τ we have [ ] H(x + d + td (ε + tε)y) H(x + d εy) lim sup H(x + d εy) H(x) Φ(y)dy t 0 y M t δ (ε, d). 2

18 Therefore, for any given δ>0, there exists a number τ>0such that for all (ε, d) +1 with ε>0 and (ε, d) τ we have ε G(ε, x + d) G(0,x) G x(ε, x + d) d [ = lim H(x + d εy) H(x) t 0 lim sup [H(x + d εy) H(x) t 0 y M ] H(x + d + td (ε + tε)y) H(x + d εy) Φ(y)dy t + lim sup [H(x + d εy) H(x) t 0 y M ] H(x + d + td (ε + tε)y) H(x + d εy) Φ(y)dy t δ 2 (ε, d) +(L +1)δ (ε, d) 2 ] H(x + d + td (ε + tε)y) H(x + d εy) Φ(y)dy t (L +1)δ (ε, d), which proves (3.16). Contrary to Theorem 3.1, in Theorem 3.2 we could not prove a result similar to (3.8). When X is a rectangle, such a result holds for several smoothing functions with supp(φ) being infinite [31] 3.3 Smoothing-nonsmooth reformulations If in (2.2), H is replaced by Π X, a smoothing approximation of Π X via convolution can be described by G(ε, x) = Π X (x y)θ(ε, y)dy, (3.18) where (ε, x) R ++. Since Θ(ε, y)dy = 1, for any ε>0 and x Rn, we have As we stated early, we always define G(ε, x) X. G(0,x):=Π X (x) and G( ε,x):=g( ε,x), x. Suppose that Φ is chosen such that G is continuous on +1 and is continuously differentiable on R ++. Then to solve the VIP is equivalent to solve ε Q(ε, x) := =0, (3.19) P (ε, x)

19 where P (ε, x) :=x G(ε, x F (x)). Since Q is continuously differentiable on R ++ and R and may be nonsmooth on 0, we can see (3.19) as a smoothing-nonsmooth reformulation of the VIP. Another smoothing-nonsmooth reformulation of the VIP is ( R(ε, x) := ε S(ε, x) ) =0, (3.20) where S(ε, x) :=F (G(ε, x)) + x G(ε, x). Theorem 3.3 Suppose that Φ is defined by (3.1) or suppose that supp(φ) is bounded and Φ is continuously differentiable. Then Q is continuously differentiable on R ++, and for any ε>0 there exists an orthogonal matrix U such that := UG x(ε, x F (x))u T is a diagonal matrix with 0 ii 1,i {1, 2,,n}. Moreover, we have (i) if UF (x)u T is a P 0 -matrix (P -matrix), then P x (ε, x) is a quasi P 0-matrix (P -matrix); (ii) if Π X is semismooth at x F (x), then for any (ε, d) R ++ with (ε, d) 0 we have Q(ε, x + d) Q(0,x) Q ε (ε, x + d) = o( (ε, d) ), (3.21) d and Q(ε, x + d) G(0,x) G ε (ε, x + d) = O( (ε, d) 2 ) (3.22) d if Π X is strongly semismooth at x F (x) and F is Lipschitz continuous at x. Proof: First, from Theorem 3.1, P is continuously differentiable on R ++. Then, since Π X is monotone and globally Lipschitz continuous with Lipschitz constant 1, from Theorem 3.1 and the proof of Theorem 2.2, we know that for any ε>0, G x (ε, x) is symmetric, positive semidefinite and G x (ε, x) 1. Thus, there exists an orthogonal matrix U such that := UG x(ε, x F (x))u T is a diagonal matrix with 0 ii 1,i {1, 2,,n}. Part (i) can be proved similarly to that in Theorem 2.3 and part (ii) follows from (vii) in Theorem 3.1. Theorem 3.4 Suppose that supp(φ) is infinite and Φ is continuously differentiable. Suppose that Assumptions 3.1 and 3.2 hold and that y 2 Φ(y)dy <. Then Q is continuously differentiable on R ++, and for any ε>0there exists an orthogonal matrix U such that := UG x(ε, x F (x))u T is a diagonal matrix with 0 1,i {1, 2,,n}. Moreover, we have ii (i) if UF (x)u T is a P 0 -matrix (P -matrix), then P x (ε, x) is a quasi P 0-matrix (P -matrix);

20 (ii) if Π X is semismooth at x F (x), then for any (ε, d) R ++ with (ε, d) 0 we have Q(ε, x + d) G(0,x) Q ε (ε, x + d) = o( (ε, d) ). (3.23) d Proof: Similar to the proof of Theorem 3.3, we can prove this theorem by Theorem 3.2. We omit the details. In this subsection, the results hold for P hold for R as well. We omit the details here. 3.4 Computable smoothing functions of Π X The smoothing approximation function G via convolution is not computable if X is not a rectangle since in this case a multivariate integral is involved. When X is a rectangle, see [2, 9] for various concrete forms of G. In [30], the authors discussed a way to get an approximate smoothing function of Π X ( ) when X is explicitly expressed as X := {x g i (x) 0,i=1, 2,,m}, (3.24) where for each i, g i is a twice continuously differentiable convex function. Suppose that the Slater constraint qualification holds, i.e., there exists a point x 0 such that g i (x 0 ) < 0 for all i {1, 2,,m}. Then for any x, there exists a vector λ R m + such that m y x + λ i g i (y) =0 i=1 λ max(0,λ+ g(y)) = 0. (3.25) Suppose that a(ε, t) is the Chen-Harker-Kanzow-Smale (CHKS) smoothing function [1, 18, 36] of max(0,t),t R, which is given by a(ε, t) := 4ε 2 + t 2 + t, t R. 2 (We can use other smoothing functions of max(0,t),t R. Here, we choose the CHKS smoothing function for the ease of discussion.) Define A : R R m R m by A i (ε, z) :=a(ε, z i ), i {1, 2,,m}. Consider the perturbed system of (3.25) m D((y, λ), (ε, x)) := y x + λ i g i (y) i=1 =0, (3.26) λ A(ε, λ + g(y)) where ε and x are treated as parameters and y and λ as variables. For any (ε, x) R ++, the system (3.26) has a unique solution (y(ε, x), λ(ε, x)). Proposition 3.4 ([30]) Suppose that the Slater constraint qualification holds. Then

21 (i) y( ) is continuously differentiable on R ++ and for any (ε, x) R ++, y x (ε, x) is symmetric, positive semidefinite and y x (ε, x) 1. (ii) For any x 0, lim ε 0,x x 0 y(ε, x) =Π X(x 0 ). See [30] for some important differentiability properties of the above defined smoothing function. Note that for any ε>0 and x, to compute y(ε, x) is no more difficult than to compute Π X (x) since D((, ), (ε, x)) is continuously differentiable everywhere when ε 0. Define y(0,x):=π X (x) and y( ε,x):=y( ε,x), (ε, x) R. Then we can define a smoothing function P of W as P (ε, x) :=x y(ε, x F (x)), where (ε, x) R. By considering the above proposition, P is continuously differentiable everywhere except (0,x),x and P is continuous at (0,x),x. Similarly, we can define a smoothing function of E as S(ε, x) :=F (y(ε, x)) + x y(ε, x), (ε, x) R. It is noted that for any ε 0,y(ε, x) int X. This guarantees that S is well defined even if F is only defined on X. 4 Algorithm and its convergence Suppose that Q : +1 +1 is defined by (3.19). In this section we will discuss a method to find a solution of Q(z) = 0, where z := (ε, x) R. If for the VIP, F is not well defined outside X, we use R defined by (3.20) to replace Q. Assumption 4.1 (i) Q is continuous on +1 and is continuously differentiable on R ++. (ii) For any ε>0 and x, Q (ε, x) is nonsingular. Remark: According to Theorems 3.3 and 3.4 and Corollary 2.1, we know that if F is a continuously differentiable generalized P 0 -function on L(X), then for any ε>0 and x, P x (ε, x) is a quasi P 0-matrix. This implies that if we redefine P (ε, x) :=P (ε, x)+εx or P (ε, x) :=x G(ε, x (F (x)+εx)),

22 then P x (ε, x) becomes a quasi P -matrix, and thus, a nonsingular matrix. Therefore, if F is a generalized P 0 -function on L(X), then, by redefining P if necessary, part (ii) in Assumption 4.1 always holds. Choose ε R ++ and γ (0, 1) such that γ ε <1. Let z := ( ε, 0) R. Define the merit function ψ : +1 R + by and define β : +1 R + by ψ(z) := Q(z) 2 β(z) :=γ min{1,ψ(z)}. Let Ω:={z := (ε, x) R ε β(z) ε}. Then, because for any z +1, β(z) γ<1, it follows that for any x, ( ε, x) Ω. Algorithm 4.1 (Squared Smoothing Newton Method [31]) Step 0. Choose constants δ (0, 1) and σ (0, 1/2). Letε 0 := ε, x 0 be an arbitrary point and k := 0. Step 1. If Q(z k )=0then stop. Otherwise, let β k := β(z k ). Step 2. Compute z k := ( ε k, x k ) R by Step 3. Let l k be the smallest nonnegative integer l satisfying Define z k+1 := z k + δ l k z k. Step 4. Replace k by k +1 and go to Step 1. Q(z k )+Q (z k ) z k = β k z. (4.1) ψ(z k + δ l z k ) [1 2σ(1 γ ε)δ l ]ψ(z k ). (4.2) The following global convergence result is proved in Qi et al. [31]. Theorem 4.1 Suppose that Assumption 4.1 is satisfied. Then an infinite sequence {z k } is generated by Algorithm 4.1 with {z k } Ω and each accumulation point z of {z k } is a solution of Q(z) =0. When F is a generalized P 0 -function on L(X), we have the following stronger result. Theorem 4.2 Suppose that Q is defined by (3.19) with P given by P (ε, x) :=x G(ε, x (F (x)+εx)), (ε, x) +1, where G is continuously differentiable on R ++ and for any (ε, x) R ++, G x (ε, x) is symmetric, positive semidefinite and G x (ε, x) 1. Suppose that X is of the structure (2.3) and that F is a continuously differentiable generalized P 0 -function on L(X). If the solution set of the VIP is nonempty and bounded, then a bounded infinite sequence {z k } is generated by Algorithm 4.1 and each accumulation point z of {z k } is a solution of Q(z) =0.

23 Proof: By the proof of Corollary 2.1 we can prove that for any ε>0 and x, P x (ε, x) is a quasi P -matrix, and so, Q (ε, x) is nonsingular. Then, by Theorem 4.1, an infinite {z k } is generated by Algorithm 4.1 and {z k } Ω. Since ψ(z k ) is a decreasing sequence, lim k ψ(z k ) exists. If lim k ψ(zk ) > 0, (4.3) then there exists a ε > 0 such that ε k >ε. By the proof of Theorem 2.4, we can prove that for any δ>0 the following set is bounded, where {x M(ε, x) δ, ε [ε, ε]} M(ε, x) :=x Π X [x (F (x)+εx)]. Since G is globally Lipschitz continuous, for any δ > 0, the following set {x P (ε, x) δ, ε [ε, ε]} is also bounded. This means that {z k } is bounded. From Theorem 4.1, {z k } has at least one accumulation point which is a solution of H(z) = 0. This contradicts (4.3). Therefore, we have lim k ψ(zk )=0. (4.4) Since F is a generalized P 0 -function, it is not difficult to show that Q : is a weakly univalent function [11]. By the assumption that the solution set of the VIP is nonempty and bounded, the inverse image Q 1 (0) is nonempty and bounded. Then, by [34, Theorem 1], we know from (4.4) that {z k } is bounded and any accumulation point of {z k } is a solution of Q(z) =0. In [31], Qi et al. provided a superlinear (quadratic) convergence result by assuming that Q is semismooth at a solution point. Here we cannot prove that Q is semismooth under our assumptions. Nevertheless, we still can provide a superlinear (quadratic) convergence result. Its proof can be drawn similarly from that in [31]. This shows that semismoothess, together with some kind of nondegeneracy, is sufficient but not necessary for superlinear (quadratic) convergence. This fact has been first observed and exploited by Kummer [19, 20] without using semismoothness and later by several authors, among them are Sun, Fukushima and Qi [38], Fischer [8], Qi, Ralph and Zhou [28], Pu and Zhang [25] and Gowda and Ravindran [10]. Theorem 4.3 Suppose that Assumption 4.1 is satisfied and that z := (0,x ) is an accumulation point of the infinite sequence {z k } generated by Algorithm 4.1. Suppose that for any ε>0 and d with (ε, d) 0 we have P (ε, x + d) P (0,x ) P (ε, x ε + d) = o( (ε, d) ) d and that all V Q(z ) are nonsingular. Then the whole sequence {z k } converges to z, z k+1 z = o( z k z )

24 and ε k+1 = o(ε k ). Furthermore, if for any ε>0 and d with (ε, d) 0 we have P (ε, x + d) P (0,x ) P (ε, x ε + d) = O( (ε, d) 2 ), d then and z k+1 z = O( z k z 2 ) ε k+1 = O(ε k ) 2. In Theorem 4.3 we assume that all V Q(z ) are nonsingular in order to get a high order convergent result. According to Theorems 3.1 and 3.2, this assumption is satisfied by assuming that all T I plen Π X [x F (x )][I F (x )] are nonsingular. By the definition of plen, the latter is equivalent to say that all T I Π X [x F (x )][I F (x )] are nonsingular. See Section 2 for the conditions to ensure the nonsingularity of the above matrices. In particular, if F (x ) is positive definite, all are nonsingular. T I Π X (x F (x ))[I F (x )] 5 Concluding Remarks In this paper, by using convolution, we reformulated the VIP equivalently as smoothingnonsmooth equations, which have some desirable properties. A globally and locally highorder convergent Newton method has been applied for solving the smoothing-nonsmooth equations, and so the VIP. There is no specific assumption on the structure of the constraint set X. Due to a multivariate integral involved, it may not be easy to compute the smoothing functions of the projection operator Π X via convolution. However, it may lead to the discovery of some effective ways to compute smoothing functions of Π X when the structure of X can be used. In fact, based on this observation, an effective way to compute the smoothing functions of Π X is proposed in [30] when X can be expressed as the set defined by several twice continuously differentiable convex functions. We believe that the research done in this paper can deepen the understanding of smoothing functions of Π X when X is not a rectangle. Acknowledgement: The authors would like to thank Houduo Qi for his comments and suggestions during writing this paper.

25 References [1] B. Chen and P.T. Harker, A non-interior-point continuation method for linear complementarity problems, SIAM Journal on Matrix Analysis and Applications, 14 (1993), 1168 1190. [2] C. Chen and O.L. Mangasarian, A class of smoothing functions for nonlinear and mixed complementarity problems, Computational Optimization and Applications, 5 (1996), 97 138. [3] X. Chen, L. Qi and D. Sun, Global and superlinear convergence of the smoothing Newton method and its application to general box constrained variational inequalities, Mathematics of Computation, 67 (1998), 519-540. [4] F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983. [5] B.C. Eaves, On the basic theorem of complementarity, Mathematical Programming, 1 (1971), 68 75. [6] Y.M. Ermoliev, V. I. Norkin and R. J.-B. Wets, The minimization of semicontinuous functions: mollifier subgradients, SIAM Journal on Control and Optimization, 33 (1995), 149 167. [7] F. Facchinei and J.S. Pang, Total stability of variational inequalities, Preprint, Department of Mathematical Sciences, Whiting School of Engineering, The Johns Hopkins University, Baltimore, Maryland 21218-2682, U.S.A., 1998. [8] A. Fischer, Solution of monotone complementarity problems with locally Lipschitzian functions, Mathematical Programming, 76 (1997), 513 532. [9] S.A. Gabriel and J.J. Moré, Smoothing of mixed complementarity problems, in Complementarity and Variational Problems: State of the Art, M.C. Ferris and J.S. Pang, eds., SIAM, Philadelphia, Pennsylvania, 1997, pp. 105 116. [10] M.S. Gowda and G. Ravindran, Algebraic univalence theorems for nonsmooth functions, Research Report, Department of Mathematics and Statistics, University of Maryland, Baltimore Country, Baltimore, Maryland 21250, 1998. [11] M.S. Gowda and R. Sznajder, Weak univalence and connectedness of inverse images of continuous functions, to appear in Mathematics of Operations Research, 24 (1999), 255 261. [12] M.S. Gowda and M.A. Tawhid, Existence and limiting behavior of trajectories associated with P 0 -equations, Computational Optimization and Applications, 12 (1999), 229 251. [13] A.M. Gupal, On a method for the minimization of almost differentiable functions, Kibernetika (1977) 114-116. [14] P.T. Harker and J.-S. Pang, Finite-dimensional variational inequality and nonlinear complementarity problem: A survey of theory, algorithms and applications, Mathematical Programming, 48 (1990), 161 220.

26 [15] J.-B. Hiriart-Urruty, Characterizations of the plenary hull of the generalized Jacobian matrix, Mathematical Programming Study, 17 (1982), 1 12. [16] A.D. Ioffe, Nonsmooth analysis: differentiable calculus of nondifferentiable mappings, Transactions of the American Mathematical Society, 266 (1981), 1 56. [17] H. Jiang and L. Qi, Local uniqueness and convergence of iterative methods for nonsmooth variational inequalities, Journal of Mathematical Analysis and Applications, 196 (1995), 314 331. [18] C. Kanzow, Some noninterior continuation methods for linear complementarity problems, SIAM Journal on Matrix Analysis and Applications, 17 (1996), 851 868. [19] B. Kummer, Newton s method for nondifferentiable functions, in: Advances in mathematical optimization, 114 125, Akademie-Verlag, Berlin, 1988. [20] B. Kummer, Newton s method based on generalized derivatives for nonsmooth functions: convergence analysis, in: Advances in optimization, 171 194, Springer, Berlin, 1992. [21] D.Q. Mayne and E. Polak, Nondifferential optimization via adaptive smoothing, Journal of Optimization Theory and Applications, 43 (1984), 19 30. [22] J.M. Ortega and W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970. [23] J.-S. Pang and L. Qi, Nonsmooth equations: Motivation and algorithms, SIAM Journal on Optimization, 3 (1993), 443 465. [24] J.M. Peng and M. Fukushima, A hybrid Newton method for solving the variational inequality problem via the D-gap function, Mathematical Programming, 86 (1999), 367 386. [25] D. Pu and J. Zhang, Inexact generalized Newton methods for second-order C- differentiable optimization, Journal of Computational and Applied Mathematics, 93 (1998) 107-122. [26] L. Qi, Convergence analysis of some algorithms for solving nonsmooth equations, Mathematics of Operations Research, 18 (1993), 227 244. [27] L. Qi and X. Chen, A globally convergent successive approximation method for severely nonsmooth equations, SIAM Journal on Control and Optimization, 33 (1995), 402-418. [28] L. Qi, D. Ralph and G. Zhou, Semiderivative functions and reformulation methods for solving complementarity and variational problems, to appear in Nonlinear Optimization and Applications 2, G. Di Pillo and F. Giannessi (eds.), Kluwer Academic Publishers [29] L. Qi and D. Sun, A survey of some nonsmooth equations and smoothing Newton methods, Progress in Optimization: Contributions from Australasia, A. Eberhard, B. Glover, R. Hill and D. Ralph (eds.), Kluwer Academic Publishers, 121 146, 1999.

27 [30] L. Qi and D. Sun, Smoothing functions and a smoothing Newton method for complementarity and variational inequality problems, AMR 98/24, Applied Mathematics Report, School of Mathematics, the University of New South Wales, October 1998. [31] L. Qi, D. Sun and G. Zhou, A new look at smoothing Newton methods for nonlinear complementarity problems and box constrained variational inequalities, Mathematical Programming, 87 (2000), 1 35. [32] L. Qi and J. Sun, A nonsmooth version of Newton s method, Mathematical Programming, 58 (1993), 353 367. [33] S.M. Robinson, Normal maps induced by linear transformation, Mathematics of Operations Research, 17 (1992), 691 714. [34] G. Ravindran and M.S. Gowda, Regularization of P 0 -functions in box variational inequality problems, to appear in SIAM Journal on Optimization. [35] L. Schwartz, Théorie des Distributions, Hermann, Paris, 1966. [36] S. Smale, Algorithms for solving equations, in Proceedings of the International Congress of Mathematicians, Berkeley, California, 1986, pp. 172 195. [37] S.L. Sobolev, Some Applications of Functional Analysis in Mathematical Physics, 3rd edition, Nauka, Moscow, 1988. [38] D. Sun, M. Fukushima and L. Qi, A computable generalized Hessian of the D-gap function and Newton-type methods for variational inequality problem, in: M.C. Ferris and J.-S. Pang, eds., Complementarity and Variational Problems State of the Art, SIAM Publications, Philadelphia, 1997, pp. 452-473. [39] T.H. Sweetser, A set-valued strong derivative for vector-valued Lipschitz functions, Journal of Optimization Theory and Applications, 23 (1977), 549 562. [40] H. Xu, Deterministically computable generalized Jacobians and Newton s methods, Preprint, School of Information Technology and Mathematics Sciences, University of Ballarat, Victoria, VIC 3353, Australia, 1998. [41] H. Xu and X. Chang, Approximate Newton methods for nonsmooth equations, Journal of Optimization Theory and Applications, 93 (1997), 373 394. [42] E.H. Zarantonello, Projections on convex sets in Hilbert space and spectral theory, in Contributions to Nonlinear Functional Analysis, E.H. Zarantonello, ed., Academic Press, New YorK, 1971, pp. 237 424.