Nonnegative Inverse Eigenvalue Problems with Partial Eigendata

Size: px

Start display at page:

Download "Nonnegative Inverse Eigenvalue Problems with Partial Eigendata"

Megan Hunt
5 years ago
Views:

1 Nonnegative Inverse Eigenvalue Problems with Partial Eigendata Zheng-Jian Bai Stefano Serra-Capizzano Zhi Zhao June 25, 2011 Abstract In this paper we consider the inverse problem of constructing an n-by-n real nonnegative matrix A from the prescribed partial eigendata. We first give the solvability conditions for the inverse problem without the nonnegative constraint and then discuss the associated best approximation problem. To find a nonnegative solution, we reformulate the inverse problem as a monotone complementarity problem and propose a nonsmooth Newton-type method for solving its equivalent nonsmooth equation. Under some mild assumptions, the global and quadratic convergence of our method is established. We also apply our method to the symmetric nonnegative inverse problem and to the cases of prescribed lower bounds and of prescribed entries. Numerical tests demonstrate the efficiency of the proposed method and support our theoretical findings. Keywords. Nonnegative matrix, inverse problem, monotone complementarity problem, generalized Newton s method, semismooth function. AMS subject classifications. 49J52, 49M15, 65F18, 90C33 1 Introduction Inverse eigenvalue problems (IEPs) arise in a wide variety of applications such as structural dynamics, control design, system identification, seismic tomography, remote sensing, geophysics, particle physics, and circuit theory, etc. For the applications, mathematical properties, and algorithmic aspects of general IEPs, we may refer to [8, 9, 12, 22, 24, 47] and references therein. An m-by-n matrix M 0 (M > 0, respectively) is called nonnegative (strictly positive, respectively) if M ij 0 (M ij > 0, respectively) for all i = 1,..., m and j = 1,..., n. Nonnegative matrices play an important role in many applications such as game theory, Markov chain, School of Mathematical Sciences, Xiamen University, Xiamen , People s Republic of China, Dipartimento di Fisica e Matematica, Università dell Insubria - Sede di Como, Via Valleggio 11, Como, Italy (zjbai@xmu.edu.cn). The research of this author was partially supported by the Natural Science Foundation of Fujian Province of China for Distinguished Young Scholars (No. 2010J06002), NCET, and Internationalization Grant of U. Insubria 2008, Dipartimento di Fisica e Matematica, Università dell Insubria - Sede di Como, Via Valleggio 11, Como, Italy (stefano.serrac@uninsubria.it). The work of this author was partially supported by MIUR (No KLJEZ). School of Mathematical Sciences, Xiamen University, Xiamen , People s Republic of China (zhaozhi231@163.com). 1

2 probabilistic algorithms, numerical analysis, discrete distributions, categorical data, group theory, matrix scaling, and economics, etc. One may refer to [1, 2, 29, 40] for the applications and mathematical properties of nonnegative matrices. The nonnegative inverse eigenvalue problem has got much attention since 1940s (see the survey papers [8, 15] and references therein). Most of the works determine necessary and sufficient conditions such that the given complete set of complex numbers is the spectrum of a nonnegative matrix. Recently, despite theoretical incompleteness of the nonnegative inverse problem, there are few numerical algorithms developed for computational purpose, including the isospectral flow method [5, 6, 7, 10] and the alternating projection method [31]. In this paper, we consider the inverse problem of constructing a real nonnegative matrix from the given partial eigendata. The canonical problem can be stated as follows. NIEP. Construct a nontrivial n-by-n nonnegative matrix A from a set of measured partial eigendata {(λ k, x k )} p k=1 (p n). In practice, the entries of a nonnegative matrix denote the distributed physical parameters such as mass, length, elasticity, inductance, capacitance, and so on. Moreover, an a priori analytical nonnegative matrix A a can be obtained by using the finite element techniques. However, the predicted dynamical behaviors (eigenvalues and eigenvectors) of the analytical nonnegative matrix A a often disagree with the experimental eigendata [22]. The inverse problem aims to reconstruct a physically feasible nonnegative matrix from the measured eigendata. In this paper, we first give the solvability conditions for the NIEP, without the nonnegative constraint, and then we study the corresponding best approximation problem with respect to the analytical nonnegative matrix A a. To find a physical solution to the NIEP, we reformulate the NIEP as a monotone complementary problem (MCP) and we propose a nonsmooth Newtontype method for solving its equivalent nonsmooth equation. In fact, we are motivated by recent developments of Newton-type methods for MCPs [13, 16, 18, 19, 23, 25, 26, 48]. Under some mild conditions, we show that the proposed method converges globally and quadratically. We also use our method to the symmetric nonnegative inverse problem and to the cases of prescribed lower bounds and of prescribed entries. Numerical tests are also reported, in order to illustrate the efficiency of our method and to confirm our theoretical results. Throughout the paper, we use the following notations. The symbols M T,M H, and M denote the transpose and the conjugate transpose, and the Moore-Penrose inverse of a matrix M, respectively. I is the identity matrix of appropriate dimension. We denote by the Euclidean vector norm and the Frobenius matrix norm, and by V the adjoint of an operator V. The symbols R m n and R m n stand for the nonnegative orthant and the strictly positive orthant of R m n, respectively. Given N := {(i, j) i, j = 1,..., n}, the sets of indices I, J N are such that J = N \I. Let I be the cardinality of the index set I. For a matrix M R n n, M I is the column vector with entries M ij for all (i, j) I. Define the linear operator P : R I R n n by { Mij, if (i, j) I, P ij (M I ) := 0, otherwise. The paper is organized as follows. In Section 2 we study the solvability of the NIEP by neglecting the nonnegative requirement and we discuss the associated best approximation problem with respect to an a priori analytical nonnegative matrix. In Section 3 we reformulate the 2

3 NIEP as a MCP and we propose a Newton-type method for solving its equivalent nonsmooth equation. Under some mild conditions, the global and quadratical convergence of our method is established. In Section 4 we discuss the application of our method to specific important cases. In Section 5 we report some numerical results to illustrate the efficiency of the proposed method. 2 Solvability and Best Approximation In this section, we give a sufficient and necessary condition for the solvability of the NIEP without the nonnegative constraint and then we discuss the associated best approximation problem with respect to an a priori analytical nonnegative matrix A a. We note that the complex eigenvalues and eigenvectors of a nonnegative matrix A R n n appear in complex conjugate pairs. If a ± b 1 and x ± y 1 are complex conjugate eigenpairs of A, where a, b R and x, y R n, then we have Ax = ax by and Ay = ay bx, or [ a b A[x, y] = [x, y] b a Therefore, without loss of generality, we can assume ]. Λ = diag(λ [2] 1,..., λ[2] s, λ 2s1,..., λ p ) R p p and where λ [2] i = [ ai b i b i a i X = [x 1R, x 1I,..., x sr, x si, x 2s1,..., x p ] R n p, ], λ 2i 1,2i = a i ± b i 1, x2i 1,2i = x ir x ii 1, ai, b i R, x ir, x ii R n, for all 1 i s. By neglecting the nonnegative constraint, the NIEP reduces to the following problem: Problem 1. Construct a nontrivial n-by-n real matrix A from the measured eigendata (Λ, X) R p p R n p. To study the solvability of Problem 1, we need the following preliminary lemma. Lemma 2.1 [45, Lemma 1.3] Given B 1 C q m, B 2 C n t, B 3 C q t, Y a C m n. Define Then E = if and only if B 1, B 2, B 3 satisfy E := {Y C m n : B 1 Y B 2 = B 3 }. B 1 B 1 B 3B 2 B 2 = B 3. In the case of E =, any Y E can be expressed as Y = B 1 B 3B 2 G B 1 B 1GB 2 B 2, 3

4 where G C m n. Moreover, there is a unique matrix Y E given by Y = B 1 B 3B 2 Y a A 1 A 1Y a A 2 A 2 such that for any unitarily invariant norm, Y Y a = min Y E Y Y a. Based on Lemma 2.1, we easily derive the following result on the solvability of Problem 1 and its approximation. Theorem 2.2 Problem 1 has a solution if and only if XΛX X = XΛ. In this case, the general solution to Problem 1 is given by A = XΛX G(I XX ), where G R n n is arbitrary. Moreover, for an a priori nonnegative matrix A a R n n, there exists a unique solution to Problem 1 given by A = XΛX A a (I XX ) such that A A a = min A S A A a, where S is the solution set of Problem 1. Remark 2.3 We note from Theorem 2.2 that Problem 1 is solvable if X is full column rank. Furthermore, we can find a unique solution in the solution set of Problem 1, which is nearest to a fixed a priori nonnegative matrix and satisfies the given eigendata though it may not be physically feasible. For the sake of clarity, we give two numerical examples. Example 2.4 Let n = 6. We randomly generate an n-by-n nonnegative matrix Â as follows: Â = The eigenvalues of Â are given by λ 1 = , λ 2,3 = ± , λ 4 = , λ 5,6 = ± We use the first three eigenvalues {λ i } 3 i=1 and associated eigenvectors {x i } 3 i=1 as prescribed eigendata. For Example 2.4, we can easily check that the given eigendata satisfy the solvability condition in Theorem 2.2. Therefore, Problem 1 is solvable and the minimum norm solution is given by A min =

5 Suppose that an a priori analytic nonnegative matrix A a takes the form of A a = Then the best approximation is given by A = We observe that a physically realizable solution is obtained. Example 2.5 Let Â be a random tridiagonal nonnegative matrix of order n = 6, where Â = The eigenvalues of Â are given by {5.6126, , , , , }. We use the first eigenvalues {λ i } 2 i=1 and associated eigenvectors {x i} 2 i=1 as prescribed eigendata. For Example 2.5, the condition of Theorem 2.2 holds and thus Problem 1 is solvable. In particular, the minimum norm solution is given by A min = Suppose that an a priori analytic symmetric tridiagonal oscillatory matrix A a takes the following form A a =

6 Then the best approximation is given by A = We see that the best approximation is not physically realizable, but the tridiagonal entries are all positive and the off-tridiagonal entries are relatively small.. 3 The Nonnegative Inverse Eigenvalue Problem with Partial Eigendata In this section, we reformulate the NIEP as a MCP and propose a generalized Newton-type method for solving an equivalent nonsmooth equation to the MCP. In the following subsections, we first review some preliminary definitions and basic results for the nonlinear nonnegative programming which are used later in this paper. Then, we present a nonsmooth Newton-type method for solving the NIEP and give the convergence analysis. 3.1 Preliminaries Let X, Y, Z be three finite dimensional real vector spaces, each equipped with a scalar inner product, and the related induced norm. Sparked by the concepts of strong second-order sufficient condition and constraint nondegeneracy in the nonlinear semidefinite programming problem [43] and by the differential properties of the metric projector over the semidefinite cone [32], in this subsection, we briefly discuss some analogous definitions and basic properties for the following nonlinear nonnegative programming (NLNNP) min f(x) s.t. h(x) = 0, g(x) R n n, where f : X R, h : X R q, and g : X R n n are all twice continuously differentiable functions. Define the Lagrangian function l : X R q R n n R by l(x, y, Ψ) = f(x) y, h(x) Ψ, g(x), (x, y, Ψ) X R q R n n. Then the Karush-Kuhn-Tucher (KKT) condition for the NLNNP is given by J x l(x, y, Ψ) = 0, h(x) = 0, and Ψ N R n n(g(x)), (1) 6

7 where J x l(x, y, Ψ) is the derivative of l(x, y, Ψ) at (x, y, Ψ) with respect to x X and N R n n(a) is the normal cone of R n n at the point a defined by ([37]) N R n n(a) = {c R n n : c, b a 0, b R n n } if a Rn n, otherwise, and any point (x, y, Ψ) satisfying (1) is called a KKT point of the NLNNP and the corresponding point x is called a stationary point of the NLNNP. Suppose that x is a stationary point of the NLNNP. Then there exists a point (y, Ψ) such that (x, y, Ψ) satisfies the KKT condition (1). From [14], it follows that Ψ N R n n (g(x)) g(x) = Π R n n(g(x) Ψ). Hence, the KKT condition (1) can be rewritten as J x l(x, y, Ψ) H(x, y, Ψ) := h(x) g(x) Π R n n(g(x) Ψ) where Π R n n( ) denotes the metric projection onto R n n. = 0, (2) We note that we cannot directly use Newton s method to solve (2) since Π R n n ( ) is not continuously differentiable everywhere. Fortunately, one may employ the nonsmooth Newton method for solving (2). To do so, we need the concept of Clarke s generalized Jacobian. We first recall the definitions of Fréchet differentiability and directional differentiability. Definition 3.1 Let Υ : X Y and x X. 1). If there exists a linear operator, denoted by J x Υ, such that Υ(x h) Υ(x) J x Υ(h) lim = 0 h 0 h for all h X, then Υ is Fréchet-differentiable at x and J x Υ is the F-derivative of Υ at x. 2). Let h X. We define the directional derivative Υ (x; h) of Υ at x by Υ Υ(x th) Υ(x) (x; h) = lim. t 0 t Υ is said to be directionally differentiable at x if Υ (x; h) exists for all h X. We now recall the definition of Clarke s generalized Jacobian [11]. Let D be an open set in Y and Ξ : D Z be a locally Lipschitz continuous function on D. Using Rademacher s theorem [38, Chap. 9.J], it is easy to know that Ξ is Fréchet differentiable almost everywhere in D. Then Clarke s generalized Jacobian of Ξ at y D is defined by Ξ(y) := conv{ B Ξ(y)}, 7

8 where conv means the convex hull and B Ξ(y) = { lim j J y jξ(y j ) : y j y, Ξ is Fréchet differentiable at y j D}. On the generalized Jacobian of composite functions, we have the following result [43, Lemma 2.1]. Lemma 3.2 Let Υ : X Y be a continuously differentiable function on an open neighborhood of B( x) of x and Ξ : Y Z be a locally Lipschitz continuous function on an open set D containing ȳ := Υ( x). Suppose that Ξ is directionally differentiable at every point in D and that J x Υ( x) : X Y is onto. Define Θ : B( x) Z by Θ(x) := Ξ(Υ(x)), x B( x). Then we have B Θ( x) = B Ξ(ȳ)J x Υ( x). Next, we review the properties of the metric projection onto a closed convex set. Let K Z be a closed convex set and z K. For any z K, let Π K (z) be the metric projection of z onto K. Then we find the following result on Π K (z) ([27, Proposition 1]). Lemma 3.3 Let K Z be a closed convex set. Then, for any z Z and V Π K (z), i) V is self-adjoint. ii) d, V d 0 for all d Z. iii) V d, d V d 0 for all d Z. For H : X R q R n n X R q R n n defined in (2), by Lemma 3.2, we can easily derive the following result. Proposition 3.4 Let ( x, ȳ, Ψ) X R q R n n. Then, for any ( x, y, Ψ) X R q R n n, we deduce that Jxxl( x, 2 ȳ, Ψ) x J x h( x) y J x g( x) Ψ H( x, ȳ, Ψ)( x, y, Ψ) = J x h( x) x J x g( x) x Π R n n(g( x) Ψ)(J. x g( x) x Ψ) From Proposition 3.4, we know that Clarke s generalized Jacobian of H is given in terms of Clarke s generalized Jacobian of Π R n n( ). In the following, we characterize Clarke s generalized Jacobian of Π R n n( ). Let R n n be equipped with the Frobenius inner product,, i.e., B 1, B 2 = tr(b T 1 B 2 ) B 1, B 2 R n n, where tr denotes the trace of a matrix. Under the Frobenius inner product, the projection C := Π R n n(c) of a matrix C R n n onto R n n satisfies the following complementarity condition: R n n C (C C) R n n, (3) 8

9 where for any two matrices C 1, C 2 R n n, C 1 C 2 C 1, C 2 = 0. Define three index sets: Then α := {(i, j) C ij > 0}, β := {(i, j) C ij = 0}, γ := {(i, j) C ij < 0}. Define the matrix U(C) R n n by C = P (C α ) P (C β ) P (C γ ). (4) U ij (C) := max{c ij, 0}, i, j = 1,..., n, (5) C ij where 0/0 is defined to be 1. It is easy to check that, for any H R n n, the directional derivative (C; H) is given by Π R n n Π R n n (C; H) = P (U α γ (C) H α γ ) P (Π β R (H β )), (6) where denotes the Hadamard product. Based on the previous analysis, we can easily derive the following differential properties of ( ). Π R n n Proposition 3.5 Let the absolute value function R m n C R m n = Π R m n be defined by (C) Π R m n( C) C R m n. (a) R n n and Π R n n( ) are F-differentiable at C R n n if and only if C has no zero entry. In this case, Π (C; ) = U(C), where U(C) is defined in (5). R n n (b) For any C R n n, Π (C, ) is F-differentiable at H R n n if and only if H R n n β has no zero entry. (c) For any C, H R n n, Π (C; H) is given by (6). R n n To characterize Clarke s generalized Jacobian of Π R n n( ), we need the following lemma. Lemma 3.6 For any C R n n, define Θ( ) := Π (C; ). Then R n n Proof: Let V B Π R n n(c). B Π R n n B Π R n n(c) = B Θ(0). By Proposition 3.5 and the definition of the elements in (C), it follows that there exists a sequence of no-zero-entry matrices {C k } in R n n converging to C such that V = lim k J C kπ R n n that for all k large enough, we have Cα γ k has no zero entry and lim k Cβ k H R n n, we have Q k := J C kπ R n n(c k )(H) = U(C k ) H, 9 (C k ). Notice that C = lim k C k implies = 0. For any

10 which implies C k Q k = C k H, i.e., P ( C k α γ Q k α γ) P ( C k β Qk β ) = P ((Ck α γ) H α γ ) P ((C k β ) H β ). Thus, Q k α γ = U α γ (C k ) H α γ and Q k β = U β(c k ) H β. By taking a subsequence if necessary, we may assume that {Q k β } is a convergent sequence. Then, for any H Rn n, we deduce that V (H) = P ( U α γ (C) H α γ ) P ( lim k {U β(c k ) H β } ). (7) Let R k := P (C k β ). Since Ck β has no zero entry, we know that Θ is F-differentiable at Rk and for any H R n n, we have J R kθ(r k Θ(P (Cβ k )(H) = lim ) th) Θ(P (Ck β )) t 0 t = P (U α γ (C) H α γ ) P = P (U α γ (C) H α γ ) P Π R lim t 0 β (U β (C k ) H β ), (Cβ k th β) Π β R t (Cβ k) where the second equality uses (6) and the third equality uses Proposition 3.5 (a) applied to at the F-differentiable point Cβ k. The latter, together with (7), implies that V (H) = Π R β lim k J R kθ(r k )(H). By the arbitrariness of H R n n, we have V B Θ(0). Conversely, let V B Θ(0). Notice that Θ is F-differentiable at R R n n if and only if R β has no zero entry. Then there exists a sequence of matrices {R k } in R n n converges to 0 such that R k β has no zero entry for every k and V = lim k J R kθ(r k ). For any H R n n, we infer that J R kθ(r k Θ(R k th) Θ(R k ) )(H) = lim t 0 t = P (U α γ (C) H α γ ) P = P (U α γ (C) H α γ ) P Π R lim t 0 β (U β (R k ) H β ). (Rβ k th β) Π β R t (Rβ k) Let C k := C P (Rβ k) = P (C α γ) P (Rβ k). It is obvious that Ck has no zero entry for every k and C k R n n = P ( C α γ ) P ( Rβ k ) and (Ck ) = P ((C α γ ) ) P (Π β R Rβ k ). Hence, Π R n n is differentiable at C k and for any H R n n, Q k := J C kπ R n n(c k )(H) = U(C k ) H, which leads to C k R n n Q k = (C k ) H. Thus, P ( C α γ Q k α γ) P ( R k β Qk β ) = P ((C α γ) H α γ ) P ((R k β ) H β ), 10

11 which gives rise to Q k α γ = U α γ (C) H α γ and Q k β = U β(p (C k )) H β. Therefore, V (H) = lim k J C kπ R n n(c k )(H) and then V B Π R n n(c). The proof is complete. We are now ready for establishing the following result on Π R n n( ). Proposition 3.7 Suppose that C R n n has the decomposition as in (4). Then for any V B Π R n n( ) ( Π R n n( ), respectively), there exists a W B Π β R (0) ( Π β R (0), respectively) such that V (H) = P ( ) ( U α γ (C) H α γ P W (Hβ ) ). (8) Conversely, for any W B Π R β ( Π R n n (0) ( Π β R ( ), respectively) such that (8) holds. Proof: We only need to prove (8) for V B Π R n n Let Θ( ) := Π (C; ). By (6), we have R n n By Lemma 3.6, we obtain (8). (0), respectively), there exists a V B Π R n n( ) ( ) and W B Π β R (0). Θ(H) = P ( ) ( U α γ (C) H α γ P ΠR β (H β ) ), H R n n. We now use Clarke s generalized Jacobian-based Newton method for solving the nonsmooth equation (2): (x j1, y j1, Ψ j1 ) = (x j, y j, Ψ j ) V 1 j H(x j, y j, Ψ j ), V j H(x j, y j, Ψ j ), j = 0, 1, 2,.... (9) To guarantee the superlinear convergence of (9), we need the semismoothness of H, whose notion is formally defined below [28, 34]. Definition 3.8 Let Υ : D Y Z be a locally Lipschitz continuous function on the open set D. 1) Υ is said to be semismooth at y D if Υ is directionally differentiable at y and for any V Υ(y h) and h 0, Υ(y h) Υ(y) V (h) = o( h ). 2) Υ is said to be strongly semismooth at y D if Υ is semismooth at y and for any V Υ(y h) and h 0, Υ(y h) Υ(y) V (h) = O( h 2 ). 11

12 Regarding the superlinear convergence of (9), we state the following result [34, Theorem 3.2]. Proposition 3.9 Let H : O X R q R n n X R q R n n be a locally Lipschitz continuous function on the open set O. Let ( x, ȳ, Ψ) X R q R n n be a solution to (2). Suppose that H is semismooth at ( x, ȳ, Ψ) and any element in H( x, ȳ, Ψ) is nonsingular. Then every sequence generated by (9) converges to ( x, ȳ, Ψ) superlinearly provided the initial point (x 0, y 0, Ψ 0 ) is sufficiently close to ( x, ȳ, Ψ). Moreover, the convergence rate is quadratic if H is strongly semismooth at ( x, ȳ, Ψ). In order to solve (2) by using the nonsmooth Newton method (9), the two assumptions in Proposition 3.9 should be satisfied. The strong semismoothness of H holds since the metric projection Π R n n( ) is strongly semismooth. In what follows, we explore the nonsingularity conditions of Clarke s generalized Jacobian H( ). We need the concepts of strong second order sufficient condition and constraint nondegeneracy for the NLNNP. Let K Z be a closed convex set and z K. Define dist(z, K) := inf{ z d : d K}. Then the tangent cone T K (z) of K at z K is given by [3, 2.2.4]) T K (z) := {d K : dist(z td, K) = o(t), t 0}. For any z K, let lin(t K (z)) denote the linearity space of T K (z), i.e., the largest linear space in T K (z). By (6), we have and T R n n (C ) = {H R n n H = Π (C R n n ; H)} = {H R n n H β 0, H γ = 0} (10) lin ( T R n n(c ) ) = {H R n n H β = 0, H γ = 0}. (11) The following definition is the constraint nondegeneracy for the NLNNP, which is originally introduced by Robinson [36]. Definition 3.10 We say that a feasible point x to the NLNNP is constraint nondegenerate if ( ) ( ) ( ) Jx h( x) {0} X J x g( x) lin ( T R n n(g( x)) ) R q = R n n. (12) As noted in [36] and [41], the constraint nondegeneracy for the NLNNP is equivalent to the stronger linear independence constraint qualification (LICQ) (cf., [30]): {J x h j ( x)} q j=1 and {J xg ij ( x)} (i,j) A( x) is linearly independent. Moreover, in this case, the set M( x) of Lagrangian multipliers defined by is a singleton. M( x) := {(ȳ, Ψ) ( x, ȳ, Ψ) is a KKT point of the NLNNP} 12

13 In the following, we give the concept of strong second order sufficient condition for the NLNNP. We first need to characterize the critical cone of the NLNNP. The critical cone of R n n at C R n n associated with the complementarity problem (3) is defined by C(C; R n n ) := T R n n (C ) (C C), i.e., C(C; R n n ) = {H Rn n H β 0, H γ = 0}. (13) Hence, the affine hull aff(c(c; R n n )) of C(C; Rn n ) is given by aff(c(c; R n n )) = {H Rn n H γ = 0}. (14) Assume that x be a stationary point of the NLNNP. Then M( x) is nonempty. The critical cone C( x) of the NLNNP at x is defined by C( x) := {d J x h( x)d = 0, J x g( x)d T R n n(g( x)), J x f( x)d 0} = {d J x h( x)d = 0, J x g( x)d T R n n(g( x)), J x f( x)d = 0}. Since x is a stationary point of the NLNNP, there exists (ȳ, Ψ) M( x) such that From [14], we have J x l( x, ȳ, Ψ) = 0, h( x) = 0, and Ψ NR n n(g( x)). Ψ N R n n(g( x)) R n n ( Ψ) g( x) R n n. Thus C := g( x) Ψ takes the form reported in (4): By (10) and (11), we get Since M(g( x)) is nonempty, g( x) = P (C α β ) and Ψ = P (Cβ γ ). (15) T R n n(g( x)) = {H R n n H β 0, H γ = 0}, T R n n(g( x)) Ψ = {H R n n H β 0, H γ = 0}, lin ( T R n n(g( x)) ) = {H R n n H β = 0, H γ = 0}. (16) C( x) = {d J x h( x)d = 0, (J x g( x)d) β 0, (J x g( x)d) γ = 0} = {d J x h( x)d = 0, J x g( x)d C(C; R n n )}, where C(C; R n n ) is the critical cone of Rn n at C := g( x) Ψ. It is difficult to determine the affine hull aff(c( x)) of C( x). However, based on (ȳ, Ψ), we can provide an approximation by which, together with (14), leads to app(ȳ, Ψ) := {d J x h( x)d = 0, J x g( x)d aff(c(c; R n n ))}. app(ȳ, Ψ) := {d J x h( x)d = 0, (J x g( x)d) γ = 0}. (17) On the relations between app(ȳ, Ψ) and aff(c( x)), we can easily deduce the following result. 13

14 Proposition 3.11 Suppose that there exists a direction d C( x) such that (J x g( x)d) β > 0. Then we have aff(c( x)) = app(ȳ, Ψ). The following definition is a strong second order sufficient condition for the NLNNP, which is slightly different from the original version introduced by Robinson [35]. Definition 3.12 Let ( x, ȳ, Ψ) be a KKT point of the NLNNP. We say that the strong second order sufficient condition holds at x if d, J 2 xxl( x, ȳ, Ψ)d > 0 d aff(ȳ, Ψ)\{0}. (18) To show the nonsingularity of Clarke s generalized Jacobian of H, we need the following useful result. Proposition 3.13 Suppose that G R n n and Ψ N R n n(y ). Then for any V Π R n n(g Ψ) and G, Ψ R n n such that G = V ( G Ψ), it holds that G, Ψ 0. Proof: Let C := GΨ. We have from [14] that G = Π R n n(gψ) = Π R n n(c) and G, Ψ = 0. Hence, G = P (C α β ) and Ψ = P (C β γ ). By Proposition 3.7, there exists a W Π β R (0) such that V ( G Ψ) = P ( U α γ (C) ( G α γ Ψ α γ ) ) P ( W ( G β Ψ β ) ). Taking into consideration the assumption that G = V ( G Ψ), we infer that From Lemma 3.3 iii) and (19), we obtain Φ α = 0, G γ = 0, G β = W ( G β Φ β ). (19) G, Ψ = G α, Ψ α G β, Ψ β G γ, Ψ γ = G β, Ψ β = W ( G β Ψ β ), Ψ β = W ( G β Ψ β ), ( G β Ψ β ) W ( G β Ψ β ) 0. We are now ready to state our result on the nonsingularity of Clarke s generalized Jacobian of the mapping H defined in (2). Theorem 3.14 Let x be a feasible solution to the NLNNP. Let (ȳ, Ψ) satisfies H( x, ȳ, Ψ) = 0. Suppose that the strong second order sufficient condition (18) holds at x and x is constraint nondegenerate. Then every element in H( x, ȳ, Ψ) is nonsingular. 14

15 Proof: Let V be an arbitrary element in H( x, ȳ, Ψ). To show that V is nonsingular, we only need to prove that V ( x, y, Ψ) = 0 ( x, y, Ψ) X R q R n n. Let C := g( x) Ψ. Then C, g( x) and Ψ take the forms of (4) and (15), respectively. By Lemma 3.2, there exists an element W Π R n n(c) such that Jxxl( x, 2 ȳ, Ψ) x J x h( x) y J x g( x) Ψ V ( x, y, Ψ) = J x h( x) x = 0. (20) J x g( x) x W (J x g( x) x Ψ) By Proposition 3.7, (17), and the second and the third equations of (20), we obtain It follows from the first and second equations of (20) that x app(ȳ, Ψ). (21) 0 = x, Jxxl( x, 2 ȳ, Ψ) x J x h( x) y J x g( x) Ψ = x, Jxxl( x, 2 ȳ, Ψ) x x, J x h( x) y x, J x g( x) Ψ = x, Jxxl( x, 2 ȳ, Ψ) x y, J x h( x) x Ψ, J x g( x) x = x, Jxxl( x, 2 ȳ, Ψ) x Ψ, J x g( x) x. (22) By Proposition 3.13 and the third equation of (20), we find The latter, together with (22), implies that Ψ, J x g( x) x 0. 0 x, J 2 xxl( x, ȳ, Ψ) x. (23) It follows from (23), (21) and the strong second order sufficient condition (18) that Then (20) reduces to x = 0. (24) [ Jx h( x) y J x g( x) Ψ W ( Ψ) By Proposition 3.7 and the second equation of (25), we obtain ] = 0. (25) ( Ψ) α = 0. (26) By the constraint nondegeneracy condition (12), there exist d X and Q lin ( T R n n(g( x)) ) such that J x h( x)d = y and J x g( x)d Q = Ψ, 15

16 which, together with the first equation of (25), implies that y, y Ψ, Ψ = J x h( x)d, y J x g( x)d Q, Ψ = d, J x h( x) y d, J x g( x) Ψ Q, Ψ = d, J x h( x) y J x g( x) Ψ Q, Ψ = Q, Ψ. (27) From (16), (26), and (27), we get y, y Ψ, Ψ = 0. Hence, y = 0 and Ψ = 0. (28) We see from (24) and (28) that V is nonsingular. Finally, we define the monotonicity and related concepts for matrix-valued functions, which were originally introduced for vector-valued functions [4, 16, 48]. Definition 3.15 Let X n denote the space of real n n matrices or the space of real symmetric n n matrices, which is equipped with the Frobenius inner product, and the related induced Frobenius norm. 1). Υ : X n X n is a monotone function if Y Z, Υ(Y ) Υ(Z) 0 for all Y, Z X n. 2). Υ : X n X n is a P 0 -function if there exists an index (i, j) such that Y ij Z ij and (Y ij Z ij )(Υ ij (Y ) Υ ij (Z)) 0 for all Y, Z X n and Y Z. 3). Υ : X n X n is a P -function if there exists an index (i, j) such that (Y ij Z ij )(Υ ij (Y ) Υ ij (Z)) > 0 for all Y, Z X n and Y Z. We observe that every monotone matrix-valued function is also a P 0 -function. 3.2 A Nonsmooth Newton-Type Method In this section, we present a globalized nonsmooth Newton-type method for solving the NIEP. Given the eigendata (Λ, X) R n p R p p as in Section 2, the NIEP is to find a matrix A R n n such that AX = XΛ. (29) 16

17 Let K = X T and B = (XΛ) T. We note that A R n n is a solution to (29) if and only if Y = A T R n n is a global solution of the following convex optimization problem { min f(y ) := 1 2 KY B 2 s.t. Y R n n (30) with f(y ) = 0. The well-known KKT condition for Problem (30) is given by From [14], we have J Y f(y ) Z = 0, Y R n n, Z Rn n Y R n n, Z Rn n, Y, Z = 0, (31), Y, Z = 0 Z N R n n (Y ) Y = Π R n n(y Z). Thus, the KKT condition (31) can be written as [ ] J Y f(y ) Z H(Y, Z) := = 0. (32) Y Π R n n(y Z) For Problem (30), it is obvious that the LICQ is satisfied at Y. Therefore, M(Y ) = {Z (Y, Z) is a KKT point of Problem (30)} is singleton. As in Subsection 3.1, one may use Clarke s generalized Jacobian-based Newton method for solving (32), where the unknown variables Y and Z are to be determined. Sparked by [13, 48], in this paper, we solve Problem (30) by constructing an equivalent nonsmooth equation to the KKT condition (31). Let F (Y ) := J Y f(y ) = K T (KY B). (33) Then the KKT condition (31) is reduced to the following MCP Y R n n By using the well-known Fischer function [17, 19] defined by which has an important property:, F (Y ) Rn n, Y, F (Y ) = 0. (34) ω(a, b) := a 2 b 2 (a b) a, b R, (35) ω(a, b) = 0 a 0, b 0, ab = 0, Solving the MCP (34) is equivalent to the solution of the following nonsmooth equation Φ(Y ) = 0, (36) where Φ ij (Y ) := ω(y ij, F ij (Y )) for i, j = 1,..., n. Also, define the merit function φ : R n n R by φ(y ) := 1 2 Φ(Y ) 2. (37) In what follows, we propose a Newton-type method for solving the nonsmooth equation (36). We first show the monotonicity of the matrix-valued function F in (33). 17

18 Proposition 3.16 The function F : R n n R n n defined in (33) is monotone matrix-valued function. Proof: By (33), for any Y 1, Y 2 R n n, we find Y 1 Y 2, F (Y 1 ) F (Y 2 ) = Y 1 Y 2, K T K(Y 1 Y 2 ) = K(Y 1 Y 2 ), K(Y 1 Y 2 ) 0. This shows that F is monotone. We note that F defined in (33) is continuously differentiable. Then we have the following result on Clarke s generalized Jacobian of Φ defined in (36). Proposition 3.17 Let Y R n n and Φ be defined in (36). Then, for any H R n n, Φ(Y )H (Γ(Y ) E) H (Ω(Y ) E) (J Y F (H)), where E is the n-by-n matrix of all ones and Γ(Y ) and Ω(Y ) are two n-by-n matrices with entries determined by Γ ij (Y ) = if (Y ij, F ij (Y )) (0, 0) and by Y ij (Y ij, F ij (Y )), Ω ij(y ) = F ij (Y ) (Y ij, F ij (Y )), Γ ij (Y ) = Γ ij, Ω ij (Y ) = Ω ij for every (Γ ij, Ω ij ) such that (Γ ij, Ω ij ) 1 if Y ij = 0 = F ij (Y ). Proof: It follows from [11, page 75] and the differentiability of F. One important property of the function Φ defined in (36) is its strong semismoothness. Proposition 3.18 The function Φ is strongly semismooth. Proof: The function Φ is strongly semismooth if and only if its each component is strongly semismooth [44]. Notice that Φ ij (Y ) is a composite of the function ω : R 2 R and the continuously differentiable function (Y ij, F ij (Y )) : R n n R 2. Since the derivative of F at Y is a constant, we can show the strong semismoothness of Φ ij (Y ) as in [33, Lemma 3.1]. The following proposition furnishes two properties of the function φ in (37). Proposition 3.19 a). The function φ defined in (37) is continuously differentiable and its gradient at Y R n n is given by φ(y ) = {V Φ(Y ) V Φ(Y )} (Γ(Y ) E) Φ(Y ) (J Y F ) ((Ω(Y ) E) Φ(Y )). b). Any stationary point of φ is a solution to Problem (30). 18

19 Proof: We first prove part a). By using the generalized Jacobian chain rule in [11, Theorem 2.6.6], we get φ(y ) = {V Φ(Y ) V Φ(Y )}. If Y ij = 0 = F ij (Y ), then Φ ij (Y ) = 0. In this case, the multivalued entries of φ(y ) are canceled by the zero components of Φ(Y ). Then, φ(y ) reduces to a singleton. Therefore, by the corollary to [11, Theorem 2.2.4], we know that φ is continuously differentiable. We now establish b). Let Y be an arbitrary stationary point of ψ, i.e., φ(y ) = 0. This leads to (Γ(Y ) E) Φ(Y ) (J Y F ) ((Ω(Y ) E) Φ(Y )) = 0. (38) We only need to show Φ(Y ) = 0. By contradiction, we suppose that there exists an index (i, j) such that Φ ij (Y ) 0, which implies that one of the following results holds: 1). Y ij < 0 and F ij (Y ) = 0. 2). Y ij = 0 and F ij (Y ) < 0. 3). Y ij 0 and F ij (Y ) 0. For every case, we have ( Y ij (Y ij, F ij (Y )) 1 ) ( ) F ij (Y ) (Y ij, F ij (Y )) 1 Notice that F is a linear operator. Then, for any H R n n, we have By Proposition 3.16, F is monotone. Thus, Hence, by using (38), we have By Proposition 3.17 and (39), F (Y H) F (Y ) = J Y F (H). (Ω(Y ) E) Φ(Y ), (J Y F ) ((Ω(Y ) E) Φ(Y )) = (Ω(Y ) E) Φ(Y ), J Y F ((Ω(Y ) E) Φ(Y )) 0. (Ω(Y ) E) Φ(Y ), (Γ(Y ) E) Φ(Y ) 0. ( ) ( ) Y ij 0 < (Y ij, F ij (Y )) 1 F ij (Y ) (Y ij, F ij (Y )) 1 Φ 2 ij(y ) 0. This is a contradiction. Therefore, Φ(Y ) = 0. We now establish the following useful lemma. > 0. (39) Lemma 3.20 Let Y R n n and H(Y ) := {D 1 D 2 J Y F D 1, D 2 R n n }. Then, every element in H(Y ) is nonsingular. 19

20 Proof: Let V be arbitrary element in H(Y ). Then, there exist two matrices D 1, D 2 such that D 1, D 2 R n n and V = D 1 D 2 J Y F. (40) For the sake of contradiction, we assume that V is singular. Then, there exists an 0 H R n n such that V (H) = 0. By (40), we obtain D 1 H = D 2 J Y F (H) or Since F is a linear operator, we obtain H ij = (D 2) ij (D 1 ) ij (J Y F (H)) ij for i, j = 1,..., n. F (Y H) F (Y ) = J Y F (H). (41) By Proposition 3.16, F is monotone. Hence, F is also a P 0 function. The latter statement, together with (41) and D 1, D 2 R n n, implies that there exists an index (i 0, j 0 ) such that H i0 j 0 0 and Hence, H i0 j 0 (J Y F (H)) i0 j 0 = (Y i0 j 0 H i0 j 0 Y i0 j 0 )(F i0 j 0 (Y H) F i0 j 0 (Y )) 0. 0 < H 2 i 0 j 0 = H i0 j 0 (D 2 ) i0 j 0 (D 1 ) i0 j 0 (J Y F (H)) i0 j 0 0, i.e., H i0 j 0 = 0. This is a contradiction because H i0 j 0 0. Therefore, V is nonsingular. Here We now construct a subset of B Φ(Y ), which is easy to evaluate. Define L(Y ) := {L = S T J Y F (S, T ) G(Y )}. G(Y ) := {(S, T ) R n n R n n (S, T ) = (P (Y, Z), Q(Y, Z)), Z Z(Y )}, where Z(Y ) := {Z R n n Z ij 0 if (Y ij, F ij (Y )) (0, 0)} and P (Y, Z) and Q(Y, Z) are given by := (P ij (Y, Z), Q ij (Y, Z)) ( ( Z ij Zij 2 (J Y F (Z)) 2 ij Y ij Y 2 1, (J Y F (Z)) ij Zij 2 (J Y F (Z)) 2 ij 1, F ij(y ) 1 ij F ij 2 (Y ) Yij 2 F ij 2 (Y ) ) 1 ), if Y ij = 0 = F ij (Y ),, otherwise. (42) We state and prove the following result on L(Y ). Proposition 3.21 For any H R n n, L(Y )H B Φ(Y )H. 20

21 Proof: Let L(H) = P (Y, Z) H Q(Y, Z) J Y F (H) L(Y )H, where Z Z(Y ) and P (Y, Z) and Q(Y, Z) are defined in (42). Let Y k := Y ɛ k Z, where 0 < ɛ k 0. Notice that for Y ij = 0 = F ij (Y ), Z ij 0. By continuity, we can assume that ɛ k is sufficiently small so that Y ij 0 or F ij (Y ) 0 for all i, j and Φ is Fréchet differentiable at Y k. In the following, we show that lim k J Y kφ(h) = L(H). If (Y ij, F ij (Y )) (0, 0), then by continuity it is easy to see that lim k (J Y kφ(h)) ij = (L(H)) ij. If Y ij = 0 = F ij (Y ), then Proposition 3.17 implies (J Y kφ(h)) ij = ɛ k Z ij 1 F ij (Y k ) 1 (J Y kf (H)) ij. (ɛ k ) 2 Zij 2 F ij 2 (Y k ) (ɛ k ) 2 Zij 2 F ij 2 (Y k ) Since F is a linear operator, we have, for Y ij = 0 = F ij (Y ), (43) F ij (Y k ) = F ij (Y ) ɛ k (J Y kf (Z)) ij = ɛ k (J Y kf (Z)) ij. (44) By (43), (44), and the continuity of J Y kf, we obtain lim k (J Y kφ(h)) ij = (L(H)) ij if Y ij = 0 = F ij (Y ). Now, we propose a globalized Newton-type method for solving (36). From Proposition 3.21, it seems natural to solve the linear system L( Y k ) = Φ(Y k ), L L(Y k ). However, we note that the nonsingularity of an element L L(Y k ) is not guaranteed. We may modify L by using the similar technique in [48]. Define where L(Y ) := { L = (S S) (T T ) J Y F (S, T ) G(Y ), (( S) ij, ( T ) ij ) Q(Y, S ij, T ij ) for i, j = 1,..., n}, Q(Y, S ij, T ij ) := { } (( S) ij, ( T ) ij ) R 2 θ(φ(y )) ( S) ij = T ij, ( T ) ij = 0, if δ < S ij and T ij δ, { } (( S) ij, ( T ) ij ) R 2 θ(φ(y )) θ(φ(y )) ( S) ij = µ T ij, ( T ) ij = (1 µ) S ij, µ [0, 1], if S ij δ and T ij δ, { } (( S) ij, ( T ) ij ) R 2 θ(φ(y )) ( S) ij = 0, ( T ) ij = S ij, if S ij δ and δ < T ij. (46) Here, δ (0, 1 1/ 2) and θ : R R is a nondecreasing continuous function such that θ(0) = 0 and θ(t) > 0 for all t > 0. If φ(y ) > 0, then it is easy to obtain that (S S), (T T ) R n n. In this case, by Lemma 3.20, we know that any element L L(Y ) is nonsingular. Proposition 3.22 Let L = S T J Y F L(Y ) and L = L S T J Y F L(Y ), where (S, T ) G(Y ) and (( S) ij, ( T ) ij ) Q(Y, S ij, T ij ) for all i, j. Then ( L L) L is positive definite. 21 (45)

22 Proof: Notice that L L = S T J Y F. Then ( L L) L = ( S T J Y F ) (S T J Y F L(Y )) = S S S T J Y F (J Y F ) T S (J Y F ) T T J Y F. Since S ij, T ij, ( S) ij, ( T ) ij 0 for all i, j, both S S and J Y F T T J Y F are positive semidefinite. Next, we establish the positive semidefiniteness of ( S T J Y F (J Y F ) T S ). For any H R n n, ( S T J Y F (J Y F ) T S )H, H = S T J Y F (H), H (J Y F ) T S H, H = S T J Y F (H), H H, T S J Y F (H) = H, ( S T T S )(J Y F (H)). Thus, we only need to show the positive semidefiniteness of ( S T T S )J Y F. Since J Y F is positive semidefinite, it is enough to show that ( S T T S ) is positive semidefinite. Notice that (( S) ij, ( T ) ij ) Q(Y, S ij, T ij ) for all i, j. This implies that for all i, j, (θ(φ(y )), 0), if δ < S ij and T ij δ, (( S) ij T ij, ( T ) ij S ij ) = (µθ(φ(y )), (1 µ)θ(φ(y ))), if S ij δ and T ij δ, (0, θ(φ(y ))), if S ij δ and δ < T ij. Thus, ( S) ij T ij ( T ) ij S ij = θ(φ(y )) for all i, j, where θ(ψ(y )) 0. Therefore, S T T S = θ(φ(y ))E is positive semidefinite. This completes the proof. We now state the Newton-type algorithm for solving Problem (30) as follows. Algorithm 3.23 (A Newton-Type Method for the NIEP) Step 0. Give Y 0 R n n, η (0, 1), and ρ, σ (0, 1/2). k := 0. Step 1. If φ(y k ) = 0, then stop. Otherwise, go to Step 2. Step 2. Select an element L k L(Y k ). Apply an iterative method (e.g., the transpose-free quasi-minimal residual (TFQMR) method [20]) to solving for Y k R n n such that and L k ( Y k ) Φ(Y k ) = 0 (47) L k ( Y k ) Φ(Y k ) η k Φ(Y k ), (48) φ(y k ), Y k η k Y k, Y k, (49) where η k := min{η, Φ(Y k ) }. If (48) and (49) are not attainable, then let Y k := φ(y k ). 22

23 Step 3. Let l k be the smallest nonnegative integer l such that φ(y k ρ l Y k ) φ(y k ) σρ l φ(y k ), Y k. Set Y k1 := Y k ρ l k Y k. Step 4. Replace k by k 1 and go to Step 1. We note that, in Steps 2 and 3 of Algorithm 3.23, we need to compute φ(y k ) at the kth iteration. By Propositions 3.19 and 3.21, it is easy to see that φ(y k ) = L k Φ(Y k ), where L k L(Y k ). Since the problem size is n 2, the direct solution of (47) needs O(n 4 ) operations, which is very costly if the problem size is large. Therefore, in Algorithm 3.23, we propose to solve (47) inexactly by iterative methods. Moreover, we will see in Subsection 3.3 that requirement (49) is reasonable (see Proposition 3.24 below). 3.3 Convergence Analysis In this subsection, we shall establish the global and quadratic convergence of Algorithm First, we state the following result on the descent property of the solution to L( Y )Φ(Y ) = 0, where L L(Y ). Proposition 3.24 If Y R n n is not a solution of Problem (30), then any solution Y R n n to L( Y ) Φ(Y ) = 0, L L(Y ) (50) is a descent direction of φ, i.e., φ(y ), Y < 0. Proof: We first show φ(y ), Y 0. Let L = L S T J Y F L(Y ), where L L(Y ) and (( S) ij, ( T ) ij ) Q(Y, S ij, T ij ) for all i, j. By Proposition 3.19 a) and Proposition 3.21, we have φ(y ), Y = L Φ(Y ), Y = L L( Y ), Y = Y, L L Y = Y, L L Y Y, ( L L) L Y. By Proposition 3.22, we deduce that φ(y ), Y 0. Next, we prove that if φ(y ), Y = 0 holds for a solution Y to (50), then Y is a solution of Problem (30). By contradiction, we assume that φ(y ), Y = 0 for a solution Y to (50), but Y is not a solution of Problem (30). Since Φ(Y ) 0, we have Y 0. Let L = (S S) (T T ) J Y F L(Y ) be such that L( Y ) Φ(Y ) = 0. Define L := Ŝ T J Y F, where for all i, j, { (Sij ( S) ij, T ij ( T ) ij ), if ω(y ij, F ij (Y )) = 0, (Ŝij, T ij ) = (S ij, T ij ), if ω(y ij, F ij (Y )) 0. By Lemma 3.20, L is nonsingular. Thus, for all i, j, { ( L Y )ij = (Φ(Y )) ij = 0, if ω(y ij, F ij (Y )) = 0, ( L Y ) ij = (L Y ) ij, if ω(y ij, F ij (Y )) (51)

24 By (51) and the positive semidefiniteness of ( L L) L, it follows from φ(y ), Y = 0 that L Y = 0. Thus, if φ(y ij, F ij (Y )) 0, ( L Y ) ij = (L Y ) ij = 0. Hence, L Y = 0 and, taking into account the nonsingularity of L, we deduce Y = 0. This is a contradiction since Y 0 and the proof is complete. By using the similar proof of Theorem 11 (a) in [13] or Theorem 3.1 in [48], we can derive the following theorem on the global convergence of Algorithm We omit it here. Theorem 3.25 Any accumulation point of the sequence {Y k } generated by Algorithm 3.23 is a solution to Problem (30). The remaining part of this subsection is concerned with the quadratic convergence of Algorithm Here, we assume that every element in B Φ( ) is nonsingular (the nonsingularity of B Φ( ) is studied in Subsection 3.4). Assumption 3.26 All elements in B Φ(Y ) are nonsingular, where Y is an accumulation point of the sequence {Y k } generated by Algorithm Under Assumption 3.26, we obtain the following lemma. The proof is similar to that of of Theorem 11 (b),(c) in [13] and we omit it. Lemma 3.27 Let Y be an accumulation point of the sequence {Y k } generated by Algorithm Suppose that Assumption 3.26 is satisfied. Then the sequence {Y k } converges to Y. We now establish the quadratic convergence of Algorithm Theorem 3.28 Let Y be an accumulation point of the sequence {Y k } generated by Algorithm Suppose that Assumption 3.26 is satisfied and the function θ is such that θ(t) = O( t). Then the sequence {Y k } converges to Y quadratically. Proof: By Proposition 3.18, Φ is strongly semismooth. Thus, by Proposition 3.21, we get Φ(Y k ) Φ(Y ) L k (Y k Y ) = O( Y k Y 2 ). (52) Moreover, by the assumption that θ(t) = O( t), we have, for all k sufficiently large, ( ) θ(φ(y k )) = O φ(y k ) = O( Φ(Y k ) ) = O( Φ(Y k ) Φ(Y ) ) = O( Y k Y ). (53) Since Assumption 3.26 is satisfied, by Lemma 3.27, lim k Y k = Y. Hence, by Proposition 3.21, for all k sufficiently large, L k is nonsingular and L 1 k is uniformly bounded. Since θ(φ(y k )) 0, it is easy to know from (45) and (46) that for all k sufficiently large, L k L k = O(θ(φ(Y k ))) are small enough and then L 1 k is uniformly bounded. Thus, for all k sufficiently large, an iterative method can find Y k such that both (48) and (49) are satisfied. Therefore, by (48), 24

25 (52), and (53), for all k sufficiently large, Y k Y k Y = L 1 ( k Φ(Y ) Φ(Y k ) L k (Y k Y ) ( L k L k )(Y k Y ) L k Y k Φ(Y k ) ) L 1 k Φ(Y ) Φ(Y k ) L k (Y k Y ) L 1 L k L k Y k Y L 1 k L k Y k Φ(Y k ) = O( Y k Y 2 ) O( Y k Y 2 ) η k L 1 k Φ(Y k ) = O( Y k Y 2 ) O( Φ(Y k ) 2 ) k = O( Y k Y 2 ). (54) On the other hand, for any τ > 0 and for all k sufficiently large, φ(y k ), Y k = L k Φ(Y k ), Y k = ( L k L k L k ) Φ(Y k ), L 1 k ( L k Y k Φ(Y k ) Φ(Y k )) = Φ(Y k ), ( L k L k ) L 1 k ( L k Y k Φ(Y k )) Φ(Y k ), ( L k L k ) L 1 k Φ(Y k ) Φ(Y k ), L k Y k Φ(Y k ) Φ(Y k ), Φ(Y k ) L k L k L 1 k Φ(Y k ) L k Y k Φ(Y k ) L k L k L 1 k Φ(Y k ) 2 Φ(Y k ) L k Y k Φ(Y k ) Φ(Y k ) 2 2 ( 1 η k (η k 1) L k L k L 1 k ) φ(y k ) 2(1 τ)φ(y k ), (55) where the last step uses the facts that L 1 k is uniformly bounded and η k and L k L k are small enough for all k sufficiently large. Moreover, there exists a constant ξ > 0 such that Y k Y Y k Y k Y Y k Y k Y k Y L 1 k Y k Y k Y (1 η k ) L 1 k Φ(Y k ) L k Y k Φ(Y k ) L 1 k Φ(Y k ) Y k Y k Y ξ Φ(Y k ) (56) for all k sufficiently large. By (54), for all k large enough, there exists a sufficiently small ς > 0 such that Y k Y k Y ς Y k Y. This, together with (56), leads to Y k Y k Y 25 ςξ 1 ς Φ(Y k ). (57)

26 Since Φ is Lipschitz continuous near Y with a Lipschitz constant ϱ > 0, we have by (57), for all k sufficiently large, or Φ(Y k Y k ) = Φ(Y k Y k ) Φ(Y ) ϱ Y k Y k Y ϱ ςξ 1 ς Φ(Y k ) φ(y k Y k ) ( ) ςζ 2 φ(y k ), ζ := ϱξ. (58) 1 ς Since σ (0, 1/2), by (55), we can choose a sufficiently small τ > 0 for which 2(1 τ)σ < 1. Also, by (58), ς > 0 can be taken small enough such that ( ) ςζ 2 1 > 2(1 τ)σ. 1 ς From (55) and (58), for all k sufficiently large, we obtain ( ( ) ) 2 φ(y k Y k ) φ(y k ) 1 ςζ 1 ς φ(y k ) which implies that for all k sufficiently large, Thus, by exploiting (54), the proof is complete. 2(1 τ)σφ(y k ) σ φ(y k ), Y k, Y k1 = Y k Y k. 3.4 Nonsingularity Conditions of Φ( ) Finally, we discuss the nonsingularity of Φ( ) at a solution Y of Problem (30). Notice that the nonsmooth equation (36) is equivalent to the nonsmooth equation (32). As in Subsection 3.1, both H(, ) and Φ( ) should share the similar nonsingularity conditions. Let (Y, Z) be the KKT point of Problem (30). The critical cone C(Y ) of Problem (30) at Y is defined by C(Y ) := {d d T R n n(y ), J Y f(y )d 0}. Since Y is a stationary point of Problem (30), we have C(Y ) = {d d T R n n(y ), J Y f(y )d = 0}. Since (Y, Z) is a KKT point of Problem (30), we have by [14], Z N R n n(y ) R n n Z Y R n n. Then C := Y Z can be rewritten in analogy to (4), i.e., Y = P (C α β ) and Z = P (C β γ ). 26

27 By (10) and (11), we get T R n n T R n n(y ) = {H R n n H β 0, H γ = 0}, (Y ) ( Z) = {H R n n H β 0, H γ = 0}, lin ( T R n n Since M(Y ) is nonempty, by (13), we infer that Thus (Y ) ) = {H R n n H β = 0, H γ = 0}. C(Y ) = {d d C(C; R n n )} = C(C; Rn n ). aff(c(y )) = {d R n n d γ = 0}. Therefore, based on Theorem 3.14, we can prove the following result on the nonsingularity of Φ( ). Theorem 3.29 Let Y be an accumulation point of the sequence {Y k } generated by Algorithm Let C := Y Z, where Z := F (Y ). Suppose that the strong second order sufficient condition holds at Y i.e., Then any element in Φ(Y ) is nonsingular. d, J 2 Y Y f(y )d > 0 d aff(c(y ))\{0}. (59) Proof: By Theorem 3.25, we know Y is solution to Problem (30). Let V be an arbitrary element in Φ(Y ). Let Y R n n be such that By Proposition 3.17, we have V ( Y ) = 0. V ( Y ) = (Γ(Y ) E) Y (Ω(Y ) E) (J 2 Y Y f(y )( Y )) = 0. (60) By the assumption that C := Y Z and Z := F (Y ), we obtain α := {(i, j) C ij > 0} = {(i, j) Y ij > 0, F ij (Y ) = 0}, β := {(i, j) C ij = 0} = {(i, j) Y ij = 0, F ij (Y ) = 0}, γ := {(i, j) C ij < 0} = {(i, j) Y ij = 0, F ij (Y ) > 0}, which implies that 1, if (i, j) α, Γ ij (Y ) = Γ ij, if (i, j) β, 0, if (i, j) γ, 0, if (i, j) α, and Ω ij (Y ) = Ω ij, if (i, j) β, 1, if (i, j) γ, 27 (61)

28 where (Γ ij, Ω ij ) 1. From (61), it follows that (60) is reduced to Let (J 2 Y Y f(y )( Y )) α = 0, (Γ β (Y ) E β ) Y β (Ω β (Y ) E β ) (J 2 Y Y f(y )( Y )) β = 0, Y γ = 0. β 1 := {(i, j) β Γ ij = 1, Ω ij = 0}, β 2 := {(i, j) β Γ ij = 0, Ω ij = 1}, β 3 := β\(β 1 β 2 ). Then (62) takes the form of (J Y 2 Y f(y )( Y )) α β 1 = 0, (Γ β3 (Y ) E β3 ) Y β3 (Ω β3 (Y ) E β3 ) (JY 2 Y f(y )( Y )) β 3 = 0, Y γ β2 = 0, (62) (63) where (Γ β3 (Y ) E β3 ), (Ω β3 (Y ) E β3 ) R β 3 β 3. By the assumption that the strong second order sufficient condition (59) holds at Y, we have Y α β1, (J 2 Y Y f(y )( Y )) α β 1 > 0 Y α β1 0, Y β3, (J 2 Y Y f(y )( Y )) β 3 > 0 Y β3 0. (64) From the first equality of (63) and the first inequality of (64), we get From the second equality of (63), we deduce that Y α β1 = 0. (65) This leads to ( Y β3 ) ij = Ω ij(y ) E ij Γ ij (Y ) E ij (J 2 Y Y f(y )( Y )) ij, (i, j) β 3. Y β3, (J 2 Y Y f(y )( Y )) β3 = which, together with the second inequality of (64), implies that Ω ij(y ) E ij (JY 2 Y f(y )( Y )) 2 ij 0, Γ (i,j) β ij (Y ) E ij 3 Y β3 = 0. (66) Therefore, by (65), (66), and the last equality of (63), we know that V is nonsingular. 28

A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems

A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems Wei Ma Zheng-Jian Bai September 18, 2012 Abstract In this paper, we give a regularized directional derivative-based