arxiv: v2 [math.oc] 31 Jul 2017

Size: px

Start display at page:

Download "arxiv: v2 [math.oc] 31 Jul 2017"

Annabel Wilcox
5 years ago
Views:

1 An unconstrained framework for eigenvalue problems Yunho Kim arxiv: v [math.oc] 3 Jul 07 May 0, 08 Abstract In this paper, we propose an unconstrained framework for eigenvalue problems in both discrete and continuous settings. We begin our discussion to solve a generalized eigenvalue problem Ax = λbx with two N N real symmetric matrices A, B via minimizing a proposed functional whose nonzero critical points x R N solve the eigenvalue problem and whose local minimizers are indeed global minimizers. Inspired by the properties of the proposed functional to be minimized, we provide analysis on convergence of various algorithms either to find critical points or local minimizers. Using the same framework, we will also present an eigenvalue problem for differential operators in the continuous setting. It will be interesting to see that this unconstrained framework is designed to find the smallest eigenvalue through matrix addition and multiplication and that a solution x R N and the matrix B can compute the corresponding eigenvalue λ without using A in the case of Ax = λbx. At the end, we will present a few numerical experiments which will confirm our analysis. Introduction Given an N N matrix A, the eigenvalue problem of our interest is to find an eigenvalue and its corresponding eigenvector of A, that is, to solve Ax = λx for x and λ. This is one of the most fundamental problems in mathematics with applications to all other fields of science. Especially, one may be interested in estimating the largest and the smallest eigenvalues. If A is symmetric and positive definite, then finding the smallest eigenvalue of A is the same as finding the largest eigenvalue of A, which usually involves solving systems of linear equations of type Ax = b. Then, we ask ourselves a question: Is it possible to compute the smallest eigenvalue of A through only basic matrix operations such as multiplication and addition, without solving Ax = b? Our interest of estimating the smallest eigenvalue and its corresponding eigenvector extends to the following infinite dimensional application, as well. On a compact manifold M, eigenvalues of the Laplacian reveals important structures of M, which makes understanding the eigenvalues of on M very important. This has interesting applications. For example, in image processing there are a few interesting works e.g. [6], [], [8] to distinguish reconstructed objects from point cloud data by evaluating on the surfaces of the objects. We can even consider general self-adjoint linear elliptic operators and find their eigenvalues and eigenfunctions. With these theoretical and numerical points of view in mind, our main discussion will be concentrated on finding the smallest eigenvalue and a corresponding eigenvector of a nonzero symmetric matrix, which leads us to begin with the Department of Mathematical Sciences, UNIST, Ulsan, South Korea, yunhokim@unist.ac.kr

2 following well-known constrained problem: given a symmetric and positive definite matrix A, min x R N x, Ax subject to x =. There have been a large number of works to solve by the name of inverse iteration methods. In particular, we would like to mention the work [] by J.E. Dennis and R.A. Tapia, which surveys historical developments of inverse and shifted inverse iterations and of Rayleigh quotient iteration, and which approaches the listed methods from the viewpoint of the Newton s method. The unconstrained version of analyzed in [] is min x, Ax + x R N x. The authors of [] explained why the inverse and the shifted inverse Rayleigh quotient iterations are fast and effective by showing the equivalence between the inverse Rayleigh quotient iteration and the Newton s method for, and also between the shifted inverse Rayleigh quotient iteration and the Newton s method for the shifted version of, when the given matrix A is symmetric and invertible. We refer to [], and references therein, any interested reader in the developments of the inverse and shifted inverse Rayleigh quotient iteration methods. There are, however, a few disadvantageous features of the functional in that we paid attention to. First of all, [] considered only nonsingular matrices for just as all other conventional methods do. Second of all, in the simplest case when A is symmetric and positive definite, if 0 < < λ, then the zero vector is the only critical point of the functional in and, even if λ, the critical points of the functional in are only the eigenvalues of A less than. Hence, our goal is of two folds: extending existing theories to singular matrices, and removing the additional limitations imposed by the parameter in. Noting that the functional in contains the term x = x + x, we believed that the factor x +, even though this is convex, is not desirable because the factor tries to push the norm x towards 0 during minimization. Therefore, our analysis begins without this factor. The rest of this manuscript is organized as follows. In Section, we present our unconstrained framework for solving a generalized eigenvalue problem in a finite dimensional space, where we propose an appropriate functional to be minimized and analyze its interesting properties. We, then, apply the gradient descent method and the Newton s method to the proposed minimization problem for solving eigenvalue problems and provide analysis for convergence either to a global minimizer or to a nonzero critical point. Moreover, we present a few variants of our approach for quantitative analysis of the error between a true eigenvector and an estimated one. In Section 3, we present the same unconstrained framework for eigenvalue problems in an infinite dimensional space such as finding eigenfunctions of self-adjoint differential operators, which show universality of our unconstrained framework. In Section 4, we present numerical aspects of our proposed method confirming the theoretical results obtained in the previous sections.

3 A generalized eigenvalue problem on a finite dimensional space First of all, we will consider a generalized eigenvalue problem Ax = λbx with two N N real symmetric matrices A, B, where B is positive definite. Notice that it becomes the usual eigenvalue problem Ax = λx when B is the identity matrix I. Here and in what follows, M M N R, Sym N R, Sym N,p R mean, respectively, the set of M N real matrices, the set of N N real symmetric matrices, the set of N N real symmetric and positive definite matrices. The set of N N real matrices will be simply denoted by M N R. The case of complex matrices will be mentioned later. Moreover, we consider x R N as an N column vector and for x, y R N, x, y will be denoted by y T x and x = x, x. For A M M N R, the operator norm of A will be denoted by A op, which is A op = sup Ax. x = Note that A op is the largest singular value of A. We also denote by B for B Sym N,p R the matrix Q ΛQ T, where QΛQ T is a diagonalization of B and Λ = diag λ,..., λ N when Λ = diagλ,..., λ N. Given A Sym N R, B Sym N,p R, we define a functional F A,B : R N R by F A,B x = x, Ax + x, Bx x, Bx 3 and propose the the following unconstrained problem min F A,B x. 4 x R N When B = I, we will simply drop the subscript B by writing F A, instead of F A,B. Then, we can see, by a change of variables, y = Bx, that 3 becomes F A,B x = y, Cy + y y = F C y with C = B A B Sym N R, which means that analyzing the functional F A,B is equivalent to analyzing F C. Note that F A,B and F C are both differentiable at x 0 and that F A,B x = F C Bx B implies the sets of nonzero critical points of F A,B and F C are equivalent up to the change of variables: y = Bx. More precisely, F A,B x = Ax + Bx and F C y = Cy + y Bx, x y imply that we can solve Ax = λbx by finding nonzero critical points of F C. Therefore, we will begin our discussion with F A for A Sym N R and investigate the minimization problem min F A x, 5 x R N which is equivalent to min x R N x, Ax + x. 3

4 Lemma. Let λ R be the smallest eigenvalue of A. For > max0, λ, the set of nonzero critical points of F A is { x R N : Ax = λx for some λ R with x = }, + λ and min F A x = x R N + λ. Proof. As was noted above, for x 0 0, F A x 0 = 0 Ax 0 = x 0 x 0, which implies that x 0 is a nonzero critical point of F A if and only if x 0 is an eigenvector of A corresponding to the eigenvalue λ 0 = x 0 with x 0 = +λ 0. In addition, F A x 0 = x 0 = + λ 0. Therefore, the set of nonzero critical points of F A is { x R N : Ax = λx for some λ R with x = }. + λ Due to the choice of > max0, λ, it is easy to see that F A is bounded from below and that a global minimizer x of F A exists and is an eigenvector of A corresponding to an eigenvalue λ with x = +λ. Since we can easily see that λ = λ and F A x = + λ = min F A x, x R N min F A x = x R N + λ. When dealing with F A, we will assume > max0, λ, where λ is the smallest eigenvalue of A if no condition on is stated. Theorem. Any local minimizer x of F A is a global minimizer. Proof. First of all, 0 is not a local minimizer of F A because for any nonzero θ R N, we have F A tθ F A 0 lim = < 0. t 0 + t θ Suppose that x is a local minimizer of F A. Lemma says that x is an eigenvector of A corresponding to an eigenvalue λ with x = and that +λ F A x = + λ + λ = min F Ax, x R 4

5 where λ is the smallest eigenvalue of A. We may diagonalize A such that A = QΛQ T, where Λ is a diagonal matrix with nondecreasing diagonal entries λ λ N and Q is an orthogonal matrix having x x as the jth column for some j implying λ = λ j. Then, it suffices to show that λ j = λ, which implies that x is a global minimizer. Suppose that λ j > λ. With y = Q T x, we have F A x = F A Qy = F Λ y. 6 For k =,,..., N, we set e k to be the k th column of the identity matrix I M N R. Then, we can see that x being a local minimizer of F A is equivalent to +λ j e j being a local minimizer of F Λ and that F Λ y = y, Λy + y y = N k= λ k + yk y, 7 where y = [y y N ] T. Moreover, +λ e is a global minimizer of F Λ. Since λ j > λ, e j is orthogonal to e and we can consider F Λ on the subspace spanned by {e, e j } by defining H : R R by Ha, b = F Λ ae + be j = λ + a + λ j + b a + b. Then, H0, +λ j exists and must be positive semidefinite, i.e., [ det ] H 0, 0. + λ j However, at a, b = 0, +λ, we obtain that det Ha, b = det + λ b ab a +b 3 a +b 3 ab + λ j a a +b 3 a +b 3 = λ λ j + λ j < 0, which is a contradiction. Therefore, λ = λ j = λ, i.e., any local minimizer x of F A is a global minimizer. In addition, we may be able to find all the eigenvalues and their corresponding eigenvectors of A. Corollary. Let λ λ N be the eigenvalues of A. Let {x,..., x k } be an orthonormal set of eigenvectors of A corresponding to the eigenvalues λ,..., λ k. We consider the following problem min F A x subject to x, x i = 0, i =,,..., k. 8 x R N Then, any local minimizer x of 8 is a global minimizer corresponding to the eigenvalue λ k+ with x = +λ k+. Proof. With a diagonalization QΛQ T of A, where Λ = diagλ, λ,..., λ N and the first k columns of Q are x,..., x k, and y = Q T x, we have that F A x = F Λ y and x = y, so 8 is equivalent to min{f Λ y : y = [0 0 y k+ y N ] T R N }. 9 5

6 Let Λ k be the last n k n k block of Λ, i.e., Λ k = diagλ k+,..., λ N. Then, 9 is equivalent to min z R N k F Λk z. Theorem applies to F Λk and we are done. Even though F A is not convex, we have seen that all local minimizers of F A are global minimizers, which is a rare case for nonconvex functionals, and that nonzero critical points of F A are eigenvectors of A. Hence, one can expect that any algorithm either to minimize the functional F A or to find a critical point of F A will work. For example, if we set Gx = x, then the conjugate functional G is G y = sup x R N x, y Gx = { 0, if y,, if y >, and Gx = G x = sup y R N x, y G y = sup y x, y. Then, 5 becomes min x R N, x, Ax + x x, y. y In fact, the constraint on y is y =, so we have min x R N, y = F A x, y = x, Ax + x x, y. 0 Since the functional F A, y is convex and quadratic in x for any fixed y, we may consider an algorithm such as. x = arg min x R N F A x, y,. Update y to be x x. 3. Iterate the above procedure until it converges. This algorithm is exactly the inverse power method A + I x k+ = x k x k. However, 0 is a constrained problem and the above algorithm requires solving a system of linear equation at every iteration, not to mention, the rate of convergence is linear. Therefore, we want to consider an algorithm that satisfies either one of the following two: the rate of convergence is linear, yet the algorithm applies only matrix addition and multiplication, the rate of convergence is faster than linear if we need to solve systems of linear equations.. The gradient descent method As for the first algorithm, we will analyze the gradient descent method for the minimization problem 5 with stepsize α k > 0: with x 0 0, x k+ = x k α k F A x k = x k α k Ax k + x k x k x k. 6

7 Let {q,..., q N } be an orthonormal basis for R N consisting of unit eigenvectors of A corresponding to the eigenvalues λ λ N, respectively. If x k = µ k, q + + µ k,n q N, then x k+ = µ k+, q + + µ k+,n q N = = N µ 0,i Π k j=0 [ α j λ i + i= N i= [ µ k,i α k λ i + ] q i x k x j ] q i. For simplicity, we assume a fixed stepsize 0 < α k = α < λ N + for all k N. Then, always converges to a critical point of F A. In fact, it converges to a global minimizer with probability if an initial point x 0 0 is chosen randomly. Theorem below is given in a general form. Theorem. A sequence {x k } generated by with x 0 = µ 0, q + + µ 0,N q N 0 converges to a critical point x of F A which is an eigenvector of A corresponding the eigenvalue λ l with x =, + λ l where l = min{j {,..., N} : µ 0,j 0}. More precisely, x is Proof. For any x 0, we get + λ µ 0,m q m. l µ m:λm=λl 0,m m:λ m=λ l x Ax, x F A x, = x x + x = λ + x, 3 where λ λ λ N. Then, α < λ N + implies x k > α for all k N since x k+ x k x k α F A x k, xk αλ + x k + α > α. x k Moreover, the line segment connecting x k and x k+ for k N lies entirely in {x R N : x α}. To see this, we take 0 < t < and observe that for k N, x k tα F x k x k tαλ + x k + tα > α tαλ + + t > α. Noting that F A x = A + I I xxt x x and A+I, I xxt x are positive semidefinite with A+I op λ N + and I xxt x op, we can see that for x α, F A x op maxλ N +, α α, 7

8 from which we obtain that for each k, F A x k+ F A x k + F A x k, x k+ x k + α x k+ x k = F A x k α F Ax k, which implies F A x k α F Ax k F A x k+. Note that for any K, α K K x k+ x k = k= k= F A x k α F Ax min x R N F A x. 4 Since F A is coercive and F A x k F A x < for all k, {x k } must be a bounded sequence in {x R N : x α}. Choosing a convergent subsequence {x kn } to x, we know from 4 that F A x = 0, i.e., x is an eigenvector of A corresponding to an eigenvalue λ i for some i N with norm x = +λ i. Knowing that {F A x k } is a decreasing and bounded sequence, we can easily derive that any subsequential limit x of {x k } satisfies F A x = F A x and A x = λ i x with norm x = +λ i. Hence, we can conclude that lim x k =. k + λ i Setting l = min{j {,..., N} : µ 0,j 0}, we can see from that λ i λ l. Suppose λ i > λ l. We note that for all k N, α λ l + > λ l + x k λ N + + α x k α x k > 0 and that as k, α λ l + αλ l λ i >. x k From, we can see that [ µ 0,l Π k j=0 α λ l + ] as k. x j This is a contradiction because {x k } is a bounded sequence. Therefore, λ i = λ l. Moreover, for λ p > λ l, we have α λ p + αλ p λ l 0, as k, x k implying [ µ 0,p Π k j=0 α λ p + ] 0 as k. x j 8

9 Hence, referring to, we can see that [ x k+ µ 0,m q m Π k j=0 α λ l + ] 0 as k. x j m:λ m=λ l In addition, convergence of the norm x k implies that Π j=0 [ α λ l + ] = x j + λ l µ m:λm=λl 0,m. Therefore, x k converges to + λ µ 0,m q m. l µ m:λm=λl 0,m m:λ m=λ l Now, going back to the generalized eigenvalue problem Ax = λbx, 5 via minimizing F A,B in 4, we realize that even though 4 is equivalent to 5, the gradient descent method we discussed above is applicable to F C with C = B A B to find a critical point of F A,B. That means that not only do we need to compute B, but also we need to invert it. However, it turns out that applying the gradient descent method directly to F A,B to solve 4 finds solutions of 5, as well. When solving 4, we will use µ A,j, µ B,j to denote the j th smallest eigenvalues of A and B, respectively. Moreover, we assume that µ B, = and that either µ A, 0 or µ A,N µ A, < 0 is true. Note that these assumptions are not restrictions, but simplifications. In relation to 5, when dealing with F A,B, the parameter will be assumed to satisfy > max0, µ A,. We set {r,..., r N } to be an orthonormal set of eigenvectors of C corresponding to the eigenvalues λ λ N. We also set q j = B r j for j N. Then, it is easy to see that {q,..., q N } is an orthonormal basis for R N with respect to the inner product x, y B := x T By. Theorem 3. If we choose 0 < α < then with x 0 0, the following procedure µ A,N + µ B,N 3, x k+ = x k α F A,B x k, k = 0,,,..., 6 produces a sequence {x k } converging to x with Bx, x = solution pair of 5. +λ, where x, λ is a Proof. If B = I, then this theorem is the same as Theorem. Hence, we will assume that µ B,N >. In addition, since the proof mimics that of Theorem with minor differences in 9

10 detail, we will emphasize only those minor differences. Let {x k } be the sequence generated by 6. First of all, we note that for any x k R N, Ax k, Bx k µ A,N µ B,N and = µ B, Bx k, Bx k µ B,N. 7 Bx k, x k Bx k, x k This implies that if 0 < α < µ A,N +µ B,N 3, then for 0 < ɛ := µ B,N, [ Axk, Bx k α Bx k, x k + Bx k, Bx k ] αµ A,N + µ B,N > ɛ. 8 Bx k, x k We can see by 7 and 8 that for k = 0,,,..., Bx k+, x k+ Bx k+, x k Bx k, x k = Bx k, x k ɛ + α xk α F A,B x k, Bx k Bxk, x k = Bx k, x k ɛ + Bx k, x k αɛ + α. 9 This proves that Bx k, x k α for k, which can be improved further as follows: for k, we have resulting in Hence, Bx k, x k ɛ + Bx k, x k αɛ + α Bx k, x k ɛ + α + ɛ, k Bx k+, x k+ α + ɛ ɛ l + α ɛ k. lim inf Bx k, x k α + ɛ k ɛ > α ɛ = αµ B,N and there exists K N such that k K implies l=0 Bx k, x k > αµ B,N. Moreover, we can estimate Bx k + tx k+ x k, x k + tx k+ x k for t 0, as follows: for k >,,..., and 0 < t <, Bx k + tx k+ x k, x k + tx k+ x k = Bx k tα F A x k Bxk Bxk tα F A,B x k, xk tα F A,B x k, Bx k = Bx k Bxk, x k [ [ Axk, Bx k = Bxk, x k tα Bx k, x k + Bx k, Bx k ] + tα Bx k, Bx k ] Bx k, x k Bx k, x k Bx k, x k tɛ + tα = Bx k, x k + tα Bx k, x k ɛ { Bx k, x k, if α Bx k, x k ɛ, Bx k, x k ɛ + α, if α < Bx k, x k ɛ. where ɛ = ɛ. Noting that α < Bx k, x k ɛ implies Bxk, x k ɛ + α > α ɛ ɛ + α = α ɛ > αµ B,N, 0

11 we have that for k K, Therefore, we have min Bx k + tx k+ x k, x k + tx k+ x k > αµ B,N. t [0,] {x k + tx k+ x k : t [0, ], k K} {x R N : Bx, x αµ B,N }. Since Bx B [I Bx,x and A + B op α and Bx Bx Bx T ] B is positive semidefinite and B [I Bx Bx, x T ] Bx Bx op B, α, F A,B x = A + B B [I Bx Bx, x Bx T ] Bx B, Bx we can see that F A,B x op α. The rest of the proof for convergence of the sequence {x k } k>k to x satisfying Bx, x = Bx = + λ, where x, λ is a solution pair of 5, is omitted due to the similarity of that in Theorem. In Theorem 3 above, convergence to a nonzero critical point of F A,B is confirmed. However, if we can efficiently deal with B, then we can guarantee to find a global minimizer of F A,B by considering the gradient descent method with respect to a different inner product. Corollary. If we generate a sequence {x k } by x k+ = x k αb F A,B x k, k = 0,,..., 0 with a randomly chosen x 0 0 and 0 < α < µ A,N +, then the sequence converges to x, where x, λ is a solution pair of 5 satisfying and Bx, x = Bx = + λ λ = min{λ : A λb is singular}. In fact, 0 is the gradient descent of F A,B with respect to the inner product x, y B := x T By. Proof. First of all, as we mentioned, 5 is equivalent to B Ax = λx, and to Cy = λy, with C = B A B, y = Bx.

12 Therefore, x, λ is a solution pair of 5 if and only if x is an eigenvector of B A corresponding to the eigenvalue λ if and only if Bx is an eigenvector of C corresponding to the eigenvalue λ. Note that since y, Cy = x,ax x x for x = B y, and min y = B y µb, =, λ = min y, Cy min µ A, B y y = y = { µ A,, if µ A, < 0, 0, if µ A, 0., Since > max0, µ A, implies > max0, λ, we can apply Theorem to F C y = y, Cy + y. Moreover, since λ N, the largest eigenvalue of C, is at most µ A,N 0 < α < µ A,N + λ N + and that a sequence {y k} generated by > 0, we know that y k+ = y k α F C y k, k = 0,,,..., with y 0 0 chosen at random, converges to y, an eigenvector of C corresponding to the smallest eigenvalue λ = λ with norm y = +λ. On the other hand, with y k = Bx k, k = 0,,..., we have which implies that 0 is nothing but F C y k = B F A,B x k, y k+ = y k α F C y k. Hence, the sequence generated by 0, with a randomly chosen x 0 0, converges to x, where x, λ is a solution pair of 5 satisfying Bx, x = + λ and λ = λ = min{λ : A λb is singular}. In addition, with respect to the inner product x, y B := x T By, we note that B F A,B x = B Ax + x, x, x B which is the gradient of F A,B with respect to the inner product, B because F A,B x = x, B Ax B + x B x B, Note also that B A is self-adjoint with respect to, B. Therefore, 0 is the gradient descent of F A,B with respect to, B. In general, with a nonsymmetrix square matrix E, minimizing the functional F E does not guarantee to find an eigenvector of E. However, Theorem 3 and Corollary make it possible to find eigenvectors of E in certain cases when E decomposes into E = B A with a symmetric matrix A and a symmetric positive definite matrix B. Moreover, if it is easy to compute B, then Corollary applies to find a global minimizer of F E with a better stepsize. We can also find subsequent eigenvectors in the same way as presented in Theorem.

13 Corollary 3. Let x, λ,..., x m, λ m be solution pairs of 5 where λ λ m are the first m smallest ones in {λ R : A λb is singular}. We consider the following problem min F A,B x subject to x, x k B = 0, k =,,..., m. x R N Then, any local minimizer x of F A,B in the subspace orthogonal to {x,..., x m } with respect to the inner product, B, is a solution with Bx, x = and +λ λ = min{λ R : A λb is singular} \ {λ,..., λ m }.. The Newton s method As for the second algorithm with a faster rate of convergence, we will analyze the Newton s method to find nonzero critical points of 5, which are eigenvectors of A. Since the functional F A is continuously twice differentiable at x 0, if we apply the Newton s method, we will generate a sequence {x k } by x k+ = x k F A x k F A x k, k = 0,,..., with an initial x 0 0 unless F x k is singular. We can observe that becomes for k = 0,,,..., [ x k+ = A + I + xk xk T ] xk x k x k x k x k x k. Hence, we propose the following scheme: with an initial guess x 0 0, for k = 0,,,..., compute y k and x k+ by y k = x k x k, and [ A + I + xk xk T ] x k+ = y k. 3 x k x k x k x k We wrote 3 in the given form to enhance its similarity either to the inverse iteration or to the Rayleigh quotient iteration. As for convergence, we will show that convergence of the norm x k is equivalent to convergence of x k. Theorem 4. Let A Sym N R have eigenvalues λ λ N. Let > max0, λ, and 0 < x 0 +λ j for any j N.. Suppose that a sequence {x k } k= can be generated by 3, i.e., x k is computable for all k N, and that x k converges to η > 0. If x k +λ j for any j N and for all k, then there exists i 0 N such that η = +λ i0 and x k converges to an eigenvector x of A corresponding to the eigenvalue λ i0 with x = +λ i0.. On the other hand, suppose that we generate a sequence {x k } k0 k= for some k 0 N by 3 and x k0 satisfies x k0 = +λ i, where λ i for some i N is an eigenvalue of A with multiplicity. Let q i be a unit eigenvector of A corresponding to λ i. If x k0 is not a critical point of F A with qi T x k 0 > 0, then x k0+ is an eigenvector of A corresponding to the eigenvalue λ i. If qi T x k 0 +λj +λ i for j < i, then x k0+ is a critical point of F A, i.e., an eigenvector of A corresponding to the eigenvalue λ i with norm x k0+ = +λ i. However, if qi T x k 0 = +λj +λ i for some j < i, then the system becomes singular and we may not compute x k0+ uniquely. In any case, the algorithm terminates in k 0 + iterations. 3

14 Proof. It suffices to consider the case that A is a diagonal matrix with diagonal entries λ λ λ N. Then, 3 becomes + λ j x k+,j = y k,j x k+ x k x k yt k y k+, j =,,..., N, 4 where x k+ = [ ] T x k+, x k+,n and yk = x k x k. Let {x k} be a sequence generated by 4 with x k +λ j for any j N and for all k N {0}. Firstly, we consider the case that x k converges to η > 0. Since x k is computable for all k N, we can see from 4 that x k+ x k yt k y k+ 0, k 0. By setting J 0 := {j {,,..., N} : x 0,j 0}, we know that for k 0, x k,j 0 if and only if j J 0. We will now prove lim sup k y T k y k+ = by contradiction. Suppose that lim sup yk T y k+ <. k Then, there exists ɛ < with lim sup k yk T y k+ = ɛ. Given δ 0, ɛ, we may choose l N so that k l implies x k η < δ and x k x k+ δ < and yk T y k+ < ɛ + δ. 5 We also choose J J 0 satisfying η + λ J From 4, we see that for k l, η = min + λ j. j J 0 x k+,j x k+ δ ɛ x k,j η + λ J + δ + λ J x k. 6 If η + λ J < ɛ, then by choosing δ 0, ɛ satisfying η + λ J + δ we can see from 6 that lim k x k+,j x k+ η min + λ j j J 0 + λ J < δ ɛ, =, which is impossible. Hence, ɛ i.e., η ɛ + λ or η ɛ + λ, where λ = min j J0 λ j and λ = max j J0 λ j. Suppose that η ɛ +λ. For any 0 < δ < ɛ +λ < ɛ, we can see from 5 that for each j J 0, k l implies x k + λ j < η + δ + λ j ɛ + λ j + λ + δ 4 + λ j < 0,

15 and This results in, for k l, y T k y k+ = x k x k+ yt k y k+ > δ yt k y k+ > 0. xk x k+ N yt k y k+ j= δ yt k y k+ j J 0 y k,j x k + λj yk,j x k + λj < 0. This implies that ɛ 0, i.e., η 0, which is a contradiction, i.e., η Hence, we must have η ɛ +λ. Again with 0 < δ < ɛ +λ for k l, and for each j J 0, x k + λ j > η δ + λ j ɛ + λ j + λ δ which implies that for k l, since ɛ + δ <, y T k y k+ = + λ j ɛ +λ is not possible. < ɛ, we can also see that > ɛ + λ j ɛ + λ j > 0, + λ + λ xk x k+ N yt k y k+ j= δ yt k y k+ j J 0 y k,j x k + λj yk,j x k + λj > 0. Hence, we have If we extract a subsequence y kn of 3 and knowing that 0 ɛ <. 7 such that y T k n y kn+ ɛ as n, then using the form we can see that lim inf n yk T n+ x kn+ Ax k n+ = lim inf n yk T [ n+ x kn+ A + I + x kn yt k n+ay kn+ + x kn x kn λ + + ɛ η η ɛ η η + ɛ ɛ + λ. xkn yt k n+ay kn+ λ xkn T ] x kn+ = yt k n+ y k n x kn+ x kn x kn + x kn yt k n y kn+ = yt k y n+ k n x kn+ 5

16 Therefore, we obtain ɛ + λ η + ɛ ɛ + λ. 8 However, this is a contradiction since 7 implies + ɛ ɛ < ɛ, i.e., η ɛ +λ is not possible, either. Therefore, we conclude that ɛ < is impossible, i.e., lim sup k yk T y k+ =. We can now show that η = +λ i0 for some i 0 N. Suppose that η +λ j for any j N. By considering a subsequence {y kn } with lim n y T k n y kn+ =, it is easy to see using 4 that = lim n max j N x kn x kn+ yt k n y = lim kn+ n <, η + λj N j= y k n,j x kn + λj which is a contradiction. Hence, we have that η = +λ i0 for some i 0 N. Next, we will show that x k converges to an eigenvector x of A corresponding to λ i0 with norm x = η = +λ i0. Firstly, we show that there exists j J 0 such that λ j = λ i0. As above, by choosing a subsequence y kn such that yk T n y kn+ as n, it is easy to see that there must be j J 0 with λ j = λ i0. That is, {j J 0 : λ j = λ i0 }. Hence, without loss of generality we will say that i 0 J 0. Let k 0 N be such that k k 0 implies Then, for k k 0, and for λ j λ i0, + λ i 0 λ j λ i0 < min. x k λ j λ i0 3 + λi0 x k + λj x k Moreover, since we have that for j N, <. x [ K+,j x K+ = Π K x k+ x k yt k y k+ ] x0,j k=0 + λj, K 0, 9 x 0 x k if we choose j for which λ j λ i0, we can see that Since x K+,j x K+,i0 x K+,j = x K+,i0 = x0,j x 0,i0 Π K k=k 0 + λi0 K k0+ x k + λj x k Π k0 k=0 Π k0 k=0 + λi0 + λj x k x k + λi0 + λj x k x k x 0,j x 0,i0 x 0,j 0 as K. x 0,i0 for λ j = λ i0, we can also see that x K+ η as K implies x K+,i0 λ j=λ i0 x0,j as K. x 0,i0 + λ i0 6

17 Let m i0 = x0,j λj=λi0 x 0,i0. Then, as K, This implies that y K+,i0 m i 0 as K. In addition, noting that for all k, and lim k N j= x K+,i0. 30 m i 0 + λ i0 yk T y k+ = x k x k+ N y yt k,j k y k+ j= x k + λj, y k,j x k + λ j =, we know that lim k x k x k+ yt k y k+ = 0. Hence, not only do we have lim sup k y T k y k+ =, but also we can obtain that lim k yt k y k+ =. In fact, since x k,j = x k,i0 x0,j x 0,i0 for λ j = λ i0, we can see that y T k y k+ = and that lim k y T k y k+ = implies x0,j x k,i0 x k+,i0 + x k x k+ x 0,i0 λ j=λ i0 lim x k,i 0 x k+,i0 =. k m i 0 + λ i0 Together with 30, we know that lim k x k,i0 exists and is either m i0 Letting x,i0 = lim k x k,i0, we have that x k converges to x, where x,j = { x 0,j x 0,i0 x,i0, for λ j = λ i0, 0, for λ j λ i0. λ j λ i0 x k,j x k+,j, +λ i0 or m i0 +λ i0. Note that x is an eigenvector of A corresponding to λ i0 with norm x = η = +λ i0. This finishes the first part of the theorem. For the second part of the theorem, we generate a sequence {x k } k0 k= for some k 0 N and suppose that x k0 satisfies x k0 = +λ i, where λ i for some i N is an eigenvalue of A with multiplicity. If x k0 is not a critical point of F A and x k0,i 0, then we have x k0,i < +λ i and x k0 yt k 0 x k0+ =, which turns 4 into λ λ λ N x k0+,. x k0+,n 7 = λ i x k0+,. x k0+,n.

18 Since λ i is of multiplicity, there exists a unique solution x k0+ = α e i, where e i is the standard basis element in R N with e i,j = δ ij and α = x k0,i+λ i. Note that α > +λ i and x k0+ is an eigenvector of A corresponding to the eigenvalue λ i, and yet is not a critical point of F A. Since x k0+ satisfies 4 with y k0+ = ± e i, we can see that for j N, { λ j + α x k 0+,j = δ ij α x k 0+,i, if y k0+ = e i, λj + α x k 0+,j = δ ij + α x k 0+,i, if y k0+ = e i. 3 Note that α = +λ j is equivalent to x k0,i = +λj +λ i. Since x k0,i < +λ i, it is possible to have α = +λ j only if j < i. Hence, if x k0,i +λj +λ i for j < i, then α +λ j for j < i, and 3 is nonsingular and has a unique solution { ± +λ x k0+,j = i, if j = i, 0, if j i. depending on y k0+ = ± e i. That is, x k0+ is a critical point of F A, an eigenvector of A corresponding to the eigenvalue λ i with norm x k0+ = and the algorithm terminates. +λ i On the other hand, if x k0,i = +λj +λ i for some j < i, then α = +λ j 3 becomes singular and the algorithm terminates. and the system Remark. It is very interesting to note that in both of the gradient descent method and the Newton s method discussed above, the convergence of a generated sequence {x k } is confirmed by the convergence of the sequence of their norms { x k }, which hardly happens, in general. So, it would probably be worth further investigation in a subsequent work..3 Some variants of the proposed framework We have seen in the previous sections how to solve the generalized eigenvalue problem 5 Ax = λbx with A Sym N R, B Sym N,p R in an unconstrained framework. In this section, we will proceed our discussion on some variants, inspired by 3, of our proposed framework including nonsymmetric cases, as well. When x k converges to an eigenvalue x of A corresponding to an eigenvalue λ via 3, we know that x k converges to λ, hence 3 becomes with ỹ = A λ I + + λ ỹỹ T x = ỹ x x. Hence, we may consider the following procedure. One Step Eigenvector Estimation. Given A Sym N R, and an eigenvalue λ of A, and > 0 with λ, we choose x 0 uniformly at random from S N and solve for x, A λi + + λx 0 x T 0 x = x 0. 3 In the case of 0 being an eigenvalue of A, by setting λ = 0, 3 turns into A+x 0 x T 0 x = x 0, which means that 3 is equivalent to A λi + x 0 x T 0 x = x 0, 0. 8

19 Before proceeding our discussion, we want to mention a work [3] of G. Peters and J.H. Wilkinson, which was further explained in []. In [3], the authors discussed an idea of computing an approximate eigenvector x λ when an approximate eigenvalue λ is given, i.e., when A λi is very ill-conditioned, or near singular, by considering A λi + x i p T x = x i, 33 with a random vector p, inspired by the inverse iteration, i.e., by A λix i+ = xi x. i The authors noticed that A λi + x i p T can be well-conditioned and the solution to 33 is nothing but a constant multiple of the solution to A λix = x i, but provided reasons why 33 is not in their favor simply because A λi + x i p T changes its form at every iteration making computations inefficient. However, with λ fixed, even though a limit exists for the inverse iteration, the convergence is still linear. Further discussions can be found in [4], and the references therein, in relation to the shifted inverse iteration and the Rayleigh quotient iteration. On the other hand, our concern is if we can make full use of the nonsingular system 3 to analyze quantitatively the error in eigenvector estimation regardless of the multiplicities of the corresponding eigenvalues helping understand the Newton s method. So, we will provide a series of results for the rest of this section. Proposition. Suppose that λ has multiplicity. With probability, the equation 3 has a unique nonzero solution x that is an eigenvector of A corresponding to the eigenvalue λ. Proof. Let q be a unit eigenvector of A corresponding to λ. Note that if we choose x 0 S N uniformly at random, then we have q T x 0 0 with probability. Moreover, if q T x 0 0, then A λi + + λx 0 x T 0 z = 0 implies x T 0 z = 0. Hence, we have Az = λz. Since λ has multiplicity, z = aq for some a R. In addtion, 0 = z T x 0 = aq T x 0 implies a = 0. That is, z = 0. Hence, A λi + + λx 0 x T 0 is nonsingular and there exists a unique nonzero solution x to 3. By multiplying 3 by q T, we have + λx T 0 x =, which implies that x also satisfies A λi x = 0. If the multiplicity of an eigenvalue λ is greater than, then 3 becomes singular and Proposition does not apply. However, when the multiplicity m > is known, we can construct another nonsingular system. Corollary 4. Suppose that an eigenvalue λ of A has multiplicity m >. We choose x 0,..., x m uniformly at random from S N and set an N m matrix X 0 = [x 0 x m ]. With probability, the equation A λi + + λx 0 X T 0 X = X 0 has a unique nonzero solution X = [ x 0 x m ], an N m matrix, whose columns x 0,..., x m constitute a basis for the eigenspace corresponding to the eigenvalue λ. Moreover, Proposition below says that, regardless of an eigenvalue s multiplicity and of the symmetry of a matrix, a good estimate of the eigenvalue guarantees a good estimate of a corresponding eigenvector through the nonsingular linear system 3. Proposition. Let A M N R be diagonalizable. Suppose that λ is an eigenvalue of A. Let be such that > 0 and λ. 9

20 . If N A λi, the null space of A λi, is of dimension, then choosing x 0 S N uniformly at random, 3 has a unique nonzero solution x with probability, which is an eigenvector of A corresponding to λ.. If N A λi has dimension greater that, then choosing x 0 S N uniformly at random, we can see, with probability, that for λ close enough to λ, has a unique nonzero solution x λ satisfying A λi + + λx 0 x T 0 x = x 0 34 η λ λ < x λ x < η λ λ for some 0 < η < η <, where x N A λi is uniquely determined by x 0 and η, η do not depend on λ. Proof. The first part can be proven in the same way as we proved Proposition with a choice of x 0 S N satisfying q T x 0 0 and q T x 0 0, where q and q are unit vectors spanning N A λi and N A T λi, respectively. Note that A λi for 0 < λ λ < min λj λ λ j λ, is invertible and that if x 0 S N is chosen uniformly at random, then x 0 is not orthogonal to N A λi with probability. Moreover, since R N = eigenspace λ V with V = λ j λ eigenspaceλ j, representing x 0 = q 0 + r 0 uniquely in R N, where q 0 eigenspace λ = N A λi and r 0 V, we have q0 T x 0 0 with probability. We also note that for any 0 < δ < min λ λj λ j λ, there exists K > 0 such that sup λ [ λ δ, λ+δ] A λi V < K, 35 where A λi V is the operator norm of the restriction A λi : V V, which is not difficult to see since V is invariant under A λi and A λi : V V is invertible and A λi V is continuous in λ for λ [ λ δ, λ + δ]. So, we will fix 0 < δ < min λ λj λ j λ, + λ and consider A λi as being restricted to V for λ [ λ δ, λ + δ]. Firstly, with such an x 0, we will confirm that for 0 < λ λ < δ, A λix = x 0 + λx 0 x T 0 x 36 has a unique solution x λ. If there is a solution x λ, then letting α λ = + λx T 0 x λ, we can see that 36 becomes A λix λ = α λ x 0, i.e., x λ = α λ A λi x 0. Hence, the unique existence of α λ confirms the unique existence of x λ. Note that α λ must satisfy α λ = α λ + λx T 0 A λi x 0 α λ = + + λx T 0 A λi x 0. Together with 35, it is not difficult to see that for λ [ λ δ, λ + δ], r 0 A λi V + δ < A λi r 0 < K r 0. 0

21 Since A λi q 0 = λ λ q 0, we have x T 0 A λi x 0 = λ λ qt 0 x 0 + x T 0 A λi r 0 qt 0 x 0 λ λ A λi r 0 > qt 0 x 0 λ λ K r 0. Noting that + λ > + λ for 0 < λ λ < δ, we can see that if 0 < λ λ < δ 0 := min δ, qt 0 x 0 β where β = max6k r 0, K r λ, then + λxt 0 A λi x 0 > and α λ exists and we can easily see that the unique nonzero solution x λ to 36 is represented as x λ = α λ A λi x 0 =, A λi x λx T 0 A λi x 0 and that α λ λ λ + λq 0 T x 0 q0 T x λx T 0 A λi r 0 = [λ λ + + λx T 0 A λi r λq0 T x 0] + λq 0 T x λ λ 0 < q0 T x K + λ + δ 0 r 0 + λq 0 T x 0 + λ δ 0 q0 T x 0 λ λ + λ λ λ. + δ 0 K r 0 Since + λ δ 0 > + λ / and + λ + δ 0 < 3 + λ / and δ 0 qt 0 x 6K r 0, we obtain that for 0 < λ λ < δ 0, implying that where + λ δ 0 q T 0 x 0 λ λ + λ + δ 0 K r 0 > + λq T 0 x 0 4 is determined by x 0. If we set x = α λ λ λ + λq 0 T x < ω λ λ, 0 ω = 4 qt 0 x K3 + λ / r 0 + λq T 0 x 0 + λx T 0 q0q 0, then for 0 < λ λ < δ 0, A λi x 0 x λ x = + + λx T 0 A λi x 0 + λx T 0 q 0 q 0 αλ λ λ + λx T 0 q q 0 + αλ A λi r 0 0 < η λ λ where η = ω q 0 + K r 0 + ωδ + λx 0. T 0 q0

22 On the the hand, we let z 0 V be such that z 0 = A λi r 0, i.e., A λz 0 = r 0 and set y = z 0 zt 0 q0 q 0 q 0 and y 0 := y y. Then, it is not difficult to see that yt 0 z 0 0 and y0 T q 0 = 0 and y T 0 A λi r 0 y T 0 z 0 = λ λy T 0 A λ A λi r 0 K r 0 λ λ, which implies that there exists δ > 0 such that for 0 < λ λ < δ, we have y T 0 A λi r 0 > yt 0 z 0 and α λ > λ λ + λx T 0 q 0. i.e., x λ x y0 T x λ x = α λ y T A λi r 0 > η λ λ, with η = y 4 T 0 z0 + λx > 0. Therefore, 0 < λ λ < minδ0, δ T impilies 0 q0 η λ < x λ x < η λ. From the analysis about 3, we can see a similarity between the Newton s methd and the Rayleigh quotient iteration. When generating a sequence {x k } via A µ k I + + µ k y k y T k x k+ = y k, where y k = x k x k, we can see that is obtained with µ k = x k and the Rayleigh quotient method is obtained with µ k = x k,ax k x k. The same is true for solving Ax = λbx, i.e., if we use µ k = x k B, where x k B = x k, x k B, we have the Newton s method, [ A + B + Bxk Bxk T ] x k+ = Bx k, x k B x k B x k B x k B x k B whereas if we use µ k = x k,ax k x k,x k B, then we have the Rayleigh quotient method. We will present numerical experiments towards the end of this paper showing an interesting feature of our proposed framework, that is, our proposed framework tends to find the smallest eigenvalues due to the philosophy of our proposed framework where F A,B is to be minimized, whereas the Rayleigh quotient method tends to find the largest eigenvalues when both methods begin with a random initial point x 0. Remark. We would like to mention that the same framework applies to solving eigenvalue problems involving complex Hermitian matrices and to finding singular values of A M M N R. When it comes to singular values of A, we may consider either A T A or AA T in place of A since λ > 0 is a singular value of A if and only if λ > 0 is that of A T A or AA T. 3 An eigenvalue problem on infinite dimensional spaces It is interesting to see that the same framework as 3 applies to eigenvalue problems on infinite dimensional spaces such as the Sturm-Liouville eigenvalue problem, the eigenvalue problem of self-adjoint elliptic operators, etc. We will present one such application of finding an eigenfunction of a self-adjoint uniformly elliptic operator corresponding to the smallest eigenvalue, which is an infinite dimensional version of a real symmetric and positive definite matrix.

23 3. Symmetric uniformly elliptic operators Let Ω be a bounded open subset of R d with Lipschitz boundary Ω. We will denote by L a symmetric uniformly elliptic operator defined by Lu = d i a i,j j u + cu, 37 i,j= where a i,j, c L Ω, i, j =,..., d, are such that a i,j x = a j,i x a.e., and cx 0 a.e., and there exists 0 < α β < such that for a.e. x Ω and for ξ R d, α ξ d a i,j xξ i ξ j β ξ. i,j= The problem that we are interested in is to find ϕ H0 Ω that solves { Lϕ = λϕ, in Ω, ϕ = 0, on Ω. 38 It is known that the eigenvalues λ, λ,... of L are nonnegative and we may have them in an increasing order, that is, 0 < λ < λ λ 3. Therefore, a natural question to ask is to find the smallest eigenvalue λ of L and its corresponding eigenfuction. Definition. ϕ H 0 Ω is a weak solution of 38 if for any ψ C 0 Ω, d Ω i,j= a i,j x j ϕx i ψxdx + cx λϕxψxdx = 0. Ω Then, it is natural from the discussion in the previous sections that we want to define a functional F L with > 0 by F L u = + d a i,j x j ux i uxdx + Ω i,j= Ω ux dx and solve the following minimization problem Ω ux dx Ω cx ux dx 39 min F L u 40 u H0 Ω and investigate the relationship between 38 and 40. The existence of a minimizer of the problem 40 is obvious by the standard method using the compact embedding theorem by Rellich-Kondrachov. Moreover, we can observe the same characteristics of 39 as those of 3. For theorems and lemmas that follow, we will omit their proofs because they are essentially the same as what we presented in the finite dimensional case. 3

24 Lemma. The set of nonzero critical points of F L is { ϕ H0 Ω : Lϕ = λϕ in Ω and ϕ L Ω = }. + λ and where λ > 0 is the smallest eigenvalue of L. min F L u = u H0 Ω + λ, Theorem 5. Any local minimizer of 39 is a global minimizer. 3.. Corresponding parabolic PDEs Corresponding to the uniformly elliptic PDEs of the form 38, we will consider the following parabolic PDE: for T 0,, we solve u t = Lu ut u in Ω T := Ω 0, T ], u = 0 on Ω T := Ω [0, T ], 4 u0 = u 0 0 in H0 Ω, where ut = Ω ux, t dx. Due to the condition c 0 in Ω, L is positive definite. In the context, we will use,, for both inner products in L Ω and in R N for N N. Note that the partial differential equation in 4 is the formal gradient flow of a functional F L in 39, i.e., u t = F Lu. Theorem 6. The problem 4 has a unique weak solution u for any T 0,. Moreover, if ψ is an eigenfunction of L corresponding to the smallest eigenvalue λ > 0 with ψ =, and if u 0, ψ 0, then the solution u satisfies ut + λ v in L Ω as t for some v {±ψ }. Before proving Theorem 6, we will present an ODE version of 4, which will be used when proving Theorem 6. For A Sym N,p R, as we did previously, if we consider a diagonalization of A, A = QΛ N Q T, where Q is an orthogonal matrix and Λ N = diagλ,..., λ N, 0 < λ = = λ p < λ p+ λ N for some p < N, we have F A x = F ΛN y, with y = Q T x. 4 The gradient descent flow associated with the functional F ΛN d dt Φ Nt = F ΛN Φ N t, and we are interested in the existence of a solution of 43 below, which is to solve on 0, is { d dt Φ Nt = F ΛN Φ N t = Λ N + IΦ N t + Φ N 0 R N, φ j 0 0 for some j p, Φ N t Φ Nt, 43 where Φ N = Φ N t = [φ t φ N t] T and Φ N t = 4 N k= φ k t.

25 Theorem 7. There exist a unique solution Φ N C [0, of 43 and v R N such that Φ N t v as t, where v is an eigenvector of Λ N corresponding to the eigenvalue λ with v = +λ v, e j 0. Note that { e,..., e N } is the standard basis for R N. Proof. The existence and uniqueness of a solution Φ N on [0, is easily guaranteed by the theory of ODEs once we establish a lower bound ω > 0 for Φ N t, t [0,. Due to φ j 0 0, i.e., Φ N 0 > 0, a solution exists and is unique on [0, ɛ] with ɛ <<. Then, by setting T max = sup{t 0, : a solution Φ N exists and is unique on [0, T ]}, we know that T max ɛ. Note that for t 0, T max and for k N, we have d dt φ kt = λ k + φ kt + Φ N t φ kt > λ N + φ kt, resulting in φ k t 0 for t 0 if and only if φ k 0 0. Let ω = min +λ N, Φ N 0. Suppose {t 0, T max : Φ N t ω}. Then, T ω = inf{t 0, T max : Φ N t ω} exists. Since there exists δ > 0 such that ω < Φ N t < 3 ω on T ω δ, T ω, we have d dt Φ Nt 3 λ N + Φ N t > 0 on T ω δ, T ω, that is, Φ N t is increasing in T ω δ, T ω and lim t Tω Φ N t > ω = Φ N T ω. This is a contradiction. Therefore, we have inf Φ N t > ω, t 0,T max which eventually implies that T max =. In addition, we note that d d dt F Λ N Φ N = dt Φ N, Λ N Φ N + Φ N Φ N Φ N = dφ N dt and that since F ΛN Φ N is bounded below, there exists y R such that implying that lim dφ N = 0 dt t and lim F Λ t N Φ N t = y 44 ΛN lim Φ N + Φ N = 0. t Φ N That is, for each k =,,..., N, φ lim λ k + t Φ N t k t =

26 Since we have ΛN 0 = lim Φ N + ΛN Φ N lim + IΦ N, t Φ N t lim t k= N λ k + φ kt =, 46 which implies that there exist k and {t n } such that t n as n and Unless lim n φ k t n = lim n φ k t n > 0. +λ k, there exists k k such that lim n φ k t n > 0 by taking a subsequence of {t n } if necessary. Then, 45 implies λ k = λ k and lim Φ Nt n =. n + λ k We may repeat this process finitely many times to have lim n {k:λ k =λ k } φ kt n = + λ k, lim n {k:λ k λ k } φ kt n = 0. Suppose that there exist l and {s n } such that λ l λ k and s n as n and n {k:λ k =λ l } lim n φ l s n > 0. Then, as was done above, with a subsequence of {s n } if necessary, we have lim n Φ N s n = +λ l and lim φ ks n = + λ l, lim φ ks n = 0 n and 44 implies that {k:λ k λ l } lim F Λ n N Φ N t n = + λ k = y = lim F Λ n N Φ N s n = + λ l, which is a contradiction. Therefore, we conclude that lim t Φ N t = lim t {k:λ k =λ k } φ kt = + λ k, lim t {k:λ k λ k } +λ k φ kt = 0. We will now claim that λ k = λ. Suppose that λ k > λ, i.e., k > p. Then, since +λ k, there exists T > 0 such that for t > T, +λ +λ k +λ +λ k > Φ N t < + λ + λ k + λ + λ k, and 6

27 and we have d dt p φ kt = k= λ + + Φ N t p φ λ k λ p kt φ + λ + λ kt k for t > T, which results in Φ N t as t. This is a contradiction. Therefore, we N have λ k = λ and lim t k=p+ φ k t = 0 and lim Φ N t = lim t t k= k= Lastly, we will claim that there exists v R N such that v = Note that for k =,..., p, if φ k 0 0, then Φ N t v as t. k= p φ kt = + λ. 47 +λ, and v, e j 0, and d φ k t dt φ kt = λ + + Φ N t for t 0, implying that for T 0,, T lnφ kt lnφ k0 = λ + + dt =: ΨT. 0 Φ N t From 47, we can see that which implies p φ kt = e ΨT k= Hence, for k p, p φ k0 as T, + λ k= Ψ = lim Ψt = ln t + λ φ kt v k as t, ln p k= φ k0. where v k = φ k 0e Ψ. Noting that φ k tφ k 0 > 0 for all t 0 unless φ k 0 = 0, we can easily see that φ k t v k as t. If we set v = [v v p 0 0] T, then v is an eigenvector of Λ N corresponding to λ and v, e j = v j 0 since φ j 0 0 for some j p, and Φ N t v as t and v = +λ. We will now give a proof of Theorem 6. 7

28 Proof. Theorem 6 The proof is inspired by the Galerkin s method. Note that we can find an orthonormal basis {ψ k } k N for L Ω such that ψ k is an eigenfunction of L corresponding to the k th smallest eigenvalue λ k with ψ k H0 Ω. Then, we let V N, N N, be the subspace of L Ω spanned by {ψ,..., ψ N } and let P N be the projection of L Ω onto V N. Suppose that u 0 H0 Ω is given and satisfies u 0, ψ 0. We consider u t = Lu ut u in Ω T, u = 0 on Ω T, 48 u0 = P N u 0. If u N L [0, T ]; H 0 Ω with d dt u N L [0, T ]; H Ω is a solution of 48, then for k =,..., N, we have and d dt u N, ψ k = λ k + u N, ψ k + u N, ψ k u N t d dt u N, ψ k = λ k + u N, ψ k + u N, ψ k u N t with u N 0, ψ k = P N u 0, ψ k. Considering φ k t = u N t, ψ k, k =,..., N, we can see that φ,..., φ N solve 43. Since Φ N = [φ φ N ] T satisfies that for t [0,, u N t Φ N t > ω = min, Φ N λ N Therefore, the existence of a solution to 48 can be obtained by a linear system of ODEs 43 given in the Appendix below, with the initial condition and setting Φ N 0 = [ P N u 0, ψ P N u 0, ψ N ] T, u N x, t = φ tψ x + + φ N tψ N x. Then, u N is a weak solution of 48 such that u N t + λ v in L Ω as t, where φ,..., φ N satisfy 43 and v is either ψ or ψ. Suppose now that u N,, u N, are two such solutions of 48. Let v N = u N, u N,. Then, we have v N = Lv N fu N, fu N,, 50 t where fux, t = ux, t ux,t ut. Since f on L Ω is Lipschitz with a Lipschitz constant µ = ω on {u : u ω} and the two solutions u N,, u N, satisfy 49 for all t 0, by taking the inner product on L Ω with v N on both sides of 50, we have which implies d dt v Nt, v N t Lv N t, v N t + µ v N t, v N t, d dt v N, v N µ v N, v N Lv N, v N

29 Therefore, v N 0 = 0 implies u N, = u N, for a.e. x, t Ω [0, T ] for any T 0,. In fact, we know that u N C [0, ; H 0 Ω as well as u N L [0, ; H 0 Ω, d dt u N L [0, ; H Ω. We now solve 4. Firstly, we fix k 0 N so that u0 < P k0 u 0 with λ k0 < λ k0+ and obtain the solution u N of 48 with N > k 0. We may write u N as u N t = φ,n tψ + + φ N,N tψ N. Let ϕ,n t = φ,n t + + φ k t and ϕ 0,N,N t = φ k t ,N φ N,N t. Then, we have This implies that dϕ,n dt dϕ,n dt λ k0 + ϕ,n + Φ N t ϕ,n, 5 λ k0+ + ϕ,n + Φ N t ϕ,n. 53 ϕ,n t ϕ,n 0e λ k 0 +t+ t 0 Φ N s ds ϕ,n 0e λ k 0 ++t+ t 0 Φ N s ds ϕ,n t i.e., ϕ,n t ϕ,n t ϕ,n 0 ϕ,n 0 eλ k 0 + λ k0 t > e λ k 0 + λ k0 t 54 due to ϕ,n 0 > u0 > ϕ,n 0. Let M = max +λ, u 0, M = min λ k0 +, u 0. Note that dϕ,n dt λ + ϕ,n + implies ϕ,n t Φ N t < M. And 54 implies Φ N t ϕ,n ϕ,n t < M e λ k 0 + λ k0 t. 55 Since we already saw that φ j,n t, j =,,..., N, exist for all t 0,, if we suppose t 0 = inf{t 0, ϕ,n t = M } <, then from 5 and 54, we have Φ N t 0 = ϕ,n t 0 + ϕ,n t 0 < M + e λ k 0 + λ k0 t 0 < M. λ k0 + Therefore, there exists δ > 0 such that for t t 0 δ, t 0, Φ N t < 9 λ k0 +,

30 which implies that dϕ,n t λ k0 + ϕ,n t + dt Φ N ϕ,n t > 0 on t 0 δ, t 0. This is a contradiction since ϕ,n t > M for t [0, t 0. Therefore, we end up with Note that 56 implies That is, the solution u N for N > k 0 satisfies M < ϕ,n t < M for t inf t 0 Φ N t M inf u N t M. 57 t 0 We fix T 0, and consider a sequence of solutions {u N t} N>k0, where u N is the solution of 48 with u N 0 = P N u 0. Using v = u N u M with N, M > k 0 in place of v N in 5, and noting that we may choose ω = M for µ = ω, we obtain u N t u M t e µt u N 0 u M 0 e µt u N 0 u M 0 for t [0, T ], 58 which implies that for any m N, {φ m,n t} N>k0 converges uniformly to φ m, t in [0, T ] and Φ N t Φ t uniformly on [0, T ] as N, where φ m,n t = u N t, ψ m and Φ t = m= φ m, t. Hence, for all m, we have In addition, for k 0 < m N, if we consider d dt φ m, = λ m + φ m, + Φ φ m,. ω m,n t = φ m,nt + + φ N,Nt, then we can obtain, by the same argument for 55, ω m,n t < M e λm λ k 0 t, t 0 59 implying φ m,n t < M e λm λ k t 0 on [0, T ] and, eventually, on 0,. By a slight modification on 55, we can see that even for each m k 0, there exists ζ m > 0 such that φ m,n t < ζ m M e λm λt on [0,, i.e., with η k0 = max k=,...,k0 ζ k, { η k0 M e λm λt, m k 0, φ m,n t < M e λm λ k t 60 0, m > k 0. This implies that { N } λ m φ m,n m= Hence, by defining N N converges uniformly to u x, t = λ m φ m, in [0, T ] as N. m= φ m, tψ m x, m= 30

31 we can easily see that u L [0, T ]; H 0 Ω and d dt u = m= and that for a.e. t [0, T ], and for each v H 0 Ω, Ω t u x, tvxdx + Ω dφ m, tψ m x L [0, T ]; H Ω dt d Ω i,j= u xvxdx u t a i,j x j u x i vxdx + cxu xvxdx Ω Ω u xvxdx = 0 with u 0 = u 0 in L Ω. Therefore, u is a weak solution of 4 for any T 0,. Next, using the functional in 39, we define ft for t 0, by Note that ft = F L u = + ft = d a ij x j u x, t i u x, tdx + Ω i,j= Ω u x, t dx λ m φ m, t + m= Ω u x, t dx Ω φ m, t Φ t. m=, cx u x, t dx Since φ m, t decays exponentially to 0 in 0, as m, we can see that df dt t = = m= λ m φ m, t + φ m, t φ m, t Φ t dφ m, t. dt m= dφm, Therefore, f is non-increasing and bounded below, i.e., a = lim t ft exists. From the fact that for all N N, Φ N t as t, + λ together with 60, we can see that Φ t + λ as t. which eventually implies that for each m N, dφm, dt t 0 as t, i.e., lim λ m + t Φ t dt t φ m, t = 0. 6 Moreover, since we have φ m, t max k=,...,k0 {M ζ k e λ λt } for all m with ζ = and inf t 0 Φ t M, 3

32 we finally obtain, together with 6, which implies that lim t φ, t = for some v {±ψ }. u t 4 Numerical experiments lim Φ t = t λ +, λ +, and λ + v in L Ω as t We will now present a few numerical experiments to confirm those analyzed properties of our proposed framework. 4. The gradient descent method In Figure,, 3 and 4, we show the first 5 eigenfunctions of the Laplace operator corresponding to their eigenvalues in the increasing order on various domains that are subsets of,, in R. We solved 4 using the FDM Finite Difference Method on uniform grids of size 8 8 including the boundaries, i.e., we represented,, as a uniform grid of size 8 8 and set the interior of each domain to have value and the complement of the interior to have value 0. As was seen that the solution u, t to 4 converges to an eigenfunction as t, corresponding to the smallest eigenvalue, no eigenvalue estimate was necessary. We set dt = t n+ t n = 0.7 h and dx = dy = h = 0.05 = 80, and used utn ut n < 0 3 ut n as a stopping criterion for all the numerical simulations. When the algorithm stops, ut n is set to be the eigenvalue corresponding to the eigenfunction u, t n, Using MATLAB, at the n th iteration, representing ut n as an 8 8 matrix, ut n+ is computed by where ut n+ = ut n + dt ut n ut n, 6 ut n ut n = 4 circshiftut n, [0, ] + 4 circshiftut n, [0, ] + 4 circshiftut n, [, 0] + 4 circshiftut n, [, 0] ut n. When considering various domains Ω, we create a mask, which is an 8 8 matrix representing Ω, and multiply ut n, ut n+ and ut n by the mask in 6, which makes this method extremely useful since we can deal with various domains by defining the mask matrix only. 3

$Figure : The first 5 eigenfunctions of the Laplace operator on an L shape domain, which is a union of three unit squares in R, i.e.,,, \ [0,, 0], corresponding to their eigenvalues in the increasing order.$

33 Figure : The first 5 eigenfunctions of the Laplace operator on an L shape domain, which is a union of three unit squares in R, i.e.,,, \ [0,, 0], corresponding to their eigenvalues in the increasing order. The last figure visualizes the domain, a uniform grid of size 8 8 including the boundary. The 8th and 9th eigenfunctions correspond to the same eigenvalue The 8th and 9th eigenfunctions correspond to the same eigenvalue , The 3rd and 4th eigenfunctions correspond to the same eigenvalue To have rough estimates of eigenvalues and eigenfunctions, we can solve 4 on coarser grids. For example, since the L shape domain in Fig. has a simple structure and there are many previous works e.g. [3], [3], [4] on computing eigenvalues and eigenvectors of the L shape domain in the literature, we indeed observed that 4 on a uniform grid of size computed rough estimates of eigenvalues and eigenfunctions fast. One thing that we would like to emphasize in Fig. is that our method provides as accurate results as possible on uniform grids. To see this, we would like to take a closer look at the 3rd and 4th eigenfunctions and corresponding eigenvalues. It is known that ux, y = sinπx sinπy for x, y,, \ [0,, 0] is an eigenfunction corresponding to the eigenvalue π. If we discretize this eigenfunction on a uniform grid of size 8 8 to have u d 3x i, y j = sinπx i sinπy j and compare it with the 3rd eigenfunction u 3 that we computed in Fig., then we have ũ d 3 ũ 3 = max ũ d 3x i, y j ũ 3 x i, y j 0 4, i,j 8 where ũ d 3, ũ 3 are the L normalizations of u d 3, u 3, and i,j d ũ d 3x i, y j ũ d 3x i, y j λ 3 0 4, where d is the discrete Laplace operator on the uniform grid and λ 3 is the computed eigenvalue λ 3 = u 3. The same is observed for the 4th eigenfunction, as well. As for the 8th and 9th eigenfunctions, we know that the corresponding eigenspace is spanned by the two simpler eigenfunctions sinπx sinπy and sinπx sinπy than the 8th and 9th eigenfunctions u 8, u 9 in Fig.. The true eigenvalue is 5π. By setting ũ d 8, ũ d 9 to be the discretized and L normalized eigenfunctions from the true simpler ones, we computed the projections v 8, v 9 of u 8, u 9 onto the eigenspace spanned by {ũ d 8, ũ d 9} and computed u 8 v 8 and u 9 v 9 and observed that the L norms are about 0 4 implying that {u 8, u 9 } is indeed a basis for the same eigenspace. After normalizing the projections v 8, v 9 by their L 33

still preserved as we increase n, m, but the accuracy of computed eigenfunctions deteriorates.

34 norms, we also observed i,j d ṽ 8 x i, y j ṽ 8 x i, y j λ 8 0 4, where λ 8 is the computed eigenvalue u 8, and same for ṽ 9. However, when we compare computed eigenfunctions with true eigenfunctions of the form sinnπx sinmπy on the fixed uniform grid of size 8 8, we observed that the accuracy of computed eigenvalues is still preserved as we increase n, m, but the accuracy of computed eigenfunctions deteriorates. Figure : The first 5 eigenfunctions of the Laplace operator on an annulus domain corresponding to their eigenvalues in the increasing order. The last figure visualizes the domain. It s clear which eigenfunctions should correspond to the same eigenvalues. As was shown in Fig., orthogonal eigenfunctions corresponding to the same eigenvalues that are rotations of each other can be expected when the domain is rotationally invariant. However, due to the uniform grid that we use and depending on the rotated angle, we observed that estimated eigenvalues in Fig. can be slightly different for such eigenfunctions unlike the L shape domain in Fig.. In Fig. 3 and Fig. 4, we computed eigenfunctions of the Laplace operator on other domains. Especially, we created an domain with no symmetry in Fig. 4. Figure 3: The first 5 eigenfunctions of the Laplace operator on a domain, visualized in the last, corresponding to their eigenvalues in the increasing order. Since we provided a generalized eigenvalue problem in the previous work [?] as an application, we can also see how to apply the Newton s method 3 to solve Ax = λbx by 34

Figure 4: The first 5 eigenfunctions of the Laplace operator on a domain, visualized in the last, corresponding to their eigenvalues in the increasing order.

35 Figure 4: The first 5 eigenfunctions of the Laplace operator on a domain, visualized in the last, corresponding to their eigenvalues in the increasing order. minimizing F A,B, where A, B are symmetric and B is positive definite and F A,B is given by F A,B x = x, Ax + x, Bx x, Bx, with some > 0 making A + B positive definite. We note that minimizing F A,B by the Newton s method with x B := x T Bx generates a sequence {x k } satisfying [ A + B + Bxk Bxk T ] x k+ = Bx k, 63 x k B x k B x k B x k B x k B which can be rewritten as [ Bxk Bxk T ] A λ k B + + λ k x k+ = Bx k 64 x k B x k B x k B with λ k = x k B. Then, depending on how to update λ k in 64, either by λ k = x k B or by λ k = xt k Ax k x T k Bx, we end up with either 63 or the type of the Rayleigh k Quotient Iteration. However, when we tested the two different eigenvalue update rules above with randomly selected positive definite matrices A, B, we observed numerically that the update rule λ = x B found small eigenvalues much more often than the update rule λ = xt Ax x T Bx. In found the true smallest eigenvalues much more often fact, we noticed that λ = x B than λ = xt Ax x T Bx as can be seen in Fig. 5, which is interesting to confirm that λ = x B is from minimizing the functional F A,B trying to find the smallest eigenvalue. We also performed the same experiment as in Fig. 5 with different sizes, i.e., A i s and B i s are of size shown in blue and of size shown in red in Fig. 6. Solid lines indicate results using the update rule λ = x B and lines with circular dots indicate results using the other update rule λ = xt Ax x T Bx λ = x B interesting to observe that the update rule λ = xt Ax. Fig. 6 clearly shows that the updated rule tends to find smaller eigenvalues than the update rule λ = xt Ax. It is x T Bx never found the smallest eigenvalues. x T Bx Secondly, we will perform the same numerical experiments as was done with the gradient descent method: finding eigenfunctions of the Laplace operator on various domains. Especially, we will show a comparison result with the L shape experiment. When computing a few eigenfunctions corresponding to their eigenvalues in the increasing order, a drawback of 35

36 Figure 5: 00 trials with random positive definite pairs A i, B i, i =,,..., 00. A i s and B i s are of size 0 0. For each pair A i, B i, we test the two eigenvalue update rules 000 times starting from random vectors. The red and blue colors indicate the two update rules: λ = x B in red and λ = xt Ax in blue. Top Row: For each i =,..., 00, we plot x T Bx the number of times each update rule finds the true smallest eigenvalue λ min. We counted the number of eigenvalues computed whose difference from λ min is less than 0 3. Middle Row: For each i =,..., 00, we plot the maximum eigenvalue among the 000 times trials. Bottom Row: For each i =,..., 00, we plot the mean eigenvalue among the 000 times trials. the gradient descent method, besides its rate of convergence, is that one needs to apply any type of the Gram-Schmidt process at every iteration to make sure that the next eigenfunction being searched is orthogonal to the previous ones. This slows down the whole process of computing the first n eigenfunctions quite a bit as n increases. On the other hand, if an estimated eigenvalue λ k = u k at the point u k is close enough to a true eigenvalue, then the Newton s method would generate a convergent sequence to a nearby critical point, i.e., a corresponding eigenfunction, which implies that it may be unnecessary to apply the Gram-Schmidt process. Hence, when computing many eigenfunctions, the Newton s method can speed up the total time spent significantly if we can find good starting points. Without any fast and efficient ideas combined, we would like to compare the total time spent by the gradient descent method with that by the Newton s method. to find the first n eigenfunctions. We computed the first 00 eigenfunctions of the Laplace operator on the L shape domain, 0, [0, 0, on a uniform grid of size 8 8 using the gradient descent method and the Newton s method, separately. Having computed the first k unit eigenfunctions ϕ,..., ϕ k an knowing that the gradient descent method provides convergence to a global minimizer, we can find u 0 satisfying the constraints ϕ j, u 0 = 0, j =,,..., k and u0 + u 0 < ɛ 65 u 0 with ɛ = 0. or 0.0, in which case u 0 is close to the k + st smallest eigenvalue, and consider u 0 as a good starting point for both the gradient descent method and the Newton s 36

Figure 6: The same experiments as in Fig. 5 with different matrix sizes. The red and blue colors indicate the two different sizes: 50 50 in blue and 00 00 in red.

37 Figure 6: The same experiments as in Fig. 5 with different matrix sizes. The red and blue colors indicate the two different sizes: in blue and in red. Solid lines are from the update rule λ = x B and the lines with circular dots are from the update rule λ = xt Ax x T Bx. Top Row: We counted the number of times that the true smallest eigenvalues were computed among the 000 times trials for each random positive definite pair A i, B i, i =,..., 00. Only the counts using the update rule λ = x B are shown because the update rule λ = xt Ax x T Bx never found the smallest eigenvalues. Middle Row: The maximum eigenvalue among the 000 times trials for each A i, B i. Bottom Row: The mean eigenvalue among the 000 times trials for each A i, B i. method to find the k+ st eigenfunction for comparison. Then, we measure the time spent for the gradient descent method to converge starting from u 0 under the orthogonality constraints and also the time spent for the Newton s method to converge starting from the same u 0 without any constraints. In Fig. 7, we present the amount of time elapsed for both cases: the gradient descent method with the orthogonality constraints vs the Newton s method 3 without any constraints. We can observe in this experiment that a linear increase in time due to the linear increase in the number of constraints for the gradient descent method is clear and that good starting points even for the gradient method can reduce the computational time significantly. In Fig. 8, we compare the Newton s method 3 with the Rayleigh quotient iteration with the same starting points used in Fig. 7. As was expected, the gradient descent method with the orthogonality constraints takes much more time to converge, depending on the starting point and on the number of constraints k. On the other hand, the time spent for the Newton s method to compute each eigenfunction depends only on the starting point, showing that almost the same amount of time is required for convergence in computing each eigenfunction. Moreover, the Newton s method 3 presents a simliar rate of convergence as the Rayleigh quotient iteration. In addition, the absence of the orthogonality constraints doesn t allow the Newton s method to find orthogonal eigenfunctions corresponding to the same eigenvalues of multiplicity greater than. However, their linear independence is guar- 37

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint