FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG. Lehrstuhl für Informatik 10 (Systemsimulation)

Similar documents
Weaker hypotheses for the general projection algorithm with corrections

Weaker assumptions for convergence of extended block Kaczmarz and Jacobi projection algorithms

Tikhonov Regularization in Image Reconstruction with Kaczmarz Extended Algorithm

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Properties of Matrices and Operations on Matrices

Linear Algebra Massoud Malek

On a general extending and constraining procedure for linear iterative methods

Chapter 1. Matrix Algebra

MODULE 8 Topics: Null space, range, column space, row space and rank of a matrix

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

MATH 304 Linear Algebra Lecture 18: Orthogonal projection (continued). Least squares problems. Normed vector spaces.

Lecture notes: Applied linear algebra Part 1. Version 2

Algorithmen zur digitalen Bildverarbeitung I

We first repeat some well known facts about condition numbers for normwise and componentwise perturbations. Consider the matrix

Let x be an approximate solution for Ax = b, e.g., obtained by Gaussian elimination. Let x denote the exact solution. Call. r := b A x.

NUMERICAL SOLUTION FOR FREDHOLM FIRST KIND INTEGRAL EQUATIONS OCCURRING IN SYNTHESIS OF ELECTROMAGNETIC FIELDS

MATH 5640: Functions of Diagonalizable Matrices

Review of Some Concepts from Linear Algebra: Part 2

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR

Section 3.9. Matrix Norm

First, we review some important facts on the location of eigenvalues of matrices.

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

A MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS

APPLICATIONS OF THE HYPER-POWER METHOD FOR COMPUTING MATRIX PRODUCTS

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

ISOLATED SEMIDEFINITE SOLUTIONS OF THE CONTINUOUS-TIME ALGEBRAIC RICCATI EQUATION

Tikhonov Regularization of Large Symmetric Problems

Review and problem list for Applied Math I

Introduction to Numerical Linear Algebra II

Math 407: Linear Optimization

EE731 Lecture Notes: Matrix Computations for Signal Processing

Algebra Exam Topics. Updated August 2017

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

Affine iterations on nonnegative vectors

Vector and Matrix Norms. Vector and Matrix Norms

Linear Systems. Carlo Tomasi

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

Recurrent Neural Network Approach to Computation of Gener. Inverses

Further Mathematical Methods (Linear Algebra)

Lecture 1: Review of linear algebra

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Least squares: the big idea

Absolute value equations

ELA THE MINIMUM-NORM LEAST-SQUARES SOLUTION OF A LINEAR SYSTEM AND SYMMETRIC RANK-ONE UPDATES

Elementary linear algebra

OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY

Diagonal matrix solutions of a discrete-time Lyapunov inequality

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

Linear Systems. Carlo Tomasi. June 12, r = rank(a) b range(a) n r solutions

MATH 304 Linear Algebra Lecture 19: Least squares problems (continued). Norms and inner products.

LAVRENTIEV-TYPE REGULARIZATION METHODS FOR HERMITIAN PROBLEMS

Lecture II: Linear Algebra Revisited

1 Last time: least-squares problems

Lecture: Quadratic optimization

Interval solutions for interval algebraic equations

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018

ECE 275A Homework #3 Solutions

Spectral inequalities and equalities involving products of matrices

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Parallel Singular Value Decomposition. Jiaxing Tan

Mathematical foundations - linear algebra

Foundations of Matrix Analysis

On the Solution of Constrained and Weighted Linear Least Squares Problems

Pseudoinverse and Adjoint Operators

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

ON THE SINGULAR DECOMPOSITION OF MATRICES

1. Let m 1 and n 1 be two natural numbers such that m > n. Which of the following is/are true?

14.2 QR Factorization with Column Pivoting

The following definition is fundamental.

A Cholesky LR algorithm for the positive definite symmetric diagonal-plus-semiseparable eigenproblem

(Mathematical Operations with Arrays) Applied Linear Algebra in Geoscience Using MATLAB

Linear Algebra Review

MATH 22A: LINEAR ALGEBRA Chapter 4

MAT Linear Algebra Collection of sample exams

Iterative Methods for Solving A x = b

MATH36001 Generalized Inverses and the SVD 2015

Chapter 6: Orthogonality

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #1. July 11, 2013 Solutions

Nonlinear equations. Norms for R n. Convergence orders for iterative methods

Lecture Notes 6: Dynamic Equations Part C: Linear Difference Equation Systems

Nonlinear Programming Algorithms Handout

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2013 PROBLEM SET 2

Permutation transformations of tensors with an application

CONVERGENCE OF MULTISPLITTING METHOD FOR A SYMMETRIC POSITIVE DEFINITE MATRIX

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

18.06 Quiz 2 April 7, 2010 Professor Strang

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

~ g-inverses are indeed an integral part of linear algebra and should be treated as such even at an elementary level.

Algebra C Numerical Linear Algebra Sample Exam Problems

Singular Value Decompsition

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Practice Final Exam Solutions for Calculus II, Math 1502, December 5, 2013

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

Numerical Analysis Lecture Notes

MATH 320: PRACTICE PROBLEMS FOR THE FINAL AND SOLUTIONS

Transcription:

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG INSTITUT FÜR INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl für Informatik 0 (Systemsimulation) On a regularization technique for Kovarik-like approximate orthogonalization algorithms Aurelian Nicola, Constantin Popa, Ulrich Rüde Lehrstuhlbericht 09-4

0

On a regularization technique for Kovarik-like approximate orthogonalization algorithms Aurelian Nicola, Constantin Popa, Constantin Popa, Ulrich Rüde, Abstract In this paper we consider four versions of Kovarik s iterative orthogonalization algorithm, for approximating the minimal norm solution of symmetric least squares problems. Although the theoretical convergence rate of these algorithms is at least linear, in practical applications we observed that a too large number of iterations can dramatically deteriorate the already obtained approximation. In this respect we analyze the above mentioned Kovarik-like methods according to the modifications they make on the machine zero eigenvalues of the problem s (symmetric) matrix. We establish a theoretical almost optimal formula for the number of iterations necessary to obtain an enough accurate approximation, as well as to avoid the above mentioned troubles. Experiments on collocation discretization of a Fredholm first kind integral equation illustrate the efficiency of our considerations. 2000 MS Classification: 65F0, 65F20 Key words and phrases: Kovarik-like algorithms, approximate orthogonalization, regularization, first kind integral equations. Introduction In the paper [2] Z. Kovarik proposed two iterative algorithms for approximate orthogonalization of a finite set of linearly independent vectors from a Hilbert space. In [5] and [6], we extended these algorithms for approximate orthogonalization of rows and columns of arbitrary rectangular matrices and in [3, 4, 7] we adapted these methods for symmetric matrices and for approximating the minimal norm solution x LS of symmetric least squares problems of the form Ax b = min{ Az b, z IR n } () with A: n n symmetric and the Euclidean norm. In this respect, the following four algorithms were obtained: let A 0 = A, b 0 = b; for k = 0,,... do until convergence Algorithm KOAS-rhs H k = I A k A k+ = (I + 0.5H k )A k b k+ = (I + 0.5H k )b k Algorithm KOBS-rhs K k = (I A k )(I + A k ) A k+ = (I + K k )A k b k+ = (I + K k )b k Ovidius University of Constanta, Faculty of Mathematics and Computer Science, Romania; e-mail: anicola@univ-ovidius.ro. Ovidius University of Constanta, Faculty of Mathematics and Computer Science, Romania; e-mail: cpoopa@univ-ovidius.ro.; for this author the paper was supported by a DAAD Grant in the period 0.07-3.08.2009 at Institute for Computer Science 0 (Systemsimulation), Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany Institute for Computer Science 0 (Systemsimulation), Friedrich-Alexander-Universität Erlangen-Nürnberg, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany; e-mail: ulrich.ruede@informatik.uni-erlangen.de

Algorithm MKOBS-rhs K k = (I A k )(I A k + A 2 k ) A k+ = (I + K k )A k b k+ = (I + K k )b k Algorithm IFKOBS-rhs K k = (I A k )(I 0.5A k ) A k+ = (I + K k )A k b k+ = (I + K k )b k Theorem Let us suppose that A is also positive semidefinite and its spectral radius ρ(a) satisfies ρ(a) <. (2) Then, any of the above algorithms generates sequences with the following properties: (i) If the problem () is consistent, then lim k b k = x LS ; (ii) If the problem () is inconsistent, then lim k A k b k = x LS. Moreover, in this case we have lim k b k =. The previous algorithms have linear or superlinear convergence and a mesh independent behavior for problems arising from collocation discretization of Fredholm first kind integral equations (see [, 3, 4]). Moreover, the assumption (2) is not restrictive because it can be obtained e.g. by the scaling A = A. (3) + A But, in spite of these good properties, we have observed in practical computations that a too large number of iterations can deteriorate the computed solution dramatically. This divergent behavior can be observed very well in the following example. Let us consider the integral equation k(s, t) x(t)dt = y(s), s [0, ], discretized by the collocation method from [] with the collocation 0 points s i = i n, i =,..., n. Then, the associated least squares formulation () has the elements A and b given by: (A) ij = 0 k(s i, t)k(s j, t)dt and (b) i = y(s i ). We used the kernel k(s, t) = + s 0.5 +t, for which the matrix coefficients (A) ij can be analytically obtained and computed. The n n matrix A is symmetric and positive semidefinite, with rank(a) = n n+ 2 (n even) or 2 (n odd) (see e.g. [3, 4]). If the right hand side y(s) is such that x ex (t) =, t [0, ] is a solution of the initial integral equation, we obtain a consistent least squares formulation as (), which we shall denote by P-cons. Then, by keeping the matrix A unchanged and adding a perturbation to the right hand side b (in Matlab notation) rand( state, 0); pert = rand(n, ); b = b + pert; we get an inconsistent version of (), denoted by P-pert. We applied the above four algorithms to both problems P-cons and P-pert, by using different numbers of iterations and for fixed n = 32. Below we present the behavior of the residuals (log( Ab k b ) for P-cons in figures - 2, and log( A(A(Ab k ) b) ) for P-pert in figures 3-4) versus the number of iterations. We can observe that in each case there exists a critical value, such that if the number of iterations exceeds it, the norm of the residual begins to increase and the computed approximation deteriorates dramatically (the values of this critical value for KOAS-rhs, KOBS-rhs, MKOBS-rhs, IFKOBS-rhs are 88, 5, 5, 52 for P-cons and 48, 29, 32, 32 for P-pert, respectively). 2 The analysis of the divergent behavior In order to understand and analyze the unpleasant behavior mentioned in section we first consider the following general description of the above four algorithms. Algorithm K-rhs. Let A 0 = A, b 0 = b; for k = 0,,... do until convergence where the function f is given by A k+ = f(a k )A k, b k+ = f(a k )b k, (4) 2

Figure : P-cons; n = 32: KOAS-rhs (left) and KOBS-rhs (right) Figure 2: P-cons; n = 32: MKOBS-rhs (left) and IFKOBS-rhs (right) f(x) = 2 +x, if K = KOBS + 2 ( x), if K = KOAS + ( x)( x + x 2 (5) ), if K = MKOBS + ( x)( x 2 ), if K = IF KOBS Lemma Let Q, B be two n n matrices such that Q is orthogonal (i.e. Q T Q = QQ T = I). Then, for any expression of f in (5) we have: (i) f(qbq T ) = Qf(B)Q T ; (ii) f (diag(γ,..., γ n )) = diag(f(γ ),..., f(γ n )). Proof. (i) If f is a second degree polynomial, f(x) = αx 2 + βx + γ the proof is obvious. Let us then suppose that f = 2 +x. We obtain the equality in (i) by the following sequence of equalities (and also using that Q = Q T ): f(qbq T ) = 2(I + QBQ T ) = 2 [ Q(I + B)Q ] T = Q [ 2(I + B) ] Q T = Qf(B)Q T. For (ii) the proof is also obvious. Let now Q T A 0 Q = D 0 = diag(λ (0),..., λ(0) r, 0,..., 0) be a spectral decomposition of A (with Q orthogonal), where r = rank(a) and the eigenvalues λ (0) i [0, ), i =,..., r (according to theorem ). Then, for the algorithm K-rhs a recursive argument gives us A k = QD k Q T where D k = diag(λ (k),..., λ(k) r, 0,..., 0), λ (k) i 3 = f(λ (k ) i )λ (k ) i, k. (6)

Figure 3: P-pert; n = 32: KOAS-rhs (left) and KOBS-rhs (right) Figure 4: P-pert; n = 32: MKOBS-rhs (left) and IFKOBS-rhs (right) The convergence analysis from theorem is based on the analysis of the sequence implicitely generated by (6), x k+ = f(x k )x k, x 0 {λ (0),..., λ(0) r } (0, ). (7) Additionally, it is shown that x k is strictly increasing to. Then, from (6) we get (from a theoretical point of view) ( ) lim A k = Q diag k lim k λ(k),..., lim k λ(k) r, 0,..., 0 Q T = Q diag (,...,, 0,..., 0) Q T = A + A = P R(A) = I P N(A), (8) where A + is the Moore-Penrose pseudoinverse of A and P R(A), P N(A) are the orthogonal projections onto the corresponding subspaces of A. Unfortunately, from a practical view point, the 0 s on positions r +,..., n in the initial diagonal matrix D 0 = diag(λ (0),..., λ(0) r, 0,..., 0) are only fl(0) = machine-zero numbers (e.g. in double precision fl(0) 0 7 ). Then, for ˆx 0 = fl(0), the iterative process (see (7)) ˆx k+ = f(ˆx k )ˆx k, k 0 generates larger and larger values. Thus, in practice, we start in algorithm K-rhs, instead of A, with the matrix Â0 = Q ˆD 0 Q T, with ˆD 0 = diag(λ (0),..., λ(0) r, fl(0),..., fl(0)). Then, after k iterations we get instead of D k from (6) the matrix ˆD k = diag(λ (k),..., λ(k) r, α r+ (k),..., α n (k)), (9) where the values α j (k) > 0, j = r +,..., n can be large enough to destroy the computed approximations. 4

3 The regularization procedure In order to overcome the difficulties mentioned in the above section, we shall formulate and try to solve the two problems from below. Problem P. Find a treshold α such that, if max{α j (k), j = r +,..., n} α (0) then the value of ˆD k from (9) starts to affect the computations. Problem P2. Suppose we know the above value α ; then find an integer k + (α ) (as small as possible!) such that x k+ (α ) α, () where x k + (α ) is the corresponding term of the sequence from (7) generated with the initial approximation x 0 = 0 7. (2) We shall first try to solve the problem P2. Theorem 2 A number with the property () (2) is given by ( ) k + (α ) = + ln (α x 0)H(α ) y + ln( + H(α, with y = (f(x 0 ) )x 0 (3) )) and the values of H(α ) > 0 are as follows H(α ) = 2α α 2, (+α ) 2 if K = KOBS 0.5 α, if K = KOAS 4α, if K = MKOBS 3α, if K = IF KOBS, for 0 < α < 4. (4) Proof. For the sequence (x k ) k 0 defined by (7) with the initial approximation (2), let k be arbitrary, but fixed. We define y j = x j x j, j =,..., k and obtain x k = x 0 + y +... + y k. (5) On the other hand, from (7) we get y j = x j x j = (f(x j ) ) x j, thus y j y j = (f(x j ) ) x j (f(x j 2 ) ) x j 2 = y j g(x j 2, x j ), j = 2,..., k, (6) with g(x j 2, x j ) given by g(x j 2, x j ) = x j x j 2 x j x j 2 (+x j )(+x j 2), if K = KOBS 2 ( x j x j 2 ), if K = KOAS + x 2 j + x2 j 2 (x j + x j 2 ) (x 2 j + x2 j 2 + 2 x j x j 2 ), if K = MKOBS 3 2 (x j + x j 2 )+ 2 (x2 j + x2 j 2 + x j x j 2 ), if K = IF KOBS According to () we shall suppose that 0 < x 0 < x <... < x k < α < 4. Then it can be easily proved that g(x j 2, x j ) > H(α ) > 0, j = 2,..., k, (8) with H(α ) from (4), respectively. From (5) - (8) we then obtain x k > x 0 + y + ( + H(α ))y +... + ( + H(α )) k y = (7) [ + H(α )] k x 0 + y H(α. (9) ) 5

Table : Values of k + (α ) Problem ( α ) KOAS KOBS MKOBS IFKOBS P-cons (0 3 ) 80 48 47 47 P-pert (0 0 ) 40 25 24 24 and we succes- We then define k + (α ) as the smallest integer such that α [+H(α < x 0 + y )] k H(α ) sively get the values from (3) - (4) and the proof is complete. Now, we come back to the problem P. Unfortunately, in this case we have no more a clear and complete solution as for problem P2 (see the next section). 4 Numerical experiments The solution to the problem P essentially depends on the class of problems we solve. For example, in the case of discretizations of Fredholm first kind integral equations, we considered a smaller dimension (e.g. n = 32 for our problems P-cons and P-pert) and made several tests which gave us at the end an appropriate choice for the treshold α, namely: α = 0 3 for P-cons and α = 0 0 for P-pert. For these values we computed the corresponding integers k + (α ) defined by (3) and we got the values from Table. Then, we performed experiments on both problems, with n = 32, 64, 28, 256, 52 and we obtained the results in figures 5-8. We can see there that the values of k + (α ) determined in the particular case n = 32 are good also for bigger dimensions, in the sense that before it no instability appear in the computed solution. REMARK. The big difference between α = 0 3 for P-cons and α = 0 0 for P-pert can be explained by using the information given in theorem (ii). According to this, in the inconsistent case we have lim k b k = +. Moreover, e.g. in the case of the algorithm IFKOBS algorithm we get b k = A k x + 2 k P N(A) (b), where x IR n is a vector with the property Ax = P R(A) (b); then, lim k (A k x) = (lim k A k )x = (I P N(A) )x. Thus, b k goes to + as 2 k P N(A) (b) does and, if we let the algorithm to run, e.g. until x k will exceed the value α = 0 3, the value 2 k P N(A) (b) will become too large and will pollute the vector b k and so, the approximation A k b k of x LS. References [] H. W. Engl, Regularization methods for the stable solution of inverse problems, Surv. Math. Ind., 3(993), pp. 7 43. [2] Z. Kovarik, Some iterative methods for improving orthogonality, SIAM J. Numer. Anal., 7(3)(970), pp. 386 389. [3] M. Mohr, C. Popa, and U. Rüde, An iterative algorithm for approximate orthogonalization of symmetric matrices, Intern. J. Computer Math., 8(2)(2004), pp. 25 226. [4] M. Mohr, and C. Popa, Numerical solution of symmetric least-squares problems with an inversion-free Kovarik type-algorithm, Intern. J. Computer Math., 85(2)(2008), pp. 27 286. [5] C. Popa, Extension of an approximate orthogonalization algorithm to arbitrary rectangular matrices, Linear Alg. Appl., 33(200), pp. 8 92. [6] C. Popa, A method for improving orthogonality of rows and columns of matrices, Intern. J. Computer Math., 77(3)(200), pp. 469 480. [7] C. Popa, On a modified Kovarik algorithm for symmetric matrices, Annals of Ovidius University Constanta, Series Mathematics, XI()(2003), pp. 47 56. 6

Figure 5: P-cons; n = 32, 64, 28, 256, 52: KOAS-rhs (left) and KOBS-rhs (right) Figure 6: P-cons; n = 32, 64, 28, 256, 52: MKOBS-rhs (left) and IFKOBS-rhs (right) 7

Figure 7: P-pert; n = 32, 64, 28, 256, 52: KOAS-rhs (left) and KOBS-rhs (right) Figure 8: P-pert; n = 32, 64, 28, 256, 52: MKOBS-rhs (left) and IFKOBS-rhs (right) 8