OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY

Similar documents
We first repeat some well known facts about condition numbers for normwise and componentwise perturbations. Consider the matrix

For δa E, this motivates the definition of the Bauer-Skeel condition number ([2], [3], [14], [15])

Componentwise perturbation analysis for matrix inversion or the solution of linear systems leads to the Bauer-Skeel condition number ([2], [13])

NP-hardness results for linear algebraic problems with interval data

Section 3.9. Matrix Norm

Wavelets and Linear Algebra

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR

Notes on Linear Algebra and Matrix Theory

A NOTE ON CHECKING REGULARITY OF INTERVAL MATRICES

Verified Solutions of Sparse Linear Systems by LU factorization

Interval solutions for interval algebraic equations

Fast Verified Solutions of Sparse Linear Systems with H-matrices

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem

A generalization of the Gauss-Seidel iteration method for solving absolute value equations

Modified Gauss Seidel type methods and Jacobi type methods for Z-matrices

Some bounds for the spectral radius of the Hadamard product of matrices

Positive entries of stable matrices

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

FAST VERIFICATION OF SOLUTIONS OF MATRIX EQUATIONS

Z-Pencils. November 20, Abstract

Cheap and Tight Bounds: The Recent Result by E. Hansen Can Be Made More Efficient

Lecture 5. Ch. 5, Norms for vectors and matrices. Norms for vectors and matrices Why?

CONVERGENCE OF MULTISPLITTING METHOD FOR A SYMMETRIC POSITIVE DEFINITE MATRIX

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Matrix functions that preserve the strong Perron- Frobenius property

Bounds on the radius of nonsingularity

Nonnegative and spectral matrix theory Lecture notes

Invertibility and stability. Irreducibly diagonally dominant. Invertibility and stability, stronger result. Reducible matrices

Linear Algebra Massoud Malek

CHUN-HUA GUO. Key words. matrix equations, minimal nonnegative solution, Markov chains, cyclic reduction, iterative methods, convergence rate

Numerical Analysis: Solving Systems of Linear Equations

On the simultaneous diagonal stability of a pair of positive linear systems

arxiv: v1 [cs.na] 7 Jul 2017

Hierarchical open Leontief models

On a quadratic matrix equation associated with an M-matrix

Permutation transformations of tensors with an application

arxiv: v1 [math.na] 28 Jun 2013

Checking Properties of Interval Matrices 1

Algebra C Numerical Linear Algebra Sample Exam Problems

On condition numbers for the canonical generalized polar decompostion of real matrices

Structured Condition Numbers of Symmetric Algebraic Riccati Equations

First, we review some important facts on the location of eigenvalues of matrices.

Numerical Linear Algebra

Institute of Computer Science. Compact Form of the Hansen-Bliek-Rohn Enclosure

c 1995 Society for Industrial and Applied Mathematics Vol. 37, No. 1, pp , March

Inverse interval matrix: A survey

An estimation of the spectral radius of a product of block matrices

Perron Frobenius Theory

Bidiagonal decompositions, minors and applications

Geometric Mapping Properties of Semipositive Matrices

CS 246 Review of Linear Algebra 01/17/19

BOUNDS OF MODULUS OF EIGENVALUES BASED ON STEIN EQUATION

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

The Eigenvalue Problem: Perturbation Theory

Generalized Numerical Radius Inequalities for Operator Matrices

COMPUTATIONAL ERROR BOUNDS FOR MULTIPLE OR NEARLY MULTIPLE EIGENVALUES

arxiv: v1 [math.fa] 19 Jul 2009

T.8. Perron-Frobenius theory of positive matrices From: H.R. Thieme, Mathematics in Population Biology, Princeton University Press, Princeton 2003

Multiplicative Perturbation Analysis for QR Factorizations

Convergence of Rump s Method for Inverting Arbitrarily Ill-Conditioned Matrices

Numerical Linear Algebra And Its Applications

Computational math: Assignment 1

On the spectra of striped sign patterns

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

A note on estimates for the spectral radius of a nonnegative matrix

ACI-matrices all of whose completions have the same rank

Diagonal and Monomial Solutions of the Matrix Equation AXB = C

The Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment

On the Perturbation of the Q-factor of the QR Factorization

Review Questions REVIEW QUESTIONS 71

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

R, 1 i 1,i 2,...,i m n.

Foundations of Matrix Analysis

Approximation algorithms for nonnegative polynomial optimization problems over unit spheres

Journal of Computational and Applied Mathematics

REGULARITY RADIUS: PROPERTIES, APPROXIMATION AND A NOT A PRIORI EXPONENTIAL ALGORITHM

Intertibility and spectrum of the multiplication operator on the space of square-summable sequences

2nd Symposium on System, Structure and Control, Oaxaca, 2004

Lecture 2 INF-MAT : , LU, symmetric LU, Positve (semi)definite, Cholesky, Semi-Cholesky

arxiv: v3 [math.ra] 10 Jun 2016

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

Convexity of the Joint Numerical Range

Fast Linear Iterations for Distributed Averaging 1

A MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS

CAAM 454/554: Stationary Iterative Methods

VERIFICATION OF POSITIVE DEFINITENESS

Optimal global rates of convergence for interpolation problems with random design

On the Skeel condition number, growth factor and pivoting strategies for Gaussian elimination

A priori bounds on the condition numbers in interior-point methods

Review of Basic Concepts in Linear Algebra

Part 1a: Inner product, Orthogonality, Vector/Matrix norm

Discrete Math Class 19 ( )

The Kalman-Yakubovich-Popov Lemma for Differential-Algebraic Equations with Applications

An Even Order Symmetric B Tensor is Positive Definite

Maximizing the numerical radii of matrices by permuting their entries

Generalized matrix diagonal stability and linear dynamical systems

Positive Semidefiniteness and Positive Definiteness of a Linear Parametric Interval Matrix

Size properties of wavelet packets generated using finite filters

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

Transcription:

published in IMA Journal of Numerical Analysis (IMAJNA), Vol. 23, 1-9, 23. OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY SIEGFRIED M. RUMP Abstract. In this note we give lower and upper bounds for the optimal p-norm condition number achievable by two-sided diagonal scalings. There are no assumptions on the irreducibility of certain matrices. The bounds are shown to be optimal for the 2-norm. For the 1-norm and inf-norm the (known) exact value of the optimal condition number is confirmed. We give means how to calculate the minimizing diagonal matrices. Furthermore, a class of new lower bounds for the componentwise distance to the nearest singular matrix is given. They are shown to be superior to existing ones. 1. Introduction and notation. Through the paper let A be an n n nonsingular real matrix. The condition number of A is defined by ( (A + A) 1 A 1 ) p κ p (A) := lim ε ε A 1, p sup A p ε A p where p denotes the Hölder p-norm, 1 p. It is well known (cf. [4, Theorem 6.4,7.2]) that { } x p κ p (A) = lim sup : Ax = b = (A + A)(x + x), A p ε A p ε ε x p = A 1 p A p. The question arises what is the minimum condition number achievable by two-sided diagonal scaling, i.e. the value of (1) µ p := inf D 1,D 2 D n κ p (D 1 AD 2 ), where D n M n (IR) denotes the set of n n diagonal matrices with positive diagonal elements. Various results are known (cf. [4, Section 7]). Especially for the -norm Bauer [2] shows (cf. [4, Theorem 7.8, Exercise 7.9]) 1 (2) µ = ϱ( A 1 A ) if A 1 A and A A 1 are irreducible, ϱ denoting the spectral radius. In the following we derive general two sided bounds for µ p including (2) without irreducibility condition. We will prove (3) µ p ϱ( A 1 A ) n 2 min(1/p,1 1/p) µ p. The bounds differ at most by a factor n (for the 2-norm). For p {1, } the exact value of µ p is ϱ( A 1 A ) for general A, and we show that the bounds in (3) are sharp for the 2-norm. Furthermore, the minimizing diagonal matrices can be calculated explicitly. Finally, the results are used to derive new lower bounds for the componentwise distance to the nearest singular matrix. The bounds are shown to be superior to existing ones. Frequently, the derived methods allow to calculate the exact value of the componentwise distance to the nearest singular matrix. This seems remarkable because Polak and Rohn [7] showed that the computation of this number is NP -hard. Computational results of matrices up to dimension n = 5 are presented. Throughout the paper we will use absolute value and comparison of matrices always entrywise. For example, Ẽ E is equivalent to Ẽij E ij for all i, j. Inst. f. Informatik III, Technical University Hamburg-Harburg, Schwarzenbergstr. 95, 2171 Hamburg, Germany 1 The starting point of this paper was the question by N. Higham whether the irreduciblility assumptions in [4, Theorem 7.8] can be omitted. 1

2. A result from matrix theory. In this section we will prove the following theorem. Theorem 2.1. Let A 1,..., A k M n (IR) be given and denote by m, m k, the number of nonnegative matrices among the A ν. For fixed p, 1 p, define (4) µ := inf D 1,...,D k D n D 1 1 A 1D 2 p D 1 2 A 2D 3 p... D 1 k A kd 1 p. Then for α := n min(1/p,1 1/p) we have (5) µ ϱ( A 1 A 2... A k ) α k m µ. The inequalities are equalities if all A ν are nonnegative. The left bound is sharp for all p and all n. For p = 2, the right bound is sharp at least for the infinitely many values of n, where a Hadamard matrix exists. Before we prove Theorem 2.1 we need some auxiliary results. First we generalize a result by Albrecht [1] to arbitrary nonnegative matrices. Lemma 2.2. Let nonnegative M M n (IR) be given. Then for all p, 1 p, inf D D D 1 MD p = ϱ(m). Proof. Albrecht [1] proved this result for irreducible M. Furthermore, he proved the infimum to be a minimum in that case and gave explicit formulas for the minimizing D depending on the left and right Perron vector of M. For general M, we may assume without loss of generality M to be in its irreducible normal form (6) M = M 11... M 1k.... M kk where the M νν, 1 ν k denote an irreducible or a 1 1 zero matrix. This is because (6) is achieved by a permutational similarity transform P T MP and D 1 P T MP D p = (P DP T ) 1 MP DP T p with diagonal P DP T. By Albrecht s theorem there are D νν D of appropriate size such that Set D = diag(d 1, εd 2,..., ε k 1 D k ) D n. Then D 1 MD =, D 1 ν M νν D ν p = ϱ(m νν ). D 1 1 M 11D 1 O(ε)... D 1 k M kkd k a block upper triangular matrix with O(ε) terms above the block diagonal. Therefore, (7) D 1 MD p = max ν = max ν Dν 1 M νν D ν p + O(ε) ϱ(m νν ) + O(ε) = ϱ(m) + O(ε). Using Albrecht s result a D with (7) can be constructed explicitly for every ε >. For the 2-norm and irreducible nonnegative M this is particularly easy. Denote r := ϱ(m) >, Mx = rx >, M T y = ry > and define D such that D 1 x = Dy. Then for N := D 1 MD, N T N Dy = N T D 1 MD D 1 x = rn T D 1 x = rdm T D 1 Dy = r 2 Dy, showing Dy > to be the Perron vector of N T N, and therefore D 1 MD 2 = r. 2

For the proof of Theorem 2.1 and general M we also need (8) M p M p n min(1/p,1 1/p) M p, which is true for all M M n (IR) and all 1 p (cf. [4, exercise 6.15]). Proof of Theorem 2.1. First we transform the definition of µ into the minimization of the p-norm of a certain matrix. Define the block cyclic matrix (9) A = A 1 A 2... A k 1 A k M kn (IR). Then A p = max 1 ν k A ν p because the p-norm is invariant under row or column permutations. For D := diag(d 1,..., D n ) D kn it follows (1) D 1 AD p = max 1 ν k D 1 ν A ν D ν+1 with k + 1 interpreted as 1. This and the infimum in (4) imply (11) µ 1/k inf D D D 1 AD p. Now we can prove the inequalities in (5). Suppose, according to Lemma 2.2, D Dkn is given such that ϱ( A ) + ε = D 1 A D p. Then by (11) and (1), µ 1/k D 1 A D p D 1 A D p = ϱ( A ) + ε. But the cyclicity of A defined by (9) implies ϱ( A ) = ϱ( A 1... A k ), and this proves the left inequality in (4). For the right inequality suppose D 1,..., D k D n be given such that Then by (8), 1 µ + ε = D 1 A D 1 1 2 p... D k A D k 1 p. µ + ε α (k m) 1 D 1 D k A k D 1 p 1 A 1 D 2 p... α (k m) 1 D 1 A 1 A 2... A k D 1 p α (k m) 1 ϱ( D 1 A 1... A k D 1 ) = α (k m) ϱ( A 1... A k ). The left inequality in (5) is obviously sharp for all A ν equal to the identity matrix I. Let p = 2 and let H be an n n Hadamard matrix, i.e. H ij = 1 for all i, j, and H T H = ni. Define A ν = n 1/2 H for 1 ν n. Then the A ν are orthogonal, and obviously µ = 1. On the other hand, ϱ( A 1... A n ) k = n k/2 = α k, such that the right inequality in (5) is sharp for p = 2 and infinitely many values of n. The theorem is proved. Note that there is in fact equality in (11) although this is, as remarked by a referee, not necessary for the proof. This is because in optimality all norms Dν 1 A ν D ν+1 must be equal, and to see this it suffices to use suitable multiples D ν := α ν I of the identity matrix. Note that for A 1 =... = A k = Q being some orthogonal matrix, e.g. orth(rand(n)) in Matlab [5] notation, freqently Q ij is of the order n 1/2 for all i, j, and ϱ( Q ) is of the order n. This means that for general orthogonal Q = A 1 =... = A k the right bound (5) is almost sharp. 3

3. Optimal two-sided scaling. With these preparations we can state our two-sided bounds for the optimal p-condition number with respect to two-sided diagonal scaling. Theorem 3.1. Let nonsingular A M n (IR) be given and let 1 p. Define Then µ p := inf D 1,D 2 D n κ p (D 1 AD 2 ). µ p ϱ( A 1 A ) α 2 p µ p with α p := n min(1/p,1 1/p). For p {1, } both inequalities are equalities. The left bound is sharp for all p. For p = 2 the right inequality is sharp at least for the infinitely many values of n where an n n Hadamard matrix exists. The proof is an immediate consequence of Theorem 2.1. Note that by the proof of Lemma 2.2 one may find D 1, D 2 explicitly such that κ p (D 1 AD 2 ) approximates ϱ( A 1 A ), a value not too far from the optimum. For the computation of the true value of the minimum 2-norm condition number we mention the following. By (1) and (11) we may transform the problem into ( ) inf κ 2 (D 1 AD 2 ) = inf D 1 A (12) D 1,D 2 D n D D 2n A 1 D 2 2. In 199, Sezginer and Overton [9] gave an ingenious proof for the fact that for fixed M M n (IR), the function e D Me D 2 is convex in the D νν. This offers reasonable ways to compute µ 2 by means of convex optimization (see also [1]). 4. Componentwise distance to singularity. In [3] the componentwise distance to the nearest singular matrix ω(a, E) was investigated. It is defined by (13) ω(a, E) := min{α : Ẽ αe, det(a + Ẽ) = }. As before we will assume A to be nonsingular. A well known lower bound is (14) ϱ( A 1 E) 1 ω(a, E). This follows easily by det(a + Ẽ) = = det(i + A 1 Ẽ), Ẽ α E, α = ω(a, E) 1 ϱ(a 1 Ẽ) ϱ( A 1 Ẽ ) α ϱ( A 1 E), the latter deductions using classical Perron-Frobenius Theory. In interval analysis this is a well known tool to show nonsingularity of an interval matrix A := [A E, A+E] = [à : A E à A + E]. An interval matrix is called nonsingular, if all à A share this property. Now ϱ( A 1 E) < 1 implies A to be nonsingular, and therefore an interval matrix with this property is called strongly regular [6]. The lower bound (14) was for a long time the only known (simple) criterion for regularity of an interval matrix. In [8, Corollary 1.1], we proved (15) [ A 1 2 E 2 ] 1 ω(a, E). It was shown that for both criteria (14) and (15) there are examples where the one is satisfied and the other is not. Our analysis yields new lower bounds for ω(a, E) and henceforth new sufficient criteria for nonsingularity of an interval matrix. The new lower bounds will be shown to be superior to the lower bound ϱ( A 1 E) 1. 4

Note that a lower bond r ω(a, E) implies all matrices à with à A < re to be nonsingular. Also note that the computation of ω(a, E) is known to be NP -hard [7]. The reciprocal of A 1 p E p is always a lower bound for ω(a, E). This is seen as before by det(a + Ẽ) = = det(i + A 1 Ẽ), Ẽ αe, α = ω(a, E) 1 ϱ(a 1 Ẽ) A 1 p Ẽ p A 1 p Ẽ p α A 1 p E p with the aid of (8). But ω(a, E) is invariant under left and right diagonal scaling, so D 1, D 2 D n : [ D 1 2 A 1 D 1 1 p D 1 ED 2 p ] 1 ω(a, E). The infimum of the left hand side can be bounded by means of Theorem 2.1) and we obtain the following. Theorem 4.1. Let nonsingular A M n (IR) and E M n (IR) be given. Define for 1 p (16) ν p := inf D2 1 D 1,D 2 D A 1 D1 1 p D 1 ED 2 p. n Then (17) ν 1 p ω(a, E) for all 1 p. Moreover, for all 1 p, ν p ϱ( A 1 E) α p ν p, where α p := n min(1/p,1 1/p). That means the new lower bound (17) is the same as (14) for p {1, }, and better than (14) for all other values of p. Moreover, for p = 2 the bound (17) may be better up to a factor n. This factor is achieved, for example, for all Hadamard matrices H by setting A = n 1/2 H, E = A. In this case ϱ( A 1 A ) = n n ν 2 n A 1 2 A 2 = n implies ν 2 = n = n 1/2 ϱ( A 1 A ). Is ν 2 always a global minimum of the ν p? Finally, the computation of ν 2 also yields a method to calculate an upper bound of ω(a, E). This is done by calculating some Ẽ with det(a + Ẽ) =. Then Ẽ βe implies ω(a, E) β. By (1) and (11) we have (18) ν 1/2 2 = inf D D 2n D 1 ( E A 1 ) D 2. Equality is seen as in the remark after the proof of Theorem 2.1: ( For A 1 = ) E choose suitable D 1 := αi, D 2 := I, block diagonal D := diag(d 1, D 2 ) and observe D 1 E A 1 D = max(α A 1, α 1 E ). This proves A 1 = E in optimality. ( For ) the moment assume the infimum is a minimum and A and E are scaled such that ν 1/2 E 2 = A 1 2. By (1) and the infimum process in the original definition (16) this implies A 1 2 = E 2 = ν 1/2 2 =: r. Denote singular vectors of A 1 and E to r by u 2 = v 2 = x 2 = y 2 = 1 and Ev = ru, A 1 y = rx. Then ( E A 1 ) ( y v ) = r ( u x ( ) ( ) ( ) y u E such that and span a 2-dimensional right and left singular vector space of v x A 1 ( ) E to r. Suppose the multiplicity of the largest singular value r of A 1 is equal to 2. Then by [1] 5 ),

( ) ( ) y u the minimization property of ν 2 implies that for 1 i n the i-th rows of and do v x have the same 2-norm. This means y = u and v = x or, for some signature matrices S 1 = S 2 = I, we have y = S 1 u and v = S 2 x. Then and therefore For Ẽ := r 2 S 1 ES 2 this means A 1 S 1 ES 2 x = A 1 S 1 Ev = ra 1 S 1 u = ra 1 y = r 2 x, det(r 2 I A 1 S 1 ES 2 ) = = det(a r 2 S 1 ES 2 ). det(a + Ẽ) = and Ẽ r 2 E = ν 1 2 E. In turn, if the multiplicity of the largest singular value of the minimizing matrix in (18) is two, then that of A 1 and E is one and the infimum in (18) is a minimum. The minimization ( ) in (18) may be performed ( following )[1] by minimizing the convex function e D E A 1 e D Ê 2. Suppose α =  1 2 = ν 1/2 2 for Ê and  1 being diagonally scaled E and A 1, respectively. Then (18) and (17) imply (19) α 2 ω(a, E). Now take singular vectors u, v, x, y of E and A 1 as above and set S 1 := diag(u y), S 2 := diag(v x) with denoting the entrywise (Hadamard) product. If entries in u, v, x, y are zero, some heuristic may be applied to choose the corresponding diagonal elements of S 1, S 2 in { 1, 1}. For such S 1, S 2 define β := max{ λ : λ real eigenvalue of A 1 S 1 ES 2 }, with β := if A 1 S 1 ES 2 does not have a real eigenvalue. If β, then det(sβi A 1 S 1 ES 2 ) = = det(a β 1 ss 1 ES 2 ) for s { 1, +1} with β 1 ss 1 ES 2 = β 1 E. By the definition (13) and (19) it follows (2) α 2 ω(a, E) β 1. This process might be repeated for other signature matrices S 1, S 2, each pair producing a valid upper bound for ω(a, E). In a practical application computation of ν 2 to high accuracy is costly, although it can be solved by a convex minimization problem. Therefore it may be advisable to perform only a few minimization steps. 5. Numerical results. For different dimensions n we defined 1 random matrices A M n (IR) with entries chosen randomly and uniformly distributed within [ 1, 1]. For each matrix we computed bounds [r, r] for ω(a, A ) as in (2), that is with respect to relative perturbations of A. The following table displays the median and maximum relative error (r r)/(r + r). relative error n median maximum 1 3. 1 3 4.4 1 2 2 6.4 1 3 8. 1 2 5 1.9 1 6 2.7 1 2 6

In a number of cases the exact value of ω(a, A ) was enclosed fairly accurately within the bounds [r, r]. This seems remarkable for this N P -hard problem and the dimensions in use. The surprisingly good median value for n = 5 seems to have occurred accidentally. Acknowledgement. The author wishes to thank two anonymous referees and Nick Higham for various helpful and constructive comments. REFERENCES [1] J. Albrecht. Minimal Norms of Non-Negative, Irreducible Matrices. Linear Algebra Appl., 249:255 258, 1996. [2] F.L. Bauer. Optimally scaled matrices. Numerische Mathematik 5, pages 73 87, 1963. [3] J.W. Demmel. The Componentwise Distance to the Nearest Singular Matrix. SIAM J. Matrix Anal. Appl., 13(1):1 19, 1992. [4] N.J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM Publications, Philadelphia, 1996. [5] MATLAB User s Guide, Version 5. The MathWorks Inc., 1997. [6] A. Neumaier. Interval Methods for Systems of Equations. Encyclopedia of Mathematics and its Applications. Cambridge University Press, 199. [7] S. Poljak and J. Rohn. Checking Robust Nonsingularity Is NP-Hard. Math. of Control, Signals, and Systems 6, pages 1 9, 1993. [8] S.M. Rump. Verification Methods for Dense and Sparse Systems of Equations. In J. Herzberger, editor, Topics in Validated Computations Studies in Computational Mathematics, pages 63 136, Elsevier, Amsterdam, 1994. [9] R. Sezginer and M. Overton. The largest singular value of e X Ae X is convex on convex sets of commuting matrices. IEEE Trans. on Aut. Control, 35:229 23, 199. [1] G.A. Watson. Computing the structured singular value. SIAM Matrix Anal. Appl., 13(14):154 166, 1992. 7