A PARTIAL CONDITION NUMBER FOR LINEAR LEAST SQUARES PROBLEMS

Size: px
Start display at page:

Download "A PARTIAL CONDITION NUMBER FOR LINEAR LEAST SQUARES PROBLEMS"

Transcription

1 A PARTIAL CONDITION NUMBER OR LINEAR LEAST SQUARES PROBLEMS MARIO ARIOLI, MARC BABOULIN, AND SERGE GRATTON CERACS Technical Report TR/PA/04/, 004 Also appeared as Rutherford Appleton Laboratory Technical Report RAL-TR Abstract We consider here the linear least squares problem min y R n Ay b where b R m and A R m n is a matrix of full column rank n and we denote x its solution We assume that both A and b can be perturbed and that these perturbations are measured using the robenius or the spectral norm for A and the Euclidean norm for b In this paper, we are concerned with the condition number of a linear function of x L T x where L R n k ) for which we provide a sharp estimate that lies within a factor of the true condition number Provided the triangular R factor of A from A T A = R T R is available, this estimate can be computed in kn flops We also propose a statistical method that estimates the partial condition number by using the exact condition numbers in random orthogonal directions If R is available, this statistical approach enables to obtain a condition estimate at a lower computational cost In the case of the robenius norm, we derive a closed formula for the partial condition number that is based on the singular values and the right singular vectors of the matrix A Keywords: Linear least squares, normwise condition number, statistical condition estimate, parameter estimation Introduction Perturbation theory has been applied to many problems of linear algebra such as linear systems, linear least squares, or eigenvalue problems [, 4,, 8] In this paper we consider the problem of calculating the quantity L T x, where x is the solution of the linear least squares problem LLSP) min x R n Ax b where b R m and A R m n is a matrix of full column rank n This estimation is a fundamental problem of parameter estimation in the framework of the Gauss-Markov Model [7, p 7] More precisely, we focus here on the evaluation of the sensitivity of L T x to small perturbations of the matrix A and/or the right-hand side b, where L R n k and x is the solution of the LLSP The interest for this question stems for instance from parameter estimation where the parameters of the model can often be divided into two parts : the variables of physical significance and a set of ancillary variables involved in the models or example, this situation occurs in the determination of positions using the GPS system, where the -D coordinates are the quantities of interest but the statistical model involves other parameters such as clock drift and GPS ambiguities [] that are generally estimated during the solution process It is then crucial to ensure that the solution components of interest can be computed with satisfactory accuracy The main goal of this paper is to formalize this problem in terms of a condition number and to describe practical methods to compute or estimate this quantity Note that as far as the sensitivity of a subset of the solution components is concerned, the matrix L is a projection whose columns consist of vectors of the canonical basis of R n The condition number of a map g : R m R n at y 0 measures the sensitivity of gy 0 ) to perturbations of y 0 If we assume that the data space R m and the solution space R n are equipped respectively with the norms D and S, the condition number Ky 0 ) is defined by Ky 0 ) = lim sup δ 0 0< y 0 y D δ gy 0 ) gy) S y 0 y D, ) whereas the relative condition number is defined by K rel) y 0 ) = Ky 0 ) y 0 D / gy 0 ) S This definition shows that Ky 0 ) measures an asymptotic sensitivity and that this quantity depends on the chosen norms for the data and solution spaces If g is a réchet-differentiable -differentiable) Rutherford Appleton Laboratory, Oxfordshire, England marioli@rlacuk CERACS, 4 av Gaspard Coriolis, 057 Toulouse Cedex, rance baboulin@cerfacsfr - gratton@cerfacsfr

2 function at y 0, then Ky 0 ) is the norm of the -derivative g y 0 ) ) see [6]), where is the operator norm induced by the choice of the norms on the data and solution spaces or the full rank LLSP, we have ga, b) = A T A) A T b If we consider the product norm A, b) = A + b for the data space and x for the solution space, then [8] gives an explicit formula for the relative condition number K rel) A, b): K rel) A, b) = A A r + x + ) A, b) x where A denotes the pseudo inverse of A, r = b Ax is the residual vector and and are respectively the robenius and Euclidean norms But does the value of K rel) A, b) give us useful information about the sensitivity of L T x? Can it in some cases overestimate the error in components or on the contrary be too optimistic? Let us consider the following example A = ɛ ɛ 0 ɛ 0 ɛ ɛ ɛ ɛ, x = ɛ ɛ ɛ and b = ɛ ɛ + ɛ ɛ + ɛ ɛ + ɛ x is here the exact solution of the LLSP min x R Ax b If we take ɛ = 0 8 then we have x = 0 8, 0 8, 0 8 ) T and the solution computed in Matlab using a machine precision 0 6 is x = 5 0 8, 5 0 8, 0 8 ) T The LLSP condition number is K rel) A, b) = and the relative errors on the components of x are x x x = x x x = 05 and x x x Then, if L = 0 0, we expect a large value for the condition number of L T x because there is 0 0 a 50% relative error on x and x If now L = 0, 0, ) T, then we expect that the condition number of L T x would be close to because x = x or these two values of L, the LLSP condition number is far from giving a good idea of the sensitivity of L T x Note in this case that the perturbations are due to roundoff errors Let us now consider a simple example in the framework of parameter estimation where in addition to roundoff errors, random errors are involved Let b = {b i },,0 be a series of observed values depending on data s = {s i } where s i = 0 + i, i =,, 0 We determine a -degree polynomial that approximates b in the least squares sense, and we suppose that the following relationship holds = 0 b = x + x s + x s + x 4 s with x = x = x = x 4 = We assume that the perturbation on each b i is 0 8 multiplied by a normally distributed random number and denote by b = { b i },,0 the perturbed quantity This corresponds to the LLSP min x R 4 Ax b where A is the Vandermonde matrix defined by A ij = Let x and ỹ be the computed solutions corresponding to two perturbed right-hand sides Then we obtain the following relative errors on each component: x ỹ x = 0 7, x ỹ x = 6 0 6, x ỹ x, s j i = 6 0 5, and x 4 ỹ 4 x 4 = 0 4 We have K rel) A, b) = 0 5 Regarding the disparity between the sensitivity of each component, we need a quantity that evaluates more precisely the sensitivity of each solution component

3 of the LLSP The idea of analyzing the accuracy of some solution components in linear algebra is by no means new or linear systems Ax = b, A R n and for LLSP, [] defines so called componentwise condition numbers that correspond to amplification factors of the relative errors in solution components due to perturbations of data A or b and explains how to estimate them In our formalism, these quantities are upper bounds of the condition number of L T x where L is a column of the identity matrix We also emphasise that the term componentwise refers here to the solution components and must be distinguished from the metric used for matrices and for which [] provides a condition number for generalized inversion and linear least squares or LLSP, [4] provides a statistical estimate for componentwise condition numbers due to either relative or structured perturbations In the case of linear systems, [] proposes a statistical approach, based on [] that enables to compute the condition number of L T x in On ) Our approach differs from the previous studies in the following aspects: we are interested in the condition of L T x where L is a general matrix and not only a canonical vector of R n, we are looking for a condition number based on the réchet-derivative, and not only for an upper bound of this quantity We present in this paper three ways to obtain information on the condition of L T x The first one uses an explicit formula based on the singular value decomposition of A The second is at the same time an upper bound of this condition number and a sharp estimate of it The third method supplies a statistical estimate The choice between these three methods will depend on the size of the problem computational cost) and on the accuracy desired for this quantity This paper is organized as follows In Section, we define the notion of a partial condition number Then, when perturbations on A are measured using a robenius norm, we give a closed formula for this condition number in the general case where L R n k and in the particular case when L R n In Section, we establish bounds of the partial condition number in robenius as well as in spectral norm, and we show that these bounds can be considered as sharp estimates of it In Section 4 we describe a statistical method that enables to estimate the partial condition number In Section 5 we present numerical results in order to compare the statistical estimate and the exact condition number on sample matrices A and L In Section 6 we give a summary comparing the three ways to compute the condition of L T x as well as a numerical illustration inally some concluding remarks are given in Section 7 Throughout this paper we will use the following notations We use the robenius norm and the spectral norm on matrices and the usual Euclidean on vectors The matrix I is the identity matrix and e i is the i-th canonical vector We also denote by ImA) the space spanned by the columns of A and by KerA) the null space of A The partial condition number of an LLSP Let L be an n k matrix, with k n We consider the function g : R m n R m R k A, b ga, b) = L T xa, b) = L T A T A) A T b ) Since A has full rank n, g is continuously -differentiable in a neighbourhood of A, b) and we denote by g its -derivative Let α and β be two positive real numbers In the present paper we consider the Euclidean norm for the solution space R k or the data space R m n R m, we use the product norms defined by A, b) = α A + β b, α, β > 0 and A, b) = α A + β b, α, β > 0

4 4 These norms are very flexible since they allow to monitor the perturbations on A and b or instance, large values of α resp β ) enable to obtain condition number problems where mainly b resp A) are perturbed A more general weighted robenius norm AT, βb), where T is a positive diagonal matrix is sometimes chosen This is for instance the case in [0] who give an explicit expression for the condition number of rank deficient linear least squares using this norm According to [6], the absolute condition numbers of g at the point A, b) using the two product norms defined above is given by: and κ g, A, b) = g A, b) A, b) max A, b) A, b) κ g, A, b) = g A, b) A, b) max A, b) A, b) The corresponding relative condition numbers of g at A, b) are expressed by and κ rel) g, A, b) = κ g, A, b) A, b) ga, b) κ rel) g, A, b) = κ g,a, b) A, b) ga, b) We call the condition numbers related to L T xa, b) partial condition numbers of the LLSP with respect to the linear operator L The partial condition number defined using the product norm, ) is given by the following theorem Theorem Let A = UΣV T be the thin singular value decomposition of A defined in [7] with Σ = diagσ i ) and σ σ σ n > 0 The absolute condition number of ga, b) = L T xa, b) is given by κ g, A, b) = SV T L where S R n n is the diagonal matrix with diagonal elements S ii = σ σi r + x i α + β Proof The demonstration is divided into three parts In Part, we establish an explicit formula of g A,b) A, b) A, b) A, b) In Part, we derive an upper bound for g A, b) In Part, we show that this bound is reached for a particular A, b) Part : Let A R m n and b R m Using the chain rules of composition of derivatives, we get ie g A, b) A, b) = L T A T A) A T b AA T A) A T b) L T A T A) A T AA T A) A T b + L T A b g A, b) A, b) = L T A T A) A T r L T A Ax + L T A b ) We write A = A + A by defining A = AA A projection of A on ImA)) and A = I AA ) A projection of A on ImA) ) We have A T r = 0 because r ImA) ) and A A = 0 Then we obtain g A, b) A, b) = L T A T A) A T r L T A A x + L T A b )

5 Part : We now prove that κ g, A, b) SV T L Let u i and v i be the i-th column of respectively U and V rom A = V Σ U T, we get AA = UU T = n u iu T i and since n v ivi T = I, we have A = n u iu T i A and A = I AA ) A n v ivi T Moreover, still using the thin SVD of A and A, it follows that 5 A T A) v i = v i σ i, A u i = v i σ i and A b = v i u T i b σ i 4) Thus ) becomes g A, b) A, b) = L T v i [v T i AT I AA ) r σ i n = L T v i y i, u T i A x σ i + u T i b σ i ] where we set y i = vi T AT I AA ) r u T σi i A x σ i + u T i b σ i R Thus if Y = y, y,, y n ) T, we get g A, b) A, b) = L T V Y and then g A, b) A, b) = L T V SS Y SV T L S Y We denote by w i = vt i AT I AA )r S iiσ i ut i Ax S iiσ i + ut i b S iiσ i the i-th component of S Y Then we have w i α v T i AT I AA ) T r αs ii σ i r α Sii σ4 i + x α S ii σ i + β Sii σ i + α u T i A x αs ii σ i + β u T i b βs ii σ i = S ii S ii α I AA ) Av i + α u T i A + β u T i b ) ) α I AA ) Av i + α u T i A + β u T i b ) Hence S Y n α I AA ) Av i + α u T i A + β u T i b = α I AA ) AV + α U T A + β U T b = α I AA ) A + α U T A + β U T b Since U T A = UU T A = AA A and U T b = UU T b b, we get S Y α A + α A + β b rom A = A + A, we get S Y A, b) and thus g A, b) A, b) SV T L A, b) So we have shown that SV T L is an upper bound for κ g, A, b) Part :

6 6 We now prove that this upper bound can be reached ie that SV T L A,b) A, b) = g A, b) for some A, b) R m n R m Let consider the particular choice of A, b) defined by holds A, b) = A + A, b) = α i α r r v T i + β i α u x T i, x γ i β u i) where α i, β i, γ i are real constants to be chosen in order to achieve the upper bound obtained in Part Since A T r = 0 and A A = 0, it follows from ) and 4) that g A, b) A, b) = L T A T A) = L T n = α i ασ i L T v i α i ασ i n α i α r vt i L T A v i r L T Thus by denoting ξ i = [L T r v i, L T x v ασi i X = α, β, γ,, α n, β n, γ n ) T R n we get n β i ασ i v i x + L T r β i ασ i x + γ i βσ i ) ασ i, LT v i βσ i β i α u i x + L T A γ i βσ i v i γ i β u i ] R k and Γ = [ξ,, ξ n ] R k n, and g A, b) A, b) = ΓX 5) ) ) Since i, j trace r r v T i )T r r v T i ) x = trace u T i x ) T x u T i x ) = δ ij ) where δ ij is the Kronecker symbol and trace r r v T i )T x u T i x ) = 0, then { r r v T i } x,,n and {u T i x },,n form an orthonormal set of matrices for the robenius norm and we get A = n α i + β i ) It follows that and Equation 5) yields We know that Γ = max X ΓX X n A, b) = n α i + n βi + γi = X, g A, b) A, b) A, b) = ΓX X is reached for some X = α, β, γ,, α n, β n, γ n ) T Then for the A, b) corresponding to this X, we have g A,b) A, b) A, b) = Γ urthermore we have ΓΓ T = L T v r α σ 4 + x α σ + β σ )v T L + + L T v n r α σn 4 + x α σn + β σn )vn T L = L T v Sv T L + + L T v n Snnv n T L = L T V S)SV T L) Hence Γ = ΓΓ T = SV T L

7 7 A,b) A, b) and α, β, γ,, α n, β n, γ n are such that g A, b) = SV T L Thus SV T L κ g, A, b), which concludes the proof Remark Let l j be the j-th column of L, j =,, k rom SV T L = S v T S nn v n T l,, l k ) = S v T l S v T l k S nn v T n l S nn v T n l k it follows that SV T L is large when there exists at least one large S ii and a l j such that v i T l j 0 In particular, the condition number of L T xa, b) is large when A has small singular values and L has components in the corresponding right singular vectors or when r is large Remark In the general case where L is an n k matrix, the computation of κ g, A, b) via the exact formula given in Theorem requires the computation of the singular values and the right singular vectors of A, which might be expensive in practice since it involves mn operations if we use a R-SVD algorithm and if m n see [7, p 54]) If the LLSP is solved using a direct method, the R factor of the QR decomposition of A or equivalently in exact arithmetic, the Cholesky factor of A T A) might be available Since the right singular vectors of A are also those of R, the condition number can be computed in about n flops using the Golub-Reinsch SVD, [7, p 54]) Using R is even more interesting when L R n, since from L T A + = R T L and L T A T A) = R R T L), 6) it follows that the computation of κ g, A, b) can be done by solving two successive n-by-n triangular systems which involve about n flops Special cases and GSVD In this Section, we analyze some special cases of practical relevance Moreover, we relate the formula given in Theorem for κ g, A, b) to the Generalized Singular Value Decomposition GSVD) [, p 57], [7, p 466], and [5, 9]) Using the GSVD of A and L T, there exist U A R m m, U L R k k orthogonal matrices and Z R n n invertible such that: with U T A A = DA 0 ) Z and U T L LT = D L 0 ) Z D A = diagα,, α n ), D L = diagβ,, β k ), α i + β i = i =,, k, α i =, i = k +,, n The diagonal matrix S can be decompose in the product of two diagonal matrices S = Σ D, with D ii = σ i r + x α + β Then, taking into account the following relations SV T L = L T V S = L T V Σ U T UD = L T A UD, L T A = U L DL 0 ) ZZ D A 0 ) U T A,

8 8 we can represent κ g, A, b) as κ g, A, b) = T HD where T R k k is a diagonal matrix with T ii = β i /α i, i =,, k and H R k n is H = I 0 ) U T A U Note that L T A = T We also point out that the diagonal entries of T are the nonzero generalized eigenvalues of λa T Az = LL T z There are two interesting special cases where the expression of κ g, A, b) is simpler irst, when r = 0, ie the LLSP problem is consistent, we have x D = α + β I and κ g, A, b) = T H x α + β Second, if we allow only perturbations on b and if we use the expression ) of the derivative of ga, b), we get L T A κ g, A, b) = = T β β see Remark 4 in Section ) Other relevant cases where the expression for κ g, A, b) has a special interest are L = I and L is a column vector In the special case where L = I, the formula given by Theorem becomes κ g, A, b) = SV T L σ n r = S = max S ii = σ + x n i α + β Since A = σ n, we obtain that κ g, A, b) = A A r + x α + β This corresponds to the result known from [8] and also to a generalization of the formula of the condition number in robenius norm given in [6, p 9] where only A was perturbed) inally, let us study the particular case where L is a column vector ie when g is a scalar derived function Corollary In the particular case when L is a vector L R n ), the absolute condition number of ga, b) = L T xa, b) is given by κ g, A, b) = L T A T A) r α + ) L T A x α + β )

9 9 Proof By replacing A T A) = V Σ V T and A = V Σ U T in the expression of K = L T A T A) r + L T A x + )) we get K = L T V Σ V T = L T V Σ = Σ V T L r α + L T V Σ U T x α + β ) r α + L T V Σ x α + β ) r α + Σ V T L x α + β ) By writing z,, z n ) T the vector V T L R n we obtain K = = = z i σ 4 i z i σ i S ii z i r α + = SV T L, z i σ i x α + β ) σ i r + x α + β ) and Theorem gives the result Sharp estimate of the partial condition number in robenius and spectral norms In many cases, obtaining a lower and/or an upper bound of κ g, A, b) is satisfactory when these bounds are tight enough and significantly cheaper to compute than the exact formula Moreover, many applications use condition numbers expressed in the spectral norm In the following theorem, we give sharp bounds for the partial condition numbers in the robenius and spectral norms Theorem The absolute condition numbers of ga, b) = L T xa, b) L R n k ) in the robenius and spectral norms can be respectively bounded as follows fa, b) κ g, A, b) fa, b) fa, b) κ g, A, b) fa, b) where fa, b) = L T A T A) r α + L T A x α + β ) ) Proof Part : We start by establishing the lower bounds Let w and w resp a and a ) be right resp the

10 0 left) singular vectors corresponding to the largest singular values of respectively L T A T A) and L T A We use a particular perturbation A, b) expressed as x T r A, b) = w T α r + ɛw, ɛ w α x β ), where ɛ = ± By replacing this value of A, b) in ) we get g A, b) A, b) = r α LT A T A) w + ɛ α x L T A T A) xw T r L T A r wt x α r ɛ x α LT A w ɛ β LT A w Since r ImA) we have A r = 0 Moreover we have w KerLT A ) and thus w ImA+T L) and can be written w = A+T Lδ for some δ R k Then w T r = δt L T A r = 0 It follows that g A, b) A, b) = r α LT A T A) w ɛ x α LT A w ɛ β LT A w rom L T A T A) w = L T A T A) a and L T A w = L T A a, we obtain g A, b) A, b) = L T A T A) r α a ɛ x α + β ) L T A a Since a and a are unit vectors, g A, b) A, b) can be be developed as g A, b) A, b) = L T A T A) r α + L T A x α + β ) ɛ L T A T A) r α x α + β ) L T A cosa, a ) By choosing ɛ = signcosa, a )) the third term of the above expression becomes positive urthermore we have x α + β ) x α + β Then we obtain ie g A, b) A, b) L T A T A) g A, b) A, b) fa, b) r α + L T A x α + β ) ) On the other hand, we have A = r w T α r + w x T α x r + ɛ trace w T ) T w x T ) ) α r α x and w β = β with r w T α r = w x T α x = α and trace r w T α r )T w x T ) ) = 0 α x

11 Then A, b) = and thus we have g A,b) A, b) A, b) fa,b) A,b) A, b) A, b) urthermore, from A, b) A, b) we get g A, b) the same particular value of A, b)) Then we obtain κ g, A, b) fa,b) and κ g, A, b) fa,b) for a particular value of fa,b) Part : Let us now establish the upper bound for κ g, A, b) and κ g, A, b) If A = AA A and A = I AA ) A, then it comes from ) that A, b) R m n R m for g A, b) A, b) L T A T A) A r + L T A A x + L T A b = Y X, where L T A T A) Y = r, α L T A x, α L T A β ) and X = α A, α A, β b ) T Hence, from the Cauchy-Schwarz inequality we get g A, b) A, b) Y X, ) with and X = α A + α A + β b α A + α A + β b Y = fa, b) Then, since A = A + A, we have X A, b) and ) yields g A, b) A, b) A, b) Y which implies that κ g, A, b) fa, b) An upper bound of κ g, A, b) can be computed in a similar manner: we get from ) that g A, b) A, b) L T A T A) r + L T A x ) A + L T A b = Y X, ) where Y L = T A T A) r + L T A x α, LT A β and X = α A, β b ) T Since X = A, b) we have κ g, A, b) Y Using then the inequality L T A T A) r + L T A ) x L T A T A) r + L T A ) x

12 we get Y Y and finally obtain κ g, A, b) fa, b) which concludes the proof Theorem shows that fa, b) can be considered as a very sharp estimate of the partial condition number expressed either in robenius or spectral norm Indeed, it lies within a factor of κ g, A, b) or κ g, A, b) Another observation is that we have 6 κ g, A, b) κ g, A, b) Thus even if the robenius and spectral norms of a given matrix can be very different for X R m n, we have X X n X ), the condition numbers expressed in both norms are of same order It results that a good estimate of κ g, A, b) is also a good estimate of κ g, A, b) Moreover 6) shows that if the R factor of A is available, fa, b) can be computed by solving two n-by-n triangular systems with k right-hand sides and thus the computational cost is kn Remark We can check on the following example that κ g, A, b) is not equal to fa, b) Let us consider We have and we get A = , L = 0 0 ) and b = / / x = /, / ) T and x = r =, κ g, A, b) = 45 4 < fa, b) = Remark 4 Using the definition of the condition number and of the product norms, tight estimates for the partial condition number for perturbations of A only resp b only) can be obtained by taking α > 0 and β = + resp β > 0 and α = + ) in Theorem In particular, when we perturb only b we have, with the notations of Section, L T A fa, b) = = T = κ g, A, b) β β Moreover, when r = 0 we have fa, b) = ) L T A x ) α + x β = T α + β Remark 5 In the special case where L = I, we have fa, b) = A T A) Since A T A) = A we obtain that r α + ) A x α + β ) fa, b) = A A r + x α + β

13 In that case κ g, A, b) is exactly equal to fa, b) due to [8] Regarding the condition number in spectral norm, since we have A, b) A, b) we get κ g, A, b) fa, b) This lower bound is similar to that obtained in [6] where only A is perturbed) As mentioned in [6], an upper bound of κ g, A) is κ u g,a) = A r + A x If we take α = and β = +, we notice that fa, b) κ u g,a) fa, b) showing thus that our upper bound and κ u g, A) are essentially the same Remark 6 Generalization to other product norms: Other product norms may have been used for the data space R m n R m If we consider a norm ν on R such that c νx, y) x + y c νx, y) then we can define a product norm A, b),ν = να A, β b ) or instance in [9], ν corresponds to Note that the product norm, ) used throughout this paper corresponds to ν = and that with the above notation we have A, b), = A, b) Then the following inequality holds c A, b),ν A, b) c A, b),ν If we denote κ g,,ν A, b) = max A, b) g A,b) A, b) A, b),ν we obtain κ g,,ν A, b) c κ g, A, b) κ g,,νa, b) c Using the bounds for κ g, given in Theorem we can obtain tight bounds for the partial condition number expressed using the product norm based on ν and when the perturbations on matrices are measured with the robenius norm: c fa, b) κ g,,ν A, b) c fa, b) Similarly, if the perturbations on matrices are measured with the spectral norm, we get c fa, b) κ g,,ν A, b) c fa, b) The bounds obtained for three possible product norms ν =, ν = and ν = ) are given in Table when using the robenius norm for matrices and in Table when using the spectral norm for matrices product norm ν, c, c lower bound upper bound factor of fa, b)) factor of fa, b)) max{α A, β b },, 6 α A + β b,,, α A + β b,, Table Bounds for partial condition number robenius norm on matrices) 4 Statistical estimation of the partial condition number In this section we compute a statistical estimate of the partial condition number We have seen in Section that using the robenius or the spectral norm for the matrices gives condition numbers that are of the same order of magnitude or sake of simplicity, we compute here a statistical estimate of κ g, A, b) Let z, z,, z q ) be an orthonormal basis for a subspace of dimension q q k) that has been randomly and uniformly selected from the space of all q-dimensional subspaces of R k this can be done by choosing q random vectors and then orthogonalizing) Let us denote g i A, b) = Lz i ) T xa, b)

14 4 product norm ν, c, c lower bound upper bound factor of fa, b)) factor of fa, b)) max{α A, β b },, 6 α A + β b,, α A + β b,, Table Bounds for partial condition number spectral norm on matrices) Since Lz i R n, the absolute condition number of g i can be computed via the exact formula given in Corollary ie κ gi, A, b) = Lzi ) T A T A) We define the random variable φq) by r α + Lzi ) T A )) x α + β 4) φq) = k q q κ gi, A, b) ) Let the operator E) denote the expected value The following proposition shows that the root mean squared of φq), defined by Rφq)) = Eφq) ) can be considered as an estimate for the condition number of ga, b) = L T xa, b) Proposition The absolute condition number can be bounded as follows: Rφq)) k κ g, A, b) Rφq)) 4) Proof Let vec be the operator that stacks the columns of a matrix into a long ) vector and M vecα A) be the k-by-mn + ) matrix such that vecg A, b) A, b)) = M Note that M vecβ b) depends on A, b, L and not on the z i Then we have: κ g, A, b) = g A, b) A, b) max vecg A, b) A, b)) = max ) A, b) A, b) A, b) vecα A) vecβ b) M z = max = M z R mn+),z 0 z = M T Let Z = [z, z,, z q ] be the k-by-q random matrix with orthonormal columns z i rom [0] it follows that k q M T Z is an unbiased estimator of the robenius norm of the mn + )-by-k matrix M T ie we have E k q M T Z ) = M T rom M T Z = Z T M z T M = zq T M

15 5 we get, since zi T M is a row vector, M T Z = q z T i M We notice that for all vector u R k, if we consider the function g u A, b) = u T ga, b), then we have u T M = g u A, b) = κ g u, A, b) and therefore z T i M = κ gi, A, b) Eventually we obtain M T = Ek q q κ gi, A, b) ) = Eφq) ) Moreover, considering that M T R mn+) k and using the well-known inequality M T k M T M T, we get the result 4 Then we will consider φq) A,b) L T x as an estimator of κ rel) g, A, b) The root mean squared of φq) is an upper bound of κ g A, b), and estimates κ g, A, b) within a factor k Proposition involves the computation of the the condition number of each g i A, b), i =,, q rom Remark, it follows that the computational cost of each κ gi, A, b) is n if the R factor of the QR decomposition of A is available) Hence, for a given sample of vectors z i, i =,, q, computing φq) requires about qn flops However, Proposition is mostly of theoretical interest, since it relies on the computation of the root mean squared of a random variable, without providing a practical method to obtain it In the next proposition, the use of the small sample estimate theory developed by Kenney and Laub [0] gives a first answer to this question by showing that the evaluation of φq) using only one sample of q vectors z, z,, z q in the unit sphere may provide an acceptable estimate Proposition Using conjecture [0, p 78], we have the following result: or any α > 0, ) φq) P r α k κ g, A, b) αφq) α q This probability approaches very fast as q increases or α = and q = the probability for φq) to estimate κ g, A, b) within a factor k is 999% Proof We define as in the proof of Proposition the matrix M as the matrix related to the vec operation representing the linear operator g A, b) rom [0, 4) p 78 and 9) p 78] we get P r M T α φq) α M T ) α q 4) We have seen in the proof of Proposition that κ g, A, b) = M T Then we have κ g, A, b) M T κ g, A, b) k It follows that, for the random variable φq), we have κg, A, b) P r φq) ακ g, A, b) ) M T k P r α α φq) α M T )

16 6 Then we obtain the result from κg, A, b) P r φq) ακ g, A, b) ) k α ) φq) = P r α k κ g, A, b) αφq) We see from this proposition that it may not be necessary to estimate the root mean squared of φq) using sophisticated algorithms Indeed only one sample of φq) obtained for q = provides an estimate of κ g, A, b) within a factor α k Remark 7 If k = then Z = and the problem is reduced to computing κ g A, b) In this case, φ) is exactly the partial condition number of L T xa, b) Remark 8 Concerning the computation of the statistical estimate in the presence of roundofferrors, the numerical reliability of the statistical estimate relies on an accurate computation of the κ gi, A, b) for a given z i Let A be a 7-by- Vandermonde matrix, b a random vector and L R n the right singular vector v n Using the Mathematica software that computes in exact arithmetic, we obtained κ rel) g, A, b) If the triangular factor R form A T A = R T R is obtained by the QR decomposition of A, we get κ rel) g, A, b) 5 08 If R is computed via a classical Cholesky factorization, we get κ g, A, b) rel) 0 0 Corollary and Remark show that the computation of κ g, A, b) rel) involves linear systems of the type A T Ax = d, which differs from the usual normal equation for least squares in their right-hand side Our observation that for this kind of ill-conditioned systems, a QR factorization is more accurate than a Cholesky factorization is in agreement with [5] 5 Numerical experiments All experiments were performed in Matlab 65 using a machine precision Examples or the examples of Section, we compute the partial condition number using the formula given in Theorem In the first example we have A = ɛ ɛ 0 ɛ 0 ɛ ɛ ɛ ɛ and we assume that only A is perturbed If we consider the values for L that are 0 0 and 0 0 L = 0, 0, ) T then we obtain partial condition numbers κ rel) g, A) that are respectively 04 and, as expected since there is 50% relative error on x and x and there is no error on x In the second example where A is the 0 by 4 Vandermonde matrix defined by A ij = 0+i) j and only b is perturbed, the partial condition numbers κ rel) g, b) with respect to each component x, x, x, x 4 are respectively 45 0, 0 4, 0 5, which is consistent with the error variation given in Section for each component 5 Average behaviour of the statistical estimate We compare here the statistical estimate described in the previous section with the partial condition number obtained via the exact formula given in Theorem We suppose that only A is perturbed and then the partial condition number can be expressed as κ rel) g, A) We use the method described in [6] in order to construct test problems [A, x, r, b] = P m, n, n r, l) with ) D A = Y Z 0 T R m n, Y = I yy T, Z = I zz T,

17 where y R m and z R n are random unit vectors ) and D = n l diagn l, n ) l,, ) 0 x =,,, n ) T is given and r = Y R c m is computed with c R m n random vector ) DZx of norm n r The right-hand side is b = Y By construction, the condition number of A c and D is n l In our experiments, we consider the matrices ) ) A E A = I and L =, E A 0 where A R m n, A R m n, L R n n, m + m = m, n + n = n, and E and E contain the same element e p which defines the coupling between A and A The matrices A and A are randomly generated using respectively P m, n, n r, l ) and P m, n, n r, l ) or each sample matrix, we compute in Matlab: the partial condition number κ rel) g, A) using the exact formula given in Theorem and based on the singular value decomposition of A, the statistical estimate φ) using three random orthogonal vectors and computing each κ gi, A, b), i =, with the R factor of the QR decomposition of A These data are then compared by computing the ratio γ = 7 φ) κ rel) g, A) Table 5 contains the mean γ and the standard deviation s of γ obtained on 000 random matrices l with m =, n = 0, m = 7, n = by varying the condition numbers n l and n of respectively A and A and the coupling coefficient e p The residual norms are set to n r = n r = In all cases, γ is close to and s is about 0 The statistical estimate φ) lies within a factor of κ rel) g, A) which is very accurate in condition number estimation We notice that in two cases, φ) is lower than This is possible because Proposition shows that Eφ) ) is an upper bound of κ g, A) but not necessarily φ) condition e p = 0 5 e p = e p = 0 5 l l γ s γ s γ s Table 5 Ratio between statistical and exact condition number of L T x 6 Estimates vs exact formula We assume that the R factor of the QR decomposition of A is known We gather in Table 6 the results obtained in this paper in terms of accuracy and flops counts for the estimation of the partial condition number for the LLSP Table 6 gives the estimates and flops counts in the particular situation where A = m = 500, n = 000, k = 50, 0 0 ) 0, L = 0 ),

18 8 κ g, A, b) flops accuracy exact formula n exact n m sharp estimate fa, b) kn fa,b) κ g, A, b) fa, b) k n stat estimate φq) qn φq) α κ k g, A, b) αφq) q k P r α q for α > 0 Table 6 Comparison between exact formula and estimates for κ g, A, b) A = A 0 0 I n 0 0 and b =,,, ) T, L = L 0 0 I k 0 0 We see here that the statistical estimates may provide information on the condition number using a very small amount of floating point operations compared with the two other methods κ rel) g, A, b) fa, b) A,b) L T x φq) A,b) L T x Gflops 00 Mflops 6 Mflops Table 6 lops and accuracy : exact formula vs estimates 7 Conclusion We have shown the relevance of the partial condition number for test cases from parameter estimation This partial condition number evaluates the sensitivity of L T x where x is the solution of an LLSP when A and/or b are perturbed It can be computed via a closed formula, a sharp estimate or a statistical estimate The choice will depend on the size of the LLSP and on the needed accuracy The closed formula requires On ) flops and is affordable for small problems only The sharp estimate and the statistical estimate will be preferred for larger problems especially if k n since their computational cost is in On ) REERENCES [] A Björck, Numerical Methods for Least Squares Problems, SIAM, 996 [] Y Cao and L Petzold, A subspace error estimate for linear systems, SIAM Matrix Analysis and Applications, 4 00), pp [] S Chandrasekaran and I C Ipsen, On the sensitivity of solution components in linear systems of equations, Numerical Linear Algebra with Applications, 995), pp 7 86 [4] L Eldén, Perturbation theory for the least squares problem with linear equality constraints, SIAM J Numer Anal, 7 980), pp 8 50 [5] V rayssé, S Gratton, and V Toumazou, Structured backward error and condition number for linear systems of the type A Ax = b, BIT, ), pp 74 8 [6] A J Geurts, A contribution to the theory of condition, Numerische Mathematik, 9 98), pp [7] G Golub and C van Loan, Matrix Computations, The Johns Hopkins University Press, 996 Third edition [8] S Gratton, On the condition number of linear least squares problems in a weighted robenius norm, BIT, 6 996), pp 5 50 [9] J Grcar, Adjoint formulas for condition numbers applied to linear and indefinite least squares, Tech Rep Technical Report LBNL-55, Lawrence Berkeley National Laboratory, 005 [0] T Gudmundsson, C S Kenney, and A J Laub, Small-sample statistical estimates for matrix norms, SIAM Matrix Analysis and Applications, 6 995), pp [] N Higham, Accuracy and Stability of Numerical Algorithms, SIAM, 00 Second edition

19 [] E D Kaplan, Understanding GPS : Principles and Applications, Artech House Publishers, Boston, 996 [] C S Kenney and A J Laub, Small-sample statistical condition estimates for general matrix functions, SIAM J Sci Comput, 5 994), pp 6 6 [4] C S Kenney, A J Laub, and M S Reese, Statistical condition estimation for linear least squares, SIAM Matrix Analysis and Applications, 9 998), pp [5] C C Paige and M A Saunders, Toward a generalized singular value decomposition, SIAM Journal on Numerical Analysis, 8 98), pp [6] C C Paige and M A Saunders, LSQR: An algorithm for sparse linear equations and sparse least squares, ACM Trans Math Software, 8 98), pp 4 7 [7] C R Rao and S K Mitra, Generalized Inverse of Matrices and Its Applications, Wiley, New York, 97 [8] G W Stewart and J Sun, Matrix Perturbation Theory, Academic Press, New York, 99 [9] C Van Loan, Generalizing the singular value decomposition, SIAM Journal on Numerical Analysis, 976), pp 76 8 [0] Y Wei, H Diao, and S Qiao, Condition number for weighted linear least squares problem and its condition number, Technical Report CAS 04-0-SQ, Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada, 004 [] Y Wei, W Xu, S Qiao, and H Diao, Componentwise condition numbers for generalized matrix inversion and linear least squares, Technical Report CAS 0--SQ, Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada, 00 9

Computing least squares condition numbers on hybrid multicore/gpu systems

Computing least squares condition numbers on hybrid multicore/gpu systems Computing least squares condition numbers on hybrid multicore/gpu systems M. Baboulin and J. Dongarra and R. Lacroix Abstract This paper presents an efficient computation for least squares conditioning

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u MATH 434/534 Theoretical Assignment 7 Solution Chapter 7 (71) Let H = I 2uuT Hu = u (ii) Hv = v if = 0 be a Householder matrix Then prove the followings H = I 2 uut Hu = (I 2 uu )u = u 2 uut u = u 2u =

More information

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Suares Problem Hongguo Xu Dedicated to Professor Erxiong Jiang on the occasion of his 7th birthday. Abstract We present

More information

Structured Condition Numbers of Symmetric Algebraic Riccati Equations

Structured Condition Numbers of Symmetric Algebraic Riccati Equations Proceedings of the 2 nd International Conference of Control Dynamic Systems and Robotics Ottawa Ontario Canada May 7-8 2015 Paper No. 183 Structured Condition Numbers of Symmetric Algebraic Riccati Equations

More information

Block Bidiagonal Decomposition and Least Squares Problems

Block Bidiagonal Decomposition and Least Squares Problems Block Bidiagonal Decomposition and Least Squares Problems Åke Björck Department of Mathematics Linköping University Perspectives in Numerical Analysis, Helsinki, May 27 29, 2008 Outline Bidiagonal Decomposition

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 17 LECTURE 5 1 existence of svd Theorem 1 (Existence of SVD) Every matrix has a singular value decomposition (condensed version) Proof Let A C m n and for simplicity

More information

Numerical Methods in Matrix Computations

Numerical Methods in Matrix Computations Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices

More information

On the Perturbation of the Q-factor of the QR Factorization

On the Perturbation of the Q-factor of the QR Factorization NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. ; :1 6 [Version: /9/18 v1.] On the Perturbation of the Q-factor of the QR Factorization X.-W. Chang McGill University, School of Comptuer

More information

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg Linear Algebra, part 3 Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2010 Going back to least squares (Sections 1.7 and 2.3 from Strang). We know from before: The vector

More information

Notes on Eigenvalues, Singular Values and QR

Notes on Eigenvalues, Singular Values and QR Notes on Eigenvalues, Singular Values and QR Michael Overton, Numerical Computing, Spring 2017 March 30, 2017 1 Eigenvalues Everyone who has studied linear algebra knows the definition: given a square

More information

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Throughout these notes we assume V, W are finite dimensional inner product spaces over C. Math 342 - Linear Algebra II Notes Throughout these notes we assume V, W are finite dimensional inner product spaces over C 1 Upper Triangular Representation Proposition: Let T L(V ) There exists an orthonormal

More information

WHEN MODIFIED GRAM-SCHMIDT GENERATES A WELL-CONDITIONED SET OF VECTORS

WHEN MODIFIED GRAM-SCHMIDT GENERATES A WELL-CONDITIONED SET OF VECTORS IMA Journal of Numerical Analysis (2002) 22, 1-8 WHEN MODIFIED GRAM-SCHMIDT GENERATES A WELL-CONDITIONED SET OF VECTORS L. Giraud and J. Langou Cerfacs, 42 Avenue Gaspard Coriolis, 31057 Toulouse Cedex

More information

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Numerical Methods. Elena loli Piccolomini. Civil Engeneering.  piccolom. Metodi Numerici M p. 1/?? Metodi Numerici M p. 1/?? Numerical Methods Elena loli Piccolomini Civil Engeneering http://www.dm.unibo.it/ piccolom elena.loli@unibo.it Metodi Numerici M p. 2/?? Least Squares Data Fitting Measurement

More information

Block Lanczos Tridiagonalization of Complex Symmetric Matrices

Block Lanczos Tridiagonalization of Complex Symmetric Matrices Block Lanczos Tridiagonalization of Complex Symmetric Matrices Sanzheng Qiao, Guohong Liu, Wei Xu Department of Computing and Software, McMaster University, Hamilton, Ontario L8S 4L7 ABSTRACT The classic

More information

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 26, 2010 Linear system Linear system Ax = b, A C m,n, b C m, x C n. under-determined

More information

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.)

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.) page 121 Index (Page numbers set in bold type indicate the definition of an entry.) A absolute error...26 componentwise...31 in subtraction...27 normwise...31 angle in least squares problem...98,99 approximation

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

NORMS ON SPACE OF MATRICES

NORMS ON SPACE OF MATRICES NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system

More information

Numerically Stable Cointegration Analysis

Numerically Stable Cointegration Analysis Numerically Stable Cointegration Analysis Jurgen A. Doornik Nuffield College, University of Oxford, Oxford OX1 1NF R.J. O Brien Department of Economics University of Southampton November 3, 2001 Abstract

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors

More information

Multiplicative Perturbation Analysis for QR Factorizations

Multiplicative Perturbation Analysis for QR Factorizations Multiplicative Perturbation Analysis for QR Factorizations Xiao-Wen Chang Ren-Cang Li Technical Report 011-01 http://www.uta.edu/math/preprint/ Multiplicative Perturbation Analysis for QR Factorizations

More information

A Note on Simple Nonzero Finite Generalized Singular Values

A Note on Simple Nonzero Finite Generalized Singular Values A Note on Simple Nonzero Finite Generalized Singular Values Wei Ma Zheng-Jian Bai December 21 212 Abstract In this paper we study the sensitivity and second order perturbation expansions of simple nonzero

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Total least squares. Gérard MEURANT. October, 2008

Total least squares. Gérard MEURANT. October, 2008 Total least squares Gérard MEURANT October, 2008 1 Introduction to total least squares 2 Approximation of the TLS secular equation 3 Numerical experiments Introduction to total least squares In least squares

More information

arxiv: v1 [math.na] 1 Sep 2018

arxiv: v1 [math.na] 1 Sep 2018 On the perturbation of an L -orthogonal projection Xuefeng Xu arxiv:18090000v1 [mathna] 1 Sep 018 September 5 018 Abstract The L -orthogonal projection is an important mathematical tool in scientific computing

More information

Functional Analysis Review

Functional Analysis Review Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all

More information

On condition numbers for the canonical generalized polar decompostion of real matrices

On condition numbers for the canonical generalized polar decompostion of real matrices Electronic Journal of Linear Algebra Volume 26 Volume 26 (2013) Article 57 2013 On condition numbers for the canonical generalized polar decompostion of real matrices Ze-Jia Xie xiezejia2012@gmail.com

More information

Linear Algebra. Session 12

Linear Algebra. Session 12 Linear Algebra. Session 12 Dr. Marco A Roque Sol 08/01/2017 Example 12.1 Find the constant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c = 1 c = 0 f (x)

More information

MATH 583A REVIEW SESSION #1

MATH 583A REVIEW SESSION #1 MATH 583A REVIEW SESSION #1 BOJAN DURICKOVIC 1. Vector Spaces Very quick review of the basic linear algebra concepts (see any linear algebra textbook): (finite dimensional) vector space (or linear space),

More information

On the loss of orthogonality in the Gram-Schmidt orthogonalization process

On the loss of orthogonality in the Gram-Schmidt orthogonalization process CERFACS Technical Report No. TR/PA/03/25 Luc Giraud Julien Langou Miroslav Rozložník On the loss of orthogonality in the Gram-Schmidt orthogonalization process Abstract. In this paper we study numerical

More information

14.2 QR Factorization with Column Pivoting

14.2 QR Factorization with Column Pivoting page 531 Chapter 14 Special Topics Background Material Needed Vector and Matrix Norms (Section 25) Rounding Errors in Basic Floating Point Operations (Section 33 37) Forward Elimination and Back Substitution

More information

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Lecture 9: Numerical Linear Algebra Primer (February 11st) 10-725/36-725: Convex Optimization Spring 2015 Lecture 9: Numerical Linear Algebra Primer (February 11st) Lecturer: Ryan Tibshirani Scribes: Avinash Siravuru, Guofan Wu, Maosheng Liu Note: LaTeX template

More information

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T. Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

MAT 610: Numerical Linear Algebra. James V. Lambers

MAT 610: Numerical Linear Algebra. James V. Lambers MAT 610: Numerical Linear Algebra James V Lambers January 16, 2017 2 Contents 1 Matrix Multiplication Problems 7 11 Introduction 7 111 Systems of Linear Equations 7 112 The Eigenvalue Problem 8 12 Basic

More information

9. Numerical linear algebra background

9. Numerical linear algebra background Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization

More information

Orthonormal Transformations and Least Squares

Orthonormal Transformations and Least Squares Orthonormal Transformations and Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 30, 2009 Applications of Qx with Q T Q = I 1. solving

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

Scientific Computing

Scientific Computing Scientific Computing Direct solution methods Martin van Gijzen Delft University of Technology October 3, 2018 1 Program October 3 Matrix norms LU decomposition Basic algorithm Cost Stability Pivoting Pivoting

More information

2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5.

2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5. 2. LINEAR ALGEBRA Outline 1. Definitions 2. Linear least squares problem 3. QR factorization 4. Singular value decomposition (SVD) 5. Pseudo-inverse 6. Eigenvalue decomposition (EVD) 1 Definitions Vector

More information

Linear Algebra, part 3 QR and SVD

Linear Algebra, part 3 QR and SVD Linear Algebra, part 3 QR and SVD Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2012 Going back to least squares (Section 1.4 from Strang, now also see section 5.2). We

More information

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization AM 205: lecture 6 Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization Unit II: Numerical Linear Algebra Motivation Almost everything in Scientific Computing

More information

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1

More information

UNIT 6: The singular value decomposition.

UNIT 6: The singular value decomposition. UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T

More information

Linear Algebra in Actuarial Science: Slides to the lecture

Linear Algebra in Actuarial Science: Slides to the lecture Linear Algebra in Actuarial Science: Slides to the lecture Fall Semester 2010/2011 Linear Algebra is a Tool-Box Linear Equation Systems Discretization of differential equations: solving linear equations

More information

Scientific Computing: Dense Linear Systems

Scientific Computing: Dense Linear Systems Scientific Computing: Dense Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 February 9th, 2012 A. Donev (Courant Institute)

More information

Error estimates for the ESPRIT algorithm

Error estimates for the ESPRIT algorithm Error estimates for the ESPRIT algorithm Daniel Potts Manfred Tasche Let z j := e f j j = 1,..., M) with f j [ ϕ, 0] + i [ π, π) and small ϕ 0 be distinct nodes. With complex coefficients c j 0, we consider

More information

We first repeat some well known facts about condition numbers for normwise and componentwise perturbations. Consider the matrix

We first repeat some well known facts about condition numbers for normwise and componentwise perturbations. Consider the matrix BIT 39(1), pp. 143 151, 1999 ILL-CONDITIONEDNESS NEEDS NOT BE COMPONENTWISE NEAR TO ILL-POSEDNESS FOR LEAST SQUARES PROBLEMS SIEGFRIED M. RUMP Abstract. The condition number of a problem measures the sensitivity

More information

Numerical Methods I Non-Square and Sparse Linear Systems

Numerical Methods I Non-Square and Sparse Linear Systems Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant

More information

The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization

More information

Review Questions REVIEW QUESTIONS 71

Review Questions REVIEW QUESTIONS 71 REVIEW QUESTIONS 71 MATLAB, is [42]. For a comprehensive treatment of error analysis and perturbation theory for linear systems and many other problems in linear algebra, see [126, 241]. An overview of

More information

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v ) Section 3.2 Theorem 3.6. Let A be an m n matrix of rank r. Then r m, r n, and, by means of a finite number of elementary row and column operations, A can be transformed into the matrix ( ) Ir O D = 1 O

More information

Computational math: Assignment 1

Computational math: Assignment 1 Computational math: Assignment 1 Thanks Ting Gao for her Latex file 11 Let B be a 4 4 matrix to which we apply the following operations: 1double column 1, halve row 3, 3add row 3 to row 1, 4interchange

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information

Basic Calculus Review

Basic Calculus Review Basic Calculus Review Lorenzo Rosasco ISML Mod. 2 - Machine Learning Vector Spaces Functionals and Operators (Matrices) Vector Space A vector space is a set V with binary operations +: V V V and : R V

More information

Contents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2

Contents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2 Contents Preface for the Instructor xi Preface for the Student xv Acknowledgments xvii 1 Vector Spaces 1 1.A R n and C n 2 Complex Numbers 2 Lists 5 F n 6 Digression on Fields 10 Exercises 1.A 11 1.B Definition

More information

Introduction to Numerical Linear Algebra II

Introduction to Numerical Linear Algebra II Introduction to Numerical Linear Algebra II Petros Drineas These slides were prepared by Ilse Ipsen for the 2015 Gene Golub SIAM Summer School on RandNLA 1 / 49 Overview We will cover this material in

More information

DS-GA 1002 Lecture notes 10 November 23, Linear models

DS-GA 1002 Lecture notes 10 November 23, Linear models DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.

More information

An error estimate for matrix equations

An error estimate for matrix equations Applied Numerical Mathematics 50 (2004) 395 407 www.elsevier.com/locate/apnum An error estimate for matrix equations Yang Cao, Linda Petzold 1 Department of Computer Science, University of California,

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra The two principal problems in linear algebra are: Linear system Given an n n matrix A and an n-vector b, determine x IR n such that A x = b Eigenvalue problem Given an n n matrix

More information

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization AM 205: lecture 6 Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization Unit II: Numerical Linear Algebra Motivation Almost everything in Scientific Computing

More information

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Key words. conjugate gradients, normwise backward error, incremental norm estimation. Proceedings of ALGORITMY 2016 pp. 323 332 ON ERROR ESTIMATION IN THE CONJUGATE GRADIENT METHOD: NORMWISE BACKWARD ERROR PETR TICHÝ Abstract. Using an idea of Duff and Vömel [BIT, 42 (2002), pp. 300 322

More information

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition AM 205: lecture 8 Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition QR Factorization A matrix A R m n, m n, can be factorized

More information

Inverses. Stephen Boyd. EE103 Stanford University. October 28, 2017

Inverses. Stephen Boyd. EE103 Stanford University. October 28, 2017 Inverses Stephen Boyd EE103 Stanford University October 28, 2017 Outline Left and right inverses Inverse Solving linear equations Examples Pseudo-inverse Left and right inverses 2 Left inverses a number

More information

2. Review of Linear Algebra

2. Review of Linear Algebra 2. Review of Linear Algebra ECE 83, Spring 217 In this course we will represent signals as vectors and operators (e.g., filters, transforms, etc) as matrices. This lecture reviews basic concepts from linear

More information

OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY

OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY published in IMA Journal of Numerical Analysis (IMAJNA), Vol. 23, 1-9, 23. OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY SIEGFRIED M. RUMP Abstract. In this note we give lower

More information

Review of Some Concepts from Linear Algebra: Part 2

Review of Some Concepts from Linear Algebra: Part 2 Review of Some Concepts from Linear Algebra: Part 2 Department of Mathematics Boise State University January 16, 2019 Math 566 Linear Algebra Review: Part 2 January 16, 2019 1 / 22 Vector spaces A set

More information

Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact

Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact Zdeněk Strakoš Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/ strakos Hong Kong, February

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Applied Linear Algebra in Geoscience Using MATLAB

Applied Linear Algebra in Geoscience Using MATLAB Applied Linear Algebra in Geoscience Using MATLAB Contents Getting Started Creating Arrays Mathematical Operations with Arrays Using Script Files and Managing Data Two-Dimensional Plots Programming in

More information

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization

More information

Backward perturbation analysis for scaled total least-squares problems

Backward perturbation analysis for scaled total least-squares problems NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 009; 16:67 648 Published online 5 March 009 in Wiley InterScience (www.interscience.wiley.com)..640 Backward perturbation analysis

More information

Problem # Max points possible Actual score Total 120

Problem # Max points possible Actual score Total 120 FINAL EXAMINATION - MATH 2121, FALL 2017. Name: ID#: Email: Lecture & Tutorial: Problem # Max points possible Actual score 1 15 2 15 3 10 4 15 5 15 6 15 7 10 8 10 9 15 Total 120 You have 180 minutes to

More information

UNIFYING LEAST SQUARES, TOTAL LEAST SQUARES AND DATA LEAST SQUARES

UNIFYING LEAST SQUARES, TOTAL LEAST SQUARES AND DATA LEAST SQUARES UNIFYING LEAST SQUARES, TOTAL LEAST SQUARES AND DATA LEAST SQUARES Christopher C. Paige School of Computer Science, McGill University, Montreal, Quebec, Canada, H3A 2A7 paige@cs.mcgill.ca Zdeněk Strakoš

More information

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM Unless otherwise stated, all vector spaces in this worksheet are finite dimensional and the scalar field F is R or C. Definition 1. A linear operator

More information

Rounding error analysis of the classical Gram-Schmidt orthogonalization process

Rounding error analysis of the classical Gram-Schmidt orthogonalization process Cerfacs Technical report TR-PA-04-77 submitted to Numerische Mathematik manuscript No. 5271 Rounding error analysis of the classical Gram-Schmidt orthogonalization process Luc Giraud 1, Julien Langou 2,

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment Two Caramanis/Sanghavi Due: Tuesday, Feb. 19, 2013. Computational

More information

MULTIPLICATIVE PERTURBATION ANALYSIS FOR QR FACTORIZATIONS. Xiao-Wen Chang. Ren-Cang Li. (Communicated by Wenyu Sun)

MULTIPLICATIVE PERTURBATION ANALYSIS FOR QR FACTORIZATIONS. Xiao-Wen Chang. Ren-Cang Li. (Communicated by Wenyu Sun) NUMERICAL ALGEBRA, doi:10.3934/naco.011.1.301 CONTROL AND OPTIMIZATION Volume 1, Number, June 011 pp. 301 316 MULTIPLICATIVE PERTURBATION ANALYSIS FOR QR FACTORIZATIONS Xiao-Wen Chang School of Computer

More information

ETNA Kent State University

ETNA Kent State University C 8 Electronic Transactions on Numerical Analysis. Volume 17, pp. 76-2, 2004. Copyright 2004,. ISSN 1068-613. etnamcs.kent.edu STRONG RANK REVEALING CHOLESKY FACTORIZATION M. GU AND L. MIRANIAN Abstract.

More information

MOORE-PENROSE INVERSE IN AN INDEFINITE INNER PRODUCT SPACE

MOORE-PENROSE INVERSE IN AN INDEFINITE INNER PRODUCT SPACE J. Appl. Math. & Computing Vol. 19(2005), No. 1-2, pp. 297-310 MOORE-PENROSE INVERSE IN AN INDEFINITE INNER PRODUCT SPACE K. KAMARAJ AND K. C. SIVAKUMAR Abstract. The concept of the Moore-Penrose inverse

More information

Notes on Solving Linear Least-Squares Problems

Notes on Solving Linear Least-Squares Problems Notes on Solving Linear Least-Squares Problems Robert A. van de Geijn The University of Texas at Austin Austin, TX 7871 October 1, 14 NOTE: I have not thoroughly proof-read these notes!!! 1 Motivation

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME MS&E 38 (CME 338 Large-Scale Numerical Optimization Course description Instructor: Michael Saunders Spring 28 Notes : Review The course teaches

More information

October 25, 2013 INNER PRODUCT SPACES

October 25, 2013 INNER PRODUCT SPACES October 25, 2013 INNER PRODUCT SPACES RODICA D. COSTIN Contents 1. Inner product 2 1.1. Inner product 2 1.2. Inner product spaces 4 2. Orthogonal bases 5 2.1. Existence of an orthogonal basis 7 2.2. Orthogonal

More information

Lecture notes: Applied linear algebra Part 1. Version 2

Lecture notes: Applied linear algebra Part 1. Version 2 Lecture notes: Applied linear algebra Part 1. Version 2 Michael Karow Berlin University of Technology karow@math.tu-berlin.de October 2, 2008 1 Notation, basic notions and facts 1.1 Subspaces, range and

More information

7. Symmetric Matrices and Quadratic Forms

7. Symmetric Matrices and Quadratic Forms Linear Algebra 7. Symmetric Matrices and Quadratic Forms CSIE NCU 1 7. Symmetric Matrices and Quadratic Forms 7.1 Diagonalization of symmetric matrices 2 7.2 Quadratic forms.. 9 7.4 The singular value

More information

A Note on Inverse Iteration

A Note on Inverse Iteration A Note on Inverse Iteration Klaus Neymeyr Universität Rostock, Fachbereich Mathematik, Universitätsplatz 1, 18051 Rostock, Germany; SUMMARY Inverse iteration, if applied to a symmetric positive definite

More information

LinGloss. A glossary of linear algebra

LinGloss. A glossary of linear algebra LinGloss A glossary of linear algebra Contents: Decompositions Types of Matrices Theorems Other objects? Quasi-triangular A matrix A is quasi-triangular iff it is a triangular matrix except its diagonal

More information

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,

More information

On The Belonging Of A Perturbed Vector To A Subspace From A Numerical View Point

On The Belonging Of A Perturbed Vector To A Subspace From A Numerical View Point Applied Mathematics E-Notes, 7(007), 65-70 c ISSN 1607-510 Available free at mirror sites of http://www.math.nthu.edu.tw/ amen/ On The Belonging Of A Perturbed Vector To A Subspace From A Numerical View

More information

B553 Lecture 5: Matrix Algebra Review

B553 Lecture 5: Matrix Algebra Review B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations

More information

Lecture 1: Review of linear algebra

Lecture 1: Review of linear algebra Lecture 1: Review of linear algebra Linear functions and linearization Inverse matrix, least-squares and least-norm solutions Subspaces, basis, and dimension Change of basis and similarity transformations

More information

G1110 & 852G1 Numerical Linear Algebra

G1110 & 852G1 Numerical Linear Algebra The University of Sussex Department of Mathematics G & 85G Numerical Linear Algebra Lecture Notes Autumn Term Kerstin Hesse (w aw S w a w w (w aw H(wa = (w aw + w Figure : Geometric explanation of the

More information

Chapter 3. Matrices. 3.1 Matrices

Chapter 3. Matrices. 3.1 Matrices 40 Chapter 3 Matrices 3.1 Matrices Definition 3.1 Matrix) A matrix A is a rectangular array of m n real numbers {a ij } written as a 11 a 12 a 1n a 21 a 22 a 2n A =.... a m1 a m2 a mn The array has m rows

More information

Basic Elements of Linear Algebra

Basic Elements of Linear Algebra A Basic Review of Linear Algebra Nick West nickwest@stanfordedu September 16, 2010 Part I Basic Elements of Linear Algebra Although the subject of linear algebra is much broader than just vectors and matrices,

More information