Total least squares Gérard MEURANT October, 2008
1 Introduction to total least squares 2 Approximation of the TLS secular equation 3 Numerical experiments
Introduction to total least squares In least squares (LS) we have only a perturbation of the right hand side whereas Total Least Squares (TLS) considers perturbations of the vector of observations c and of the m n data matrix A minimize ( E r ) F, E, r subject to the constraint (A + E)x = c + r This is finding the smallest perturbations E and r such that c + r is in the range of A + E see Golub and Van Loan; Van Huffel and Vandewalle; Paige and Strakoš
Theorem (Golub and Van Loan) Let C = ( A c ) and U T CV = Σ be its SVD Assume that the singular values of C are such that σ 1 σ k > σ k+1 = σ n+1 Then the solution of the TLS problem is given by and min ( E r ) F = σ n+1 x TLS = y α where the vector ( y α ) T of norm 1 with α 0 is in the subspace S k spanned by the right singular vectors {v k+1,..., v n+1 } of V. If there is no such vector with α 0, the TLS problem has no solution
The right singular vectors v i are the eigenvectors of C T C and S k is the invariant subspace associated to the smallest eigenvalue σn+1 2 The TLS solution x TLS solves the eigenvalue problem C T C ( ) x 1 = σ 2 n+1 ( ) x 1 Theorem If σ A n > σ n+1, then x TLS exists and is the unique solution of the TLS problem x TLS = (A T A σ 2 n+1i ) 1 A T c Moreover, σ n+1 satisfies the following secular equation [ ] n σn+1 2 di 2 1 + (σi A ) 2 σn+1 2 = ρ 2 LS i=1 where the vector d = U T c and ρ 2 LS = (c Ax LS) 2
The secular equation can also be written as σ 2 n+1 = c T c c T A(A T A σ 2 n+1i ) 1 A T c This is obtained by writing ( A T ) ( ) ( ) A Ac x c T A c T = (σ c 1 n+1 ) 2 x 1 and eliminating x For data least squares (DLS) when only the matrix is perturbed, the secular equation is This can also be written as c T c c T A(A T A σ 2 I ) 1 A T c = 0 c T (AA T σ 2 I ) 1 c = 0
10 TLS secular function as a function of σ 2 8 6 4 2 0 2 4 6 8 10 5 0 5 10 15 20 25 Example of TLS secular function as a function of σ 2
Approximation of the TLS secular equation We approximate the quadratic form in the TLS secular equation by using one of the Golub-Kahan bidiagonalization algorithms with c as a starting vector It reduces A to lower bidiagonal form and generates a matrix γ 1. δ.. 1 C k =......... γk a k + 1 by k matrix such that C T k C k = J k the tridiagonal matrix generated by the Lanczos algorithm for the matrix A T A δ k
At iteration k we approximate the TLS secular equation by c T c c 2 (e 1 ) T C k (C T k C k σ 2 I ) 1 C T k e1 = σ 2 This corresponds to the Gauss quadrature rule We use the SVD of C k = U k S k Vk T of C k and ξ (k) = Uk T e1. Let σ(k) i be the singular values (ξ (k) k+1 )2 σ 2 k i=1 (ξ (k) i ) 2 (σ (k) i ) 2 σ = 1 2 c 2 We need to compute the smallest zero. Secular equation solvers use rational interpolation When an approximate solution σtls 2 has been computed, we solve x tls = (A T A σ 2 tls I ) 1 A T c
The Gauss Radau rule We implement the Gauss Radau rule by using the other Golub-Kahan bidiagonalization algorithm with A T c as a starting vector It reduces A to upper bidiagonal form. If γ 1 δ 1...... B k = γ k 1 δ k 1 γ k the matrix B k is the Cholesky factor of the Lanczos matrix J k
To obtain the Gauss Radau rule we must modify B k to have a prescribed eigenvalue z Let ω be the solution of Let (B T k B k zi )ω = (γ k 1 δ k 1 ) 2 e k ω k = (z + ω k ) (γ k 1δ k 1 ) 2 γ 2 k 1 = (z + ω k ) δ 2 k 1 The modified matrix giving the Gauss Radau rule is γ 1 δ 1...... B k = γ k 1 δ k 1 γ k where γ k = ω k
Using B k we solve the secular equation c 2 A T c 2 (e 1 ) T ( B T k B k σ 2 I ) 1 e 1 = σ 2 with the SVD of B k
Numerical experiments A s = U s Σ s V T s, U s = I 2 u su T s u s 2, V s = I 2 v sv T s v s 2 where u s and v s are random vectors Σ s is an m n diagonal matrix with elements [1,, n] Let x s be a vector whose ith component is 1/i and c s = A s x s The right hand side is A = A s + ξ randn(m, n) c = c s + ξ randn(m, 1)
A small example Example TLS1, m = 100, n = 50, BNS1 ε = 10 6 ξ L it. s it. sol. exact sol. 0.3 10 2 30 57 0.01703479103104873 0.01703478979190218 0.3 10 1 26 49 0.169448388286749 0.1694483528865543 0.3 28 73 1.464892131470029 1.464891451263777 30 33 64 88.21012648624229 88.21012652906667
A larger example We are not able to store A which is a dense matrix in Matlab We use the vectors u s and v s to do matrix multiplies with A s or A T s We perturb the singular values in the same way as the right hand side Example TLS3, m = 10000, n = 5000, noise=0.3, BNS1 ε L it. s it. min it. max it. av. it. solution 10 6 250 273 1 2 1.09 1.418582932414374 10 10 328 660 1 3 2.01 1.418576233569240 It works fine but it is too expensive
The Gauss Radau rule Example TLS1, m = 100, n = 50, Gauss Radau, noise=0.3, ε = 10 6 Met. L it. z s it. min it. max it. solution Newt 28 σ min 130 2 14 1.464891376927382 Newt 28 σ max 79 2 4 1.464892626809155 Rat 28 σ min 98 2 5 1.464891376927382 Rat 28 σ max 74 2 3 1.464892626809155
Example TLS3, m = 10000, n = 5000, Gauss Radau, noise=0.3, ε = 10 6 Met. L it. z s it. min it. max it. solution Newt 250 σ min 2572 3 31 1.418576232676234 Newt 250 σ max 1926 3 26 1.418583305908228 Rat 250 σ min 837 2 5 1.418576232676233 Rat 250 σ max 653 2 4 1.418583305908227
Optimization of the algorithm To reduce the cost We monitor the convergence of the smallest singular value of A For this we solve a secular equation at every Lanczos iteration We use a third order rational approximation and tridiagonal solves The Gauss and Gauss Radau estimates are only computed at the end
Example TLS3, m = 10000, n = 5000, noise=0.3, ε = 10 6 Met. L it. trid z s it. solution - 250 551 Gauss - 2 1.418582932414440 G R σ min (B k ) 2 1.418582932414443 G R σ max (B k ) 3 1.418583305908306
Example TLS4, m = 100000, n = 50000, noise=0.3, ε = 10 6 Met. L it. trid z s it. solution - 755 1775 Gauss - 1 0.8721122166701496 G R σ min (B k ) 2 0.8721122166735605 G R σ max (B k ) 3 0.8721124331415380
For example TLS3 with m = 10000, n = 5000 and ε = 10 6 The computing time when solving for Gauss and Gauss Radau at each iteration was 117 seconds With the last algorithm it is 12 seconds
J.R. Bunch, C.P. Nielsen and D.C. Sorensen, Rank-one modification of the symmetric eigenproblem, Numer. Math., v 31, (1978), pp 31 48 G.H. Golub and C. Van Loan, An analysis of the total least squares problem, SIAM J. Numer. Anal., v 17 n 6, (1980), pp 883 893 Ren-Cang Li, Solving secular equations stably and efficiently, Report UCB CSD-94-851, University of California, Berkeley, (1994) A. Melman, A unifying convergence analysis of second-order methods for secular equation, Math. Comp., v 66 n 217, (1997), pp 333 344 A. Melman, A numerical comparison of methods for solving secular equations, J. Comp. Appl. Math., v 86, (1997), pp 237 249
C.C. Paige and Z. Strakǒs, Bounds for the least squares residual using scaled total least squares, in Proc. 3 rd int. workshop on TLS and error-in-variables modelling, S. Van Huffel and P. Lemmerling eds., Kluwer, (2001), pp 25 34 C.C. Paige and Z. Strakǒs, Unifying least squares, total least squares and data least squares, in Proc. 3 rd int. workshop on TLS and error-in-variables modelling, S. Van Huffel and P. Lemmerling eds., Kluwer, (2001), pp 35 44 C.C. Paige and Z. Strakǒs, Bounds for the least squares distance using scaled total least squares problems, Numer. Math., v 91, (2002), pp 93-115 C.C. Paige and Z. Strakǒs, Scaled total least squares fundamentals, Numer. Math., v 91, (2002), pp 117-146 C.C. Paige and Z. Strakǒs, Core problems in linear algebraic systems, SIAM J. Matrix Anal. Appl., v 27 n 3, (2006), pp 861 874
S. Van Huffel and J. Vandewalle, The total least squares problem: computational aspects and analysis, SIAM, (1991)