Tikhonov Regularization for Weighted Total Least Squares Problems Yimin Wei Naimin Zhang Michael K. Ng Wei Xu Abstract In this paper, we study and analyze the regularized weighted total least squares (RWTLS) formulation. Our regularization of the weighted total least squares problem is based on the Tikhonov regularization. Numerical examples are presented to demonstrate the effectiveness of the RWTLS method. 1 Introduction In this paper, we study the regularized weighted total least squares (RWTLS) formulation. Our regularization of the weighted total least squares problem is based on the Tikhonov regularization [1]. For the total least squares (TLS) problem [2], the truncation approach has already been studied by Fierro et al. [3]. In [4], Golub et al. has considered the Tikhonov regularization approach for TLS problems. They derived a new regularization method in which Research supported in part by the National Natural Science Foundation of China and Shanghai Education Committee, Hong Kong RGC Grant Nos. 713/2P and 746/3P. Department of Mathematics & Laboratory of Mathematics for Nonlinear Sciences, Fudan University, Shanghai, 2433, People s Republic of China. E-mail: ymwei@fudan.edu.cn Department of Applied Mathematics, Dalian University of Technology, Dalian, 11624, P. R. China, and Information Engineering College, Dalian University, Dalian, 116622, P. R. China. E-mail: nmzhang@dlut.edu.cn Department of Mathematics, The University of Hong Kong Pokfulam Road, Hong Kong. E-mail: mng@maths.hku.hk Department of Computing and Software, McMaster University, Hamilton, Ont., Canada, L8S 4L7. 1
stabilization enters the formulation in a natural way, and that is able to produce regularized solutions with superior properties for certain problems in which the perturbations are large. In the present work, we focus on RWTLS problems. We show that the RWTLS solution is closely related to the Tikhonov solution to the weighted least squares solution. Our paper is organized as follows. In Section 2, we introduce the RWTLS formulation, and study its regularizing properties. Computational aspects are described in Section 3. In Section 4, numerical examples are presented to demonstrate the usefulness of the RWTLS method. 2 The Regularized Weighted Total Least Squares A general version of Tikhonov s formulation for the linear weighted TLS problem takes the form [5]: min x U[(A, b) (Ã, b)]v F subject to b = Ãx, Dx S δ, (1) where U, V, W and D are nonsingular matrices, S is a symmetric positive definite matrix, the matrix V is of the form V = W with y 2 S = y T Sy, and δ and γ are non-zero γ constants. Weighted Tikhonov regularization has an equivalent formula min x UAW x γub 2 subject to Dx S δ, (2) where δ is a positive constant. Probelm (2) is a weighted LS problem with a quadratic constraint and, using the Lagrange multiplier formulation, the above Tikhonov regularization can be rewritten as follows: L(Ã, x, µ) = U[(A, b) (Ã, b)]v 2 F + µ( Dx 2 S δ 2 ), (3) where µ is the Lagrange multiplier, zero if the inequality constraint is inactive. solution x δ to this problem is different from the solution x W T LS to The min U[(A, b) (Ã, b)]v F subject to b = Ãx, Ã, b,x where δ is less than Dx W T LS 2, the two solutions x δ and x δ to the two regularized problem in (2) and (1) have an interesting relationship, presented by Theorem 1. 2
Before we show the properties of the solution to (3), we have the following results about the matrix differentiation for the matrices A, Ã, W and U. Lemma 1. (i) tr(w T A T U T UÃW ) = U T UAW W T (ii) tr(w T ÃT U T UAW ) = U T UAW W T (iii) tr(w T ÃT U T UÃW ) = 2U T UÃW W T (iv) (b T U T UÃx) Ã = U T Ubx T (v) (x T ÃT U T Ub) = U T Ubx T (vi) (x T ÃT U T UÃx) = 2U T UÃxxT Proof. Since (i) is equivalent to (ii), (iv) is equivalent to (v), and (vi) is a special case of (iii), we only give the proof of (i) and (iii). We first note that for any matrices X = (x ij ) R m n, Z = (z ij ) R p q G = (g ij ) R p m, H = (h ij ) R n q, C = (c ij ) R p n, D = (d ij ) R m q, the following properties are equivalent: (see Theorem 7.1 in [6]) and where E (kl) ij i Z = GE (mn) ij H + C(E (mn) ij ) T D, i = 1,..., m, j = 1,..., n x ij z ij X = GT E (pq) ij H T + D(E (pq) ij ) T C, i = 1,..., p, j = 1,..., q, is an k-by-l zero matrix except the (i, j)-entry being equal to one. ( ) For (i), we let Y = W T A T U T U and we have (Y ÃW ) ii (Y ÃW ) ii. (Y ÃW ) Since = Y E ij W and (Y ÃW ) ij = Y T E Ãij ij W T, tr(y ÃW ) we obtain = Y T E ii W T = Y T W T. i The result follows. For (iii), we find that = W T E T iju T UÃW +(UÃW )T UE ij W, and therefore we have [(UÃW )T (UÃW )] ii It follows that [(UÃW )T (UÃW )] ij tr(y ÃW ) = i and = U T UÃW ET ii W T + U T UÃW E iiw T. = tr[(uãw )T (UÃW )] = i [(UÃW )T (UÃW )] ii = i U T UÃW ET ii W T + U T UÃW E iiw T = 2U T UÃW W T. 3
With Lemma 1, we have the following main theorem. Theorem 1. The RWTLS solution to (1) with the inequality constraint replaced by equality, is a solution to the problem (A T U T UA + αw T W 1 + βd T SD)x = A T U T Ub, (4) where the parameters α and β are given by α = γ2 b Ax 2 U T U, β = µ 1 + γ 2 x 2 W T W γ (1 + 2 γ2 x 2 W T W 1) (5) 1 and µ is the Lagrange multiplier in (3). The two parameters are related by βδ 2 = (Ub) T U(b Ax) + 1 α, (6) γ2 and the weighted TLS residual satisfies U[(A, b) (Ã, b)]v 2 F = α. Proof. We characterize the solution to (1) by setting the partial derivatives of L(Ã, x, µ) to zero. Using Lemma 1, the differentiation of L(Ã, x, µ) with respect to à yields where r = γu(b Ãx) = γu(b b). UÃW W T UAW W T γ rx T =, (7) Moreover, the differentiation of L(Ã, x, µ) with respect to the entries in x yields γãt U T r + µd T SDx = or (γ 2 à T U T Uà + µdt SD)x = γ 2 à T U T Ub. (8) By using (7) and (8), we have A T U T UA = (Uà γ rxt W T W 1 ) T (Uà γ rxt W T W 1 ) and ÃT U T Ub = A T U T Ub + γw T W 1 x r T Ub. = ÃT U T Uà + γ2 r 2 2W T W 1 xx T W T W 1 µd T SDxx T W T W 1 µw T W 1 xx T D T SD By using the assumption that Dx S = δ and gathering the above terms, we obtain (5) with α = µδ 2 γ 2 r 2 2 x 2 W T W 1 γ rt Ub and β = µ γ 2 (1 + γ2 x 2 W T W 1). 4
In order to obtain the expression for α, we first rewrite r as r = γu(b Ãx) = γu(b Ax γu 1 rx T W T W 1 x) = γu(b Ax) γ 2 r x 2 W T W 1 from which we obtain the relation From (8), we have r = γu(b Ax) 1 + γ 2 x 2 W T W 1. (9) µ = γxt à T U T r x T D T SDx = (γub r)t r δ 2 (1) By inserting (9) and (1) into the expression for α, we obtain (5). Equation (6) is proved by multiplying β by δ 2 and inserting (9) and (1). Finally, we note from (7) that UAW UÃW = γ rxt W T (UAW, γub) (UÃW, γuãx) = ( γ rxt W T, r). It follows that and therefore we have U[(A, b) (Ã, b)]v 2 F = γ rx T W T 2 F + r 2 2 = (1 + γ 2 x 2 W T W 1) r 2 2 = γ2 b Ax 2 U T U 1 + γ 2 x 2 W T W 1 = α. The next theorem tells us the relationship between the RWTLS solution and the WTLS solution without the regularization. Theorem 2. For a given value of δ, the RWTLS solution x RW T LS (δ) is related to the solution x W T LS to the weighted total least squares problem without the regularization as follows: δ solution α β δ < Dx W T LS S x RW T LS (δ) x W T LS α < and α δ > β > δ Dx W T LS S x RW T LS (δ) = x W T LS α = σ min ((UAW, γub)) 2 β = Here σ min ((UAW, γub)) is the smallest singular value of the matrix (UAW, γub). Proof. For δ < Dx W T LS S, the inequality constraint is active and therefore the Lagrange multiplier µ is positive, since this is a necessary condition for optimality, see [7]. By (5), we know that β is positive. Again from (6), we find that when δ increases, α increases, and therefore the TLS residual U[(A, b) (Ã, b)]v 2 F = α decreases. For δ Dx W T LS S, the Lagrange multiplier µ is equal to zero. The solution becomes the unconstrained minimizer x W T LS. The result follows. 5
For δ = Dx W T LS, the Lagrange multiplier is zero, and the solution becomes the unconstrained minimizer x W T LS. The value σ min ((UAW, γub)) 2 follows from Theorem 4.1 in [5]. The constraint is never again active for larger δ, so the solution remains unchanged. 3 Computational method To compute the RWTLS solutions, we have found it most convenient to avoid explicit use of δ; instead we use β as the free parameter, fixing its value and then computing the value of α that satisfies (5) and is smallest in absolute value. The corresponding value of δ can be computed from relation (6). We discuss how to solve (4) efficiently for many values of α and β. We notice that the equation is equivalent to the augmented system I m UA r Ub I p β 1/2 S 1/2 D s =, (11) (UA) T β 1/2 D T S 1/2 αw T W 1 x where r = Ub UAx, s = β 1/2 S 1/2 Dx. Our algorithm is based on this formulation. We reduce UA to m n bidiagonal form B by means of orthogonal transformations: H T (UA)K = B, and C = J T (S 1/2 D)K retains the banded form. Using the sequence of Givens transformations, it is easy to get J, H and K. Once B and C have been computed, we can recast the augmented system in (11) in the following form: I n B H T r H T Ub I p β 1/2 C J T s =. (12) B T β 1/2 C T αk T W T W 1 K K T x Since α changes more frequently than β in our approach, we will now use Givens rotations to annihilate β 1/2 C using B by means of Elden s algorithm [8], which can be represented as B β 1/2 C = G B = G 11 G 12 G 21 G 22 B. 6
When we insert this G into the augmented system (11), it becomes I n B ˆr G T 11H T Ub I p ŝ = G T 12H T Ub B T αk T W T W 1 K K T x, where ˆr = G T 11H T r + G T 21J T s, ŝ = G T 12H T r + G T 22J T s. The middle block row is now decoupled, and we obtain I n B T B ˆr αk T W T W 1 K K T x = Finally, we apply a symmetric perfect shuffle reordering n + 1, 1, n + 2, 2, n + 3, 3,, n, 2n GT 11H T Ub to the rows and columns of the above matrix, to obtain a symmetric, tridiagonal, indefinite matrix of order 2n 2n: α ˆb 11 ˆb11 1 ˆb12 ˆb12 α ˆb 22. ˆb22 1........ and we can solve this permuted system by a general tridiagonal solver.. 4 Numerical Examples In this section, we present numerical results that illustrate the usefulness of the RWTLS method. Our computations are carried out in MATLAB. We consider an example in [4]. This test problem is a discretization by means of Gauss-Laguerre quadrature of the inverse Laplace transform exp( st)f(t)dt = 1 s 1 s + 4/25, s >. The exact solution f(t) = 1 exp( 4t/25) is known. This example has been implemented in the function ilaplace(n, 2) in Hansen s regularization toolbox [9]. 7
In the tests, we consider the size of the coefficient matrix is 64, and the the perturbed part of the coefficient matrix is E and its elements are generated from a normal distribution with zero mean and the unit standard deviation. The perturbed right-hand side is generated as b = (A + σ E 1 F E)x + σ e 1 2 e, where the elements of e are from normal distributions with zero mean and the unit standard deviation, x is the accurate solution. In Figure 1, we show the results for different σ =.1,.1,.1. The solid line is the exact solution, (which is for the discretized 64 64 probelm) while the line with * is the solution from RWTLS and the dotted line is the solution by RTLS (i.e., the regularized TLS solution without the weighting). In the RWTLS method, we select U to be a diagonal matrix whose elements are 1/σ for the first 16 elements and the last 16 elements are 3σ which are not larger than.1 otherwise divide them by 1 until the condition is satisfied, the other elements are equal to 1. The first half elements of W are ones while the last half ones are equal to σ. The matrix D is the identity matrix. At the same time we let γ = 1. In each case, the optimal regularization parameter µ is selected. We see from the figures that the solutions provided by the RWTLS method are better than those by the RTLS method. One of the future research work is to study how to choose the weight W without knowing the noise. We expect some optimization models should be incorporated into the objective function and the weighting can be determined by the optimization process, see for instance [1]. 1.5 1.5 1.5 1 1 1.5.5.5 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 (a) (b) (c) Figure 1: Numerical solutions for different methods (a) σ =.1; (b) σ =.1 and (c) σ =.1. 8
References [1] H. Engl, M. Hanke and A. Neubauer, Regularization of Inverse Problems, Kluwer Academic Publishers, Netherlands, 1996. [2] S. Van Huffel and J. Vandewalle, The Total Least Squares Problem: Computational Aspects and Analysis, SIAM, Philadelphia, 1991. [3] R. Fierro,G. Golub, P. Hansen and D. O Leary, Regularization by truncated total least squares, SIAM J. Sci. Comput., 18 (1997) 1223-1241. [4] G. Golub, P. Hansen and D. O Leary, Tikhonov regularization and total least squares, SIAM J. Matrix Anal. Appl.,21 (1999) 185-194. [5] G. Golub and C. Van Loan, An analysis of the total least squares problem, SIAM J. Numer. Anal., 17 (198) 883-893. [6] G. Rogers, Matrix Derivatives, Lecture Notes in Statistics 2, New York, 198. [7] S. Nash and A. Sofer, Linear and Nonlinear Programming, McGraw-Hill, New York, 1996. [8] L. Elden Algorithms for regularization of ill-conditioned least squares problems, BIT, 17(1977) 134-145. [9] P. Hansen, Regularization tools: a Matlab package for analysis and solution of discrete ill-posed problems, Numer. Algorithms, 6 (1994) 1-35. [1] H. Fu and J. Barlow, A regularized total least squares algorithm for high resolution image reconstruction, to appear in Linear Algebra and its Applications. 9