Structured weighted low rank approximation 1

Departement Elektrotechniek ESAT-SISTA/TR 03-04 Structured weighted low rank approximation 1 Mieke Schuermans, Philippe Lemmerling and Sabine Van Huffel 2 January 2003 Accepted for publication in Numerical Linear Algebra with Applications 1 This report is available by anonymous ftp from ftp.esat.kuleuven.ac.be in the directory pub/sista/mschuerm/reports/03-04.ps.gz 2 K.U.Leuven, Dept. of Electrical Engineering (ESAT), Research group SISTA, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium, Tel. 32/16/32 17 10, Fax 32/16/32 19 70, WWW: http://www.esat.kuleuven.ac.be/sista. E-mail: mieke.schuermans@esat.kuleuven.ac.be. Prof. dr. Sabine Van Huffel is a full professor, dr. Philippe Lemmerling is a postdoctoral researcher of the FWO (Fund for Scientific Research Flanders) and Mieke Schuermans is a research assistant at the Katholieke Universiteit Leuven, Belgium. Our research is supported by Research Council KUL: GOA-Mefisto 666, IDO/99/003 (Predictive computer models for medical classification problems using patient data and expert knowledge), several PhD/postdoc & fellow grants; Flemish Government: FWO: PhD/postdoc grants, projects, G.0200.00 (damage detection in composites by optical fibers), G.0078.01 (structured matrices), G.0407.02 (support vector machines), G.0269.02 (magnetic resonance spectroscopic imaging), G.0270.02 (nonlinear Lp approximation), research communities (ICCoS, ANMMM); AWI: Bil. Int. Collaboration Hungary/Poland; IWT: PhD Grants, Belgian Federal Government: DWTC (IUAP IV-02 (1996-2001) and IUAP V- 22 (2002-2006): Dynamical Systems and Control: Computation, Identification & Modelling) ); EU: NICONET, INTERPRET, PDT-COIL, MRS/MRI signal processing (TMR); Contract Research/agreements: Data4s, IPCOS;

NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2003; 10:1 10 [Version: 2002/09/18 v1.02] Structured weighted low rank approximation M. Schuermans 1,, P. Lemmerling 1 and S. Van Huffel 1 1 K.U.Leuven, ESAT-SCD, Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium SUMMARY This paper extends the Weighted Low Rank Approximation (WLRA) approach towards linearly structured matrices. In the case of Hankel matrices an equivalent unconstrained optimization problem is derived and an algorithm for solving it is proposed. The correctness of the latter algorithm is verified on a benchmark problem. Finally the statistical accuracy and numerical efficiency of the proposed algorithm is compared with that of STLNB, a previously proposed algorithm for solving Hankel WLRA problems. Copyright c 2003 John Wiley & Sons, Ltd. key words: rank reduction; structured matrices; weighted norm. 1. Introduction Multivariate linear problems play an important role in many applications, including MIMO system identification and deconvolution problems with multiple outputs. These linear models can be described as AX B or equivalently as [AB][X T I] T 0 with A R n (m d), B R n d the observed variables and X R (m d) d the parameter matrix to be estimated. Since often all variables (i.e., those in A and B) are perturbed by noise, the Total Least Squares (TLS) approach is used instead of the Least Squares (LS) approach. The TLS approach solves the following problem: min A, B,X [ A B] 2 F such that (A + A)X = B + B, (1) where. F stands for the Frobenius norm, S [A B] is the observed data matrix and S [ A B] is the correction applied to S. The standard way for solving the TLS problem is by means of the Singular Value Decomposition (SVD) [1, 2]. The latter approach yields maximum likelihood (ML) estimates of X when the noise part of S has rows that are independently identically distributed (i.i.d.) with common zero mean vector and common covariance matrix that is a (unknown) multiple of the identity matrix I m. However, in many applications the observed data matrix S is structured and since the SVD Correspondence to: K.U.Leuven, ESAT-SCD, Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium The noise part corresponds to S n R n m in S = S o+s n where S o R n m contains the unobserved noiseless data and S R n m contains the observed noisy data. Received January 10, 2003 Copyright c 2003 John Wiley & Sons, Ltd. Revised October 10, 2003

2 M. SCHUERMANS ET AL. does not preserve structure, [A + A B + B] will typically be unstructured. In [3] it was shown that this loss of structure implies -under noise conditions that occur naturally in many problems- a loss of statistical accuracy of the estimated parameter matrix. These specific noise conditions can best be explained by representing the structured matrix S by its minimal vector representation s (e.g., a Hankel matrix S R n m is converted into a minimal representation s R n+m 1 ). The noise part of s has to obey a Gaussian i.i.d. distribution in order to generate the noise condition mentioned earlier. To obtain a maximum likelihood parameter matrix, in these frequently occurring structured cases, one has to solve a so-called Structured Weighted Low Rank Approximation(SWLRA) problem: min R rank(r) r<k R Ω S R 2 V, (2) with R, S R n m, Ω the set of all matrices having the same structure as S and M 2 V vec(m) T V vec(m) where vec(m) stands for the vectorized form of M, i.e., a vector constructed by stacking the consecutive columns of M in one vector. So, for a given structured data matrix S of a certain rank k, we are looking for the nearest (in weighted norm. 2 V -sense) lower rank matrix R with the same structure as S. We define r m r. For specific choices of V, r and Ω, (2) corresponds to well-known problems: If V is the identity matrix, denoted as V = I, and Ω = R n m, problem (2) reduces to the well-studied TLS problem (1) with R = [A + A B + B] and d = r, i.e., we enforce r independent linear relations among the columns of R in order to reduce R to rank r. When Ω is a set of matrices sharing a particular linear structure (for example, Hankel matrices) and V is a diagonal matrix, then (2) corresponds to the so-called Structured TLS problem. STLS problems with r = 1 have been studied extensively [3, 4, 5, 6]. For r > 1 only a few algorithms are described [7, 8], most of which yield statistically suboptimal results such as the algorithm to find lower rank structured matrices by alternating iterations between lower rank matrices and structured matrices [7]. The case in which Ω = R n m corresponds to the previously introduced Unstructured Weighted Low Rank Approximation (UWLRA) problem [9]. In this paper we consider (2) with Ω the set of a particular type of linearly structured matrices (for example, the set of Hankel matrices). From a statistical point of view it only makes sense to treat identical elements in an identical way, e.g., in a Hankel matrix all the elements on one antidiagonal should be treated the same way. Therefore (2) is reformulated as: min vec 2 (R) S R 2 W, (3) rank(r) r<k with M 2 W vec 2(M) T Wvec 2 (M), where vec 2 (M) is a minimal vector representation of the linearly structured matrix M. E.g., when Ω represents the set of Hankel matrices, vec 2 (M) is a vector containing the different elements of the different antidiagonals. Note that due to the one-to-one relation between M and vec 2 (M) the condition R Ω no longer appears in (3). Problem (3) is the so-called Structured Weighted Low Rank Approximation (SWLRA). The major contribution of this paper is the extension of the concept and the algorithm presented in [9] to linearly structured matrices (i.e., the extension of UWLRA to SWLRA

STRUCTURED WEIGHTED LOW RANK APPROXIMATION 3 problems). Problem (3) can also be interpreted as an extension of the STLS problems described in [4, 3, 5, 6] by allowing r to be larger than 1. The paper is structured as follows. In section 2, an unconstrained optimization problem equivalent to problem (3) is derived by means of the method of Lagrange multipliers. Section 3 introduces an algorithm for solving this unconstrained optimization problem. Finally, section 4 concludes with some numerical experiments illustrating the statistical optimality of the SWLRA algorithm and the SWLRA algorithm is compared with the STLNB algorithm described in [8]. N R m (m r) N T N=I 2. Derivation The first step in the derivation is to reformulate (3) into an equivalent double-minimization problem: min ( min S R 2 R R n m W ). (4) RN=0 The second step consists of finding a closed form expression f(n) for the solution of the inner minimization min R R n m S R 2 W. The latter is obtained as follows. Applying the technique RN=0 of Lagrange multipliers to the inner minimization of (4) yields the Lagrangian ψ(l, R) = vec 2 (S R) T Wvec 2 (S R) tr(l T (RN)), (5) where tr(a) stands for the trace of matrix A and L is the matrix of Lagrange multipliers. Using the equalities (5) becomes vec(a) T vec(b) = tr(a T B), vec(abc) = (C T A)vec(B), ψ(l, R) = vec 2 (S R) T Wvec 2 (S R) vec(l) T (N T I)vec(R). (6) Note that when R is a linearly structured matrix it is straightforward to write down a relation between vec(r) and its minimal vector representation vec 2 (R) : vec(r) = Hvec 2 (R), (7) where H R nm q and q is the number of different elements in R. E.g., in the case of a Hankel matrix, q = n + m 1 and vec 2 (R) can be constructed from the first column and last row of R. Substituting (7) in (6), the following expression is obtained for the Lagrangian: ψ(l, R) = vec 2 (S R) T Wvec 2 (S R) vec(l) T (N T I n )Hvec 2 (R). (8) Setting the derivatives of ψ w.r.t. vec 2 (R) and L equal to 0 yields the following set of equations: [ ] [ ] [ ] 2W H T (N I n ) vec2 (R) 2Wvec2 (S) (N T =. (9) I n )H 0 vec(l) 0 Using the fact that [ A B B T 0 ] 1 = [ A 1 A 1 B(B T A 1 B) 1 B T A 1 (B T A 1 B) 1 B T A 1 ],

4 M. SCHUERMANS ET AL. and setting H 2 (N I n ) T H, it follows from (9) that vec 2 (R) = (I q W 1 H T 2 (H 2 W 1 H T 2 ) 1 H 2 )vec 2 (S) vec 2 (S R) = W 1 H T 2 (H 2W 1 H T 2 ) 1 H 2 vec 2 (S) (10) As a result, the double-minimization problem (4) can be written as the following optimization problem: min vec 2 (S) T H2 T (H 2W 1 H2 T ) 1 H 2 vec 2 (S). (11) N R m (m r) N T N=I 3. Algorithm From this point on we will focus on a specific Ω, since as will become clear further on, it is not possible to derive a general algorithm that can deal with any type of linearly structured matrices. The Hankel structure will be investigated further on since this is one of the most frequently occurring structures in signal processing applications. As a result Toeplitz matrices are also dealt with since they can be converted into Hankel matrices with a simple permutation of the rows, whereas the solution of the corresponding SWLRA problem does not change. The straightforward approach for solving (4) would be to apply a nonlinear least squares (NLLS) solver to (11). For r = 1 this works fine but when r > 1 this approach breaks down by yielding the trivial solution R = 0. This can easily be understood by considering (4) with W equal to the identity matrix. As can be seen from (10), the inner minimization of (4) corresponds to an orthogonal projection of vec 2 (S) on the orthogonal complement of the column space of H2 T R (n+m 1) n r. Therefore it is clear that for r = 1 the orthogonal complement of the column space of H2 T will never be empty (the dimension of the column space of H2 T = rank(ht 2 ) = n) but for r > 1 the latter orthogonal complement will typically be empty since n m. As a result the projection will yield R = 0. The problem lies in an overparameterization of the matrix H2 T = HT (N I n ). To find a solution to the latter problem we first state the inner minimization in (4) in words: Given a matrix N whose column space is a nullspace of a non-trivial Hankel matrix, find the Hankel matrix R closest to S such that the nullspace of R is spanned by the column space of N. A parameterization of the nullspace of a Hankel matrix can be found by means of the so-called Vandermonde decomposition of a Hankel matrix. Given a Hankel matrix R R n m of rank r, it is straightforward to see that it can be parameterized as follows: R = 1 1... 1 z 1 z 2... z r.... z1 n 1 z2 n 1... zr n 1 c 1 c 2... cr. 1 z 1... z m 1 1 1 z 2... z m 1 2.. 1 z r... zr m 1, (12) with z i, i = 1,...,r the so-called complex signal poles and c i, i = 1,...,r the so-called complex amplitudes. By defining a vector p = [p 1 p 2...p r+1 ] T such that the following polynomial in λ p 1 + p 2 λ +... + p r+1 λ r

STRUCTURED WEIGHTED LOW RANK APPROXIMATION 5 has roots z i, i = 1,...,r, it is easy to see that the nullspace of R can be parameterized by p i, i = 1,...,r + 1 in the following matrix whose rows span the nullspace of R: p 1 p 2... p r+1 0 0... 0 0 p 1 p 2... p r+1 0... 0.. R(m r) m 0 0... 0 p 1 p 2... p r+1 We could now use the previous matrix to parameterize N in (11) but a different approach is taken here. First note that from (12) it follows that the vector vec 2 (R) (with R being a rank r Hankel matrix) can be written as the following linear combination: vec 2 (R) = c 1 b 1 + c 2 b 2 +... + c r b r, with vectors b i [1 z i... z n+m 2 i ] T and c i scalar coefficients for i = 1,...,r. As a result, it is clear that the rows of the following matrix p 1 p 2... p r+1 0 0... 0 0 p 1 p 2... p r+1 0... 0 H 3.... R(n+m 1 r) (n+m 1) 0 0... 0 p 1 p 2... p r+1 span the space perpendicular to b i, i = 1,...,r. Therefore -bearing in mind the goal of the inner minimization of (4)- the rank deficient Hankel matrix R closest to S can be found by means of the following orthogonal projection: and (11) thus becomes vec 2 (R) = (I W 1 H T 3 (H 3 W 1 H T 3 ) 1 H 3 )vec 2 (S), (13) min p vec 2 (S) T H T 3 (H 3W 1 H T 3 ) 1 H 3 vec 2 (S), (14) with p [p 1 p 2... p r+1 ] T. Remark that the new expressions are similar as before, although we now work with a vector p instead of a matrix N. The algorithm for solving the SWLRA for Hankel matrices can therefore be summarized as follows: Algorithm SWLRA Input: Hankel matrix S with first column = [s 1 s 2... s n ] T and last row = [s n... s n+m 1 ], S R n m, rank r and weighting matrix W. Output: Hankel matrix R of rank r, such that R is as close as possible to S in. W -sense. Begin Step 1 Construct matrix S by rearranging the elements of matrix S such that S R (n+m r 1) (r+1) is a Hankel matrix with first column = [s 1...s n+m r 1 ] T and last row = [s n+m r 1... s n+m 1 ]. Step 2 Compute SVD of S : UΣV T.

6 M. SCHUERMANS ET AL. End Step 3 Take starting value p 0 equal to the (r + 1)-right singular vector of S. Step 4 Minimize the cost function in (14) Step 5 Compute R using (13) In Step 4 a standard NLLS (Matlab s lsqnonlin) is used. In order to use a NLLS routine, the cost function has to be cast in the form f T f and in order to do so the Cholesky decomposition of H3 T(H 3W 1 H3 T) 1 H 3 has to be computed. This can be done by a QR factorisation of W 1/2 H3 T. The computationally most intensive step is Step 4. 4. Numerical experiments In this section we first consider a small SWLRA problem of which the solution is known in order to illustrate the correctness of the proposed algorithm. The last subsection compares the efficiency and statistical accuracy of algorithm SWLRA and the STLNB algorithm proposed in [8]. 4.1. Benchmarks To illustrate the numerical correctness of the algorithm SWLRA, we use two small SWLRA problems proposed in [8] of which the exact solution can be calculated analytically. In [8], modeling problems have been discussed, which approximate a given data sequence z k C p by the impulse response z k C p of a finite-dimensional linear time-invariant (LTI) system of a given order, yielding the following SWLRA problem p ( ) min [(z k z k )w k ] 2 X s.t. C = 0, (15) z k I k=1 where C C n m is a Toeplitz matrix constructed from z k and w k are appropriate weights. Example 1: Consider the following LTI-system of order 4 w k+1 = diag(0.4, 0.3, 0.2, 0.1)w k ; z k = [1 1 1 1]w k with w 0 = [1 1 1 1] T. By arranging the first eight samples z k of the impulse response in a Toeplitz matrix C 0.1 0.3 1 4 0.0354 0.1 0.3 1 C = 0.013 0.0354 0.1 0.3 0.00489 0.013 0.0354 0.1 0.00187 0.00489 0.013 0.0354 and computing the best rank 1 structure-preserving approximation C to C using the weights (w 2 1, w 2 2,..., w 2 8) = (1, 2, 3, 4, 4, 3, 2, 1), we obtain with algorithm SWLRA the same ratio

STRUCTURED WEIGHTED LOW RANK APPROXIMATION 7 z k / z k 1 = 0.26025661, k = 2,...,8 as the analytically determined solution in [8]. The result is only accurate up to 8 digits, since accuracy is lost due to the squaring effect of the data matrix S in the cost function of (14). Example 2: Consider the following LTI-system of order 2 w k+1 = ( 1 1 0 1 z k = [1 0]w k ( ) 6 with w 0 =. 1 We arrange the data z k in the 5 2 Toeplitz matrix 5 6 4 5 C = 3 4 2 3 1 2 ) w k ; and determine the best rank 1 structure-preserving approximation C to C by solving the corresponding SWLRA problem with weights (w 2 1, w 2 2,..., w 2 6) = (1, 2, 2, 2, 2, 1). We obtain the optimal solution with z k / z k 1 = 0.76292301, k = 2,...,6 as determined in [8]. Also here, the accuracy is limited in the same way by the squaring effect. We lose accuracy, in contrast to STLNB which has an accuracy up to 10 16, because of the squaring effect of our cost function f T f that has to be minimized by the algorithm SWLRA. 4.2. Statistical accuracy and numerical efficiency To compare the statistical accuracy and the computational efficiency of the SWLRA algorithm and the STLNB algorithm on larger examples, we perform Monte-Carlo simulations. In Table I, the results for a 30 20 Hankel matrix are presented, whereas Table II contains the results for a 50 30 Hankel matrix. The Monte-Carlo simulations are performed as follows. A Hankel matrix of the required rank, i.e., r = m r is constructed. The latter matrix is perturbed by a Hankel matrix that is constructed using a vector containing i.i.d. Gaussian noise of standard deviation 10 5. Finally, the SWLRA and the STLNB algorithm are applied to this perturbed Hankel matrix. Each Monte-Carlo simulation consists of 100 runs, i.e., it considers 100 noisy realizations. The results presented are averaged over these runs and obtained by coding both algorithms in Matlab (version 6.1) and running on a PC i686 with 800 MHz and 256 MB memory. From both tables we can conclude the following. The statistical accuracy of the SWLRA algorithm is worse than that of the STLNB algorithm for small r but for large r the SWLRA algorithm performs best. From the computational point of view, the SWLRA algorithm outperforms the STLNB algorithm. The behaviour of the computational time for both algorithms can be explained as follows. The number of flops for minimizing the cost function (Step 4) in algorithm SWLRA is dominated by a QR decomposition (NLLS solver) whose computational cost is of the order of 2m 2 (n m/3) for a n m matrix, so in our case of the order of 2(r + 1) 2 (n + m 1 r (r + 1)/3). As a result, the number of flops decreases for decreasing

8 M. SCHUERMANS ET AL. r SWLRA STLNB S R F cputime (sec) S R F cputime (sec) 2 8.04 10 7 6.07 10 1 7.87 10 7 4.83 10 2 4 8.41 10 7 5.52 10 1 7.19 10 7 8.89 10 2 6 6.85 10 7 5.00 10 1 6.48 10 7 1.55 10 1 8 9.36 10 7 4.39 10 1 5.96 10 7 2.90 10 1 10 5.70 10 7 3.78 10 1 4.40 10 7 4.18 10 1 12 4.11 10 7 3.14 10 1 3.82 10 7 4.20 10 1 14 2.93 10 7 2.53 10 1 2.74 10 7 2.41 10 1 16 1.45 10 7 2.00 10 1 1.01 10 7 4.48 10 1 18 7.40 10 8 1.90 10 1 1.12 10 4 2.05 Table I. Numerical results for the 30 20 matrix r SWLRA STLNB S R F cputime (sec) S R F cputime (sec) 2 7.54 10 7 2.43 6.76 10 7 1.18 10 1 4 1.22 10 6 2.27 6.67 10 7 4.72 10 1 6 3.40 10 6 2.11 6.38 10 7 1.58 8 1.07 10 6 1.96 6.00 10 7 2.07 10 2.30 10 6 1.78 5.32 10 7 4.52 12 2.07 10 6 1.61 4.94 10 7 4.21 14 1.81 10 6 1.45 4.39 10 7 5.4 16 9.54 10 7 1.27 4.07 10 7 6.11 18 7.32 10 7 1.11 3.03 10 7 6.66 20 1.37 10 6 9.38 10 1 2.69 10 7 7.09 22 2.58 10 7 8.43 10 1 2.31 10 7 5.15 24 1.37 10 7 6.91 10 1 1.17 10 7 2.87 26 1.39 10 7 5.36 10 1 9.34 10 8 3.88 28 7.40 10 9 3.89 10 1 7.40 10 9 2.83 Table II. Numerical results for the 50 30 matrix r, or equivalently for increasing r. On the contrary, the cost of the STLNB algorithm is dominated by a QR factorization of a (n+m 1+n r) (n+m 1+m r) matrix requiring 2(n + m 1 + m r) 2 (n + m 1 + n r (n + m 1 + m r)/3) flops, which results in an increasing number of flops for increasing r. However, in both algorithms the computational cost can significantly decrease by exploiting the matrix structure in the computation of the QR decomposition, e.g., by using displacement rank theory [10, 11]. This is the subject of future work.

STRUCTURED WEIGHTED LOW RANK APPROXIMATION 9 5. Conclusions In this paper we have developed an extension of the Weighted Low Rank Approximation introduced by Manton et al. [9] for linearly structured matrices. For a particular type of structure, namely the Hankel structure, an algorithm was developed. The numerical accuracy of the latter algorithm was tested on a benchmark problem. Furthermore, the statistical accuracy and the computational efficiency are compared for the STLNB [8] and the SWLRA algorithm. For both properties, the SWLRA algorithm performs better than the STLNB algorithm for large r. ACKNOWLEDGEMENTS Prof. Sabine Van Huffel is a full professor, dr. Philippe Lemmerling is a postdoctoral researcher of the FWO (Fund for Scientific Research Flanders) and Mieke Schuermans is a research assistant at the Katholieke Universiteit Leuven, Belgium. Our research is supported by Research Council KUL: GOA-Mefisto 666, IDO/99/003 and IDO/02/009 (Predictive computer models for medical classification problems using patient data and expert knowledge), several PhD/postdoc & fellow grants; Flemish Government: FWO: PhD/postdoc grants, projects, G.0200.00 (damage detection in composites by optical fibers), G.0078.01 (structured matrices), G.0407.02 (support vector machines), G.0269.02 (magnetic resonance spectroscopic imaging), G.0270.02 (nonlinear Lp approximation), research communities (ICCoS, ANMMM); AWI: Bil. Int. Collaboration Hungary/Poland; IWT: PhD Grants, Belgian Federal Government: DWTC (IUAP IV-02 (1996-2001) and IUAP V-22 (2002-2006): Dynamical Systems and Control: Computation, Identification & Modelling) ); EU: NICONET, INTERPRET, PDT-COIL, MRS/MRI signal processing (TMR); Contract Research/agreements: Data4s, IPCOS;

10 M. SCHUERMANS ET AL. REFERENCES 1. Van Huffel S. and Vandewalle J. The total least squares problem: computational aspects and analysis. Frontiers in Applied Mathematics series, Vol. 9, SIAM, Philadelphia, 1991. 2. Golub G. H., Van Loan C. F. An analysis of the total least squares problem. SIAM Journal Numer. Anal. Vol. 17, p.883-893, 1980. 3. Abatzoglou T. J., Mendel J. M. and Harada G. A. The Constrained Total Least Squares Technique and its Applications to Harmonic Superresolution. IEEE Transactions on Signal Processing, 39 (1991), pp. 1070-1086. 4. Abatzoglou T. J. and Mendel J. M. Constrained Total Least Squares. IEEE International Conf. on Acoustics, Speech & Signal Processing, Dallas, 1987, pp. 1485-1488. 5. De Moor B. Total least squares for affinely structured matrices and the noisy realization problem. IEEE Transactions on Signal Processing, 42 (1994), pp. 3004-3113. 6. Rosen J. B., Park H., and Glick J. Total least norm formulation and solution for structured problems. SIAM Journal on Matrix Anal. Appl., 1996, vol. 17, no. 1, pp. 110-126. 7. Chu M. T., Funderlic R. E., Plemmons R. J. Structured Lower Rank Approximation. Linear Algebra and Its Applications, 2003, vol. 366, pp. 157-172. 8. Van Huffel S., Park H., Rosen J. B. Formulation and solution of structured total least norm problems for parameter estimation. IEEE Transactions on signal processing, Vol. 44, No 10, October 1996. 9. Manton J. H., Mahony R. and Hua Y. The geometry of weighted low rank approximations. IEEE Transactions on Signal Processing, in press 2002. 10. Lemmerling P., Mastronardi N., Van Huffel S. Fast algorithm for solving the Hankel/Toeplitz structured total least squares problem. Numerical Algorithms, 23 (2000), pp. 371-392. 11. Kailath T. and Sayed A. H., editors, Fast reliable algorithms for matrices with structure. SIAM, Philadelphia, 1999.