Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St

Structured Lower Rank Approximation by Moody T. Chu (NCSU) joint with Robert E. Funderlic (NCSU) and Robert J. Plemmons (Wake Forest) March 5, 1998

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank Structured Matrices: Lift and Project Method Parameterization by SVD Implicit Optimization Engineerers' Misconception Simplex Search Method Explicit Optimization constr in MATLAB LANCELOT on NEOS

Structure Preserving Rank Reduction Problem Given A target matrix A R nn, An integer k, 1 k < rank(a), A class of matrices with linear structure, a xed matrix norm k k Find A matrix ^B of rank k, and ka ; ^Bk = min ka ; Bk: (1) B rank(b)=k Example of linear structure: Toeplitz or block Toeplitz matrices. Hankel or banded matrices. Applications: Signal and image processing with Toeplitz structure. Model reduction problem in speech encoding and lter design with Hankel structure. Regularization of ill-posed inverse problems.

Diculties No easy way to characterize, either algebraically or analytically, a given class of structured lower rank matrices. Lack of explicit description of the feasible set =) Dicult to apply classical optimization techniques. Little discussion on whether lower rank matrices with specied structure actually exist.

An Example of Existence 5 Physics sometimes sheds additional light. The Toeplitz matrix with H := h j := k X i=1 6 i z j i h n. h n+1 ::: h n;1. h h ::: h n+1 h 1 h ::: h n 7 5 j = 1 ::: n ; 1 where f i g and fz i g are two sequences of arbitrary nonzero numbers satisfying z i 6= z j whenever i 6= j and k n, is a Toeplitz matrix of rank k. The general Toeplitz structure preserving rank reduction problem as described in (1) remains open. Existence of lower rank matrices of specied structure does not guarantee closest such matrices. No x > 0 for which 1=x is minimum. For other types of structures, the existence question usually is a hard algebraic problem.

6 Another Hidden Catch The set of all n n matrices with rank k is a closed set. The approximation problem min ka ; Bk B rank(b)k is always solvable, so long as the feasible set is nonempty. The rank condition is to be less than or equal to k, but not necessarily exactly equal to k. It is possible that a given target matrix A does not have a nearest rank k structured matrix approximation, but does have a nearest rank k;1 or lower structured matrix approximation.

Our Contributions 7 Introduce two procedures to tackle the structure preserving rank reduction problem numerically. The procedures can be applied to problems of any norm, any linear structure, and any matrix norm. Use the symmetric Toeplitz structure with Frobenius matrix norm to illustrate the ideas.

8 Structure of Lower Rank Toeplitz Matrices Identify a symmetric Toeplitz matrix by its rst row, T = T ([t 1 ::: t n ]) = 6 t 1 t ::: t n t t 1... t n;1....... t n;1 t t n t n;1 ::: t t 1 T = The ane subspace of all n n symmetric Toeplitz matrices. Spectral decomposition of symmetric rank k matrices: M = k X i=1 i y (i) y (i)t : () Write T = T ([t 1 ::: t n ]) in terms of () =) kx i=1 i y (i) j y (i) j+s = t s+1 s = 0 1 ::: n; 1 j n;s 7 5 : () Lower rank matrices form an algebraic variety, i.e, solutions of polynomial systems.

Some Examples 9 The case k = 1 is trivial. Rank-one Toeplitz matrices form two simple oneparameter families, T = 1 T ([1 ::: 1]) or T = 1 T ([1 ;1 1 ::: (;1) n;1 ]) with arbitrary 1 6= 0. For symmetric Toeplitz matrices of rank, there are 10 unknowns in 6 equations. 8 >< >: 1 := y (1) ;y () 1 ;y (1) 1 ;y () +y (1) := y (1) y() 1 y(1) 1 (1) := ; y y (1) := ; y (1) y() 1 y () := ; y (1) y () + y() y(1) y (1) y() 1 +y(1) 1 y() () (1) () y ; y y 1 ; y (1) y() ;y () y(1) 1 ; y (1) ;y () y (1) y() 1 +y(1) 1 y() () () () y 1 y ; y(1) y 1 y() 1 y() y(1) y (1) y () 1 1 y() y(1) 1 + y (1) y (1) y () + y (1) y(1) 1 y() 1 1 + y (1) ; y (1) y(1) 1 y () y() 1 y(1) 1 y() +y(1) 1 ; y (1) 1 y(1) 1 y() y() 1 y(1) 1 y() +y(1) 1 + y (1) y (1) + y () y() 1 1 y() y(1) 1 () y () () (1) () y ; y() 1 y(1) 1 y 1 +y 1 y () : y Explicit description of algebraic equations for higher dimensional lower rank symmetric Toeplitz matrices becomes unbearably complicated.

10 Let's See It! Rank decient T ([t 1 t t ]) det(t ) = (t 1 ; t )(t 1 + t 1t ; t ) = 0. A union of two algebraic varieties. 0 t - - - - t 0 0 t1 - - Figure 1: Lower rank, symmetric, Toeplitz matrices of dimension identied in R. The number of local solutions to the structured lower rank approximation problem is not unique.

Constructing Lower Rank Toeplitz Matrices 11 Idea: Rank k matrices in R nn form a surface R(k). Rank k Toeplitz matrices = R(k) T T. Two approaches: Parameterization by SVD:. Identify M R(k) by the triplet (U V) of its singular value decomposition M = UV T. U and V are orthogonal matrices, and = diagfs 1 ::: s k 0 ::: 0g with s 1 ::: s k > 0.. Enforce the structure. Alternate projections between R(k) and T to nd intersections. (Cheney & Goldstein'59, Catzow'88)

1 Lift and Project Algorithm Given A (0) = A, repeat projections until convergence: LIFT. Compute B () R(k) nearest to A () :. From A () T, rst compute its SVD A () = U () () V ()T :. Replace () by diagfs () 1 ::: s () k 0 ::: 0g and dene B () := U () () V ()T : PROJECT. Compute A (+1) T nearest to B () :. From B (), choose A (+1) to be the matrix formed by replacing the diagonals of B () by the averages of their entries. The general approach remains applicable to any other linear structure, and symmetry can be enforced. The only thing that needs to be modied is the projection in the projection (second) step.

Geometric Sketch 1 ( ν) B R(k) ( ν+1) B T ( ν) A ( ν+1 ) A Figure : Algorithm 1 with intersection of lower rank matrices and Toeplitz matrices

1 Black-box Function Descent property: ka (+1) ;B (+1) k F ka (+1) ;B () k F ka () ;B () k F : Descent with respect to the Frobenius norm which is not necessarily the norm used in the structure preserving rank reduction problem. If all A () are distinct then the iteration converges to a Toeplitz matrix of rank k. In principle, the iteration could be trapped in an impasse where A () and B () would not improve any more, but not experienced in practice. The lift and project iteration provides a means to dene a black-box function P : T ;! T \ R(k): The P (T ) is presumably piecewise continuous since all projections are continuous.

15 The graph of P (T ) Consider P : R ;! R : Use the xy-plane to represent the domain of P for symmetric Toeplitz matrices T (t 1 t ). Use the z-axis to represent the image p 11 (T ) and p 1 (T )), respectively. 1 1 0.5 0.5 p 11 0 p 1 0 0.5 0.5 1 1 1 1 0.5 1 0.5 1 0 0.5 t 1 1 0.5 0 t 1 0.5 0 0.5 t 1 1 0.5 0 t 1 0.5 Figure : Graph of P (T ) for -dimensional symmetric Toeplitz T. Toeplitz matrices of the form T (t 1 0) or T (0 t ), corresponding to points on axes, converge to the zero matrix.

16 Implicit Optimization Implicit formulation: min kt 0 ; P (T )k: () T =toeplitz(t 1 ::: t n ) T 0 is the given target matrix. P (T ), regarded as a black box function evaluation, provides a handle to manipulate the objective function f(t ) := kt 0 ; P (T )k. The norm used in () can be any matrix norm. Engineers' misconception: P (T ) is not necessarily the closest rank k Toeplitz matrix to T. In practice, P (T 0 ) has been used \as a cleansing process whereby any corrupting noise, measurement distortion or theoretical mismatch present in the given data set (namely, T 0 ) is removed." More needs to be done in order to nd the closest lower rank Toeplitz approximation to the given T 0 as P (T 0 ) is merely known to be in the feasible set.

Numerical Experiment 17 An ad hoc optimization technique: The simplex search method by Nelder and Mead requires only function evaluations. Routine fmins in MATLAB, employing the simplex search method, is ready for use in our application. An example: Suppose T 0 = T (1 5 6). Start with T (0) = T 0, and set worst case precision to 10 ;6. Able to calculate all lower rank matrices while maintaining the symmetric Toeplitz structure. Always so? Nearly machine-zero of smallest calculated singular value(s) =) T k is computationally of rank k. T k is only a local solution. kt k ; T 0 k < kp (T 0 ) ; T 0 k which, though represents only a slight improvement, clearly indicates that P (T 0 ) alone does not give rise to an optimal solution.

18 rank k 5 1 # of iterations 110 81 6 6 17 # of SVD calls 1881 78 585 9 558 optimal solution 6 1:106 1:8880 :105 :9106 5:065 5:9697 7 5 6 1:08 1:800 :05 :11 :855 6:0759 7 5 6 1:18 1:7980 :8171 :1089 5:156 5:750 7 5 6 1:9591 :1059 :568 :157 :779 6:897 7 5 6 :9 :9 :9 :9 :9 :9 7 5 kt 0 ; T k 0.5868 0.9851 1.0.890 8.5959 k singular values 6 17:9851 7:557 :866 0:9989 0:616 :68e;15 7 5 6 17:9980 7:1 :86 0:876 :5e;1 :010e;1 7 5 6 18:015 7:15 :1 1:9865e;1 9:075e;15 6:555e;15 7 5 6 18:86 6:99 :088e;1 7:5607e;15 :879e;15 :5896e;15 7 5 6 17:6667 :088e;1 9:895e;15 6:086e;15 :69e;15 :1171e;15 7 5 Table 1: Test results for a case of n = 6 symmetric Toeplitz structure

Explicit Optimization 19 Dicult to compute the gradient of P (T ). Other ways to parameterize structured lower rank matrices: Use eigenvalues and eigenvectors for symmetric matrices Use singular values and singular vectors for general matrices. Robust, but might have overdetermined the problem.

0 An Illustration Dene M( 1 ::: k y (1) ::: y (k) ) := k X i=1 i y (i) y (i)t : Reformulate the symmetric Toeplitz structure preserving rank reduction problem explicitly as min kt 0 ; M( 1 ::: k y (1) ::: y (k) )k(5) subject to m j j+s;1 = m 1 s (6) s = 1 :::n; 1 j = ::: n; s + 1 if M = [m ij ]. Objective function in (5) is described in terms of the non-zero eigenvalues 1 ::: k and the corresponding eigenvectors y (1) ::: y (k) of M. Constraints in (6) are used to ensure that M is symmetric and Toeplitz. For other types of structures, we only need modify the constraint statement accordingly. The norm used in (5) can be arbitrary but is xed.

1 Redundant Constraints Symmetric centro-symmetric matrices have special spectral properties: dn=e of the eigenvectors are symmetric and bn=c are skew-symmetric.. v = [v i ] R n is symmetric (or skew-symmetric) if v i = v n;i (or v i = ;v n;i ). Symmetric Toeplitz matrices are symmetric and centrosymmetric. The formulation in (5) does not take this spectral structure into account in the eigenvectors y (i). More variables than needed have been introduced. May have overlooked any internal relationship among the n(n;1) equality constraints. May have caused, inadvertently, additional computation complexity.

Using constr in MATLAB Routine constr in MATLAB: Uses a sequential quadratic programming method. Solve the Kuhn-Tucker equations by a quasi-newton updating procedure. Can estimate derivative information by nite dierence approximations. Readily available in Optimization Toolbox. Our experiments: Use the same data as in the implicit formulation. Case k = 5 is computationally the same as before. Have trouble in cases k = or k =,. Iterations will not improve approximations at all.. MATLAB reports that the optimization is terminated successfully.

Using LANCELOT on NEOS Reasons of failure of MATLAB are not clear. Constraints might no longer be linearly independent. Termination criteria in constr might not be adequate. Dicult geometry means hard-to-satisfy constraints. Using more sophisticated optimization packages, such as LANCELOT. A standard Fortran 77 package for solving large-scale nonlinearly constrained optimization problems. Break down the functions into sums of element functions to introduce sparse Hessian matrix. Huge code. See http:==www:rl:ac:uk=departments=ccd=numerical=lancelot=sif =sif html:html: Available on the NEOS Server through a socket-based interface. Uses the ADIFOR automatic dierentiation tool.

LANCELOT works. Find optimal solutions of problem (5) for all values of k. Results from LANCELOT agree, up to the required accuracy 10 ;6, with those from fmins. Rank aects the computational cost nonlinearly. rank k 5 1 # of variables 5 8 1 1 7 # of f/c calls 108 56 7 19 total time 1.99.850.10 1.80.00 Table : Cost overhead in using LANCELOT for n =6.

Conclusions 5 Structure preserving rank reduction problems arise in many important applications, particularly in the broad areas of signal and image processing. Constructing the nearest approximation of a given matrix by one with any rank and any linear structure is dicult in general. We have proposed two ways to formulate the problems as standard optimization computations. It is now possible to tackle the problems numerically via utilizing standard optimization packages. The ideas were illustrated by considering Toeplitz structure with Frobenius norm. Our approach can be readily generalized to consider rank reduction problems for any given linear structure and of any given matrix norm.

6 f-count FUNCTION MAX{g} STEP Procedures 9 0.95896 8.6597e-15 1 77 0.95896.665e-1 1.91e-06 11 0.95896.7089e-1.98e-08 Hessian modified twice 185 0.95896.7089e-1.98e-08 9 0.95896.7115e-1.98e-08 89 0.95896.77556e-1.77e-07 7 0.95896.77556e-1 1.91e-06 9 0.95896.77556e-1 7.5e-09 Hessian modified twice 5 0.95896 5.866e-1 1.19e-07 501 0.95896 5.68e-1 7.5e-09 557 0.95896 5.70655e-1 7.5e-09 Hessian not updated 61 0.95896 5.661e-1 7.5e-09 667 0.95896 5.5511e-1.98e-08 Hessian modified twice 71 0.95896.170e-1 7.6e-06 761 0.95896.61569e-1 1.91e-06 81 0.95896.6001e-1 -.8e-07 Hessian modified twice 856 0.95896.5779e-1.05e-05 Hessian modified twice 900 0.95896.566e-1.05e-05 Hessian modified twice 98 0.95896.5718e-1 1.91e-06 99 0.95896.5668e-1 7.6e-06 108 0.95896.87e-1.05e-05 108 0.95896.177e-1-1.5e-05 Hessian modified twice 11 0.95896.9575e-1 0.000 Hessian modified twice 1161 0.95896 5.085e-1 0.0091 Hessian modified twice 100 0.95896 5.19e-1 0.000977 Hessian modified twice 1 0.95896 5.61551e-1 0.065 Hessian modified twice 17 0.95896 5.866e-1 0.000977 Hessian modified twice 108 0.95896.879e-1 0.00781 Hessian modified twice 109 0.95896.87e-1 1 Hessian modified twice Optimization Converged Successfully Table : A typical output of intermediate results from constr.