Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St

Similar documents
Chapter 8 Structured Low Rank Approximation

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Inverse Singular Value Problems

CHAPTER 11. A Revision. 1. The Computers and Numbers therein

Optimization and Root Finding. Kurt Hornik

Math 411 Preliminaries

Basic Math for

Inverse Eigenvalue Problems: Theory and Applications

Linear Algebra Review. Vectors

Background Mathematics (2/2) 1. David Barber

IV. Matrix Approximation using Least-Squares

Linear Algebra: Matrix Eigenvalue Problems

Outline Background Schur-Horn Theorem Mirsky Theorem Sing-Thompson Theorem Weyl-Horn Theorem A Recursive Algorithm The Building Block Case The Origina

Inverse Eigenvalue Problems: Theory, Algorithms, and Applications

Review of Some Concepts from Linear Algebra: Part 2

Iterative Methods for Solving A x = b

Econ Slides from Lecture 7

Review problems for MA 54, Fall 2004.

σ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2

Numerical Methods I Solving Nonlinear Equations

Linear Algebra. Session 12

COMP 558 lecture 18 Nov. 15, 2010

Practical Linear Algebra: A Geometry Toolbox

7. Symmetric Matrices and Quadratic Forms

Properties of Matrices and Operations on Matrices

Parallel Singular Value Decomposition. Jiaxing Tan

B553 Lecture 5: Matrix Algebra Review

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)

The Singular Value Decomposition

Chap 3. Linear Algebra

Linear Algebra Massoud Malek

Lecture 6. Numerical methods. Approximation of functions

5 Handling Constraints

Singular Value Decompsition

Least Squares Optimization

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Numerical Methods in Matrix Computations

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Computationally, diagonal matrices are the easiest to work with. With this idea in mind, we introduce similarity:

arxiv: v1 [math.na] 5 May 2011

Jeffrey D. Ullman Stanford University

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

1 9/5 Matrices, vectors, and their applications

Theorem A.1. If A is any nonzero m x n matrix, then A is equivalent to a partitioned matrix of the form. k k n-k. m-k k m-k n-k

1 Number Systems and Errors 1

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

Chapter 3 Transformations

Elementary linear algebra

Chapter 7: Symmetric Matrices and Quadratic Forms

Numerical Methods I Non-Square and Sparse Linear Systems

Review of some mathematical tools

Introduction. Chapter One

Latent Semantic Analysis. Hongning Wang

Computational Methods. Eigenvalues and Singular Values

CS60021: Scalable Data Mining. Dimensionality Reduction

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Definitions for Quizzes

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

Lecture 9: Numerical Linear Algebra Primer (February 11st)

The Singular Value Decomposition

Linear Algebra - Part II

Lecture Summaries for Linear Algebra M51A

A randomized algorithm for approximating the SVD of a matrix

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3

Chapter 5 Eigenvalues and Eigenvectors

15 Singular Value Decomposition

MAT Linear Algebra Collection of sample exams

Linear Least-Squares Data Fitting

AMS526: Numerical Analysis I (Numerical Linear Algebra)

A Quick Tour of Linear Algebra and Optimization for Machine Learning

Lecture 7: Positive Semidefinite Matrices

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Linear Algebra and Eigenproblems

Linear Algebra for Machine Learning. Sargur N. Srihari

CS227-Scientific Computing. Lecture 4: A Crash Course in Linear Algebra

Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

Math 4377/6308 Advanced Linear Algebra

What is A + B? What is A B? What is AB? What is BA? What is A 2? and B = QUESTION 2. What is the reduced row echelon matrix of A =

QALGO workshop, Riga. 1 / 26. Quantum algorithms for linear algebra.

Conceptual Questions for Review

Least Squares Optimization

LINEAR ALGEBRA KNOWLEDGE SURVEY

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Nonlinear Optimization for Optimal Control

1. Linear systems of equations. Chapters 7-8: Linear Algebra. Solution(s) of a linear system of equations (continued)

1 Computing with constraints

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

5 Quasi-Newton Methods

Introduction to Data Mining

Lecture 6 Positive Definite Matrices

Lecture 5 Singular value decomposition

Statistical Geometry Processing Winter Semester 2011/2012

Introduction to Scientific Computing

Algebra C Numerical Linear Algebra Sample Exam Problems

Transcription:

Structured Lower Rank Approximation by Moody T. Chu (NCSU) joint with Robert E. Funderlic (NCSU) and Robert J. Plemmons (Wake Forest) March 5, 1998

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank Structured Matrices: Lift and Project Method Parameterization by SVD Implicit Optimization Engineerers' Misconception Simplex Search Method Explicit Optimization constr in MATLAB LANCELOT on NEOS

Structure Preserving Rank Reduction Problem Given A target matrix A R nn, An integer k, 1 k < rank(a), A class of matrices with linear structure, a xed matrix norm k k Find A matrix ^B of rank k, and ka ; ^Bk = min ka ; Bk: (1) B rank(b)=k Example of linear structure: Toeplitz or block Toeplitz matrices. Hankel or banded matrices. Applications: Signal and image processing with Toeplitz structure. Model reduction problem in speech encoding and lter design with Hankel structure. Regularization of ill-posed inverse problems.

Diculties No easy way to characterize, either algebraically or analytically, a given class of structured lower rank matrices. Lack of explicit description of the feasible set =) Dicult to apply classical optimization techniques. Little discussion on whether lower rank matrices with specied structure actually exist.

An Example of Existence 5 Physics sometimes sheds additional light. The Toeplitz matrix with H := h j := k X i=1 6 i z j i h n. h n+1 ::: h n;1. h h ::: h n+1 h 1 h ::: h n 7 5 j = 1 ::: n ; 1 where f i g and fz i g are two sequences of arbitrary nonzero numbers satisfying z i 6= z j whenever i 6= j and k n, is a Toeplitz matrix of rank k. The general Toeplitz structure preserving rank reduction problem as described in (1) remains open. Existence of lower rank matrices of specied structure does not guarantee closest such matrices. No x > 0 for which 1=x is minimum. For other types of structures, the existence question usually is a hard algebraic problem.

6 Another Hidden Catch The set of all n n matrices with rank k is a closed set. The approximation problem min ka ; Bk B rank(b)k is always solvable, so long as the feasible set is nonempty. The rank condition is to be less than or equal to k, but not necessarily exactly equal to k. It is possible that a given target matrix A does not have a nearest rank k structured matrix approximation, but does have a nearest rank k;1 or lower structured matrix approximation.

Our Contributions 7 Introduce two procedures to tackle the structure preserving rank reduction problem numerically. The procedures can be applied to problems of any norm, any linear structure, and any matrix norm. Use the symmetric Toeplitz structure with Frobenius matrix norm to illustrate the ideas.

8 Structure of Lower Rank Toeplitz Matrices Identify a symmetric Toeplitz matrix by its rst row, T = T ([t 1 ::: t n ]) = 6 t 1 t ::: t n t t 1... t n;1....... t n;1 t t n t n;1 ::: t t 1 T = The ane subspace of all n n symmetric Toeplitz matrices. Spectral decomposition of symmetric rank k matrices: M = k X i=1 i y (i) y (i)t : () Write T = T ([t 1 ::: t n ]) in terms of () =) kx i=1 i y (i) j y (i) j+s = t s+1 s = 0 1 ::: n; 1 j n;s 7 5 : () Lower rank matrices form an algebraic variety, i.e, solutions of polynomial systems.

Some Examples 9 The case k = 1 is trivial. Rank-one Toeplitz matrices form two simple oneparameter families, T = 1 T ([1 ::: 1]) or T = 1 T ([1 ;1 1 ::: (;1) n;1 ]) with arbitrary 1 6= 0. For symmetric Toeplitz matrices of rank, there are 10 unknowns in 6 equations. 8 >< >: 1 := y (1) ;y () 1 ;y (1) 1 ;y () +y (1) := y (1) y() 1 y(1) 1 (1) := ; y y (1) := ; y (1) y() 1 y () := ; y (1) y () + y() y(1) y (1) y() 1 +y(1) 1 y() () (1) () y ; y y 1 ; y (1) y() ;y () y(1) 1 ; y (1) ;y () y (1) y() 1 +y(1) 1 y() () () () y 1 y ; y(1) y 1 y() 1 y() y(1) y (1) y () 1 1 y() y(1) 1 + y (1) y (1) y () + y (1) y(1) 1 y() 1 1 + y (1) ; y (1) y(1) 1 y () y() 1 y(1) 1 y() +y(1) 1 ; y (1) 1 y(1) 1 y() y() 1 y(1) 1 y() +y(1) 1 + y (1) y (1) + y () y() 1 1 y() y(1) 1 () y () () (1) () y ; y() 1 y(1) 1 y 1 +y 1 y () : y Explicit description of algebraic equations for higher dimensional lower rank symmetric Toeplitz matrices becomes unbearably complicated.

10 Let's See It! Rank decient T ([t 1 t t ]) det(t ) = (t 1 ; t )(t 1 + t 1t ; t ) = 0. A union of two algebraic varieties. 0 t - - - - t 0 0 t1 - - Figure 1: Lower rank, symmetric, Toeplitz matrices of dimension identied in R. The number of local solutions to the structured lower rank approximation problem is not unique.

Constructing Lower Rank Toeplitz Matrices 11 Idea: Rank k matrices in R nn form a surface R(k). Rank k Toeplitz matrices = R(k) T T. Two approaches: Parameterization by SVD:. Identify M R(k) by the triplet (U V) of its singular value decomposition M = UV T. U and V are orthogonal matrices, and = diagfs 1 ::: s k 0 ::: 0g with s 1 ::: s k > 0.. Enforce the structure. Alternate projections between R(k) and T to nd intersections. (Cheney & Goldstein'59, Catzow'88)

1 Lift and Project Algorithm Given A (0) = A, repeat projections until convergence: LIFT. Compute B () R(k) nearest to A () :. From A () T, rst compute its SVD A () = U () () V ()T :. Replace () by diagfs () 1 ::: s () k 0 ::: 0g and dene B () := U () () V ()T : PROJECT. Compute A (+1) T nearest to B () :. From B (), choose A (+1) to be the matrix formed by replacing the diagonals of B () by the averages of their entries. The general approach remains applicable to any other linear structure, and symmetry can be enforced. The only thing that needs to be modied is the projection in the projection (second) step.

Geometric Sketch 1 ( ν) B R(k) ( ν+1) B T ( ν) A ( ν+1 ) A Figure : Algorithm 1 with intersection of lower rank matrices and Toeplitz matrices

1 Black-box Function Descent property: ka (+1) ;B (+1) k F ka (+1) ;B () k F ka () ;B () k F : Descent with respect to the Frobenius norm which is not necessarily the norm used in the structure preserving rank reduction problem. If all A () are distinct then the iteration converges to a Toeplitz matrix of rank k. In principle, the iteration could be trapped in an impasse where A () and B () would not improve any more, but not experienced in practice. The lift and project iteration provides a means to dene a black-box function P : T ;! T \ R(k): The P (T ) is presumably piecewise continuous since all projections are continuous.

15 The graph of P (T ) Consider P : R ;! R : Use the xy-plane to represent the domain of P for symmetric Toeplitz matrices T (t 1 t ). Use the z-axis to represent the image p 11 (T ) and p 1 (T )), respectively. 1 1 0.5 0.5 p 11 0 p 1 0 0.5 0.5 1 1 1 1 0.5 1 0.5 1 0 0.5 t 1 1 0.5 0 t 1 0.5 0 0.5 t 1 1 0.5 0 t 1 0.5 Figure : Graph of P (T ) for -dimensional symmetric Toeplitz T. Toeplitz matrices of the form T (t 1 0) or T (0 t ), corresponding to points on axes, converge to the zero matrix.

16 Implicit Optimization Implicit formulation: min kt 0 ; P (T )k: () T =toeplitz(t 1 ::: t n ) T 0 is the given target matrix. P (T ), regarded as a black box function evaluation, provides a handle to manipulate the objective function f(t ) := kt 0 ; P (T )k. The norm used in () can be any matrix norm. Engineers' misconception: P (T ) is not necessarily the closest rank k Toeplitz matrix to T. In practice, P (T 0 ) has been used \as a cleansing process whereby any corrupting noise, measurement distortion or theoretical mismatch present in the given data set (namely, T 0 ) is removed." More needs to be done in order to nd the closest lower rank Toeplitz approximation to the given T 0 as P (T 0 ) is merely known to be in the feasible set.

Numerical Experiment 17 An ad hoc optimization technique: The simplex search method by Nelder and Mead requires only function evaluations. Routine fmins in MATLAB, employing the simplex search method, is ready for use in our application. An example: Suppose T 0 = T (1 5 6). Start with T (0) = T 0, and set worst case precision to 10 ;6. Able to calculate all lower rank matrices while maintaining the symmetric Toeplitz structure. Always so? Nearly machine-zero of smallest calculated singular value(s) =) T k is computationally of rank k. T k is only a local solution. kt k ; T 0 k < kp (T 0 ) ; T 0 k which, though represents only a slight improvement, clearly indicates that P (T 0 ) alone does not give rise to an optimal solution.

18 rank k 5 1 # of iterations 110 81 6 6 17 # of SVD calls 1881 78 585 9 558 optimal solution 6 1:106 1:8880 :105 :9106 5:065 5:9697 7 5 6 1:08 1:800 :05 :11 :855 6:0759 7 5 6 1:18 1:7980 :8171 :1089 5:156 5:750 7 5 6 1:9591 :1059 :568 :157 :779 6:897 7 5 6 :9 :9 :9 :9 :9 :9 7 5 kt 0 ; T k 0.5868 0.9851 1.0.890 8.5959 k singular values 6 17:9851 7:557 :866 0:9989 0:616 :68e;15 7 5 6 17:9980 7:1 :86 0:876 :5e;1 :010e;1 7 5 6 18:015 7:15 :1 1:9865e;1 9:075e;15 6:555e;15 7 5 6 18:86 6:99 :088e;1 7:5607e;15 :879e;15 :5896e;15 7 5 6 17:6667 :088e;1 9:895e;15 6:086e;15 :69e;15 :1171e;15 7 5 Table 1: Test results for a case of n = 6 symmetric Toeplitz structure

Explicit Optimization 19 Dicult to compute the gradient of P (T ). Other ways to parameterize structured lower rank matrices: Use eigenvalues and eigenvectors for symmetric matrices Use singular values and singular vectors for general matrices. Robust, but might have overdetermined the problem.

0 An Illustration Dene M( 1 ::: k y (1) ::: y (k) ) := k X i=1 i y (i) y (i)t : Reformulate the symmetric Toeplitz structure preserving rank reduction problem explicitly as min kt 0 ; M( 1 ::: k y (1) ::: y (k) )k(5) subject to m j j+s;1 = m 1 s (6) s = 1 :::n; 1 j = ::: n; s + 1 if M = [m ij ]. Objective function in (5) is described in terms of the non-zero eigenvalues 1 ::: k and the corresponding eigenvectors y (1) ::: y (k) of M. Constraints in (6) are used to ensure that M is symmetric and Toeplitz. For other types of structures, we only need modify the constraint statement accordingly. The norm used in (5) can be arbitrary but is xed.

1 Redundant Constraints Symmetric centro-symmetric matrices have special spectral properties: dn=e of the eigenvectors are symmetric and bn=c are skew-symmetric.. v = [v i ] R n is symmetric (or skew-symmetric) if v i = v n;i (or v i = ;v n;i ). Symmetric Toeplitz matrices are symmetric and centrosymmetric. The formulation in (5) does not take this spectral structure into account in the eigenvectors y (i). More variables than needed have been introduced. May have overlooked any internal relationship among the n(n;1) equality constraints. May have caused, inadvertently, additional computation complexity.

Using constr in MATLAB Routine constr in MATLAB: Uses a sequential quadratic programming method. Solve the Kuhn-Tucker equations by a quasi-newton updating procedure. Can estimate derivative information by nite dierence approximations. Readily available in Optimization Toolbox. Our experiments: Use the same data as in the implicit formulation. Case k = 5 is computationally the same as before. Have trouble in cases k = or k =,. Iterations will not improve approximations at all.. MATLAB reports that the optimization is terminated successfully.

Using LANCELOT on NEOS Reasons of failure of MATLAB are not clear. Constraints might no longer be linearly independent. Termination criteria in constr might not be adequate. Dicult geometry means hard-to-satisfy constraints. Using more sophisticated optimization packages, such as LANCELOT. A standard Fortran 77 package for solving large-scale nonlinearly constrained optimization problems. Break down the functions into sums of element functions to introduce sparse Hessian matrix. Huge code. See http:==www:rl:ac:uk=departments=ccd=numerical=lancelot=sif =sif html:html: Available on the NEOS Server through a socket-based interface. Uses the ADIFOR automatic dierentiation tool.

LANCELOT works. Find optimal solutions of problem (5) for all values of k. Results from LANCELOT agree, up to the required accuracy 10 ;6, with those from fmins. Rank aects the computational cost nonlinearly. rank k 5 1 # of variables 5 8 1 1 7 # of f/c calls 108 56 7 19 total time 1.99.850.10 1.80.00 Table : Cost overhead in using LANCELOT for n =6.

Conclusions 5 Structure preserving rank reduction problems arise in many important applications, particularly in the broad areas of signal and image processing. Constructing the nearest approximation of a given matrix by one with any rank and any linear structure is dicult in general. We have proposed two ways to formulate the problems as standard optimization computations. It is now possible to tackle the problems numerically via utilizing standard optimization packages. The ideas were illustrated by considering Toeplitz structure with Frobenius norm. Our approach can be readily generalized to consider rank reduction problems for any given linear structure and of any given matrix norm.

6 f-count FUNCTION MAX{g} STEP Procedures 9 0.95896 8.6597e-15 1 77 0.95896.665e-1 1.91e-06 11 0.95896.7089e-1.98e-08 Hessian modified twice 185 0.95896.7089e-1.98e-08 9 0.95896.7115e-1.98e-08 89 0.95896.77556e-1.77e-07 7 0.95896.77556e-1 1.91e-06 9 0.95896.77556e-1 7.5e-09 Hessian modified twice 5 0.95896 5.866e-1 1.19e-07 501 0.95896 5.68e-1 7.5e-09 557 0.95896 5.70655e-1 7.5e-09 Hessian not updated 61 0.95896 5.661e-1 7.5e-09 667 0.95896 5.5511e-1.98e-08 Hessian modified twice 71 0.95896.170e-1 7.6e-06 761 0.95896.61569e-1 1.91e-06 81 0.95896.6001e-1 -.8e-07 Hessian modified twice 856 0.95896.5779e-1.05e-05 Hessian modified twice 900 0.95896.566e-1.05e-05 Hessian modified twice 98 0.95896.5718e-1 1.91e-06 99 0.95896.5668e-1 7.6e-06 108 0.95896.87e-1.05e-05 108 0.95896.177e-1-1.5e-05 Hessian modified twice 11 0.95896.9575e-1 0.000 Hessian modified twice 1161 0.95896 5.085e-1 0.0091 Hessian modified twice 100 0.95896 5.19e-1 0.000977 Hessian modified twice 1 0.95896 5.61551e-1 0.065 Hessian modified twice 17 0.95896 5.866e-1 0.000977 Hessian modified twice 108 0.95896.879e-1 0.00781 Hessian modified twice 109 0.95896.87e-1 1 Hessian modified twice Optimization Converged Successfully Table : A typical output of intermediate results from constr.