H 2 -optimal model reduction of MIMO systems

Similar documents
H 2 optimal model reduction - Wilson s conditions for the cross-gramian

Model reduction of interconnected systems

Projection of state space realizations

Krylov Techniques for Model Reduction of Second-Order Systems

Model reduction via tangential interpolation

Model Reduction for Unstable Systems

Fluid flow dynamical model approximation and control

On some interpolation problems

Chapter 7. Canonical Forms. 7.1 Eigenvalues and Eigenvectors

Affine iterations on nonnegative vectors

An iterative SVD-Krylov based method for model reduction of large-scale dynamical systems

Implicit Volterra Series Interpolation for Model Reduction of Bilinear Systems

Control Systems. Laplace domain analysis

Balanced Truncation 1

Generalized Shifted Inverse Iterations on Grassmann Manifolds 1

BALANCING AS A MOMENT MATCHING PROBLEM

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

REVISITING THE STABILITY OF COMPUTING THE ROOTS OF A QUADRATIC POLYNOMIAL

Similarity matrices for colored graphs

Zeros and zero dynamics

Inexact Solves in Krylov-based Model Reduction

Realization-independent H 2 -approximation

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Eigenvalues and Eigenvectors: An Introduction

Jordan Canonical Form

Gramians of structured systems and an error bound for structure-preserving model reduction

ECE 275A Homework #3 Solutions

Math Ordinary Differential Equations

Model reduction of large-scale dynamical systems

Non-negative matrix factorization with fixed row and column sums

Krylov-Subspace Based Model Reduction of Nonlinear Circuit Models Using Bilinear and Quadratic-Linear Approximations

MIT Final Exam Solutions, Spring 2017

Knowledge Discovery and Data Mining 1 (VO) ( )

Note on the convex hull of the Stiefel manifold

CANONICAL LOSSLESS STATE-SPACE SYSTEMS: STAIRCASE FORMS AND THE SCHUR ALGORITHM

Definition (T -invariant subspace) Example. Example

EIGENVALUES AND EIGENVECTORS 3

CSL361 Problem set 4: Basic linear algebra

1. Find the solution of the following uncontrolled linear system. 2 α 1 1

CONVEX OPTIMIZATION OVER POSITIVE POLYNOMIALS AND FILTER DESIGN. Y. Genin, Y. Hachez, Yu. Nesterov, P. Van Dooren

NOTES ON LINEAR ODES

Module 09 From s-domain to time-domain From ODEs, TFs to State-Space Modern Control

The Bock iteration for the ODE estimation problem

A Brief Outline of Math 355

An iterative SVD-Krylov based method for model reduction of large-scale dynamical systems

An iterative SVD-Krylov based method for model reduction of large-scale dynamical systems

Iterative Rational Krylov Algorithm for Unstable Dynamical Systems and Generalized Coprime Factorizations

On Solving Large Algebraic. Riccati Matrix Equations

Model Reduction of Inhomogeneous Initial Conditions

OPTIMAL H 2 MODEL REDUCTION IN STATE-SPACE: A CASE STUDY

Jordan Canonical Form Homework Solutions

Quadratic forms. Here. Thus symmetric matrices are diagonalizable, and the diagonalization can be performed by means of an orthogonal matrix.

Math Linear Algebra Final Exam Review Sheet

Subdiagonal pivot structures and associated canonical forms under state isometries

QUALITATIVE CONTROLLABILITY AND UNCONTROLLABILITY BY A SINGLE ENTRY

Canonical lossless state-space systems: staircase forms and the Schur algorithm

arxiv: v1 [cs.sy] 17 Nov 2015

Efficient and Accurate Rectangular Window Subspace Tracking

Computationally, diagonal matrices are the easiest to work with. With this idea in mind, we introduce similarity:

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra

Matrices A brief introduction

Linear Algebra: Matrix Eigenvalue Problems

Optimal Scaling of Companion Pencils for the QZ-Algorithm

MATH 5640: Functions of Diagonalizable Matrices

Linear Algebra Review

235 Final exam review questions

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Matrices and Linear Algebra

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

Chap 3. Linear Algebra

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

SQUARE ROOTS OF 2x2 MATRICES 1. Sam Northshield SUNY-Plattsburgh

BALANCING-RELATED MODEL REDUCTION FOR DATA-SPARSE SYSTEMS

Model Reduction of Linear Systems, an Interpolation point of View

Applications of Controlled Invariance to the l 1 Optimal Control Problem

Zentrum für Technomathematik

Homework 6 Solutions. Solution. Note {e t, te t, t 2 e t, e 2t } is linearly independent. If β = {e t, te t, t 2 e t, e 2t }, then

MULTIVARIABLE ZEROS OF STATE-SPACE SYSTEMS

2.3. VECTOR SPACES 25

ECEN 605 LINEAR SYSTEMS. Lecture 7 Solution of State Equations 1/77

Introduction to Matrix Algebra

Linear Algebra M1 - FIB. Contents: 5. Matrices, systems of linear equations and determinants 6. Vector space 7. Linear maps 8.

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Robust Control 2 Controllability, Observability & Transfer Functions

SYSTEMTEORI - ÖVNING Stability of linear systems Exercise 3.1 (LTI system). Consider the following matrix:

ALGEBRA QUALIFYING EXAM PROBLEMS LINEAR ALGEBRA

Symmetric and anti symmetric matrices

Control Systems. Frequency domain analysis. L. Lanari

Case study: Approximations of the Bessel Function

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

Solution of Linear State-space Systems

Quantum Computing Lecture 2. Review of Linear Algebra

MATH 5524 MATRIX THEORY Problem Set 4

Mathematical Olympiad Training Polynomials

Mathematical foundations - linear algebra

Represent this system in terms of a block diagram consisting only of. g From Newton s law: 2 : θ sin θ 9 θ ` T

Study Guide for Linear Algebra Exam 2

Transcription:

H 2 -optimal model reduction of MIMO systems P. Van Dooren K. A. Gallivan P.-A. Absil Abstract We consider the problem of approximating a p m rational transfer function Hs of high degree by another p m rational transfer function b Hs of much smaller degree. We derive the gradients of the H 2-norm of the approximation error and show how stationary points can be described via tangential interpolation. Keyword Multivariable systems, model reduction, optimal H 2 approximation, tangential interpolation. Introduction In this paper we will consider the problem of approximating a real p m rational transfer function Hs of McMillan degree N by a real p m rational transfer function Ĥs of lower McMillan degree n using the H 2 -norm as approximation criterion. Since a transfer function has an unbounded H 2 -norm if it is not proper a rational transfer function is proper if it is zero at s =, we will constrain both Hs and Ĥs to be proper. Such transfer functions have state-space realizations A,B,C R N2 R Nm R pn and Â, B,Ĉ Rn2 R nm R pn satisfying Hs := CsI N A B and Ĥs := ĈsI n  B. The realization {Â, B,Ĉ} is not unique in the sense that the triple {ÂT, B T,ĈT } := {T ÂT,T B, ĈT } for any matrix T GLn, R defines the same transfer function : Ĥs = ĈsI n  B = Ĉ T si n ÂT BT. It is known see e.g. Theorem 4.7 in Byrnes and Falb [3] that the geometric quotient of R n2 R nm R pn under GLn, R is a smooth, irreducible variety of dimension nm + p. This implies that the set Rat n p,m of p m proper rational transfer functions of degree n can be parameterized with only nm+p real parameters in a locally smooth manner. A possible approach for building a reduced order model {Â, B,Ĉ} from a full order model {A,B,C} is tangential interpolation, which can always be achieved see [4] by solving two Sylvester equations for the unknowns W,V R N n AV V Σσ + BR =, 2 and constructing the reduced order model of degree n as follows W T A Σ T µw T + L T C =, 3 {Â, B,Ĉ} := {W T V W T AV,W T V W T B,CV }, 4 Corresponding author. E-mail address: paul.vandooren@uclouvain.be CESAME, Université catholique de Louvain, B-348 Louvain-la-Neuve, Belgium School of Computational Science, Florida State University, Tallahassee FL 3236, USA This paper presents research supported by the Belgian Network DYSCO Dynamical Systems, Control, and Optimization, funded by the Interuniversity Attraction Poles Programme, initiated by the Belgian State, Science Policy Office and by the National Science Foundation under contract OCI-3-24944. The scientific responsibility rests with its authors.

provided the matrix W T V is invertible which also implies that V and W must have full rank n. The interpolation conditions {Σ σ,r} and {Σ µ,l} where Σ µ,σ σ R n n, R R m n and L R p n are known to uniquely determine the projected system {Â, B,Ĉ} [4]. The equations above can be expressed in another coordinate system by applying invertible transformations of the type { Q Σ σ Q,RQ } and { P Σ µ P,LP } to the interpolation conditions, which yields transformed matrices V P and WQ but does not affect the transfer function of the reduced order model {Â, B,Ĉ} see [4]. Therefore, the interpolation conditions essentially impose nm + p real conditions, since Σ σ and Σ µ can be transformed to their Jordan canonical form. In the case that both matrices are simple no Jordan blocks of size larger than we can assume Σ σ and Σ µ to be block diagonal with a diagonal block σ i or µ i for each real condition and a 2 2 diagonal block [ σ i σ i+ ] [ µ σ i+ σ i or i µ i+ ] µ i+ µ i for each pair of complex conjugate conditions. We refer to [] for a more elaborate discussion on this and for a discrete-time version of the results of this paper. In this paper we first compute the gradients of the H 2 error of the approximation problem and then show that its stationary points satisfy special tangential interpolation conditions that generalize earlier results for SISO systems and help understand numerical algorithms to solve this model reduction problem. 2 The H 2 approximation problem Let Es be an arbitrary proper transfer function, with realization triple {A e,b e,c e }. If Es is unstable, its H 2 -norm is defined to be. Otherwise, its squared H 2 -norm is defined as the trace of a matrix integral [2] : J := Es 2 H 2 := tr Ejω H Ejω dω 2π = tr EjωEjω H dω 2π. By Parseval s identity, this can also be expressed using the state space realization as see [2] J = tr [C e exp Aet B e ][C e exp Aet B e ] T dt = tr [C e exp Aet B e ] T [C e exp Aet B e ]dt. This can also be related to an expression involving the Gramians P e and Q e defined as P e := [exp Aet B e ][exp Aet B e ] T dt, Q e := which are also known to be the solutions of the Lyapunov equations [exp Aet B e ] T [C e exp Aet ]dt, A e P e + P e A T e + B e B T e =, Q e A e + A T e Q e + C T e C e =. 5 Using these, it easily follows that the squared H 2 -norm of Es can also be expressed as J = tr B T e Q e B e = tr C e P e C T e. 6 We now apply this to the error function Es := Hs Ĥs. A realization of Es in partitioned form is given by {A e,b e,c e } := {[ [ A B B] [,, C Ĉ] Â] }, and the Lyapunov equations 5 become [ ] P X P e := X T, P [ ] [ ] [ ][ A P X P X A T Â X T + P X T P Â T ] [ ] [ + B BT] B B T =, 7 and [ ] Q Y Q e := Y T, Q [ A T Â T ] [ Q ] Y Y T Q [ ] [ ] Q Y A C T [ + Y Q][ T + C Ĉ] =. 8 Â ĈT 2

To minimize the H 2 -norm, J, of the error function Es we must minimize [ ] [ [ ] J = tr B T Q Y BT Y Q] B B T = tr B T QB + 2B T Y B + B T Q B, 9 where Q, Y and Q depend on A, Â, C and Ĉ through the Lyapunov equation 8, or equivalently [ ] [ ][ ] P X C T J = tr C Ĉ = tr CPC P T ĈT 2CXĈT + Ĉ PĈT, X T where P, X and P depend on A, Â, B and B through the Lyapunov equation 7. Note that the terms B T QB and CPC T in the above expressions are constant, and hence can be discarded in the optimization. 3 Optimality conditions The expansions above can be used to express first order optimality conditions for the squared H 2 -norm in terms of the gradients of J versus Â, B and Ĉ. We define a gradient as follows. Definition 3. The gradient of a real scalar function fx of a real matrix variable X R n p is the real matrix X fx R n p defined by It yields the expansion [ X fx] i,j = d dx i,j fx, i =,...,n, j =,...,p. fx + = fx + X fx, + O 2, where M,N := tr M T N. The following lemma is useful in the derivation of our results see [7]. Lemma 3.2 If AM + MB + C = and NA + BN + D =, then trcn = trdm. Starting from the characterizations 7, and 8,9 of the H 2 norm and using Lemma 3.2 we easily derive succinct forms of the gradients. This theorem is originally due to Wilson [8]. Theorem 3.3 The gradients ba J, bb J and bc J of J := Es 2 H 2 are given by ba J = 2 Q P + Y T X, bb J = 2 Q B + Y T B, bc J = 2Ĉ P CX, where A T Y + Y  CT Ĉ =,  T Q + Q + ĈT Ĉ =, 2 X T A T + ÂXT + BB T =, P T +  P + B B T =. 3 Proof. For finding an expression for ba J we consider the characterization J = tr B T QB + 2B T Y B + B T Q B, A T Y + Y  CT Ĉ =,  T Q + Q + ĈT Ĉ =. Then the first order perturbation J corresponding to ba is given by J = tr 2 BB T Y + B B T bq where Y and bq depend on ba via the equations A T Y + Y  + Y ba =,  T bq + bq  + T b A Q + Q ba =. 4 3

It follows from applying Lemma 3.2 to the Sylvester equations 3,4 that tr BB T Y = tr X T Y ba and tr B BT bq = tr P T Q ba + Q ba and therefore J = tr 2X T Y ba + P T A b Q + Q ba = tr 2X T Y ba + 2 P Q ba = 2 Q P + Y T X, ba. Since J also equals ba J, ba, it follows that ba J = 2 Q P + Y T X. To find an expression for bb J we perturb B in the characterization J = tr B T QB + 2B T Y B + B T Q B. which yields the first order perturbation J = tr 2B T Y bb + T Q B bb + BT Q bb = 2Y T B + Q B, bb. Since J also equals bb J, bb, it follows that bb J = 2 Q B + Y T B. In a similar fashion we can write the first order perturbation of J = tr CPC T 2CXĈT + Ĉ PĈT to obtain bc J = 2Ĉ P CX. The gradient forms of Theorem 3.3 allow us to derive our fundamental theoretical result. Theorem 3.4 At every stationary point of J where P and Q are invertible, we have the following identities  = W T AV, B = W T B, Ĉ = CV, W T V = I n with W := Y Q, V := X P 5 where X, Y, P and Q satisfy the Sylvester equations 2,3. Proof. Since we are at a stationary point of J, the gradients versus Â, B and Ĉ must be zero : Q P + Y T X =, Q B + Y T B =, Ĉ P CX =. Since P and Q are invertible, we can define W := Y Q and V := X P. It then follows that W T V = I n, B = W T B, Ĉ = CV. Multiplying the first equation of 3 with W and using X T = PV T, yields PV T A T W +  PV T W + BB T W =. Using V T W = I, B T W = B T and the second equation of 3 it then follows that  = W T AV. If we rewrite the above theorem as a projection problem, then we are constructing a projector Π := V W T implying W T V = I n where V and W are given by the following transposed Sylvester equations QW T A + ÂT QW T + ĈT C =, AV P + V PÂT + B B T =. 6 Notice that P and Q can be interpreted as normalizations to ensure that W T V = I n. 4

It was shown in [4] that projecting a system via Sylvester equations always amounts to satisfying tangential interpolation conditions. The Sylvester equations 6 show that the parameters of reduced order models corresponding to stationary points must have specific relationships with the parameters of the tangential interpolation conditions 2,3,4. First note that  = Σ σ = Σ µ requires that the left and right interpolation points are identical and equal to the negatives of the poles of the reduced order model. For SISO systems, choosing identical left and right interpolation point sets implies that Ĥs and Hs and, at least, their first derivatives match at the interpolation points. Theorem 3.4 therefore generalizes to MIMO systems the conditions of [6] on the H 2 -norm stationary points for SISO systems. The simple additive result for the orders of rational interpolation for SISO systems, however, is replaced by more complicated tangential conditions for MIMO systems that require the definition of tangential direction vectors that can be vector polynomials of s. The Sylvester equations 6 show that these direction vectors are also related to parameters of realizations of Ĥs. If the Sylvester equations are expressed in the coordinate system with  in Jordan form then the transformed B and Ĉ contain the parameters that define the tangential interpolation directions. 4 Tangential interpolation revisited Theorem 3.4 provides the fundamental characterization of the stationary points of J via tangential interpolation conditions and their relationship to the realizations of Ĥs. It is instructive to illustrate those relationships in a particular coordinate system and derive an explicit form of the tangential interpolation conditions. We assume here that all poles of Ĥs are distinct but possibly complex the so-called generic case. Hence the transfer functions Hs and Ĥs have real realizations {A,B,C} and {Â, B,Ĉ} with  diagonalizable. The interpretation of these conditions for multiple poles or higher order poles becomes more involved and can be found in an extended version of this paper []. Given our assumptions, we have for Ĥs the partial fraction expansion Ĥs = n i= ĉ i bh i s λ i, 7 where b i C m and ĉ i C p and where λ i, b i,ĉ i,i =,...,n is a self-conjugate set. We must keep in mind that the number of parameters in {Â, B, Ĉ} is not minimal and hence that the gradient conditions of Theorem 3.3 must be redundant. We make this more explicit in the theorem below. For this we will need s i, t H i, the complex left and right eigenvectors of the real matrix  corresponding to the complex eigenvalue λ i. Because of the expansion 7, we then have : Âs i = λ i s i, Ĉs i = ĉ i, t H i  = λ i t H i, t H i B = b H i. Theorem 4. Let Ĥs = n i= ĉi b H i /s λ i have distinct first order poles where λ i, b i,ĉ i, i =,...,n is self-conjugate. Then 2 B bj T s i = [H T λ i ĤT λ i ]ĉ i 8 2 th i bc J T = b H i [H T λ i ĤT λ i ] 9 2 th i ba J T s i = b H d i ds [HT s ĤT s] ĉ i s= λi b 2 2 th i ba J T s j = 2 λ i λ j [ b H i bb J T s j t H i bc J T ĉ j ] 2 5

Proof. Define y i := Y s i, q i := Qs i, x i := Xt i and p i := Pt i. Then from 2,3 we have It follows that from which we obtain A T + λ i Iy i = C T ĉ i, ÂT + λ i I q i = ĈT ĉ i, x H i A T + λ i I = b H i B T, p H i ÂT + λ i I = b H i y i = A T + λ i I C T ĉ i, q i = ÂT + λ i I Ĉ T ĉ i, 22 x H i = b H i B T A T + λ i I, p H i = b H i B T. 2 b B J T s i = B T Q + B T Y s i = [H T λ i ĤT λ i ]ĉ i, B T ÂT + λ i I, 23 2 th i bc J T = t H i PĈT X T C T = b H i [H T λ i ĤT λ i ]. From the 22,23 it also follows that 2 th i ba J T s j = t H i P Q+X T Y s j = b H i [ B T ÂT + λ i I ÂT + λ j I Ĉ T B T A T + λ i I A T + λ j I C T ]ĉ i. If we use d ds Hs = CsI A 2 B and dsĥs d = ĈsI Â 2 B, then for i = j we obtain 2 th i ba J T s i = b H d i ds [HT s ĤT s] ĉ i. s= λi b For i j we use the identity M + λ j I M + λ i I = λ i λ j [M + λ j I M + λ i I ] to obtain 2 th i ba J T s j = λ i λ j t H i [H T λ i ĤT λ i ] [H T λ j ĤT λ j ] s j and finally 2 th i ba J T s j = 2 λ i λ j [ b H i bb J T s j t H i bc J T ĉ j ]. Let S := [ s... s n ], then the above theorem shows that the off-diagonal elements of S ba J T S vanish when bb J T and bc J T vanish. Therefore we need to impose only conditions on diags ba J T S, on bb J T and on bc J T to characterize stationary points of J. These are exactly nm + p conditions since the vectors b H i or ĉ i can be scaled as indicated in Section. Moreover one can view them as nm + p real conditions since the poles λ i come in complex conjugate pairs. The following corollary easily follows. Corollary 4.2 If bb J T =, bc J T = and diag S ba J T S = then ba J = and the following tangential interpolation conditions are satisfied for all λ i,i =,...,n : [H T λ i ĤT λ i ]ĉ i =, bh i [H T d λ i ĤT λ i ] =, bh i ds [HT s ĤT s] ĉ i =. 24 s= λi b Notice that we retrieve the conditions of [6] for the SISO case since then b H i and ĉ i are just nonzero scalars that can be divided out. The conditions above then become the familiar 2n interpolation conditions H λ i = Ĥ λ i, d ds Hs s= b λ i = d ds Ĥs s= b λi, i =,...,n. 6

5 Concluding remarks The H 2 norm of a stable proper transfer function Es is a smooth function of the parameters {A e,b e,c e } of its state-space realization because the squared norm of Es is differentiable versus the parameters {A e,b e,c e } as long as A e is stable the Lyapunov equations are then invertible linear maps and the trace is a smooth function of its parameters. If Ĥs is an isolated local minimum of the error function Hs Ĥs 2 H 2, then the continuity of the norm implies that a small perturbation of Hs will induce also only a small perturbation of that local minimum. This explains why we can construct a characterization of the optimality conditions without assuming anything about the structure of the poles of the transfer functions Hs and Ĥs. Those ideas also lead to algorithms. One can view 2,3 and 5 as two coupled equations X,Y, P, Q = FÂ, B,Ĉ and Â, B,Ĉ = GX,Y, P, Q for which we have a fixed point Â, B,Ĉ = GFÂ, B,Ĉ at every stationary point of J Â, B,Ĉ. This automatically suggests an iterative procedure X,Y, P, Q i+ = FÂ, B,Ĉ i+, Â, B,Ĉ i+ = GX,Y, P, Q i, which is expected to converge to a nearby fixed point. This is essentially the idea behind existing algorithms using Sylvester equations in their iterations see [2]. Another approach would be to use the gradients or the interpolation conditions of Theorem 4. to develop descent methods or even Newton-like methods, as was done for the SISO case in [5]. The two fundamental contributions of this paper are, first, the characterization of the stationary points of J via tangential interpolation conditions and their relationship to the realizations of Ĥs given by Theorem 3.4, and, second, the fact that this can be done using Sylvester equations without assuming anything about the structure of either Hs or Ĥs thereby providing a framework to relate existing algorithms and to develop and understand new ones. References [] P.-A. Absil, K. A. Gallivan and P. Van Dooren. Multivariable H 2 -optimal approximation and tangential interpolation. Internal CESAME Report, Catholic University of Louvain, September 27. [2] A. Antoulas. Approximation of Large-Scale Dynamical Systems. Siam Publications, Philadelphia 25 [3] C. Byrnes and P. Falb. Applications of algebraic geometry in systems theory. American Journal of Mathematics, 2:337-363, April 979. [4] K. A. Gallivan, A. Vandendorpe, and P. Van Dooren. Model reduction of MIMO systems via tangential interpolation. SIAM J. Matrix Anal. Appl., 262:328 349, 24 [5] S. Gugercin, A. Antoulas and C. Beattie. Rational Krylov methods for optimal H2 model reduction, submitted for publication, 26. [6] L. Meier and D. Luenberger. Approximation of linear constant systems. IEEE Trans. Aut. Contr., 2:585-588, 967. [7] W.-Y. Yan and J. Lam. An approximate approach to H 2 optimal model reduction. IEEE Trans. Aut. Contr. 447:34-358, 999. [8] D. A. Wilson, Optimum solution of model reduction problem, Proc. Inst. Elec. Eng., 7:665, 97. 7