Departement Elektrotechniek ESAT-SISTA/TR About the choice of State Space Basis in Combined. Deterministic-Stochastic Subspace Identication 1

Katholieke Universiteit Leuven Departement Elektrotechniek ESAT-SISTA/TR 994-24 About the choice of State Space asis in Combined Deterministic-Stochastic Subspace Identication Peter Van Overschee and art De Moor 2 December 994 Submitted for Publication to Automatica This report is available by anonymous ftp from gate.esat.kuleuven.ac.be in the directory /pub/sista/vanoverschee/reports/bal pap.ps.z. This paper presents research results of the elgian Programme on Interuniversity Poles of Attraction (IUAPno. 7 and 50), initiated by the elgian State, Prime Minister's Oce for Science, Technology and Culture. The scientic responsibility rests with its authors. 2 ESAT - Katholieke Universiteit Leuven, Kardinaal Mercierlaan 94, 300 Leuven (Heverlee), elgium, tel 32/6/32709, fax 32/6/32986, email: peter.vanoverschee@esat.kuleuven.ac.be and bart.demoor@esat.kuleuven.ac.be. Peter Van Overschee is a research assistant and art De Moor a senior research associate of the elgian National Fund for Scientic Research.

abstract This paper describes how the state space basis of models identied with subspace identication algorithms can be determined. It is shown that this basis is determined by the input spectrum and by user dened input and output weightings. Through the connections between subspace identication and frequency weighted balancing, the state space basis of the subspace identied models is shown to coincide with a frequency weighted balanced basis. Introduction The identication problem considered in the combined deterministic-stochastic subspace identication papers [4, 8, 0, ] is the following : let u k 2 R m ; y k 2 R l be the observed input and output generated by the unknown system : x k+ Ax k + u k + w k ; y k Cx k + Du k + v k ; () with E[ w k v k w T l v T l ] Q S T S R kl 0 ; (2) and A; Q 2 R nn ; 2 R nm ; C 2 R ln ; D 2 R lm ; S 2 R nl and R 2 R ll. v k 2 R l and w k 2 R n are unobserved, zero mean, white noise vector sequences. fa; Cg and fa; [ Q 2 ]g are assumed to be observable respectively controllable. The main identication problem is then stated as : Given N input-output measurements generated by the system ()-(2), nd A; ; C; D; Q; R; S up to within a similarity transformation. Several solutions for this problem have been described in the literature [4, 8, 0, ]. Although the solutions look very dierent at rst sight, it was shown in [9] that these algorithms use the same basic subspace, but weighted in a dierent way. In this paper the eect of these weights will be further explored. It will be shown that the state space basis of the identied model corresponds to an inputoutput frequency weighted balanced basis as described by Enns [3]. The weights in the frequency domain are a function of the input (u k ) applied to the system and of the weighting matrices W and W 2 introduced in [9]. The main Theorem introduces specic choices for these weighting matrices, such that the system is identied in a predened state space basis. As a special case, we will investigate the state space basis of the N4SID algorithm [8]. A nice side result is the lower order identication problem. Since the basis in which the state space matrices are identied is well dened and is frequency weighted balanced, it is very easy to truncate the model after identication to a lower order model. This corresponds exactly to the technique of frequency weighted model reduction of Enns [3]. This paper is organized as follows : In Section 2 we introduce some notation and background. Section 3 shortly introduces the concepts of frequency weighted balancing. In Section 4 the E denotes the expected value operator and kl the Kronecker delta.

main Theorem is presented. Section 5 addresses the problem of reduced order identication. Finally Section 6 contains the conclusions. 2 Notation and ackground In this Section we introduce the notation used throughout the paper : input and output block Hankel matrices, system related matrices and weighting matrices. We also shortly revise some results on subspace identication. Throughout the paper, we will consider the model ()-(2) in its forward innovations form as [6] : x k+ Ax k + u k + Ee k ; y k Cx k + Du k + F e k ; E[e k e T l ] I kl ; (3) with E 2 R nl, F 2 R ll, and the innovations e k 2 R l uncorrelated with u k. The transformation from the system ()-(2) to this innovations model can be done through the solution of a Riccati equation [6]. The steady state Kalman gain is given by K EF? 2 R nl. Subspace identication algorithms make extensive use of input and output block Hankel matrices. We dene : U p 0 @ u 0 u : : : u j? u u 2 : : : u j : : : : : : : : : : : : u i? u i : : : u i+j?2 C A, U f 0 @ u i u i+ : : : u i+j? u i+ u i+2 : : : u i+j : : : : : : : : : : : : u 2i? u 2i : : : u 2i+j?2 Somewhat loosely we denote U p as the past inputs and U f as the future inputs. Through a similar denition, Y p and Y f are dened as the past respectively the future outputs and E p and E f as the past respectively the future innovations. We assume that j throughout the paper and that all sequences are ergodic. The time averaging operator E j is dened as : E j [] j j [] : The covariance matrix of the past inputs R p will play an important role in several derivations : R p E j [U p Up T ] L p :L T p, where L p is a lower triangular square root of R p obtained e.g. via a Cholesky decomposition of R p. The extended (i > n) observability matrix? i is dened as :? i C T (CA) T : : : (CA i? ) T T : The extended (i > n) reversed deterministic d i and stochastic s i controllability matrices are dened as : d i A i? : : : A ; s i A i? E : : : AE E : C A : 2

Furthermore we dene the block Toeplitz matrices containing the Markov parameters of the deterministic and the stochastic system as : H d i 0 @ D 0 : : : 0 C D : : : 0 : : : : : : : : : : : : CA i?2 CA i?3 : : : D C A ; H s i 0 @ F 0 : : : 0 CE F : : : 0 : : : : : : : : : : : : CA i?2 E CA i?3 E : : : F The (non-steady state) Kalman lter state sequence e Xi is dened as in [8] : ex i exi ex i+ ex i+2 : : : ex i+j? Each of its columns is the output of a non-steady state Kalman lter (see [8, 9] for more details). The Z-transforms (initial state equal to zero) of u k and y k are denoted by respectively U(z) and Y (z), while the spectral factor of e k is denoted by E(z). From (3) we then nd : Y (z) G(z)U(z)+H(z)E(z) with G(z) D+C(zI?A)? and H(z) F +C(zI?A)? E. The spectral factor of u k is denoted by S u (z) : U(z)U T (z? ) S u (z)su T (z? ) with all poles of S u (z) and S? (z) inside the unit circle. We dene the input and output weighting matrix functions as : u W u (z) D u + C u (zi? A u )? u ; W y (z) D y + C y (zi? A y )? y : From the Markov parameters of these functions, the weighting matrices Wi u formed : W u i 0 @ D u 0 : : : 0 C u u D u : : : 0 : : : : : : : : : : : : C u A i?2 u u C u A i?3 u u : : : D u C A ; W y i 0 @ : and W y i C D y 0 : : : 0 C y y D y : : : 0 A : can be : : : : : : : : : : : : C y A i?2 y y C y A i?3 y y : : : D y A denotes the operator that projects the row space of a matrix onto the row space of A (which is assumed to be of full row rank) : A A T (AA T )? A. The projection of the row space of onto the row space of A is dened as : A : A A T (AA T )? A. The orthogonal complement of the row space of A is denoted by A?. The unifying Theorem of [9] describes how the extended observability matrix? i and the states ex i can be recovered directly from the input-output data u k and y k. The Theorem introduces two weighting matrices W 2 R lili and W 2 2 R jj that will play an important role in this paper. The main results of the Theorem [9] are that the system order n and the matrices? i and e Xi can be determined from an innite number of input-output data through the non-zero singular values and the left and right singular vectors of the matrix : " W :E j [ (Y f U? U p f ):( U? f ) T U ] E p j [ ( U? U p f ):( U? f ) T ] Y p Y p Y p #? Up :W 2 Y p C A : 3

ek H(z) yk W y (z) W u (z) uk G(z) Figure : Cascade system used for the interpretation of frequency weighted balancing. The weights W u (z) and W y (z) are user dened. Note that the noise input (the input to H(z)) has no extra weight, since from an input-output view, this weight is indistinguishable from H(z). which has a singular value decomposition given by :? i and e Xi then follow from : U :S :V T : (4) W? i U S 2 ; e Xi W 2 S 2 V T : (5) We refer to [9] for more details. It is known from linear system theory that? i and Xi e are only determined up to within a non-singular similarity transformation T 2 R nn :? i? i T and e Xi T? e X i. This implies that the following question makes sense : In which state space basis are? i and e Xi determined when a subspace method is used to estimate them? In what follows, we will show that this basis is a function of the weights W and W 2, and that by a proper choice of these weights, the basis can be altered in a user controlled manner. Furthermore, it will be shown that the singular values S (4) used to determine the system order have a clear interpretation from a linear system theoretical point of view. 3 Frequency Weighted alancing In this Section we recapitulate the results of Enns [3] for frequency weighted balancing. We also show how the frequency weighted Gramians introduced by Enns can be calculated from the extended observability and controllability matrices and from the weighting matrices. Well known in system theory is the notion of \balanced realization" [7]. Enns [3] has developed a frequency weighted extension of this result. The idea is that input and output frequency weights can be introduced as to emphasize certain frequency bands in the balancing procedure (Figure ). Instead of using the regular controllability and observability Gramians, Enns uses frequency weighted Gramians : 4

Denition Frequency Weighted Gramians The solution P of the Lyapunov equation : A C u 0 A u P P2 T P 2 P 22 T A C u + D u E 0 A u u 0 D u E u 0 T P P2 T ; P 2 P 22 is called the W u (z) weighted controllability Gramian and is denoted by P [W u (z)] P. The solution Q of the Lyapunov equation : T A 0 y C A y Q Q 2 Q T 2 Q 22 A 0 T D y C C y y C A y D y C C y (6) Q Q 2 ; Q T 2 Q 22 is called the W y (z) weighted observability Gramian and is denoted by Q[W y (z)] Q. Just as for the classical balancing procedure, a similarity transformation can be found that makes both Gramians diagonal and equal to each other. In that case the system is said to be \Frequency Weighted alanced". Denition 2 Frequency Weighted alancing The system (3) is called [W u (z); W y (z)] frequency weighted balanced when P [W u (z)] Q[W y (z)] where diagonal[ ; 2 ; : : :; n ]. The diagonal elements k are called the frequency weighted Hankel singular values, and will be denoted by k [W u (z); W y (z)]. (7) Even though (6) and (7) are easily solvable for P [W u (z)] and Q[W y (z)], we present a dierent way to compute these weighted Gramians. These expressions will enable us to make the connection between subspace identication and frequency weighted balancing. Lemma With A asymptotically stable, we have : P [W u (z)] [ d i i :W i u:(w i u)t :( d i )T + s i :(s i )T ] ; (8) Q[W y (z)] [? T i i :(W y i )T :(W y i ):? i] : (9) A proof can be found in appendix A. 4 Subspace Identication and Frequency Weighted alancing In this Section we consider the connection between frequency weighted balancing and subspace identication. We show how the weights W and W 2 inuence the Gramians P [W u (z)] and Q[W y (z)] corresponding to the state space basis of? i and e Xi. 5

4. Main Result Theorem - Main Theorem With A asymptotically stable and i 2, we have with : W W y i ; W 2 U T p R? p W u i L? p U p + U? p ; (0) that the W u (z) weighted controllability Gramian and W y (z) weighted observability Gramian of the state space basis corresponding to? i and e Xi are given by (with S from (4)) : P [W u (z)] S Q[W y (z)] : () A proof of the Theorem can be found in Appendix. The Theorem implies that the state space basis of? i and Xi e for the choice of W and W 2 given by (0) is the [W u (z); W y (z)] frequency weighted balanced basis. It also implies that the singular values S are the [W u (z); W y (z)] frequency weighted Hankel singular values. 4.2 Special cases of the rst main Theorem Even though the weighting matrices Wi u and W y i can be chosen arbitrarily, there are some special cases that lead to algorithms published in the literature. N4SID N4SID stands for Numerical algorithms for Subspace State Space System Identication [8, ]. With the results of [9], it is easy to show that N4SID delivers the following choice of weighting matrices in Theorem : Wi u L p and W y i I. It is easy to verify that (for i ) the lower triangular matrix L p corresponds to the Toeplitz matrix generated by the Markov parameters of the spectral factor S u (z) of u k. This implies that (for N4SID) the input weight W u (z) in the frequency weighted balancing procedure corresponds to the spectral factor S u (z). alanced Realization With the weighting matrices W u i I and W y i I we nd : P [I(z)] S Q[I(z)]. Now it is easy to verify that P [I(z)] and Q[I(z)] are equal to the unweighted controllability respectively observability Gramian. This implies that the basis of? i and e Xi is the classical balanced basis as described in [7]. A similar result for pure deterministic systems was obtained in [5]. 5 Consequences for Reduced order Identication In this section we apply the results of the main Theorem to the identication of lower order systems. The connections with frequency weighted model reduction are exploited. As has been proven in this paper, subspace identication of a model of order n (the exact state space order) leads to a state space system that is [W u (z); W y (z)] frequency weighted 2 Note that rst j through the operator E j, after which i. 6

balanced. This nth order model can then be easily reduced to a model of lower order r by truncating it as follows : A r n? r n? r A 2 A 22 r A A 2 ; m r n? r 2 r n? r ; C l C C 2 ; E l r E : n? r E 2 The reduced order model is described by the matrices : A ; ; C ; D; E ; F. The reduced transfer functions are denoted by : b G(z) D + C (zi? A )? and b H(z) F + C (zi? A )? E. Enns [3] now suggested the following conjecture in his Thesis : when truncating a [W u (z); W y (z)] frequency balanced system, the innity norm of the dierence between the original and the reduced system can be upperbounded by the neglected weighted Hankel singular values. In the framework of this paper (see also Figure ), this conjecture becomes : k W y (z) [ G(z)? G(z) b ] Wu (z) j W y (z) [ H(z)? H(z) b ] k nx 2 kr+ k [W u (z); W y (z)]( + ) ; > 0 : (2) The conjecture consists of the fact that is \small". Let us pounder a bit about this conjecture. We have tried to nd a simple expression for, but (just as Enns) did not succeed (as didn't any one else as far as we know, even though the problem has been open since 984). Even though the result is ambiguous ( has never been proven to be bounded, let alone to be small), the \heuristic" model reduction technique seems to work very well in practice [2, 2]. It turns out that, even though not a real upper bound, two times the sum of the neglected singular values ( 0) gives a good indication about the size of the error. More importantly (2) states that the t of the truncated lower order model will be good where W u (z) and W y (z) are large. This implies that by a proper choice of W u (z) and W y (z) the distribution of the error in the frequency domain can be shaped. We nd for the special cases : N4SID k [ G(z)? b G(z) ] Su (z) j [ H(z)? b H(z) ] k 2 nx kr+ k ( + ) : We can conclude that the error of the model will be small where the frequency content of the input is large. This is a very intuitive result : A lot of input energy in a certain frequency band leads to an accurate model in that band. Also note that the error on the noise model can not be shaped by the user. alanced basis k [ G(z)? b G(z) ] j [ H(z)? b H(z) ] k 2 nx kr+ Note that for this case setting 0 is justied, since for (unweighted) balanced model reduction two times the sum of the Hankel singular values is actually a proven upper bound for the truncation error (see for instance []). 7 k :

6 Conclusions In this paper we have shown that the state space basis of the subspace identied models corresponds to a frequency weighted balanced basis. The frequency weights are determined by the input spectrum and by user dened input and output weighting functions. References [] Al-Saggaf U.M., Franklin G.F. (987). An error bound for a discrete reduced order model of a linear multivariable system. IEEE Trans. on Aut. Control, Vol. 32, pp. 85-89. [2] Anderson.D.O., Moore J.. (989). Optimal Control - Linear Quadratic Methods. Prentice Hall. [3] Enns D. (984). Model reduction for control system design. Ph.D.. dissertation, Dep. Aeronaut. Astronaut., Stanford University, Stanford CA. [4] Larimore W.E. (990). Canonical Variate Analysis in Identication, Filtering, and Adaptive Control. Proceedings of the 29th Conference on Decision and Control, Hawaii, pp. 596-604. [5] Moonen M., Ramos J. (99). A subspace algorithm for balanced state space system identication. IEEE Trans. Automatic Control, Vol. 38, pp. 727-729. [6] Pal D. (982). alanced Stochastic Realization and Model Reduction. Master's thesis, Washington State University, Electrical Engineering. [7] Moore.C. (98). Principal component analysis in linear systems: controllability, observability and model reduction. IEEE Trans. on Aut. Control, Vol. 26, pp. 7-32. [8] Van Overschee P., De Moor. (994). N4SID : Subspace Algorithms for the Identication of Combined deterministic-stochastic Systems. Automatica (Special Issue on Statistical Signal Processing and Control), Vol. 30, No., pp. 75-93, 994. [9] Van Overschee P., De Moor. (994). A Unifying Theorem for three Subspace System Identication Algorithms. ESAT/SISTA report 993-50, Kath. Universiteit Leuven, Dept. E.E., elgium. Accepted for publication in Automatica. [0] Verhaegen M. (994). Identication of the deterministic part of MIMO state space models given in innovations form from input-output data. Automatica (Special Issue on Statistical Signal Processing and Control), Vol. 30, No., pp. 6-74. [] Viberg M., Ottersten., Wahlberg., Ljung L. (993). Performance of Subspace ased State Space System Identication Methods. Proc. of the 2th IFAC World Congress, Sydney, Australia, 8-23 July, Vol. 7, pp. 369-372. 8

[2] Wortelboer P.M.R., osgra O.H. (992). Generalized frequency weighted balanced reduction. Selected Topics in Identication, Modeling and Control, Delft University Press, Vol. 5, pp. 29-36. A Proof of Lemma We rst prove equation (8). controllability Lyapunov equation (6) : Hereto we consider the dierent sub-blocks of the weighted P AP A T + EE T + AP T 2C T u T + C u P 2 A T + [C u P 22 C T u + D u D T u ] T ; (3) P 2 A u P 2 A T + [A u P 22 C T u + u D T u ] T ; (4) P 22 [A u P 22 A T u + u T u ] : (5) We rst prove that with u i A i? u : : : A u u u we have : Proof of (6) : u i :( u i ) T P 22 [ u i i :(u i )T ] ; (6) P 2 [ u i :(Wi u ) T :( d i ) T ] : i (7) A u : u i? u ( u i? )T A T u T u A u [ u i?:( u i?) T ]A T u + u : T u : (8) For stable A u ( i [A i? u ] 0) we also have that : i [u i :( u i ) T ] [ u i i?:( u i?) T + A i? u 0 u ) {z T } 0 u T u (A i? ] [ u i i?:( u i?) T ] : (9) Taking the it for i on both sides of (8), we nd with (9) that i [( u i (u i )T ] is the solution of the same Lyapunov equation (5) as P 22 and thus prove (6). Proof of (7) : u i :(W u i )T :( d i )T A u u i? u (W u i? )T ( u i? )T C T u 0 D T u ( d i? )T A T T A u :[ u i? :(W u i? )T :( d i? )T ]:A T + [A u ( u i? :(u i? )T )C T u + ud T u ]T (20) For stable A u and A we have : i [ u i :(W u i )T :( d i )T ] i [ u i? :(W u i? )T :( d i? )T ]. Taking the it for i on both sides of (20), we thus nd with (6) that : i [u i :(W i u)t :( d i )T ] A u : [ u i i :(W i u)t :( d i )T ]:A T + [A u P 22 Cu T + udu T ]T ; which proves that i [ u i :(W u i )T :( d i )T ] is the solution of the same equation as P 2 (4) and thus proves (7). 9

Finally, we prove (8) : i [d i :W u i :(W u i )T :( d i )T + s i :(s i )T ] i [ + A d i? A s i? E Wi? u 0 C u u i? D u ( s i? )T A T E T (W u i? )T ( u i? )T C T u 0 D T u A: i [d i? :W u i? :(W u i? )T :( d i? )T + s i? :(s i? )T ]A T + EE T ] +A: i [d i? :W u i? :(u i? )T ]C T u T + C u : i [u i? :(W u i? )T :( d i? )T ]A T +[C u : i [u i? :(u i? )T ]C T + D u ud T u ]T A: i [d i? :W i? u :(W i? u )T :( d i? )T + s i? :(s i? )T ]A T + EE T ( d i? )T A T T +AP T 2 C T u T + C u P 2 A T + [C u P 22 C T u + D ud T u ]T : (2) Through a similar reasoning as before, it is easy to prove that with A and A u stable : i [d i :W u i :(W u i ) T :( d i ) T + s i :( s i) T ] i [ d i?:w u i?:(w u i?) T :( d i?) T + s i?:( s i?) T ] ; which proves with (2) that i [ d i :W u i :(W u i )T :( d i )T + s i :(s i )T ] is the solution of the same equation as P (3) and thus proves (8). The proof of (9) is analogous to the proof of (8). Proof of Theorem From [8] we have : ex i [A i? Q i? i ]S(R? ) jmi + d i? Q ih d i Q i U p Y p : (22) For the meaning of the symbols we refer to [8]. The only properties we use here are (from [8], for stable A) : i [Ai? Q i? i ] 0 ; [Q i? i ] 0 ; [Q i H s i i i :(Hi s ) T Q T i ] [ s i i: s i ] : (23) Using Y p? i X p + H d i U p + H s i E p, where X p are the past states of the forward innovation model (3) [8], we can rewrite (22) as : ex i M i + d i U p + Q i H s i E p ; (24) with : M i [A i? Q i? i ]S(R? ) jmi U p + Q i? i X p : Since the input u k and the noise e k are uncorrelated, we also know that [8] : E j [U p :E T p ] 0 ; E j [M i :E T p ] 0 ; E j [E p :W 2 :W T 2 :E T p ] E j [E p :E T p ] I : (25) 0

Formula (24) combined with the expression for W 2 in (0) leads to : E j[ e Xi:W2:W T 2 : e X T i ] E j[(m i + d i U p + Q ih s i E p):w2:w T 2 :(M i + d i U p + Q ih s i E p) T ] d i E j[u p:w2:w T 2 :U T p ]( d i ) T + Q ih s i E j[e p:w2:w T 2 :E T p ](H s i ) T Q T i +E j[(m i + d i U p)e T p ] 0 (25) +E j[e p(m i + d i U p) T ] 0 (25) +E j[m i:w2:w T 2 :M T i + M i:w2:w T 2 :U T p ( d i ) T + d i U p:w2:w T 2 :M T i ] Ci Rp Rp d i E j[u pu T p ] R? p W u i L? p E j[u pu T p ] I (25) L?T +Q ih s i E j[e p:w2:w2 T :E T p ](H s i ) T Q T i + C i Rp p (W u i ) T R? p E j[u pu T p ]( d i ) T d i W u i (W u i ) T ( d i ) T + Q ih s i (H s i ) T Q T i + C i : (26) Using (23) it is easy to show that i [C i ] 0. Taking the it for i on both sides of (26) thus gives (use (23)) : i h Ej [ e Xi :W 2 :W T 2 : e X T i ]i On the other hand, we know from (5) that : i h Ej [ e Xi :W 2 :W T 2 : e X T i ]i Finally, from (27) and (28) we nd : i [ d i W u i (W u i ) T ( d i ) T + s i ( s i ) T ] : (27) [S 2 i V T V S 2 ] S : (28) i [d i :W u i :(W u i )T :( d i )T + s i :(s i )T ] S : From Lemma we know that the left part of this expression is equal to P [W u (z)]. This leads to P [W u (z)] S which is exactly the rst part of (). From (5) we know that? T i W T W? i S 2 U T U S 2 S. Combining this with Lemma (9) and with W W y i (0), we nd : Q[W y (z)] [? T i :(W y i i )T :W y i :? i] [? T i W T W? i ] S ; i which is the second part of ().