CME 345: MODEL REDUCTION Balanced Truncation Charbel Farhat & David Amsallem Stanford University cfarhat@stanford.edu These slides are based on the recommended textbook: A.C. Antoulas, Approximation of Large-Scale Dynamical Systems, Advances in Design and Control, SIAM, ISBN-0-89871-529-6 1 / 42
Outline 1 Reachability and Observability 2 Balancing 3 Balanced Truncation Method 4 Error Analysis 5 Stability Analysis 6 Computational Complexity 7 Comparison with the POD Method 8 Application 9 Balanced POD Method 2 / 42
Reachability and Observability Systems Considered Consider the following stable LTI high-dimensional system d w(t) dt = Aw(t) + Bu(t) y(t) = Cw(t) w(0) = w 0 w R N : State variables u R p : Input variables, typically p N y R q : Output variables, typically q N Recall that the solution w(t) of the above linear ODE can be written as w(t) = φ(t, u; t 0, w 0 ) = e A(t t0) w(t 0 )+ t t 0 e A(t τ) Bu(τ)dτ, t t 0 3 / 42
y(t) = Cφ(t, 0; 0, w) = 0 q 4 / 42 CME 345: MODEL REDUCTION - Balanced Truncation Reachability and Observability Reachability, Controllability and Observability Definition (1) A state w R N is reachable if there exists an input function u(.) of finite energy and a time T < such that under this input and zero initial condition, the state of the system is w Definition (2) A state w R N is controllable to the zero state if there exist an input function u(.) and a time T < such that φ(t, u; 0, w) = 0 N Definition (3) A state w R N is unobservable if for all t 0,
Reachability and Observability Completely Controllable Dynamical System Definition (4 - R.E. Kalman, 1963) A linear dynamical system (A, B, C) is completely controllable at time t 0 if it is not equivalent, for all t t 0, to a system of the type d dt w(1) = A (1,1) w (1) + A (1,2) w (2) + B (1) u d dt w(2) = A (2,2) w (2) y(t) = C (1) w (1) + C (2) w (2) Interpretation: It is not possible to find a coordinate system in which the state variables are separated into two groups, w (1) and w (2), such that the second group is not affected neither by the first group, nor by the inputs to the system Controllability is only a property of (A, B) This definition can be extended to linear time-variant systems 5 / 42
Reachability and Observability Completely Observable Dynamical System Definition (5) A linear dynamical system (A, B, C) is completely observable at time t 0 if it is not equivalent, for all t t 0, to any system of the type d dt w(1) = A (1,1) w (1) + B (1) u d dt w(2) = A (2,1) w (1) + A (2,2) w (2) + B (2) u y(t) = C (1) w (1) Dual of the previous definition Interpretation: It is not possible to find a coordinate system in which the state variables are separated into two groups, w (1) and w (2), such that the second group is not affected neither by the first group, nor by the outputs of the system Observability is only a property of (A, C) This definition can be extended to linear time-variant systems 6 / 42
Reachability and Observability Example: Simple RLC Circuit y 1 R =1 R =1 u 1 + _ w 2 w 1 L C The equation of the network in terms of the current w 1 flowing through the inductor, and the potential w 2 across the capacitor (LC = R = 1), is given by d dt w 1 = 1 L w 1 + u 1 d dt w 2 = 1 C w 2 + u 1 y 1 = 1 L w 1 + 1 C w 2 + u 1 7 / 42
Reachability and Observability Example: Simple RLC Circuit Under the change of variable w 1 = 0.5(w 1 + w 2 ) and w 2 = 0.5(w 1 w 2 ), the system becomes d dt w 1 = 1 L w 1 + u 1 d dt w 2 = 1 L w 2 y 1 = 2 L w 2 + u 1 w 1 is controllable but not observable w 2 is observable but not controllable 8 / 42
Reachability and Observability Canonical Structure Theorem Theorem (Kalman 1961) Consider a dynamical system (A, B, C). Then (i) There is a state space coordinate system in which the components of the state vector can be decomposed into four parts w = [w (a), w (b), w (c), w (d) ] T (ii) The sizes N a, N b, N c and N d of these vectors do not depend on the choice of basis (iii) The system matrices take the form A = A (a,a) A (a,b) A (a,c) A (a,d) 0 A (b,b) 0 A (b,d) 0 0 A (c,c) A (c,d) 0 0 0 A (d,d) C = [ 0 C (b) 0 C (d) ], B = B (a) B (b) 0 0, 9 / 42
Reachability and Observability Canonical Structure Theorem The four parts of w can be interpreted as follows Part (a) is completely controllable but unobservable Part (b) is completely controllable and completely observable Part (c) is uncontrollable and unobservable Part (d) is uncontrollable and completely observable This theorem can be extended to linear time-variant systems 10 / 42
Reachability and Observability Reachable and Controllable Subspaces Definition (6) The reachable subspace W reach R N of a system (A, B, C) is the set containing all of the reachable states of the system. R(A, B) = [B, AB,, A N 1 B, ] is the reachability matrix of the system. Definition (7) The controllable subspace W contr R N of a system (A, B, C) is the set containing all of the controllable states of the system 11 / 42
Reachability and Observability Reachable and Controllable Subspaces Theorem Consider the system (A, B, C). Then W reach = im R(A, B) Corollary (i) AW reach W reach (ii) The system is completely reachable if and only if rank R(A, B) = N (iii) Reachability is basis independent 12 / 42
Reachability and Observability Reachability and Observability Gramians Definition The reachability (controllability) Gramian at time t < is defined as the N N symmetric positive semi-definite matrix P(t) = t 0 e Aτ BB e A τ dτ Definition The observability Gramian at time t < is defined as the N N symmetric positive semi-definite matrix Q(t) = t 0 e A τ C Ce Aτ dτ 13 / 42
Reachability and Observability Reachability and Observability Gramians Proposition The columns of P(t) span the reachability subspace W reach = im R(A, B) Corollary A system (A, B, C) is reachable if and only if P(t) is symmetric positive definite at some time t > 0 14 / 42
Reachability and Observability Equivalence Between Reachability and Controllability Theorem For continuous linear dynamical systems, the notions of controllability and reachability are equivalent that is, W reach = W contr 15 / 42
Reachability and Observability Unobservability Subspace Definition The unobservability subspace W unobs R N is the set of all unobservable states of the system. O(C, A) = [C, A C,, (A ) i C, ] is the observability matrix of the system. 16 / 42
Reachability and Observability Unobservability Subspace Theorem Consider the system (A, B, C). Then W unobs = ker O(C, A) Corollary (i) AW unobs W unobs (ii) The system is completely observable if and only if rank O(C, A) = N (iii) Observability is basis independent 17 / 42
Reachability and Observability Infinite Gramians Definition The infinite reachability (controllability) Gramian is defined for a stable LTI system as the N N symmetric positive semi-definite matrix P = 0 e At BB e A t dt Definition The infinite observability Gramian is defined for a stable LTI system as the N N symmetric positive semi-definite matrix Q = 0 e A t C Ce At dt 18 / 42
Reachability and Observability Infinite Gramians Using the Parseval theorem, the two previously defined Gramians can be written in the frequency domain as follows infinite reachability Gramian P = 1 2π infinite observability Gramian Q = 1 2π (jωi N A) 1 BB ( jωi N A ) 1 dω ( jωi N A ) 1 C C(jωI N A) 1 dω 19 / 42
Reachability and Observability Infinite Gramians The two infinite Gramians are solution of the following Lyapunov equations infinite reachability Gramian infinite observability Gramian AP + PA + BB = 0 N A Q + QA + C C = 0 N 20 / 42
Reachability and Observability Energetic Interpretation P and Q are respective bases for the reachable and observable subspaces P 1 and Q are semi-norms For a reachable system, the inner product based on P 1 characterizes the minimal energy required to steer the state from 0 to w as t w 2 P = 1 wt P 1 w The inner product based on Q indicates the maximal energy produced by observing the output of the system corresponding to an initial state w 0 when no input is applied w 0 2 Q = w T 0 Qw 0 21 / 42
Balancing Model Order Reduction Based on Balancing Model order reduction strategy: Eliminate the states w that are at the same time difficult to reach, i.e., require a large energy w 2 P to be reached 1 difficult to observe, i.e., produce a small observation energy w 2 Q The above notions are basis dependent One would like to consider a basis where both concepts are equivalent, i.e., where the system is balanced 22 / 42
Balancing The Effect of Basis Change on the Gramians Balancing requires changing the basis for the state using a transformation T GL(N) w = Tw Then e At B Te At T 1 TB = Te At B B e A t B T T e A t T = B e A t T the reachability Gramian becomes P = TPT Ce At CT 1 Te At T 1 = Ce At T 1 e A t C T e A t T T C = T e A t C the observability Gramian becomes Q = T QT 1 23 / 42
Balancing Balancing Transformation The balancing transformations T bal and T 1 bal can be computed as follows 1 compute the Cholesky factorization P = UU 2 compute the eigenvalue decomposition of U QU U QU = KΣ 2 K where the entries in Σ are ordered decreasingly 3 compute the transformations One can then check that T bal = Σ 1 2 K U 1 T 1 bal = UKΣ 1 2 T bal PT bal = T bal QT 1 bal = Σ Definition (Hankel Singular Values) Σ = diag(σ 1,, σ N ) contains the N Hankel singular values of the system 24 / 42
Balancing Variational Interpretation Computing the balancing transformation T bal is equivalent to minimizing the following function min f (T) = min T GL(N) T GL(N) trace(tpt + T QT 1 ) For the optimal transformation T bal, the function takes the value f (T bal ) = 2tr(Σ) = 2 N i=1 where {σ i } N i=1 are the Hankel singular values σ i 25 / 42
Balanced Truncation Method Block Partitioning of the System Applying the change of variable w = T bal w transforms the given dynamical system into (Ã, B, C) where T bal AT 1 bal = Ã, T balb = B, CT 1 bal = C Let 1 k N; the system can be partitioned in blocks as [ ] [ ] Ã11 Ã Ã = 12 B, B = 1 [ ], C = C Ã 21 Ã 22 B 1 C 2 2 The subscripts 1 and 2 denote the dimensions k and N k, respectively 26 / 42
Balanced Truncation Method Block Partitioning of the System The blocks with the subscript 1 correspond to the most observable and reachable states The blocks with the subscript 2 correspond to the least observable and reachable states The ROM of size k obtained by Balanced Truncation is then the system (A r, B r, C r ) = (Ã11, B 1, C 1 ) R k k R k p R q k The left and right ROBs are W = T bal(:, 1 : k) and V = S bal (:, 1 : k), respectively where S bal = T 1 bal 27 / 42
Error Analysis Error Criterion Definition (The Hardy space H ) The H -norm associated with a system characterized by a transfer function G( ) is defined as G H = sup z C + G(z) = sup z C + σ max (G(z)) Proposition (i) G H (ii) G H = sup ω R G(iω) = sup σ max (G(iω)) ω R y( ) 2 = sup = sup u 0 u( ) 2 u 0 0 y(t) 2 2 dt 0 u(t) 2 2 dt The H norm of the error system between the high-dimensional and reduced-order models can (and will) be used as an error criterion 28 / 42
Error Analysis Theorem Theorem (Error Bounds) The Balanced Truncation procedure yields the following error bound for the output of interest. Let { σ i } N SV i=1 {σ i} N i=1 denote the distinct Hankel singular values of the system and { σ i } N SV i=n k +1 the ones that have been truncated. Then y( ) y r ( ) 2 2 N SV i=n k +1 σ i u( ) 2 Equivalently, the above result can be written in terms of the H -norm of the error system as follows G( ) G r ( ) H 2 N SV i=n k +1 where G and G r are the full- and reduced-order transfer functions. Equality holds when σ Nk +1 = σ NSV (all truncated singular values are equal). σ i 29 / 42
Error Analysis Theorem Proof. The proof proceeds in 3 steps 1 consider the error system E(s) = G(s) G r (s) 2 chow that if all the truncated singular values are all equal to σ, then E( ) H = 2σ 3 use this result to show that in the general case E( ) H 2 n SV i=n k +1 σ i 30 / 42
Stability Analysis Theorem Theorem (Stability Preservation) Consider (A r, B r, C r ) = (Ã 11, B 1, C 1 ), a ROM obtained by Balanced Truncation. Then (i) A r = Ã1 has no eigenvalues in the open right half plane (ii) Furthermore, if the systems (Ã 11, B 1, C 1 ) and (Ã 22, B 2, C 2 ) have no Hankel singular values in common, A r has no eigenvalues on the imaginary axis 31 / 42
Computational Complexity Numerical Methods Because of numerical stability issues, computing the transformations T bal = Σ 1 2 K U 1, T 1 bal = UKΣ 1 2 is usually ill-advised (computation of inverses) Instead, it is better advised to use the following procedure 1 start from the Cholesky decompositions of the Gramians P = UU and Q = ZZ 2 compute the SVD 3 construct the matrices U Z = WΣV T bal = V Z and T 1 bal = UW 32 / 42
Computational Complexity Cost and Limitations Balanced Truncation leads to ROMs with quality and stability guarantees; however the computation of the Gramians is expensive as it requires the solution of a Lyapunov equation (O(N 3 ) operations) for this reason, Balanced Truncation is in general impractical for large systems of size N 10 5 33 / 42
Comparison with the POD Method POD Recall the theorem underlying the construction of a POD basis Theorem Let K R N N be the real symmetric semi-definite positive matrix defined as T ˆK = w(t)w(t) T dt 0 Let ˆλ 1 ˆλ 2 ˆλ N 0 denote the ordered eigenvalues of K, and let φ i R N, i = 1,, N denote the associated eigenvectors K φ i = ˆλ i φi, i = 1,, N. The subspace V = range( V) of dimension k minimizing J(Π V,V ) is the invariant subspace of K associated with the eigenvalues ˆλ 1,, ˆλ k 34 / 42
Comparison with the POD Method POD for an Impulse Response The response of an LTI system to a single impulse input with a zero initial condition is w(t) = e At B Consequently, the reachability Gramian is K = T 0 w(t)w(t) T dt = T 0 e At BB T e AT t dt = P Unlike the Balanced Truncation method, the POD method does not take into account the observability Gramian to determine the ROM, and therefore every observable state may be truncated 35 / 42
Application CD Player System The goal is to model the position of the lens controlled by a swing arm 2-inputs/2-outputs system 36 / 42
Application CD Player System Bode plots associated with the HDM-based solution (N = 120): Each column represents one input and each row represents a different output From: In(1) Bode Diagram From: In(2) 100 To: Out(1) 0 100 Magnitude (db) 200 100 0 To: Out(2) 100 200 10 0 10 5 Frequency (rad/sec) 10 0 10 5 37 / 42
Application CD Player System Bode plots associated with the Balanced Truncation-based ROM solution: Each column represents one input and each row represents a different output To: Out(1) 150 100 50 0 From: In(1) Bode Diagram From: In(2) ssfom ssbt_5 ssbt_10 ssbt_20 ssbt_30 ssbt_40 50 Magnitude (db) 100 50 To: Out(2) 0 50 100 150 10 5 Frequency (rad/sec) 10 5 38 / 42
Balanced POD Method The Balanced POD Method The Balanced POD (BPOD) method generates snapshots for the dual system in addition to the POD snapshots S = [ (jω 1 I A) 1 B,, (jω Nsnap I A) 1 B ] S dual = [ ( jω 1 I A ) 1 C,, ( jω Nsnap I A ) 1 C ] Then, right and left ROBs can be computed as follows S T duals = UΣZ T (SVD) V = SZ k Σ 1/2 k W = S dual U k Σ 1/2 k where the subscript k designates the first k terms of the singular value decomposition If no truncation occurs, the result is equivalent to two-sided moment matching at s i {ω 1,, ω Nsnap } 39 / 42
Balanced POD Method Balanced Truncation and POD in the Time Domain The POD method in the time domain is based solely on the reachability concept However, the BPOD method introduces also the notion of observability in the construction of a ROM is tractable for very large-scale systems provides an approximation to the Balanced Truncation method does not however guarantee in general the stability of the resulting ROM 40 / 42
Balanced POD Method Application Supersonic Inlet Problem (part of the Oberwolfach Model Reduction Benchmark Collection repository) Incoming flow Inlet throat Shock Inlet disturbance E d w(t) dt = Aw(t) + Bu(t) y(t) = Cw(t) N = 11, 370 (2-D Euler equations) p = 1 input (density disturbance of the inlet flow) q = 1 output (average Mach number at the diffuser throat) 41 / 42
Balanced POD Method Application Model reduction in the frequency domain using POD BPOD Same frequency sampling for the computation of the snapshots in both cases Plot of the magnitude of the relative error in the transfer function within the sampled frequency interval as a function of the ROM dimension k 10 2 10 0 10 2 10 4 10 6 POD Balanced POD 10 8 10 12 14 16 18 20 42 / 42