An efficient DC-gain matched balanced truncation realization for VLSI interconnect circuit order reduction

Similar documents
EE5900 Spring Lecture 5 IC interconnect model order reduction Zhuo Feng

An Optimum Fitting Algorithm for Generation of Reduced-Order Models

Passive Interconnect Macromodeling Via Balanced Truncation of Linear Systems in Descriptor Form

Model Reduction for Unstable Systems

MODEL-order reduction is emerging as an effective

Model reduction of interconnected systems

Model reduction for linear systems by balancing

Passive reduced-order modeling via Krylov-subspace methods

P. Feldmann R. W. Freund. of a linear circuit via the Lanczos process [5]. This. it produces. The computational cost per order of approximation

Krylov Techniques for Model Reduction of Second-Order Systems

Practical Considerations For Passive Reduction of RLC Circuits

Second-Order Balanced Truncation for Passive Order Reduction of RLCK Circuits

Behavioral Modeling for Analog System-Level Simulation by Wavelet Collocation Method

Structure-Preserving Model Order Reduction of RCL Circuit Equations

Model order reduction of electrical circuits with nonlinear elements

Wavelet-Based Passivity Preserving Model Order Reduction for Wideband Interconnect Characterization

Automatic Generation of Geometrically Parameterized Reduced Order Models for Integrated Spiral RF-Inductors

Balanced Truncation 1

Model reduction of coupled systems

MODEL ORDER REDUCTION FOR PEEC MODELING BASED ON MOMENT MATCHING

BALANCING-RELATED MODEL REDUCTION FOR DATA-SPARSE SYSTEMS

Automatic Generation of Geometrically Parameterized Reduced Order Models for Integrated Spiral RF-Inductors

Iterative Rational Krylov Algorithm for Unstable Dynamical Systems and Generalized Coprime Factorizations

Model-Order Reduction of High-Speed Interconnects: Challenges and Opportunities

Krylov-Subspace Based Model Reduction of Nonlinear Circuit Models Using Bilinear and Quadratic-Linear Approximations

H 2 optimal model reduction - Wilson s conditions for the cross-gramian

The model reduction algorithm proposed is based on an iterative two-step LMI scheme. The convergence of the algorithm is not analyzed but examples sho

Model Order-Reduction of RC(L) Interconnect including Variational Analysis

An iterative SVD-Krylov based method for model reduction of large-scale dynamical systems

Poor Man s TBR: A Simple Model Reduction Scheme

H 2 -optimal model reduction of MIMO systems

CME 345: MODEL REDUCTION

FEL3210 Multivariable Feedback Control

Model reduction of large-scale systems by least squares

Model Order Reduction

Recent developments for MOR in the electronics industry

Approximate Low Rank Solution of Generalized Lyapunov Matrix Equations via Proper Orthogonal Decomposition

Wideband Modeling of RF/Analog Circuits via Hierarchical Multi-Point Model Order Reduction

A Circuit Reduction Technique for Finding the Steady-State Solution of Nonlinear Circuits

Comparison of Model Reduction Methods with Applications to Circuit Simulation

Fourier Model Reduction for Large-Scale Applications in Computational Fluid Dynamics

Model reduction of nonlinear circuit equations

THE recent trend in the VLSI industry toward miniature

Stability and Passivity of the Super Node Algorithm for EM Modeling of IC s

AN OVERVIEW OF MODEL REDUCTION TECHNIQUES APPLIED TO LARGE-SCALE STRUCTURAL DYNAMICS AND CONTROL MOTIVATING EXAMPLE INVERTED PENDULUM

ECEN 420 LINEAR CONTROL SYSTEMS. Lecture 6 Mathematical Representation of Physical Systems II 1/67

Robust Multivariable Control

DTT: Direct Truncation of the Transfer Function An Alternative to Moment Matching for Tree Structured Interconnect

A Trajectory Piecewise-Linear Approach to Model Order Reduction and Fast Simulation of Nonlinear Circuits and Micromachined Devices

Some inequalities for sum and product of positive semide nite matrices

Control Systems Engineering (Chapter 2. Modeling in the Frequency Domain) Prof. Kwang-Chun Ho Tel: Fax:

Peter C. Müller. Introduction. - and the x 2 - subsystems are called the slow and the fast subsystem, correspondingly, cf. (Dai, 1989).

3 Gramians and Balanced Realizations

Model-Order Reduction Using Variational Balanced Truncation with Spectral Shaping

LMI Based Model Order Reduction Considering the Minimum Phase Characteristic of the System

Applied Mathematics Letters

A comparison of model reduction techniques from structural dynamics, numerical mathematics and systems and control

Model Reduction for Linear Dynamical Systems

Advanced Computational Methods for VLSI Systems. Lecture 4 RF Circuit Simulation Methods. Zhuo Feng

Some of the different forms of a signal, obtained by transformations, are shown in the figure. jwt e z. jwt z e

Identification of Electrical Circuits for Realization of Sparsity Preserving Reduced Order Models

ME 234, Lyapunov and Riccati Problems. 1. This problem is to recall some facts and formulae you already know. e Aτ BB e A τ dτ

BALANCING AS A MOMENT MATCHING PROBLEM

Blind deconvolution of dynamical systems using a balanced parameterized state space approach

Problem set 5 solutions 1

Virtual Prototyping for Power Electronics

Weighted balanced realization and model reduction for nonlinear systems

Model reduction of large-scale dynamical systems

S. Gugercin and A.C. Antoulas Department of Electrical and Computer Engineering Rice University

CANONICAL LOSSLESS STATE-SPACE SYSTEMS: STAIRCASE FORMS AND THE SCHUR ALGORITHM

Closed Form Expressions for Delay to Ramp Inputs for On-Chip VLSI RC Interconnect

Rational Krylov Methods for Model Reduction of Large-scale Dynamical Systems

Model reduction via tangential interpolation

Block oriented model order reduction of interconnected systems

Network Reconstruction from Intrinsic Noise: Non-Minimum-Phase Systems

System Level Modeling of Microsystems using Order Reduction Methods

KTH. Access to the published version may require subscription.

VERIFYING systems hierarchically at different levels

CDS Solutions to the Midterm Exam

The Lanczos and conjugate gradient algorithms

The Important State Coordinates of a Nonlinear System

Norm invariant discretization for sampled-data fault detection

Vector Potential Equivalent Circuit Based on PEEC Inversion

MODEL REDUCTION BY A CROSS-GRAMIAN APPROACH FOR DATA-SPARSE SYSTEMS

There are six more problems on the next two pages

The norms can also be characterized in terms of Riccati inequalities.

L. Miguel Silveira Ibrahim M. Elfadel Jacob K. White. 555 River Oaks Parkway, MS 3B1. tables of data.

Order reduction of large scale second-order systems using Krylov subspace methods

NP-hardness of the stable matrix in unit interval family problem in discrete time

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science : Dynamic Systems Spring 2011

Yimin Wei a,b,,1, Xiezhang Li c,2, Fanbin Bu d, Fuzhen Zhang e. Abstract

Passive Reduced Order Multiport Modeling: The Padé-Laguerre, Krylov-Arnoldi-SVD Connection

Parallel VLSI CAD Algorithms. Lecture 1 Introduction Zhuo Feng

An iterative SVD-Krylov based method for model reduction of large-scale dynamical systems

ADI-preconditioned FGMRES for solving large generalized Lyapunov equations - A case study

HIGH-PERFORMANCE circuits consume a considerable

To appear in IEEE Trans. on Automatic Control Revised 12/31/97. Output Feedback

Total least squares. Gérard MEURANT. October, 2008

Improved Newton s method with exact line searches to solve quadratic matrix equation

Transcription:

Microelectronic Engineering 60 (00) 15 www.elsevier.com/ locate/ mee An efficient DC-gain matched balanced truncation realization for VLSI interconnect circuit order reduction Xuan Zeng *, Dian Zhou, Wei Cai a, b c a ASIC & System State-Key-Lab, Microelectronic Department, Fudan University, Shanghai 004, PR China b E. E. Department, University of Texas at Dallas Richardson, TX 7508, USA c E. E. Department, University of North Carolina at Charlotte, Charlotte, NC 8, USA Abstract In this paper, we present a linear time DC-Gain matched BTR (DBTR) method for the VLSI interconnect order reduction. From the circuit point of view, the original BTR has a serious drawback that the DC gain between the original and the reduced order system doesn t match. We propose the DBTR method which can both match the DC gain and guarantee the performance of the reduced order system. Moreover, considering that the practical VLSI circuit order can be up to several thousands, we derive a linear time algorithm for computing a DBTR by extending the O(n) Krylov Subspace Oblique Projection. With linear time algorithm, the obstacle caused by the expensive computation cost of TBR for large circuit order reduction can be solved efficiently, and the advantage of the BTR, guaranteed performance on the reduced order system, can be fully utilized. 00 Elsevier Science B.V. All rights reserved. Keywords: Order reduction; Balanced truncation realization; Krylov subspace oblique projection; VLSI interconnect 1. Introduction With the remarkable evolution of VLSI technology, the minimum feature size in VLSI circuits reaches 0.1 micrometers and the frequency reaches multi-ghz. In high-speed deep-submicron chips, circuit performance is dominated by VLSI interconnects. In order to evaluate the circuit performance, fast and accurate circuit simulation is needed. Since the frequency is high, the interconnections and packaging should be modeled by distributed circuits rather than lumped components. Because of the circuit density and features of deep submicron process, the parasitic resistors, capacitors and inductors of interconnects should be considered. Such a detailed modeling level eventually results in extremely large-scale linear circuits to be analyzed. An effort of reducing the circuit order (or size) is then *Corresponding author. Tel.: 186-1-6564-764; fax: 186-1-6564-0850. E-mail address: xzeng@fudan.edu.cn (X. Zeng). 0167-917/ 0/ $ see front matter 00 Elsevier Science B.V. All rights reserved. PII: S0167-917(01)00576-7

4 X. Zeng et al. / Microelectronic Engineering 60 (00) 15 necessary in order to evaluate the circuit performance and characteristics in a reasonable amount of time, as required by real design practice. The process of reducing linear system order is called linear system order reduction. The pioneering work on VLSI interconnect circuit order reduction, Asymptotic Waveform Evaluation for Timing Analysis (AWE), is based on Pade approximation [1]. AWE approximates the Laplace-domain transfer function of a linear network by a reduced-order model, containing only a relatively small number of dominant poles and residuals. The original numerical method used in AWE suffered severe numerical difficulties when a high order approximation is needed. This problem was resolved later by using a more stable numerical method: Lanczos algorithm []. The other problems related to the approaches based on the Pade approximation, such as to preserve passivity and reciprocity have also been resolved [ 5]. In the past 10 years, AWE and its variations have emerged as the dominant approach for the efficient analysis of large linear networks [6,7,,8,9] Although order reduction methods based on the Pade approximation have had spectacular success in solving many practical VLSI interconnect/ packaging problems, as is well known that the Pade approximation generally does not guarantee performance [10,11]. The Pade approximation tries to match the original and approximated systems at a chosen frequency point. The solution is valid only at the neighborhood of the frequency expansion point, and consequently the resulted error cannot be bounded in a wide frequency range. The remedial techniques, such as scaling, frequency shifting, and complex frequency hopping, are sometimes heuristic, hard to apply automatically, and may be computational expensive, as pointed out in []. In addition, those heuristics unfortunately often do not have a strong theoretical basis. Recently, effort has been made to develop new order reduction methods that guarantee an error bound on the approximation. Chen [1] proposes a method called Pade Approximation via Bilinear Conformal Transformation (PVBCT). This method provides an error bound in the time domain. However, the proposed bound is too weak to be practically useful, and similarly to AWE it is not easy to find a proper order to get a good approximation. In this sense, the method remains as heuristics. Recently a fast wavelet collocation method (FWCM) [1 15] has been developed for solving Ordinary Differential Equations (ODE). The wavelet method can achieve order reduction by representing the system transfer function in wavelet series and throwing out the insignificant terms in the expansion. The wavelet method guarantees the performance of the reduced-order system, since the transfer function of the original system can be approximated to any specified accuracy. Moreover, the characteristics of the reduced-order system can be easily analyzed because wavelets are localized in both frequency and time domain. However, the complexity for wavelet order reduction is O(n ) where n is the number of state variables of the original system. Rabiei [11] uses the well-known method developed in control area: Balanced Truncation Realization (BTR) realizing that BTR guarantees the performance of the reduced order system [16]. The reduced order system obtained through BTR generally does not match the performance of the original system for DC gain [17,18]. Such a characteristic is very undesirable in practical circuit design since DC gain correctness is a critical criterion in any circuit design, though it is less important in the control area. In addition to the DC gain mismatch, the computation cost of BTR is very expensive which has prohibited applying BTR to large VLSI circuit where the order of the circuit can easily be up to several thousands. In an effort to improving the computation efficiency, Li [10] proposed to compute only the controllability gramian and do not use the observability. The obtained reduced order system is therefore not balanced, and as a result its performance is not guaranteed. Even using the computation method ADI proposed in [10],

X. Zeng et al. / Microelectronic Engineering 60 (00) 15 5 the time complexity remains as O(n ) where n is the order of the original system. As a matter of fact, all existing order reduction methods (including the Pade approximation based methods) have a 1 computational complexity O(n ), because they need to either compute A where A is the matrix describing the linear system, or find the singular values of the original system. This paper solves the above mentioned problems. We developed: an order reduction method that guarantees the performance of the reduced order system (with an upper bound on the error), and the matched DC gain; an efficient linear time algorithm for the proposed order reduction method. Clearly, the obtained results are significant to the linear system order reduction from both theoretical and practical point of view.. Problem formulation We define the linear system order reduction problem first. Assume a linear system is specified by ~x(t) 5 Ax(t) 1 Bu(t), (1) y(t) 5 Cx(t) 1 Du(t) () m n p where for each t, u(t) [ C, x(t) [ C, y(t) [ C are vectors of inputs, states, and outputs, respectively. nn nm pn pm The matrices A [ C, B [ C, C [ C and D [ C are assumed to be constant. The transfer function of the system is given by 1 G(s) 5 D 1 C(sI A) B. () It has a McMillan degree n if the system is minimal. The model order reduction problem is to find a system whose transfer function G ˆ k(s) has a McMillan degree k, n such that the reduced order system retains some of the fundamental properties of the original system and/or is close to the original system in a suitable sense.. Order reduction method for guaranteed performance and matched DC gain Mullis and Roberts first introduced balanced realizations for the filters to roundoff noise [19]. Later, Moore [0] proposed the balanced truncation model reduction. Enns [1] and Glover [16] further gave the error bound for this method. This method is defined as follows. A stable minimal system (A, B, C, D) (Eqs. (1) ()) is called balanced, if the solutions P and Q to the Lyapunov equations AP 1 PA91BB950, (4) A9Q 1 QA1 C9C 5 0 (5) are such that P 5 Q 5 S 5 diag(s1i n, si n,...,sli n ) with Hankel singular values s 1. s.???. 1 l l s l, where oi51ni5 n, and niis the multiplicity of Hankel singular value s i. The important fact is that for a stable minimal system there exists a state space transformation that brings the system into balanced coordinates. If a k dimensional reduced order system is sought, the model reduction method now consists of block partitioning the balanced system (A, B, C, D) and retaining the subsystem (A 11,

6 X. Zeng et al. / Microelectronic Engineering 60 (00) 15 j 1 1 i51 i k n, n. Let Gˆ be the transfer function B, C, D) that corresponds to the first k states where k 5 o of the reduced order system. The following H`-norm bound holds in frequency domain f g supig( jv) G ˆ ( jv)i 5sup s G( jv) G ˆ ( jv) # O s (G(s)) (6) k 1 k i v v i5j11 and in time domain l ie(t)i ]] sup # O s i(g(s)) (7) iu(t)i u(t) #1 i5j11 where u(t) is any input signal with iu(t)i # 1, and e(t) is the output error between the original model and the balanced truncated model. Clearly, the maximum ratio of the error signal energy over the l i5j11 input signal energy is bounded by o s i(g(s)). VLSI interconnect circuit is a passive network and hence is stable, i.e., the eigenvalues of A in Eq. (1) locate in the left half plane. The controllability and observability gramians P and Q are at least semi-positive definite. Thus, Cholesky factorization of matrix Q exists. l T Q 5 RR T Next, matrix RPR can be diagonalized as (8) T T T RPR 5 US U, U U5 I, (9) S 5 diag(s I, s I,...,s I ), s. s.???.s, O s 5 n (10) 1 s s N s 1 N i51 i 1 N where s, i51,..., N, are Hankel singular values of system Eqs. (1) (). The balanced transformai tion is obtained as 1/ T T 5 S U R. (11) 1 1 b b b b The transformed system ha 5 TAT, B 5 TB, C 5 CT, D 5 Dj has balanced controllability and observability gramians Pb5 Qb5 S. 1 Write the system in a partitioned form with respect to the controllability and observability gramians F G F G 1 A11 A1 B1 A 5, B 5, C 5 C C b b b f g, D b (1) A A B 1 j 11 i51 i where the sub-matrix A has a dimension of k 5 o s, and the others have their suitable corresponding dimensions respectively. The balanced truncated model system is ha, B, C, Dj 11 1 1 with a dimension of k, i.e., a reduced order k. The important issue is that the H -norm of the model ` reduction error in s-domain (roughly say, frequency domain) is bounded by N ig(s) G ˆ (s) i # O s. (15) k ` i5j11 i N (14)

X. Zeng et al. / Microelectronic Engineering 60 (00) 15 7 Although TBR offers an upper bound on the reduction error, there is a drawback related to this approach. A detailed study reveals that while trying to minimize the response difference between the original and the reduced-order system for any input with a limited energy iu(t)i # 1. The DC gain of the two systems is not matched. For simplicity of presentation, we only discuss here the case of single input/ output systems. Generalizations to the multivariable case are given in [17,18]. The following result is due to Mahil et al. [] and in its current language due to Ober []. Let 1 G(s) 5 C(sI A) B be a transfer function of a stable minimal system. Then N G(0) 5 Oi51sis i(g(s)). (16) Here si 561 are signs that are uniquely determined by the system, P 5 Q 5 S 5 diag(s 1, s,..., s n), s 1$ s $???$sn are Hankel singular values of G(s). In fact s1s 1(G(s)),..., sns(g(s)) n are the eigenvalues of the Hankel operator associated with the system []. If now sa 11, B 1, C1d is a k-dimensional k, n balanced reduced-order system of (A, B, C) with transfer function G ˆ (s), then k Ĝ (0) 5 O s s (G(s)). (17) k i51 i i n k11 i This implies that the DC gain of the reduced order system deviates by o s s i(g(s)) from that of the original system and is in fact typically far different. Now, we illustrate this DC gain mismatch problem with an example due to its importance in real circuit design. We computed the Hankel singular values of the example Fig. 16 in [1]. hs (G(s)), s (G(s)),...,s (G(s)) j5h0.574, 0.077, 0.005, 0.000, 0.000, 0.000, 0, 0, 0, 0 j (18) 1 10 Note that only the first three Hankel singular values are non-zero. A third order balanced realization would perfectly match the original system. However, there will be a DC gain mismatch if the lower order approximation is used. In this particular example the first sign parameters are s 5 1, 1 s51, s5 1. Hence, the DC gain of the original system is G(0) 5 1.008, and that of the first order balanced approximant is G ˆ (0)5 s 5 1.1448. The DC gain mismatch between the original system 1 11 10 i5 and a first order balanced approximant is o s i(g(s)) 5 0.16. Matched DC gain is in fact a critical criterion in all VLSI interconnect/ packaging circuit design. Note that the Pade approximation always results in a correct DC gain if the expansion frequency point is chosen at zero. But as pointed out earlier, such an approach does not provide an error bound over a wide frequency range. In order to have both a matched DC gain and bounded reduction error, we propose the following modification of the balanced model reduction method [4]. First, the transfer function G(s) is replaced by another transfer function G (s) which is obtained from G(s) by setting c G (s): 5 G(1/s). This mapping can be performed in state space, using elementary state space c manipulations. In particular, if (A, B, C, D) is a realization of G(s), then the realization of G (s) is c obtained by 1 1 1 1 A 5 A, B 5 A B, C 5CA, D 5 D CA B. (19) c c c c Next, G (s) is approximated by G ˆ (s) of a lower order system using TBR. Finally, G ˆ c c c(s) is mapped back to obtain G(s) ˆ by setting G(s)[G ˆ ˆ c(1/s) with the similar algorithm in Eq. (19). The two fundamental properties of this method are [4]: k

8 X. Zeng et al. / Microelectronic Engineering 60 (00) 15 Fig. 1. An RC tree with a floating capacitor. 1. The same error bound applies as in the original BTR method;. The DC error is zero. A major drawback of the original BTR is therefore removed. To demonstrate the effectiveness of this method, Fig. 1 shows a circuit (the example Fig. in [1]) in which voltage across C7 is the output. Fig. plots the output of the original and reduced order systems. When the BTR is used, the 1st order BTR has a significant mismatched DC gain while the proposed modified BTR overcomes this problem effectively. In the figure, we also plotted the performance of nd and rd order approximation. It can be seen the nd already matches the original system very well. Fig.. Modified BTR with the matched DC gain.

X. Zeng et al. / Microelectronic Engineering 60 (00) 15 9 4. Linear time algorithm for linear system order reduction In the previous section we proposed a new method for order reduction that guarantees the matched DC gain and performance. We call the proposed method the DBTR (DC gain matched BTR). In this section we precede to address another important issue: computational aspects of BTR and DBTR. We first show how BTR can be obtained in linear time, and then extend the result to DBTR. So far the two major model reduction schemes for large linear circuit, PVL implementation of AWE [] and Predominant Controllability Approximation [10], have a computational complexity of O(n ) where n is the number of circuit nodes. Such a high cost is not acceptable and prohibitively expensive for large circuits needed for accurate prediction of interconnect effects at the board and chip level. In this paper, we propose to use an O(n) model reduction scheme, an oblique projection scheme into both controllability and observabilitiy spaces via a Lanczos Lyapunov solver. In the PVL/AWE approximation, the transfer function G(s) at frequency s 5 0 is approximated by a rational function q1 q1 1 0 q as q 1???1as1 1 a0 b s 1???1bs1b H (s) 5 ]]]]]]]]]. (0) q In the original AWE scheme [1], the coefficients of the rational functions are obtained by a Taylor expansion of H q(s) near s 5 0 and then matching with those of the original transfer function G(s). The k Taylor expansion of G(s) will involve the calculation of Arwhere r is a vector, therefore, the inversion of the matrix A is unavoidable, which is O(n ) by the standard LU decomposition. Later it has been realized that the AWE procedure is badly conditioned, a more stable implementation of AWE PVL was proposed in [] with an observation of a connection of the Pade approximation and the classical Lanczos procedure [5]. The Lanczos procedure is an iterative way to reduce a square matrix into a triadiagonal matrix, in the context of the rational function the eigenvalues and eigenfunctions of that triadiagonal matrix turn out to relate to the location of the poles in H (s) and the q residual of the poles. The PVL method improves the numerical stability of the original AWE procedure. Unfortunately, the Lanczos procedure used in the PVL applies to matrix G(s) which involves again the inversion of the matrix A, thus resulting in an O(n ) algorithm overall. Recently, a new model reduction scheme was suggested to explore the predominant controllability space associated with the linear system Eqs. (1) (). However, it can be shown that the method proposed in [10], which only considered the controllability gramian P while ignoring the observability gramian Q, could produce results deviating dramatically from the original system. Moreover, in terms of the computational complexity, the route taken in [10] will again result in an O(n ) algorithm. This is the result of the ADI iterative scheme [6] on the Lyapunov equation for the controllability gramian P. To illustrate, if X is the approximation to the P, the ADI scheme iterates as follows: j X 5 0, 0 ( pia)x 1 j j] 5 B 1 X j1(a91pi), j (1) 9 1 j j j] j ( pi A)X 5 B 1 X (A91pI) where p s are ADI iteration parameters. Therefore, the inversion of the matrix A is again required, j

10 X. Zeng et al. / Microelectronic Engineering 60 (00) 15 resulting in an O(n ) in addition to the fact that the large condition number of the matrix A (a consequence of different scales present in the circuits) will usually render the ADI iteration inefficient and stagnant toward convergence. We will follow a different route in the design of model reduction using both the predominant controllability and observability spaces. The major difference in our approach is to realize that there is no need to solve the whole Lyapunov equations (4) and (5) for P and Q before we can obtain approximation to their predominant spaces. A dynamic error monitor procedure can be implemented to terminate the computation with only the needed predominant spaces calculated, and nothing more. To accomplish this task, we will extend the O(n) Krylov Subspace Oblique Projection methods (OPM) in [7,8] to our linear circuit order reduction procedure and the general procedure is described as follows. In the oblique projection method, the original Lyapunov equations (4) and (5) 1 are projected into lower Krylov spaces of order m generated by A and A9, respectively. Then a low rank (m < n) gramians Pm and Qm obtained from the reduced order Lyapunov equations are shown to approximate the original P and Q as m becomes larger, respectively. Moreover, the reduced order transfer function G (s) can be shown to approximate over all frequency range s. m For illustration, considering only single input and output circuit system, then B 5 b and C 5 c are vectors and the Krylov spaces for both A and A9 are m1 K (A,b) 5 spanh[b, Ab,..., A b]j and K (A9, c9) 5 spanh[c9, A9c9,..., A c9]j. () m m The following Lanczos procedure will produce a base Vm 5 [v 1,..., v m] and Wm5 [w 1,..., w m] for both Krylov spaces, respectively, which are biorthogonal, i.e. V9 W 5 I. m m m ] 1. Start: set b15œubcu, d15 b 1? sign[cb] and define v15 b/d 1,w15 c9/b 1.. Iterate: For j51,,..., do a 5 (Av, w ), j j j ˆ v 5 Av a v b v, () j11 j j j j j1 ˆ w 5 A9w a w d w, j11 j j j j j1 ]]]] b 5 u(v ˆ,w ˆ )u, d 5 b? sign[(v ˆ, w ˆ )], œ j11 j11 j11 j11 j11 j11 j11 v 5 v ˆ /d, w 5 w ˆ /b. j11 j11 j11 j11 j11 j11 With the new bases of Vm and W m, we can project the original Lyapunov equations Eqs. (4) and (5) for both P and Q to form the following reduced order Lyapunov equation A P1 PA9 1 B B9 m m m m5 0 (4) where the reduced system matrix Am5V9mAVm and Bm5V9mB. The Lyapunov equation (4) of much smaller size m < n can be solved with direct methods m1 9 1 The rank is not necessary greater, but no smaller than the rank k of the reduced order system. In general we choose m 5 k. For the simplicity, we only describe the computation of P here and the treatment of Q is similar.

X. Zeng et al. / Microelectronic Engineering 60 (00) 15 11 proposed by Bartels and Stewart [9] and Hammarlings [0] with O(m ) operations. The approximated predominant spaces for P can be defined as P 5V PV 9 m m m. (5) Notice that P equation m is an n n matrix. The residual of the reduced gramian for the original Lyapunov R(P )[AP 1 P A91BB9 m m m (6) satisfies the following Galerkin condition with respect to the Krylov subspace W, i.e. m W9mR(P m)wm5 0. (7) This implies that as the Lanczos progresses, the residual produced by the reduced gramian P m, with small rank m, will approximate the original grammian P. As a matter of fact, error estimation can be derived [8] to monitor the accuracy of such an approximation. A reduced model then can be constructed using the projected system A as follows: m 1 m m m m Bm G (s) 5 D 1 C (si A ) (8) where C 5V9C and D 5V9 m m m md. This reduced transfer function approximates the original transfer function over the whole imaginary s-axis. It is apparent that the cost of the Lanczos Lyapunov procedure described above is of O(mn 1 m ) complexity using the fact that the sparsity of the matrix A implies O(n) non-zero entries in A. It is shown that to obtain a BTR the derived algorithm has a linear time complexity with respect to the problem size n. It is straightforward to extend the derived algorithm to DBTR. Note that DBTR involves the operation Eq. (19) and those of the original BTR. Here, we only have to show the operation Eq. (19) can be carried out efficiently. Similar to solving the Lyapunov equation, we can apply the operation Eq. (19) to the reduced system (A, B, C, D ) obtained through the Krylov space projection. Thus, m m m m the operations in Eq. (19) can be done in O(m ). Consequently, a DBTR can be obtained in O(mn 1 m ); again it is linear time with respect to the original problem size. 5. Experiment results In this section, we present some of our experimental results. The first example is a balanced clock tree, which is modeled by distributed RLC circuit. The order of the original system is 1010. The proposed DBTR algorithm was used to find the reduced-order (0 and 4) model. The step response at one leaf node of the original system and the reduced-size system are plotted in Fig.. It is shown that the 4-order system can be a very good approximation to the original system. The second example is a coupling interconnect network, which models seven parallel bus lines, as shown in Fig. 4. Each line is modeled by a number of 16 RLC segments. There are coupling capacitance and mutual inductance between any two parallel segments in any different lines. The

1 X. Zeng et al. / Microelectronic Engineering 60 (00) 15 Fig.. Step response of a clock tree. original system has 198 states. A voltage source is used to drive the network, which locates at the left end of line 1, and the observation point is at the other end of the wire. Fig. 5 shows the simulation waveforms, in which the cross-talk waveforms are also plotted for comparison. The solid line with circles is the response of the original system; the dash lines with plus legends and the dotted lines with stars are the response of the reduced system with order 0 and 0, respectively. The response of the reduced order system with order 0 is indistinguishable to the response of the original system. Fig. 4. The distributed circuit schematic of the 7 parallel bus lines.

X. Zeng et al. / Microelectronic Engineering 60 (00) 15 1 Fig. 5. Step response of 7 parallel bus lines. 6. Conclusion In this paper, we have solved two important problems occurred when applying the BTR to VLSI interconnect circuit order reduction. First, we showed that the original BTR has a serious drawback from the circuit point of view that the DC gain between the original and the reduced order system doesn t match. A new method DBTR was proposed that matches the DC gain and guarantees the performance of the reduced order system. Secondly, we derived a linear time algorithm for computing a DBTR. Previously, the expansive computational cost has been a major obstacle preventing applying the BTR to VLSI circuit order reduction where the circuit order can easily be up to several thousands. With linear time algorithm, those practical problems can be solved efficiently, and the advantage of the BTR, guaranteed performance on the reduced order system, can be fully utilized. The obtained results are not limited to the application of VLSI interconnect design. They are also important to the fundamental theory of linear system order reduction and numerical method for linear system computation. Acknowledgements This research is supported by NSFC oversea s young scientist joint research project 699840, the doctoral program foundation of Ministry of Education of China 0000468, NSFC project

14 X. Zeng et al. / Microelectronic Engineering 60 (00) 15 69806004, National High Technology Research and Development 86 Plan 86-SOC-Y--, AirForce Office of Scientific Research grant F4960-96-1-04 and NSF NYI Award MIP-945740. References [1] L.T. Pillage, R.A. Rohrer, Asymptotic waveform evaluation for timing analysis, IEEE Trans. CAD 9 (4) (1990) 5 66. [] P. Feldmann, R.W. Freund, Efficient linear analysis by Pade approximation via Lanzos process, IEEE Trans. CAD 14 (5) (1995) 69 649. [] B.N. Sheehan, ENOR: Model order reduction of RLC circuits using nodal equations for efficient factorization, in: Proc. DAC 99, 1999, pp. 17 1. [4] R.W. Freund, Passive reduced-order models for interconnect simulation and their computation via Krylov-subspace algorithm, in: Proc. DAC 99, 1999, pp. 195 00. [5] L.M. Silveira, M. Kamom, I. Elfadel, J. White, A coordinate transformed Arnoldi algorithm for generating guaranteed stable reduced-order models of RLC circuits, Tech. Dig. IEEE/ACM ICCAD (1996) 88 94. [6] E. Chiprout, M.S. Nakhla, Asymptotic Waveform Evaluation and Moment Matching for Interconnect Analysis, Kluwer Academic, Norwell, MA, 1994. [7] E. Chiprout, M.S. Nakhla, Analysis of interconnect networks using complex frequency hopping (CFH), IEEE Trans. CAD of Integrated Circuits and Systems 14 () (1995) 186 00. [8] K.J. Kerns, I.L. Wemple, A.T. Yang, Stable and efficient reduction of substrate model networks using congruence transforms, in: Proc. ICCAD-95, 1995, pp. 07 14. [9] Y. Liu, L.T. Pileggi, A.J. Strojwas, Model order-reduction of RC(L) interconnect including variational analysis, in: Proc. DAC 99, 1999, pp. 01 06. [10] J.R. Li, F. Wang, J. White, An efficient equation based approach for generating reduced-order models of interconnect, in: Proc. DAC 99, 1999, pp. 1 7. [11] P. Rabiei, M. Pedram, Model order reduction of large circuits using balanced truncation, in: Proc. of ASP-DAC 99, 1999, pp. 7 40. [1] C.P. Chen, D.F. Wong, Error bounded Pade approximation via bilinear conformal transformation, in: Proc. DAC 99, 1999, pp. 7 1. [1] D. Zhou, W. Cai, A fast wavelet wollocation method for high-speed circuit simulation, in: IEEE Trans. CAS-I, 1999, pp. 90 90. [14] D. Zhou, W. Cai, W. Zhang, An adaptive wavelet method for nonlinear circuit simulation, in: IEEE Trans. CAS-I, 1999, pp. 91 98. [15] D. Zhou, N. Chen, W. Cai, A fast wavelet collocation method for high-speed circuit simulation, in: Proc. ICCAD 95, 1995, pp. 115 1. [16] K. Glover, All optimal Hankel-norm approximations of linear multivariable systems and their L-error bounds, Int. J. Control 9 (6) (1984) 1115 119. [17] R.J. Ober, Balanced realizations for finite and infinite dimensional systems, Ph.D. Thesis, Engineering Department, Cambridge University, 1987. [18] A. Gheondea, R.J. Ober, A trace formula for Hankel operators, in: Proc. the American Mathematical Society, 1999, pp. 007 01. [19] C.T. Mullis, R.A. Roberts, Synthesis of minimum roundoff noise fixed point digital filters, IEEE Trans. on Circuits and Systems (1976) 551 56. [0] B.C. Moore, Principal component cnalysis in linear systems: controllability, observability, and model reduction, IEEE Trans. on Automatic Control 5 (1981) 17. [1] D. Enns, Model reduction with balanced realizations: an error bound and a frequency weighted generalization, in: Proc. Conference on Decision and Control, Las Vegas, Nevada, 1984. [] S.S. Mahil, F.W. Fairman, B.S. Lee, Some integral properties for balanced realizations of scalar systems, IEEE Trans. on Automatic Control 9 (1984) 181 18.

X. Zeng et al. / Microelectronic Engineering 60 (00) 15 15 [] R.J. Ober, Balanced parameterization of classes of linear systems, SIAM J. Control Optimization 9 (1991) 151 187. [4] Y. Liu, B.D.O. Anderson, Singular perturbation approximation of balanced systems, Int. J. Control 50 (1989) 179 1405. [5] C. Lanczos, An iteration method for the solution of the eigenvalue problem of linear differential and integral operators, J. Res. Nat. Bur. Standards 45 (1950) 55 8. [6] A. Lu, E.L. Wachpress, Solution of Lyapunov equations by ADI iteration, Computers Math. Applic. 1 (9) (1991) 4 58. [7] Y. Saad, Numerical solution of large Lyapunov equations, in: M.A. Kaashoek, J.H. Van Schuppen, A.C.M. Ran (Eds.), Signal Processing, Scattering, Operator Theory and Numerical Methods, Birkhauser, Boston, 1990, pp. 50 511. [8] I.M. Jaimoukha, E.M. Kasenally, Oblique projection method for large scale model reduction, SIAM J. Matrix Anal. Appl. 16 () (1995) 60 67. [9] R.H. Bartels, W. Stewart, Solution of the matrix equation AX 1 XB 5 C, Comm. ACM 15 (197) 80 86. [0] S.J. Hammarling, Numerical solution of the stable, non-negative definite Lyapunov equation, IMA J. Numer. Anal. (198) 0.