ON THE IMPLICIT INTEGRATION OF DIFFERENTIAL-ALGEBRAIC EQUATIONS OF MULTIBODY DYNAMICS. by Dan Negrut

Size: px

Start display at page:

Download "ON THE IMPLICIT INTEGRATION OF DIFFERENTIAL-ALGEBRAIC EQUATIONS OF MULTIBODY DYNAMICS. by Dan Negrut"

Joseph Simpson
6 years ago
Views:

1 ON THE IMPLICIT INTEGRATION OF DIFFERENTIAL-ALGEBRAIC EQUATIONS OF MULTIBODY DYNAMICS by Dan Negrut A thess submtted n partal fulfllment of the requrements for the Doctor of Phlosophy degree n Mechancal Engneerng n the Graduate College of The Unversty of Iowa July 998 Thess supervsor: Professor Edward J. Haug

2 Graduate College The Unversty of Iowa Iowa Cty, Iowa CERTIFICATE OF APPROVAL PH.D. THESIS Ths s to certfy that the Ph.D. thess of Dan Negrut has been approved by the Examnng Commttee for the thess requrement for the Doctor of Phlosophy degree n Mechancal Engneerng at the July 998 graduaton. Thess commttee: Thess supervsor Member Member Member Member

3 To my parents

4 ACKNOWLEDGMENTS I would lke to thank my advsor Professor Edward J. Haug for hs contnuous gudance, I learned many worthwhle thngs from hm. I would also lke to thank Professor Floran A. Potra, for hs advce and help over the last 5 years and Professor Jeffrey S. Freeman for hs patence and understandng durng my early years at The Unversty of Iowa. My thanks and good thoughts go to Crstna, for always supportng and encouragng me, and to Fel, for provdng nspraton and always cheerng me up. I am deeply grateful to my teachers and my frends.

5 ABSTRACT The topc of the thess s mplct ntegraton of the dfferental-algebrac equatons (DAE) of Multbody Dynamcs. Methods used n the thess for the soluton of DAE are based on state-space reducton va generalzed coordnate parttonng. In ths approach, a subset of ndependent generalzed coordnates, equal n number to the number of degrees of freedom of the mechancal system, s used to express the tme evoluton of the mechancal system. The second order state-space ordnary dfferental equatons (SSODE) that descrbe the tme varaton of ndependent coordnates are numercally ntegrated usng mplct formulas. Effcent means for acceleraton and ntegraton Jacoban computaton are proposed and numercally mplemented. Methods proposed for numercal soluton of the ndex 3 DAE of Multbody Dynamcs are the State-Space Reducton Method, the Descrptor Form Method, and the Frst Order Reducton Method. Algorthms based on the State-Space Reducton and Descrptor Form Methods employ the extensvely used famly of Newmark mult-step formulas for mplct ntegraton of the SSODE. More refned Runge-Kutta formulas are used n conjuncton wth both Frst Order Reducton and Descrptor Form Methods. Rosenbrock-Nystrom and SDIRK formulas of order 4 that are employed are L-stable methods wth sound stablty and accuracy propertes. All ntegraton formulas are provded wth robust error control mechansms based on ntegraton step-sze selecton. Several algorthms are developed, based on the proposed methods for numercal soluton of ndex 3 DAE of Multbody Dynamcs. These algorthms are shown to be robust and accurate. Typcally, two orders of magntude speed-up s acheved when these algorthms are compared to prevously used, well establshed, explct numercal v

6 ntegraton algorthms for smulaton of a stff model of the Hgh Moblty Multpurpose Wheeled Vehcle (HMMWV) of the US Army. Computatonal methods developed n ths thess enable effcent dynamc analyss of systems contanng bushngs, stff subsystem complance elements, and hgh frequency subsystems that heretofore requred tremendous amounts of CPU tme, due to lmtatons of the prevously employed numercal algorthms. v

7 TABLE OF CONTENTS Page LIST OF TABLES... LIST OF FIGURES... v x CHAPTER. INTRODUCTION..... Motvaton Thess Overvew REVIEW OF LITERATURE Revew of Methods for the Soluton of DAE Revew of Methods for the Soluton of ODE THEORETICAL CONSIDERATIONS Generalzed Coordnates Parttonng Algorthm State-Space Reducton Method Mult-step Methods Runge-Kutta Methods Descrptor Form Method Mult-step Methods Runge-Kutta Methods Frst Order Reducton Method Theoretcal Consderatons n Frst Order Reducton Computng Acceleratons n the Cartesan Representaton Computng Acceleratons n Mnmal Representaton NUMERICAL IMPLEMENTATIONS Trapezodal-Based State-Space Implct Integraton General Consderatons Algorthm Pseudo-code SDIRK Based Descrptor Form Implct Integraton General Consderatons SDIRK4/ Algorthm Pseudo-code Trapezodal-Based Descrptor Form Implct Integraton... 3 v

8 4.3.. Algorthm Pseudo-code Rosenbrock-Based Frst Order Implct Integraton General Consderaton Rosenbrock-Nystrom Methods Order Condtons For Rosenbrock-Nystrom Algorthm Algorthm Pseudo-code SDIRK-Based Frst Order Implct Integraton SDIRK4/ Algorthm Pseudo-code NUMERICAL EXPERIMENTS Valdaton of Frst Order Reducton Method Explct versus Implct Integraton Method Comparson CONCLUSIONS AND RECOMMENDATIONS APPENDIX A. PARALLEL COMPUTATION OF INTEGRATION JACOBIAN APPENDIX B. TANGENT-PLANE PARAMETRIZATION-BASED IMPLICIT INTEGRATION REFERENCES v

9 LIST OF TABLES Table Page. Butcher s Tableau Numercal Results: Solvng For Acceleratons and Lagrange Multplers Tmng Results for Seven Body Mechansm Tmng Results for Chan of Pendulums HMMWV4 Model - Component Bodes Operaton Count for Gaussan Elmnaton Operaton Count for Alg-S Operaton Count for Alg-D JR Lnear Solvers CPU Results Tmng Profles Alg-S and Alg-D Pseudo-code for Trapezodal-Based State-Space Method Butcher s Tableau for SDIRK Formulas SDIRK4/6 Formula for Descrptor Form Method Pseudo-code for SDIRK4/6-Based Descrptor Form Method Pseudo-code for Trapezodal-Based Descrptor Form Method Pseudo-code for Rosenbrock-Nystrom-Based Frst Order Reducton Method Pseudo-code for SDIRK4/5-Based Frst Order Reducton Method Parameters for the Double Pendulum Intal Condtons for Double Pendulum v

10 20. Poston Error Analyss ForSDIRK Poston Error Analyss ForRosen Velocty Error Analyss ForSDIRK Velocty Error Analyss ForRosen HMMWV4 Explct Integraton Smulaton CPU Tmes HMMWV4 Implct Integraton Smulaton Results Tmng Results for InflSDIRK Tmng Results for InflTrap Tmng Results for ForSDIRK Tmng Results for ForRosen ForSDIRK Analytcal/Numercal Computaton of Integraton Jacoban Pseudo-code for Parallel Computaton of Integraton Jacoban x

11 LIST OF FIGURES Fgure Page. Seven Body Mechansm Chan of Pendulums Graph Representaton: Seven-Body Mechansm Reduced Matrx B: Seven-Body Mechansm Two Jont Numberng Schemes: Chan of Pendulums Reduced Matrx: Chan of Pendulums A Par of Connected Bodes n JR A Tree Structure HMMWV4 Body Model: Topology Graph Spannng Tree HMMWV Double Pendulum Orentaton Body Angular Velocty Body US Army HMMWV Body Model of HMMWV Topology Graph for HMMWV Chasss Heght HMMWV Explct Integraton Results for Tolerance Implct Integraton Results for Tolerance Tmng Results for Dfferent Tolerances... 8 x

12 2. Algorthm Comparson for Dfferent Tolerances Algorthm Comparson for Dfferent Smulaton Lengths Step-Sze Hstory for ForSDIRK and SspTrap Step-Sze Hstory for ForRosen and InflSDIRK Number of Iteratons for SspTrap... 9 x

13 CHAPTER INTRODUCTION. Motvaton The motvaton for ths thess les n the mportance of havng effectve mathematcal and computatonal tools for vrtual prototypng. In a world of ncreasng global competton and shrnkng wndows of opportunty, as desgn cycles are contnuously compressed under the pressure of hgh qualty standards and short tme-tomarket objectves, vrtual prototypng becomes a powerful tool n the hands of desgners. Once the prvlege of few companes, steady growth n nexpensve computatonal power has establshed vrtual prototypng as an essental lnk n the desgn cycle of vrtually any successful company. In applcatons rangng from Arcraft and Automotve Industres, to Bomechancs and Mechatroncs, cuttng costs and reducng development cycles by elmnatng hardware prototypes, along wth qualty mprovement through desgn senstvty and what-f studes are attrbutes that have made vrtual prototypng an mportant segment of CAE/CAD/CAM ntegrated envronments. Wth these consderatons n mnd, at the begnnng of 996, the wrter crtcally evaluated some of the mathematcal methods of vrtual prototypng n Multbody Dynamcs. Research topcs were dentfed that had the potental to qualtatvely mprove methods n use at that tme. A frst goal was to answer the challenge posed by dynamc analyss of rcher and more complex dynamc systems, and then there was the objectve of developng algorthms and methods that would enable effectve numercal mplementatons of theoretcal methods once deemed to be computatonally ntractable.

14 2 Computatonal stablty and effcency were the key ssues, to be addressed usng new algorthms for topology-based lnear algebra and numercal ntegraton of Dfferental- Algebrac Equatons of Multbody Dynamcs..2 Thess Overvew Ths Secton provdes an outlne of the thess. Bref remarks descrbe the content of each Chapter of the thess. Chapter 2 contans a revew of the lterature. Many dfferent technques for the soluton of dfferental-algebrac equatons (DAE) of Multbody Dynamcs have been proposed over tme. Some of the more mportant approaches are presented, pontng out ther merts and defcences. Chapter 3 contans the methods proposed for mplct numercal ntegraton of DAE of multbody dynamcs. The frst method presented s the State Space Reducton Method, n whch the DAE of Multbody Dynamcs are reduced, va generalzed coordnate parttonng (Wehage and Haug, 982), to a set of ordnary dfferental equatons (ODE). The next method s the Descrptor Form Method, n whch usng an ODE numercal ntegraton formula, the ndex one DAE problem obtaned by assocatng the equatons of moton and acceleraton knematc constrants equaton, s drectly ntegrated. Constrant error accumulaton s prevented n ths method by recoverng dependent postons and veloctes va poston and velocty knematc constrant equatons. The last method presented for the soluton of stff DAE of Multbody Dynamcs s the Frst Order Reducton Method, whch has the potental of usng any standard mplct numercal code to numercally solve the resultng ODE problem. In ths formulaton,

15 3 second order dfferental equatons governng the tme evoluton of ndependent postons s reduced to a system of frst order dfferental equatons. Central to the Frst Order Reducton Method s the ssue of generalzed acceleraton computaton. Sectons and are devoted to a comprehensve analyss of generalzed acceleraton computaton n Cartesan and mnmal coordnates representatons, respectvely. Based on theoretcal consderatons of Chapter 3, several algorthms and codes were developed for the mplct ntegraton of the DAE of Multbody Dynamcs. These codes represent numercal mplementaton of the State Space, Descrptor Form, and Frst Order Reducton methods ntroduced n Sectons 3.2, 3.3, and 3.4, respectvely. The codes are presented n Chapter 4. The methods mplemented are as follows: (a). The State Space Method s mplemented based on the Newmark famly of mplct ntegraton formulas. In partcular, the Trapezodal formula s used throughout the numercal experments, due to ts hgher order compared to other Newmark formulas. (b). The Descrptor Form Method s mplemented based on two ntegraton formulas (b). Trapezodal formula. (b2). A fve stage, order 4, A-stable, stffly-accurate sngly dagonal mplct Runge-Kutta (SDIRK) method. (c). The frst order approach was mplemented usng two dfferent ntegraton formulas (c). A four stage, order 4 Rosenbrock formula. (c2). An SDIRK fve stage, order 4, A-stable, stffly-accurate formula of Harer and Wanner (996). When applcable, each numercal mplementaton s detaled n the followng three aspects: (). Specfcs of DAE-to-ODE reducton.

16 4 (2). ODE ntegraton stage. (3). Iteraton stoppng crtera, and step sze control. Based on the algorthms developed, numercal experments are carred out usng two test problems. Chapter 5 contans the results of these smulatons. Frst, the numercal methods are valdated n terms of asymptotc behavor. Then, the mplct ntegrators defned are compared n terms of effcency wth an explct alternatve for ntegraton of stff ODE of Multbody Dynamcs. A thrd set of numercal experments s amed at comparng the mplct ntegrators among themselves. Chapter 6 presents potental drectons of future research, and concludes the thess wth fnal remarks concernng the topc of generalzed coordnate-based state-space mplct ntegraton of DAE of Multbody Dynamcs. Appendx A presents more recent theoretcal results regardng the potental of usng multprocessor archtectures for speedng up the otherwse computatonally ntensve task of DAE mplct ntegraton. Fast ntegraton Jacoban evaluaton s the focus of the analyss n ths Secton, whch concludes wth a strategy that can take advantage of parallel computer archtectures. Fnally, the theoretcal framework derved for the coordnate parttonng approach to solvng DAE of Multbody Dynamcs s extended n Appendx B to the tangent-plane parametrzaton method. In ths context, coordnate parttonng s n fact a partcular case of the tangent-plane parametrzaton method, obtaned by choosng a certan projecton matrx.

17 5 CHAPTER 2 REVIEW OF LITERATURE 2. Revew of Methods for the Soluton of DAE In the present work, only the case of rgd body mechancal system models s consdered. However, the methods proposed n ths work are sutable for dynamc analyss of models ncorporatng flexble components. The ssue of generatng dervatve nformaton specfc to systems contanng flexble components has not yet been addressed. Throughout ths document, q = [ q, q 2, K, q n ] T denotes the vector of generalzed coordnates. The n generalzed coordnates q defne the state of the mechancal system at the poston level;.e., gven a set of n values for q, the poston of each element of the mechancal system model s unquely determned. The generalzed coordnates may be absolute (Cartesan) coordnates of body reference frames, relatve coordnates between bodes, or a combnaton of both. Generalzed veloctes are defned as the frst tme dervatve of the generalzed coordnates. In what follows an over-dot sgnfes tme dervatve. Thus, generalzed velocty s defned as q & = [ q&, q& 2, K, q& n ] T (2.) The set of generalzed postons and veloctes defne the state of the mechancal system model;.e., once these quanttes are avalable there s a unque confguraton of the system at a gven nstant n tme. Conversely, to each state of the mechancal system there corresponds unque generalzed poston q and velocty &q.

18 6 Jonts connectng bodes of a mechancal system model restrct the relatve and/or absolute moton of components of the model. From a mathematcal standpont, these moblty constrants are accounted for by the requrement that a set of algebrac expressons must be satsfed throughout the smulaton. In the most general case, the constrants may be equaltes or nequaltes nvolvng generalzed coordnates and ther frst tme dervatves. In ths thess only the case of scleronomc and holonomc equalty constrants wll be consdered;. e., the constrant equatons do not depend explctly on tme, and they do not contan any tme dervatves of generalzed coordnates. Inequalty constrant equatons are not treated n the thess. Techncally, ths scenaro can be addressed by consderng event drven ntegraton of DAE, when event locaton (dscontnuty treatment) becomes the man ssue of concern. More nformaton about event drven ntegraton can be found n the work of Wnckler (997). Under the above assumptons, the poston knematc constrant equatons assume the form Φ( q) [ Φ( q), Φ ( q),, Φ ( q)] = 0 2 K m T (2.2) Dfferentatng the poston knematc constrant equaton of Eq. (2.2) wth respect to tme yelds the velocty knematc constrant equaton, Φ q ( qq )& = 0 (2.3) where subscrpts denote partal dfferentaton;.e., Φ q = [( Φ ) / ( q )]. Fnally, takng another tme dervatve of Eq. (2.3) yelds the acceleraton knematc equaton, Φ ( qq )&& =- ( Φ q&) q& τ ( qq,&) (2.4) q q q Equatons (2.) through (2.4) characterze the admssble moton of the constraned mechancal system at poston, velocty, and acceleraton levels. j

19 7 The state of the mechancal system wll change n tme under the effect of both appled and constrant forces. The Lagrange multpler form of the equatons of moton for the mechancal system model s (Haug, 989) Mqq ( )&& + Φ T ( q) λ = Q A ( qq,&, t ) (2.5) q where M R n n s the mass matrx, whch depends on the generalzed coordnates q ; λ R m s the vector of Lagrange multplers that account for the workless constrant forces; and Q A R n s the vector of generalzed appled forces that may depend on generalzed coordnates, ther tme dervatves, and tme. Equatons (2.2) through (2.5) are the so-called Newton-Euler constraned equatons of moton. From a mathematcal standpont, they comprse a system of dfferental-algebrac equatons (DAE). Mathematcally, DAE are not ODE (Petzold, 982). The task of obtanng a numercal soluton of the DAE problem of Eqs. (2.2) through (2.5) s substantally more dffcult and prone to ntense numercal computaton that the task of solvng an ODE (Potra, 994). Ths trend s more pronounced for hgher ndex DAE, where the ndex (Brenan, Campbell, Petzold, 989), s defned as the number of dervatves requred to transform the DAE problem nto an ODE problem. To fnd the ndex of the DAE of Multbody Dynamcs, note that acceleraton knematc constrant equaton of Eq. (2.4) s obtaned after takng two tme dervatves of the poston knematc constrant equaton of Eq. (2.2). The acceleraton knematc equatons are assocated wth the equatons of moton to obtan a lnear system n generalzed acceleratons and Lagrange multplers! T M Φ q&& Q Φ q q 0 " $ #! " $ # λ =! τ A " $ # (2.6) The expresson obtaned for Lagrange multplers, by formally solvng Eq. (2.6), s a functon of generalzed postons and veloctes, q and &q, respectvely. Takng a thrd

20 8 dervatve of Lagrange multplers, and combnng these dervatves wth Eq. (2.5), results n a set of ODE. Consequently, the ndex of the DAE of Multbody Dynamcs s three. In practce ths sequence of steps convertng the DAE to ODE s never used to obtan the numercal soluton of the orgnal problem. Its only purpose s to determne the ndex of the DAE beng analyzed. Ths nformaton s useful, as generally the ndex s regarded as a measure of the dffculty that should be expected when numercally solvng the DAE. In partcular, the ndex 3 of the DAE of Multbody Dynamcs s a rather hgh ndex when compared wth DAE obtaned from modelng problems arsng n other areas of Physcs and Engneerng. Dynamc analyss of mechancal systems concerns ther tme evoluton under the acton of appled forces. It s hghly unlkely to obtan an analytcal soluton to the DAE of Multbody Dynamcs, so approxmate solutons are obtaned by means of numercal methods. In ths context, Eqs. (2.2) and (2.5) alone cease to characterze the tme evoluton of the system, snce Eqs. (2.3) and (2.4) are not guaranteed to be satsfed. When ths s the case, numercal soluton at velocty and acceleraton levels drfts away from analytcal soluton. Consequently, future poston confguratons of the system wll be wrong, snce the numercal ntegraton stage embedded n the numercal algorthm uses corrupted dervatve nformaton. Specal numercal methods have been developed to deal wth DAE, and the theory surroundng these methods bulds around the ndex of the DAE. Thus, dfferent numercal methods are proposed for dfferent ndex DAE problems. In ths Secton, the focus s on methods for the soluton of the ndex 3 DAE of Multbody Dynamcs. Most of the methods for the soluton of ths class of DAE belong to one of the followng categores:

21 9 (a). Stablzaton Methods (b). Projecton Methods (c). State Space Methods An early stablzaton-based numercal algorthm that allows for ntegraton of DAE s the so called constrant stablzaton technque (Baumgarte, 972). The problem s reduced to ndex, as n Eq. (2.6), and s drectly ntegrated. Snce, after drect ntegraton the constrants fal to be satsfed, at the next tme step, the rght-sde of acceleraton knematc constrant equaton s modfed to take nto account the constran volaton. The form of the rght sde acceleraton knematc constrant equaton s altered to τ = τ 2α&Φ βφ (2.7) where last two terms are the so called compensaton terms. These two terms do not appear n the orgnal form of acceleraton knematc constrant equaton of Eq. (2.4), and they compensate for errors n satsfyng constrant equatons at poston and velocty levels. Ostermeyer (990) dscusses crtera for optmally choosng the postve scalars α and β. Ths process s problematc (Ascher et. al, 995) and s yet to be resolved. Usually, α = γ and β= γ 2 where γ > 0. The manfold then becomes an attractor of the soluton of the newly obtaned system of ordnary dfferental equatons of Eqs. (2.5) and (2.7). Ideally, the value of the scalar γ would be ndependent of both the method used to dscretzed the new set of ODE and the ntegraton step-sze. A smple example shows, however, that choosng an optmal γ depends on both these factors. Consder for example the hypothetcal case of a system wth lnear constrants at the poston level, Φ( q) = Gq (2.8)

22 0 Usng the forward Euler ntegraton formula, Baumgarte s technque results n T T q& = q& + hm Q( q,& q )-hm G ( GM G ) n+ n n n q = q + hq& - [ GM Q( q,& q ) + αgq + β Gq& ] (2.9) n+ n n n n n n In order to see how constrant equatons are satsfed at the new tme step n +, multply the new postons and veloctes by G to obtan! " q$ & $ n+ I hi qn+ $# =!-αhi ( -βh) I " $#! q$ q& $ n n " $# (2.0) where $q Gq and q& $ Gq& are errors n constrants at the poston and velocty levels, and I s the m m dentty matrx. Denotng by B the coeffcent matrx n Eq. (2.0), deally B 0. Snce ths s not possble for the forward Euler method, α and β are optmally chosen;. e., such that B 2 0. These optmal values are α = h and β = h 2. In other words, the scalar γ above depends on both the method used to dscretze the ordnary dfferental equatons and the ntegraton step sze, whch s a drawback of the approach. More sophstcated technques should be consdered. Asher et al. (995) propose several algorthms that have as startng pont the underlyng ODE assocated wth the ndex 3 DAE of Multbody Dynamcs. The underlyng ODE s obtaned by formally elmnatng Lagrange multplers from equatons of moton of Eq. (2.5) usng acceleraton knematc constrant equaton to obtan λ = ( Φ M Φ T ) [ Φ M Q A τ] (2.) q q q Lagrange multplers are substtuted back nto Eq. (2.5) to obtan the underlyng ODE assocated to the DAE as Mq&& = Q A ( qq,&) Φ T ( Φ M Φ T ) [ Φ M Q A ( qq,&) τ ( qq,&)] (2.2) q q q q

23 whch theoretcally, can be further transformed to a frst order system of ODE z & = f $ () z (2.3) T T wth z ( q,& q ) T. By drectly ntegratng the set of second order ODE n Eq. (2.2), knematc constrant equatons at poston and velocty levels wll cease to be satsfed. In the sprt of Baumgarte s method, the dea of Ascher et al. (995) s to add a more general constrant stablzaton term n rght-sde of Eq. (2.3), to obtan z & = f $ () z γ F() z h() z (2.4) where γ > 0 s a parameter, Fz () s a 2n 2m matrx, and hz () =! Φ( q) Φqq& " $ # (2.5) are the knematc constrant equatons at poston and velocty levels. In a numercal mplementaton, the system of ODE of Eq. (2.4) s drectly ntegrated by any explct ntegraton scheme to advance the smulaton. Whle n Baumgarte s method the parameters α and β were to be chosen, the approach proposed by Ascher needs to provde the matrx Fz () of Eq. (2.4). The authors suggested several forms for ths matrx. Based on smulaton results of two small-scale planar mechancal models, they concluded the approach s relable for nonstff and hghly oscllatory problems. The addton of a stablzaton term s not justfed from a physcal, but only a mathematcal standpont;. e., t s ntroduced to lmt the drftng effect nduced by drect ntegraton of all generalzed coordnates n the formulaton. Dependng on the value of the parameter γ and the expresson of the compensaton coeffcent matrx Fz (), the dynamcs of the system would be altered f the formulaton of Eq. (2.4) s drectly ntegrated. Consequently, Ascher et al. slghtly modfy the proposed approach as

24 2 follows: Frst, use an ntegrator of choce to drectly ntegrate the underlyng ODE reformulated as n Eq. (2.3). Then use Eq. (2.4) to stablze the constrant equatons. In ths respect ths approach s very smlar to coordnate projecton technques presented later n ths Secton. Fnally, the constrant stablzaton algorthm proposed s as follows: (). Apply drect ntegraton to the underlyng ODE (2). Apply stablzaton ~ f zn+ = Ψ h ( zn) (2.6) z = ~ z γ hf ( ~ z ) h ( ~ z ) (2.7) n+ n+ n+ n+ Ascher, Petzold, and Chn (994), clam that, provded the ntegraton formula used to obtan ~ z n+ n Eq. (2.6) s of order p, the method has global error Oh ( p ) and constrants at the poston and velocty levels are satsfed to Oh ( p + ). More detals about the choce of the compensaton coeffcent matrx Fz ( ) and stablty range for the parameter γ can be found n Ascher et al. (995), Ascher, Petzold, and Chn (994), and Ascher and Petzold (993). A second class of algorthms for numercal soluton of the DAE of Multbody Dynamcs s based on so called projecton technques (Ech et. al, 990), n whch all generalzed coordnates are ntegrated at each tme step. Addtonal multplers are ntroduced to account for the requrement that the soluton satsfy constrant equatons at poston, velocty, and for some formulatons, acceleraton level. Gear et al. (985) reduce the DAE to an analytcally equvalent ndex 2 problem, n whch projecton s only performed at the velocty level. An extra multpler µ s ntroduced to nsure that velocty knematc constrant equaton of Eq. (2.3) s also satsfed. The algorthm uses a backward dfferentaton formula (BDF) to dscretze the followng form of the equatons of moton:

25 3 T q& = v Φ ( q) µ A T Mqv ( )& = Q Φ ( q) λ Φ( q) = 0 q q Φ ( qq )& = 0 q (2.8) In a smlar approach proposed, by Fuhrer and Lemkuhler (99), the DAE s reduced to ndex and an addtonal multpler η s ntroduced, along wth the requrement that the acceleraton knematc equaton of Eq. (2.4) s satsfed. In an analytcal framework, all addtonal multplers can be proved to be zero for the actual soluton. However, under dscretzaton, these multplers assume nonzero values, due to truncaton errors of the ntegraton formula beng used. Startng from the prevous ndex formulaton, Fuhrer and Lemkuhler (99) propose an algorthm based on an ndex formulaton wth no extra multplers. All varables are ntegrated, and knematc constrant equatons at poston, velocty, and acceleraton levels are mposed. Under dscretzaton, an over-determned set of 2 n+ 3 m nonlnear equatons n 2 n+ m unknowns must be solved at each ntegraton step. The dscretzaton nvolves a backward dfferentaton formula (BDF). Because of truncaton errors, the equatons become nconsstent and can only be solved n a generalzed sense. Whle the so-called ssf-soluton obtaned usng a specal oblque projecton technque s, n the case of lnear constrant equatons, equvalent to that obtaned by ntegratng the state-space form usng the same dscretzaton formula, ths ceases to be the case n general (Potra, 993). The method s robust, and t s comparable n terms of effcency to the ndex formulaton wth addtonal multplers. The methods proposed by Fuhrer and Lemkuhler (99) and Gear et. al, (985) belong to the class of so called dervatve projecton methods;.e., expressons for dervatves are modfed by addtonal multplers that ensure constrant satsfacton. A second projecton technque s based on the coordnate projecton approach. The

26 4 dervatves are no longer modfed, and ntegraton s carred out to obtan a soluton of the ndex, or for some formulatons ndex 2 DAE. Snce all varables are ntegrated, they do not satsfy the constrant equatons, so some form of coordnate projecton technque s employed to brng the ODE soluton to the constrant manfold. Ths s the approach followed by Lubch (990), Shampne (986), Ech (993), and Brasey (994). From a physcal standpont, the projecton stage s conventonal, and typcally the underlyng ODE s ntegrated wth very hgh accuracy to reduce the weght of the projecton stage n the overall algorthm. The code MEXX (Lubch et. al, 992) for ntegraton of multbody systems s based on coordnate projecton and uses relatvely expensve but very accurate extrapolaton methods for ntegraton of the ODE. When coordnate projecton methods are appled, the ndex of the DAE of Multbody Dynamcs s usually reduced to a lower order, usually two (Lubch (990), Brasey (994)). The most representatve methods n ths class are the half-explct methods of Brasey (994). The dea of half-explct methods s ntroduced by startng wth the smplest method;. e., forward Euler. Startng from consstent ntal values ( q, v ), one step of the explct Euler method s appled to the equatons of moton of Eq. 0 0 (2.5), yeldng q q = h v 0 0 A T Mq ( )($ v v) = hq ( q, v) hφ ( q) λ( q, v) q (2.9) After ths step, the velocty s stablzed. There are several ways n ths can be done. One possblty s to keep the value q fxed, and to project $v onto the manfold defned by the velocty knematc constrant equaton of Eq. (2.3), as suggested by Alshenas (992) and Lubch (99). The soluton v s obtaned by solvng T Mq ( )( v v$ ) = Φ ( q) µ q q Φ v = 0 (2.20)

27 5 The dea of half-explct methods s to replace the argument q wth q 0 n the frst lne of Eq. (2.20). Addng the resultng equatons to the second of Eqs. (2.9) elmnates $v. Introducng a new varable λ 0 for λ ( q0, v0 ) + µ h, the followng algorthm s obtaned: q q = h v 0 0 A T Mq ( )( v v) = hq ( q, v) hφ ( q) λ (2.2) q q 0 0 Φ ( q ) v = 0 The frst relaton defnes q, whereas the remanng equatons represent a lnear system for v and λ 0. The advantage of the approach of Eq. (2.2) over the one of Eqs. (2.9) and (2.20) s that nether the value λ( q, v ) 0 0 (whch requres the soluton of a lnear system ), nor the ntermedate value $v must be calculated. Consequently, ths algorthm does not requre the acceleraton knematc constrant equaton of Eq. (2.5). Applcaton of the above dea to each stage of an explct Runge-Kutta method wth coeffcents a j and b yelds the algorthm P = q + h a V 0 j j j= V = v + h a V 0= Φ ( P) V q 0 j j j= (2.22) where V j s gven by and q A T MP ( ) V = Q ( P, V) Φ ( P) Λ (2.23) j j j j q j j = P s +, v = Vs +. For smplcty, the notaton as+, = b, =, K, s, has been used. Attractve features of the proposed method are that only lnear systems of equatons must be solved, and the acceleraton knematc equaton does not appear n the formulaton.

28 6 Another class of algorthms for the soluton of DAE s based on the state-space reducton method. The DAE s reduced to an equvalent ODE, usng a parametrzaton of the constrant manfold. The dmenson of the equvalent second order state-space ODE (SSODE) s sgnfcantly reduced to ndof = n m. Ths method has the potental of usng well establshed theory and relable numercal technques for the soluton of ODE. Snce constrant equatons n multbody dynamcs are generally nonlnear, a parametrzaton of the constrant manfold can only be determned locally. Computatonal overhead results each tme the parametrzaton s changed. Nonlnearty also leads to computatonal effort n retrevng dependent generalzed coordnates through the parametrzaton. Ths stage requres the soluton of a system of nonlnear equatons, for whch Newton-lke methods are generally used. The choce of constrant parametrzaton dfferentates among algorthms n ths class. The most used parametrzaton s based on an ndependent subset of poston coordnates of the mechancal system (Wehage and Haug, 982). The partton of varables nto dependent and ndependent sets s based on LU factorzaton of the constrant Jacoban matrx. Ths partton s mantaned as long as the sub-jacoban matrx (the dervatve of the constrant functons wth respect to dependent coordnates) s not ll condtoned. The method has been used extensvely wth large scale applcatons n multbody dynamcs and has proved to be relable and accurate. Ths approach s presented n detal n Secton 3.. State-space methods for soluton of the DAE of multbody dynamcs have been subject to crtque n two aspects. Frst, the choce of projecton subspace s generally not global. Second, as Alshenas and Olafsson (994), have ponted out, bad choces of the projecton space result n SSODE that are demandng n terms of numercal treatment, manly at the expense of overall effcency of the algorthm.

29 7 These crtques are answered to some extent by consderng the tangent-plane parametrzaton method proposed by Man et al. (985), where the parametrzaton varables are obtaned as lnear combnatons of generalzed coordnates. The parametrzaton varables z are components of the vector z = V I q, where q s the vector of generalzed coordnates, and V I R ndof m contans the last ndof rows of the matrx V n the sngular value decomposton (SVD) Φ q = UDV T of the constrant Jacoban. The SVD decomposton can be replaced by a more effcent QR factorzaton, and mantan the rest of the approach. The benefts of ths reducton are antcpated to be twofold. The resultng SSODE s expected to be numercally better condtoned and allow for sgnfcantly larger ntegraton step-szes. Second, dependent varable recovery can take advantage of nformaton generated durng state-space reducton. The state-space based reducton alternatves presented above are partcular cases of a more general formulaton proposed by Potra and Rhenboldt (99). The man dea s that, under dscretzaton, Eqs. (2.2) through (2.5) result n an over-determned nonlnear algebrac system that s nconsstent. Therefore, projecton s frst performed at the poston and velocty levels, T A (& q v) = 0 T A ( v& a) = 0 2 (2.24) where A and A 2 are n ndof matrces, chosen such that the augmented matrces! A Φ T q " $ #, = 2, are nonsngular. Equaton (2.24) s appended to Eqs. (2.2) through (2.5), and rewrtten n the form

30 8 Φ( q) = 0 Φ Φ q q ( qv ) = 0 ( qa ) = τ( qv, ) Mqa ( ) + Φ λ = Q (, qv, ) T A q t to obtan an ndex DAE that s dscretzed by an ntegraton formula. The resultng well determned nonlnear algebrac system s solved at each tme step to recover poston q, velocty v, acceleraton a, and Lagrange multpler λ. The methods of Wehage and Haug (982), Man et al. (985), Haug and Yen (992) can be obtaned from the formulaton presented above by choosng partcular projecton matrces A and A 2. The last state-space type method dscussed, s the dfferentable null space method of Lang and Lance (987). It reduces the DAE to an equvalent SSODE by projectng the equatons of moton onto the tangent hyperplane of the manfold. The projecton s done before dscretzaton, and Lagrange multplers are elmnated from the problem. The algorthm requres a set of ndof vectors that span the constrant manfold hyperplane, along wth ther frst tme dervatve. The Gram-Schmdt factorzaton method (Atknson, 989) s used to obtan ths nformaton. The algorthm s effcent and robust, the resultant SSODE of dmenson ndof beng well condtoned. The mplementaton of an mplct formula to ntegrate the resultng state-space ODE s dffcult, because of the Gram-Schmdt process embedded n the algorthm. There are several conceptually dfferent methods to obtan numercal solutons of the DAE of Multbody Dynamcs. They share common features; e. g., the exstence of an ntegraton formula, and face smlar challenges; e. g., the soluton of systems of nonlnear equatons. However, they are sgnfcantly dfferent, n terms of the sequence of steps taken for solvng the DAE problem. Because of the heterogeneous character of the algorthms and technques used by dfferent numercal DAE solvers, robustness and effcency consderatons lmt the

31 9 breadth of the proposed research to state-space methods for solvng DAE. Ths choce s motvated by the better accuracy and relablty of the algorthms n ths class, when used for dynamc analyss of large scale mechancal systems. 2.2 Revew of Methods for the Soluton of ODE There are three mportant classes of methods for numercal ntegraton of ODE; mult-step methods, Runge-Kutta methods, and extrapolaton methods. They are brefly dscussed below, along wth error control mechansms. When compared to one step methods, mult-step methods usually requre fewer functon evaluatons to meet accuracy requrements. They are potentally effcent and useful f dervatve nformaton s expensve to obtan, as s the case n smulaton of Multbody Dynamcs. In the language of smulaton of Multbody Dynamcs, a functon evaluaton s equvalent to ndependent acceleraton computaton. Gven ndependent postons and veloctes, computng ndependent acceleratons requres soluton of a set non-lnear equatons to retreve depended coordnates and a lnear system to obtan dependent veloctes. Fnally, after evaluatng the composte mass matrx and generalzed force vector, ndependent acceleratons are obtaned by solvng the lnear system of Eq. (2.6). The most wdely used mult-step formulas belong to the Adams famly. They are typcally used for non-stff ODE ntegraton. Mult-step formulas are also effectve when dense output s requred. Ths feature regards the capablty of a method to generate cheap numercal approxmatons of the soluton and ts dervatve between grd ponts. It s mportant for practcal questons such as graphcal output, or event locaton. Mult-step methods are fast and relable over a large range of tolerances, due to the fact that they can vary both order and step sze automatcally. In terms of error

32 20 control, current mplementatons of mult-step methods use an estmate of the local truncaton error, va a scaled predctor-corrector dfference (Ech et. al, 995). Usng past soluton and dervatve nformaton, mult-step methods construct an nterpolatng polynomal that s used to produce the numercal soluton at the new tme step. The grd ponts for ths polynomal can be chosen arbtrarly (Krogh, 969), or equdstant (Gear, 97). In the latter case, a tme step change requested by the error control mechansm requres determnaton of the new off-grd values by nterpolaton. An order change s much smpler n ths respect. It s done by addng more past values of the soluton and/or dervatve. Famles of formulas such as Adams are very attractve from ths standpont. The opposte s true for Nordseck-based mult-step formulas (Nordseck, 962);.e., step-sze change only requres rescalng of each entry of the Nordseck vector, whle an order change s more complex. A drawback of mult-step formulas s the need for startng values. Startng multstep formulas s usually done by usng one step formulas to generate the requred number of past values, but more recently, self-startng mult-step methods that start wth order one and very small step lengths have ganed acceptance. Ths drawback s a matter of concern wth state-space methods, because a change of parametrzaton usually results n a restart of ntegraton. Furthermore, the method s not recommended for ntegraton of ntermttent moton, when repeated ntegraton restarts must be effectvely dealt wth. There are several good codes that are based on mult-step methods. The most popular s DEABM (Shampne and Gordon, 975), whch belongs to the package DEPAC desgned by Shampne and Watts (979). The code mplements the varable step sze, dvded dfference representaton of the Adams formulas, usng the Predct- Estmate-Predct-Estmate (PECE) strategy. The mplementaton requres two functons evaluatons for each successful step.

33 2 VODE mplements the Adams method n Nordseck form. It s due to Brown et. al, (989). The code s recommended for problems wth wdely dfferent actve tme scales. The nonlnear equaton s solved (for non-stff ntal value problems) by fxed pont teraton, and for many steps one teraton s suffcent. The order selecton strategy s to maxmze the step-sze. The order s kept constant over long ntervals, whch s reasonable snce a change of order for the Nordseck representaton s expensve. Fnally, for non-stff ODE, LSODE s another mplementaton of the Adams famly, based on the fxed step sze Nordseck representaton. It behaves smlarly to VODE. Harer, Nørsett, and Wanner (993), carred out numercal experments on a set of small and large problems to compare these mult-step codes. For the sake of completeness, they also ncluded the code DOPRI853 (Dormand and Prnce, 980), based on a one step method (Runge-Kutta) that s dscussed below. Ther concluson was that LSODE and DEABM requre, for equal accuracy, usually less functon evaluatons, wth DEABM beng the best for hgh precson (Tol 0 6 ). Ths suggests that when compared to the error control mechansm n LSODE, the one n DEABM s better desgned. When computng tme was measured nstead of functon evaluatons, the stuaton changed dramatcally n favor of DOPRI853. Ths was observed for small problems n whch dervatve evaluaton s cheap. When the dervatve s expensve to evaluate, the dscrepancy s not as large. Ths s explaned by the overhead of the error control mechansm for mult-step methods, compared to the one n DOPRI853, and by the cost of dervatve evaluaton for the test problems consdered. Sngle step methods for solvng ODE requre only knowledge of the soluton at the current pont, n order to advance the soluton to the next grd-pont. The most used sngle-step methods are Runge-Kutta (RK) methods and extrapolaton methods. The

34 22 truncaton error for RK methods can be controlled n a more straghtforward manner than for mult-step methods. However, these methods requre more functon evaluatons, and ths s a matter of concern n multbody dynamcs. Error control for RK methods s based almost always on step sze selecton only. The error estmaton s obtaned usng ether extrapolaton methods, or (n most cases) embedded formulas of dfferent order. The latter approach computes two approxmatons of the soluton, y and $y, and an estmate of the local truncaton error s obtaned as y y$. Componenetwse, ths error s requred to be smaller than some lmt value that s computed based on user prescrbed absolute and relatve tolerances that are ensured by the error control mechansm. The queston of whch value to choose, y or $y, to contnue the ntegraton s usually answered by takng the value obtaned wth the method of hgher order. Ths strategy s called local extrapolaton. The ratonale s that, due to unknown stablty propertes of the dfferental system, local errors have lttle n common wth global error. Therefore, the error estmate y y$ s used solely for step-sze selecton and afterwards dscarded. There are many codes based on Runge-Kutta methods. Below are presented several that are commonly used. These codes are characterzed by good error control that leads to effcent mplementatons. The code RKF45 was wrtten by Shampne and Watts (979). It s based on a par of embedded formulas of order 4 and 5, and t uses local extrapolaton. Because of the low orders consdered, except for precson, the code requres the largest number of functonal calls, compared to DOPRI, DOPRI853, and DVERK, whch are dscussed below. When the comparson s made n terms of CPU tme, RKF45 shows some mprovement relatve to the performance of DOPRI5, DOPRI853, and DVERK, suggestng small overhead.

35 23 The code DOPRI5 s based on a RK method of order 5, wth an embedded error estmator of order 4, as proposed by Dormand and Prnce. Snce the coeffcents of the method are carefully chosen, error constants are much more optmzed than for RKF45. Thus, for comparable numercal effort, between half and one dgt better numercal precson s obtaned for DOPRI5, compared to RKF45. The code performs well for medum precson error tolerances (between 0-3 and 0-5 ). The code DOPRI853 was theoretcally expected to perform well for hgh accuracy. However, t outperformed codes based on lower order formulas even for low accuracy. Whenever more than 3 or 4 exact dgts are desred, ths code s recommended. The method s of order 8, whle the error estmator s based on a 5 th order method wth 3 rd order correcton. The code DVERK s based on a 6 th order method due to Verner (978), and s ncluded n the IMSL lbrary. The error constants are less optmal, and ths code surpasses the performance of DOPRI5 only for very strngent error tolerances. Another class of one step methods s based on extrapolaton technques. These methods are less popular than mult-step and RK methods. They are usually used when very hgh accuracy (errors less than 0-2 ) s sought. Extrapolaton methods use asymptotc expanson of the global error to successvely elmnate more and more terms of the truncaton error assocated to an ntegraton formula by repeated extrapolaton. The method generates a tableau of numercal results that form a sequence of embedded methods. Ths allows easy estmates of local error and strateges for varable order formulas (Harer, Nørsett, and Wanner, 993). The method can easly generate dense output, and s adequate for mult-rate ntegraton of ODE.

36 24 The best known code based on the extrapolaton method s the code ODEX of Harer, Nørsett, and Wanner (993). Numercal experments show good performance of the code, especally for strngent accuracy where t outperformed DOPRI853. Many of the explct codes dscussed so far do a very poor job when appled for the soluton of stff ntal value problems. Whle practcal causes for stffness are ntutve, there s controversy about ts correct mathematcal defnton. The egenvalues of the Jacoban of the dervatve functon play a key role n decdng f an ntal value problem s stff or not, but other quanttes such as the smoothness of the soluton, the dmenson of the problem, and the length of the ntegraton nterval are also mportant (Harer and Wanner, 996). From a practcal standpont, engneerng applcatons such as the dynamc analyss of a car, tractor sem-traler, etc., result n stff problems, due to bushngs, dampers, stff sprngs, and flexble components present n the models. Ths ndcates that stff ntegraton formulas must be used qute frequently for effcent soluton of these problems. The best known one step RK methods for stff ODE are the collocaton methods, dagonally mplct RK (DIRK) methods, and Rosenbrock-type methods. All these methods are mplct. The most common collocaton methods are fully mplct RK methods, wth ntermedate steps that are usually the zeros of certan orthogonal polynomals. The number of stages s rather low, because of lmtaton mposed by the nonlnear system that must be solved at each tme step. Thus, f the dmenson of the ODE s n, an s stage RK method of ths type requres the soluton of a nonlnear system of dmenson n s at each tme step. The best known method n ths class s RADAU5 (Harer and Wanner,

37 25 996). It s based on Radau quadrature, the ntermedate ponts c, K, c s of the s stage RK method beng the zeros of the polynomal s d dx s x s ( x ) s and the weghts b are determned by the quadrature condtons s q bc =, q =, K, s q = There are also good methods based on Lobatto (Axelsson, 972), and Gauss ( Ehle, 968) quadrature formulas. When compared to s stage explct RK methods, the order of the s stage fully mplct methods s large. Furthermore, the s stage methods based on Gaussan quadrature of order 2 s, 2 s for Radau, and 2 s 2 for Lobatto, have excellent stablty propertes. Thus, the 3 stage Radau and Lobatto formulas of order 5 and 4, respectvely, are A-stable, a property that for mult-step methods can be assocated only wth orders up to two ( Dahlqust, 963). Dagonally mplct Runge-Kutta (DIRK) methods are defned as k = hf( x + ch, y + a k ), =, K, s 0 0 j j j= y = y + b k 0 s = (2.25) They do not requre the costly soluton of a system of nonlnear equatons of dmenson n s. Instead, these methods solve, at each stage, a system of nonlnear equatons of dmensons n, for the stage varables k. The effcency of these methods s further mproved f all elements a n Eq. (2.25) are dentcal. The resultng methods are called sngly dagonal mplct Runge-Kutta (SDIRK). In ths case, the Jacoban matrx used by the quas-newton algorthm s

38 26 dentcal at each stage, one factorzaton of ths matrx allowng for recovery of all stage values k. The better effcency s obtaned at the prce of lower order methods, compared to the fully mplct approach. However, s stage SDIRK methods can result n order s methods that are L stable (Harer and Wanner, 996). Fnally, Rosenbrock-type formulas attempt to mprove effcency by applyng only one Newton teraton for the soluton k requred by DIRK methods. When appled to an autonomous dfferental equaton, a DIRK method s lnearzed to yeld k = h[ f( g ) + a f ( g ) k ] (2.26) g = y + a k 0 j j j= Important computatonal advantage s obtaned by replacng the Jacoban f ( g ) by J = f ( y ), so that the method requres ts calculaton and factorzaton only once. 0 Addtonal lnear combnatons of terms Jk j are ntroduced n Eq. (2.26) (Kaps and Rentrop, 979) to obtan an s stage Rosenbrock method, 0 j j γ j j j= j= k = hf( y + a k ) + hj k, =, K, s y = y + b k 0 s = (2.27) where a j, γ j, and b are formula defnng coeffcents and J = f ( y0 ). In terms of step-sze control, wth some modfcatons, strateges based on Rchardson extrapolaton or embedded methods are as for explct RK methods. Integrators for stff ntal value problems replace the quantty $y y prevously used for step-sze selecton by P( y$ y), where P = ( I hγj) depends on the method beng used through the coeffcent γ. The new error estmate for stff equatons s more relable, compared to the older one, whch becomes unbounded for the typcal Dahlqust test problem y = λ y, y( 0) =.

39 27 Newton-lke methods are used to solve the dscretzed systems of nonlnear equatons. Snce the step-sze appears n expressons for matrces that must be factored (such as n P above), a step-sze change requres refactorzaton of these matrces. Therefore, the step-sze s decreased, based on stablty consderaton, whenever necessary, but t s only ncreased when the estmated new step-sze h new s sgnfcantly larger than the current one. Harer and Wanner (996), n ther step sze control mechansm for RADAU5, do not change the step sze as long as ch h ch, wth old new 2 old c = 0. and c 2 = 2.. The values for these constants should be adjusted to take nto account the dmenson of the problem and the complexty of the LU factorzaton assocated wth a step-sze change. In terms of mult-step formulas, much as numercal ntegraton consderatons lead to the Adams famly of formulas for the numercal soluton of non-stff ODE, numercal dfferentaton s used to obtan the famly of BDF that, due to ther stablty propertes, are well suted for numercal ntegraton of stff ntal value problems. The dea s to determne an nterpolaton polynomal satsfyng P k ( x ) = y j = 0,, K, k, and requre that at the new confguraton x n+, The value y n+ j n+ j, P k ( xn+ ) = f ( xn+, yn+ ) (2.28) = P k ( x ) s taken as the numercal soluton of the ODE at grd pont x n+. n+ n+ The BDF are mplct formulas, snce recoverng y n+ amounts to solvng a system of non-lnear algebrac equatons resultng from Eq. (2.28). Although the procedure outlned above produces formulas of any order, the hgh order formulas can not be used n practce. If the step-sze s constant, formulas of order 7 and hgher are computatonally useless, because they are not stable (Shampne, 994). Furthermore, BDF are not as accurate as Adams-Moulton formulas of the same order, and only relatvely low order BDF are stable. Nevertheless, the stable BDF are so much

40 28 more stable than the Adams formulas of the same order that they are essental for the soluton of stff problems. If smple calculatons are carred out, BDF can be brought to the standard form k y = γ y + hγ f ( x, y ) (2.29) n+ n+ = 0 n+ n+ These formulas come n famles of order, 2, 3, etc.. The lowest order s one (backward Euler), whch s a one step method. Startng ntegraton wth ths formula, whch s L stable, and buldng up to an approprate order s the usual strategy for BDF-based codes. The order ncrease s eventually followed by a step-sze ncrease. Ths strategy s recent ( Shampne and Zhang, 990), whle older mplementatons (Brankn et al., 988) start the ntegraton usng RK methods wth dense output capabltes. Compared to explct mult-step formulas, the error control mechansm for BDF s more conservatve, n terms of step-sze varaton. A change of step-sze occurs when the step fals on accuracy consderatons, or convergence of the teratve process that retreves y n+ s unacceptable slow. On the other hand, the step-sze s ncreased when the overhead assocated to ths process s worthwhle. There are two man sources of overhead. Frst, any step-sze change requres refactorng certan matrces used n the teratve process. Second, a step-sze change s equvalent to an ntegraton restart, snce past nformaton to sut the new step-sze must be generated. Although t does not explctly appear n Eq. (2.29), t s usually assumed that the grd ponts are equdstant;.e., past values y n, K, y are obtaned wth the same stepsze h. There are mplementatons that allow arbtrary grd ponts, but such methods must recompute some of the coeffcents γ of Eq. (2.29) at each tme step. Fnally, for most methods, adjustng step-sze s more dffcult than changng order. Frequent tme step changes corrupt the stablty propertes of BDF methods, n k

41 29 especally f the step change occurs more often than once every k steps, where k s the number of past values used n Eq. (2.29). Harer and Wanner (996) performed numercal experments to compare the performance of several ODE codes based on one and multple-step methods for ntegraton of stff ntal value problems. For mult-step methods, the codes avalable are bascally the same as for explct ntegrators. Nevertheless, they use mplct formulas, and the user must decde f the problem should be deemed stff or not. Among the codes based on mult-step methods, VODE, LSODE, and DEBDF were compared. All of them are varable order, varable step-sze mplementatons. The code VODE was brefly dscussed prevously, and t s based on BDF methods on a non-unform grd. The codes LSODE and DEBDF are very smlar, and are based on Nordseck representaton of the unform grd BDF methods. They are smlar n performance, DEBDF beng usually slghtly faster than LSODE. When used to ntegrate small problems, the codes LSODE and DEBDF were faster than VODE. The dfference vanshed when larger problems were consdered. When comparng the performance of mult-step and one step methods for small problems, the code RADAU5, a 3 stage, order 5 fully mplct RK method, outperformed the mult-step methods. Ths trend reverses when the codes are tested on large problems, because for RADAU5 the dmenson of the nonlnear system to be solved at each step becomes rather large. However, RADAU5 proves to be a relable code wth consstent behavor over all test problems consdered. In terms of one step methods, RADAU5 and the code RODAS based on 5 stage, 4 th order Rosenbrock method wth embedded error control of order 3 (Harer and Wanner, 996), turned out to be the best. For small problems, RODAS appears to perform better than RADAU5. It should be mentoned that the latter code s also used to ntegrate low ndex DAE. One drawback of the codes based on Rosenbrock methods s

42 30 the requrement to provde an accurate Jacoban. For complex problems, when an analytcal representaton of the Jacoban can not be provded, numercal technques to generate ths nformaton are employed, and ths corrupts performance. Fnally, there are codes based on mplct extrapolaton formulas ;. e.,, SODEX and SEULEX (the stff versons of ODEX and EULEX), that have not been descrbed here. For very strngent error tolerances (usually less than 0-2 ), these are the methods of choce. A presentaton of these methods can be found n the work of Bader and Deuflhard (983) and the book of Harer and Wanner (996).

43 3 CHAPTER 3 THEORETICAL CONSIDERATIONS 3. Generalzed Coordnates Parttonng Algorthm In Chapter 2 several methods for the soluton of dfferental-algebrac equatons (DAE) of Multbody Dynamcs were proposed, each wth ts own merts and drawbacks. Ths thess s not amed at the otherwse worthy goal of comparng these methods n a consstent and untary manner, but rather ts objectve s to extend the range of applcablty of the state-space method usng mplct numercal ntegrators. The focus s set prmarly on coordnate parttonng based state-space reducton (Wehage and Haug, 992). Theoretcal consderatons pertanng to tangent-plane parametrzaton-based state-space reducton are made n Appendx B, but ths alternatve has not been numercally nvestgated. Central to the coordnate parttonng based DAE-to-ODE reducton s the noton of parttonng the vector of generalzed coordnates q nto dependent and ndependent vectors u and v, respectvely. Let q 0 be a consstent confguraton of the mechancal system;. e., Φ( q0 ) = 0. In ths confguraton, the constrant Jacoban Φ q s evaluated and numercally factored usng the Gaussan elmnaton wth full pvotng (Atknson, 989), yeldng Gauss * * Φ ( q ) [ Φ Φ ] q 0 u v where, provded the Jacoban s full row rank, the matrx F u (3.) s upper trangular and det( F u ) 0 (3.2)

44 32 Let s( ), =, K, n be the set of column permutatons done durng the Gaussan elmnaton algorthm. By formally rearrangng the generalzed coordnates n the vector q such that qnew() = q( s ()), =, K, n (3.3) the frst m components of q new comprse the vector of dependent coordnates u, whle the remanng ndof n m components form the vector of ndependent coordnates v. It wll be assumed henceforth that reorderng has been done, so the permutaton array s equal to the dentty;. e., s( ) =, =, K, n (3.4) Thus, no column permutaton s necessary durng the factorzaton process. Ths s not a lmtng assumpton, and t s ntroduced here to smplfy the notaton and enhance the clarty of the presentaton. Under ths assumpton, the knematc constraned equatons at poston, velocty, and acceleraton levels, along wth the Newton-Euler form of the equatons of moton can be expressed n parttoned form as 6 6 v u 6 6 vv vu T v M u, v && v + M u, v && u + Φ u, v l = Q u, v, u&, v& (3.5) uv uu T u M uvv, && + M u, v && u + Φ uv, l = Q uvuv,,&,& (3.6) The condton of Eq. (3.2) mples that Φ u Φ u F( uv, ) = 0 (3.7) ( u, v )& u+ Φ ( u, v)& v= 0 (3.8) ( uvu, )&& Φ ( uvv, )&& uvuv,,&, & v v + = t 6 (3.9) det( F u ) 0 (3.0)

45 33 over some tme nterval. The mplct functon theorem (Corwn, Szczarba, 982) guarantees that Eq. (3.7) can be locally solved n a neghborhood of the consstent confguraton q 0 for u as a functon of v. Thus, u = g( v) (3.) where the functon gv ( ) has as many contnuous dervatves as does the constrant functon F( q ). In other words, at an admssble confguraton q 0, there exst neghborhoods U ( v 0 ) and U 2 ( u 0 ), and a functon g:u U2, such that for any v U, Eq. (3.7) s dentcally satsfed when u s as gven by Eq. (3.). Note that the dependency of u as a functon of v n Eq. (3.) t s not explctly determned, but s a theoretcal result that enables DAE-to-ODE reducton. Snce the coeffcent matrx Φ u n Eq. (3.8) s nonsngular, &u can be expressed n terms of v and &v, as u& = F F v& Hv& (3.2) u v The dependency of the quanttes n Eq. (3.2) and the followng equatons on v and &v s suppressed for notatonal smplcty. Followng the same argument, the dependent acceleratons are expressed as functon of ndependent postons, veloctes, and acceleratons usng Eqs. (3.9), (3.), and (3.2), u&& = Hv && + F u t (3.3) Fnally, Lagrange multplers are formally expressed as functon of ndependent postons, veloctes, and acceleratons, by usng Eq. (3.6) to obtan l = F T u uv uu [ Q M && v M ( Hv && + F t)] u u (3.4) Once u, &u, &&u, and l are formally expressed as functons of ndependent varables, the DAE s reduced to a second order ODE called the state-space ODE (SSODE), whch s obtaned by substtutng the dependent varables n Eq. (3.5) to yeld

46 34 Mv $ && = Q$ (3.5) where $ vv vu T uv uu M = M + M H+ H ( M + M H) (3.6) $ v T u vv T uu Q= Q + H Q ( M + H M ) F t (3.7) An argument based on the postve defnteness of the quadratc form assocated wth knetc energy of any mechancal system s at the foundaton of a result (Haug, 989) that states that the coeffcent matrx $ M n Eq. (3.5) s postve defnte. Therefore, the system n Eq. (3.5) has a unque soluton at each tme step, whch s used to numercally advance the ntegraton to the next tme step. From a practcal standpont, the ndependent acceleratons are not computed by frst evaluatng the matrces $ M and $ Q, and then solvng the system n Eq. (3.5). Ths strategy for solvng for ndependent acceleratons would be neffcent, because of the costly matrx-matrx multplcatons nvolved. Rather, an approach based on frst solvng the lnear system of Eq. (2.6), and then extractng the ndependent acceleratons &&v from the set of generalzed acceleratons &&q, based on the permutaton array s, s more effcent. Detals about ths strategy are gven n Sectons and In ths context, t s worth pontng out that effcent state-space based explct ntegraton of the DAE of Multbody Dynamcs requres n the frst place a fast method to obtan the ndependent acceleratons &&v. Usng an explct ntegraton formula, the ndependent postons v, and veloctes &v are obtaned at the next tme step by ntegratng the ntal value problem (IVP) && v = f(, t v,& v) (wth f = M$ - Q$ ) for a gven set of ntal condtons v 0 and &v 0 ; then usng Eqs. (3.7) and (3.8) u and &u at the new tme step can be obtaned. These two last stages amount to the soluton of a set of non-lnear equatons and a set of lnear equatons, respectvely. Thus, for explct ntegraton the u

47 35 lnear algebra stage s central. It appears durng each of the three steps of the process (computaton of &&v, recovery of u, recovery of &u ) and the extent to whch topology nformaton of the mechancal system model s taken nto account wll determne the overall performance of the method. The followng Sectons address the ssue of mplct ntegraton n the framework of the state-space method. Both mult-step and Runge-Kutta methods are consdered. Theoretcal consderatons n these Sectons are based on results of Haug, Negrut and Iancu (997a), and Negrut, Haug, and Iancu (997). 3.2 State-Space Reducton Method In the state-space reducton framework, several alternatves for DAE-to-ODE reducton are avalable. Although not mentoned explctly n the ttle, the method presented n ths secton uses the coordnate parttonng based reducton algorthm (Wehage and Haug, 982). Mult-step and Runge-Kutta ntegraton formulas are gong to be used to ntegrate the resultng SSODE Mult-step Methods General Consderatons on Mult-step Methods Ths secton contans a bref ntroducton of the mult-step numercal ntegraton methods. Characterstc propertes relevant n the context of mplct ntegraton of the DAE of Multbody Dynamcs va state-space reducton are presented. The works of Gear (97), Harer, Nørsett, and Wanner (993), and Harer and Wanner (996) should be

48 36 consulted for a comprehensve treatment of the topc of mult-step numercal ntegraton methods. Consder the general mplct mult-step ntegraton formula (Atknson, 989) p y = a y + h b f ( x, y ) (3.8) n+ j n j j n j n j j= 0 j= p where h s the ntegraton step-sze, a0, K, ap, b, b0, K, bp are constants, and p 0. Ths ntegraton formula s rewrtten as y = ~ y + bhf ( t, y ) (3.9) where b b, and n+ n+ n+ n+ p Í ~ y = a y + h b f ( x, y ) n+ j n- j j= 0 j= 0 p Í j n- j n-j ncludes all the terms that do not depend on y n+. Dependng on the set of coeffcents a, K, a, b, b, K, b, the mult-step formula wll have certan order and stablty 0 p 0 propertes. p In what follows, mult-step methods are appled for numercal ntegraton of the generc second order IVP && v = f(, t v,& v) (3.20) wth v( ) = v and v& ( ) = v&. Ths development wll be nstrumental n presentng an t 0 0 t 0 0 approach for the mult-step state-space based ntegraton of DAE of Multbody Dynamcs n the next Secton. Moreover, antcpatng the sgnfcance of v, &v, and &&v, these varables are called the ndependent postons, veloctes, and acceleratons. Generally, two dfferent formulas could be used, frst to ntegrate ndependent acceleratons to obtan ndependent veloctes, and then to ntegrate ndependent

49 37 veloctes to obtan ndependent postons. Based on Eq. (3.9), ndependent veloctes and postons can be expressed as v& = ~ v& + γ h&& v (3.2) n+ n+ n+ v = v + β hv& (3.22) n+ n+ n+ Notce that & ~ v n+ and v n+ contan only past nformaton, and generally γ (3.2) to express ndependent veloctes n Eq. (3.22) n terms of ndependent β. Usng Eq. acceleratons, the ndependent postons assume the form v = ~ 2 v + β h && v (3.23) n+ n+ n+ where ~ v v + βh ~ v& n+ n+ n+ β γβ Dscretzng the second order ODE n Eq. (3.20), and usng Eqs. (3.2) and (3.23) yelds 6 (3.24) && v = f t, v (&& v ), v& (&& v ) n+ n+ n+ n+ The dependence of ndependent postons and veloctes on ndependent acceleratons s va the ntegraton formulas (3.23) and (3.2), respectvely. Equaton (3.24) s a system of non-lnear algebrac equatons that must be solved at each tme step for the set of ndependent acceleratons &&v n+. Recastng ths system nto the form 6 (3.25) Ψ(&& v ) && v - f t, v (&& v ), v& (&& v ) = 0 n+ n+ n+ n+ n+ ts soluton s found by means of a quas-newton method. For ths, Jacoban nformaton must be provded to the non-lnear solver. After applyng the chan rule of dfferentaton, the Jacoban (henceforth ths quantty s gong to be called ntegraton Jacoban to avod confuson wth the constrant Jacoban Φ q ) s obtaned as

50 38 = - & - (3.26) Ψ && I f & v && f v && v v v v v In Eq. (3.26), subscrpts n + have been dropped for convenence. Usng Eqs. (3.2) and (3.23), the dervatves of ndependent postons and veloctes wth respect to ndependent acceleratons are readly obtaned as v =βh 2 I (3.27) v&& & && v v = γh I (3.28) where I s the dentty matrx of approprate dmenson. Fnally, substtutng the expressons of v v&& and v& && v n the expresson for the ntegraton Jacoban yelds Ψ && v = I- γ fv & - β fv h h 2 (3.29) The dervatves f v& and f v are problem dependent, and they are to be provded to the non-lnear solver. How to provde them for the partcular case of SSODE of Multbody Dynamcs s the topc of the next Secton. Wth these consderatons, the algorthm proposed for the mplct ntegraton of the second order IVP s as follows: (). Provde an ntal estmate for the set of ndependent acceleratons &&v. (2). Integrate to obtan ndependent postons and veloctes, usng the expressons n Eqs. (3.23) and (3.2), respectvely. (3). If stoppng crtera are satsfed, then stop. Otherwse, usng a quas-newton correcton, correct the value of &&v and go to step (2). The stoppng crtera n step (3) are based on the norm of the error n satsfyng the non-lnear system Ψ(&&) v = 0, and on the sze of the last correcton appled to &&v. In a practcal mplementaton, the ntegraton Jacoban Ψ &&v s computed once at the end of each successful ntegraton step, and kept constant durng the teratve process at least for

51 39 the followng ntegraton step. Ths strategy s motvated by the CPU ntensve process of generatng f v& and f v Mult-step State-Space Based Implct Integraton The prevous Secton descrbed how an mplct mult-step ntegraton formula s used to ntegrate a generc set of second order ODE. The generc ODE s replaced here wth the SSODE obtaned va coordnate parttonng based DAE-to-ODE reducton. The challenge of mult-step state-space based mplct ntegraton les n provdng dervatve nformaton, whch stands as the backbone of ntegraton Jacoban computaton. The focus s thus set on computng what amounts to be the analogue of the quanttes f v& and f v of the prevous Secton. Usng a mult-step mplct ntegraton formula to drectly dscretze the SSODE of Eq. (3.5) s mpractcal, snce solvng the resultant set of non-lnear equatons requres Jacoban nformaton whenever a Newton-lke method s nvolved. For ths mplct form ODE, generatng the ntegraton Jacoban would be a dffcult task. Instead, usng Eq. (3.5), from whch Eq. (3.5) orgnates, leads to a consstent and tractable way to obtan the needed dervatve nformaton. Equatons (3.2) and (3.23) are used to dscretze the ndependent equatons of moton of Eq. (3.5). From a theoretcal standpont, callng these dfferental equatons ndependent n the sense n whch the generalzed coordnates were s not rgorous. It s only to underlne that these are the equatons leadng to the state-space ODE of Eq. (3.5) after substtuton of all dependent varables. Thus, regardng Eq. (3.5) as a set of second order ODE n ndependent coordnates v, and substtutng v n+ and &v n+ from Eqs. (3.23) and (3.2), yelds a set of non-lnear equatons n ndependent acceleratons at tme t n+,

52 40 vv vu v ψ(&& v ) M && v + M u&& + ( Φ T ) λ - Q = 0 (3.30) n+ n+ n+ n+ n+ v n+ n+ n+ The system of non-lnear equatons Y(&& vn+ ) = 0, obtaned after dscretzaton of the ndependent equatons of moton s numercally solved for &&v n+ by employng a quas- Newton method. The process of obtanng the requred ntegraton Jacoban turns out to be an exercse n applyng the chan rule of dfferentaton, at the level of knematc constrant equatons and dependent equatons of moton. Takng the dervatve of Eq. (3.30) wth respect to ndependent acceleratons and usng the chan rule of dfferentaton yelds Y v vv vv vv vu = M + ( M &&) v uuv + ( M &&) v vvv + M u&& v vu vu T T T + ( M u&&) uu ( &&) F ( F ) ( F ) && v + M u vv && v + vl && v + vl uu&& v + vl vv && v (3.3) v v v v Qu Qv Qu& Qv& && && && && u v v v && && & && & && u v v In Eq. (3.3), the subscrpt n + s suppressed for notatonal smplcty. In what follows the dervatves u v&&, u& && v, u&& && v, and l &&v v are evaluated. In order to compute the dervatve of dependent postons wth respect to ndependent acceleratons, the poston knematc constrant equaton of Eq. (3.7) s dfferentated wth respect to &&v, to obtan F to u = F v. Usng Eq. (3.27), ths reduces u && v v && v 2 2 u = βh F F = βh H (3.32) && v u v Takng the dervatve of the velocty knematc constrant equaton of Eq. (3.8) wth respect to &&v yelds Fuu& v = [( Fuu& ) uuv + ( Fuu& ) vvv + Fvv& v + ( Fvv& ) uuv + ( Fvv& ) vvv] && && && && && && (3.33) 2 = βh [( F u& ) H+ ( F u& ) + ( F v& ) H+ ( F v& ) ] γhf Ths equaton can be further reduced to u u u v v u v v v

53 4 2 u& = βh Φ [( Φ q& ) H+ ( Φ q& ) ] γhφ Φ && v u q u q v u v 2 βh J+ γhh (3.34) The dervatve u&& && v s obtaned by dfferentatng the acceleraton knematc equaton of Eq. (3.9) wth respect to &&v, to yeld Then Fuu&& v = tuuv + tuu& v + tuu& v + tvv& v [( Fq&&) q uuv + ( Fq&&) q vvv + Fv] && && & && & && & && && && (3.35) 2 = βh [ t H+ t + t J ( F &&) q H ( F q&&) ] + γh( t H+ t ) F u v u& q u q v u& v& v = B (3.36) 2 u&& = βh F [ t ( F &&) q ] H+ t + t J ( F q&&) && v u u q u v u& q v 2 + γhf ( t H+ t ) + H βh L+ γhn+ H u u& v& Fnally, dfferentatng the dependent equatons of moton of Eq. (3.6) wth respect to &&v yelds T u u u u T T uv Ful v = Qu u v + Qv v v + Qu u& v + Qv v& v [( Ful) uuv + ( Ful) vvv + M && && && & && & && && && uv uv uu uu uu + ( M &&) v uuv + ( M &&) v vvv + M u&& v + ( M u&&) uuv + ( M u&&) vv && && && && && v] u u u T T uv = βh 2 [ QH u + Qv + QJ u ( Ful) uh ( Ful) v ( M &&) v u H ( M uv &&) v & v uu uu uu u u uu uv uu M L ( M u&&) H ( M u&&) ] + γh( Q H+ Q M N) M M H u v and the dervatve of Lagrange multplers wth respect to ndependent acceleratons s 2 -T u T uv uu u u T l [ ( l) ( &&) ( &&)] ( l) && v = βh Fu Qu Fu u M v u M u u H+ Qv + QuJ F & u v uv uu uu -T u u uu ( M &&) v v ( M u&&) v M LA + γhf u ( QuH+ Q ) & v M N (3.37) & -T uv uu -T 2 uv uu F ( M + M H) F ( βh R+ γhs+ M + M H) u < u In Eqs. (3.32) through (3.37), the followng notatons are used: u& J = F [( F q&) H+ ( F q& ) ] (3.38) u q u q v = B (3.39) L = F [ t ( F q&) ] H+ t + t & J ( F q&) u u q u v u q v N= F ( t H+ t ) (3.40) u u& v& v&

54 < u T u u u T R = [ Q ( F l) ( M q&&) ] H+ Q + Q J ( F l) u u uu ( Mq&&) M L v u u A u v u& u v (3.4) 42 u u uu S = ( Q H+ Q M N) (3.42) & & u v After substtutng the dervatve expressons nto Eq. (3.3) and collectng terms accordng to powers of the step-sze h, the ntegraton Jacoban can be expressed as Y v vv vu T uv uu vu T v v = M + M H+ H ( M + M H) + γh( M N+ H S Q Q H) && & & v v vu T T T + βh 2 [( Mq &&) H+ ( Mq&&) + M L+ HR+ ( F l) H+ ( F l ) (3.43) v v v QH Q QJ] u v u u& v v u v u v v Takng nto account the defnton of M $, and ntroducng the notatons M$ vu M N + T H S - v Q - v v Q H & u& M$ v ( M &&) q H + v ( M &&) q + vu 2 u v M L + T H R T T v v v + ( Φ λ) H+ ( Φ λ) -Q H-Q -Q J & v u v v u v u (3.44) (3.45) the ntegraton Jacoban assumes the form Ψ $ $ 2 $ && v = M+ γhm + βh M2 (3.46) The ntegraton Jacoban n Eq. (3.46) s nonsngular for small enough step-szes h, snce the matrx M $ s postve defnte. The soluton &&v s found va an teratve process of the form - 2 && v 7 (3.47) ( k+ ) ( ) ( ) ( ) && v k k = && v - Ψ (&& v 0 ) Ψ(&& v ) In a numercal mplementaton, the rather complcated form of the ntegraton Jacoban needs to be coded once. In other words, software should be provded to compute M $, M $, and M $ 2. Ths s a matter of book-keepng and effcent level 2 and 3 Basc Lnear Algebra Subroutne (BLAS) operatons. Ths process should, however, be supported by a lbrary of dervatves that contans for each typcal jont and force element

55 43 dervatves wth respect to generalzed postons and veloctes. Ths s a one-tme effort, and n terms of constrant dervatves the problem s addressed elegantly by n fact provdng the dervatve nformaton for several basc constrant functons. These constrants are Φ d ( a, a j), Φ d 2 ( a, d j), Φ s ( P, Pj ), Φ ss ( P, Pj, C), along wth other several absolute constran functons (Haug, 989). These are the buldng blocks, that n a Cartesan representaton of a mechancal system, are assembled to generate dervatve nformaton for complex jont elements. Computng force dervatves s more dffcult, because the famly of forces that are readly expressble n an analytc closed form s rather small. Dervatve nformaton can be easly obtanable for Translatonal-Sprng-Damper-Actuator (TSDA), and Rotatonal-Sprng-Damper-Actuator (RSDA) elements, but t s vrtually mpossble to generate for the nteracton force between a bulldozer s blade pushng a ple of gravel, for example. In the latter case, the soluton s to ether neglect the contrbuton of these dervatves n the quas-newton algorthm, or to compute them numercally. The ssue of obtanng dervatve nformaton s not the focus of ths thess. A comprehensve treatment of ths topc can be found n the work of Serban (998). Here, t s assumed that dervatve nformaton s avalable and organzed n a lbrary that s called durng the numercal ntegraton stage to compute, among other thngs the ntegraton Jacoban of Eq. (3.43). It remans to assemble and use ths nformaton, accordng to the partcular ntegraton method beng consdered.

56 Runge-Kutta Methods General Consderatons on Runge-Kutta Methods Ths Secton presents basc concepts underlyng the famly of Runge-Kutta numercal ntegrators. There s a deep theory behnd ths class of numercal ntegrators, and the presentaton here s by no means complete. A very thorough presentaton of ths very actve area of Numercal Analyss has been gven by Harer, Nørsett, and Wanner (993), and Harer and Wanner (996). A Runge-Kutta ntegraton formula can be represented usng the so called Butcher s tableau of Table. Table. Butcher s Tableau c a a 2 a s c 2 a a a 2s c s a s a s2 a ss b b 2 b s The Runge-Kutta formula defned n Table s an s -stage formula, defned by the coeffcents a j, b, and c j. If ths ntegraton formula s appled to numercally ntegrate the IVP yš = f ( x, y) (3.48) wth yx ( ) = y, after one ntegraton step the soluton s gven by 0 0

57 45 s y = y0 + hí bk (3.49) = where, wth h beng the ntegraton step-sze, k = f ( x + ch, y + h a k ) 0 0 j j j= s Í =, K, s (3.50) for j If a j = 0 for j, the formula s called explct. When some of the values a j,, are non-zero, the formula s called mplct. If all a j are non-zero, the formula s called fully mplct. A partcular class of mplct formulas s the so called famly of dagonally mplct formulas (DIRK), where a j = 0 for < j;. e., when all the supradagonal elements are zero. In the case of DIRK formulas, sgnfcant gan n effcency can be obtaned f all the dagonal entres n the Butcher tableau are dentcal, yeldng the so called sngly dagonal mplct Runge-Kutta (SDIRK) ntegrators. In the present work, formulas consdered for numercal ntegraton of the SSODE of Multbody Dynamcs belong to the SDIRK famly. However, the theoretcal consderatons that are made here n conjuncton wth ths class of ntegrators are drectly applcable to the class of fully mplct Runge-Kutta ntegrators. For the dynamc analyss of the class of problems consdered so far, the performance of the SDIRK famly has been very good. More challengng applcatons mght recommend takng the extra step of mplementng and usng the more complex class of fully mplct Runge-Kutta formulas. In ths context, the RADAU famly of collocaton methods (Harer and Wanner 996), s generally consdered to be the most promsng drecton to follow. The basc dea behnd the Runge-Kutta famly of ntegraton formulas s to choose the coeffcents defnng the formula;. e., a j, b, and c j, of Table such that the Taylor expanson of the soluton of the IVP n Eq. (3.48) s dentcal to the Taylor expanson of the numercal soluton provded by Eq. (3.49) and (3.50). Taylor expansons are done

58 46 wth respect to the varable h, and the number of matchng terms n the Taylor expanson determnes the order of the method. Among the condtons usually mposed on the coeffcents are (Harer, Nørsett, and Wanner, 993), Í aj = c j= (3.5) s Í b = = (3.52) As the upper summaton lmt n Eq. (3.5) suggests, the theoretcal framework s that of SDIRK formulas ( a = γ, assumes the form By ntroducng the stage varables =, K, s). For ths famly of mplct ntegrators, Eq. (3.50) k = f ( x + ch, y + h a k ) 0 0 j j j= Í z y + h a k 0 j j j= Í (3.53) k can be expressed as k = f ( x + ch, z ) 0 In the context of Multbody Dynamcs, the stage varables k and z can be regarded as stage acceleratons and veloctes, respectvely; or stage veloctes and postons, respectvely. From a physcal standpont, the stage varables are not the numercal soluton at ntermedate ntegraton grd ponts x 0 + ch, but they are merely ntermedate values that are used to provde the numercal soluton at the end of one macro-step of the Runge-Kutta formula.

59 47 Before applyng an SDIRK formula for numercal ntegraton of the SSODE t s nstructve to apply the same formula to ntegrate a generc second order set of ODE. Ths wll serve n Secton as the model to be followed when ntegratng the ndependent equatons of moton of Eq. (3.5). Consder for ths the IVP v&& = f (, t v,&) v (3.54) wth vt ( ) = v and vt &( ) = v&. The second order ODE s formally reduced to a frst order ODE Then, dy dt v v = gty y gty! " v$ # & (, ), (, ) &! f (, t v,&) v k = g( t + ch, y + h a k ) 0 0 j j j= Í " $# (3.55) (3.56) and the soluton at the new tme step s gven by y = y + h bk 0 s Í = Introducng the stage varables z [ v v& ] () () T as the stage varables k are k = g( t + ch, z ) = 0 z = y + h a k 0 j j j=! Í v& () () () f ( t + ch, v,& v ) 0 " $ # =! v& v&& () () " $ # (3.57) (3.58) Wrtng Eq. (3.57) explctly yelds! v v& () () " $ # =! v 0 v& " h a v ( j) & " j ( j) $# + Í j! v&& $ # = 0

60 48 or equvalently, () 0 Í j ( j) j= v = v + h a v& () 0 Í j ( j) j= v& = v& + h a v&& (3.59) (3.60) If the stage values for v& ( j) from Eq. (3.60) are substtuted back nto Eq. (3.59), the stage value v () can be expressed exclusvely n terms of v&& ( j ), wth () 2 ( j) v = v + hc v& + h µ v&& (3.6) 0 0 Í j= µ j a k a kj k= j j Í (3.62) Snce the functon f () Ÿ n Eq. (3.54) s generally non-lnear, a system of nonlnear equatons must be solved at each stage of the formula to retreve v&& (). The strategy to fnd the soluton v&& () s based on a quas-newton approach, and therefore dervatve nformaton must be provded to the non-lnear solver. Ths process of retrevng v&& (), thus k, s the cornerstone of the ntegraton algorthm, snce the soluton at the new grd-pont s mmedately obtaned as a lnear combnaton of these quanttes, as ndcated n Eq. (3.49). At each stage, v&& () s obtaned as follows: (). Provde an ntal estmate for v&& () (2). Based on Eqs. (3.6) and (3.59), obtan stage varables v () and v& () (3). If the stoppng crtera are satsfed, then the teratve process s stopped, and () () k = [ v& v&& ] T. Otherwse, usng a quas-newton correcton, correct the value of v&& () and go to step (2).

61 49 The stoppng crtera n step (3) are based on the sze of the norm of the error n satsfyng Eq. (3.54), as well as on the sze of the last correcton n v&& (). Central to the process of obtanng the stage value v&& () s the correcton step. The non-lnear system to be solved s () () () () () v&& - f ( t0 + ch, v (&& v ), v& (&& v )) = 0 (3.63) The quanttes v () and v& () n Eq. (3.63) depend on v&& (), through the ntegraton formulas of Eqs. (3.6) and (3.59), respectvely. Consequently, the Jacoban of the non-lnear system s () () J f v v ( ) f v v = - && - & ( ) && (3.64) v () () The dervatves v v ( ) and v ( ) && &&& v are readly avalable by takng the partal dervatve of Eqs. (3.6) and (3.59) wth respect to v&& (). On the other hand, the dervatves f v and f v& are problem dependent and must be provded. v& Usually, n a quas-newton approach for the soluton of ths non-lnear system, the quanttes f v and f v& are evaluated at the begnnng of a macro-step, and kept constant durng at least that macro-step, thus savng a substantal amount of CPU effort by crcumventng the costly process of dervatve evaluaton. CPU savngs assocated wth the SDIRK method, as noted at the begnnng of ths Secton, are due to the () () expresson assumed by the dervatves v v ( ) and v ( ) && &&& v. For all stages, these dervatves are dentcal. In partcular, for an SDIRK method wth a = a 22 = K= a ss = γ, () v ( ) = γ 2 h 2 (3.65) v&& v = γ h (3.66) () v ( ) &&& Therefore, for an SDIRK ntegraton formula, the Jacoban used at each stage to solve for v&& () s the same and assumes the form 2 2 J = - γh f - γ h f (3.67) & v v

62 50 The foregong s the framework n whch numercal ntegraton of the SSODE s gong to be carred out. Approprate modfcatons of the formulas n ths Secton accommodate the stuaton n whch, nstead of scalar values, vector quanttes are dealt wth Runge-Kutta State-Space Based Implct Integraton As n the case of mult-step methods, dscretzaton usng an SDIRK formula s appled to the ndependent equatons of moton, rather than to the mplct form second order SSODE n Eq. (3.5). In the lght of consderatons made n Secton , applyng a dfferent numercal ODE ntegrator to fnd the soluton of the state space ODE s straght forward. Instead of usng an mplct mult-step formula, the second order ODE s ntegrated usng the methodology ntroduced n the prevous Secton. In order to extend the presentaton from mplct mult-step methods to SDIRK formulas, Eqs. (3.6) and (3.59) are reformulated n vector notaton as () v ~ () () = v + h 2 γ 2 && v (3.68) and () & ~ () () v = v& +γ h&& v (3.69) respectvely. Wth µ j defned n Eq. (3.62), the followng notatons have been used: ~ () ( j) v = v + hc v& + µ && v (3.70) Í j= j - ~ v& = v& +h a && v Í () ( j ) 0 j j= (3.7) Comparng Eqs. (3.68) and (3.69), to Eqs. (3.23) and (3.2), t results that each stage of one macro-step s dentcal to one step of a generc mult-step method, as

63 5 ntroduced n the framework of Sectons and Consequently, the ntegraton Jacoban that s requred durng each stage of an SDIRK method s obtaned as n Eq. (3.43), wth the only dfference that β = γ 2. Ths mght be the case even for mult-step methods, provded the same ntegraton formulas are used to ntegrate ndependent acceleratons to obtan ndependent veloctes, and then ndependent veloctes to obtan ndependent postons. In other words, the ntegraton Jacoban requred for SDIRK based mplct ntegraton of the state-space ODE s Ψ v vv vu T uv uu vu T v v = M + M H+ H ( M + M H) + γh( M N+ H S-Qv -QuH) 2 2 v v vu T T T + γ h [( Mq &&) u H+ ( Mq&&) v + M L+ HR+ ( Φvλ) uh+ ( Φvλ ) v (3.72) v v v -QH-Q -QJ] && & & u v u& or, usng the matrx notaton ntroduced n Secton , Ψ $ $ 2 2 $ && v = M+ γhm + γ h M2 (3.73) where γ s the dagonal element n Butcher s tableau for the SDIRK formula consdered. Techncally, these consderatons complete the dervaton of SDIRK based state-space mplct ntegraton. 3.3 Descrptor Form Method Ths Secton ntroduces a new method for the mplct ntegraton of the DAE of Multbody Dynamcs. The results presented here are based on the work of Haug, Negrut, and Engstler (998); Haug, Negrut, and Iancu (997b); and Iancu, Haug, and Negrut (997). In Secton 3.3. s presented the case n whch the method employs an mplct mult-step method to dscretze the ndex DAE of Eqs. (3.5), (3.6), and (3.9). In Secton s detaled the case when an SDIRK formula s used for ths purpose. Although the dscretzaton s done at the ndex DAE level, the method s truly a state-

64 52 space method, snce after each ntegraton step, dependent varables at poston and velocty levels are recovered usng knematc constrant equatons. Compared to the State-Space Method, ths approach results n system of non-lnear algebrac equatons of sgnfcantly larger dmenson. Thus, the dscretzaton of the ndex DAE results n a set of n+ m non-lnear equatons that s solved at each step/stage for generalzed acceleratons and Lagrange multplers. Snce mplct mult-step methods and the famly of SDIRK formulas were ntroduced n Sectons and , respectvely, the focus here s prmarly on ntegraton Jacoban computaton. Generc mplct ntegraton formulas v = ~ 2 v + β h && v (3.74) n+ n+ n+ v& = ~ v& + γ h&& v (3.75) n+ n+ n+ are used to present the new method, where ~ v n+ and ~ v& n+ contan only past nformaton. To smplfy the presentaton, t s assumed that a reorderng of the vector of generalzed coordnates has been done, such that the frst m entres are dependent generalzed coordnates, whle the last ndof entres are ndependent generalzed coordnates. Usng the ntegraton formulas of Eqs. (3.74) and (3.75), the equatons of moton and the acceleraton constrant equaton are dscretzed to obtan Ψ! " $ T A M qn && qn + Φq qn λn -Q qn, q& n Φqqn+ 6&& qn+ - τqn+, q& n+ 6 # = 0 (3.76) Ths system s solved at each ntegraton step of a mult-step method, or at each stage of an SDIRK method, for &&q n+ and λ n+. Once these values are avalable, the ntegraton formulas are used to obtan the ndependent postons and veloctes. Dependent varables (postons and veloctes) are recovered va the knematc constrant equatons at the poston and velocty levels.

65 53 The ntegraton proceeds as follows: (). Provde a startng value for &&q n+ and λ n+ (2). Use Eqs. (3.74) and (3.75) to obtan ndependent postons and veloctes (3). If stoppng crtera are met, stop. Otherwse, n a quas-newton framework, correct the values of the generalzed acceleratons and Lagrange multplers, and go to Step (2). For an SDIRK method, these three steps are repeated at each stage of the method. The stoppng crtera n the last step should be desgned based on the norm of the resdual Ψ ( k ) (evaluated at the end of teraton k ), and the norm of the last correcton n generalzed acceleraton and Lagrange multplers. The remander of ths Secton focuses on how to compute the Jacoban of the dscretzaton system of non-lnear algebrac equatons of Eq. (3.76). Subscrpts wll be suppressed to keep the presentaton smple. Dervatves of ndependent postons and veloctes wth respect to ndependent acceleratons are easly obtaned by dfferentatng Eqs. (3.74) and (3.75), respectvely, yeldng v = βh 2 I (3.77) v&& & && v v = γh I (3.78) where I s the dentty matrx of dmenson ndof. n If a Boolean matrx s used to select the ndependent generalzed coordnates, as v = Pq (3.79)

66 54 the smplfyng assumpton ntroduced earler concernng the orderng of varables n the vector of generalzed coordnates q, mples that P= [ 0 I], wth P ª ndof œm, and I beng the dentty matrx of dmenson ndof. Recallng that P s constant, && v = Pq&&, and && v && = q P (3.80) Based on the results n Eqs. (3.77) and (3.32), the dervatve of the generalzed coordnate vector q wth respect to ndependent acceleratons &&v s H" q = βh v&&! I $ # 2 (3.8) Usng the chan rule of dfferentaton yelds HP" qq = qv && vq = βh 2 H$ && && &&! P $ # βh 2 (3.82) In order to compute the dervatve q& && v, frst recall the defnton of the matrx J of Eq. (3.38) and the expresson for u& && v of Eq. (3.34) n Secton Usng these results, the desred dervatve s H" J q& = && v! I $ # " γh + βh! 0 $ # 2 (3.83) Applyng the chan rule of dfferentaton and usng expresson of q& && v and && v && q, HP" JP q& q& && v $ $ && q = H J && v && q =! P $ # " γh + βh 2! 0 $ # γh + βh 2 Fnally, usng the chan rule of dfferentaton and expressons for q q&& and q& && q, the ntegraton Jacoban requred by the quas-newton algorthm s Ψ && q =! 2 T A && Φ λ $ A $ 2 M+ βh Mq + -Q H- Q H+ J$ & q q q q q γh βh 2 Φ + βh Φ && q -τ H$ - τ γhh$ 2 + βh J$ q q 7q q q& " $ #

67 Ψ λ =! Φ T q 0 " $ # 55 The teratve process s carred out as follows: 2 T A T M+ βh M&& q + -Q H$ A -Q J$ A Φ -hγq H j q q λ $ q q q q Φ ( ) q && q # =- + βh && q - H$ - J$ -hγ H$ 0 $# & & 2 Φ Φ τ τ τ λ & &! J J2 q 7q q L L q q q! " $ # =! " $ # +! ( j) ( j ) ( j) - && q && q q&& λ λ λ " $# " # $ # ( j-)! " Ψ ( j ) where j s the teraton counter. At each teraton, ntegraton formulas determne v and &v, and knematc constrant equatons at the poston and velocty levels are solved for u and &u. Iteraton s contnued untl stoppng crtera are met. One attractve feature of ths method s that, when compared to the prevously ntroduced State-Space Method, computaton of the ntegraton Jacoban s by far less CPU ntensve. Ths s prmarly due to the lack of expensve matrx-matrx multplcatons n ntegraton Jacoban computaton for the State-Space Method. Lkewse, ths method does not requre the extra effort of computng &&u and λ at each teraton, as does the State-Space Reducton Method, snce these quanttes are now beng solved for. The major drawback of ths method s the rather large dmenson of the non-lnear system that must be solved at each step/stage. For a Hgh Moblty Multpurpose Wheeled Vehcle (HMMWV) model comprsng 4 bodes (Serban, Negrut, and Haug, 998) that serves as a test problem later n ths document, the dmenson of ths nonlnear system becomes 78. The same model, when analyzed usng the State-Space Method, results n a non-lnear system of dmenson 8. To some extent, ths dmensonal problem for the Descrptor Form Method s allevated f sparse solvers are

68 56 used to carry out the teratve process. For the HMMWV model, the fll-n rato of the ntegraton Jacoban s n fact less than 5%.;. e., over 85% of the entres are zero Mult-step Methods In the case of a mult-step formula used n the framework of the Descrptor Form Method, generc formulas of Eqs. (3.74) and (3.75) are replaced by actual mult-step formulas. As n Secton 3.2.., the actual mult-step ntegraton formulas are used to obtan ndependent postons and veloctes as v = ~ 2 v + βh && v n+ n+ n+ v& = ~ v& + γh&& v n+ n+ n+ wth ~ v n+ and ~ v& n+ contanng formula-specfc past nformaton. Consequently, nothng s changed n the approach presented n the prevous Secton. The ntegraton Jacoban assumes exactly the same form, wth the coeffcents γ and β provded by the mult-step ntegraton formula beng consdered. Wth ths, the 3 steps presented for the Descrptor Form Method are mmedately applcable Runge-Kutta Methods When an SDIRK type formula s used to express ndependent postons and veloctes n terms of ndependent acceleratons, ths dependency was shown n Secton to assume the form v = ~ v + h 2 γ 2 && v () () () v& = ~ v& +γh&& v () () ()

69 57 Wth µ j defned n Eq. (3.62), () Í j ( j) j= ~ v = v + hc v& + µ && v - ~ v& = v& +h a && v Í () ( j ) 0 j j= where a j are the a coeffcents n Butcher s tableau, and γ s the dagonal element of the formula. Superscrpt refers to the stage of the formula. The ntegraton Jacoban s then easly obtaned by replacng the coeffcent β wth γ 2. As noted before, an attractve feature of SDIRK formulas s that the ntegraton Jacoban needs to be evaluated only once at the begnnng of the macro-step. Ths Jacoban s then used durng each stage to compute q&& () and λ (). Once these quanttes are avalable for each stage, ndependent postons and veloctes are obtaned at the new tme step as Í 2 v = v0 + hv& 0 + h ϕ&& v () s = v& = v& + h b&& v () 0 s Í = where ϕ = ba j j s Í j=. At the end of the macro-step, t remans to recover dependent postons and veloctes, based on knematc constrant equatons at poston and velocty levels.

70 Frst Order Reducton Method The State Space and Frst Order Reducton Methods share the same DAE-to-ODE reducton strategy. The startng pont for both methods s the ndependent equatons of moton 6 6 v 6 6 vv vu T v M u, v && v + M u, v && u + Φ u, v l = Q u, v, u&, v& In the State Space Reducton Method, ths second order dfferental equatons s dscretzed, and the resultng non-lnear equatons are solved for the ndependent acceleratons &&v. The dea behnd the Frst Order Reducton Method s to go one step further, and transform ths set of second-order ordnary dfferental equatons n an equvalent set of frst order dfferental equatons. If the dmenson of the second order ODE dealt wth n the State Space Method s equal to the number of degrees of freedom ndof of the mechansm, the dmenson of the resultng frst order ODE n the Frst Order Reducton Method wll be 2 œ ndof. The motvaton for the extra step for frst order reducton resdes n the exstence of very good software for the numercal soluton of stff frst order IVP. The goal s to adapt some of these codes, to accommodate numercal ntegraton of the frst order ODE obtaned from the SSODE of Multbody Dynamcs after order reducton. The mplcaton of the addtonal order reducton step for the Frst Order Method s substantal. If a standard ODE ntegraton code s appled, correctons durng the teratve soluton of the dscretzed non-lnear equatons, are done n ndependent postons and veloctes. Durng each teraton, after recoverng dependent postons and veloctes, generalzed acceleratons and Lagrange multplers must be computed. Ths sets the Frst Order Reducton Method on a dfferent path than the one followed by the State Space Method presented n Secton 3.2. The computaton of generalzed

71 59 acceleratons, as the soluton of a set of lnear equatons, brngs nto the pcture the challenge of effcent lnear algebra. Sparsty and topology-based matrx manpulatons are at the heart of two methods proposed n the Sectons and for effcently accommodatng the lnear algebra demands of the Frst Order Reducton Method Theoretcal Consderatons n Frst Order Reducton When the Frst Order Reducton Method s compared to the State Space Reducton Method, there are no qualtatve dfferences between the DAE-to-ODE reducton stages, and the new method requres only an extra two-to-one ODE order reducton. The dervatve nformaton for the method s the standard requred by any well establshed code for numercal soluton of frst order systems of ordnary dfferental equatons; how to compute dervatve nformaton s dscussed n Secton Relevant aspects of the quas-newton algorthm for soluton of the dscretzaton nonlnear algebrac equatons are dscussed n Secton Secton contans remarks on the Frst Order Reducton Method that wll brdge n a untary framework the ssues of numercal ODE ntegraton, and method characterstc lnear algebra Frst Order Reducton of SSODE By replacng all dependent varables n the ndependent equatons of moton 6 6 v 6 6 vv vu T v M u, v && v + M u, v && u + Φ u, v l = Q u, v, u&, v& the second order set of State-Space dfferental equatons was shown n Secton 3. to assume the form Mv $ && = Q$ (3.84)

72 60 wth $ M a postve defnte matrx. Expressons for $ M and $ Q are provded n Eqs. (3.6) and (3.7). Snce $ M s postve defnte, the mplct form of the second order dfferental equatons n Eq. (3.84) can be expressed as a set of ordnary dfferental equatons of the form && v = f(, t v,& v) (3.85) where f M$ - Q$. The second order ODE n Eq. (3.85) s further reduced to a frst order system. Denotng w [ v T T v& ] T, the frst order ODE assumes the form w& = g(, t w) (3.86) wth g(, t w) =! v& f(, t v,& v) " $# Frst order mplct numercal ntegraton requres dervatve nformaton for the soluton of the dscretzed non-lnear algebrac equatons. Consequently, the ntegraton Jacoban G g w needs to be provded. For the partcular form of the frst order ODE n Eq. (3.86), the ntegraton Jacoban assumes the form G! 0 I f v f v& " $# (3.87) where I s the dentty matrx of dmenson ndof. To compute G, the dervatves of the rght sde of the dfferental equaton of Eq. (3.85) must be provded. In other words, J && v v and J 2 && v v& are requred. The next Secton detals the process of computng J and J 2.

73 Computng the Dervatve Informaton Takng the partal dervatve of the ndependent set of equatons of moton wth respect to ndependent postons yelds v u v v v u v v 2 v7 2 v v 7u v v u v u& v vv vv vv vv vu vu M J + M && v + M && v u + M u&& + M u&& + M u&& u T T T v v v + Φ λ + Φ λ + Φ λ u = Q + Qu + Qu& v (3.88) In order to compute J, the quanttes u v, &u v, &&u v, and λ v must be obtaned. Takng the dervatve of the poston knematc constrant equaton wth respect to ndependent postons and applyng the chan rule of dfferentaton yelds Φ u + Φ = 0 u v v Usng the defnton of the matrx H n Eq. (3.2), u = v H (3.89) Takng the dervatve of the velocty knematc constrant equaton n Eq. (3.8) wth respect to ndependent postons yelds Φ u& + Φ u& u + Φ u& + Φ v& + Φ v& u = 0 u v u u v u v v v v u v Then, u v 3 q8 3 u q 8u Φ u& =- Φ q& + Φ q& H Usng the defnton of the matrx J n Eq. (3.38), the dervatve of dependent veloctes wth respect to ndependent postons becomes &u = v J (3.90) In order to compute the dervatve of dependent acceleratons wth respect to ndependent postons, the acceleraton knematc constrant equaton of Eq. (3.9) s dfferentated to yeld τ τ τ ΦvJ + Φqq&& + Φqq&& H+ Φuu&& v = v + uh+ uj & v u

74 62 Usng the defnton of the matrx L n Eq. (3.39), the dervatve &&u v s &&u = v HJ + L (3.9) Fnally, n order to compute the dervatve of Lagrange multplers wth respect to ndependent postons, the dependent equatons of motons are dfferentated to obtan uv uv uv uu uu M J + M && v H+ M && v + M HJ + L + M u&& u v 2 7u u v 2 u7 2 v u 7u uu T T T + M u&& H+ Φ λ + Φ λ + Φ λ H= 0 Usng the defnton of the matrx R n Eq. (3.4), the dervatve of Lagrange multplers wth respect to ndependent postons s obtaned as uv uu λ v =- 2Φ u 7 R+ 2 T - M + M H7 J (3.92) v Once expressons for the requred dervatves are avalable, results n Eqs. (3.89) through (3.92) are substtuted nto Eq. (3.88) to obtan the matrx J as the soluton of the multple rght sde lnear equaton MJ $ v v v vu T T = Q + Q H + Q J - M L + H R + Φ λ H + v u vv vu vv vu M && v+ M u&& + M && v+ M u&& H v u& u T 2 v 7 2Φ u vλ7v (3.93) where $ M s defned n Eq. (3.6). The same steps are taken to compute the dervatve J 2 of ndependent acceleratons wth respect to ndependent veloctes. Dfferentatng the ndependent equatons of moton wth respect to ndependent veloctes, and usng the chan rule of dfferentaton, yelds The quanttes u& v &, u&& v &, and λ &v vv vu T v v M J + M u&& + Φλ = Q u& + Q (3.94) 2 v v v u & & & & & v v are evaluated below based on knetc and knematc nformaton. Takng the dervatve of the velocty knematc constrant equaton wth respect to ndependent veloctes yelds

75 63 Φ u& & + Φ = 0 u v v so & & u = v H (3.95) To compute u&& v &, the acceleraton knematc constrant equaton s dfferentated wth respect to ndependent veloctes to obtan Φ u&& + Φ J2 = τ u& + τ u v& v u& v& v& Usng the defnton of the matrx N n Eq. (3.40), the dervatve of dependent acceleratons wth respect to ndependent veloctes assumes the form u&& = v & N + HJ2 (3.96) Fnally, n order to compute the dervatve of Lagrange multplers wth respect to ndependent veloctes, the dependent equatons of motons are dfferentated wth respect to ndependent veloctes to yeld u 2 26 Φλ u v & u & v & uv uu T u M J + M N+ HJ + = Q H+ Q The quantty λ &v s then obtaned as (3.97) u λ & & & u uu uv uu v = Φ T - u QH u + Qv -M N- M + M HJ2 By substtutng expressons for u& v &, u&& v &, and λ, nto Eq. (3.94), the matrx J s &v 2 obtaned as the soluton of the multple rght sde lnear system T $MJ = W - H S (3.98) 2 where the matrx S s defned n Eq. (3.42), and the matrx W s defned as v v vu W = Q H+ Q -M N (3.99) & & u v

76 Iteratve Process n the Frst Order Reducton Method The ntegraton Jacoban G s used to correct va an teratve process the confguraton w at each tme step. Correctons δw are the soluton of a lnear system of equatons that generally assumes the form ( αi- G) δw = err (3.00) where err s the resdual n satsfyng the ntegraton formula for a mult-step formula, or the stage equaton for a Runge-Kutta method. Lkewse, α γh s a coeffcent that depends on the ntegraton step-sze h and an ntegraton-formula coeffcent γ, whle I s the dentty matrx of approprate dmenson. For the reduced order SSODE, the dmenson of ths matrx s 2 œ ndof. When solvng the system of Eq. (3.00), advantage can be taken of the specal structure of G. As a consequence of the fact that the frst order ODE s obtaned after T T reducng the orgnal second order SSODE, the vector δw [ δp δv ] T s the soluton of the lnear system! αi -I -J αi-j 2 " $#! δp δv " $# =! b b 2 " $# T T T where I s the dentty matrx of dmenson ndof and err [ b b ]. Analytcally, the soluton of ths system can be obtaned by solvng two lnear systems of dmenson ndof for the varatons δp and δv. The varaton n ndependent postons s obtaned as the soluton of 2 2α I -α J 2 - J 7 δ p = α b b J b (3.0) whereas the varaton n ndependent veloctes s obtaned solvng 2 2α I-αJ2 - J7 δv = b2 + J b (3.02) 2

77 65 As noted n the prevous Secton, J and J 2 are the soluton of two systems of lnear equatons wth multple-hand sdes. CPU savngs are obtaned f, nstead of solvng these systems, Eqs. (3.0) and (3.02) are multpled by the nonsngular matrx 2 2 α 7 M $. Usng the notaton Π M$ - MJ $ - MJ $ 2 2 (3.03) α α correctons n ndependent postons and veloctes are solutons of 26 2 (3.04) Π δp= M$ αb + b - MJ $ b 2 2 α α Π δ v = + α Mb $ α MJ $ b 2 2 (3.05) Two thngs are worth notng here. Frst, there s no need to solve the systems n Eqs. (3.93) and (3.98). Ths s because now the matrces J and J 2 do not appear solated, but only n products of the form $ MJ and $ MJ 2. Wherever these products appear, they can be replaced by the rght-sdes of the Eqs. (3.93) and (3.98). Ths leads to the second observaton. After makng these substtutons for $ MJ and $ MJ 2 n the expresson for Π, Π s dentcal to Ψ &&v n Eq. (3.43). Ths s true provded the same ntegraton formula s used for the State Space Method to ntegrate for both ndependent veloctes and postons. In other words the followng holds: Proposton. Assume that the same ntegraton formula s consstently used for the resultng ODE n the State Space and Frst Order Reducton Methods. Then, coeffcent matrces of the lnear systems that provde correctons n ndependent acceleratons, and ndependent postons and veloctes, for the State Space, and Frst Order Reducton Methods, respectvely, are the same. The proof of ths result s obtaned by drect substtuton.

78 66 The beneft of the result stated n Proposton s that the rather complcated Jacoban computaton for two dfferent methods can be numercally managed usng only one set of software routnes. Furthermore, for the Frst Order Reducton Method, the proposed approach elmnates the need for explct computaton of the matrces J and J Observatons Regardng the Frst Order Reducton Method Once the dervatves &&v v and && v v & are evaluated, usng Eqs. (3.93) and (3.98), the ntegraton Jacoban G n Eq. (3.87) s avalable. Thus, practcally any standard ODE mplct solver can be consdered to determne the tme evoluton of the ndependent postons and veloctes. It remans to provde a set of robust and effcent routnes for dependent varable recovery (at the poston and velocty levels), along wth a fast algorthm for computaton of acceleratons and Lagrange multplers. Dependent poston and velocty recovery s an mportant ssue for effcency and robustness. In ths work, ths ssue s not addressed. A comprehensve analyss of ths topc can be found n the work of Serban (998). However, the ssue of effcently computng acceleratons and Lagrange multplers s addressed n detal, n what follows. For the mplct ntegraton of the DAE of Multbody Dynamcs, as proposed n Secton 3.4., each correcton n ndependent postons and veloctes requres computaton of generalzed acceleratons &&q. On the other hand, when explct ntegraton s consdered, Eq. (2.6) must be solved for &&q, snce ndependent acceleratons are used to advance the ntegraton to the next tme step. Dependng on the type of analyss beng consdered, the Lagrange multplers are also quanttes of nterest. These are reasons that motvated development of methods for effcent computaton of generalzed acceleratons &&q and Lagrange multplers λ.

79 67 In terms of the number of varables used to model a multbody system, there are two extreme approaches; (). The descrptor form, or the Cartesan representaton, or the body representaton, n whch the mechancal systems are represented usng for each body a set of coordnates specfyng the poston of a partcular pont on that body, along wth a choce of parameters that specfy the orentaton of the body wth respect to a global reference frame (2). The mnmal form, or the recursve formulaton, or the jont representaton, n whch mechancal systems are represented n terms of a mnmal set of generalzed coordnates Throughout ths document, BR (for Body Representaton) and JR (for Jont Representaton) denote these two approaches. The Cartesan formulaton (BR) s convenent for representng the state of a mechancal system, because knetc and knematc nformaton s readly avalable for each body n the system. The major drawback of the Cartesan approach s that the dmenson of the problem ncreases dramatcally, compared to the alternatve provded by the recursve formulaton (JR). Sectons and focus on constructng methods that effcently solve for generalzed acceleratons &&q and Lagrange multplers l n both formulatons. These quanttes are the soluton of the lnear system n Eq. (2.6). The coeffcent matrx of ths system s called n what follows the augmented matrx. The objectve s to take advantage of both structure, and topology-nduced sparsty when solvng for &&q and λ. As mentoned by Andrezejewsk and Schwern (995), no code seems to explot both the structure of the augmented matrx, and the sparsty n a satsfyng manner.

80 Computng Acceleratons n the Cartesan Representaton In ths Secton, a method for computng acceleratons and Lagrange multplers n the Cartesan representaton s presented, followng the approach presented by Negrut, Serban, and Potra (996). A slghtly dfferent approach s proposed by Serban, Negrut, Haug, and Potra (997), where numercal results showng the good performance of the algorthm were provded. The Newton-Euler constraned equatons of moton for a multbody system assume the form T Mqq ( )&& + F ( q) l = Qqq (,&) F( q) = 0 q (3.06) where q, &q, and &&q are vectors n ª n that represent generalzed poston, velocty, and acceleraton; Qqq (,& ) R n s the vector of generalzed forces; l R m s the vector of Lagrange multplers; and Mq ( ) s the n n mass matrx. Fnally, F q s the m n constrant Jacoban, m< n. The knematc constrants are assumed to be ndependent, so the Jacoban matrx has full row rank. Dfferentatng the poston knematc constrant equaton twce wth respect to tme, and replacng poston knematc constrant equaton wth the newly obtaned acceleraton knematc constrant equaton reduces the ndex of the DAE of Eq. (3.06) from 3 to. The resultng ndex DAE assumes the from! M F q F 0 T q " $ q Q # &&! " $ # " l =! t $ # (3.07) where the rght sde of the acceleraton knematc equaton s defned n Eq. (2.4). The coeffcent matrx of the lnear system n Eq. 3 s of dmenson ( n+ m) ( n+ m), and n what follows t s denoted by A, and called the augmented matrx.

81 69 The dmenson of the augmented matrx assumes large values for relatvely smple mechancal systems. For the seven body mechansm n Fgure modeled as a planar mechancal, n = 2;.e., 7 bodes wth 3 coordnates, two postons and one orentaton angle. The number of constrants m s 20. The mechansm has one degree of freedom, and the augmented matrx s of dmenson 4. The stuaton s more drastc f the mechansm s modeled as a spatal system. In ths case n s 42, whle the number of constrants mposed s 4. The augmented matrx s thus of dmenson 83. It can be seen that the dmenson of the augmented s large, especally when spatal mechancal systems wth few degrees of freedom are consdered. Fgure. Seven Body Mechansm The strategy proposed for soluton of the lnear system n Eq. (3.07) proceeds by formally expressng acceleratons &&q n terms of Lagrange multplers. Assumng for the

82 70 moment that the mass matrx s nonsngular, acceleratons are substtuted back nto acceleraton knematc equaton to obtan - ΦqM T - 3 Φq8λ = ΦqM Q-τ (3.08) For notatonal convenence, the dependency of the Jacoban on postons, and of the generalzed force on both postons and veloctes, s suppressed. The matrx B F M matrx have the form q F T s referred as the reduced matrx. Dagonal blocks n ths q j j T j j B = ( F ) M ( F ) + ( F ) M ( F ) jj T q q q 2 q2 2 (3.09) f jont j connects bodes and 2. Off-dagonal blocks assume the form B jk j k = ( Fq ) M ( Fq ) T, j k (3.0) f jonts j and k are on the same body. Otherwse, they are zero. j In Eqs. (3.09) and (3.0), F q s the dervatve of the constrant functon correspondng to jont j wth respect to the generalzed coordnates q of body, and M s the nverse of the mass matrx of body The Planar Case For planar mechansms, the mass matrx M s postve defnte. Therefore, snce the constrant Jacoban s assumed to have full row rank, the reduced matrx B s postve defnte. Assumng that Lagrange multplers are avalable, acceleratons are easly computed usng the equatons of moton Mq&& = Q F T ql (3.) The mass matrx M s of dmenson ( 3nb 3nb), where nb s the number of bodes n the system, and has a dagonal block structure

83 M =! M 0 L 0 0 M K 0 2 L L L L 0 0 L M nb " $ # (3.2) 7 For the planar case, gven a certan jont numberng scheme, the followng algorthm s used to compute &&q and λ : Algorthm (). Assemble the reduced matrx B, based on Eqs. (3.09) and (3.0) (2). Solve the lnear system of Eq. (3.08) for λ (3). Recover the generalzed acceleratons, based on Eqs. (3.) and (3.2) Step 2 s dscussed n Secton Step 3 requres the soluton of nb systems of lnear equatons of dmenson 3œ 3, to recover generalzed acceleratons. Ths process can be carred out n parallel. For rgd body smulaton, the matrces M are dagonal, and they are constant over the entre smulaton. Therefore, n an effcent mplementaton, matrx M s factored durng the pre-processng stage of the smulaton, computaton of generalzed acceleratons requrng only a matrx-vector multplcaton The Spatal Case Wth notatons used by Haug (989), generalzed acceleratons and Lagrange multplers are the soluton of the lnear system T A M 0 F r 0 && r F T T p T T A T 0 4G J G F p p 3Fp8 && 2Gn 8GJGp Fr Fp 0 0 # p p# = + & & l t p 0 F 0 0 l t! p " # $ #! " # $! " $ # (3.3) where

84 72 r [ r, K, r ] T T T nb p [ p, K, p ] T T T nb F [ F, K, F ] T T T nb n [ n, K, n ] M T T T nb dag( M ) (3.4) J dag( J ) G dag( G ) F p [ pp, K, p nb p ] T T T F p p T dag( p ) 2 The quantty t n Eq. (3.3) s the rght sde of the acceleraton knematc equaton, as gven n Eq. (2.4). The rght sde of the Euler parameter constrant equaton s obtaned easly as t p T T = [ 2pp &, K, 2p p] In Eq. (3.3), Lagrange multplers correspondng to poston knematc constrant equatons are denoted by l, whle those correspondng to Euler parameter normalzaton nb T constrant equatons pp = are denoted by l p. The coeffcent matrx n Eq. (3.3) can be brought to a form that resembles the two dmensonal case by denotng T M dag( M, 4G J G) 2 7 T T T F [ F, F p ] Then, the coeffcent matrx would assume the form n Eq. (3.07). However, the dea used for the two dmensonal case s not drectly applcable here, snce the newly defned

85 73 composte mass matrx M fals to be postve defnte. In fact, by smply takng the vector z [ 0 T T T, p ] 0, Mz = 0. Ths s a drect consequence of the fact that n any consstent confguraton of the bodes n the system, Gp = 0 (Haug, 989). The approach n whch the reduced system s frst solved for Lagrange multplers, and then generalzed acceleratons are recovered fals, snce t s based on postve defnteness of the mass matrx. The soluton proposed s as follows: keep the algorthm and temporarly change the representaton. Thus, nstead of followng the Euler parameter representaton of the constraned equatons of moton for a mechancal system, for the moment consder the orgnal Newton-Euler formulaton n whch acceleraton s retreved by solvng the lnear system (Haug, 989)! M 0 0 J F r F p F F 0 T r T p " $ #! " # $ A && r F w& n J # = w ~ w l t! " $ # (3.5) At each ntegraton tme step, snce the quanttes p and &p are known, the matrx G can be constructed, and once the tme dervatve of the angular veloctes w & s avalable, the Euler parameters acceleratons are calculated usng the relaton (Haug, 989) T T p&& = G w & (& p p&) p (3.6) 2 Notce that the constructon of the coeffcent matrx and the rght sde of Eq. (3.5) s less CPU ntensve when compared to the approach n Eq. (3.3). The reason for whch the w angular velocty-formulaton of Eq. (3.5) s not commonly used s that the angular velocty s not ntegrable (Haug, 989). The Euler parameter formulaton avods ths dffculty. For the two formulatons dscussed above, there s a formulaton that s not recommended on mathematcal ntegraton grounds, but s desrable based on lnear

86 74 algebra consderatons, and conversely. The dea proposed by Negrut, Serban, and Potra (996) s to consder each formulaton when t s more benefcal: the w -formulaton for the lnear algebra part, and the Euler parameter formulaton for the numercal ntegraton part. What makes ths feasble s the ablty to move back and forth between the two formulatons at relatvely no CPU penalty. Wth ths, the case of spatal mechansms can be treated much as the planar case. Wth nb beng the number of bodes of the mechancal system model, the followng notaton s ntroduced Q x [&& rω& Š T ] T x [ x T, x T, K, x T nb] T (3.7) 2! A F T T T T # Q Q Q Qnb nš - ~ [,, ] 2 K (3.8) ω JŠωŠ " $ M dag( M, JŠ ) M dag( M, M, K 2, Mnb) (3.9) The spatal case reduces to fndng the soluton of the system! M Φ q Φ 0 T q " $ x Q #! " $ # " λ =! τ $ # (3.20) The matrx M s postve defnte and the algorthm presented for the planar case can be employed to compute the unknowns x and λ. Thus, frst proceed by solvng for Lagrange multplers the equvalent of Eq. (3.08), whch for the spatal case wth the notaton n Eqs. (3.7) through (3.9) assumes the form T - 3ΦqMΦq8λ = ΦqM Q-τ (3.2) and then recover the &&r and ω& by solvng The followng algorthm s proposed: Mx = Q -Φ T q λ (3.22)

87 75 Algorthm 2 (). Assemble the reduced matrx B, based on Eqs. (3.09) and (3.0) (2). Wth the notatons ntroduced n Eqs. (3.7) through (3.9), solve the reduced lnear system of Eq. (3.2) (3). For each body n the mechancal system, solve the 6 œ 6 system of Eq. (3.22) to recover &&r and &ω Š (4). Compute &&p va Eq. (3.6) Factorng the Reduced Matrx In order to effcently solve Eq. (3.08) for Lagrange multplers, the topology of the mechancal system should be used to advantage. Two smple examples are presented to llustrate ths pont. The frst example s Andrew s squeezng mechansm, or the Seven-Body Mechansm, whch s a standard test problem. A descrpton of ths mechansm can be found n the work of Schehlen (990). The mechansm s presented n Fgure. The second example s a 50-lnk chan consstng of smple pendulums shown n Fgure 2. Ths problem s larger and admts a jont numberng scheme that results n a block dagonal matrx B. The Seven-Body Mechansm s a closed loop mechansm, whose graph s presented n Fgure 3. The jonts are represented as vertces, whle the bodes are connectng edges. Ths representaton s dfferent from the usual one, n whch bodes are vertces and jonts are connectng edges of the graph. In the proposed representaton, one jont connects exactly two bodes, and the same body can appear more than once as an edge. Ths s the case when a body s connected to two or more bodes by jonts. Thus,

88 76 when a body s attached through k jonts wth neghborng bodes t yelds k edges n the graph. The ground s not represented as a body, so free vertces located at the end of the graph, vertces, 8, and 0 n Fgure 3(a), connect the adjacent body to ground. Fgure 2. Chan of Pendulums (a) (b) Fgure 3. Graph Representaton: Seven-Body Mechansm

89 77 For the seven-body mechansm, two dfferent jont numberng sequences, shown n Fgure 3(a) and (b), yeld the correspondng reduced matrces B and B 2 n Fgure 4 (a) and (b), respectvely. Only non-zero elements are represented. Fgure 4. Reduced Matrx B: Seven-Body Mechansm Fgure 6 shows the sparsty pattern of the B matrx correspondng to two dfferent jont-numberng schemes for the chan of 50 pendulums. The frst matrx s obtaned by numberng the jonts n the natural order (,2,3,...), startng wth the jont between body and ground as shown n Fgure 5(a). The second matrx corresponds to the numberng n whch the frst jont of the chan s, the second s 50, the thrd s 2, the next s 49, and so on, as shown n Fgure 5, (b). Ths latter numberng scheme s unlkely to ever be used, and t s consdered only to llustrate the mpact of a bad jont numberng scheme on the sparsty pattern of the reduced matrx B. These results show that the bandwdth of the reduced matrx, and therefore the effcency of a sparse solver, depend on the jont-numberng scheme. The choce of jont numberng s made at the stage of mechancal system modelng, and then used throughout

90 78 the smulaton. For mproved performance, t s mportant to determne a strategy that automatcally numbers the jonts of the model, such that the amount of work n factorng the reduced mass matrx s mnmzed. Fgure 5. Two Jont Numberng Schemes: Chan of Pendulums Fgure 6. Reduced Matrx: Chan of Pendulums Snce the reduced matrx s postve defnte, factorzaton s based on block Cholesky decomposton, takng advantage of the sparse-block structure of the reduced matrx. The block structure s gven by Eqs. (3.09) and (3.0). The block wdth β of row s defned n terms of blocks as β = max{( j): B 0, j < } (3.23) j

91 79 The bandwdth β of the reduced matrx, n terms of blocks, s gven by the maxmum row wdth as The envelope or profle of the matrx s gven by β= max{ β : = 2,, K, m} (3.24) env( B ) = m β = (3.25) In the Cholesky factorzaton of B, the computatonal work of an algorthm that makes use of an envelope storage scheme can be bounded from above by (Negrut, Serban, and Potra, 996), m work( B ) = β( β + 3) 2 = 2 (3.26) Ths estmate s an upper bound on the actual work n a block orented Cholesky factorzaton algorthm. An operaton n Eq. (3.26) s consdered to be ether a blockblock multplcaton, or a block nverson. Equaton (3.26) does not take nto account the square root of the dagonal elements (for the scalar case), or the Cholesky decomposton of dagonal blocks (n the block form). If these operatons were to be m = consdered, the above formula becomes work( B ) = 2 ( β β + 2) 2 The values of the bandwdth, envelope, and the actual work performed n factorzaton of the reduced matrx depend on the choce of row and column orderng n matrx B. In general the mnma for these three quanttes wll not be obtaned wth the same orderng. Mnmzng the bandwdth of a matrx s an NP-complete problem, and mnmzng any of the other two quanttes consdered above s an ntractable task. It s common practce to use a bandwdth and/or envelope reducton algorthm to reorder the matrx, pror to applyng Cholesky factorzaton. Although ths approach does not exactly mnmze the amount of work n the Cholesky factorzaton, the results are

92 80 satsfactory. Algorthms such as Gbbs-Kng, or Gbbs-Poole-Stockmeyer (Lews, 982) perform ths operaton. These algorthms employ a local-search n the adjacent graph of the matrx. Central to ths dscusson s the fact that ths graph s dentcal to the graph representaton of the mechansm, as proposed earler n ths Secton. Thus, permutatons of rows and columns of the reduced matrx va symmetrc permutaton of the form T PBP;.e., renumberng the vertces n the adjacency graph of the matrx, s equvalent to renumberng the jonts of the mechancal system. Therefore, the reordered ndex set gven by any of the above algorthms s translated nto a new numberng of the jonts of the mechancal system. In the dscusson above, blocks of the reduced matrx were manpulated as f they were smple entres n a matrx. These blocks are nherted from the structure of the problem, and ther dmenson s dctated by the number of constrant equatons assocated wth dfferent jont types, as expressed by Eqs. (3.09) and (3.0). These blocks could be further broken down, gong to entry-level. The bandwdth and/or envelope reducton algorthms would result n better-structured matrces, n terms of operatons needed for Cholesky factorzaton. However, n ths case the mmedate relatonshp between the topology of a mechansm, as defned at the begnnng of ths Secton, and the structure of the reduced matrx s lost. Ultmately, ths s the dfference between regardng a certan mechancal jont as a set of basc constrant equatons that are manpulated together or, manpulatng these basc constrant equatons on an ndvdual bass, dsregardng the fact that some of them together descrbe a certan mechancal jont Numercal Results The Seven-Body Mechansm and the chan of pendulums are consdered to llustrate the beneft of usng the proposed algorthm. Numercal results for other

93 8 mechancal systems are presented n the paper of Serban, Negrut, Haug, and Potra (997). The concluson drawn there, along wth the observed speed-up factors for the proposed algorthm, are qualtatvely the same. Based on the theoretcal consderatons of Sectons through , an algorthm to compute Lagrange multplers and generalzed acceleratons s defned as follows: Algorthm 3 (). For an arbtrary jont numberng, create the reduced matrx B, based on Eqs. (3.09) and (3.0). (2). Use a jont numberng algorthm Gbbs-Kng (Lews, 982), Gbbs-Poole- Stockmeyer (Lews, 982), etc. to reduce the profle of matrx B. (3). Renumber the jonts of the mechancal system model, as suggested by the profle reducton algorthm. (4). Apply Algorthm for the planar case, or Algorthm 2 for the spatal case, to recover frst the Lagrange multplers, and then on a per body bass, the generalzed acceleratons. Durng smulaton, steps () through (3) are done once at the pre-processng stage. The resultng jont numberng s then used throughout the smulaton. On the other hand, step (4) s taken once at each ntegraton step for the stuaton when explct ntegraton s used, or once for each teraton of the quas-newton algorthm, when mplct ntegraton s used. A set of numercal experments, denoted below wth ExpA, s carred out to compare two strateges for obtanng acceleratons and Lagrange multplers. The strateges are as follows

94 82 ExpA: Solve the augmented system of Eq. (3.07), usng the routne ma28 of the drect sparse solver Harwell, (Duff, 980) ExpA2: Apply Algorthm 3 above The routne ma28 employs a mult-frontal sparse Gaussan elmnaton, combned wth a modfed Markowtz strategy for local pvotng. Table 2 contans CPU tmes n seconds requred to compute acceleratons and Lagrange multplers 000 tmes. All numercal experments were performed on a HP9000 model J20 computer, wth two PA 7200 processors. Table 2. Numercal Results: Solvng For Acceleratons and Lagrange Multplers Seven Body Mechansm Chan of 50 Pendulums ExpA ExpA The results suggest that a speed-up of a factor of 6 to 8 s obtaned by usng the proposed method, compared to the drect approach of solvng the augmented system. The observed speed-up rato s cut down to values of 3 to 4, when spatal mechancal system models are consdered (Serban, Negrut, Haug, Potra, 997). A second set of numercal experments was carred out to assess the mpact of dfferent jont numberng schemes on CPU tme requred to solve the reduced system for Lagrange multplers. For both planar and spatal cases, the reduced matrx B s postve defnte.

95 83 The proposed algorthm s compared to well establshed dense solvers from Lapack and Harwell. The sparse Cholesky solver developed s block orented and takes advantage of the sparse skylne format used to store the reduced matrx B. Two LAPACK routnes, one based on the dgetrf/dgetrs par and a second that takes advantage of the postve defnteness of the reduced matrx, (dppsvx) are consdered. Fnally, the sparse solver MA28 of Harwell s used for comparson. The CPU results n seconds for 000 solutons of the reduced system are presented n Tables 3 and 4. A bad jont numberng s consdered frst (case ExpB), whle a good jont numberng obtaned after applyng Gbbs-Kng (Lews, 982) envelope reducton algorthm s used n the second case (ExpB2). Table 3. Tmng Results for Seven Body Mechansm dgetrfr/dgetrs dppsvx MA28 Sparse Cholesky ExpB ExpB Table 4. Tmng Results for Chan of Pendulums dgetrfr/dgetrs dppsvx MA28 Sparse Cholesky B B For the algorthm developed, the envelope reducng approach of reorderng blocks n the matrx B nfluences the effcency of Lagrange multpler computaton. Whle for

96 84 the other methods a reorderng process s rrelevant, for the sparse Cholesky algorthm t leads to a speed-up of more than 3 for a rather restrctve example of a closed loop mechancal system such as the Seven-Body Mechansm. The proposed algorthm s especally attractve when open loop mechansms such as the chan of 50 pendulums are consdered. In ths nstance, snce the reduced matrx s block dagonal, the proposed algorthm s extremely fast, when compared to the dense alternatves provded by Lapack. Compared wth the sparse solver from the Harwell lbrary, the speed-up obtaned s about 4, and ths was smlar for both test problems. Surprsngly enough, the Harwell solver compares poorly wth the dense solvers from Lapack for the Seven-Body Mechansm. Apparently, the dmenson of the problem (20 by 20) s small and the fll-n ndex for ths problem s too hgh for the sparse solver to have an mpact. In ths case however, a speed-up factor of 4 can be observed, comparng the proposed algorthm to all the other alternatves, dense or sparse Computng Acceleratons n Mnmal Representaton In ths Secton s presented an algorthm that takes advantage of the topology of the mechancal system modeled usng a mnmal set of generalzed coordnates. It s shown that the generalzed mass matrx, the so called composte nerta matrx, assocated wth any open or closed loop mechansm s postve defnte. Based on ths result, an algorthm that effcently solves for acceleratons s desgned. Sgnfcant speed-ups are obtaned, due both to the no fll-n factorzaton of the composte nerta matrx and to the degree of parallelsm attanable wth the new algorthm. The dscusson s organzed as follows. In Secton the jont representaton (JR) approach to modelng multbody systems s brefly outlned. The notaton and relevant results of Tsa (989) are referred to n ths dscusson. In Secton ,

97 85 postve defnteness of the CIM for open and closed loop mechansms s proved. The proposed technque of factorng the CIM s presented n Secton The potental for parallel mplementaton of the proposed factorzaton s outlned n Secton Results of numercal experments amed at showng the capabltes of the new approach conclude the dscusson JR Modelng of Multbody Systems Basc Concepts The formalsm behnd JR modelng of multbody systems s complex. Ths Secton provdes a bref ntroducton to ths topc, wth the goal of defnng quanttes related to the ssue of computng generalzed acceleratons. The nterested reader s referred to the work of Tsa (989) for a detaled descrpton of the formalsm. The man concept n JR modelng s that body j s vewed as beng located and orented relatve to ts nboard body. Fgure 7 shows a par of connected bodes wth general relatve rotaton and translaton. The nboard body s located by the poston vector r from the orgn of the global xyz coordnate frame to the orgn of the body xyz Š Š Š frame. The xyz Š Š Š frame s orented by an orthogonal transformaton matrx A, whch transforms a vector n the body reference frame to the global reference frame. A jont xj ŠŠyj ŠŠzj ŠŠ reference frame, s defned and fxed on body at the jont connecton pont O j ŠŠ, whch s located by the constant vector s j Š from the orgn of the xyz Š Š Š frame. A vector d j s defned from the orgn of the jont xj ŠŠyj ŠŠzj ŠŠ frame, to the orgn OŠ j of the xyz Šj Šj Šj frame on the outboard body. Reference frames for each successve body n the knematc chan are defned n the same way as those for body.

98 86 The so called body reference frame s the reference frame located at the jont that connects the consdered body to ts nboard body. Ths defnton apples to all bodes except the base body, whch has no nboard body. For the base body, a body reference frame s attached at the center of gravty. The transformaton from the nboard xyz Š Š Š body reference frame to the outboard body reference frame, requres only the constant transformaton to the xj ŠŠyj ŠŠzj ŠŠ jont reference frame, followed by a jont transformaton to the xyz Šj Šj Šj reference frame of body j. The reason for defnng the body reference frame at the nboard jont s that every body, except for the base body, has one and only one nboard body n the knematc chan, whle t may have several outboard bodes. Fgure 7. A Par of Connected Bodes n JR

99 87 The orgn of the body reference frame at the nboard jont s unquely defned. Gven the poston of body, the orgn of the body j reference frame s located by the poston vector r j rj = r + sj + dj (3.27) where dj = ACjdj ( qj ) s a vector from the jont reference frame orgn O j ŠŠ on body to the body reference frame orgn OŠ j on body j, q j s the vector of relatve coordnates for the jont, and C j s the constant orthogonal transformaton matrx between the and O Š frames on body. The angular velocty of body j can be expressed as ŠŠ O j ω j = ω + ω j (3.28) where ω s the angular velocty of body, ω j s the angular velocty of body j, and ω j s the angular velocty of body j relatve to body. The vector ω j can be obtaned from the relatve coordnate velocty &q j as ω j = HA q j q j (, ) & (3.29) where HA (, qj ) s a transformaton matrx that depends on the orentaton of body and the vector q j of relatve coordnates, whch s defned for each type of jont. The velocty of body j can be found by dfferentatng Eq. (3.27) wth respect to tme to yeld r& = r& + s& + d& (3.30) j j j 3 8 ω j. The tlde operator (as n ω ~ ) sgnfes the j d dj where d A C d q d q q j = j j ( j) = ~ j + & dt skew symmetrc vector product operator (Haug, 989) appled to the vector ω. After smple manpulatons, Eqs. (3.28) and (3.30) can be combned n matrx form to yeld

100 ! " & ~ & ~ ždj r r r r ~ j + jω j + ω rh j q j j $ # =! $# + + ž ω ω H "! j j " $ # q& j (3.3) 88 Equaton (3.3) can be expressed n the so called state-vector notaton (Tsa, 989) as Y$ = Y$ + B q& (3.32) j j j where the velocty state-vector of body s defned as Y$ =! r& + ~ rω ω " $# (3.33) and the velocty transformaton matrx B j between bodes and j s defned as B j = ždj + ~ rh j j žq j H! j " $ # (3.34) For the base body, ths matrx s the dentty matrx of approprate dmenson. Fnally, the acceleraton state-vector of body j s obtaned by dfferentatng Eq. (3.32) wth respect to tme, to obtan Y$& = Y$& + B q&& + B& q& Y$& + B && q + D (3.35) j j j j j j j j Once the velocty state-vector Y $ & j and the acceleraton state-vector $ Yj are avalable, the Cartesan velocty and acceleraton, Y j and &Y j, for body j can be recovered by usng the transformatons Y T j = = T Y$ and Y& = T Y$& - R, wth T j and R j defned as j j j! I -~ r 0 I j " j j j j rj j R j TjYj $# ~& =- & $ ω =! 0 $ # (3.36) In Secton s outlned the process of generatng the equatons of moton for an open-loop mechancal system model. In Secton s presented the case of closedloop mechancal systems. "

101 Equatons of Moton for a Tree-Structure System as The varatonal form of the equatons of moton can be wrtten n matrx notaton T δz 2MY& - Q7= 0 (3.37) where the Cartesan acceleraton, Cartesan vrtual dsplacement δz, modfed mass matrx M, and generalzed force are defned as && r" r Y = Z M! $ # δ ", δ =! $ #, = ω δπ! mi -mρ ~ ~, mρ J " = $#! Q F - mωωρ ~~ n- ωω ~ J " $# (3.38) In Eq. (3.38), ρ s the vector from the body reference frame locaton O Š to the body center of gravty locaton. The correspondng state-vector form of Eq. (3.37) s (3.39) δ $ T Z MY $ $& - Q$ = where, wth the notatons of Eq. (3.36), δz$ = T δz M$ = T MT Q= T Q+ MR - T T 6 (3.40) For the multbody system of Fgure 8, the varatonal form of the equatons of moton assume the form p m Í Í Í δ Z $ T ( M $ Y $& Q $ ) δ Z $ T ( M $ Y $& Q $ ) δ Z $ T ( M $ Y $& - Q $ ) = = p+ = m+ n (3.4) EQ() + EQ(2) + EQ(3) = 0 The state varaton δ $ Z of body n chan 2 s expressed recursvely, n terms of nboard jont relatve coordnate varatons δq k and the state varaton δz p of the juncton body, as - Í δz$ = δz$ + B δq p k k k= p+

102 90 whle the acceleraton state $& Y of body s recursvely expressed as Í Y$& = Y & $ + ( B q&& + D ) p k k k k= P+ wth B k and D k defned n Eqs. (3.34) and (3.35). Chan Chan 2 p+ p m+ Chan 3 m n Fgure 8. A Tree Structure Concentratng on chan 2, drect manpulatons brng Eq. (3.4) to the form T Eq() + Eq(3) + δ Z $ K Y $& + K B && q -( L -K D ) m Í m Í p p+ p k k k p+ p+ p+ k= p+ + δq T B T K Y $& + B T K B q - B T && ( L - K D ) Í p v k k j k= p+ j= p+ where the subscrpt v of K v s, f matrces K and L are recursvely defned as = 0 (3.42) > k, or k f < k. The composte mass and force K = K + M$, L = L - K D + Q$ (3.43)

103 9 The recurson starts from the last body n the chan; n ths case the end body m, and proceeds nward along the chan toward the base body. For the end body, the composte mass and force matrces K m and L m are dentcal to state-space reduced mass matrx $M m and force matrx Q $ m, respectvely. For chans 3 and, the same steps as for chan 2 are followed, expressng the state varatons δ $ Z of body n terms of the nboard jont relatve coordnate varatons δq k and state varaton δz p for chan 3 and δ $ Z for chan. State-vector acceleratons $& Y are generated recursvely along the chan toward the base body. For juncton body p, the composte mass and force matrces are defned as K = M$ + K + K p p p+ m+ L = Q$ + ( L - K D ) + ( L -K D ) p p p+ p+ p+ m+ m+ m+ (3.44) After drect manpulaton, the varatonal equatons of moton assume the form T = δz K Y & $ - L + K B q&& 0 n n Í k = 2 k k k T T + Íδq B K Y$& T T + B K B && q -B L -K D = 2 m n Í v k k j k = 2 j= 2 Í (3.45) T T + Í δq B K Y$& T T T + Í B K B q + Í B K B q -B L - K ÍD + ÍD && && v k k v k k j = p+ m p k = 2 m k= p+ P j= 2 j= p+ T T + Í δq BKY$& T T T + Í BKBq + Í BKBq -B L- K ÍD + ÍD && && v k k v k k j = m+ p k = 2 n k= m+ p j= 2 j= m+ Equaton (3.45) s obtaned by startng from Eq. (3.4) and recursvely expressng state-vector vrtual dsplacements and acceleratons. Equatng to zero expressons multplyng arbtrary vrtual dsplacements n the varatonal form of the equaton of moton produces the dfferental equatons of moton of the open loop j j

104 92 mechansm n Fgure 8. For any tree structured mechansm, the equatons of moton are obtaned followng the steps outlned above Equatons of Moton for a Closed-Loop System If a mechansm contans closed loops, a number of jonts are cut n the process of obtanng a spannng tree. Constrants should therefore be mposed n order to preserve the behavor of the mechansm. Consequently, the vrtual dsplacements n Eq. (3.45) are no longer arbtrary. They are related through the constrant equatons. Thus, Eq. (3.45) expressed n matrx form as δq T ( Mq && - Q) =0 (3.46) should hold for any vrtual dsplacement satsfyng Φ q δ q= 0. It s assumed here that the collecton of all cut jonts s replaced by an equvalent set of constrant equatons that assumes the form Φ( q) = 0. In ths framework, usng Lagrange multpler theorem (Haug, 989), and takng nto account the constrant acceleraton equatons, the equatons of moton for closedloop mechancal systems assume the form! M Φ q Φ 0 T q " q $ # &&! " $ # λ =! Q τ " $ # (3.47) Postve Defnteness of CIM Two propertes of the composte nerta matrx M (CIM); the specal structure (or sparsty pattern) and the form of the entres, are used to prove ts postve defnteness. Both these propertes are dctated by the topology of the system, as reflected by the equatons of moton. In JR, a graph s assocated wth each mechansm by consderng ts

105 93 bodes and jonts as beng vertces and connectng edges of the graph, respectvely. In addton to the graph concepts ntroduced by Tsa (989), a few others are defned n what follows. Two concepts ntroduced n Secton nduce a drecton n the spannng tree that s obtaned after cuttng a requred number of jonts. These concepts are the nboard-outboard relatonshp between pars of neghbor bodes, along wth the fact that each body n the spannng tree has one and only one nboard body, whereas t can have an arbtrary number of outboard bodes. In the followng, the drected spannng tree assocated wth the cut jont mechansm s referred as the spannng tree, unless otherwse stated. Body j s a descendent of body f there s a path n the spannng tree from body to body j. Famly, denoted by F( ), s the set of all descendant of body. If body s added to F( ), then the new collecton of bodes s denoted by F[ ]. The sequence of bodes on the unque path startng from body and endng at body j s called a subchan, denoted d[, j ]. If body s left out of ths sequence, the subchan s denoted by d(, j ]. The subchans d[, j ) and d(, j ) are defned smlarly. The subchan startng at the base body and endng at body s denoted by d[ ]. Note that a subchan nherts ts order from the spannng tree, whle a famly does not. By concatenatng subchan d[ ] to famly F( ), the subtree c( ) s obtaned, whch wll be ordered from the base body outward to body. If b s the base body, the subtree c( b ) represents the entre spannng tree and, for convenence, t s denoted smply by S. Fnally, body s a leaf of the spannng tree f t has no outboard bodes. The results below are based on the work of Negrut, Serban, and Potra (998).

106 94 Lemma. Let x w be a matrx or vector quantty assocated wth body w and y s be a matrx or vector quantty assocated to body s, such that the multplcaton x w y s s well defned. Then, Í Í x y Í Í x y w s w s F( r) w F[ s] w F( r) s d( r, w] = s where r s an arbtrary body of the spannng tree. The rght-sde term s actually a reorderng of the summaton n the left-sde. Based on Lemma, takng r to be a fcttous nboard body of the base body, the followng result s obtaned. Corollary. Wth y s T and x w vectors of approprate dmenson assocated wth bodes s and w, respectvely, Í Í T yx = Í Í T yx s w s s S w F[ s] w S s d[ w] Generally, the unknowns related to body u occupy poston n the vector of unknowns. A permutaton p s defned such that for each body t gves ts poston n the global vector of unknowns, p( u) wth w =. If u and v are two bodes of the spannng tree, = p( u) and j = p( v), the block entry (, j ) of CIM of Eq. (3.47), s gven as M[, j] = % & K ' K T BKB u u v f v d[ u] T BKB u v v f v F(u) 0 f v c( u) (3.48) The matrx B s defned n Eq. (3.34), whle the K matrx s defned n Eqs. (3.43) and (3.44). Rather than dealng wth scalar entres, M s defned n terms of blocks. The number of rows and columns of block M[, j ] s equal to the dmenson of q and q j, respectvely. Generally, f N s the number of bodes n the system and n s the

107 95 dmenson of the generalzed coordnate vector q, q N n= Í n. = n ª, then M ª nœn, wth Proposton 2. The composte nerta matrx M defned n Eq. (3.48) s postve defnte. T T T T Defnng v = [ v, v, K, v N ], t s to be proved that ( 2) vmv T 0 and equalty 2 s obtaned only f v = 0. Consderng the vector z= Mv and denotng y B v, Í T T z() = B K y + B K y r r r s r s s s d[ r] s F( r) Í T B M$ T = y + B M$ y s d[ r] Í Í Í Í r w s r w F[ r] s F( r) w F[ s] w s k k k Usng the result of Lemma and defnng u w Í [ ] ys, s d w Í Í Í T z() B M$ T = y + B M$ y r r w s r w s w F[ r] s d[ r] w F( r) s d( r, w] Í Í Í T BMy $ T = + BMy $ T = BMy $ Í r r s r w s r w s s d[ r] w F( r) s d[ w] w F[ r] s d[ w] Í Í Í T BM$ T = y = B Mu $ r w s r w w w F[ r] s d[ w] w F[ r] Fnally, usng the result of Corollary, Í Í 2 T T T vmv= y Mu $ = ymu $ Í r Í w w Í Í r w 2 2 r S w F[ r] r S w F[ r] w (3.49) T = Í Í ymu $ T = ÍuMu $ r w w w Í u w w w w Sr d[ w] w S w S $ M In Eq. (3.49), M$ defnes a norm, snce M $ w w s postve defnte. To see ths, frst note that the state-vector-reduced matrx assumes the form T T T c $M = TMT= THMHT w w w wth M and T defned as n Eqs. (3.38) and (3.36), and M c and H gven by 2 w

108 M c mi 0" = H I 0 c! 0 J $ # =! ρ I " $# (3.50) 96 where J c s the body nerta matrx wth respect to a reference frame located at the centrod of the body and parallel to the body reference frame. Therefore, J c s postve defnte. Snce matrces H and T are nonsngular, and M c s postve defnte, t results that $ M w s postve defnte. The last sum n Eq. (3.49) s zero only when u = w 0, for all w S. If ths s the case, then yw = 0, for all w S. Equvalently, Bwvw = 0, for all w S. Snce proper modelng n JR requres B w to be of full column rank, t s concluded that v = w 0, for all w S. Ths completes the proof. n Takng T defne the quadratc form ( 2) vmv T T T T T T T T v = [ v, v2, K, vn] [& q, q& 2, K,& qn ] (3.5) as representng the knetc energy of the treestructured mechansm, n the state-vector space. Note that u w, as defned above, represents the state-vector velocty of body w ;.e., the vector prevously denoted by $ Y w. Hence, Eq. (3.49) mples that the knetc energy of the system n the state-vector space s equal to the sum of the knetc energy n the state-vector space of all bodes n the system. Based on Eq. (3.40) and the fact that Y = T Y$, the followng holds. w w w Corollary 2. The knetc energy of each body n the system s the same as ts knetc energy n the state-vector space. Therefore the knetc energy of the treestructured mechansm has the same property.

109 Factorng the Augmented Matrx In the most general case of a mechansm contanng closed loops, the augmented matrx that must be factored s the coeffcent matrx of the lnear system of Eq. (3.47), denoted here by A. In the case of a tree-structured mechansm, the augmented matrx s dentcal to the composte nerta matrx. Block Cholesky factorzaton of the matrx A wll result n a LL T decomposton, whch s detaled below. Cholesky factorzaton s not drectly applcable when closed loops are present n the mechancal system, snce the augmented matrx s not postve defnte, due to the presence of the constran Jacoban. The augmented matrx wll therefore be factored as follows: A! M Φ q Φ 0 T q " $! L 0 I 0 # " = n T! T I $ # T m 0 -T T " $#! L T 0 I T m " $ # (3.52) where L s obtaned from the Cholesky factorzaton of M = LL T, and T ª nœm s the soluton of the matrx equaton LT = Φ T q (3.53) The matrx TT T s postve defnte snce the Jacoban of the poston knematc constrant equatons s assumed to have full row rank Factorng Composte Inerta Matrx Contrary to general percepton, sparsty n the CIM s not lost, and furthermore t s structured. Qualtatvely, the sparsty s dctated by the topology of the mechansm, and n the case of closed loop mechansms also by the cut jonts. The major problem wth drect algorthms for sparse systems s that, to a certan extent, sparsty s lost durng factorzaton. The proposed algorthm preserves the sparsty pattern of CIM;.e., no fll-

110 98 n occurs durng the Cholesky factorzaton. For ths to happen, a renumberng of the bodes of the model may be necessary. In a sense, the strategy requred to explot sparsty n the JR formulaton s complementary to the one ntroduced for the Cartesan representaton. Whle n the latter case the jonts were renumbered, n JR formulaton the bodes are renumbered to yeld a certan order of the unknowns. If the vector of T T T T unknowns s of the form q&& [&& q,&& q, K,&& q N ], a block permutaton defned by a block = 2 permutaton matrx P s appled to &&q, to obtan T T T T && qnew = Pq&& = [&& qb(),&& qb( 2), K && qb( N )] (3.54) where b s a permutaton such that, f body u s assgned va the permutaton P poston n &&q new, then b( ) = u. Note that b s the nverse of the permutaton p ntroduced before defnng the entres of the composte nerta matrx n Eq. (3.48). Lemma 2. Let u and v be two bodes n the spannng tree assocated wth a mechancal system. No fll-n occurs durng the classcal block Cholesky factorzaton of CIM f, for u F( v), then p( u) < p( v) To prove ths result the symmetry and the block structure of the composte nerta matrx are used. Matrx M s factored usng a block-orented Cholesky factorzaton and t s to be shown that blockwse, L[ k, j ] = 0, f M[ k, j ] = 0. L[,] s determned from L[,] L T [,] = M[,] and, because of the postve defnteness of M, t s not dentcally zero. The matrx L[ k, ] s the soluton of L[,] L T [ k,] = M[ k,], < k < N, and clearly L[ k, ] = 0 when M[ k, ] = 0. Suppose that, up to column j -, the concluson of Lemma 2 holds. The T postve defnteness of M mples that L[ j, j ], obtaned from L[ j, j] L [ j, j] Í j- T = M[ j, j] - L[ j, ] L [ j, ] of =, s not dentcally zero. For k > j, L[ k, j ] s the soluton

111 99 j- Í T T L[ j, j] L [ k, j] = M[ k, j] - L[ j, ] L [ k, ] = (3.55) T Let M[ k, j ] = 0, and suppose there s an, < j < k, such that L[ j, ] L [ k, ] 0. Then M[, j ] 0 and M[, ] 0 and p( w) k. Let bodes u, v, and w be such that p( u) =, p( v) = j, = k. Because of the way the precedence s defned n the vector of unknowns, ths mples that v d[ u] and w d[ u]. However, snce j < k, w d[ v]. Therefore, M[ k, j ] 0, whch s a contradcton. Consequently, the rght-sde of Eq. (3.55) s zero. T Then, L[ j, ] L [ k, ] = 0 and L[ k, j ] = 0. Ths completes the proof. n The reorderng nduced by the permutaton array p s not unque, and developng one s straghtforward. A good ntal jont numberng based on the precedng lemma, ensures no fll-n factorzaton of the CIM for the entre smulaton The Closed-loop Case In the case of a closed-loop mechancal system, M jonts connectng bodes n the system are cut to obtan a spannng tree of the mechansm. A set of constrant ( ) ( 2) ( M ) equatons Φ = Φ = K= Φ = 0, s mposed to account for the cut jonts. In Secton , the collecton of constrant equatons was denoted by () Φ = [ Φ T (, Φ 2 ) T ( ), K M, Φ T ] T M, wth m= Í m and Φ () partton of T n Eq. (3.53), the component T ( j) = m ª. Wth a columnwse n m j ª œ s obtaned solvng LT = Φ T, j ( j) ( j) q M. Wth the precedence n the vector of unknowns nduced by Lemma 2, the followng result sngles out the zeros of the matrx T. Lemma 3. Let constrant j account for the cut jont between bodes k and l, and let ( = p( u), wth u d[ k] d[ l]. Then T j) [] = 0.

112 00 For proof, let be the smallest nteger that satsfes the hypothess of the lemma and yet volates the concluson. Row of LT - Í v= = Φ T s gven by ( j) ( j) q j j L[, v] T [ v] + L[,] T [] = Φ q ( ) ( ) ( j) ( j Snce u d[ k] d[ l], Φ ) q = 0 and, therefore, cannot be. Suppose there exsts v, u v<, such that L[, ] T ( j v ) [ v] 0. Wth the precedence nduced by Lemma 2 and wth w defned by v = b( w), w F( u). Therefore, u u d[ w]. Fnally, snce T ( j) [ v ] 0 and v<, w d[ k] d[ l]. However, u d[ k] d[ l], whch s a contradcton. Ths completes the proof. n Algorthm for Factorzaton of Composte Inerta Matrx The proposed algorthm s based on the decomposton n Eq. (3.52). The orderng of elements of the unknown vector nduced by Lemma 2 s assumed. Algorthm 4 Step 0 Set y = n 0, y n n ª T Step Factor M = LL Step 2 Solve Lz = n Q A n, z n ª IF (closed-loop) then: Step 3. Solve LT= Φ T q, T ª nœm T m Step 3.2 Set zm = T zn -τ, z m ª Step 3.3 Compute T T T T Step 3.4 Solve T Tλ = z m, λ ª m Step 3.5 Set y = n T λ

113 0 ENDIF Step 4 Solve Lq T && = z -y n n Durng Step, CIM s factored usng block Cholesky wth no fll-n. Therefore, the sparsty pattern for L s known beforehand. Solvng for z n requres only forward substtuton. Wthout takng sparsty nto account n Step 2, row of the lnear system Lz = n Q A would be - Í v= A L[, v] z [ v] + L[,] z [] = Q [] n A reduced number of operatons results f when solvng for z n [], n the lght of Lemma 2, row s equvalently expressed as Í n + n = A k Fp ( ( )) L[, b( k)] z [ b( k)] L[,] z [] Q [] The same result holds when backward substtuton s used to retreve the acceleratons &&q n Step 4. n Usng the result of Lemma 3, further advantage can be taken of the problem structure when computng the matrx T durng Step 3.. Some entres of ths matrx are known beforehand to be zero and need not be computed. The coeffcent matrx n T T Tλ = z m s dense and postve defnte, and t s factored va Cholesky decomposton. In general, t s not possble to gve an operaton count for the proposed algorthm. The number of operatons wll depend on the topology of the partcular mechancal system model beng consdered. Secton presents a comparson n terms of number of operatons and CPU tme of several alternatves to compute generalzed acceleratons and Lagrange multplers when dealng wth a vehcle model.

114 Takng Advantage of Parallelsm In ths Secton, t s assumed that the precedence n the vector of unknowns s as n Lemma 2. Let j be an nteger such that j n the mechancal system model. Defne D( j) { < j and b() F b(j) } N, where N s the number of bodes 6 U( j) { N and b() db(j) } Lemma 4. Durng Cholesky factorzaton of the CIM, for any j, j k N and U( j), L[ k, j ] can be computed, provded that for each D( j), L[ l,] s avalable for l U( ). For proof, let v be the body n the system for whch p( v) = j. It s frst shown that the concluson holds f j corresponds to a leaf v. Then t s shown that t holds for any j. One stage of the block Cholesky algorthm s defned as follows: j- Í T T L[ j, j] L [ j, j] = M[ j, j] - L[ j, ] L [ j, ] = j- Í T T L[ j, j] L [ k, j] = M[ k, j] - L[ j, ] L [ k, ] = (3.56) (3.57) Equaton (3.56) s solved for L[ j, j ], and then L[ k, j ], j < k N, s computed from Eq. (3.57). Let j n p( v) u be such that p( u) = j be such that v s a leaf. Then, D( j ) = { }. For < =. Then, u F( v), because otherwse D( j ) { }. Lkewse, j, let u d[ v], snce p( u) < p( v), and ths would volate the precedence nduced by Lemma 2. Therefore, u c[ v], and wth the defnton of M[ j, ] gven n Eq. (3.48), along wth the result of Lemma 2, L[ j, j ] s the soluton of L[ j, j] L T [ j, j] = M[ j, j]. Furthermore, snce L[ j, ] = 0 for < j, L[ k, j ] s the soluton of L[ j, j] L T [ k, j] = M[ k, j],

115 03 j < k N. Thus, for the case of j correspondng to a leaf of a tree, L[ k, j ] can be computed for k U( j). For the last step of the proof, let j be such that D( j ) { };.e., v n p( v) = j s not a leaf of the tree. Assume that for each D( j), L[ l,] s avalable for l U( ). It s to be shown that L[ k, j] can be computed for k U( j). Wth beng the summaton ndex n Eq. (3.56), let u be such that p( u) =. There are two alternatves, relatve to the poston of u n the spannng tree of the mechansm; u F( v), or u F( v). In the frst case, u d[ v]. Otherwse, the precedence nduced by Lemma 2 s volated. Then, u c[ v] and agan, wth the defnton of M[ j, ] along wth the result of Lemma 2, L[ j, ] = 0. On the other hand, f u F( v), then j U( ). Therefore, L[ j, ] s known. Consequently, one can evaluate the summaton n Eq. (3.56) and obtan the value L[ j, j ]. Fnally, to compute L[ k, j ] for k U( j), t s necessary to evaluate the summaton n Eq. (3.57). It was shown above that L[ j, ] s ether dentcally zero (n ths case L[ k, ] need not be evaluated) or known (when j U( ) ). In the latter stuaton, the value of L[ k, ] s needed. If j U( ), snce k U( j), k U( ) as well. Consequently, L[ k, ] s known and the summaton can be evaluated. Ths completes the proof. n When the Cholesky factorzaton progresses through the sequence descrbed by Eqs (3.56) and (3.57), the process moves columnwse. Each column s flled, startng from the dagonal element and proceedng down to the last row of the matrx. Lemma 4 states that, based on a certan amount of nformaton, some of the entres of column j, namely L[ k, j ] wth k U( j), can be computed. The other entres are dentcally zero, snce when k U( j) and j < k N, L[ k, j ] = 0.

116 04 The results n ths Secton gve a practcal way n whch the CIM can be factored; column j can be computed once the columns = p( u) correspondng to all u F( v) have been computed. In other words, the factorzaton progresses ndependently from the tree-end bodes (leafs) toward the root of the tree. These observatons yeld the followng corollary, based on the result of Lemma 4. Corollary 3. Let u and v be two bodes n the spannng tree assocated wth a mechancal system, wth u F( v), and v F( u). Then, the columns k for w F( u) and l = p( t) for t F( v) of the matrx L n the Cholesky factorzaton of CIM can be computed n parallel. = p( w) Note that there s another stage n Algorthm 4 that can be easly parallelzed, snce any forward or backward substtuton nvolvng the matrx L can be done n parallel Numercal Experments Numercal experments are performed on a 4-body model of the Army s Hgh Moblty Multpurpose Wheeled Vehcle (HMMWV) (Serban, Negrut, Haug, 998). The topologcal graph of the mechancal system model s shown n Fgure 9. The bodes of the model are represented as the vertces of the graph, whle the jonts are the connectng edges. Here, R stands for revolute jont, T for translatonal jont, S for sphercal jont, and D for dstance constrant. The bodes used to model the vehcle are lsted n Table 5. For ths problem, the coeffcent matrx n Eq. (3.47) has dmenson 43. A number of 6 constrant equatons account for the cut jonts; the vector of generalzed coordnates s of dmenson 27. The vehcle model has degrees of freedom. The constrants marked wth an arrow n Fgure 9 are cut to obtan the spannng tree n Fgure 0(a).

117 05 Fgure 9. HMMWV4 Body Model: Topology Graph Table 5. HMMWV4 Model - Component Bodes Vertex Body Vertex Body Chasss 8 Left rear upper control arm 2 Rght front upper control arm 9 Left rear wheel spndle 3 Rght front wheel spndle 0 Rack 4 Left front upper control arm Rght front lower control arm 5 Left front wheel spndle 2 Left front lower control arm 6 Rght rear upper control arm 3 Rght rear lower control arm 7 Rght rear wheel spndle 4 Left rear lower control arm

118 (a) (b) Fgure 0. Spannng Tree HMMWV4 Four methods for the soluton of the augmented lnear system n Eq. (3.47) are compared. The comparson s made n terms of CPU tme and, for Gaussan elmnaton and the proposed algorthm, also n terms of number of operatons. The frst method analyzed s based on Gaussan elmnaton, and denoted by Gauss. The method Symmetrc s based on a PAP T T = LDL decomposton of the symmetrc augmented matrx A, where D s a dagonal matrx wth blocks of dmenson œ or 2œ 2. The method Harwell solves the augmented system by usng lnear algebra subroutnes from the Harwell lbrary. The fourth method consdered s the one ntroduced here, denoted by Alg-S, the S standng for sparse. In the NADS Vehcle Dynamcs Software (995), the strategy for solvng the augmented system n Eq. (3.47) s based on Gaussan elmnaton. Table 6 contans the number of operatons for Gaussan elmnaton for ths model. In ths table, Fact stands

119 07 for factorzaton, FS for forward substtuton, and BS for back substtuton. The number of addtons A, multplcatons M, dvsons D, and square roots SQ s counted at each stage of Gauss. Table 6. Operaton Count for Gaussan Elmnaton Gauss Fact FS BS Total A M D SQ Table 7 provdes operaton counts for algorthm Alg-S, followng the steps of Algorthm 4. Ths mplementaton of the proposed algorthm uses topology nformaton to take advantage of sparsty. In order to preserve the sparsty pattern of CIM, n the lght of Lemma 2, the bodes of the system were renumbered as shown n Fgure 0 (b). The number of addtons, multplcatons and dvsons for Alg-S s clearly smaller than for Gauss. As mplemented for the test problem, Alg-S requred the calculaton of 43 square roots when performng the two Cholesky factorzatons, whle Gauss requred none. Square root calculatons can be elmnated usng an LDL T approach. The beneft of ths alternatve has not yet been nvestgated though. One advantage of Alg-S over the Gaussan elmnaton famly of solvers s that no pvotng s nvolved. Gaussan elmnaton requres pvotng n order to ensure numercal stablty. Alg-S avods ths by employng Cholesky factorzatons twce (durng Steps

120 08 and 3 of Algorthm 4), and ths factorzaton method s known to be numercally stable (Golub and Van Loan, 989). Table 7. Operaton Count for Alg-S Alg-S Total A M D SQ A dsadvantage of the new algorthm s the sparsty-related overhead;. e., data accessng pattern and number of vector touches (number of memory accesses). Whle for Gaussan elmnaton the fashon n whch data s manpulated s ntutve, t s not straghtforward for the proposed algorthm. It s dffcult to asses to what extent ths wll affect the overall performance. It depends on the partcular mechancal system beng modeled and the way the algorthm s coded. An deal mplementaton of Alg-S would be a no-loop, hard-coded, problem dependent verson that s generated durng the preprocessng stage of the smulaton. Ths would requre a program that based on topology nformaton, wrtes the code for the specfc mechancal system, mplementng Algorthm 4 at the pre-processng stage. An attempt was made to evaluate sparsty-related overhead for the test problem consdered. For ths, the steps of Algorthm 4 were to be followed, but sparsty-related book-keepng present n Alg-S s elmnated usng dense-matrx operatons. Basc Lnear Algebra Subroutne (BLAS) 2 and 3 operatons, along wth dense Cholesky factorzaton,

121 09 are at the core of a new algorthm denoted below by Alg-D, the D standng for dense. For ths algorthm, there s no need to renumber the bodes of the mechancal system, or to keep track of topology nformaton. The operaton count for Alg-D, usng the same HMMWV example, s provded n Table 8. Table 8. Operaton Count for Alg-D Alg-D Total A M D SQ Table 9 lsts the CPU tmes n mcroseconds for the methods dscussed above. The tmes correspond to one soluton of the augmented lnear system. For comparson, tmng results for Harwell and Symmetrc were also ncluded. All numercal experments presented n ths Secton were performed on a HP9000 model J20 computer wth two PA 7200 processors. Table 9. JR Lnear Solvers CPU Results Gauss Symmetrc Harwell Alg-S Alg-D

122 0 For the test problem consdered, Alg-D s the algorthm of choce. The problem consdered s too small for Alg-S to show any beneft from takng nto account the topology-nduced sparsty. It should be noted however that the numercal mplementaton of Alg-S could be further mproved. In Alg-D, the postve defnte matrces M and TT T, were factored usng the drver dposv. In Gauss, Symmetrc, and Harwell, the routnes used were dgesv, dsysv, and the trple ma28ad/ma28bd/ma28cd, respectvely. Wth the excepton of the trple ma28 taken from an older Harwell publc doman lbrary, the other routnes were avalable n Lapack. The faster routne ma47 of the more recent versons of Harwell was unavalable at the tme of ths study, and t should mprove the performance of the algorthm Harwell. Fnally, Table 0 presents a detaled profle of the algorthms Alg-S and Alg-D, lstng CPU tmes n mcroseconds for each step of Algorthm 4 n the two mplementatons. Table 0. Tmng Profles Alg-S and Alg-D Total Alg-S Alg-D For the test problem consdered, there s no advantage n consderng sparsty for forward/backward elmnaton. It s only when sparsty nformaton s used for the coeffcent and rght sde matrces n step 3. (when computng the T matrx) that Alg-S gans an edge over Alg-D.

123 CHAPTER 4 NUMERICAL IMPLEMENTATIONS Ths Chapter contans detals regardng numercal mplementaton of methods presented n Chapter 3 for the mplct numercal ntegraton of the DAE of Multbody Dynamcs. To defne the methods n Chapter 3, generc ntegraton formulas were used. In ths Chapter, specfc ntegraton formulas are consdered, and detals are provded about how a partcular ntegraton formula s embedded n the overall archtecture of an mplct DAE ntegraton algorthm. 4. Trapezodal-Based State-Space Implct Integraton 4.. General Consderatons The trapezodal ntegraton formula s often used for mplct ntegraton of mldly stff ntal value problems (IVP). It s a popular mult-step formula that uses nformaton only from the pror tme step. The order of the method s two, and the method s A stable (Atknson, 989). For the IVP yš = f ( x, y), yx ( ) = y, t assumes the form 0 0 y = y0 + h( f ( x0, y0) + f ( x, y)) 2 Ths formula s used to ntegrate ndependent acceleratons to obtan ndependent veloctes, and then to ntegrate ndependent veloctes to obtan ndependent postons. Thus,

124 2 2 v = ~ h v+ && v 4 (4.) v& ~& = v+ h && v 2 (4.2) where 2 ~ h v v0 + hv& 0 + && v 4 ~& v = v + h && v The generc formulas of Eqs. (3.23) and (3.2) of Secton 3.2. are replaced wth Eqs. (4.) and (4.2). Therefore, the coeffcents that appear n the ntegraton Jacoban of Eq. (3.43), are γ = 05. and β = The trapezodal formula s known to be a smple answer to the challenge of solvng stff IVP. Compared to other ntegraton formulas, n partcular to a member of the Rosenbrock famly presented later, the order s rather low, and stablty propertes are not sound. A typcal problem encountered when usng the trapezodal formula s the loss of accuracy for very stff ODE. Ths s caused by ts lack of L-stablty, a noton that s defned when dscussng Runge-Kutta formulas n conjuncton wth the Descrptor Form Method Algorthm Pseudo-code The numercal mplementaton of the State Space Method presented n ths document s based on the trapezodal formula of the prevous Secton. The pseudo-code of the mplementaton s provded n Table.

125 3 Table. Pseudo-code for Trapezodal-Based State-Space Method. Intalze Smulaton 2. Set Integraton Tolerance 3. Whle (t<tend) do 4. Setup Integraton Step 5. Get Integraton Jacoban 6. Factor Integraton Jacoban 7. Do whle (.NOT. converged) 8. Integrate 9. Recover Dependent Postons 0. Recover Dependent Veloctes. Evaluate Mass Matrx and Actve Forces 2. Evaluate Dependent Acceleratons 3. Evaluate Lagrange Multplers 4. Evaluate Error Term 5. Correct Independent Acceleratons 6. End do 7. Check Accuracy. Select Step-sze 8. Check Partton 9. End do Step ntalzes the smulaton. A consstent set of ntal condtons s determned, smulaton startng and endng tmes are defned, and an ntal step-sze s provded. User set ntegraton tolerances are read durng Step 2.

126 4 Step 3 starts the smulaton loop. The system state s stored, to be used later to restart the ntegraton upon a rejected step. An ntal estmate for the ndependent acceleratons s also provded. Currently, values of ndependent acceleratons from the prevous tme step are used to start the teratve process. Several other operatons are done durng ths step, as s noted below. At Step 5, the ntegraton Jacoban of Eq. (3.43) s computed, wth γ = 05. and β = The ntegraton Jacoban s factored at Step 6. Snce the dmenson of the ntegraton Jacoban s equal to the number of degrees of freedom of the model, ths matrx s rather small and dense. In the case of a moderately large HMMWV mltary vehcle that s modeled usng 4 bodes (Serban, Negrut, and Haug 998), whch s used n the next Chapter for valdaton purposes, the dmenson of ntegraton Jacoban s 8. Consequently, dense Lapack routnes are used to factor ths matrx. Step 7 starts the loop that solves the dscretzed non-lnear algebrac equatons. For the State-Space Method, the system vv vu v ψ(&& v ) M && v + M u&& + ( Φ T ) λ - Q = 0 (4.3) v whch s obtaned after dscretzng the ndependent equatons of moton, s solved for ndependent acceleratons. Gven new ndependent acceleratons, ndependent veloctes and postons are obtaned usng at Step 8 Eqs. (4.) and (4.2). Snce dependent acceleratons are avalable, usng the same ntegraton formulas they are also ntegrated twce to provde a better startng pont for dependent postons recovery durng Step 9. Wth a gven v, dependent coordnates u are obtaned va a quas-newton approach, ( u k + ) ( ) ( ) u k - = -Φ ( q ) Φ( u k, v) u 0 where Φ( uv, ) = 0represents poston knematc constrant equaton of Eq. (3.7). Here q 0 s the vector of generalzed coordnates from the last tme step, or the value computed

127 5 durng the prevous teraton for &&v ;. e., the value obtaned as a result of the last call to poston recovery. At the end of Step 9, the sub-jacoban Φ u s factored n a consstent confguraton of the mechansm. Dependent veloctes are cheaply obtaned durng Step 0, by solvng Φ u& =-Φ v& u v Durng Steps 2 and 3, the generalzed mass and external forces are evaluated n the most recent confguraton;. e., usng the last generalzed postons and veloctes avalable after Step 0. The computaton proceeds by solvng the lnear systems to obtan dependent acceleratons, and Φ u u&& =- Φ && v+τ T λ = Q -M && v-m u&& Φ u u uv uu v to obtan the Lagrange multplers. Note that fndng the soluton of these systems s nexpensve, snce a factorzaton of the sub-jacoban Φ u s avalable. Durng Step 4, the error n satsfyng the non-lnear system of Eq. (4.3) s evaluated. Based on the norm of the resdual and the norm of the last correcton n ndependent acceleratons, the decson to stop or contnue the teratve process s taken. If stoppng crtera are not met, another teraton s taken, unless the number of teratons already exceeds a specfc number. In the latter case, the step-sze s decreased, the code returns at Step 4, and the teratve process s restarted. However, the costly ntegraton Jacoban computaton s skpped. The ntegraton Jacoban was shown n Secton to assume the form Ψ $ $ 2 $ && v = M+ γhm + βh M2 (4.4)

128 6 When the tme step s rejected, matrces M $, M $, and M $ 2 are reused and only the stepsze s gong to be changed. Therefore, tme step rejecton s cheap as far as ntegraton Jacoban evaluaton s concerned. If stoppng crtera are not meet and the number of teratons s less than the lmt number, the ndependent acceleratons at teraton k are corrected at Step 5 as ( k+ ) ( k) - ( k) && v = v - Ψ Ψ( v ) && v and the code proceeds to Step 8 for a new teraton. Once the teratve process has converged, accuracy of the numercal soluton s verfed. The confguraton of the mechancal system at the new tme step s known, and t remans to determne whether accuracy of the numercal soluton meets requrements mposed by the user va the ntegraton tolerances set at Step 2. An embedded method s used to obtan a second approxmate soluton that s used for ntegraton error estmaton. The dfference of the two approxmate solutons s taken to be the local truncaton error. The new step-sze s obtaned by mposng the condton that the norm of the local truncaton error s smaller than a certan composte norm, based on user mposed ntegraton tolerances. Ths approach s detaled n Secton 4.2 n the framework of Runge-Kutta methods. The embedded formula used n conjuncton wth the trapezodal method s the backward Euler formula. If accuracy requrements are met, the step-sze computed as a by-product of error analyss s used for the next tme step. Otherwse, the tme step s rejected and, after proceedng to Step 4, the new step sze s used n conjuncton wth Eq. (4.4) to reevaluate the ntegraton Jacoban. The parttonng of generalzed coordnates s checked at Step 8. A reparttonng s trggered by a large value of the condton number of the dependent sub- Jacoban Φ u. In ths context, large means a condton number that exceeds by a factor of

129 7 α = 25. the value of the frst condton number assocated wth the current partton. The value α = 25. was determned as a result of numercal experments. For the case of a vehcle model wth 4 bodes and 98 states, the ntal parttonng of coordnates was vald for 40 seconds of smulaton, and no reparttonng request was made. Ths mechansm dd not cause unjustfed reparttonng requests, and proved to be relable. Increasng the factor α wll reduce the number of reparttonngs, and conversely. Note that computng the condton number of the dependent sub-jacoban s nexpensve, snce a factorzaton of ths matrx s avalable after the last call to dependent poston recovery. There are two CPU ntensve stages of the proposed algorthm. Frst s computaton of the ntegraton Jacoban, whch employs many matrx-matrx multplcatons, as shown n Secton The computaton of matrces $ M, $ M, $ M 2, R, and S are ntensve CPU operatons. Second s factorzaton of the dependent sub- Jacoban Φ u, whch n the current numercal mplementaton s done after each dependent poston recovery (Step 9). Ths s an mportant matrx that appears often n the mplementaton. It s used to obtan dependent veloctes (Step 0), recover dependent acceleratons &&u (Step 2), evaluate Lagrange multplers λ (Step 3), and compute the matrces H, J, N, and L (Step 5). These latter matrces are obtaned as solutons of multple rght-sde systems of lnear equatons whose coeffcent matrx s the dependent sub-jacoban Φ u. Fnally, the factorzaton of Φ u s needed n Step 8, to evaluate the condton matrx requred for checkng the valdty of the current parttonng. Usually, the dmenson of the dependent sub-jacoban Φ u s rather large. For the HMMWV4 model mentoned above, the dmenson of ths matrx s 80. Typcally, 3 teratons are requred to solve the non-lnear system Ψ(&&) v = 0 to retreve &&v. Consequently, each ntegraton successful step requres 3 factorzatons of Φ u and one

130 8 factorzaton of the ntegraton Jacoban. As wll be shown later, the Rosenbrock formula embedded n the framework of the Frst Order Reducton Method, has better stablty propertes (L-stable), has order 4, ncludes a more relable step-sze control mechansm, and requres about the same computatonal effort per tme step as the trapezodal formula when used n conjuncton wth the State-Space Method. 4.2 SDIRK Based Descrptor Form Implct Integraton 4.2. General Consderatons Accuracy and stablty are the ssues of concern when applyng a numercal ntegraton formula to determne the soluton of an IVP. When dealng wth stff IVP, explct ntegraton formulas are compelled to very small step-szes due to stablty lmtatons. For most mechancal engneerng applcatons, precson n the range of 0-4 through 0-2 usually suffce. For non-stff IVP, the step-sze of an explct md- to hghorder formula mght assume large values, and stll meet the relatvely mld accuracy requrements of Yet when appled to stff IVP, explct codes produce meanngless results, unless the step-sze s reduced to rdculously small values to keep the whole process away from uncontrollable behavor. Implct ntegraton formulas are effectve n avodng ths drawback of explct methods, and the step-sze s agan lmted by accuracy consderatons, as was the case wth the explct formulas appled to non-stff IVP. However, the good stablty propertes of mplct formulas have a computatonal penalty, as these formulas are more complex, more dffcult to mplement, and on a per-

131 9 step bass more CPU ntensve. Overall though, CPU tmes for solvng stff IVP can be cut by orders of magntude f mplct ntegraton s used. From a theoretcal standpont, mplct formulas ntroduce concepts such as A and L-stablty, contractvty (B-convergence) and order reducton, exstence of numercal soluton, and convergence ssues. There are also ssues regardng the numercal mplementaton of these formulas, n whch lnear algebra plays an mportant role. Stoppng crtera, how these are reflected n the global ntegraton error, and startng values for solvng the dscretzed system of algebrac equatons are ssues that ncrease the complexty of mplct methods. General notons related to the class of mplct Runge-Kutta methods are next consdered. These are the notons that are relevant to partcular Runge-Kutta formulas that are used n conjuncton wth the Descrptor Form and Frst Order Reducton Methods. An excellent reference on the topc of mplct ntegraton s the book of Harer and Wanner (996), whch contans a detaled treatment of ths problem. Consder Dahlqust test equaton yš = λy wth yx ( 0) =. If the forward Euler formula s appled to ntegrate ths IVP, after one ntegraton step the soluton s gven by y = ( + hλ) y 0 or, where y = R( hλ) y 0 Rz () + z

132 20 The functon Rz ( ) s called the stablty functon, whch s specfc to the ntegraton formula. By defnton, settng z the Dahlqust test problem after one ntegraton step. The set = hλ, the stablty functon s obtaned as the soluton of S = z C ; R( z) s called the stablty doman of the method. A method whose stablty doman satsfes - S ± C = z;rez 0 s called A-stable. Ths s a desrable property of an ntegraton formula, to successfully deal wth the class of stff IVP. However, as Alexander (977) has remarked, A-stablty s not the entre answer to the problem of stff equatons. Usng A-stable one-step methods to solve large systems of stff non-lnear dfferental equatons, Prothero and Robnson (974) concluded that some A-stable methods gve hghly unstable solutons. The accuracy of solutons obtaned when the equatons are very stff often appeared to be unrelated to the order of the method used. The observatons of Prothero and Robnson characterze what s called the process of order reducton of mplct Runge-Kutta methods when appled to very stff dfferental equatons. Methods that dsplay good behavor, even for extremely stff IVP have been desgned. One attrbute of these methods s ther L-stablty property. A method s L- stable, f t s A-stable and satsfes the condton lm Rz ( ) = 0 z One class of methods that satsfy ths condton (Harer and Wanner, 996) s the class of stffly-accurate Runge-Kutta methods, for whch a = b j =, K, s sj j

133 2 In other words, the last lne of a elements and the stage value weghts n Butcher s tableau concde. In partcular, ths class of problems s extensvely used for the soluton of sngularly perturbed problems and low ndex DAE va drect methods. For step-sze control, Runge-Kutta formulas use a lower order method that works n conjuncton wth the ntegraton formula to provde a second approxmate soluton. Ths addtonal soluton s generally used only to calculate an approxmaton of the local truncaton error. Keepng ths local error lower than user prescrbed accuracy requrements s the basc dea of the step-sze controller. To compute the second approxmaton of the soluton at the new grd pont effcently, nformaton avalable durng the process of obtanng the numercal soluton should be used to advantage. Usually, stage values are used wth weghts b $ to provde a second approxmaton $y of the soluton s y$ y bk $ = 0 + Í (4.5) = Followng the presentaton of Harer, Nørsett, and Wanner (993), an estmate of the error for the less precse result s y than a composte error tolerance sc - y$. Componentwse, ths error s kept smaller y - y$ sc sc = Atol + max( y, y ) Rtol (4.6) 0 where Atol and Rtol are user prescrbed ntegraton tolerances. As a measure of the error, the value err = n y - y$ Í n sc = 2 (4.7) s consdered. Other norms mght be chosen; one alternatve used often beng the max norm. Then, err n Eq. (4.7) s compared to, n order to fnd an optmal step-sze.

134 22 From asymptotcal error behavor, err Ch q +, and from q+ Ch opt (where q = mn( p, p$), wth p and $p beng the order of the formulas used), the optmal step sze s obtaned as h opt q = h err + (4.8) A safety factor fac usually multples h opt such that the error s acceptable at the end of the next step wth hgh probablty. Further, h s not allowed to ncrease or decrease too fast. Thus, the value used for the new step-sze s ( q+ ) h = h mn( facmax, max( facmn, fac ( err) )) new and f at the end of the current step, err, the step s accepted. The soluton s then advanced wth y and a new step s tred, wth h new as step-sze. Otherwse, the step s rejected and computatons for the current step are repeated wth the new step-sze h new. The maxmal step-sze ncrease facmax, usually chosen between.5 and 5, prevents the code from takng too large a step and contrbutes to ts safety. When chosen too small, t may also unnecessarly ncrease the computatonal work. Fnally, t s advsable to put facmax = n steps after a step-rejecton (Shampne and Watts, 979). The last mportant aspect, from a practcal standpont, s a dense output capablty. Ths capablty s mportant for many practcal questons such as event locaton and treatment of dscontnutes n dfferental equatons, graphcal output, etc. From a mathematcal perspectve, ths ssue s mportant f the number of output ponts s very large. Cuttng the step-sze, to meet output request and not accuracy or stablty lmtatons, may reduce effcency of the algorthm sgnfcantly. These are consderatons that motvate the constructon of dense output formulas (Horn, 983). The dea s to provde, n addton to the numercal result y, cheap

135 23 numercal approxmatons to yx ( + θ h), for the entre ntegraton nterval 0 θ. 0 Thus, usng stage values k, a set of coeffcents dependng on the output pont through θ, are provded to gve an approxmaton of the soluton at the ntermedate pont x 0 + θ h. Ths approxmate soluton s then obtaned as u( θ) = y + h b( θ) k 0 s Í = (4.9) where b ( θ ) are polynomals n θ. Based on order condtons, they are determned such that u( ) y( x h) ( h p + θ - + θ = Ο ) 0 where yx ( ) s the soluton of the IVP and p s the order of the method. A Runge-Kutta method provded wth a formula such as Eq. (4.9) s called a contnuous Runge-Kutta method. The two SDIRK methods presented n ths document for mplct ntegraton of SSODE are contnuous methods. Further nformaton regardng contnuous methods can be found n the works of Harer, Nørsett, and Wanner (993) and Shampne (986) SDIRK4/6 Snce there s no extra prce for usng a stffly-accurate Runge-Kutta formula that can successfully handle even very stff ODE, ths class of ntegrators was embedded n the Descrptor Form Method. The Runge-Kutta method mplemented was a 5 stage, order 4, stffly accurate formula, wth error control based on an order 3 embedded formula. Wth γ beng the dagonal element of the SDIRK formula, defne 2 p = -γ p 2 = 2-2γ + γ 2 3 p 3 = γ γ γ p 4 = 6-32γ + 3γ -γ

136 p 5 = γ γ γ γ p 6 = 8-4 3γ + 4γ - 4γ + γ p 7 = γ γ γ γ p 8 = 24-23γ + 3γ - 4γ + γ The followng smplfed condtons assure the order four of the method (Harer and Wanner (996)) b+ b2 + b3+ b4 = p bc 2 2Š + bc 3 3Š + bc 4 4Š = p bcš + bcš + bcš = p ba cš + b( a cš + a cš ) = p bcš + bcš + bcš = p b cša cš + b cš( a cš + a cš ) = p (4.0) ba cš + b( a cš + a cš ) = p ba a cš = p The coeffcents c, are defned as cš = tableau s gven n Table 2. - Í j= a j 7, = 23,,. The assocated Butcher s Table 2. Butcher s Tableau for SDIRK Formulas c γ 0 0 c 2 a 2 γ 0 c s a s a s2 γ b b 2 b s

137 25 The A-stablty property of the method s determned by the choce of γ n ths tableau (Harer and Wanner, 996). In order to obtan good stablty propertes, along wth a small leadng coeffcent of the local truncaton, Harer and Wanner (996) suggest values of γ n the range 0.25 and For the Descrptor Form Method, the value chosen was γ = 025. (or γ = 46, the name of ths formula). Accordng to Butcher s tableau, f the number of stages s s = 5, there are coeffcents to be determned, namely the sub-dagonal elements a j. Once these coeffcents are known, snce the method s stffly stable, the coeffcents b are known. Fnally, the coeffcents c are obtaned usng the condtons c = Í = a j, =, K, 5 Wth the coeffcent γ chosen, eght order condtons are avalable n Eq. (4.0) to compute unknowns. The order of the method and ts stablty propertes are satsfed by the condtons mposed. Therefore, the two extra degrees of freedom n the choce of coeffcents are used to mnmze the ffth-order error terms. Ths s expected to attan addtonal accuracy for the otherwse fourth order formula. Ths condton leads to the recommendaton that c 2 Š = 05. and c 3 Š = 03. (Harer and Wanner, 996). Wth these two condtons, a non-lnear system s solved for a j, and the resultng stffly-accurate, L- stable, 5 stage, order 4, Runge-Kutta formula obtaned s shown n Table 3. Step sze control s based on an order 3 embedded formula. To obtan the coeffcents of the second formula, order condtons are revsted. They are reformulated n general form, to obtan a set of equatons for the new weghts $ b. Snce the same stage values are used, the a j and c coeffcents reman the same. Only the weghts that are to change to produce a new approxmaton of the soluton. The order three condtons assume the form

138 26 Í $b j = j Í jk, Í jkl,, ba $ j jk = ba $ a j jk jl 2 = 3 (4.) ba $ Í a = j jk kl 6 jkl,, Table 3. SDIRK4/6 Formula for Descrptor Form Method /4 / /4 /2 / /20 7/50 -/25 /4 0 0 /2 37/360-37/2720 5/544 /4 0 25/24-49/48 25/6-85/2 /4 y = 25/24-49/48 25/6-85/2 /4 $y = 59/48-7/96 225/32-85/2 0 Usng values of a j from Table 3, these equatons lead to the followng lnear system for the weghts $ b :

139 ! " $ #! b$ b$ 2 b$ 3 b$ 4 " $ # =! " $ # 27 The soluton of ths system s provded as the last row n Table 3. Fnally, the coeffcents of the dense output formula are gven by ( θ) = θ θ + θ θ b b 2 ( θ) = θ θ + θ 0θ ( θ) = θ+ θ θ + θ (4.2) b ( θ) = θ + θ 4 6 b ( θ) = θ+ θ θ + θ b These coeffcents are obtaned by symbolcally solvng smplfed order condtons; e. g., those provded n Eq., wth modfed rght-sdes that are functons of θ. Detals are provded by Harer and Wanner (996) Algorthm Pseudo-code Ths Secton provdes the pseudo-code of the algorthm developed based on SDIRK4/6 formula used n conjuncton the Descrptor Form Method. Durng Step, the ntal confguraton of the mechancal system s read n, along wth ntal and fnal tmes. An ntal estmaton for the ntegraton step-sze s provded. Ths step sze s subsequently changed by the step-sze controller, to accommodate

140 28 accuracy constrants. Step 2 reads values of tolerances prescrbed by the user;. e., the values Atol and Rtol of Eq. (4.6). Table 4. Pseudo-code for SDIRK4/6-Based Descrptor Form Method. Intalze Smulaton 2. Set Integraton Tolerance 3. Whle (t < tend) do 4. Setup Macro-step 5. Get Integraton Jacoban 6. Sparse Factor Integraton Jacoban 7. Do stage to 5 8. Setup Stage 9. Do whle (.NOT. converged) 0. Integrate. Recover Postons and Veloctes 2. Get Error Term 3. Correct Acceleratons and Lagrange Multplers 4. End do 5. End do 6. Check Accuracy. Determne New Step-sze 7. Check Partton 8. End do

141 29 At Step 3 the smulaton loop s started, and the code proceeds by settng the new macro-step. If the step s successful, the current confguraton s saved. Otherwse, when rejected, the ntal settngs are used agan to restart the ntegraton, wth step-sze suggested by the step-sze controller. Step 5 s the pvotal pont of the mplementaton. If the current tme step has not been rejected, the ntegraton Jacoban s computed accordng to consderatons of Secton 3.3, whch s the CPU ntensve part of the code. Otherwse, f the current call to the ntegraton Jacoban computaton comes after an unsuccessful tme step, the 2 ntegraton Jacoban s known, snce t assumes the form M + γhm + βh M, and 2 matrces M, M 2, and M 3 are avalable from the frst rejected attempt. Only the stepsze h s modfed, and re-computng the ntegraton Jacoban s cheap. Harwell sparse lnear algebra routnes are used for factorzaton of the ntegraton Jacoban. Snce the sparsty pattern of the ntegraton Jacoban does not change durng ntegraton, the factorzaton process s cheap, once the routne ma48ad of Harwell has analyzed ts structure and a factorzaton sequence has been determned. All subsequent calls to ntegraton Jacoban factorzaton use the much faster ma48bd factorzaton routne. Care should be taken to make sure the defnng call to ma48ad s done wth the typcal sparsty pattern of ntegraton Jacoban, and sometmes ths mght not be nduced by the frst confguraton of the mechansm. At Step 7 s started to loop for the stage values k. Values terated for are the generalzed acceleratons and Lagrange multplers. At Step 8, startng values for these quanttes are provded, and the teraton counter s reset to zero. If durng a stage of the formula, ths counter exceeds a lmt value, the tme step s deemed rejected, the ntegraton step-sze s decreased, and the code proceeds to Step 4. 3

142 30 The soluton of the dscretzed non-lnear algebrac equatons s obtaned durng the loop that starts at Step 9 and ends at Step 4. As descrbed n Secton 3.3.2, based on the SDIRK formula of Table 3, acceleratons are ntegrated to obtan generalzed veloctes, whch are ntegrated to obtan generalzed postons. After drect ntegraton of all generalzed acceleratons and veloctes, the generalzed postons and veloctes fal to satsfy the knematc constrant equatons at poston and velocty levels. Dependent postons obtaned after drect ntegraton are consdered only as startng estmates for recoverng the consstent confguraton of the mechansm, va soluton of poston knematc constrant equatons. The same observatons apply for dependent veloctes, whch are computed usng velocty knematc constrant equatons. Ths s the reason for whch, although the dscretzaton s done at the ndex DAE level, the Descrptor Form s truly a state-space method. At Step 3, the error term n satsfyng the ndex DAE s evaluated. At step 4, correctons n generalzed acceleratons and Lagrange multplers are computed. Based on the norm of these correctons, the ntegraton tolerance, and the norm of the error n the dscretzed ndex DAE, an teraton stoppng decson s made. If stoppng crtera are met, the code proceeds to Step 6, once all fve stages of the SDIRK4/6 formula have been completed. Otherwse, correctons n acceleratons and Lagrange multplers are made, and f the lmt number of teratons has not been reached, another teraton s started. Durng Step 6, the step sze controller descrbed n Secton 4.2., based on an embedded formula provded n the last row of Table 3, analyzes the accuracy of the numercal soluton. If accuracy of the approxmate soluton s satsfactory, the confguraton at ths grd pont s accepted, and ntegraton proceeds wth a step-sze that s obtaned as a by-product of the accuracy check. Otherwse, wth the newly computed

143 3 step-sze, the code proceeds to Step 4 to restart ntegraton, ths tme on the premse of a rejected tme step. Durng Step 7, the parttonng of the vector of generalzed coordnates s checked. If necessary, a new dependent/ndependent parttonng s determned. Ths process s detaled for the trapezodal-based State-Space Method n Secton Fnally, Step 8 s the end of the smulaton loop. 4.3 Trapezodal-Based Descrptor Form Implct Integraton The same trapezodal ntegraton formula used n the framework of the State- Space Method s used here n conjuncton wth the Descrptor Form Method. Characterstcs of ths ntegrator, such as stablty propertes, order, etc. were presented n Secton 4... Pseudo-code for trapezodal-based mplct ntegraton of DAE va the Descrptor Form Method s presented below Algorthm Pseudo-code A large part of the dscusson of Secton regardng SDIRK4/6-based ntegraton va the Descrptor Form Method apples here. Snce the dscretzaton s based on the trapezodal rather than SDIRK4/6 formula, the numercal mplementaton s dentcal, wth the excepton that nstead of havng 5 stages, there s only one stage that advances the smulaton. The pseudo-code of the algorthm s provded n Table 5. Durng Stage 4, the system confguraton at the begnnng of the tme step s saved. Ths nformaton s used after a rejected tme step for rentalzng the state of the system and restartng ntegraton wth a new step-sze. Compared to the pseudo-code for

144 32 SDIRK4/6, the Setup Stage and the loop after the 5 stages do not appear. All other steps of the mplementaton are as dscussed n Secton , n conjuncton wth the SDIRK4/6 algorthm. Addtonal nformaton about ths algorthm s provded n the work of Haug, Negrut, and Engstler (998). Table 5. Pseudo-code for Trapezodal-Based Descrptor Form Method. Intalze Smulaton 2. Set Integraton Tolerance 3. Whle (t < tend) do 4. Setup Step 5. Get Integraton Jacoban 6. Sparse Factor Integraton Jacoban 7. Do whle (.NOT. converged) 8. Integrate 9. Recover Postons and Veloctes 0. Get Error Term. Correct Acceleratons and Lagrange Multplers 2. End do 3. Check Accuracy. Determne New Step-sze 4. Check Partton 5. End do

145 Rosenbrock-Based Frst Order Implct Integraton 4.4. General Consderatons The fundamental dea of the proposed algorthm s to reduce the second order SSODE to an equvalent frst order ODE, and apply a Rosenbrock formula to ntegrate the ODE. The dscusson starts wth a bref descrpton of Rosenbrock methods, followed by a more detaled presentaton of how ths class of methods s used to ntegrate the SSODE of Multbody Dynamcs. Fnally, a four stage, order 4, L-stable Rosenbrock method s derved (Sandu, et al., 998). For the dfferental equaton y = f( y) (4.3) an s stage dagonal mplct Runge-Kutta method defned by the set of coeffcents ( aj, b ) assumes the form s k = hf( y + a k + a k ), =, K, s (4.4) 0 j j j= y = y0 + b k s = (4.5) At stage, the system of nonlnear equatons of Eq. (4.4) s solved for k. The soluton s obtaned as a lnear combnaton of stage values k as ndcated n Eq. (4.5). For Rosenbrock methods, nstead of solvng a non-lnear system at each stage, the term f () n the rght sde of Eq. (4.4) s lnearzed n terms of k. The stage value k s then obtaned as the soluton of a lnear system of equatons. The beneft of Rosenbrock methods les n ths approach to fndng k. It can be shown (Harer and Wanner, 996) that a Rosenbrock method can be nterpreted as the applcaton of one

146 34 ( Newton teraton to each stage n Eq. (4.4), wth startng values k 0 ) = 0. Ths approach yelds the followng method: k = hf( g) + hf ( g) ak (4.6) g y + a k 0 j j j= To gan even more computatonal advantage, the Jacoban term f ( g ) n Eq. (4.6) s replaced by J f ( y0 ). To satsfy order condtons more easly, addtonal lnear combnatons of terms Jk are ntroduced nto Eq. (4.6) (Nørsett and Wolfbrandt 979, Kaps and Rentrop 979), to arrve at the followng class of methods. Defnton 4.. An s -stage Rosenbrock method s gven by the formulas 0 j j e j j j= j= k = hf( y + a k ) + hj k, =, K, s y = y + b k 0 s = (4.7) where a j, e j, and b are the defnng coeffcents and J = f ( y0 ). Each stage of ths method conssts of a system of lnear equatons, wth unknowns k and coeffcent matrx I he J. Of specal nterest are methods wth e = L = e = ss e, for whch only one LU decomposton s needed per step. The case of non-autonomous ordnary dfferental equatons y = f(, t y) (4.8) s converted to autonomous form by appendng to the orgnal ODE the dfferental equaton t =, and regardng tme as one of the varables. In ths case, f the method of Eq. (4.7) s appled, the correspondng Rosenbrock method s obtaned as

147 35 2 k = hf( t + ah, y + a k ) + eh f ( t, y ) + hf ( t, y ) e k 0 0 j j j= y = y + b k 0 s = t 0 0 y 0 0 j j j= (4.9) where the addtonal coeffcents are gven by a = a, e = e (4.20) j= j j= j and the notaton f t f t, and f y f y has been used Rosenbrock-Nystrom Methods Assume n what follows that a second order scalar ordnary dfferental equaton s gven n the form y = f (, t y, y ) (4.2) The assumpton that the dfferental equaton n Eq. (4.2) s scalar smplfes notaton and makes the presentaton more accessble. The mplcatons of the case n whch y s a vector wll be ponted out when necessary. Techncally, they amount to substtutng certan matrx-matrx, and matrx-vector products wth tensoral products. The goal s to apply the formula of Eq. (4.9) to the dfferental equaton n Eq. (4.2). Ths s the scenaro when the SSODE obtaned after DAE-to-ODE reducton s ntegrated numercally. Frst, the second order ODE s reduced to the frst order ODE! " y y $# =! " y f (, t y, y ) $ # (4.22)

148 36 The Rosenbrock method s formally appled to ntegrate ths ODE. If k and r are ntermedate stage values at ndependent poston and velocty levels, applyng the method of Eq. (4.9) to the dfferental equaton of Eq. (4.22) yelds! "! k y a r j j h eh r $# = 0 + f t + ah y a k y a r f t y y + j j + # + 0 = 2 (,, j j )! t ( 0, 0, 0 ) 0 + h! J "!! y y0 y $# = y 0 0 j= I " " e $#! k j j $ # J j= rj s " b k " $# + =! r $# j= " # $ " $# (4.23) (4.24) where J f y t y y J f (,, ), ( y t, y, y ) (4.25) Although a, =, K, s, do not appear among the coeffcents of the formula, they are ntroduced and formally set to zero, to smplfy the summaton ndces. Lkewse, the followng notaton s ntroduced a = a, e = e (4.26) j= j j= j To obtan a numercal method able to drectly ntegrate Eq. (4.2), the y -stage characterstc k unknowns are express n terms of the y -stage unknowns r. Denotng β j a j e j = +, the frst row of Eq. (4.23) becomes k = hy + h β r, =, K, s (4.27) 0 j j j= In what follows, usng Eq. (4.27), the unknowns k are systematcally elmnated from Eq. (4.23).

149 37 The term j= ak j j appears as the second argument of f (,,). Usng Eq. (4.27), and nterchangng the order of summaton, ths term s expressed as where j= a k = a hy + β r ha y a β r j j j= j 0 j m= j jm m j jm m = + 0 j= m= = hay + aj jr + aj j r + aj j r hay r j= j= j= = + 0 β β 2 2 K β, 0 δ 2 j= - µ j = Í amβmj (4.28) m= j Snce ths procedure of expressng k n terms of r wll to appear several tmes durng ths dscusson, a matrx representaton that smplfes the process s ntroduced. If k [ k, k 2, K, k s ] T, and r [ r, r 2, K, r s ] T, based on Eq. (4.27), k = hy0 + hbr (4.29) where the vector R s s defned as [,, K,] T, and j m B ( ), B R β j s s (4.30) For stage j= =, K, s, terms such as ak j j can be smultaneously evaluated by usng the matrx notaton ntroduced. In ths notaton, usng Eq. (4.29), the array z R s, z 4 ak j= j j 9, can be expressed as a z= Ak = A ( hy + h Br ) = hy ABr r # + h = hy ( a ) M 0 h a! s " # $ where, f multplcaton s carred out to compute the matrx AB, t can be seen that = ( µ j ), wth µ j gven by Eq. (4.28). The second row of Eq. (4.23) can now be expressed as

150 38 r = hf ( t + ah, y + ha yš + h δ r, yš + a r ) j j j= + eh - Í - Í j = žf ( t0, y0, y0š ) + hjíe k + hj2íejr žt 2 j j j= j= 0 j j j (4.3) In order to elmnate the k j that multply J n Eq. (4.3), the matrx notaton and the notaton n Eq. (4.26) are used to yeld where Q j= = ( q j ) ek j j hy h hy e es h = Ek = E( + = + 0 Br) 0 L T EBr hy ( e ) + hq 0 s EB. Stage fnally assumes the form r = hf ( t + ah, y + ha y + h δ r, y + a r ) j j j= + eh j= + + f 2 ( t0, y0, y0 ) + Jy0 h J q r hj2 ejr t 2 j j j= j= 0 j j j (4.32) 2 Snce q = e β = e ( e + a ) = e = e 2, at stage, s, r s computed as the soluton of the lnear system Sr = hf ( t + ah, y + ha y + h δ r, y + a r ) j j j= + eh j= + + f 2 ( t0, y0, y0 ) + Jy0 h J q r hj2 ejr t 2 j j j= 0 j j j= j (4.33) 2 2 S I hej2 h e J (4.34) Due to the choce of coeffcents e = K = e = ss e, only one matrx factorzaton needs to be computed per successful tme step. At each stage, ths factorzaton s used to compute r. Note that the rght sde of Eq. (4.33) depends on past nformaton only. Once r. =, K, s, are avalable, the soluton s obtaned from Eq. (4.24) va Eq. (4.27).

151 39 In matrx notaton, f b s the row vector b = [ b, K, b s ], the soluton at the velocty level s gven by b + K + = b s y = y0 + br (4.35) Takng nto account Eq. (4.29) and the fact that for Runge-Kutta methods, the soluton at the poston level s obtaned as where the row vector m s defned as y = y0 + hy0 + hmr (4.36) m = bb (4.37) To summarze, the followng lnearly mplct method for soluton of the second order ODE of Eq. (4.2) has been defned: - Í Y = y + ha yš + h µ r (4.38) 0 0 j j j= Y = y + a r 0 j j j= (4.39) 2 f Sr = hf ( t0 + ah, Y, Y ) + eh ( t0, y0, y0 ) + Jy0 t (4.40) 2 + hj qr+ hj er j= j j s 2 j= j j y = y0 + hy0 + h mr (4.4) = y = y + h br 0 s = (4.42) Followng a remark of Harer and Wanner (996), when mplementng the Rosenbrock method of Eqs. (4.38) through (4.42), one matrx-vector multplcaton can be saved. For ths, a new set of unknowns s ntroduced as

152 40 u e r j j j= (4.43) Usng matrx notaton, the orgnal unknowns r are expressed n terms of u [ u,, ] as r K u T s = E u (4.44) Denotng f ~ [ f( t + ahy,, Y ), K f( t + ahy,, Y )] T, Eq. (4.32) can be rewrtten 0 0 s s s n matrx form as ~ er = hef + e 2 h 2 2 ( ft + Jy0 ) + h ejqr+ hej2 Er (4.45) Smple manpulatons of Eq. (4.45) yeld ~ Su = hef + e h ( f t + Jy0 ) + h Je( Qr eu) + ( E dag ( e)) r (4.46) Usng the new set of unknowns u elmnates the multplcaton n Eq. (4.46) wth J 2. It remans to express terms contanng the unknowns r n the rght sde of Eq. (4.46) n terms of u. Thus, ( E ) r = ( E ) E dag(e) dag(e) u= e( dag( e ) E ) u (4.47) and Qr u = QE e u eu = [ E( A + E) E dag ( e)] u (4.48) = [ EAE + E dag( e)] u To take advantage of the matrx notaton ntroduced, n what follows ( e j ) represents the matrx E used above, ( a j ) stands for the matrx whose elements are the coeffcents a j, etc. Wth ths, the algorthm for Rosenbrock-based ntegraton of the dfferental equaton of Eq. (4.2) s as follows: Algorthm: Rosenbrock-Nystrom Gven the coeffcents ( a j ), ( e j ), ( b ), ( b $ ) of an s stage Rosenbrock method, the assocated Rosenbrock-Nystrom method s defned as

153 4 s y = y0 + hy0 + h pu, y$ y hy h pu $ = (4.49) = s = y = y + su 0 s =, y$ = y + su $ 0 s = (4.50) Y = y + ha y + h t u 0 0 j j j= (4.5) - Í YŠ= yš + w u 0 j j j= (4.52) 2 f Su = hef ( t0 + ah, Y, Y ) + h ee ( t0, y0, y0 ) + Jy0 t (4.53) 2 + e c u + h ej d u j= j j j= j j where ( w ) = ( a )( e ) j j j ( c ) = dag( e) ( e ) j ( d ) = ( e ) + ( e )( a )( e ) j j j j j ( t ) = ( a ) + ( a ) ( e ) 2 j j j j ( s ) = ( b)( e ) j ( s$ ) = ( b)( e ) j ( p ) = ( b) + ( b)( a )( e ) j j ( p$ ) = ( b $ ) + ( b $ )( a )( e ) j j ( e ) = ( e ) j ( a ) = ( a ) j j (4.54) The matrces of coeffcents ( a j ), ( e j ), and ( b ) defne the Rosenbrock-Nystrom algorthm, ncludng order and stablty propertes. All other coeffcents of the algorthm are derved based on these coeffcents.

154 42 Dependng on the applcaton that s numercally ntegrated, dfferent orders and stablty requrements should be consdered. For the specfc class of Multbody Dynamcs problems, a low-to-medum accuracy method wth good stablty propertes s desrable. Formulas wth few functon evaluatons are favored, snce functon evaluatons for mechancal system smulaton amount to acceleraton evaluatons, whch are expensve. A method that s L stable allows for robust ntegraton of very stff problems. Consequently, bushng elements and flexble components used n modelng mechancal systems can be effcently handled. An order 4 formula should relably meet all accuracy requrements lkely be consdered for Multbody Dynamcs smulaton. Wth these consderatons n mnd, the defnng matrces of coeffcents ( a j ), ( e j ), and ( b ) are selected such that the resultng Rosenbrock-Nystrom method s of order 4 and s L stable. The number of stages of the method s chosen to be 4. Ths leaves some freedom n the choce of the embedded method for step-sze control. Followng an dea suggested by Harer and Wanner (996), the number of functon evaluatons s kept to 3;.e., one functon evaluaton s saved. Ths makes the Rosenbrock-Nystrom method compettve wth the trapezodal method whenever the latter method requres 3 or more teratons for convergence. However, the proposed method s L stable whle the trapezodal method s not. Furthermore, the dfference n order s 2: 4 for Rosenbrock-Nystrom, and 2 for trapezodal Order Condtons For Rosenbrock-Nystrom Algorthm The order condtons refer to condtons the defnng coeffcents ( a j ), ( e j ), and ( b ) should satsfy such that the method has a certan order. The dscusson here s based

155 43 on results presented by Harer and Wanner (996). In order to construct an order 4 Rosenbrock method, the order condtons are as follows: b+ b2 + b3+ b4 = (4.55) b β + bβ + b β = e (4.56) ba + ba + ba = 3 (4.57) bβ β + b ( β β + β β ) = 6 e+ e (4.58) ba + ba + ba = 4 (4.59) e baa β 2 + ba 4 4 ( a42β 2 + a43β 3) = (4.60) bβ a + b ( β a + β a ) = 2 e 3 (4.6) e b4β43β32β 2 = 24 + e e (4.62) 2 2 where the followng abbrevatons are used a = a, β = β j= j j= j The step sze control mechansm s based on an embedded formula (Kaps and Rentrop, 979) of the form y$ = y + b$ k 0 s j= j whch uses the stage values k from Eqs. (4.9). The embedded method should be of order 3;. e., the order condtons of Eqs. (4.55) through (4.57) must be satsfed. These j condtons lead to the system β β β a a a β β β β + β β b$ b$ 2 b$ 3 = b$ 2 e 3 6 e+ e 2 (4.63)

156 44 If the coeffcent matrx n Eq. (4.63) s non-sngular, unqueness of the soluton of ths lnear system mples b = b $, =, K, 4, and the order 3 embedded method can not be used for step sze control. Consequently, besde the order condtons mposed on the defnng coeffcents a j, e j, and b, one addtonal condton should be consdered for step-sze control purposes, namely the coeffcent matrx n Eq. (4.63) to be sngular. Ths condton guarantees the exstence of a thrd order embedded method when the system of non-lnear equatons (4.55) through (4.62) possesses a soluton. The condton assumes the form ( β a β a ) β β = ( β a β a )( β β + β β ) (4.64) The number of condtons that should be satsfed so far by the coeffcents of the method s 9. The number of unknowns s 7; the dagonal element e, 6 coeffcents e j, 6 coeffcents a j, and 4 weghts b. There are several degrees of freedom n the choce of coeffcents, whch are prmarly used to construct a method wth few functon evaluatons. A frst set of condtons s set n the form a a a 43 = 0 = a = a 4 3 (4.65) These condtons assure that the argument of f n Eqs. (4.40) and (4.53) s the same for stages 3 and 4. Hence, the number of functon evaluatons s reduced by one. Further, free parameters can be determned such that several condtons of order fve are satsfed. When Eq. (4.65) s satsfed, one of the nne order 5 condtons s satsfed, provded a 3 5 a2 4 = 4 a 3 2 (4.66) A second order 5 condton s satsfed by mposng the condton

157 2 e e b4β 43a3( a3 a2) = a (4.67) 45 Next, two condtons are chosen as b a 3 2 = 0 = 2e (4.68) to make the task of fndng a soluton more tractable. The last condton regards the choce of the dagonal element e. The value of ths parameter determnes the stablty propertes of the Rosenbrock method. Snce nterest s n constructng an L-stable method, the coeffcent e should be taken as (Harer and Wanner, 996) e = (4.69) Equatons (4.55) through (4.62) and (4.64) through (4.69) comprse a system of 7 non-lnear equatons n 7 unknowns. The soluton of ths system s (Sandu, et al., 998): e = e e e e e e = = = = = = a a a a a a = = = = = = 0.0 (4.70)

158 46 b b b b = = = 00. = These coeffcents defne the proposed 4-stage, L-stable, order 4 Rosenbrock-Nystrom method wth 3 functon evaluatons per successful ntegraton tme-step. The weghts of the embedded formula are computed as the soluton of the lnear system n Eq. (4.63), b$ = b$ 2 = b$ 3 = b$ = (4.7) In order to save one matrx-vector multplcaton, the approach n Eqs. (4.49) through (4.53) s mplemented. The coeffcents n ths formulaton are obtaned from the coeffcents of Eqs. (4.70) and (4.7), based on Eq. (4.54). The followng are the coeffcents that are actually used n the numercal mplementaton of the proposed Rosenbrock-Nystrom method: w w w w w w c c c c c c = E0 = E = E0 = E = E0 = 00. = E = E = E0 = E = E0 = E0 (4.72) (4.73)

159 47 d d d d d d t t t t t t s s s s ev ev ev ev p p p p ep ep ep ep = E = E = E0 = E0 = E 0 = E0 = E = E0 = E0 = E0 = E0 = 00. = E = E0 = E0 = E = E 0 = E 0 = E 0 = E 0 = E = E0 = 00. = E0 = E 0 = E 0 = E0 = E 0 (4.74) (4.75) (4.76) (4.77) (4.78) (4.79)

160 48 e e e e a a a = E0 = E = E0 2 3 = E0 = E = E0 = E0 (4.80) (4.8) In Eqs. (4.77) and (4.79), nstead of provdng the value of the weghts $s and $p for the embedded method, the dfference between the weghts of the actual and embedded methods were computed. The coeffcents ev and ep, =, K, 4 are used to compute the composte error at the poston and velocty levels, usng the stage unknowns u Algorthm Pseudo-code Pseudo-code for numercal mplementaton of the Rosenbrock-Nystrom method presented n the prevous two Sectons s provded n Table 6. Codng ths algorthm requres mplementaton of Eqs. (4.49) through (4.53), usng coeffcents provded n the prevous secton. The frst two steps of the mplementaton are dentcal to those of the prevously presented pseudo-codes. Step 4 saves the system confguraton to be used upon a rejected tme step. Durng the followng two steps, the ntegraton Jacoban s evaluated and factored. The quantty that s computed s n fact not the matrx S of Eq. (4.34), as the Rosenbrock formula suggests, but the matrx Π ( e 2 ) MS, $ wth M $ beng the postve defnte matrx of Eq. (3.6). Matrx Π s not obtaned by frst computng the matrces M $ and S, and then multplyng them. As shown n Secton (Proposton ), the matrx Π s dentcal to the ntegraton Jacoban Ψ &&v of the State

161 49 Space Method. Compared to matrx S, Ψ &&v s easer to compute. Therefore, at each stage of the Rosenbrock formula, nstead of solvng systems of the form Sx = b the vector x s obtaned as the soluton of the equvalent system Πx = Mb $ (4.82) Table 6. Pseudo-code for Rosenbrock-Nystrom-Based Frst Order Reducton Method. Intalze Smulaton 2. Set Integraton Tolerance 3. Whle (t < tend) do 4. Set Macro-step 5. Get Integraton Jacoban 6. Factor Integraton Jacoban 7. Get the Tme Dervatve 8. Resolve Stage 9. Resolve Stage 2 0. Resolve Stage 3. Resolve Stage 4 2. Get Soluton. Check Accuracy. Compute new Step-sze 3. Check Partton 4. End do The matrx Π s factored durng Step 6. The dmenson of ths matrx s equal to the number of degrees of freedom of the model, and therefore s rather small. Durng

162 50 Step 7, the partal dervatve žf žt s evaluated n the consstent confguraton from the begnnng of each macro-step. In the case of scleronomc mechancal systems, for whch all the external appled forces are tme ndependent, the quantty žf žt s zero. Otherwse, adequate means to provde ths quantty should be embedded n the code. The next four steps compute the stage varables u, =, K, 4. The challenge s how to retreve the rght sde of Eq. (4.53). Durng each of the four stages, some or all of the followng steps are taken: (a). Obtan a consstent confguraton at poston and velocty levels (b). Compute generalzed acceleratons (c). Evaluate rght sde of Eq. (4.53) (d). Solve lnear system to obtan the stage values u Step (a) s justfed by the necessty to compute generalzed acceleratons requred by step (c). Snce the fundamental dea n DAE ntegraton hnges upon state-space reducton, the Rosenbrock formula actually ntegrates ndependent varables only. Dependent varables must be recovered, snce any call to acceleraton computaton requres consstent poston and velocty nformaton. Independent postons are computed based on Eq. (4.5), whle ndependent veloctes are computed at each stage, as ndcated by Eq. (4.52). After recoverng dependent postons and veloctes, based on knematc constrant equatons of Eqs. (3.7) and (3.8), topology based lnear algebra algorthms ntroduced n Sectons and are employed to compute acceleratons durng substep (c). Durng sub-step d), the avalable nformaton s used to obtan the rght sde of Eq. (4.53). The actual mplementaton s based on the matrx Π. Therefore, one addtonal step s taken to multply M $ of Eq. (3.6) wth the orgnal rght-sde. As

163 5 shown n Secton , ths approach adds one matrx-vector multplcaton, but elmnates the need to explctly solve for J and J 2. Due to the partcular choce of coeffcents defnng the Rosenbrock formula and the way n whch the code was mplemented, each of the four stages has ts own partculartes. Thus, (). Stage (Step 8 of the pseudo-code) concdes wth the begnnng of the current macro-step, or equvalently the end of the pror macro-step. Therefore, the system s n an assembled confguraton, and sub-step a) s skpped. Durng ths stage, the matrx Π s evaluated, and generalzed acceleratons are obtaned cheaply as a byproduct of ths effort. To obtan Π, the matrx M $ s computed, and the dependent sub-jacoban Φ u s factored. The latter s needed to obtan the matrces H, J, L, and N of Secton If these two quanttes are avalable, &&v s frst computed as the soluton of the postve defnte lnear system of Eq. (3.5), and &&u s computed usng acceleraton knematc equaton of Eq. (3.9), snce Φ u s already factored. Consequently, sub-stages (a) and (b) are effectvely dealt wth by stage. Obtanng the rght-sde durng sub-step (c) s straghtforward. Then, the matrx Π s factored, and the stage value u computed. Note that factorzaton of Π s used for all stages of the formula, snce the dagonal elements n Butcher s tableau are dentcal. (2). Stages 2 and 3 (Steps 9 and 0) smply follow the sub-stages (a) through (d) above. (3). Stage 4 by-passes sub-stage (b), the functon evaluaton, due to the choce of formula defnng coeffcents n Eq. (4.65). Snce no acceleraton computaton s requred, there s no need to provde a consstent confguraton at the poston and velocty levels. Therefore sub-stage (a) s skpped. It remans to compute the rght

164 52 sde of the lnear equaton and, wth the coeffcent matrx already factored, to do a forward/backward substtuton to obtan u 4. Once all the stage values u, =, K, 4 are avalable, the soluton at the new grd pont s computed durng Step 2, based on Eqs. (4.49) and (4.50). Extra effort goes nto recoverng dependent postons and veloctes to obtan a consstent confguraton of the mechansm, whch s used durng the next tme step, at stage, sub-stage (a). Based on a second approxmaton of the soluton at the new grd pont gven by the embedded formula, values $y and y$ Š of Eqs. (4.49) and (4.50), the accuracy of the soluton s analyzed, and the step s accepted or rejected. As a result of error analyss, a new step-sze s provded. Ths s used to recompute the current tme step upon a rejected step, or to proceed to the next tme step upon a successful step. Fnally, the last step of the algorthm checks the parttonng. Based on the condton number of the dependent sub-jacoban Φ u, whch s avalable on ext from Step 2, and the value of the reparttonng coeffcent α prevously ntroduced, the parttonng can be rejected or preserved for the next tme step. The most mportant feature of any Rosenbrock formula s ts ablty to provde accuracy and stablty wthout the penalty of havng to solve dscretzed non-lnear algebrac equatons. No teratve process s embedded n the formula, and the delcate ssue of stoppng crtera s elmnated. It s recognzed n the numercal analyss communty that mplementaton of a Rosenbrock formula s sgnfcantly smpler than any other Runge-Kutta type formula. The factor that has lmted extensve use of the attractve class of Rosenbrock formulas s the requrement that the ntegraton Jacoban should be exact. Ths s a lmtng factor for many engneerng applcatons that are lkely to contan complex components for whch computng exact Jacoban s an ntractable task. In the context of

165 53 the dynamc analyss of multbody systems, the lmtng factor remans the task of provdng force dervatves. In stuatons such as these, when no close/analytc expressons are avalable to compute requred dervatve nformaton, the soluton s ether to use automatc dfferentaton tools such as ADIFOR (Bschof et al., 994), or to use numercal means to obtan dervatve nformaton. In the latter case however, usng a Rosenbrock type formula s no longer an opton, and the only alternatve s to use more robust and somehow more CPU-expensve methods. One of these methods s the SDIRK4/5 formula that s dscussed n the next Secton. 4.5 SDIRK-Based Frst Order Implct Integraton 4.5. SDIRK4/5 Accordng to Harer and Wanner (996), the choce γ = 45 nstead of γ = 46 as the dagonal element of a 5 stage, order 4 stffly-accurate SDIRK method s numercally superor. Ths motvated ther effort to code ths formula, rather than the one proposed n Secton 4.2 n conjuncton wth the Descrptor Form Method. The resultng code s much more refned, and specal attenton was pad to ssues such as stoppng crtera, mechansms for early detecton of convergence falure, estmaton of teraton startng ponts, avodng roundng errors, etc. One postve feature of the Frst Order Reducton Method s ts ablty to lnk to any standard code for the numercal soluton of stff ODE. Ths feature s exploted here by embeddng a publc doman ODE code of Harer and Wanner n the framework of the Frst Order Reducton Method. The objectve s to keep as much as possble of the orgnal layout that made the code of Harer and Wanner robust and effcent. There s no

166 54 major dfference between the SDIRK formula of ths Secton, and the one defned n Secton 4.2.2, so the notaton of that Secton s retaned. Smplfed order condtons for SDIRK4/5 are provded n Eq. (4.0), and values for p, =, K, 8, are obtaned by settng γ = 45. The SDIRK4/5 formula s chosen to be stffly-stable, and the partcular choce of γ mples that t s A-stable (Harer and Wanner, 996). Consequently, t s L-stable. As n Secton 4.2.2, n order to mnmze ffth-order error terms, the same values c 2 Š = 05. and c 3 Š = 03. are consdered for the two extra coeffcents defnng the ntegraton formula. After solvng the non-lnear system n Eq. (4.0), the coeffcents of the formula are obtaned as Stage a = 45 Stage 2 Stage 3 Stage 4 Stage 5 a a a a a a a a = 2 = 45 = = = 45 = = = a44 = 45 a a a a = = = = a55 = 45

167 55 Snce the formula s desgned to be stffly stable, b = a 5, =, K, 5. The coeffcents c are obtaned as c = Í j= a j, =, K, 5 It remans to provde an embedded formula for step-sze control. Much lke the case of SDIRK4/6, the orgnal order condtons are mposed up to order 3 to construct a formula that uses the same values of a j and c. The order condtons for the embedded formula are gven n Eq. (4.). Wth the values of a j gven above, the lnear system that gves the weghts $ b for the embedded order 3 formula s b$ + b$ + b$ + b$ = b$ + b$ 2 + b$ 3+ b$ 4 = $ 529 $ 289 $ b+ b2 + b $ 3+ b4 = $ 76 $ $ b+ b $ 2 + b3+ b4 = The weghts $ b obtaned by solvng ths system are $b = $b 2 = $b 3 = (4.83) $b 4 = $b 5 = 0

168 56 To reduce the nfluence of round-off errors durng numercal ntegraton of the IVP yš = f ( x, y), yx ( ) = y, Harer and Wanner n ther code preferred to work at each 0 0 stage wth the quanttes z = hí aj f ( x0 + cjh, y0 + zj) j=, =, K, s (4.84) Whenever the soluton z, z 2,, z s of the system n Eq. (4.84) s known, the stage values k are obtaned as k = f ( x + ch, y + z ) 0 0 Ths assumes s -extra functon evaluatons. The addtonal effort can be avoded f the matrx A = ( a j );. e., the matrx of a j coeffcents n Butcher s tableau, s non-sngular. Snce ths s the case wth any SDIRK formula, the system n Eq. (4.84) s rewrtten as where, z hf ( x0 + ch, y0 + z) M M = A z hf ( x + c h, y + z ) s 0 s 0 s The soluton at the new grd pont s obtaned as y = y + d z 0 s Í = ( d, K, d ) = ( b, K, b ) A s s - (4.85) One attractve feature of the z -approach s that, snce SDIRK4/5 s stffly stable, the vector d s smply (,,,,) Another advantage of usng the z -approach s the followng. The quanttes z, z 2,, z s are computed teratvely and therefore affected by small teraton errors. The evaluaton of k ;. e., evaluatons of the form f ( x + ch, y + z ) would then, due to the large Lpschtz constant of f, amplfy these 0 0 errors. Ths could be dsastrously naccurate for a stff problem (Shampne, 980).

169 57 A consequence of the z -approach s that expressons for stage values k, =, K, s must be replaced by lnear combnatons of z, =, K, s, as nduced by Eq. (4.85). Accordngly, the step-sze controller needs to be modfed to accommodate the z -representaton. In other words, as the vector b of weghts s replaced by the d vector, the weghts of the embedded method must be modfed accordngly. The second approxmaton of the soluton $y s obtaned as where, y$ = y + d$ z 0 s Í = ( d $, K, d $ ) = ( b $, K, b $ ) A s Snce the actual quantty of nterest s the approxmaton y error, ths s obtaned as s - - y$ of the local truncaton - err = ( b - b$ )A z dz (4.86) - where d = ( b-b$ )A. Based on the values of b, b $, and A, the vector d s obtaned as d d d d d = = = = = (4.87) The orgnal code wrtten by Harer and Wanner also provded dense output formulas. Wthout gong nto detals regardng the process of obtanng these coeffcents, they are

170 58 d = d2 = d3 = d4 = d = d d d d d d d d d d d d d d d = = = = = = = = = = = = = = = The way n whch these coeffcents are computed was dscussed n Secton 4.2.2, when defnng dense output for SDIRK4/6. These coeffcents were scaled by A - to account for the z rather the k mplementaton of the formula. For =, K, 5, wth b ( θ) the soluton at an ntermedate pont x + θ 0 h, 0< θ, s gven by = 4 Í j= d j θ j

171 59 yx ( + θh) y+ b( θ) z Í = whch s of order 3 for 0< θ <, and updates to the fourth order approxmaton y for θ = Algorthm Pseudo-code The SDIRK4/5 formula was embedded n the Frst Order Reducton Method to provde a robust and effcent algorthm. The numercal mplementaton s based on a publc doman code developed by Harer and Wanner for the ntegraton of stff IVP, whch was adapted to support the proposed algorthm. An obstacle to mmedate converson of the algorthm s the need to embed dependent varable recovery n the orgnal code. Lkewse, the SSODE obtaned n Multbody Dynamcs s of second order, and the orgnal code was desgned for frst order systems wth talored lnear algebra, teraton startng values, and stoppng crtera. The pseudo-code of the resultng algorthm s presented n Table 7. The frst four steps of the mplementaton are the same as dscussed n Secton At Step 5, the ntegraton Jacoban s evaluated. The quantty that s computed s the matrx Π, whch s used to carry out the teratve process (Steps 9 through 6). The next step factorzes Π. Dense Lapack routnes are used for ths operaton, snce the dmenson of the matrx s low and sparsty s not a factor.

172 60 Table 7. Pseudo-code for SDIRK4/5-Based Frst Order Reducton Method. Intalze Smulaton 2. Set Integraton Tolerance 3. Whle (t < tend) do 4. Set Macro-Step 5. Get Integraton Jacoban 6. Factor Integraton Jacoban 7. For stage from to 5 do 8. Set up Stage 9. Whle (.NOT. converged) do 0. Evaluate Acceleratons. Buld RHS 2. Get Correcton n Independent Postons 3. Get Correcton n Independent Veloctes 4. Analyze Convergence 5. Recover Dependent Postons 6. Recover Dependent Veloctes 7. End do 8. End do 9. Check Accuracy. Compute new Step-sze. 20. Check Partton 2. End do

173 6 Step 7 starts the loop of the stages of the formula. Frst, the stage s set up at Step 8. The code provdes ntal teraton estmates, based on nformaton avalable from pror stages. When settng up the last stage, the startng pont s provded by the embedded formula that uses nformaton only from frst four stages. The embedded formula s of order 3, so ths should be a good approxmaton of the soluton for stage 5. Ths dea works because SDIRK4/5 s stffly-accurate, and the last stage ntal predcton can be taken to be the soluton provded by the embedded method. The teratve process for recoverng z starts at Step 9. The process adopted s descrbed n Secton To see how the z -formulaton relates to the teratve method descrbed n that secton, frst note that n the case of the SSODE, followng the notatons of Secton 3.4.., z = hí ajg( x0 + cjh, w0 + zj) j= (4.88) In Secton 3.4.., the second order SSODE obtaned after DAE reducton was T T consdered to assume the form w& = g(, t w), wth w [ v v& ] T. The stage values z! v v& () () " $ # are obtaned as the soluton of the non-lnear system of Eq. (4.88). The quas-newton algorthm employed requres computaton of the ntegraton Jacoban, of Eq. (3.87). The correctons n z ;. e., n v () and v& (), are gven n Eqs. (3.0) and (3.02). A computatonal advantage s obtaned f, nstead of explctly computng the dervatves J and J 2, the matrx Π s ntroduced, and approprate alteratons are made n the rght sde of the lnear system. In lght of Eq. (4.88), the coeffcents b and b 2 at stage, teraton k, are

174 62 - Í (, k) ( j) (, k) b = h a ( v& + v& ) + γh(& v + v& ) j 0 j= - (, ) 2 Í j ( ) (, k) j= k j b = h a && v + γh&& v 0 where && v ( j) represents ndependent acceleratons at stage j, correspondng to the confguraton v v 0 + ( ) j and v& v& ( ), wth v 0 and &v 0 beng ndependent postons and 0 + j veloctes at the begnnng of the macro-step. Thus, for j =, K, -, snce all past (, k) nformaton s avalable, the frst terms n the expressons for b (, and b k ) 2 can be mmedately obtaned. They are held constant durng the teratve process. To evaluate the last part of the rght sde of these two expressons, γ = 45, h s the step-sze, and the value v& (, k) s provded by the teratve process. Dscusson n Sectons and showed how to compute generalzed acceleratons && v (, k) at a gven system confguraton. The remanng steps of the pseudo-code manpulate data accordng to the consderatons above. Durng stage, =, K, 5, Step 0 computes at each teraton k, the ndependent acceleratons && v (, k ) (, k). The quanttes b (, k) and b 2 are computed n Step, whle durng Step 2 correctons n v () and v& () are made usng the matrx Π, accordng to consderatons presented n Secton Step 3 contans a sophstcated mechansm that does the followng: (a). Convergence rate analyss (b). Stops the teratve process based on stoppng crtera (c). Convergence forecast These functons are closely related to estmaton of teraton error. Snce convergence of the quas-newton algorthm (steps 9 through 6) s lnear, correctons at two consecutve teratons satsfy

175 63 (, k+ ) (, k) δz θ δz (4.89) wth θ <. Wth z () the exact soluton of the non-lnear system n Eq. (4.88), applyng the trangle nequalty yelds (, k+ ) () (, k+ ) (, k+ 2) (, k+ 2) (, k+ 3) z - z = ( z - z ) + ( z - z ) + K and takng nto account Eq. (4.89) yelds (, k+ ) () θ (, k) z - z z - θ δ (4.90) An estmaton of the convergence rate at teraton k s avalable as k k- θ = δ (, ) δ (, z z ) k It s clear that the teraton error should not be larger than the local dscretzaton error, whch because of the step-sze control mechansm s kept close to Tol (the ntegraton tolerance computed based on the accuracy requrements mposed by the user). Therefore takng nto account Eq. (4.90), teraton s stopped when, wth η = θ ( -θ ), k k k (, k) η δz κ Tol (4.9) k The value z (, k) s then accepted as the approxmaton of z (). Ths strategy can be appled after at least 2 teratons. In order to be able to apply ths stoppng crtera even after the frst teraton, for k = 0 the quantty η 0 s defned as η 0 = (max( η, uround )). old where η old s the last η k of the precedng step. It remans to make a good choce of the parameter κ n Eq. (4.9). After extensve numercal experments wth values between 0 and 0-4, Harer and Wanner (996) recommended a value of κ n the range 0 - and

176 64 It remans to address pont (c) above, whch attempts to avod unjustfed computatonal effort n an teratve process that appears at some pont to be doomed to fal. Usually, k max = 7 to 0 teratons are allowed. However, the computaton s stopped and the step rejected as soon as one of the followng condtons holds (c). There s a k for whch θ k (the teraton s expected to dverge) (c2). For some k, kmax θk6 - θ k -k (, k) δz > κ Tol The left sde of the last expresson s an estmate of the teraton error to be expected after k max - teratons. Whenever the step s rejected because of (c) or (c2), ntegraton s restarted wth a smaller step-sze, usually h 2. If the teraton process has not converged, and f, based on observatons obtaned n Step 3, there s no apparent sgn for future convergence falure, the value z (, k) s corrected to produce the next approxmaton z (, k+ ). Durng Steps 5 and 6, dependent postons and veloctes correspondng to newly computed ndependent postons and veloctes are obtaned. Both these steps n the actual code are based on teratve processes. Most mportantly, they use the same matrx, namely the constrant sub- Jacoban Φ u that was factored n the confguraton at the begnnng of the macro-step. Once all the stages have been completed, accuracy of the soluton s verfed. Snce the ntegraton formula used s stffly-accurate, stage 5 provdes the soluton at the new tme step. An approxmaton of the local error s computed, based on the coeffcents d of Eq. (4.87). The step-sze selecton process used was detaled n Secton Step 20 checks the valdty of the parttonng, as was descrbed for trapezodal method used n conjuncton wth the State-Space Reducton Method n Secton Fnally, Step 2 s the end of the smulaton loop.

177 65 CHAPTER 5 NUMERICAL EXPERIMENTS 5. Valdaton of Frst Order Reducton Method Ths Secton focuses on valdaton of the codes mplemented n conjuncton wth the Frst Order Reducton Method of Secton 4.4. Valdaton of the code based on the State-Space Reducton Method of Secton 4. s presented by Haug, Negrut, and Iancu (997a) and Negrut, Haug, and Iancu (997). Codes based on the Descrptor From Method presented n Sectons 4.2 and 4.3 are valdated n the work of Haug, Negrut, and Engstler (998). The double pendulum shown n Fgure s the model used for valdaton purposes. It s a two-body, two-degree-of-freedom planar mechancal system. The mechansm s modeled usng planar Cartesan coordnates;. e., the x and y coordnates of the body center of the mass and the angle θ that defnes the orentaton of the local centrodal reference frames wth respect to the global reference frame. A large amount of stffness s nduced by means of two rotatonal sprng-damper-actuators (RSDA). The parameters of the model are provded n SI unts n Table 8. Table 8. Parameters for the Double Pendulum L m k C L 2 m 2 k 2 c E5 5.0E4

178 66 Fgure. Double Pendulum The double pendulum problem analyzed s on a scale startng from mldly stff, and endng wth extremely stff, somewhere n between, wth a domnant egenvalue that has a small magnary part, and real part of the order of Intal condton for the problem are gven n Table 9. The frst row contans poston nformaton and the second contans velocty nformaton. Table 9. Intal Condtons for Double Pendulum Body Body 2 x-coordnate y-coordnate θ-coordnate x-coordnate y-coordnate θ-coordnate π π/

179 67 For valdaton purposes, several tolerances are mposed, and smulaton results are analyzed to see f the mposed accuracy requrements are met. A reference soluton s frst generated by mposng a very tght tolerance. All smulatons are compared to a reference smulaton, to fnd the nfnty norm of the error, the tme at whch t occurred, and average error per tme step. A second code that compares smulaton results has been developed. Ths postprocessng tool reads results of the current smulaton and fts each tme step (grd-pont) between grd ponts of the reference smulaton. The latter ponts are expected to be n much larger number, due to strngent accuracy wth whch ths smulaton s run. Splne cubc nterpolaton s used to generate a reference value at each grd pont of the current smulaton, based on reference smulaton data. The reference value and the current value at each grd pont are then compared. Ths comparson s done for all grd ponts of the current smulaton, and the largest dfference n solutons s defned as the nfnty norm of the error. Ths quantty, along wth the grd pont at whch t occurred, s reported by the post-processng tool. The average error per tme step s obtaned by summng the square smulaton errors at each grd pont, and dvdng the square root of ths sum by the number of tme steps taken durng the smulaton. Suppose that n tme steps are taken durng the current smulaton, and the varable used for error analyss s denoted by e. The grd ponts of the current smulaton are denoted by tnt = t < t2 < K < tn = tend, and results of the current smulaton are obtaned as e, for n. If N s the number of tme steps taken durng reference smulaton, t s expected that N >> n. Let Tnt = T < K < TN = T end be the smulaton tme steps, and E j, for j N be the correspondng reference values. For each, n, an nteger r( ) s defned such that Tr() t Tr() +. Based on reference data

180 68 E r( )-, E r( ), E r( )+, and E r( )+2, splne cubc nterpolaton s used to generate an nterpolated value E * at tme t. If r( ) - 0, the frst four reference ponts are consdered for nterpolaton, whle f r( ) + 2 N, the last four reference ponts are consdered for nterpolaton. The error at tme step s defned as * = E -e (5.) The nfnty norm of the smulaton error s defned as ( k ) = max n (5.2) where k denotes the tolerance set for the current smulaton. The average error per tme step s defned as ( k ) = n 2 Í n = (5.3) The code ForRosen s an mplementaton of the Rosenbrock-Nystrom formula dscussed n Secton 4.4. ForSDIRK s an mplementaton of SDIRK4/5 formula of Secton The codes are run wth tolerances between 0-2 and 0-5, and results are compared to the reference soluton. Fgure 2 presents the tme varaton of orentaton angle θ, for whch error analyss s carred out. The length of the reference smulaton;. e., the length for whch error analyss s carred out, s T = 2 seconds. Reference data are generated usng the code ForSDIRK. Wth an ntegraton tolerance set to 0-8, the code takes 354,090 successful tme steps to run 2 seconds of smulaton.

181 Angle Body [rad] Smulaton Tme [s] Fgure 2. Orentaton Body Table 20 contans results of error analyss at the poston level for the code ForSDIRK. The frst column contans the value of the tolerance wth whch the smulaton was run. Relatve and absolute tolerances ( Atol and Rtol of Eq. (4.6)) are set to 0 k, and they apply for both poston and velocty. The second column contans the tme t * at whch the largest error of Eq. (5.2) occurred. The thrd column contans ( k ). Column four contans the relatve error, defned as RelErr = ( k ) œ00. 0 (5.4) * E where E * s the reference value evaluated va cubc splne nterpolaton, at tme t *. Equaton (5.4) holds, provded E * s not too close to zero. Fnally, the last column contans the average error per tme step, as defned n Eq. (5.3). The most relevant nformaton for method valdaton s ( k ). If k = 3;. e., accuracy of 0-3 s demanded, (-3) should have ths order of magntude. It can be seen

182 70 n Table 20 that ths s the case for all tolerances, except for the case k = 5, when however the magntude of the error s close to order 0-5. One reason for whch the value of k s not always reflected exactly n the value of ( k ) s that the relatve tolerance comes nto play. For ths experment, the relatve tolerance s not zero. In lght of Eq. (4.6), dependng on the magntude of the varable beng analyzed, t loosens (for large magntudes) or tghtens the step-sze control. Based on results shown n Fgure 2, the relatve tolerance s multpled by a value that oscllates between 4.0 and 6.0. Therefore, the actual upper bound of accuracy mposed on soluton (accordng to Eq. (4.6)) fluctuates and reaches values up to 7 0 -k. Table 20. Poston Error Analyss ForSDIRK k t ( k ) RelErr[%] ( k ) The results shown n Fgure 2 and Table 20 confrm the relablty of the stepsze controller embedded n ForSDIRK, and ndcate t as beng slghtly optmstc n step-sze predcton. Ths s not the case wth the code ForRosen, for whch error analyss results are provded n Table 2. The step-sze controller for ths method s conservatve, and ths has a negatve mpact on the CPU performance of the method, especally for hgh accuracy requrements, as shown n Secton 5.3. The step-sze

183 7 controller s for all stuatons almost one order of magntude too strngent;. e., ths code s qute conservatve. Table 2. Poston Error Analyss ForRosen k t ( k ) RelErr[%] ( k ) The reason for whch ForRosen s conservatve can be explaned usng Dahlqust s test problem yš = λ y, yx ( ) = y. For large values of step-sze h, as λ 0 0 assumes large real negatve values, hλ -, and the local truncaton obtaned usng the embedded method s approxmately (Harer and Wanner, 996) $y- y γhλ y0 (5.5) where γ s a coeffcent dependng on the ntegraton formula consdered. The truncaton error n Eq. (5.5) s large n magntude, and t ncreases as h λ ncreases. Therefore, ths mechansm for step-sze control ceases to be effectve when the problem s very stff, or when due to loose tolerances, the step sze becomes large. These consderatons explan very accurately the behavor of ForRosen, and why the looser the tolerance, the more conservatve the results. - Shampne suggests usng the quantty ( -hγλ ) ($ y - y ) nstead the $y - y for step-sze control purposes. For the general IVP yš = f (, t y), y( ) = y, the quantty x 0 0 consdered s

184 72 where J - err = ( I -hγj) ($ y-y ) (5.6) =žf žy s evaluated at x 0, and I s the dentty matrx of approprate dmenson. The factorzaton of the matrx ( hγ) I- J s avalable, snce t s used to compute stage varables k. Therefore, err s cheap to compute. Especally for very stff systems, the dea of Shampne elmnates conservatsm of the step-sze controller. Ths approach, ntally n the code of Harer and Wanner and nherted by ForSDIRK, s currently not mplemented n ForRosen. The results n Tables 20 and 2 ndcate that theoretcal predctons are confrmed by numercal results, and they underlne the mportance of the step-sze controller. A code wth an optmstc step-sze controller, may not attan the requred accuracy, whle usng an ntegrator that s too pessmstc n terms of step-sze control results n excessve computatonal effort. Error analyss s also performed at the velocty level. The tme varaton of angular velocty & θ of body s shown n Fgure 3. 8 Angular Velocty Body [rad/s] Smulaton Tme [s] Fgure 3. Angular Velocty Body

185 73 The angular velocty of body fluctuates between 0 and 7 rad/s. The absolute and relatve tolerances were set to 0 k, k = 2, K, 5, for all numercal experments. The order of magntude for ( k ) s expected to be 0 k +. Results n Table 22 are obtaned wth ForSDIRK. They confrm theoretcal predctons. The optmstc tendency of the step-sze controller embedded n ForSDIRK can be seen n these results. Table 22. Velocty Error Analyss ForSDIRK k t ( k ) RelErr[%] ( k ) Results obtaned at the velocty level wth ForRosen are presented n Table 23 and they confrm the observatons made earler about ths code. The step-sze controller s slghtly conservatve. Instead of values of 0 - k +, the accuracy of the results as ndcated by ( k ) s one order of magntude better, most of the tme. Results presented n ths Secton valdate the codes and confrm relablty of the step-sze controllers embedded. Wth one slghtly on the optmstc sde (ForSDIRK), and one on the conservatve sde (ForRosen), the step sze controllers show that the dea of usng an embedded formula for local error estmaton s sound. The accuracy obtaned wth these codes s good. It remans to adapt the step-sze controller for ForRosen, accordng to Eq. (5.6), to avod unjustfed CPU penaltes.

186 74 Table 23. Velocty Error Analyss ForRosen k t ( k ) RelErr[%] ( k ) Explct versus Implct Integraton Ths Secton contans a comparson, n terms of CPU tme, of two dfferent numercal methods used for dynamc analyss of a Hgh Moblty Multpurpose Wheeled Vehcle (HMMWV). A pcture of the vehcle s provded n Fgure 4. The topology of the mechansm s presented n Fgure 5. The vehcle s modeled usng the same bodes as n Secton , but the body numberng of that Secton s changed. Snce 4 bodes are used to model ths vehcle, n what follows the model s referred as HMMWV4. More detals about HMMWV specfc parameters are provded by Serban, Negrut, and Haug (998). Fnally, all tmng results n ths and the followng sectons are obtaned on a SGI Onyx machne wth eght R0000 processors.

187 75 Fgure 4. US Army HMMWV Fgure 5. 4 Body Model of HMMWV

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng