EVALUATING HIGHER DERIVATIVE TENSORS BY FORWARD PROPAGATION OF UNIVARIATE TAYLOR SERIES

MATHEMATICS OF COMPUTATION Volume 69, Number 231, Pages 1117 1130 S 0025-5718(00)01120-0 Article electronically publishe on February 17, 2000 EVALUATING HIGHER DERIVATIVE TENSORS BY FORWARD PROPAGATION OF UNIVARIATE TAYLOR SERIES ANDREAS GRIEWANK, JEAN UTKE, AND ANDREA WALTHER Abstract. This article consiers the problem of evaluating all pure an mixe partial erivatives of some vector function efine by an evaluation proceure. The natural approach to evaluating erivative tensors might appear to be their recursive calculation in the usual forwar moe of computational ifferentiation. However, with the approach presente in this article, much simpler ata access patterns an similar or lower computational counts can be achieve through propagating a family of univariate Taylor series of a suitable egree. It is applicable for arbitrary orers of erivatives. Also it is possible to calculate erivatives only in some irections instea of the full erivative tensor. Explicit formulas for all tensor entries as well as estimates for the corresponing computational complexities are given. 1. Introuction Many applications in scientific computing require secon- an higher-orer erivatives. Therefore, this article eals with the problem of calculating erivative tensors of some vector function y = f(x) with f : D R n R m that is the composition of (at least locally) smooth elementary functions. Assume that f is given by an evaluation proceure in C or some other programming language. Then f can be ifferentiate automatically [6]. The Jacobian matrix of first partial erivatives can be compute by the forwar or reverse moe of the chainrule base technique known commonly as computational ifferentiation (CD). CD also yiels secon erivatives that are neee in optimization [8] an even higher erivatives that are calle for in numerical bifurcation, beam ynamics [2] an other nonlinear calculations. Even though the reverse moe of CD may be more efficient when the number of epenent variables is small compare to the number of inepenents [6], only the forwar moe will be consiere here. The mechanics of this irect application of the chain rule are completely inepenent of the number of epenent variables, so it is possible to restrict the analysis to a scalar-value function y = f(x) with f : D R n R. Receive by the eitor January 2, 1998 an, in revise form, June 30, 1998. 1991 Mathematics Subject Classification. Primary 65D05, 65Y20, 68Q40. Key wors an phrases. Higher orer erivatives, computational ifferentiation. This work was partially supporte by the Deutsche Forschungsgesellschaft uner grant GR 705/4-1. 1117 c 2000 American Mathematical Society

1118 ANDREAS GRIEWANK, JEAN UTKE, AND ANDREA WALTHER In other wors, formulas for one component f(x) of the original vector function f(x) are erive. This greatly simplifies the notation, an the full tensors can then easily be obtaine by an outer loop over the component inex. The natural approach to evaluating erivative tensors seems to be their recursive calculation using the usual forwar moe of CD. This technique has been implemente by Berz [3], Neiinger [10], an others. The only complication here is the nee to utilize the symmetry in the higher erivative tensors, which leas to fairly complex aressing schemes. As has been mentione in [7] an [1], much simpler ata access patterns an similar or lower computational counts can be achieve through propagating a family of univariate Taylor series of an arbitrary egree. At the en, their values are interpolate to yiel the tensor coefficients. The paper is organize as follows. Section 2 introuces the notations that are use an makes some general observations. In Section 3 the complexity of storing an propagating multivariate an univariate Taylor polynomials is examine, an the avantages of the univariate Taylor series are shown. Section 4 erives formulas for the calculation of all mixe partial erivatives up to egree from a family of univariate Taylor polynomials of egree. Some run-time results are presente in Section 5. Finally, Section 6 outlines some conclusions. 2. Notations an basic observations In many applications, one oes not require full erivative tensors but only the erivatives in n n irections s i R n. Therefore suppose a collection of n n irections s i R n is given, an that they form a matrix S = [s 1, s 2,...,s n ] R n n. OnepossiblechoiceisS = I with n = n. Here, I enotes the ientity matrix in R n n. Of particular interest is the case n = 1, where only erivatives in one irection are calculate. With a coefficient vector z R n, one may wish to calculate the erivative tensors f(x + Sz) z = (f (x)s j ) j=1,...,n R n, z=0 2 f(x + Sz) z2 = (f j=1,...,n (x) s i s j ) i=1,...,n Rn n, z=0 an so on. The last equation is alreay an abuse of the usual matrix-vector notation. Here, the abbreviation S k f(x) Rnk will be use in orer to enote the k-th erivative tensor of f(x + Sz) with respect to z at z =0. The use of the see matrix S allows us to restrict our consierations to a subspace spanne by the columns s i along which the properties of f might be particularly interesting. This situation arises for example in optimization an bifurcation theory, where the range of S might be the tangent space of the active constraints or the nullspace of a egenerate Jacobian. Especially, when n n it is obviously preferable to calculate the restricte tensors S k f(x) irectly, rather than first evaluating the full tensors k f(x) I k f(x) R nk

EVALUATING HIGHER DERIVATIVE TENSORS 1119 an then contracting them by multiplications with the irections s i. One way to evaluate the esire tensors S k f(x) is to recursively calculate for all intermeiate quantities w the corresponing restricte tensors S w, S 2 w,... etc. The process woul be starte with the initialization S x = S an with S kx =0 for k>1. For all subsequent intermeiates, the erivative tensors are efine by the usual ifferentiation rules. For example, in the case w = u v there are the well known ifferentiation rules S w = u S v + v S u an S 2 w = u 2 S v + Su( S v) T + S v( S u) T + v S 2 u. For the thir erivative tensor, the matrix-vector notation is no longer sufficient, but the following observation is generally applicable. To calculate S k w, each element of j S u with 0 j k has to be consiere an then multiplie with all elements of k j S v. The result is then multiplie by a binomial coefficient given by Leibniz theorem, an finally incremente to the appropriate element of S kw. The next section stuies the complexity of storing an propagating multivariate an univariate Taylor polynomials. Taylor coefficients are use since they are somewhat better scale than the corresponing erivatives an satisfy slightly simpler recurrences. 3. The complexity of propagating univariate Taylor series If the k-th tensor were store an manipulate as a full n k array, the corresponing loops coul be quite simple to coe, but the memory requirement an operations count woul be unnecessary high. In the case k = 2, ignoring the symmetry of Hessians woul almost ouble the storage per intermeiate (from 1 2 (n+1)n to n 2 ) an increase the operations count for a multiplication by about fifty percent. This price may be worth paying in return for the resulting sequential or at least constant strie ata access. However, by stanar combinatorial arguments S k f(x) Rnk has exactly ( ) n + k 1 n (n +1) (n + k 1) (1) = nk k 1 2 k k! istinct elements. Hence, the symmetry reuces the number of istinct elements in S k w almost by the factor k!. Therefore in the case k = 3 the number of istinct elements is reuce almost by six an in the case k = 4 the storage ratio becomes almost 24. Since higher orer tensors have very many entries, one has to utilize symmetric storage moes. The rawback of symmetric storage moes is that the access of iniviual elements is somewhat complicate, requiring for example three integer operations for aress computations in the implementation of Berz [3]. Moreover, the resulting memory locations may be far apart with irregular spacing, so that significant paging overhea may be incurre. None of these ifficulties arises when n =1. Then for any intermeiate value w the irectional erivatives w, s w, s 2 w,..., s w can be store an manipulate as a contiguous array of ( + 1) scalars. Here enotes

1120 ANDREAS GRIEWANK, JEAN UTKE, AND ANDREA WALTHER the highest egree that is esire. It will be shown in Section 4 that to reconstruct the full tensors S k f(x) fork =0,...,, exactly as many univariate Taylor series are neee as the highest tensor S f(x) has istinct elements. As we will see, that entails a slight increase in storage, but a very significant gain in coe simplicity an efficiency. After this outlook, the manipulation of Taylor polynomials is consiere. The collection of Taylor coefficients that constitute the tensors k S f(x), k =0,...,, represents a general polynomial of egree in n variables. As can be conclue from (1), S k f(x) contains ( ) n + ( ) n + k 1 (2) = k k=0 istinct monomials. The truncate Taylor polynomials of all intermeiate scalar quantities w have exactly the same structure. Consiering again a prouct w = u v, for the computation of S k w each element of j S u,0 j k, has to be multiplie with all elements of k j S v. It follows that ( ) 2n + k 1 k ( )( ) n + j 1 n + k j 1 = k j k j j=0 multiplications are necessary because of (1). Hence, the total count of multiplications for computing S w is ( ) 2n + ( ) 2n + k 1 =. k k=0 When n>1 there are also a significant number of aitions an other overhea for each multiplication. Nevertheless it is possible to consier the number ( ) 2n + 1! (2n) if n as a reasonable approximation for the factor by which the cost to evaluate f grows when the calculation is performe in Taylor arithmetic of egree an orer n. Here, we have tacitly use the fact that propagating Taylor polynomials through nonlinear functions such as the exponential, logarithm, an trigonometric functions is about as costly as the convolution for the prouct iscusse above. All linear operations are cheaper, of course, since for them the effort is roughly ( ) n+ an thus about the same as the number of ata entries that nee to be fetche an store from an into memory. Provie there is a significant fraction of nonlinear elementary functions in the overall calculation, one may consier ( )/( ) 2n + n + 2 as an approximate computation/communication ratio for propagating Taylor polynomials. Consequently, even on a moern super-scalar processor with comparatively slow memory access, communication shoul not be the bottleneck. Now suppose that instea of propagating one Taylor polynomial in n variables an egree one propagates ( ) n+ 1 univariate Taylor polynomials of the same

EVALUATING HIGHER DERIVATIVE TENSORS 1121 egree. Since the common constant term nees to be store only once, the amount of ata per intermeiate becomes ( ) n + 1 1+ = 1+ n ( ) n + (3). + n By inspection of the right-han sie we see this is at most times that for the stanar case (see (2)). Furthermore, ( ) ( +2)( +1) +2 = 2 multiplications are neee to propagate one univariate Taylor polynomial of egree through an elementary multiplication. Hence, the total amount of the computational effort for propagating ( ) n+ 1 univariate Taylor polynomials of egree is essentially given by (n )( ) ( ) + 1 +2 2n + (4) = q(, n), where ( +2)( +1) (n + 1)...(n) q(, n) 2 (2n + )(2n + 1)...(2n +1). It is easy to check through inuction that q is never greater than 3 2 an that q() ( +2)( +1) lim q(, n) = n 2 +1. One has in particular q(0,n)= 1an q(1,n) = 3 2 3 4n+2 = 3 2 O( 1 n ), q(2,n) = 3 2 3 4n+2 = 3 2 O( 1 n ), q(3,n) = q(4,n) = q(5,n) = 5n(n+2) (2n+3)(2n+1) = 5 4 O( 1 n ), 15n(n+3) 4(2n+3)(2n+1) = 15 16 O( 1 n ), 21n(n+3)(n+4) 4(2n+5)(2n+3)(2n+1) = 21 32 O( 1 n ), as well as, for all higher erivatives 6, ( +2)( +1) 1 ) q(, n) = 2 +1 + O(. n In other wors, the computational effort is ramatically reuce when the egree is quite large. This fact can also be seen in Figure 1, which isplays the function q(, n). For the probably more important moerate values of 5thecomplexity ratio is surprisingly close to one. The computations/communications ratio is about /2, almost inepenently of n (see (3) an (4)). For 4 that ratio might appear rather small. However, the memory access is now strictly sequential, an no aressing calculations are neee. In this section, we have emonstrate that for moerate values of the propagation of ( ) n+ 1 univariate Taylor series has essentially the same operations count as multivariate erivative propagation. We foun also that for higher egree erivatives, one nees consierable fewer operations using univariate Taylor series instea of the multivariate one. In the next section, an efficient scheme for interpolating

1122 ANDREAS GRIEWANK, JEAN UTKE, AND ANDREA WALTHER q 1.4 1.2 1 0.8 0.6 0.4 0.2 00 20 40 n 60 80 100 2 4 6 8 10 Figure 1. q(, n) forn =4,...,100, =0,...,10 all partial erivatives up to egree from these ( ) n+ 1 univariate Taylor series is evelope. 4. Interpolation with univariate Taylor series For each irection s R n one can obtain the Taylor expansion (5) f(x + ts) =f(x)+f (1) (x; s)t + f (2) (x; s)t 2 + + f () (x; s)t + O(t +1 ). The notation f (m) (x; s) will be use throughout to enote the m-th homogeneous polynomial in s in the Taylor expansion of f at x. Hence, x is consiere constant an s R n variable with the homogeneity property (6) f (m) (x; ts) = t m f (m) (x; s) for t R. Since only the last coefficient f () (x; s) contains any information about the highest tensor s f(x) withits ( ) n+ 1 istinct elements, it is clear that one nees to evaluate at least that many univariate expansions. We will see that this number is sufficient. Let i =(i 1,...,i n ) N n 0 with i j N 0 (j =1,...,n) be a multi-inex. The norm of i is efine by n i = i j. j=1 Consier the matrix S throughout this section as fixe. Now choose the irections Si i N n 0 with i =.

EVALUATING HIGHER DERIVATIVE TENSORS 1123 After selecting a particular, Taylor coefficients are propagate along these ( ) n+ 1 necessary irections only. In other wors, for each multi-inex i N n 0 with i =, we consier the Taylor series for the univariate function ϕ i : R R, ϕ i (t) f(x + tsi), an evaluate the Taylor coefficients of ϕ i up to the egree. The epenence of the ϕ i on S is not enote explicitly, because S is consiere as fixe. Note that there is no propagation along the irections Si with i <. However, we will see that it is still possible to obtain all lower orer erivative information that is necessary to compute the tensors S m f with m<. Interpolating secon erivatives. To illustrate our approach, let us first consier the computation of the Hessian S 2 f with S = I when the maximal egree equals 2. Denote the i-th unit vector in R n by e i. Then, the restricte graient components of S f(x) are obtaine immeiately as S f(x)e i = 1 2 ϕ i(0) = 1 { 2, k = i, (7) 2 f(x)si with i Nn 0, i k = 0, k i. Similarly, the pure secon erivatives can be obtaine from e T i 2 S f(x)e i = 1 4 ϕ i (0) = 1 4 (Si)T 2 f(x)si with the same multi-inex i as in (7). However, the mixe secon erivatives e T i 2 S f(x)e j,i j, are not irectly available. To get them, one has to consier the iagonal irection Sl efine by the multi-inex { l N n 1ifk = i or k = j, 0 with l k = 0ifk ian k j. Because of l = 2 the secon orer Taylor coefficient of ϕ l is known: ϕ l (0) = (e i + e j ) T S 2 f(x)(e i + e j ) = e T i 2 S f(x)e i + e T j 2 S f(x)e j +2e T i 2 S f(x)e j. This ientity yiels the interpolation formula (8) e T i 2 S f(x)e j = 1 2 [(e i + e j ) T 2 S f(x)(e i + e j ) e T i 2 S f(x)e i e T j 2 S f(x)e j] = 1 2 ϕ l (0) 1 8 ϕ i (0) 1 8 ϕ j (0) with the multi-inices i, j, l N n 0 an { { 2ifk = i, 2ifk = j, (9) i k = j k = 0ifk i, 0ifk j, l k = { 1ifk = i or k = j, 0ifk i an k j. When = 3 this scheme is not irectly applicable because l < 3. Therefore l oes not aim at a erivative of the highest orer, an one oes not propagate a Taylor series along the irection Sl. Hence, ϕ l (0) is unknown. To hanle that situation an generally the higher orer cases, one nees a systematic way of generating formula of the form (8) for mixe partials.

1124 ANDREAS GRIEWANK, JEAN UTKE, AND ANDREA WALTHER Computing mixe partials from gri values. For any polynomial P of egree n or less, it can be checke that n P (Sz) 1 1 (10) = P (i 1 s 1 + + i n s n )( 1) n (i1+ +in) z 1 z 2... z n i 1=0 i n=0 through integration of the constant function on the left sie over the unit cube in n imensions. The important observation here is that one only has to know the value of the polynomial at the corners of the parallelepipe {Sz :0 z i 1} in orer to compute the mixe erivative with respect to all z i. Naturally, one oes not want to assume that f itself is a polynomial. However, one can use the m-th coefficients of the univariate Taylor expansions to compute values of the homogeneous approximating polynomials f (m) (x; s), 1 m, satisfying (5) an (6). For example, it follows immeiately for the secon mixe partials with P (Sz) =f (2) (x; Sz) an1 j<i n: 2 f (2) (x; Sz) (11) = f (2) (x; S(e i +e j )) f (2) (x; Se i ) f (2) (x; Se j ) +f (2) (x;0) z i z j }{{}}{{}}{{} = 1 2 ϕ l (0) = 1 8 ϕ i (0) = 1 8 ϕ j (0) with the multi-inices i, j, anl efine as in (9). Through the homogeneity property (6) one obtains that (12) f (m) (x;0) = 0, m>0, so that the last term in (11) rops out an one gets 2 f (2) (x; Sz) 2 f(x + Sz) z i z j z i z j because of the previous formula (8). For fixe S an given, therealnumbers f ( i ) (x; Si) with i will be referre to as the gri values. By consiering see matrices S R n with repeate columns, one can erive from (10), for any polynomial P of egree up to, its generalization i P (Sz) z i1 1 zi2 2... zin n = i 1 i 2 k 1=0 k 2=0 i n k n=0 ( )( i1 i2 k 1 k 2 z=0 ) ( in k n ) ( 1) i k P (Sk), where now S R n n an i may be any multi-inex with i = eg(p ). Applying this ientity to the homogeneous components f ( i ) (x; s) off at x an using binomial coefficient notation for multi-inices, one obtains, with (12) an the zero vector 0 N n 0, (13) f i (x) i f(x + Sz) z i1 1 zi2 2... zin n = z=0 0<k i ( ) i ( 1) i k f ( i ) (x; Sk). k We conclue this subsection with a geometric illustration of the situation. Suppose one wishes to obtain the partials f i for all i with 1 i. The functions f (m) (x; s), s R n,enotethem-th homogeneous polynomial at x as in equation (5). Through formula (13) it is possible to calculate f i (x) from the gri values f ( i ) (x; Sk) for all k i. However, the propagation of univariate Taylor series only yiels some of the gri values irectly. The others must be calculate in a secon interpolation step. The values f (m) (x; Si) are known for all i = an

EVALUATING HIGHER DERIVATIVE TENSORS 1125 e 2 e 1 Figure 2. Gri values an ray values for n =2an =4 for all 0 m. They will be referre to as ray values. Figure 2 illustrates this situation for n =2an = 4. The thin lines represent the irections of the propagate Taylor series incluing the e 1 -anthee 2 -axes. The gri points whose values can be obtaine by scaling of the ray values f (m) (x; Si) with i = an 0 m are marke by unfille circles. The black circles enote the values that are esire because they arise on the right han sie of (13) but are as yet unknown. The grey shae circles mark points that o not belong to the gri but whose values can be obtaine by scaling. The following section escribes the way to compute the unknown gri values from the known ray values. Computing gri values from ray values. For all multi-inices k with k <, the fact that f (m) (x; s) is homogeneous of egree m implies (14) f (m) (x; Sk) = ( k /) m f (m) (x; S(k/ k )). The ray values are place at equal istances, an the polynomial ( z j) f (m) (x; Sj) in z is nonzero for only one ray point. Hence, an interpolation process similar to the one imensional Lagrange interpolation formula yiels the equation (15) f (m) (x; Sz) = j = ( ) z f (m) (x; Sj) j for z = k/ k or any other vector z R n with value z =. Therefore the ray values f (m) (x; S(k/ k )) can be compute for any k with k <. Substituting

1126 ANDREAS GRIEWANK, JEAN UTKE, AND ANDREA WALTHER (15) into (14), one obtains (16) f (m) (x; Sk) = j = ( k ) m ( k/ k j ) f (m) (x; Sj). Direct computation of mixe partials from ray values. Now the main result of this article can be establishe, namely an efficient scheme for calculating all mixe partial erivatives up to egree from the known ray values. The following proposition contains the explicit formula an an estimation of the complexity of the interpolation proceure. Proposition 4.1. Let f be at least -times continuously ifferentiable at some point x R n. Then for any i N n 0 with 1 i the partial erivative f i(x) efine in (13) is given by the formula (17) ) i f i (x) = f ( i ) (x; Sj) ( )( )( i k/ k k ( 1) i k k j j = 0<k i }{{} c i,j The number of nonvanishing coefficients (18) 0 c i,j for i, j N n 0, 1 i, j =, is less than or equal to p(, n) ( )( )( ) n m + 1, m m m=1 which bouns the number of operations neee to calculate all partial erivatives from the univariate Taylor series. Proof. Combining (13) an (16), one obtains the ientity (17), which can be written in a simpler form as (19) f i (x) = c i,j f ( i ) (x; Sj). j = Define the sign function for multi-inices componentwise as follows: sign(i) (sign(i 1 ),...,sign(i n )) i N n 0. For two multi-inices i an j with 1 i an j =, the coefficient c i,j can only be nonzero if (20) sign(j) sign(i), since otherwise the secon binomial coefficient in (17) must vanish for all 0 < k i. Now, consier an arbitrary m N with 1 m. A multi-inex i with sign(i) = m has m nonvanishing components. Therefore the inequality m i must hol for all i with sign(i) = m. The number of istinct possibilities for choosing m positive integers so that their sum is no greater than is given by ( m). It follows that the total number of multi-inices i with 1 i an sign(i) = m is equal to ( n m)( m). For each of these multi-inices the coefficient ci,j can be greater than zero only if j satisfies the necessary conition (20). This implies that sign(j) m.

EVALUATING HIGHER DERIVATIVE TENSORS 1127 r 70 60 50 40 30 20 10 0 10 8 6 4 2 0 0 20 40 60 n 80 100 Figure 3. r(, n) forn =1,...,100, =0,...,10 Therefore, one has that for each multi-inex i with 1 i an sign(i) = m the number of multi-inices j with sign(j) sign(i) isgivenby ( ) m+ 1. Hence we conclue that the function ( )( )( ) n m + 1 (21) p(, n) m m m=1 enotes an upper boun for the number of nonvanishing coefficients c i,j. One can erive from (19) that (21) is also an upper boun for the total number of multiplications of the interpolation proceure. It follows from the tables at the en of this section that for =2, 3anany n N, the function p(, n) gives the exact number of nonvanishing coefficients c i,j. Therefore it may be conjecture that p(, n) correspons exactly to the number of nonvanishing factors c i,j, but so far there is no proof of this. Define r(, n) p(, n) ) = 1 ( ) 2 1 2 ( 2n+ ( c ) + O n as the ratio of the upper boun p(, n) an the complexity ( ) 2n+ of propagating multivariate Taylor series through a single multiplication operation. Hence, one obtains for large ( r(, n) 2 1 c ) + O. π n

1128 ANDREAS GRIEWANK, JEAN UTKE, AND ANDREA WALTHER Table 1. Nonvanishing coefficients for =2 Multi-inices c i,j i M 1 j M 2 i T 1 j 0 2 i M 2 j M 2 i T 1 j 0 2 i M 3 j M 2 i T j 0 1 4 i M 3 j M 3 i T j =2 1 As can be seen in Figure 3 the ratio r(, n) is quite small for the more important moerate values 5. It follows in particular that r(1,n) = 1 2 1 4n+2 = 1 2 + O( 1 n ), r(2,n) = r(3,n) = n(1+3n) 2(2n+1)(n+1) = 3 4 + O( 1 n ), n(1+3n+5n 2 ) (2n+3)(2n+1)(n+1) = 5 4 + O( 1 n ), as well as r(4,n) = 35 16 + O( 1 n ) 2+O( 1 n ), r(5,n) = 63 16 + O( 1 n ) 4+O( 1 n ), r(6,n) = 231 32 + O( 1 n ) 7+O( 1 n ), r(7,n) = 429 32 + O( 1 n ) 13 + O( 1 n ). To apply the formula (19) at various points x R n it makes sense to precompute the rational coefficients (18). So far a simple, explicit formula for them has not been foun, but it is possible to further reuce the computation by grouping the aitive terms together. A listing is given for =2an =3. When = 2 one has to consier the following four groups of multi-inices: (22) M 0 = {0 N n 0 }, M 1 = {i N n 0 i =1}, M 2 = {i N n 0 i =2an sign(i) =1}, M 3 = {i N n 0 i =2an sign(i) =2}. Then, the coefficient c i,j has to be calculate for each i M 0 M 1 M 2 M 3 an each j M 2 M 3. Table 1 lists the nonvanishing c i,j. For = 3 there are three groups of multi-inices besies those of (22): M 4 = {i N n 0 i =3an sign(i) =1}, M 5 = {i N n 0 i =3an sign(i) =2}, M 6 = {i N n 0 i =3an sign(i) =3}. Now, for each i M 0... M 6 an each j M 4 M 5 M 6 the coefficients (18) must be consiere. The nonvanishing c i,j are given by Table 2. As one can see, all coefficients c i,j for 3 are of moerate size an most are positive. It shoul also be note that the interpolation proceure oes not involve any ivisions, so it oes not expan errors occurre by propagating the univariate Taylor series.

EVALUATING HIGHER DERIVATIVE TENSORS 1129 Table 2. Nonvanishing coefficients for =3 Multi-inices c i,j Multi-inices c i,j i M 1 j M 4 i T 1 j 0 3 i M 5 j M 4 i T 2 j =3 27 i M 2 j M 4 i T 2 j 0 9 i M 5 j M 5 i T 2 j =5 3 i M 3 j M 4 i T j 0 5 36 i M 5 j M 5 i T j =4 1 3 i M 3 j M 5 i T 1 j =3 4 i M 6 j M 4 i T 2 j 0 27 i M 4 j M 4 i T 2 j 0 9 i M 6 j M 5 i T j =3 1 6 i M 5 j M 4 i T j =6 5 27 i M 6 j M 6 i T j =3 1 5. Some run time results This section presents some run time results for computing higher erivative tensors using the interpolation with univariate Taylor expansions escribe above. To optimize bevel gears, exact knowlege of the geometric properties of bevel gear tooth flanks is necessary (see e.g. [9]). Arbitrary points on a given gri on the tooth flank can be calculate with machine accuracy by an analytical characterization [5] an computational ifferentiation. To that en one nees to compute the higher erivative tensors of a vector function f : R 3 R 3 escribing a family of surfaces. A corresponing evaluation coe consisting of approximately 160 C statements was ifferentiate using ADOL-C [4] an a special river for the calculation of higher erivative tensors by forwar propagation of univariate Taylor series. The run times observe on a Sun Sparc 10, always normalize by time for evaluating f without erivatives, are liste in Table 3. The thir column states the theoretical run time ratios given by ( +2) 2 ( +1) 2 /4 (see equation (4) in Section 3). The next column contains the ratios of the complete calculation of k f(x), k =0,...,, using ADOL-C an the special river mentione above. As can be seen, the total ratio is about 50 % 70 % of the theoretical estimate. The ratio of the run time of the interpolation process only an the run time of the function evaluation is containe in the fourth column. In agreement with the asymptotic expansion for r(, n), one observes a oubling of the ratio with each increment of by 1. Nevertheless, one fins on this example for 9thatnomorethanfive percent of the run time for the erivative calculation is neee to interpolate the coefficients of the erivative tensors from the univariate Taylor series. Table 3. Run-time ratios for erivative tensor evaluations istinct elements in theoretical ratio for evaluating ratio for the k f(x),k =0,..., run time ratio k f(x),k =0,..., interpolation 2 10 36 16.2 0.6 3 20 100 45.0 1.9 4 35 225 93.3 4.0 5 56 441 184.7 8.5 6 84 672 356.0 15.1 7 120 1290 655.3 26.3 8 165 2025 1174.0 44.4 9 220 3025 2040.7 81.0

1130 ANDREAS GRIEWANK, JEAN UTKE, AND ANDREA WALTHER 6. Conclusions Many applications in scientific computation require higher orer erivatives. This article escribe a promising approach to compute higher orer erivatives by using univariate Taylor series. The main result is equation (17), which represents general partial erivatives in terms of univariate Taylor coefficients. It was foun that the post-processing effort for the interpolation given by the function p(, n) in Section 4 is very small, especially for the more important moerate values of. Also,forthese the complexity ratio q(, n) analyze in Section 3 between the new univariate Taylor approach an a more conventional multivariate Taylor approach is essentially 1. In aition, the ata structures an memory access pattern are much simpler an more regular, so that actual run-times shoul be significantly reuce. References 1. Christian H. Bischof, George F. Corliss, an Anreas Griewank, Structure secon- an higher-orer erivatives through univariate Taylor series. Optimization Methos an Software 2 (1993), 211-232. 2. Martin Berz, Differential algebraic escription of beam ynamics to very high orers. Particle Accelerators 12 (1989), 109 124. 3. Martin Berz, Algorithms for higher erivatives in many variables with applications to bear physics. In George F. Corliss an Anreas Griewank, eitors, Automatic Differentiation of Algorithms - Theory, Implementation, an Applications SIAM (1991), 147-156. MR 92h:65031 4. Anreas Griewank, Davi Juees, an Jean Utke, ADOL-C: A package for the automatic ifferentiation of algorithms written in C/C++. ACM Trans. Math. Software 22 (1996), 131 167. 5. Anreas Griewank an George W. Reien, Computation of cusp singularities for operator equations an their iscretizations. J. Comput. Appl. Math. 26 (1989), 133 153. MR 91e:58036 6. Anreas Griewank, On automatic ifferentiation. In Masao Iri an K. Tanabe, eitors, Mathematical programming: Recent evelopments an applications Kluwer Acaemic Publishers (1989), 83 108. MR 92k:65002 7. Anreas Griewank, Automatic evaluation of first- an higher-erivative vectors. In R. Seyel, F. W. Schneier, T. Küpper, an H. Troger, eitors, Proceeings of the Conference Bifurcation an Chaos: Analysis, Algorithms, Applications Birkhäuser (1991), 124 137. MR 91m:58001 8. Anreas Griewank, Computational ifferentiation an optimization. In J. Birge an K. Murty, eitors, Mathematical Programming: State of the Art University of Michigan (1994), 102 131. 9. Ulf Hutschenreiter, A new metho for bevel gear tooth flank computation. In Martin Berz, Christian Bischof, George F. Corliss an Anreas Griewank, eitors, Computational Differentiation - Techniques, Applications, an Tools SIAM (1996), 161-172. 10. Rich Neiinger, An efficient metho for the numerical evaluation of partial erivatives of arbitrary orer. ACM Trans. Math. Software 18 (1992), 159 173. MR 93b:65040 Institute of Scientific Computing, Technical University Dresen, D-01062 Dresen, Germany E-mail aress: griewank@math.tu-resen.e Framework Technologies, Inc., 10 South Riversie Plaza, Suite 1800, Chicago, Illinois 60606 E-mail aress: utke@fti-consulting.com Institute of Scientific Computing, Technical University Dresen, D-01062 Dresen, Germany E-mail aress: awalther@math.tu-resen.e