Physics 25 Results for Matrix Exponentials Spring 27. Properties of the Matrix Exponential Let A be a real or complex n n matrix. The exponential of A is efine via its Taylor series, e A A n = I + n!, () where I is the n n ientity matrix. The raius of convergence of the above series is infinite. Consequently, eq. () converges for all matrices A. In these notes, we iscuss a number of key results involving the matrix exponential an provie proofs of three important theorems. First, we consier some elementary properties. Property : If [ A, B ] AB BA =, then n= e A+B = e A e B = e B e A. (2) This result can be prove irectly from the efinition of the matrix exponential given by eq. (). The etails are left to the ambitious reaer. Remarkably, the converse of property is FALSE. One counterexample is sufficient. Consier the 2 2 complex matrices ( ) ( ) A =, B =. (3) 2πi 2πi An elementary calculation yiels e A = e B = e A+B = I, (4) where I is the 2 2 ientity matrix. Hence, eq. (2) is satisfie. Nevertheless, it is a simple matter to check that AB BA, i.e., [A, B]. Inee, one can use the above counterexample to construct a secon counterexample that employs only real matrices. Here, we make use of the well known isomorphism between the complex numbers an real 2 2 matrices, which is given by the mapping ( ) a b z = a+ib. (5) b a It is straightforwar to check that this isomorphism respects the multiplication law of two complex numbers. Using eq. (5), we can replace each complex number in eq. (3) with the corresponing real 2 2 matrix, A = 2π 2π, B = 2π 2π.
One can again check that eq. (4) is satisfie, where I is now the 4 4 ientity matrix, whereas AB BA as before. It turns out that a small moification of Property is sufficient to avoi any such counterexamples. Property 2: If e t(a+b) = e ta e tb = e tb e ta, where t (a,b) (where a < b) lies within some open interval of the real line, then it follows that [A, B] =. Property 3: If S is a non-singular matrix, then for any matrix A, exp { SAS } = Se A S. (6) TheaboveresultcanbeerivesimplybymakinguseoftheTaylorseriesefinition[cf.eq.()] for the matrix exponential. Property 4: If [A(t), A/] =, then ea(t) = e A(t)A(t) = A(t) e A(t). This result shoul be self evient since it replicates the well known result for orinary (commuting) functions. Note that Theorem 2 below generalizes this result in the case of [A(t), A/] Property 5: If [ A, [A, B] ] =, then e A Be A = B +[A, B]. To prove this result, we efine an compute B(t) B(t) e ta Be ta, = Ae ta Be ta e ta Be ta A = [A, B(t)], 2 B(t) 2 = A 2 e ta Be ta 2Ae ta Be ta A+e ta Be ta A 2 = [ A, [A, B(t)] ]. By assumption, [ A, [A, B] ] =, which must also be true if one replaces A ta for any number t. Hence, it follows that [ A, [A, B(t)] ] =, an we can conclue that 2 B(t)/ 2 =. It then follows that B(t) is a linear function of t, which can be written as ( ) B(t) B(t) = B()+t. Noting that B() = B an (B(t)/) t= = [A, B], we en up with t= e ta Be ta = B +t[a, B]. (7) By setting t =, we arrive at the esire result. If the ouble commutator oes not vanish, then one obtains a more general result, which is presente in Theorem below. 2
If [ A, B ], the e A e B e A+B. The general result is calle the Baker-Campbell- Hausorffformula, whichwillbeproveintheorem4below. Here, weshallproveasomewhat simpler version, Property 6: If [ A, [A, B] ] = [ B, [A, B] ] =, then To prove eq. (8), we efine a function, e A e B = exp { A+B + 2 [A, B]}. (8) F(t) = e ta e tb. We shall now erive a ifferential equation for F(t). Taking the erivative of F(t) with respect to t yiels F = AetA e tb +e ta e tb B = AF(t)+e ta Be ta F(t) = { A+B +t[a, B] } F(t), (9) after noting that B commutes with e Bt an employing eq. (7). By assumption, both A an B, an hence their sum, commutes with [A, B]. Thus, in light of Property 4 above, it follows that the solution to eq. (9) is F(t) = exp { t(a+b)+ 2 t2 [A, B] } F(). Setting t =, we ientify F() = I, where I is the ientity matrix. Finally, setting t = yiels eq. (8). Property 7: For any matrix A, etexpa = exp { TrA }. () If A is iagonalizable, then one can use Property 3, where S is chosen to iagonallize A. In this case, D = SAS = iag(λ, λ 2,..., λ n ), where the λ i are the eigenvalues of A (allowing for egeneracies among the eigenvalues if present). It then follows that ete A = i e λ i = e λ +λ 2 +...+λ n = exp { TrA }. However, not all matrices are iagonalizable. One can moify the above erivation by employing the Joran canonical form. But, here I prefer another technique that is applicable to all matrices whether or not they are iagonalizable. The iea is to efine a function f(t) = ete At, an then erive a ifferential equation for f(t). If δt/t, then ete A(t+δt) = et(e At e Aδt ) = ete At ete Aδt = ete At et(i +Aδt), () after expaning out e Aδt to linear orer in δt. 3
We now consier +A δt A 2 δt... A n δt A 2 δt +A 22 δt... A 2n δt et(i +Aδt) = et...... A n δt A n2 δt... +A nn δ = (+A δt)(+a 22 δt) (+A nn δt)+o ( (δt) 2) = +δt(a +A 22 + +A nn )+O ( (δt) 2) = +δttra+o ( (δt) 2). Inserting this result back into eq. () yiels ete A(t+δt) ete At = TrA ete At +O(δt). δt Taking the limit as δt yiels the ifferential equation, eteat = TrA ete At. (2) The solution to this equation is ln ete At = ttra, (3) wheretheconstant ofintegrationhasbeeneterminebynotingthat(ete At ) t= = eti =. Exponentiating eq. (3), we en up with ete At = exp { ttra }. Finally, setting t = yiels eq. (). Note that this last erivation hols for any matrix A (incluing matrices that are singular an/or are not iagonalizable). Remark: For any invertible matrix function A(t), Jacobi s formula is ( eta(t) = eta(t)tr A (t) A(t) ). (4) Note that for A(t) = e At, eq. (4) reuces to eq. (2) erive above. 2. Four Important Theorems Involving the Matrix Exponential The ajoint operator a A, which is a linear operator acting on the vector space of n n matrices, is efine by a A (B) = [A,B] AB BA. (5) Note that (a A ) n (B) = [ A, [A,[A,B]] ] }{{} n involves n neste commutators. 4 (6)
Theorem : e A Be A = exp(a A )(B) n= n! (a A) n (B) = B +[A,B]+ [A,[A,B]]+. (7) 2 Proof: Define B(t) e ta Be ta, (8) an compute the Taylor series of B(t) aroun the point t =. A simple computation yiels B() = B an B(t) = Ae ta Be ta e ta Be ta A = [A,B(t)] = a A (B(t)). (9) Higher erivatives can also be compute. It is a simple exercise to show that: n B(t) n = (a A ) n (B(t)). (2) Theorem then follows by substituting t = in the resulting Taylor series expansion of B(t). We now introuce two auxiliary functions that are efine by their power series: These functions satisfy: f(z) = ez z = n= g(z) = lnz z = n= z n, z <, (2) (n+)! ( z) n n+, z <. (22) f(lnz)g(z) =, for z <, (23) f(z)g(e z ) =, for z <. (24) Theorem 2: e A(t) e A(t) = f(a A ) ( ) A, (25) where f(z) is efine via its Taylor series in eq. (2). Note that in general, A(t) oes not commute with A/. A simple example, A(t) = A+tB where A an B are inepenent of t an [A,B], illustrates this point. In the special case where [A(t),A/] =, eq. (25) reuces to Proof: Define e A(t) e A(t) = A, if [ A(t), A ] =. (26) B(s,t) e sa(t) e sa(t), (27) 5
an compute the Taylor series of B(s,t) aroun the point s =. It is straightforwar to verify that B(,t) = an ( ) n B(s,t) A s n = (a A(t) ) n, (28) s= for all positive integers n. Assembling the Taylor series for B(s,t) an inserting s = then yiels Theorem 2. Note that if [A(t),A/] =, then ( n B(s,t)/s n ) s= = for all n 2, an we recover the result of eq. (26). Theorem 3: e A(t) = This integral representation is an alternative version of Theorem 2. Proof: Consier sa A e e ( s)a s. (29) ( e sa e ( s)b) = Ae sa e ( s)b +e sa e ( s)b B s = e sa (B A)e ( s)b. (3) Integrate eq. (3) from s = to s =. Using eq. (3), it follows that: ( e sa e ( s)b) = e sa e ( s)b = e A e B. (3) s e A e B = se sa (B A)e ( s)b. (32) In eq. (32), we can replace B A+hB, where h is an infinitesimal quantity: Taking the limit as h, e A e (A+hB) = h lim h h se sa Be ( s)(a+hb). (33) [ e (A+hB) e A] = se sa Be ( s)a. (34) Finally, we note that the efinition of the erivative can be use to write: Using e A(t+h) e A(t) e A(t) = lim. (35) h h A(t+h) = A(t)+h A +O(h2 ), (36) 6
it follows that: [ ( exp A(t)+h A )] exp[ A(t)] e A(t) = lim. (37) h h Thus, we can use the result of eq. (34) with B = A/ to obtain e A(t) = which is the result quote in Theorem 3. sa A e e ( s)a s, (38) Secon proof of Theorem 2: One can now erive Theorem 2 irectly from Theorem 3. Multiply eq. (29) by e A(t) to obtain: Using Theorem [see eq. (7)], e A(t) e A(t) = e A(t) e A(t) = = ( s)a A e e ( s)a s. (39) exp [ ] ( ) A a ( s)a s e ( s)a A Changing variables s s, it follows that: e A(t) e A(t) = The integral over s is trivial, an one fins: e A(t) e A(t) = ea A a A which coincies with Theorem 2. e sa A ( ) A s. (4) ( ) A s. (4) ( ) A = f(a A ) Theorem 4: The Baker-Campbell-Hausorff (BCH) formula ln ( e A e B) = B + ( ) A, (42) g[exp(ta A )exp(a B )](A), (43) where g(z) is efine via its Taylor series in eq. (22). Since g(z) is only efine for z <, it follows that the BCH formula for ln ( e A e B) converges provie that e A e B I <, where I is the ientity matrix an is a suitably efine matrix norm. Expaning the BCH formula, using the Taylor series efinition of g(z), yiels: e A e B = exp ( A+B + 2 [A,B]+ 2 [A,[A,B]]+ 2 [B,[B,A]]+...), (44) 7
assuming that the resulting series is convergent. An example where the BCH series oes not converge occurs for the following elements of SL(2,R): ( ) [ ( )] [ ( )] e λ M = e λ = exp λ exp π, (45) where λ is any nonzero real number. It is easy to prove that no matrix C exists such that M = expc. Nevertheless, the BCH formula is guarantee to converge in a neighborhoo of the ientity of any Lie group. Proof of the BCH formula: Define or equivalently, C(t) = ln(e ta e B ). (46) e C(t) = e ta e B. (47) Using Theorem, it follows that for any complex n n matrix H, exp [ a C(t) ] (H) = e C(t) He C(t) = e ta e B He ta e B Hence, the following operator equation is vali: = e ta [exp(a B )(H)]e ta = exp(a ta )exp(a B )(H). (48) exp [ a C(t) ] = exp(taa )exp(a B ), (49) after noting that exp(a ta ) = exp(ta A ). Next, we use Theorem 2 to write: e C(t) ( ) C e C(t) = f(a C(t) ). (5) However, we can compute the left-han sie of eq. (5) irectly: e C(t) e C(t) = e ta e B e B e ta = e ta e ta = A, (5) The characteristic equation for any 2 2 matrix A is given by λ 2 (Tr A)λ+et A =. Hence, the eigenvalues of any 2 2 traceless matrix A sl(2,r) [that is, A is an element of the Lie algebra of SL(2,R)] are given by λ ± = ±( et A) /2. Then, { Tr e A 2cosh et A /2, if et A, = exp(λ + )+exp(λ ) = 2cos et A /2, if et A >. Thus, if et A, then Tr e A 2, an if et A >, then 2 Tr e A < 2. It follows that for any A sl(2,r), Tr e A 2. For the matrix M efine in eq. (45), Tr M = 2coshλ < 2 for any nonzero real λ. Hence, no matrix C exists such that M = expc. 8
since B is inepenent of t, an ta commutes with (ta). Combining the results of eqs. (5) an (5), ( ) C A = f(a C(t) ). (52) Multiplying both sies of eq. (52) by g(expa C(t) ) an using eq. (24) yiels: C = g(expa C(t))(A). (53) Employing the operator equation, eq. (49), one may rewrite eq. (53) as: C = g(exp(ta A)exp(a B ))(A), (54) which is a ifferential equation for C(t). Integrating from t = to t = T, one easily solves for C. The en result is C(T) = B + T g(exp(ta A )exp(a B ))(A), (55) where the constant of integration, B, has been obtaine by setting T =. Finally, setting T = in eq. (55) yiels the BCH formula. Finally, we shall use eq. (43) to obtain the terms exhibite in eq. (44). In light of the series efinition of g(z) given in eq. (22), we nee to compute I exp(ta A )exp(a B ) = I (I +ta A + 2 t2 a 2 A)(I +a B + 2 a2 B) = a B ta A ta A a B 2 a2 B 2 t2 a 2 A, (56) an [ I exp(taa )exp(a B ) ]2 = a 2 B +ta Aa B +ta B a A +t 2 a 2 A, (57) after ropping cubic terms an higher. Hence, using eq. (22), g(exp(ta A )exp(a B )) = I 2 a B 2 ta A 6 ta Aa B + 3 ta Ba A + 2 a2 B+ 2 t2 a 2 A. (58) Noting that a A (A) = [A,A] =, it follows that to cubic orer, B + g(exp(ta A )exp(a B ))(A) = B +A [B,A] [ ] [ ] 2 2 A,[B,A] + 2 B,[B,A] which confirms the result of eq. (44). = A+B + 2 [A,B]+ 2[ A,[A,B] ] + 2[ B,[B,A] ], (59) 9
References: The proofs of Theorems, 2 an 4 can be foun in section 5. of Symmetry Groups an Their Applications, by Willar Miller Jr. (Acaemic Press, New York, 972). The proof of Theorem 3 is base on results given in section 6.5 of Positive Definite Matrices, by Rajenra Bhatia (Princeton University Press, Princeton, NJ, 27). Bhatia notes that eq. (29) has been attribute variously to Duhamel, Dyson, Feynman an Schwinger. See also R.M. Wilcox, J. Math. Phys. 8, 962 (967). Theorem 3 is also quote in eq. (5.75) of Weak Interactions an Moern Particle Theory, by Howar Georgi (Dover Publications, Mineola, NY, 29) [although the proof of this result is relegate to an exercise]. The proof of Theorem 2 using the results of Theorem 3 is base on my own analysis, although I woul not be surprise to fin this proof elsewhere in the literature. Finally, a nice iscussion of the SL(2,R) matrix that cannot be written as a single exponential can be foun in section 3.4 of Matrix Groups: An Introuction to Lie Group Theory, by Anrew Baker (Springer-Verlag, Lonon, UK, 22), an in section.5(b) of Group Theory in Physics, Volume 2, by J.F. Cornwell (Acaemic Press, Lonon, UK, 984).