1 Introduction Tensors - Lecture 4 The concept of a tensor is derived from considering the properties of a function under a transformation of the corrdinate system. As previously discussed, such transformations allow the insertion of symmetries into the mathmetical representations of physical processes. We know that a description of a physical process cannot depend on the coodinate orientation or origin. This idea can be expanded to look for symmetries in descriptions other than geometric. Symmetries have been a powerful tool in the development of physical intuition of physical phenomena. In the previous lecture we discussed the mathematical definition of scalar and vector functions. It is best to start the description of tensors by reviewing the transformation of a vector function under a rotation of coordinates. Consider the vector function; F(x, y, z) = f x (x, y, z)ˆx + f y (x, y, z)ŷ + f z (x, y, z)ẑ Thus a vector function really is 3 functions associated with each of the 3 spatial coordinate directions. Under a rotation through an angle θ, about the ẑ axis we transform from the functions, f i (x, y, x) to f i (x, y, z ). The transformation is written; f i = a ij f j j cos(θ) sin(θ) 0 a ij sin(θ) cos(θ) 0 0 0 1 Generally one can rotate about any axis, and the transformation for other axes is an obvious permutation. As an aside here, note that a transformation about an arbitrary axis can be obtained by a rotation about the three Euler angles in succession α, β, γ as illustrated in Figure 1. Thus, there is a transformation a ij which takes the function, F, defined in the coordinate frame x, y, z into the function F in the coordinate frame x, y, z. a ij cos(γ) sin(γ) 0 sin(γ) cos(γ) 0 0 0 1 cos(α) sin(α) 0 sin(α) cos(α) 0 0 0 1 cos(β) sin(β) 0 0 1 0 sin(β) cos(β) 0 The rotation matricies are unitary. That is they preserve the length of the rotated vector. Multiply the matrix by its transpose to show this. 1
z z f z y y" f z" f z y z" x f x x fx" Figure 1: An example to obtain the rotation of a coordinate frame about an arbitrary axis defined by using the Euler angles F F = i F F = f j fk jk a ij a ki = δ jk i a ij f j a ki f k j k a ij a ki i The matrix obtained in the above sum over the direct product on the a ij is a diagonal matrix with 1 s along the diagonal. This martix has the determinant of +1. We might consider an operation which preserves length but reflects the vector about the origin. That is the transformation which changes a right handed coordinate coordinate system into a left handed one. x x y y z z This forms a matrix that is diagonal with -1 s along the diagonal. It has a determinant of -1. This represents the symmetry operation of parity. A vector changes sign upon reflection, ie a vector in a right handed frame is a negative vector in a left handed frame. If the vector does not change sign under reflection then it is a pseudo-vector or axial-vector. The cross product of two real vectors is a pseudo-vector as is obvious since each vector changes signs upon reflection and their product does not. The operations of rotation and reflection can be written as components of matricies and as such, represent linear transformations of the vector components. Remember that a scalar function remains constant under rotations and also reflections. Since a scalar is represented by one function, the transformation is a 1-D matrix or just the number, 1. However, if the scalar function changes sign under reflection, it is a pseudo-scalar. Using the concepts of vector and scalar, linear transformations we can expand the dimensions to write linear operators for transformations between higher order forms. The fundamental component of the definition is the linear connection between 2
the differential forms of a function between different coordinate frames. Thus a tensor of rank 0 is a scalar (or pseudo-scalar) and a tensor of rank 1 is a vector (or pseudovector). A tensor of higher rank transforms the function F ij as; F ij = k,l, a k,l, F k,l, 2 Summation Convention Suppose we have n independent variables, x i with i = 1, 2, n. The set of values x i will define a point in a n-dimensional space. There are n independent functions, φ i (x 1, x 2,, n) in this space. For these functions to be linearly independent, the Jacobian cannot vanish. Therefore; φ 1 φ x J = 1 n x 1 0 φ 1 φ x n n x n Let x i = φ i define another coordinate system, so in the same way we developed the scale factors previously; x k = i x k x i x i A direction at a point is; = δ k j dx i = j x i dx j Now introduce the summation convention which assumes that a repeated index, called a dummy variable, is summed, and the summation sign is dropped. Thus the partial derivative equations above are written as; x k x i x i = δ k j dx i = x i dx j 3 Contravarient and covarient vectors We will now use a superscript on the variable instead of a subscript to represnet a tensor. This is because we will now use this difference to denote different types of tensors. Note that 3
a distance transforms as; dx i = x i dxj Thus this form represents a true vector, and it has the linear transform; A i = x i Aj To clarify, the above transformation describes how a component of a contravarient vector transforms. Next consider the gradient operator; φ = φ x i ˆxi This form transforms as ; φ x i = φ x i Note that the gradient transforms as xj x i rather than x i j. This transforms a covarient x vector component which will be noted by a subscript. A i = xi x j A j In a transformation between Cartesian coordinate frames; x i x j = x i = a i j = aj i Thus there is no difference between contravarient and covarient vectors. In a general curvilinear coordinate frame this is not the case. Therfore unless stated differently we will use a superscript to denote a contravarient component of a tensor and a subscript to denote a covarient component. A contravarient vector determines a direction and magnitude of a displacement at some point in space. It forms a vector field. A covarient vector describes the change of the field at the point in space. Let λ i be any n functions of the coordinates, x j. The contravarient transformation is; Then write; λ i = x i λj x k x i λ i = λ j xk x i x i = λ k 4
Thus we have found the inverse transformation for components of a covarient vector. Suppose we rewrite the above relations for vector components λ j and µ k and compose the form (summation convention); λ j µ j = λ j x i µ k xk x i = λ j µ k δ k j = λ j µ j This is an invarient of the transformation. The result is a scalar which is invarient under coordinate transformations. In our old way of combining vectors this is just the dot or scalar product. 4 Tensors To begin, define two contravarient vectors, λ i and η i. We study their transformations from a un-primed to a primed coordinate frame. Also define two covarient vectors, µ i, and ζ i. The following forms are to be used; A ij = λ i η j A ij = µ i ζ j A i j = λ i µ j We may define combinations of the primed forms in a similar way. The transformation matricies between these are; A ij i kl x = A x j x k x l A ij A i j = A kl x k x i x l x j = A k l x i x k xl x j These forms are tensors of 2 nd order. A kl is a contravarient tensor, A kl is a covarient tensor, and A k l is a mixed tensor. Note that there are n 2 elements in each tensor. The Kronecker delta, δ k j, is a mixed tensor of 2nd order. δ k l = x k x l Tensors of any order may be constructed in a similar way. The above construction uses the direct product of a tensor of lower order to produce one of higher order. Tensors are defined 5
by their linear transformation properties. Thus; A i,j,,k = A m,n,,p ( x i x m ) ( x k x p ) The components of covarient and mixed tensors transform in a similar way. However, note that the order of the indicies (not the order of the terms in parenthesis) is important. Consider an arbitrary contravarient tensor A ij. Write this as; A ij = (1/2)[A ij + A ji ] + (1/2)[A ij A ji ] Addition/subtraction of tensors can only be done for tensors of the same order. The first term on the left side in the above equation forms a symmetric tensor and the second an antisymmetric tensor. An arbitrary tensor is a combination of symmetric and anti-symmetric tensors. We developed the concept of tensors by multiplying two tensors of lower order to obtain a tensor of higher order. This is the outer (direct) product of the tensors. The inner (dot or scalar) product of two tensors forms a tensor of lower order than the direct product. Thus consider the inner product of the tensors A ij and B jkl. The inner product sums over the repeated index, in this case, j to get a tensor of rank 3. A ij B jkl = Γ kl i The direct product with sum over w is, Γ (t=s)uv rs Γ tuv rs Note that; = ( xr x s )( xs x w )( x w x t )( x k x u )( x l x v ) ( xj x w )( x w x t ) = δ r t This results in ; A ij B jkl = Γ suv rs ( xr x i)( x k x u )( x l x v ) The above transforms as a mixed tensor of rank 3. The process of reducing the order of a tensor by the inner product is called contraction. 6
5 Conjugate Tensors The metric g ij is a symmetric covarient tensor of order 2. Thus g ij = g ji. Further we have that; g 11 g 1n g = g n1 g nn Then let g ij be the cofactor of the element g ij divided by g. The cofactor of a tensor element, A ij, is given by; A ij = ( 1) i+j M ij where M ij is the minor of the element A ij. The minor is obtained from the determinant of the tensor after deleting the row and column containing the element. This means that the determinant of the tensor A is; A = n k=1 A ij ( 1) i+j M ij This then gives the relation; g ij g kj = δ i k The g ij are elements of a contravarient tensor of order 2. This tensor is the conjugate of g ij. Using this symmetric tensor, one may obtain a tensor of the same order but of different character (raise ot lower the index - ie change from covarient to contravarient of contravarient to covarient). A l jk = gli A ijk A lmp = g li g mj g pk A ijk Note that the process is reversible. Also; g ij g kj = (1/g) j ( 1) i+j M ij g kj Unless i = k we have the product of terms one of one row and cofactors of another row. In the case of i = k the determinant results. As a result; a i = g ij a j a j = g ij a i 7
6 Further discussion on the metric The metric is used in determining the differential length in a coordinate frame. ds 2 = g ij dη i dη j g jk = x i x i η i j η k The metric is restricted so that g ij 0, but otherwise ds 2 can be < 0 so we must take ds 2 to be the measure of length. The metric is still defined as; ds 2 = g ij dx i dx j An arc of length between a and b is; b s = dt e g x i ij t xj t a The factor, e, is taken as ±1 so that each element in the sum is > 0, and t is the paramteric variable. 7 Levi-Civita symbol Define the folowing tensor of rank 3. ǫ 123 = ǫ 231 = ǫ 321 = 1 ǫ 132 = ǫ 213 = ǫ 312 = 1 All other ǫ = 0 This is a 3-D Levi-Civita tensor and is obviously a pseudo-tensor as odd permutations of the indicies are 1 times the even permutations. We can also define the conjugate tensor ǫ ijk. Suppose a set of vectors λ i η i ζ i with i = 1, 2, 3. Contract the Levi-Civita tensor to produce the pseudo-scalar, φ. φ = ǫ ijk λ i η j ζ k φ = λ 1 η 2 ζ 3 λ 1 η 3 ζ 2 + λ 2 η 3 ζ 1 λ 2 η 1 ζ 3 + λ 3 η 1 ζ 2 λ 3 η 2 ζ 1 8
The Levi-Civita tensor may be expanded to rank 4, ǫ ijkl, by a similar definition. If we allow the vectors λ i η i ζ i to be differential vectors lying along each of the Cartesian coordinate axes, ie λ (dx, 0, 0), η (0, dy, 0), ζ (0, 0, dz) The value of φ = dxdy dz = dτ. This is a differential volume element, which we identify as a pseudo-scalar. 8 Dual tensors Any anti-symmetric tensor of rank greater than 2 can have a dual representation. This is understood by contracting the tensor with a Levi-Civita tensor. Thus suppose a tensor of rank 3. To be specific, assume the angular momentum obtained by; L ij (1/2) 0 C 12 C 31 C 21 0 C 23 C 31 C 32 0 0 x 1 p 2 x 2 p 1 x 1 p 3 + p 3 x 1 x 1 p 2 + x 2 p 1 0 x 2 p 3 p 2 x 3 x 1 p 3 x 3 p 1 x 2 p 3 + p 3 x 2 0 where to be anti-symmetric C ij = C ji. Then the contraction with the 3-D Levi-Civita tensor yields a vector. c i = (1/2)ǫ ijk C jk This produces, for the angular momentum example; L = [yp z zp y ]ˆx [xp z zp x ]ŷ + [yp x xp y ]ẑ Note C transforms as a vector as previously proven by tensor contraction, but this vector is a pseudo-vector because of its symmetry properties. We see that the cross product is a special anti-symmetric tensor with a dual which has the rotation properties of a vector, but the symmetry under inversion, a tensor of rank 2. Tensor properties may also be developed from the concept of the increasing order of surfaces. The properties of a point are equivalent to a scalar, the properties of a line are vectors, the properties of a surface are pseudo-vectors, etc. 9 Lorentz covarience Assume a 4-D Cartesian coordinate system in Minkowski space. x 1 = x, x 2 = y, x 3 = z, x 4 = ict 9
The metric is then; ds 2 = g ij dx i dx j = dx 2 + dy 2 + dz 2 (cdt) 2 This is an invarient length (scalar) in the 4-D space. The metric written as a matrix is; 1 0 0 0 g ij 0 1 0 0 0 0 1 0 0 0 0 1 The sign component g 44 depends on the definition of the 4 th component. In this case we defined x 4 = ict. If we had defined x 4 = ct then g 44 = 1. You may see this variation in a number of different expositions of Lorentz covarience. Introduce the Lorentz transformation in the general form; x µ = a uv x ν aµν a µλ = δ λ ν Then for a boost along the x 3 axis; 1 0 0 0 a µν 0 1 0 0 0 0 γ iγβ 0 0 iγβ γ In the above, β = v/c and γ = [ 1 β 2 ] 1. This looks like a rotation in coordinate space. Recall that a rotation matrix rotates a vector preserving length and has the form for a rotation through an angle θ about the x 1 axis; a µν 1 0 0 0 cos(θ) sin(θ) 0 sin(θ) cos(θ) For the Lorentz transformation the length is preserved as seen above since ds is a scalar, however we identify cos(θ) γ and sin(θ) iβγ. Thus we identify; cosh(ζ) = γ sinh(ζ) = βγ sinh 2 (ζ) = cosh 2 (ζ) 1 = The parameter ζ is call the rapadity. β2 1 β 2 10
10 Maxwell s equations Maxwell s equations are; E = B t H = ρ V + D t D = ρ B = 0 D = ǫ E B = µ H Introduce the vector and scalar potentials; B = A E = φ A t The Lorentz condition is A + ǫµ φ t use ǫ 0 µ 0 = 1/c 2 to obtain; = 0. This is used to separate the equations. Also [ 2 1/c 2 2 t 2 ] A = µ 0 ρ V [ 2 1/c 2 2 t 2 ]φ = ρ/ǫ 0 Consider the potential functions as components of a 4-vector, A j = cǫa j, A 4 = iǫφ, and the source terms as the 4-vector, j i = ρv i /c and j 4 = ρ. Then we have Maxwell s equations in the form; λ 2 x λ 2 Aµ = j µ This expression transforms as a covarient vector. Note that the differential 2 λ 2 is a covarient tensor of rank 2 and is contracted with a contravarient tensor of rank 1 (vector) resulting x in a covarient tensor of rank 1. 11
Many forms in electrodynamics, relativity, and other fields of physics can be developed using the symmetry properties of tensors. 11 Derivatives of a tensor field The ordinary derivatives of a scalar field are components of a covarient vector field. In general, however, the derivatives os a tensor field will NOT form a new tensor field. The derivative of a vector compares its value to the value at another point, slightly displaced point. These two vectors do not transform with exactly the same coefficients, so the derivative of hte coefficients differs. Thus to obtain a tensor derivative which remains a tensor, we must compare a vectors which are displaced parallel to each other. If this is done then the linear transformation between coordinate frames is preserved. Previously when obtaining vector operators, we found that changes in a vector were due not only to the changes in magnitude, but to changes in direction. Thus when a vector is translated to a neighboring point without changing its Cartesian coordinates, it has been parallel displaced. In Cartesian this is easily done, but becomes complex in general curvilinear coordinates. Assume a contravarient vector, A i. A = n â n h n f n + n h n f n (â n ) This can be written as; A = n â n [ h nf n + m h m f m Γ n mj] The change in an individual component is; A n = [ h nf n + h η n f m Γ n mj ] j m The form, Γ n mj, is a Christoffel symbol of the 2nd kind. â i = Γ k ijâk Γ k ij = âk âi In the previous lecture, we demonstrated that the change in the unit vectors â i, can be expressed as; â i = (â j /h i ) h j η i 12
This shows that for orthogonal coordinates; Γ i ik = 0 if i, j, k are all different Γ i ii = (1/h i ) hi η i Γ i ij = Γi ji (1/hi ) hi Γ j ii = (h i /h j 2 ) hi Γ i ii = (1/h i ) hi η i The above was developed for a contravarient vector, but a similar development provides the transformation for a covarient vector. At times the Christoffel symbol is written in the form; { } n Γ n mj m j A space that has parallel displacement defined is an affine space and the Γ n mj are the components of an affine connection. Note that the Christoffel symbols are not tensors and neither are the derivatives of the vector function. However, the combination of the two through the above equation does produce a tensor. A Christoffel symbol of the 1 st kind is defined as; [ij, k] = (1/2)[ g ij Γ l ij = glk [ij, k] + g jk η i g ij η k ] In the case of Cartesian coordinates, the Christoffel symbols vanish. 12 Covarient derivative The covarient derivative assures that a vector is independent of its description in an arbitrary coordinate frame. That is, the vector has a magnitude and points is a specific direction independent of the frame of reference. The covarient derivative removes the change in the vector due the curvature of the coordinate frame. Therefore the components of a covarient vector representing the rate of change of an ordinary vector, A, with respect to the η i axis is; A j i = Aj η i + k A k Γ j ki 13
The comma indicates that this is the covarient derivative. The A i, j are the components of a mixed tensor, covarient with respect to the index j and contravarient with respect to the index i. Covarient differentiation can also be extended to tensors of higher order. The contracted tensor, n A n, n = n n A n, n, represents the Div of the vector A; A n η n + n,m A n Γ m nm = n 1 h 1 h 2 h 3 ηn (A n h 1 h 2 h 3 ) Identify A n = F n /h n. Substitution in the above yields the Div A. Div A = 1 h 1 h 2 h 3 n h 1 h 2 h 3 A n η n h n The curl of vector B, can also be developed for orthorgonal coordinates by consideration of the component A i = ( 1 )[B h 1 h 2 h j,k B k,j ]. The scalar Laplacian can also be written as 3 a covarient derivative, ( 1 φ n h 2 ) η, n n n Thus the use of the covarient derivative allows one to express an equation in the same form for any coordinate system. 13 Geodesic lines In a general curvalinear coordinate system (Riemann space) there is a unique shortest line that connects two points. As you know, in a sphericalk system this line is a great circle. These lines are called geodesics. The length is given by; s = b a ds = b a dt [g ij dx i dt dxj dt ]1/2 In the above t is the parametric parameter. To find the minimum we apply the calculus of variations. This will be discussed later in the course. Basically one varies all paths between a and b that go through the end points a and b choosing the one which is stationary, ie this is similar to the derivative. This reduces to the Euler-Lagrange equations and finally the differential equation; d 2 x l ds 2 + Γ l dx i ik ds dxk ds = 0 In a Cartesian system the Chrsitoffel symbol vanishes which leads to a linear relation between the length and the coordinates 14