dr bob s elementary differential geometry

dr bob s elementary differential geometry a slightly different approach based on elementary undergraduate linear algebra, multivariable calculus and differential equations by bob jantzen (Robert T. Jantzen) Department of Mathematical Sciences Villanova University Copyright 2007, 2008 with Hans Kuo, Taiwan in progress: version: March 4, 2010

Abstract There are lots of books on differential geometry, including at the introductory level. Why yet another one by an author who doesn t seem to take himself that seriously and occasionally refers to himself in the third person? This one is a bit different than all the rest. dr bob loves this stuff, but how to teach it to students at his own (not elite) university in order to have a little more fun at work than usual? This unique approach may not work for everyone, but it attempts to explain the nuts and bolts of how a few basically simple ideas taken seriously underlie the whole mess of formulas and concepts, without worrying about technicalities which only serve to put off students at the first pass through this scenery. It is also presented with an eye towards being able to understand the key concepts needed for the mathematical side of modern physical theories, while still providing the tools that underlie the classical theory of surfaces in space.

Contents Preface............................................ 7 I ALGEBRA 8 0 Introduction: motivating index algebra 9 1 Foundations of tensor algebra 15 1.1 Index conventions................................... 15 1.2 A vector space V................................... 16 1.3 The dual space V.................................. 26 1.4 Linear transformations of a vector space into itself (and tensors)......... 38 1.5 Linear transformations of V into itself and a change of basis........... 49 1.6 Linear transformations between V and V..................... 59 2 Symmetry properties of tensors 92 2.1 Measure motivation and determinants........................ 92 2.2 Tensor symmetry properties............................. 96 2.3 Epsilons and deltas.................................. 101 2.4 Antisymmetric tensors................................ 109 2.5 Symmetric tensors and multivariable Taylor series................. 112 3 Time out 118 3.1 Whoa! Review of what we ve done so far...................... 118 4 Antisymmetric tensors, subspaces and measure 129 4.1 Determinants gone wild............................... 129 4.2 The wedge product.................................. 130 4.3 Subspace orientation and duality.......................... 136 4.4 Wedge and duality on R n in practice........................ 159 II CALCULUS 168 5 From multivariable calculus to the foundation of differential geometry 169 3

5.1 The tangent space in multivariable calculus..................... 169 5.2 More motivation for the re-interpretation of the tangent space.......... 178 5.3 Flow lines of vector fields............................... 182 5.4 Frames and dual frames and Lie brackets...................... 190 5.5 Non-Cartesian coordinates on R n (polar coordinates in R 2 )............ 196 5.6 Cylindrical and spherical coordinates on R 3.................... 204 5.7 Cylindrical coordinate frames............................ 207 5.8 Spherical coordinate frames............................. 215 5.9 Lie brackets and noncoordinate frames....................... 223 6 Covariant derivatives 229 6.1 Covariant derivatives on R n with Euclidean metric................ 229 6.2 Notation for covariant derivatives.......................... 231 6.3 Covariant differentiation and the general linear group............... 237 6.4 Covariant constant tensor fields........................... 246 6.5 The clever way of evaluating the components of the covariant derivative..... 248 6.6 Noncoordinate frames................................. 251 6.7 Geometric interpretation of the Lie bracket..................... 254 7 More on covariant derivatives 259 7.1 Gradient and divergence............................... 259 7.2 Second covariant derivatives and the Laplacian................... 262 7.3 Spherical coordinate orthonormal frame....................... 265 7.4 Rotations and derivatives............................... 268 8 Parallel transport 275 8.1 Covariant differentiation along a curve and parallel transport........... 275 8.2 Geodesics....................................... 286 8.3 Parametrized curves as motion of point particles................ 290 8.4 The Euclidean plane and the Kepler problem.................... 296 8.5 The 2-sphere of radius r 0............................... 304 8.6 The torus....................................... 310 8.7 Geodesics as extremal curves: a peek at the calculus of variations........ 320 9 Intrinsic curvature 324 9.1 Calculating the curvature tensor........................... 324 9.2 Interpretation of curvature.............................. 331 9.3 The limiting loop parallel transport coordinate calculation............ 335 9.4 The limiting loop parallel transport frame curvature calculation......... 338 9.5 The symmetry of the covariant derivative...................... 339 4

10 Extrinsic curvature 341 10.1 The extrinsic curvature tensor............................ 341 10.2 Spheres and cylinders: a pair of useful concrete examples............. 346 10.3 Cones: a useful cautionary example......................... 348 10.4 Total curvature: intrinsic plus extrinsic curvature................. 349 11 Integration of differential forms 355 11.1 Changing the variable in a single variable integral................. 355 11.2 Changing variables in multivariable integrals.................... 356 11.3 Parametrized p-surfaces and pushing forward the coordinate grid and tangent vectors......................................... 360 11.4 Pulling back functions, covariant tensors and differential forms.......... 363 11.5 Changing the parametrization............................ 367 11.6 The exterior derivative d............................... 370 11.7 The exterior derivative and a metric......................... 379 11.8 Induced orientation.................................. 389 11.9 Stokes theorem.................................... 395 11.10Worked examples of Stokes Theorem for R 3.................... 399 11.11Spherical coordinates on R 4 and 3-spheres: a useful example with n > 3..... 405 12 Wrapping things up 410 12.1 Final remarks..................................... 410 12.2 MATH 5600 Spring 1991 Differential Geometry: Take Home Final........ 411 A Miscellaneous background 418 A.1 From trigonometry to hyperbolic functions and hyperbolic geometry....... 418 B Maple worksheets 426 C Solutions 427 C.1 Chapter 1....................................... 427 C.2 Chapter 2....................................... 433 C.3 Chapter 3....................................... 434 C.4 Chapter 4....................................... 435 C.5 Chapter 5....................................... 437 C.6 Chapter 6....................................... 442 C.7 Chapter 7....................................... 446 C.8 Chapter 8....................................... 448 C.9 Chapter 9....................................... 449 C.10 Chapter 10...................................... 449 C.11 Chapter 11...................................... 449 C.12 Chapter 12: final exam worked........................... 450 Final exam......................................... 461 5

List of Figures 484 The last page is:...................................... 485 6

7 Preface This book began as a set of handwritten notes from a course given at Villanova University in the spring semester of 1991 that were scanned and posted on the web in 2006 at http://www34.homepage.villanova.edu/robert.jantzen/notes/dg1991/ and were converted to a L A TEX compuscript and completely revised in 2007 2008 with the help of Hans Kuo of Taiwan through a serendipitous internet collaboration and chance second offering of the course to actual students in the spring semester of 2008, offering the opportunity for serious revision with feedback. Life then intervened and the necessary cleanup operations to put this into a finished form were delayed indefinitely. Most undergraduate courses on differential geometry are leftovers from the early part of the last century, focusing on curves and surfaces in space, which is not very useful for the most important application of the twentieth century: general relativity and field theory in theoretical physics. Most mathematicians who teach such courses are not well versed in physics, so perhaps this is a natural consequence of the distancing of mathematics from physics, two fields which developed together in creating these ideas from Newton to Einstein and beyond. The idea of these notes is to develop the essential tools of modern differential geometry while bypassing more abstract notions like manifolds, which although important for global questions, are not essential for local differential geometry and therefore need not steal precious time from a first course aimed at undergraduates. Part 1 (Algebra) develops the vector space structure of R n and its dual space of real-valued linear functions, and builds the tools of tensor algebra on that structure, getting the index manipulation part of tensor analysis out of the way first. Part 2 (Calculus) then develops R n as a manifold first analyzed in Cartesian coordinates, beginning by redefining the tangent space of multivariable calculus to be the space of directional derivatives at a point, so that all of the tools of Part 1 then can be applied pointwise to the tangent space. Non-Cartesian coordinates and the Euclidean metric are then used as a shortcut to what would be the consideration of more general manifolds with Riemannian metrics in a more ambitious course, followed by the covariant derivative and parallel transport, leading naturally into curvature. The exterior derivative and integration of differential forms is the final topic, showing how conventional vector analysis fits into a more elegant unified framework. The theme of Part 1 is that one needs to distinguish the linearity properties from the inner product ( metric ) properties of elementary linear algebra. The inner product geometry governs lengths and angles, and the determinant then enables one to extend the linear measure of length to area and volume in the plane or 3-dimensional space, and to p-dimensional objects in R n. The determinant also tests linear independence of a set of vectors and hence is key to characterizing subspaces independent of the particular set of vectors we use to describe them while assigning an actual measure to the p-parallelepipeds formed by a particular set, once an inner product sets the length scale for orthogonal directions. By appreciating the details of these basic notions in the setting of R n, one is ready for the tools needed point by point in the tangent spaces to R n, once one understands the relationship between each tangent space and the simpler enveloping space.

Part I ALGEBRA 8

Chapter 0 Introduction: motivating index algebra Elementary linear algebra is the mathematics of linearity, whose basic objects are 1- and 2- dimensional arrays of numbers, which can be visualized as at most 2-dimensional rectangular arrangements of those numbers on sheets of paper or computer screens. Arrays of numbers of dimension d can be described as sets that can be put into a 1-1 correspondence with regular rectangular grids of points in R d whose coordinates are integers, used as index labels: {a i i = 1,..., n} 1-d array : n entries {a ij i = 1,..., n 1, j = 1,..., n 2 } 2-d array : n 1 n 2 entries {a ijk i = 1,..., n 1, j = 1,..., n 2, k = 1,..., n 3 } 3-d array : n 1 n 2 n 3 entries 1-dimensional arrays (vectors) and 2-dimensional arrays (matrices), coupled with the basic operation of matrix multiplication, itself an organized way of performing dot products of two sets of vectors, combine into a powerful machine for linear computation. When working with arrays of specific dimensions (3 component vectors, 2 3 matrices, etc.), one can avoid index notation and the sigma summation symbol n i=1 after using it perhaps to define the basic operation of dot products for vectors of arbitrary dimension, but to discuss theory for indeterminant dimensions (n-component vectors, m n matrices), index notation is necessary. However, index positioning (distinguishing subscript and superscript indices) is not essential and rarely used, especially by mathematicians. Going beyond 2-dimensional arrays to d-dimensional arrays for d > 2, the arena of tensors, index notation and index positioning are instead both essential to an efficient computational language. Suppose we start with 3-vectors to illustrate the basic idea. The dot product between two vectors is symmetric in the two factors a = a 1, a 2, a 3, b = b 1, b 2, b 3 3 a b = a 1 b 1 + a 2 b 2 + a 3 b 3 = a i b i = b a, but using it to describe a linear function in R 3, a basic asymmetry is introduced 3 f a ( x) = a x = a 1 x 1 + a 2 x 2 + a 3 x 3 = a i x i. 9 i=1 i=1

10 Chapter 0. Introduction: motivating index algebra The left factor is a constant vector of coefficients, while the right factor is the vector of variables and this choice of left and right is arbitrary but convenient, although some mathematicians like to reverse it for some reason. To reflect this distinction, we introduce superscripts (up position) to denote the variable indices and subscripts (down position) to denote the coefficient indices, and then agree to sum over the understood 3 values of the index range for any repeated such pair of indices (one up, one down) f a ( x) = a 1 x 1 + a 2 x 2 + a 3 x 3 = 3 a i x i = a i x i. i=1 The last convention, called the Einstein summation convention, turns out to be an extremely convenient and powerful shorthand, which in this example, streamlines the notation for taking a linear combination of variables. This index positioning notation encodes the distinction between rows and columns in matrix notation. Now we will represent a matrix (a ij ) representing a linear transformation as (a i j) with row indices (left) associated with superscripts, and column indices (right) with subscripts. A single row matrix or column matrix is used to denote respectively a coefficient vector and a variable vector ( ) x 1 a1 a 2 a 3, x 2, x 3 where the entries of a single row matrix are labeled by the column index (down), and the entries of a single column matrix are labeled by the row index (up). The matrix product of a row matrix on the left by a column matrix on the left re-interprets the dot product between two vectors as the way to combine a row vector (left factor) of coefficients with a column vector (right factor) of variables to produce a single number, the value of a linear function of the variables ( ) a1 a 2 a 3 x 1 x 2 x 3 = a 1 x 1 + a 2 x 2 + a 3 x 3 = a x. If we agree to use an underlined kernel symbol x for a column vector, and the transpose a T for a row vector, where the transpose simply interchanges rows and columns of a matrix, this can be represented as a T x = a x. Extending the matrix product to more than one row in the left factor is the second step in defining a general matrix product, leading to a column vector result ( ) a 1 1 a 1 2 a 1 3 x 1 ( ) a 2 1 a 2 2 a 2 x 2 a 1T = x 1 3 x 3 a 2T x 2 = x 3 ( ) a1 x a 2 = x ( ) a 1 i x i a 2 ix i. Thinking of the coefficient matrix as a 1-dimensional vertical array of row vectors (the first right hand side of this sequence of equations), one gets a corresponding array of numbers (a

11 column) as the result, consisting of the corresponding dot products of the rows with the single column. Denoting the left matrix factor by A, then the product column matrix has entries [A x] i = 3 a i kx k = a i kx k, 1 i 2. k=1 Finally, adding more columns to the right factor in the matrix product, we generate corresponding columns in the matrix product, with the resulting array of numbers representing all possible dot products between the row vectors on the left and the column vectors on the right, labeled by the same row and column indices as the factor vectors from which they come ( ) a 1 1 a 1 2 a 1 3 x 1 1 x 1 ( ) ( ) 2 a 2 1 a 2 2 a 2 x 2 1 x 2 a 1T (x1 ) a1 x 2 = 3 x 3 1 x 3 a 2T x 2 = 1 a 1 x 2 a 2 x 1 a 2. x 2 2 Denoting the new left matrix factor again by A and the right matrix factor by X, then the product matrix has entries (row index left up, column index right down) [AX] i j = 3 a i kx k j = a i kx k j, 1 i 2, 1 j 2, k=1 where the sum over three entries (representing the dot product) is implied by our summation convention in the second equality, and the row and column indices here go from 1 to 2 to label the entries of the 2 rows and 2 columns of the product matrix. Thus matrix multiplication in this example is just an organized way of displaying all such dot products of two ordered sets of vectors in an array where the rows of the left factor in the matrix product correspond to the coefficient vectors in the left set and the columns in the right factor in the matrix product correspond to the variable vectors in the right set. The dot product itself in this context of matrix multiplication is representing the natural evaluation of linear functions (left row) on vectors (right column). No geometry (lengths and angles in Euclidean geometry) is implied in this context, only linearity and the process of linear combination. The matrix product of a matrix with a single column vector can be reinterpreted in terms of the more general concept of a vector-valued linear function of vectors, namely a linear combination of vectors, in which case the right factor column vector entries play the role of coefficients. In this case the left factor matrix must be thought of as a horizontal array of column vectors ( ) x 1 ( ) v 1 v 2 v 3 x 2 v 1 = 1 v 1 2 v3 1 x 1 ( ) x 3 v 2 1 v 2 2 v3 2 x 2 v 1 = 1 x 1 + v 1 2x 2 + v 1 3x 3 x 3 v 2 1x 1 + v 2 2x 2 + v 2 3x 3 ) ) ) = x 1 ( v 1 1 v 2 1 + x 2 ( v 1 2 v 2 2 + x 3 ( v 1 3 v 2 3 = x 1 v 1 + x 2 v 2 + x 3 v 3 = x i v i. Thus in this case the summed-over index pair performs a linear combination of the columns of the left factor of the matrix product, whose coefficients are the entries of the right column

12 Chapter 0. Introduction: motivating index algebra matrix factor. This interpretation extends to more columns in the right matrix factor, leading to a matrix product consisting of the same number of columns, each of which represents a linear combination of the column vectors of the left factor matrix. In this case the coefficient indices are superscripts since the labels of the vectors being combined linearly are subscripts, but the one up, one down repeated index summation is still consistent. Note that when the left factor matrix is not square (in this example, a 2 3 matrix multiplied by a 3 1 matrix), one is dealing with coefficient vectors v i and vectors x of different dimensions, in this example combining three 2-component vectors by linear combination. If we call our basic column vectors just vectors (contravariant vectors, indices up) and call row vectors covectors (covariant vectors, indices down), then combining them with the matrix product represents the evaluation operation for linear functions, and implies no geometry in the sense of lengths and angles usually associated with the dot product, although one can easily carry over this interpretation. In this example R 3 is our basic vector space consisting of all possible ordered triplets of real numbers, and the space of all linear functions on it is equivalent to another copy of R 3, the space of all coefficient vectors. The space of linear functions on a vector space is called the dual space, and given a basis of the original vector space, expressing linear functions with respect to this basis leads to a component representation in terms of their matrix of coefficients as above. It is this basic foundation of a vector space and its dual, together with the natural evaluation represented by matrix multiplication in component language, reflected in superscript and subscript index positioning respectively associated with column vectors and row vectors, that is used to go beyond elementary linear algebra to the algebra of tensors, or d-dimensional arrays for any positive integer d. Index positioning together with the Einstein summation convention is essential in letting the notation itself directly carry the information about its role in this scheme of linear mathematics extended beyond the elementary level. Combining this linear algebra structure with multivariable calculus leads to differential geometry. Consider R 3 with the usual Cartesian coordinates x 1, x 2, x 3 thought of as functions on this space. The differential of any function on this space can be expressed in terms of partial derivatives by the formula df = f x 1dx1 + f x 2dx2 + f x 3dx3 = i fdx i = f,i dx i using first the abbreviation i = / x i for the partial derivative operator and then the abbreviation f,i for the corresponding partial derivatives of the function f. At each point of R 3, the differentials df and dx i play the role of linear functions on the tangent space. The differential of f acts on a tangent vector v at a given point by evaluation to form the directional derivative along the vector D v f = f + f + f = f, x 1v1 x 2v2 x 3v3 x ivi so that the coefficients of this linear function of a tangent vector v at a given point are the values of the partial derivative functions there, and hence have indices down compared to the up indices of the tangent vector itself, which belongs to the tangent space, the fundamental vector space describing the diffential geometry near each point of the whole space. In the linear

13 function notation, the application of the linear function df to the vector v gives the same result df( v) = f x ivi. If f/ x i are therefore the components of a covector, and v i the components of a vector in the tangent space, what is the basis of the tangent space, analogous to the natural (ordered) basis {e 1, e 2, e 3 } = { 1, 0, 0, 0, 1, 0, 0, 0, 1 } of R 3 thought of as a vector space in our previous discussion? In other words how do we express a tangent vector in the abstract form like in the naive R 3 discussion where x = x 1, x 2, x 3 = x i e i is expressed as a linear combination of the standard basis vectors {e i } = { 1, 0, 0, 0, 1, 0, 0, 0, 1 } usually denoted by i,j,k? This question will be answered in the following notes, making the link between old fashioned tensor analysis and modern differential geometry. One last remark about matrix notation is needed. We adopt here the notational conventions of the computer algebra system Maple for matrices and vectors. A vector u 1, u 2 will be interpreted as a column matrix in matrix expressions ( ) u u = u 1, u 2 1 = while its transpose will be denoted by u 2 u T = u 1 u 2 = ( u 1 u 2). In other words within triangle bracket delimiters, a comma will represent a vertical separator in a list, while a vertical line will represent a horizontal separator in a list. A matrix can then be represented as a vertical list of rows or as a horizontal list of columns, as in ( ) a b = a b, c d = a, c b, d. c d Finally if A is a matrix, we will not use a lowercase letter a i j for its entries but retain the same symbol: A = (A i j). Since the matrix notation and matrix multiplication which suppresses all indices and the summation is so efficient, it is important to be able to translate between the summed indexed notation to the corresponding index-free matrix symbols. In the usual language, matrix multiplication the ith row and jth column entry of the product matrix is [AB] ij = n A ik B kj. k=1 In our application of this to matrices with indices in various up/down positions, the left index will always be the row index and the right index the column index and to translate from indexed notation to symbolic matrices we always have to use the above correspondence independent of the index up or down position: only left-right position counts. Thus to translate an expression like M ij B i mb j n we need to first rearrange the factors to B i mm ij B j n and then recognize that

14 Chapter 0. Introduction: motivating index algebra the second summed index j is in the right adjacent pair of positions for interpretation of matrix multiplication, but the first summed index i is in the row instead of column position so the transpose is required to place it adjacent to the middle matrix factor (B i mm ij B j n) = ([B T M B] mn ) = B T M B.

Chapter 1 Foundations of tensor algebra 1.1 Index conventions We need an efficient abbreviated notation to handle the complexity of mathematical structure before us. We will use indices of a given type to denote all possible values of given index ranges. By index type we mean a collection of similar letter types, like those from the beginning or middle of the Latin alphabet, or Greek letters a, b, c,... i, j, k,... α, β, γ... each index of which is understood to have a given common range of successive integer values. Variations of these might be barred or primed letters or capital letters. For example, suppose we are looking at linear transformations between R n and R m where m n. We would need two different index ranges to denote vector components in the two vector spaces of different dimensions, say i, j, k,... = 1, 2,..., n and α, β, γ,... = 1, 2,..., m. In order to introduce the so called Einstein summation convention, we agree to the following limitations on how indices may appear in formulas. A given index letter may occur only once in a given term in an expression (call this a free index ), in which case the expression is understood to stand for the set of all such expressions for which the index assumes its allowed values, or it may occur twice but only as a superscript-subscript pair (one up, one down) which will stand for the sum over all allowed values (call this a repeated index ). Here are some examples. If i, j = 1,..., n then A i n expressions : A 1, A 2,..., A n A i i n i=1 Ai i, a single expression with n terms A ji i n i=1 A1i i,..., n i=1 Ani i, n expressions each of which has n terms in the sum A ii no sum, just an expression for each i, if we want to refer to a specific diagonal component (entry) of a matrix, for example A i (v i + w i ) = A i v i + A i w i 2 sums of n terms each or one combined sum 15

16 Chapter 1. Foundations of tensor algebra A repeated index is a dummy index, like the dummy variable in a definite integral b f(x)dx = b f(u)du. We can change them at will: a a Ai i = A j j. 1.2 A vector space V Let V be an n-dimensional real vector space. Elements of this space are called vectors. Ordinary real numbers (let R denote the set of real numbers) will be called scalars and denoted by a, b, c,..., while vectors will be denoted by various symbols depending on the context: u, v, w or u (1), u (2),..., where here the parentheses indicate that the subscripts are only labeling the vectors in an ordered set of vectors, to distinguish them from component indices. Sometimes X, Y, Z, W are convenient vector symbols. The basic structure of a real vector space is that it has two operations defined, vector addition and scalar multiplication, which can then be combined together to perform linear combinations of vectors: vector addition: the sum of two vectors u + v is again a vector in the space, scalar multiplication: the product cu of a scalar c and a vector u is again a vector in the space, called a scalar multiple of the vector, so that linear combinations of two or more vectors with scalar coefficients au + bv are defined. These operations satisfy a list of properties that we take for granted when working with sums and products of real numbers alone, i.e., the set of real numbers R thought of as a 1-dimensional vector space. A basis of V, denoted by {e i }, i = 1, 2,..., n or just {e i }, where it is understood that a free index (meaning not repeated and therefore summed over) like the i in this expression will assume all of its possible values, is a linearly independent spanning set for V 1. spanning condition: Any vector v V can be represented as a linear combination of the basis vectors: v = n v i e i = v i e i i=1 whose coefficients v i are called the components of v with respect to {e i }. The index i on v i labels the components (coefficients), while the index i on e i labels the basis vectors. 2. linear independence: If v i e i = 0, then v i = 0, (i.e., more explicitly if v = n i=1 vi e i = 0, then v i = 0 for all i = 1, 2,..., n).

1.2. A vector space V 17 Example 1.2.1. V = R n = {u = (u 1,..., u n ) = (u i ) u i R}, the space of n-tuples of real numbers with the natural basis e 1 = (1, 0,..., 0), e 2 = (0, 1,..., 0),..., e n = (0, 0,..., 1), which we will refer to as the standard basis or natural basis. In R 3, these basis vectors are customarily denoted by i, j, k. When we want to distinguish the vector properties of R n from its point properties, we will emphasize the difference by using angle brackets instead of parentheses: u 1, u 2, u 3. In the context of matrix calculations, this representation of a vector will be understood to be a column matrix. As a set of points, R n has a natural set of Cartesian coordinate functions x i which pick out the ith entry in an n-tuple, for example on R 3 : x 1 ((a 1, a 2, a 3 )) = a 1, etc. These are linear functions on the space. Interpreting the points as vectors, these coordinate functions pick out the individual components of the vectors with respect to the standard basis. Any two n-dimensional vector spaces are isomorphic. This just means there is some map from one to the other, say Φ : V W, and it does not matter whether the vector operations (vector sum and scalar multiplication, i.e., linear combination which encompasses them both) are done before or after using the map: Φ(au +bv) = aφ(u)+bφ(v). The practical implication of this rather abstract statement is that once you establish a basis in any n-dimensional vector space V, the n-tuples of components of vectors with respect to this basis undergo the usual vector operations in R n when the vectors they represent undergo the vector operations in V. For example, the set of at most quadratic polynomial functions in a single variable ax 2 + bx + c = a(x 2 ) + b(x) + c(1) has the natural basis {1, x, x 2 } and under linear combination of these functions, the triplet of coordinates (c, b, a) (coefficients ordered by increasing powers) undergo the corresponding linear combination as vectors in R 3. We might as well just work in R 3 to visualize relationships between vectors in the original abstract space. Exercise 1.2.1 By expanding at most quadratic polynomial functions in a Taylor series about x = 1, one expresses these functions in the new basis {(x 1) p }, p = 0, 1, 2, say as A(x 1) 2 +B(x 1)+C(1). Express (c, b, a) as linear functions of (C, B, A) by expanding out this latter expression. Then solve these relations for the inverse expressions, giving (C, A, B) as functions of (c, b, a) and express both relationships in matrix form, showing explicitly the coefficient matrices. Alternatively, actually evaluate (C, B, A) in terms of (c, b, a) using the Taylor series expansion technique. Make a crude drawing of the three new basis vectors in R 3 which correspond to the new basis functions, or use technology to draw them. Exercise 1.2.2 An antisymmetric matrix is a square matrix which reverses sign under the transpose operation:

18 Chapter 1. Foundations of tensor algebra A T = A. Any 3 3 antisymmetric matrix has the form 0 a 3 a 2 0 0 0 0 0 1 0 1 0 A = a 3 0 a 1 = a 1 0 0 1 + a 2 0 0 0 + a 3 1 0 0 a 2 a 1 0 0 1 0 1 0 0 0 0 0 a 1 E 1 + a 2 E 2 + a 3 E 3. The space of all such matrices is a 3-dimensional vector space with basis {E i } since it is defined as the span of this set of vectors (hence a subspace of the vector space of 3 3 matrices), and setting the linear combination equal to the zero matrix forces all the coefficients to be zero proving the linear independence of this set of vectors (which is therefore a linear independent spanning set). a) Show that matrix multiplication of a vector in R 3 by such a matrix A is equivalent to taking the cross product with the corresponding vector a = a 1, a 2, a 3 : A b = a b. b) Although the result of two successive cross products a (b u) is not equivalent to a single cross product c u, the difference of two such successive cross products is. Confirm the matrix product A B B A = (a b) i E i Then by part a) it follows that (AB B A)u = (a b) u, c) Use the matrix distributive law to fill in the one further step which then proves the vector identity a (b u) b (a u) = (a b) u. Example 1.2.2. The field C of complex numbers is a 2-dimensional real vector space isomorphic to R 2 through the isomorphism z = x + iy (x, y) which associates the basis {1, i} with the standard basis {e 1 = (1, 0), e 2 = (0, 1)}. A p-dimensional linear subspace of a vector space V can be represented as the set of all possible linear combinations of a set of p linearly independent vectors, and such a subspace results from the solution of a set of linear homogeneous conditions on the variable components of a vector variable expressed in some basis. Thus if x = x 1,..., x n is the column matrix of components of an unknown vector in V with respect to a basis {e i }, and A is an m n matrix of rank m (i.e., the rows are linearly independent), the solution space of A x = 0 will be a (p = n m)-dimensional subspace, since m < n independent conditions on n variables leave n m variables freely specifiable. In R 3, these are the lines (p = 1) and planes (p = 2) through the origin. In higher dimensional R n spaces, the (n 1)-dimensional subspaces are called hyperplanes in analogy with the ordinary planes in the case n = 3, and we can refer to p-planes through the origin for the values of p between 2 and n 1.

1.2. A vector space V 19 Elementary linear algebra: solving systems of linear equations It is worth remembering the basic problem of elementary linear algebra: solving m linear equations in n unknowns or variables x i, which is most efficiently handled with matrix notation A 1 1x 1 + A 1 nx n = b 1.., Ax = b. A m 1x 1 + A m nx n = b m The interpretation of the problem requires a slight shift in emphasis to the n columns u (i) R m of the coefficient matrix by defining u i (j) = A i j or A = u (1) u (n). Then this is equivalent to setting a linear combination of these columns equal to the right hand side vector b = b 1,..., b m R m A x = x 1 u (1) + x n u (n) = b. If b = 0, the homogeneous case, this is equivalent to trying to find a linear relationship among the n column vectors, namely a linear combination of them equal to the zero vector whose coefficients are not all zero; then for each nonzero coefficient, one can solve for the vector it multiplies and express it as a linear combination of the remaining vectors in the set. When no such relationship exists among the vectors, they are called linearly independent, otherwise they are called linearly dependent. The span (set of all possible linear combinations) of the set of these column vectors is called the column space Col(A) of the coefficient matrix A. If b 0, then the system admits a solution only if b belongs to the column space, and is inconsistent if not. If b 0 and the vectors are linearly independent, then if the solution admits a solution, it is unique. If they are not linearly independent, then the solution is not unique but involves a number of free parameters. The solution technique is row reduction involving a sequence of elementary row operations of three types: adding a multiple of one row to another row, multiplying a row by a nonzero number, and interchanging two rows. These row operations correspond to taking new independent combinations of the equations in the system, or scaling a particular equation, or changing their order, none of which changes the solution of the system. The row reduced echelon form A R b R of the augmented matrix A b leads to an equivalent ( reduced ) system of equations A R x = b R which is easily solved. The row reduced echelon form has all the zero rows (if any) at the bottom of the matrix, the leading (first from left to right) entry of each nonzero row is 1, the columns containing those leading 1 entries (the leading columns) have zero entries above and below those leading 1 entries, and finally the pattern of leading 1 entries moves down and to the right, i.e., the leading entry of the next nonzero row is to the right of a preceding leading entry. The leading 1 entries of the matrix are also called the pivot entries, and the corresponding columns, the pivot columns. A pivot consists of the set of add row operations which makes the remaining entries of a pivot column zero. The number of nonzero rows of the reduced augmented matrix is called the rank of the augmented matrix and represents the number of independent equations in the original set. The number of nonzero rows of the reduced coefficient matrix alone is called its rank: r = rank(a) m and equals the number of leading 1 entries in A R, in turn the number of leading 1 columns of A R. The remaining n r n m columns are called free columns. This classification

20 Chapter 1. Foundations of tensor algebra of the columns of the reduced coefficient matrix into leading and free columns is extended to the original coefficient matrix. The associated variables of the system of linear equations then fall into two groups, the leading variables (r m in number) and the free variables (n r in number), since each variable corresponds to one of the columns of the coefficient matrix. Each leading variable can immediately be solved for in its corresponding reduced system equation and expressed in terms of the free variables, whose values are then not constrained and may take any real values. Setting the n r free variables equal to arbitrary parameters t B, B = 1,, n r leads to a solution in the form x i = x i (particular) + t B v i (B) The particular solution satisfies A x (particular) = b, while the remaining part is the general solution of the related homogeneous linear system for which b = 0, an (n r)-dimensional subspace Null(A) of R n called the null space of the matrix A, since it consists of those vectors which are taken to zero under multiplication by that matrix. A(t B v i (B)) = t B (Av i (B)) = 0. This form of the solution defines a basis {v (B) } of the null space since by definition any solution of the homogeneous equations can be expressed as a linear combination of them, and if such a linear combination is zero, every parameter t B is forced to be zero, so they are linearly independent. This basis of coefficient vectors {v (B) } R n is really a basis of the space of linear relationships among the original n vectors {u (1),..., u (n) }, each one representing the coefficients of an independent linear relationship among those vectors: 0 = A j iv i (B) = v i (B)u j (i). In fact these relationships correspond to the fact that each free column of the reduced matrix can be expressed as a linear combination of the leading columns which precede it going from left to right in the matrix, and in fact the same linear relationships apply to the original set of vectors (since the coefficients x i of the solution space are the same!). Thus one can remove the free columns from the original set of vectors to get a basis of the column space of the matrix consisting of its r leading columns, so the dimension of the column space is the rank r of the matrix. By introducing the row space of the coefficient matrix Row(A) R n consisting of all possible linear combinations of the rows of the matrix, the row reduction process can be interpreted as finding a basis of this subspace that has a certain characteristic form: the r nonzero rows of the reduced matrix. The dimension of the row space is thus equal to the rank r of the matrix. Each equation of the original system corresponding to each (nonzero) row of the coefficient matrix separately has a solution space which represents a hyperplane in R n, namely an (n 1)-dimensional subspace. Re-interpreting the linear combination of the variables as a dot product with the row vector, in the homogeneous case, these hyperplanes consist of all vectors orthogonal to the original row vector, and the joint solution of all the equations of the system is the subspace which is orthogonal to the entire row space, namely the orthogonal complement of the row space within R n. Thus Null(A) and Row(A) decompose the total space R n into an orthogonal decomposition with respect to the dot product, and the solution algorithm for the homogeneous linear system provides a basis of each such subspace.

1.2. A vector space V 21 Left multiplication of A by a row matrix of variables y T = y 1... y m yields a row matrix, so one can consider the transposed linear system in which that product is set equal to a constant row vector c T = c 1... c m y T A = c T, or A T y = c This is the linear system of equations associated with the transpose of the matrix, which interchanges rows and columns and hence the row space and column space Row(A T ) = Col(A), Col(A T ) = Row(A), but adds one more space Null(A T ), which can be interpreted as the subspace orthogonal to Row(A T ) = Col(A), hence determining an orthogonal decomposition of R m as well. Example 1.2.3. Here is the augmented matrix and its row reduced echelon form for 5 equations in 7 unknowns 1 2 4 11 0 4 1 16 1 2 0 1 0 0 0 2 1 2 1 4 0 2 0 5 0 0 1 3 0 0 0 3 A b = 0 0 4 12 0 2 4 12, A R b R = 0 0 0 0 0 1 0 0 3 6 4 15 0 2 4 42 0 0 0 0 0 0 1 6 4 8 1 7 0 1 3 7 0 0 0 0 0 0 0 0 and its solution 2 + 2t 1 t 2 2 2 1 0 t 1 0 1 0 0 3 3t 2 3 0 3 0 x = t 2 = 0 + t 1 0 + t 2 1 + t 3 0 = x (particular) + t B v (B). t 3 0 1 0 1 0 0 0 0 0 6 6 0 0 0 The rank of the 5 7 coefficient matrix (and of the 5 8 augmented matrix) is r = 4 with 4 leading variables {x 1, x 3, x 6, x 7 } and 3 free variables {x 2, x 4, x 5 }. By inspection one sees that the 2nd, 4th, and 5th columns are linear combinations of the preceding leading columns with coefficients which are exactly the entries of those columns. The same linear relationships apply to the original matrix, so columns 1,3,6,7 of the coefficient matrix A = u 1... u 7, namely {u 1, u 3, u 6, u 7 }, are a basis of the column space Col(A) R 5. The 4 nonzero rows of the reduced coefficient matrix A R are a basis of the row space Row(A) R 7. The three columns {v (1), v (2), v (3) } appearing in the solution vector x multiplied by the arbitrary parameters {t 1, t 2, t 3 } are a basis of the homogeneous solution space Null(A) R 7. Together these 7 vectors form a basis of R 7. One concludes that the right hand side vector b R 5 can be expressed in the form b = x i u (i) = x i (particular)u (i) + t B v i (B)u (i) = x i (particular)u (i) = 2u (1) + 3u (3) + 6u (7)

22 Chapter 1. Foundations of tensor algebra since the homogeneous part of the solution forms the zero vector from its linear combination of the original columns. Notice that the fifth column u (5) = 0; the zero vector makes any set of vectors trivially linearly dependent, so t 3 is a trivial parameter and v (3) represents that trivial linear relationship. Thus there are only two independent relationships among the 6 nonzero columns of A. The row space Row(A) = Col(A T ) is a 4-dimensional subspace of R 5. If one row reduces the 7 5 transpose matrix A T, the 4 nonzero rows of the reduced matrix are a basis of this space, and one finds one free variable and a single basis vector 258, 166, 165, 96, 178 /178 for the 1-dimensional subspace Null(A T ), which is the orthogonal subspace to the 4-dimensional subspace Row(A) R 5. Don t worry. We will not need the details of row and column spaces in what follows, so if your first introduction to linear algebra stopped short of this topic, don t despair. Example 1.2.4. We can also consider multiple linear systems with the same coefficient matrix. For example consider the two linearly independent vectors X (1) = 1, 3, 2, X (2) = 2, 3, 1 which span a plane through the origin in R 3 and let 1 2 X = X (1) X (2) = 3 3. 2 1 Clearly the sum X (1) + X (2) = 3, 6, 3 and difference X (2) X (1) = 1, 0, 1 vectors are a new basis of the same subspace (since they are not proportional) so if we try to express each of them in turn as linear combinations of the original basis vectors, we know already the unique solutions for each 1 2 ( ) 3 3 u 1 3 u 2 1 2 = 6, 3 1 2 ( ) 3 3 v 1 v 2 1 2 = 1 0 1 ( ) u 1 = u 2 ( ) 1, 1 ( ) v 1 = v 2 ( ) 1. 1 Clearly from the definition of matrix multiplication, we can put these two linear systems together as 1 2 ( ) 3 3 u 1 v 1 3 1 u 2 1 2 v 2 = 6 0, 3 1 which has the form X Z = Y where X is the 3 2 coefficient matrix, Y is the 3 2 right hand side matrix, and Z is the unknown 2 2 matrix whose columns tell us how to express the vectors Y (1), Y (2) as linear combinations of the vectors X (1), X (2). Of course here we know the unique solution is ( ) 1 1 Z =, 1 1

1.2. A vector space V 23 a matrix which together with its inverse can be used to transform the components of vectors from one basis to the other. In other words it is sometimes useful to generalize the simple linear system A x = b to an unknown matrix X of more than one column A X = B when the right hand side matrix is more than one column A }{{} m n X }{{} n p = B. }{{} m p Elementary linear algebra: the eigenvalue problem and linear transformations The next step in elementary linear algebra is to understand how a square n n matrix acts on R n by matrix multiplication as linear transformation of the space into itself x A x, x i A i jx j which maps each vector x to the new location A x. Under this mapping the standard basis vectors e i are mapped to the new vectors A e i, each of which can be expressed as a unique linear combination of the basis vectors with coefficients A j i, hence the index notation e i A e i = e j A j i which makes those coefficients for each value of i into the columns of the matrix A. To understand how this matrix multiplication moves around the vectors in the space, one looks for special directions ( eigendirections ) along which matrix multiplication reduces to scalar multiplication, i.e., subspaces along which the direction of the new vectors remains parallel to their original directions (although they might reverse direction) A x = λx, x 0, which defines a proportionality factor λ called the eigenvalue associated with the eigenvector x, which must be nonzero to have a direction to speak about. This eigenvector condition is equivalent to (A λi)x = Ax λx = 0. In order for the square matrix A λi to admit nonzero solutions it must row reduce to a matrix which has at least one free variable and hence at least one zero row, and hence zero determinant, so a necessary condition for finding an eigenvector is that the characteristic equation is satisfied by the eigenvalue det(a λi) = 0. The roots of this nth degree polynomial are the eigenvalues of the matrix, and once found can be separately backsubstituted into the linear system to find the solution space which defines

24 Chapter 1. Foundations of tensor algebra the corresponding eigenspace. The row reduction procedure provides a default basis of this eigenspace, i.e., a set of linearly independent eigenvectors for each eigenvalue. It is easily shown that eigenvectors corresponding to distinct eigenvalues are linearly independent so this process leads to a basis of the subspace of R n spanned by all these eigenspace bases. If they are n in number, this is a basis of the whole space and the matrix can be diagonalized. Let B = b 1... b n be the matrix whose columns are such an eigenbasis of R n, with A b i = λ i b i. In other words define B j i = b j i as the jth component of the ith eigenvector. Then A B = A b 1... Ab 1 = λ 1 b 1... λ n b 1 = b 1... b n λ 1... 0..... 0... λ n where the latter diagonal matrix multiplies each column by its corresponding eigenvalue, so that λ 1... 0 B 1 A B =..... A B 0... λ n is a diagonal matrix whose diagonal elements are the eigenvalues listed in the same order as the corresponding eigenvectors. Thus (multiplying this equation on the left by B and on the right by B 1 ) the matrix A can be represented in the form A = B A B B 1. This matrix transformation has a simple interpretation in terms of a linear transformation of the Cartesian coordinates of the space, expressing the old coordinates x i (with respect to the standard basis) as linear combinations of the new basis vectors b j whose coefficients are the new coordinates x i = y j b i j = B i jy j, which takes the matrix form x = B y, x i = B i jy j, y = B 1 x, y i = B 1i jx j, The top line expresses the old coordinates as linear functions of the new Cartesian coordinates y i. Inverting this relationship by multiplying both sides of the top matrix equation by B 1, one arrives at the bottom line, which instead expresses the new coordinates as linear functions of the old coordinates. Then under matrix multiplication of the old coordinates by A, namely x A x, the new coordinates are mapped to y i = B 1 i jx j B 1i j(a j kx k ) = B 1 i ja j kb k my m = [A R ] i jy j, so A B is just the new matrix of the linear transformation with respect to the new basis of eigenvectors. In the eigenbasis, matrix multiplication is reduced to distinct scalar multiplications along each eigenvector, which may be interpreted as a stretch 0 λ i < 1 or a contraction 1 < λ i (but no change if λ i = 1) combined with a change in direction (reflection) if the eigenvalue is negative λ i < 0. Not all square matrices can be diagonalized in this way. For example, rotations,

1.2. A vector space V 25 occur in the interesting case in which one cannot find enough independent (real) eigenvectors to form a complete basis, but correspond instead to complex conjugate pairs of eigenvectors. Don t worry. We will not need to deal with the eigenvector problem in most of what follows, except in passing for symmetric matrices A = A T which can always be diagonalized by an orthogonal matrix B. However, the change of basis example is fundamental to everything we will do. Example 1.2.5. Consider the matrix ( ) ( ) ( ) 1 4 5 0 1 2 A = = B A 2 3 B B 1, A B =, B =, B 1 = 1 ( ) 1 2. 0 1 1 1 3 1 1 Under matrix multiplication by A, the first eigenvector b 1 = 1, 1 is stretched by a factor of 5 while the second one b 2 = 2, 1 is reversed in direction. A shown in figure 1.1, this reflects the letter F across the y 1 axis and then stretches it in the y 1 direction by a factor of 5. Figure 1.1: The action of a linear transformation on a figure shown with a grid adapted to the new basis of eigenvectors. Vectors are stretched by a factor 5 along the y 1 direction and reflected across that direction along the y 2 direction.

26 Chapter 1. Foundations of tensor algebra 1.3 The dual space V Let V be the dual space of V, just a fancy name for the space of real-valued linear functions on V ; elements of V are called covectors. (Sometimes I will slip and call them 1-forms in the same sense that one sometimes speaks of a linear form or a quadratic form on a vector space.) The condition of linearity is linearity condition: f V f(au + bv) = af(u) + bf(v), or in words: the value on a linear combination = the linear combination of the values. This easily extends to linear combinations with any number of terms; for example ( N ) f(v) = f v i e i = i=1 N v i f(e i ) i=1 where the coefficients f i f(e i ) are the components of a covector with respect to the basis {e i }, or in our shorthand notation f(v) = f(v i e i ) = v i f(e i ) (linearity) (express in terms of basis) = v i f i. (definition of components) A covector f is entirely determined by its values f i on the basis vectors, namely its components with respect to that basis. Our linearity condition is usually presented separately as a pair of separate conditions on the two operations which define a vector space: sum rule: the value of the function on a sum of vectors is the sum of the values, f(u+v) = f(u) + f(v), scalar multiple rule: the value of the function on a scalar multiple of a vector is the scalar times the value on the vector, f(cu) = cf(u). Example 1.3.1. In the usual calculus notation on R 3, with Cartesian coordinates (x 1, x 2, x 3 ) = (x, y, z), linear functions are of the form f(x, y, z) = ax +by + cz, but a function with an extra additive term g(x, y, z) = ax + by + c + d is called linear as well. Only linear homogeneous functions (no additive term) satisfy the basic linearity property f(au + bv) = af(u) + bf(v). Unless otherwise indicated, the term linear here will always be intended in its narrow meaning of linear homogeneous. Warning: In this example, the variables (x, y, z) in the defining statement f(x, y, z) = ax + by + cz are simply place holders for any three real numbers in the equation, while the Cartesian coordinate functions denoted by the same symbols are instead the names of three independent (linear) functions on the vector space whose values on any triplet of numbers are just the corresponding

1.3. The dual space V 27 number from the triplet: y(1, 2, 3) = 2, for example. To emphasize that it is indeed a function of the vector u = (1, 2, 3), we might also write this as y(u) = y((1, 2, 3)) = 2 or even y( 1, 2, 3 ) if we adopt the vector delimiters, instead of the point delimiters (, ). Notation is extremely important in conveying mathematical meaning, but we only have so many symbols to go around, so flexibility in interpretation is also required. The dual space V is itself an n-dimensional vector space, with linear combinations of covectors defined in the usual way that one can takes linear combinations of any functions, i.e., in terms of values covector addition: (af + bg)(v) af(v) + bg(v), f, g covectors, v a vector. Exercise 1.3.1 Show that this defines a linear function af + bg, so that the space is closed under this linear combination operation. [All the other vector space properties of V are inherited from the linear structure of V.] In other words, show that if f, g are linear functions, satisfying our linearity condition, then c 1 f + c 2 g also satisfies the linearity condition for linear functions. Let us produce a basis for V, called the dual basis {ω i } or the basis dual to {e i }, by defining n covectors which satisfy the following duality relations { ω i (e j ) = δ i 1 if i = j, j 0 if i j, where the symbol δ i j is called the Kronecker delta, nothing more than a symbol for the components of the n n identity matrix I = (δ i j). We then extend them to any other vector by linearity. Then by linearity ω i (v) = ω i (v j e j ) (expand in basis) = v j ω i (e j ) (linearity) = v j δ i j (duality) = v i (Kronecker delta definition) where the last equality follows since for each i, only the term with j = i in the sum over j contributes to the sum. Alternatively matrix multiplication of a vector on the left by the identity matrix δ i jv j = v i does not change the vector. Thus the calculation shows that the i-th dual basis covector ω i picks out the i-th component v i of a vector v. Notice that a Greek letter has been introduced for the covectors ω i partially following a convention that distinguishes vectors and covectors using Latin and Greek letters, but this convention is obviously incompatible with our more familiar calculus notation in which f denotes a function, so we limit it to our conventional symbol for the dual basis associated with a starting basis {e i }. Why do the n covectors {ω i } form a basis of V? We can easily show that the two conditions for a basis are satisfied.

28 Chapter 1. Foundations of tensor algebra 1. spanning condition: Using linearity and the definition f i = f(e i ), this calculation shows that every linear function f can be written as a linear combination of these covectors f(v) = f(v i e i ) (expand in basis) = v i f(e i ) (linearity) = v i f i (definition of components) = v i δ j if j (Kronecker delta definition) = v i ω j (e i )f j (dual basis definition) = (f j ω j )(v i e i ) (linearity) = (f j ω j )(v). (expansion in basis, in reverse) Thus f and f i ω i have the same value on every v V so they are the same function: f = f i ω i, where f i = f(e i ) are the components of f with respect to the basis {ω i } of V also said to be the components of f with respect to the basis {e i } of V already introduced above. The index i on f i labels the components of f, while the index i on ω i labels the dual basis covectors. 2. linear independence: Suppose f i ω i = 0 is the zero covector. Then evaluating each side of this equation on e j and using linearity 0 = 0(e j ) (zero scalar = value of zero linear function) = (f i ω i )(e j ) (expand zero vector in basis) = f i ω i (e j ) (definition of linear combination function value) = f i δ i j (duality) = f j (Knonecker delta definition) forces all the coefficients of ω i to vanish, i.e., no nontrivial linear combination of these covectors exists which equals the zero covector (the existence of which would be a linear relationship among them) so these covectors are linearly independent. Thus V is also an n-dimensional vector space. Example 1.3.2. The familiar Cartesian coordinates on R n are defined by x i ((u 1,..., u n )) = u i (value of i-th number in n-tuple). But this is exactly what the basis {ω i } dual to the natural basis {e i } does i.e., the set of Cartesian coordinates {x i }, interpreted as linear functions on the vector space R n (why are they linear?), is the dual basis: ω i = x i. A general linear function on R n has the familiar form f = f i ω i = f i x i.

1.3. The dual space V 29 If we return to R 3 and calculus notation where a general linear function has the form f = ax + by + cz, then all we are doing is abstracting the familiar relations x(1, 0, 0) x(0, 1, 0) x(0, 0, 1) 1 0 0 y(1, 0, 0) y(0, 1, 0) y(0, 0, 1) = 0 1 0 z(1, 0, 0) z(0, 1, 0) z(0, 0, 1) 0 0 1 for the values of the Cartesian coordinates on the standard basis unit vectors along the coordinate axes, making the three simple linear functions {x, y, z} a dual basis to the standard basis, usually designated by the unit vectors {î,ĵ, ˆk} with or without hats (the physics notation to indicate unit vectors). Note that linearity of a function can be interpreted in terms of linear interpolation of intermediate values of the function. Given any two points u, v in R n, then the set of points tu + (1 t)v for t = 0 to t = 1 is the directed line segment between the two points. Then the linearity condition f(tu + (1 t)v) = tf(u) + (1 t)f(v) says that the value of the function at a certain fraction of the way from u to v is exactly that fraction of the way between the values of the function at those two points. v O origin or zero vector u + v u Figure 1.2: Vector addition: main diagonal of parallelogram. Vectors and vector addition are best visualized by interpreting points in R n as directed line segments from the origin ( arrows ). Functions can instead be visualized in terms of their level surfaces f(x) = f i x i = t (t, a parameter), which are a family of parallel hyperplanes, best represented by selecting an equally spaced set of such hyperplanes, say by choosing integer values of the parameter t. However, it is enough to graph two such level surfaces f(x) = 0 and f(x) = 1 to have a mental picture of the entire family since they completely determine the orientation and separation of all other members of this family. This pair of planes also enables one to have a geometric interpretation of covector addition on the vector space itself, like the parallelogram law for vectors. However, instead of the directed main diagonal line segment, one has the cross diagonal hyperplane for the result. Let s look at two pairs of such hyperplanes representing f and g but edge on, namely in the 2-plane orthogonal to the (n 2)-plane of intersection of the two (n 1)-planes which are these hyperplanes. The intersection of two nonparallel hyperplanes, each of which represents

30 Chapter 1. Foundations of tensor algebra Figure 1.3: Geometric representation of a covector: the representative hyperplanes of values 0 and 1 are enough to capture its orientation and magnitude. the solution of a single linear homogeneous condition on n variables, represents the solution of two independent conditions on n variables, and hence must be an (n 2)-dimensional plane. This is easier to see if we are more concrete. Figs 1.4 and 1.5 illustrate this in three dimensions. The first figure looking at the intersecting planes edge on down the lines of intersection is actually the two-dimensional example, where it is clear that the cross-diagonal intersection points of the two pairs of lines must both belong to the line (f + g)(x) = 1 on which the sum covector has the value 1 = 0 + 1 = 1 + 0. The second line of the pair (f + g)(x) = 0 needed to represent the sum covector is the parallel line passing through the origin. If we now rotate our point of view away from the edge-on orientation, we get the picture depicted in Fig. 1.5, which looks like a honeycomb of intersecting planes, with the cross-diagonal plane of intersection representing the sum covector. Of course the dual space (R n ) is isomorphic to R n f = f i ω i = f i x i (R n ) f (f i ) = (f i,..., f n ) R n, where the flat symbol notation reminds us that a correspondence has been established between two different objects (effectively lowering the component index), and since (R n ) is a vector space itself, covector addition is just the usual parallelogram vector addition there. However, the above hyperplane interpretation of the dual space covector addition occurs on the original vector space! These same pictures apply to any finite dimensional vector space. The difference in geometrical interpretation between directed line segments and directed hyperplane pairs is one reason

1.3. The dual space V 31 Figure 1.4: Covector addition seen edge-on in R 3. The plane (f + g)(x) = 1 representing the addition of two covectors is the plane through the lines of intersection of the cross-diagonal of the parallelogram formed by the intersection of the two pairs of planes when seen edge-on down the lines of intersection. Moving that plane parallel to itself until it passes through the origin gives the second plane of the pair representing the sum covector. for carefully distinguishing V from V by switching index positioning. For R n the distinction between n-tuples of numbers which are vectors and (the component n-tuples of) covectors is still made using matrix notation. Vectors in R n are identified with column matrices and covectors in the dual space with row matrices u = (u 1,..., u n ) u 1. u n, f = f i ω i f (f 1,..., f n ) (f 1... f n ) [no commas here], which we will sometimes designate respectively by u 1,..., u n and f 1... f n to emphasize the vector/covector column/row matrix dual interpretation of the n-tuple of numbers. The natural evaluation of a covector on a vector then corresponds to matrix multiplication f(u) = f i u i = (f 1... f n ) This evaluation of a covector (represented by a row matrix on the left) on a vector (represented by a column matrix on the right), which is just the value of the linear function f = f i x i at the point with position vector u, is a matrix product of two different objects, although it can be u 1. u n.

32 Chapter 1. Foundations of tensor algebra Figure 1.5: Covector addition in R 3 no longer seen edge-on. One has a honeycomb of intersecting planes, with the sum covector represented by the cross-diagonal plane of intersection.