Notes on Special Relativity. Historical background - the Newtonian/Galilean picture

Notes on Special Relativity The purpose of these notes is to give students the minimum background necessary to survive an introductory general relativity course. There are many excellent texts on SR, and the reader should get hold of one to fill in the many gaps in the presentation. Historical background - the Newtonian/Galilean picture Newton s three laws of mechanics: 1. A body not subject to an external force moves in a straight line with constant velocity. 2. The change in momentum of any body is equal to the (vector) sum of the forces acting on it: dp/dt = f, where p = mv. If m is constant, this reads ma = f. 3. The forces two bodies exert on each other are equal in magnitude with opposite signs. Observers who want to do experiments set up coordinate systems in which measurements and observations are performed. Any rectilinear coordinate system in which Newton s first law holds is called an inertial frame, and an observer at rest in such a frame is called an inertial observer. Because there is no way to shield ourselves from gravitational forces, there are no global inertial frames. Inertial frames certainly exist locally, to the extent that gravitational effects can be neglected or otherwise accounted for. For the present, we simply ignore gravitation and assume the existence of an inertial frame in which each event is uniquely labelled by the 4 numbers (t, x, y, z). This can be taken to be a system in which the axes are aligned with the distant stars. Given one inertial frame, there are many others: the original coordinate system can be rotated and translated in space, and given a constant velocity v. The new coordinates are related to the old by x = Rx + b + tv, where R is a 3 3 orthogonal matrix (R t = R 1 ) and b, v are constant 3-vectors. Such a transformation is called a Galilean transformation. Time translation (t = t + c) is also allowed - this just corresponds to the freedom to set a clock to read t = 0. In a Galilean transformation, the 1-parameter family of surfaces given by t = constant is mapped into itself: the 3-spaces are rotated, translated in time, and put into motion, but the family itself is unchanged. These transformations form a 10-parameter group (3 for rotations, 3 for the velocity, 3 for the translation, and 1 for the origin of time). The general form of the Galilean transformation asserted above can readily be deduced from the following requirement: any curve x(t) satisfying ẍ(t) = 0 in the original system must likewise satisfy ẍ (t) = 0 in the new system (this is a good exercise in the use of the chain rule!) Under a Galilean transformation in which R = I, so the coordinate axes are aligned, m d2 x dt 2 = f md2 x dt 2 = f, 1

This leads to the principle of relativity: All inertial frames are equivalent for the description of mechanics. This means: In any inertial frame, if you compute md 2 x/dt 2 for some particle, you ll get the same result (once you ve corrected for any rotation of the axes) as any other inertial observer. Toward the end of the 19 th century, a problem arose: Suppose that x(t) is the path of an unaccelerated object in the unprimed inertial frame, and that the primed frame is given by x = x + tv. Then the velocities of the object in the two frames are related by dx dt = dx dt + v. This is the Galilean law for the addition of velocities. So something moving with velocity c (the speed of light 3 10 10 cm/sec) in the unprimed frame should move with velocity c + v in the primed frame if the direction of motion is parallel to v. Maxwell had recently shown that electromagnetic waves in vacuum moved at the speed of light, and it was tacitly assumed (at first) that this meant at the speed of light relative to the inertial frame of the fixed stars. This being the case, it should have been easy to observe different values for c in frames which were not at rest relative to the fixed stars. All attempts to verify this experimentally failed, as is well-documented in any modern physics text, and we ll assume it s known to the reader. The constant c appears explicitly in Maxwell s equations for the electromagnetic field in vacuum, and it doesn t change to c v in a moving frame. Also, Maxwell s equations are first-order in time, so that under a Galilean transformation, the quantity E/ t would have to be replaced by E / t+v, which is not correct. So Maxwell s equations fail to obey the principle of relativity when subjected to a Galilean transformation. These equations give the correct (non-quantum) description of electrodynamics, and it was confusing and quite unsatisfactory to have separate sets of rules for electromagnetic and mechanical phenomena. Einstein cleared up the problem in an amazing intellectual tour-de-force 1 by proposing that The speed of light in vacuum is constant in all inertial frames of reference, and by broadening the principle of relativity to read All inertial frames are equivalent for the description of physics He then proceeded to work out the consequences of these assumptions, to which we now turn. 1 A. Einstein, On the electrodynamics of moving bodies, Ann d Phys 17 (1905). During this same year, Einstein also (a) invented the photon to explain the photoelectric effect, (b) explained the phenomenon of Brownian motion, and (c) published several other seminal papers in statistical physics. 2

Measurements of time and distance Let s begin by repeating a gedanken experiment of Einstein s which formalizes part of the discussion above: Suppose the inertial observer O B moves with constant velocity v in the positive x direction in the frame of O A. From the Newtonian/Galilean viewpoint, we have x B = x A vt A, y B = y A, z B = z A, and t B = t A. Notice that at t = 0, the two origins coincide. Let a photon be emitted at x A = x B = 0 at time t = 0 moving in the positive x direction. At t A = 1, it has gone a distance of one light-second and is found at x A = c. How does this appear to the observer O B? In this frame, the photon has a speed of only c v in the positive x B direction. After 1 second, it has therefore gone a distance x B = c v. At t B = 1, we have x B = c v = x A v (using the Galilean transformation); therefore, x A = c and everything is logically consistent. But it is just this consistency which is contradicted by experiment. All the experiments show that x B = c, and not c v. But if x B = c after the photon has travelled for 1 second in the frame of O B, how can this be reconciled with the fact that (a) O A and O B are in relative motion, and (b) the photon appears to have gone the same distance in both frames in the same time? Clearly when photon arrives at the location x B = c in the O B frame, it has passed the location x A = c in the O A frame due to the relative motion. So how can the two distances be the same? Einstein s solution to the problem involves a careful examination of how we actually assign coordinates to events, and how distances and times are measured. How, for example, in the situation just discussed, do we know that x B = x A vt A or t A = t B is correct? We need to address this issue experimentally (or operationally as the philosophers say) and give a clear prescription that is consistent with both relativity and the observed facts regarding the speed of light. Transformation between inertial frames Most of the interesting features of relativity appear when we restrict our attention to what happens in the direction of the relative motion. So for the moment, we ignore two spatial dimensions and stick with the observers O A and O B whose relative motion is along the common x direction. Any event will be labeled by coordinates (t A, x A ), and (t B, x B ) repectively by the two observers. How are they related? To begin with, we assume the transformation connecting them is linear, since the transformation preserves uniform, rectilinear motion. Thus, there will exist constants α, β, γ, δ such that t B = αt A + βx A (1) x B = γt A + δx A (2) If O B moves with velocity v in the positive x direction, then the spatial origin in the O B system (x B = 0), corresponds to x A = vt A. Similarly, x A = 0 = x B = vt B. Now if x A = 0, equation (1) gives t B = αt A, while (2) = vt B = γt A, and so γ = αv. And 3

now, if x B = 0, then (2) = α = δ. And we can rewrite this as t B = αt A + βx A (3) x B = α( vt A + x A ) (4) To get further, it is necessary to go beyond the principle of relativity. The Galilean transformations result from the (insufficiently examined) assumption that t A = t B, from which it follows that β = 0 and α = 1. To get the Lorentz transformation, we first note that, interchanging the roles of the two observers, we ll have x A = α (vt B + x B ). Invoking the principle of relativity again, we must have α = α : x A = α(vt B + x B ). (5) Now suppose the event we re looking at lies on a light ray emitted from the common origin. This is the example discussed above. According to Einstein s postulate, we must have, at this event, both x B = ct B, and x A = ct A. From (5), we get while from (4), we obtain ct A = α(vt B + ct B ) = αt B (v + c), ct B = α(ct A vt A ) = αt A (c v). Multiplying these together, and cancelling the term t A t B, we arrive at α 2 = α = c2 c 2 v2, or where we ve taken the positive root, since α > 0. 1 (6) 1 v2 /c2, The matrix of the linear transformation connecting the two systems has determinant α 2 + αβv, and in order to get equation (5) when we invert the matrix, this determinant must have the value 1. Solving algebraically for β, we find β = vα c 2 = v/c 2 (7) 1 v2 /c2. Putiing this together, we arrive at the Lorentz transformation where t B = α(t A vx A /c 2 ) (8) x B = α(x A vt A ), (9) α = 1 1 v2 /c 2 4

Since there s no motion in the other two directions, the complete 4-dimensional Lorentz transformation includes the additional equations y B = y A, z B = z A. If v/c 1, this reduces to the Galilean transformation, as expected. It is customary, and quite useful, to introduce a system of units in which c = 1. We do this by defining x A = x A /c, and similarly x B = x B /c. If the original units were MKS, then x A has units of seconds and x A = 1 km means x A = x A /c 3.3 10 6 sec = the time required for a photon to go 1 km or 1 light-kilometer. 2 Similarly, v is replaced by ṽ = v/c. Removing the tildes, we get a more symmetric looking form for the equations: where t B = α(t A vx A ) (10) x B = α(x A vt A ) (11) α = 1 1 v 2. (12) We shall use whichever system of units seems appropriate to the subject. Definition: This particular Lorentz transformation is called a boost in the x-direction. There are likewise boosts in the y and z directions. We can already see the non-euclidean nature of spacetime from the boost. We set up coordinates for O A in the usual way, with the time axis t A as a vertical line (in spacetime diagrams, time always increases from the bottom to the top of the figure), and the space axis x A horizontal: t A t B light ray x A x B O A s time axis is given by the equation x A = 0, and his space axis by t A = 0. Constant time surfaces for this observer are given by the equations t A = const, and consist of the set of all lines parallel to the space axis. There s one light ray shown, and in these units (c = 1), it must have slope 1. The time axis for O B, given by x B = 0, is given in this coordinate system by 0 = α(x A vt A ) or t A = (1/v)x B. It has slope > 1. The space axis, t B = 0 = t A vx A has slope v < 1. 2 Another way to set c = 1 is to use ct instead of t. Then time is measured in light-seconds - the distance travelled by light in one second. 5

Operational definition of spacetime coordinates The alert reader has noticed that nothing has been said about how these coordinates are to be assigned. The manner in which this is done must be consistent both with Einstein s postulates and with what we actually do (in principle). The simplest solution is to use radar-ranging : We suppose that, at each event, there s a 0-dimensional spherical mirror which reflects photons back along their incoming directions. We also suppose that each inertial observer possesses a standard clock standard in the sense that they all keep identical time whenever they re at rest in the same frame. Each observer is also equipped with a cool flashlight which emits single photons in any direction the observer chooses, at any time. The observer O A is located at x A = y A = z A = 0, and t A is the time registered on his 3 clock, so his world-line or path in spacetime is the parametric curve t A (t A, 0, 0, 0) in his coordinate system. An event E not on O A s world line is assigned coordinates as follows: A photon is emitted by O A at time t A = t 1, bounces off the mirror at E, and is reflected back to O A, where it s received at time t A = t 2. (These times are unique! See the figure on the next page.) Then the time of the event E is the midpoint of the time interval [t 1, t 2 ], and the distance r A from O A to E is half the distance travelled by the photon: t A (E) = 1 2 (t 2 + t 1 ) r A (E) = c 2 (t 2 t 1 ) The x, y, z coordinates can be assigned once the distance r A (E) is known, since O A also knows the angular coordinates (θ, φ) of E from the direction in which the flashlight was pointing. To simplify things, we continue with our 2-dimensional spacetime, in which x A (E) = ± c 2 (t 2 t 1 ), with the positive sign being taken if the direction of E is along the positive x A axis. From now on, we ll revert to units in which c = 1. Spacetime Diagrams We draw the worldline of observer A as a straignt line, with t A increasing from the bottom to the top. The worldlines of photons will have slope ±1 in these units, and will be drawn with dotted lines. The first figure illustrates some photon worldlines, used by O A to assign coordinates to a few events. The two events E and F are simultaneous with respect to O A since t A (E) = t A (F). It s important to note that worldlines are sequences or histories of events, and don t move around as we change coordinates. These same photon worldlines are used by all inertial 3 It is really too much trouble to rework every sentence so that it s neutered. Apologies for any offense this causes. 6

O B O A O A F S A F S A E E G S B Figure 1: O A assigns coordinates using radar. Note that E and F are simultaneous events for O A. The segment S A is part of the surface given by t A = const. in the O A frame. Figure 2: A different observer O B has a different set of events which are simultaneous with E. The surfaces S A and S B consist of events which are simultaneous wrt O A and O B. observers when they assign coordinates to these same events. In the second figure, we ve added observer B, and evidently the events E and F are not simultaneous in the O B frame. These figures indicate clearly the relativity of simultaneity ; it is an observer-dependent phenomenon. Note: The angles formed by the space and time axes are not observables. We are representing our 2-dimensional space time in the Euclidean plane, but, as we ll see in a bit, the relativistic geometry of this plane is non-euclidean. 7

The k-factor O A O B T E 1 kt E 2 Figure 3: In the figure at the left, O A emits two photons at times t A = t 1 and t A = t 2, with t 2 t 1 = t A. They ll be received by O B at times t B = t 1 and t B = t 2, t B = t 2 t 1 seconds apart. If we set T = t A, then evidently t B = kt for some positive number k. In O A s system, the first photon s world line is given parametrically by s (t 1 + s, s), and events on O B s world line satisfy x A = vt A. So the 2 lines intersect when v(t 1 + s) = s, or when s = vt 1 /(1 v), and the corresponding event has the coordinates (t A (E 1 ), x A (E 1 )) = ( t 1 + vt ) ( ) 1 1 v, vt 1 1 = t 1 1 v 1 v, v, 1 v with a similar expression for E 2. So the vector E 2 E 1 has coordinates ( ) 1 ( t A, x A ) = T 1 v, v, 1 v and therefore We call t B = 1 1 + v ( t A v x A ) = T 1 v 2 1 v. k AB = 1 + v 1 v the k-factor, and note that it s independent of T. It s better known as the relativistic Doppler shift: if λ is the wavelength as measured by O A, then O B finds it to be k AB λ. The reader should be able to verify that k BA = k AB. k AC = k AB k BC if O C is a third observer receding from both O A and O B. 8

If, instead, O B is approaching O A, then k 1/k = 1 v 1 + v. We can use the k-factor to derive the expression for the relativistic addition of velocities: First, invert the formula for k, obtaining Then adding a third observer O C, we have v = k2 1 k 2 + 1. v AC = k2 AC 1 k 2 AC + 1. Substituting k AC = k AB k BC leads, after a bit of algebra, to the result v AC = v AB + v BC 1 + v AB v BC, which differs from the Galilean result because of the denominator. In particular, if v AB = v = V BC, then V AC = 2v/(1 + v 2 ), not the 2v we d expect. Length contraction and time dilation The factor α(v) = some surprising effects: 1, which appears prominently in the Lorentz transformation, has 1 v 2 If E 1, E 2 are two events on O A, with coordinates (t 1, 0), (t 2, 0), then t A = t 2 t 1. We know that t B = α(v)( t A ) (since x A = 0 for these events). Hence t B t A = α(v) > 1 if v 0. If t A is the time between 2 ticks of A s clock, then B reports that these ticks occur father apart than t 2 t 1 - that is, according to B, A s clock is running more slowly than B s. (Of course, the two observers are symmetric here: A says the same about B s clock.) This is the phenomenon of time dilation. This is a very real phenomenon: muons are created in the upper atmosphere by cosmic rays, typically with huge velocities, say.99c. If the muons are created at a height of 100 km, they will reach the surface of the earth in 100/.99c 3.3 10 4 seconds. This time, measured in a lab frame on the surface of the earth, is greater by nearly an order of magnitude than the average lifetime of the muons. So it would seem that we shouldn t see very many; nearly all should have decayed before reaching the earth s surface. But... this is not the time experienced in the rest frame of the muons (which is all the muons care about!) In that frame, the elapsed time between the two events (creation of the muon and its detection at the surface) can be found by noting that t lab = α t µ. ( x µ = 0). So t µ = (1/α) t lab 4.6 10 5 seconds, nearly an order of magnitude less. So there s plenty of time for the muons to survive, which they do. 9

Now consider a ruler with length L, at rest in A s frame. When we say it has length L, we mean x A = L, where x A = x A (E 2 ) x A (E 1 ) for two simultaneous events E 1, E 2 on the two worldlines corresponding the ends of the ruler. Now observer B, moving with velocity v can also measure the length of the ruler, using the same prescription: he computes x B for two events on the ends of the ruler which are simultaneous for him. O A O B S B Figure 4: The edges of the ruler are O A s world line and the line parallel to it. Surfaces of equal time for the two observers are S A and S B. When O A measures the length of the ruler, he computes x A = L for E 2 E 1. O B, on the other hand, computes x B = L for E 2 E 1. E 1 E 2 E 2 S A We do the computation in A s frame. We have E 2 E 1 = (E 2 E 1 ) + (E 2 E 2 ), so the components of this vector in O A are (0, x A ) + ( t A, 0). Now x A = L and t A = v x A = vl. (The equation of the line S B is x = (1/v)t in O A.) So ( t A, x A ) = (vl, L), and the Lorentz transformation gives us x B = α(v)( x A v t A ) = 1 v 2 x A, or L = 1 v 2 L. This is the Lorentz-Fitzgerald contraction: the ruler is in motion according to O B, and it s length appears shorter by a factor of 1/α. The spacetime inteval As we ve seen, measurements of distance and time are observer-dependent, as is the notion of simultaneity. One might wonder if there s any observable which is independent of relative motion. Indeed there is: Definition: The spacetime interval between two events is τ 2, given in conventional units by c 2 τ 2 = c 2 t 2 x 2 y 2 z 2, where the coordinate differentials are computed in any inertial frame, with c 2 = 1 in simplified units. 4 4 Some authors reverse the signs here, and write something like ds 2 = dx 2 + dy 2 + dz 2 c 2 dt 2. 10

Theorem: The number τ 2 depends only on the two events, and not on the inertial frame in which it s computed. Proof: Just write it out, using equations (10) and (11). This supplies the proof for the special 2-dimensional Lorentz transformation we ve been studying. It s called a boost in the x direction. There are boosts in the other two spatial directions as well, and the proof for these is the same, with x replaced by either y or z. τ 2 is also invariant under spatial rotations: these are given by (t,x) (t, Rx), where R t R = I and det(r) = 1. Under a rotation, t doesn t change, and we know from Pythagoras theorem that the sum of the squares of the spatial coordinate differentials is invariant. This gives us 6 parameters: boosts and rotations along and about the (x, y, z) axes. It s not hard to show that the interval is invariant under finite products of these transformations as well. The set of all such products is a 6-dimensional group called the Lorentz group. More precisely, it s called the proper, orthochronous Lorentz group and is denoted in the literature by L +. If we add into the mix the transformations of time-reversal and spatial reflection (x x), we get the full Lorentz group. In this course, we ll be happy with just L +. The interval is also invariant under (constant) space-time translations of the form (t, x) (t + a,x + a). These are not homogeneous (the origin is not a fixed point), so they re not represented by matrices. Clearly the coordinate differentials, t, x, etc. remain unchanged under translations, and so does the interval. Composing the translations with the homogeneous (L +) ones gives us a 10-parameter group called the (proper orthochronous) Poincaré group, denoted P +. Starting with one inertial frame, all the others can be obtained from it by one of the transformations in the Poincaré group, so the theorem is proved. Exercise: If P + = {(L,a) : L L +, and a R 4 }, find the multiplication law for the group. Invariance of the spacetime interval is the relativistic analog of Pythagoras s theorem: in R 3, the quantity s 2 = x 2 + y 2 + z 2 is invariant under the Euclidean group of rigid motions. These include rotations about the origin, spatial translations, and their compositions. In this context, the word invariant means that (a) one computes s 2 using the same formula in any frame obtained from the initial Cartesian one by a rigid motion, and (b) one get the same number; it depends only on the pair of points, and not on their coordinates. 11

Minkowski space Euclidean geometry is the study of objects in R n whose properties (such as angles, areas, etc.) are determined by quantities invariant under rigid motions. Similarly, special relativity is the study of objects in R 4 whose properties are determine by Poincaré invariant quantities. The space R 4, together with the invariant interval τ 2 is known as 4-dimensional Minkowski space or M 4. A vector in M 4 will be represented by uppercase Roman letters: U, V, etc. In the reference frame of O A (who has now been promoted 2 dimensions in rank), we write U A = ( t A, x A ), and the invariant τ 2 for the vector U will be written as U U = t 2 A x A xa. No subscript on the left hand side is needed, since this is independent of the reference frame. The non-euclidean nature of Minkowski space follows from the fact that, despite the way it s written, the interval τ 2 can be either positive, negative or zero (unlike s 2 which is positive, except for x = 0.) Definition: The 4-vector U 0 is said to be timelike if U U > 0, spacelike if U U < 0, and null if U U = 0 If E 1 and E 2 are any two events on the worldline of an observer, then the 4-vector U = E 2 E 1 is timelike. This is because, in the inertial frame of the observer, t is the only non-zero component of U, and so U U = t 2 > 0. Since this holds in one inertial frame, it holds in any such frame: in another frame, we ll have U A = ( t A, x A ), but U U = t 2 A x A x A = t 2 > 0. If E 1 and E 2 are any two events which are simultaneous with respect to some inertial observer, then U = E 2 E 1 is spacelike, since in the observer s frame, t = 0 and x 0. So U U = x x < 0. And, to be repetitive, if this holds in one frame, it holds in all. And finally, if the two events both lie on the path of a photon, U is null, since in any frame, U = ( t, x) = t(1, x/ t), and x/ t = c = 1 (Einstein s postulate!) in these units. So U U = t 2 (1 1) = 0. In the figure below (2 spatial dimensions suppressed), the time axis of an observer is indicated by the dotted line. One of the missing dimensions can be recovered by rotating the figure about this time axis. A and B are future-pointing timelike vectors lying in the future null cone of the event O. Both these vectors have the same length since their tips lie on the pseudosphere of radius 1. Either could be made to have any apparent Euclidean length by 12

A τ 2 = 1, t > 0 null cone O B C Figure 5: The Minkowski analogs of spheres in Euclidean space are the 3-dimensional surfaces given by τ 2 = constant, and called pseudo-spheres. In some (hence any) inertial frame, they are the points satisfying an equation of the form t 2 x 2 y 2 z 2 = a, for some number a. If a > 0, then the pseudo-sphere is a 2-sheeted hyperboloid which is asymptotic to the null cone, defined as the pseudo-sphere with radius a = 0. D sliding them farther out on the hyperboloid which is asymptotic to the null cone. But there s no physics here, since Euclidean length is not an observable. C is a spacelike vector, and D is a past-pointing timelike vector. The null vectors at O have their tips on the null cone. The geometry of a boost can be understood by changing to null coordinates: We just diagonalize the Lorentz transformation given by the matrix ( ) 1 v L = α(v). v 1 The eigenvalues are λ 1 = α(v)(1 + v) = k, λ 2 = α(v)(1 v) = 1/k, with eigenvectors ξ 1 = (1, 1) t, ξ 2 = (1, 1) t. The new coordinates are ( ) µ = t + x ν = t x, and ˆL k 0 = 0 k 1. The coordinates are called null because the lines µ = constant and ν = constant are just the two families of parallel null rays in M 2. Under the Lorentz transformation, the null vectors at O are rescaled by factors of k and k 1 respectively. So points along the null ray parallel to ξ 1 are moved outward along the ray, while those along the ray parallel to ξ 2 are moved inward (toward the origin). Points on the pseudosphere τ 2 = 1 (which is invariant) are slid around to the left, and so on. Remark: The notion of future-pointing timelike and null vectors is invariant under transformations in L +. For suppose that U is a future-pointing timelike vector with U U = a 2. Since the quantity U U is invariant, the only possibility for LU to be past pointing is for L to interchange the two hyperboloids τ = a and τ = a. But this transformation involves time-reversal, which, by construction is not in L +. Similarly for null vectors. 13

The physical interpretation of τ Definition: We write A B, read A precedes B if the vector B A is future-pointing and timelike. The set I + (A) = {B such that A B} is called the future of A. If A B, then there s an inertial observer present at both events. I + (A), which includes the future null cone) is the set of all events which can be influenced by A. There is no ambiguity concerning which event occurs first. For this observer, the components of the vector U = B A take the form ( t,0). If A B, the elapsed time between the two events recorded by O A just τ, which is therefore called the proper time between the two events. Suppose u γ(u) is a smooth parametric curve, with γ(u 1 ) = A, γ(u 2 ) = B. We say that γ is a timelike curve from A to B if, at each event, the tangent vector dγ/du is timelike and future pointing. The characterization is invariant under reparametrization by u, provided that du/du > 0. We now define the infinitessimal dτ = dτ 2 = du 2 dx(u) dx(u) = 1 v 2 (u)du, and we compute, in a relativistic analogy to arc length in Euclidean space, the quantity u2 dγ L[γ] = dτ = du dγ du du. γ u 1 L[γ] is the elapsed time between the events A and B recorded by the clock of an observer whose world line contains the segment γ. There s no requirement that the observer be inertial. T[γ] is independent of the coordinate system, and the parametrization of the curve itself. It s an invariant of the curve in M 4. The twin paradox B γ A β This is a non-paradox which is nevertheless very interesting. The figure at left shows the worldlines of two observers, one of whom (β) is inertial, and the other (γ) not. Both are present at the two events A and B. We compute in the frame of the observer β; A and B will have the local coordinates (t 1,0) and (t 2,0) respectively, with t 2 > t 1. γ will be given parametrically by something of the form t (t,x(t)). We have γ dτ = t2 t 1 1 v2 dt dt = t 1 t2 β dτ. 14

Since the inequality holds for all such curves γ, we have the following: The proper time elapsed between two events A B is maximized by an intertial observer present at both events. There is no symmetry in this situation, and it is not true that all motion is relative, or some such nonsense. Yes, both observers are in motion relative to one another, but only one is inertial. So there s no paradox in the following: One of two twins stays at rest in an inertial frame in the solar system, while the other makes a round trip to alpha centauri. When they re reunited, the twin who stayed behind is older (exactly how much depends on the details of the younger one s trip, since his worldline is not that of an inertial observer.) This result stands in obvious contrast to the Euclidean result, which is that distance is minimized along straight lines! Exercise: 1. What is the resolution of the gedanken experiment mentioned before? 2. Show that if v, w < 1, then v+w < 1 and conversely. What does this say about 1+vw the possibility of accelerating past the speed of light? 3. A particle moves rectilinearly in O A s and hence any inertial frame. Suppose its velocity in this frame is u A = x A / t A. Its velocity in the frame of O B is u B = x B / t B. Express the components of u B in terms of those of u A. 4. Observer B is cruising down the common x axis of both A and B. Observer A has a garage which is 10 feet long. Observer B is carrying a 20 ft pole. How fast must B be moving in order for the pole to fit in A s garage? (Which observer sees this? What does the other observer see?) It helps to draw a spacetime diagram here. How is it possible they observe different things if the motion is uniform and rectilinear? 5. α-centauri is 4 light years from the Earth. A rocket travels with constant speed c/2 from here to there and back. How much time elapses on the Earth between departure and arrival of the rocket? How much time elapses on the rocket? What happens as the speed of the rocket approaches c? The metric tensor and τ 2 The expression x x in Euclidean 3-space gives the squared length of the vector x. It s derived from the dot product x y. This is an object with the following properties which you ll recognize: 1. It is bilinear (linear in both x and y), meaning (a) (c 1 x 1 + c 2 x 2 ) y = c 1 x 1 y + c2 x 2 y. 15

(b) x (c 1 y 1 + c 2 y 2 ) = c 1 x y 1 + c 2 x y 2, for any real c 1, c 2. 2. It is symmetric: x y = y x. 3. It is non-degenerate: x y = 0, y = x = 0. 4. It is positive-definite: x x 0, with strict inequality unless x = 0. The first two properties assert that the dot product is a symmetric bilinear form, and can be represented as x y = x t gy, where g is a 3 3 symmetric matrix. Property (3) asserts that det(g) 0, and (4) means precisely that x t gx > 0, x 0. In the usual Cartesian coordinates, g = I, of course, and this just reads x y = x t y. Under a linear change of basis in the vector space R 3 given by the non-singular matrix P, we ll have new quantities ˆx,ŷ, and ĝ obeying the usual formulae: ˆx = P 1 x; x t y = ˆx t ĝŷ = x t (P t ) 1 ĝp 1 y, x, y = ĝ = P t P. If the change of basis is orthogonal, then P t P = I, and the numerical form of the dot product is unchanged, meaning that it s computed in the new (Cartesian) coordinate system using the same formula as in the original system. The dot product is invariant under orthogonal transformations. The bilinear form represented by the dot product in Euclidean space is also called the metric tensor of Euclidean space. In M 4, in contrast, we have the bilinear form associated to τ 2, which we can write as U V = t 1 t 2 x 1 x 2 y 1 y 2 z 1 z 2 = U t gv, where g = Diag{1, 1, 1, 1}. The relativistic bilinear form g satsifies all of the properties of the dot product, except for (4) it is not positive-definite, due to the existence of null and spacelike vectors. In particular, there s no way (consistent with the postulates of relativity) to define a distance function with the same sort of properties we get from x y in Euclidean space. Certainly τ does not behave like a distance function: Example: the reverse triangle inequality In Euclidean space for any three points A, B, C, we have d(a, C) d(a, B) + d(b, C), where d(a, C) is the distance from A to C. In Minkowski space, if A B C, and we write δ(a, B) for τ of the vector B A, we have shown above (the twin non-paradox) that δ(a, C) δ(a, B) + δ(b, C). The Lorentz metric tensor represented by the form U V is the fundamental object in relativity. It is more usual to write g(u, V ) instead of U V, 16

and this is the convention we will (eventually) adopt. We may also use the form U t gv to avoid polluting the computations with indices (although in GR, we ll have to give this up). Any of these expressions is fine. Under a change of inertial frame, if Û = L 1 U, where L is the matrix of a Lorentz transformation, we must have g = ĝ due to the invariance of τ 2. Thus for all vectors U. It follows that U t gu = Ût ĝû = Ût gû = Ut (L t ) 1 gl 1 U = U t gu, (L t ) 1 gl 1 = g and so g = L t gl. This is the formal way of asserting that the metric is Lorentz-invariant. Equivalently, U V is likewise Lorentz-invariant. Exercise: Let h be any non-degenerate, symmetric bilinear form on R n. Show that is a group under matrix multiplication. G = {A n n such that A t ha = h} Remark: The matrix h given above can be diagonalized by an orthogonal matrix since it s symmetric. So for some R, R 1 hr = D is diagonal. But R is orthogonal, so R 1 = R t, and we have D = R t hr with the eigenvalues of h on the diagonal. The eigenvalues are real, since h is symmetric, so we can scale them each to be ±1 : D = P t DP, where P is the diagonal matrix with Pi i = 1/ λ i, λ i being the i th eigenvalue. So, for any non-degenerate symmetric bilinear form, there s a basis in which it is diagonal, with entries of ±1 on the diagonal. This is called the canonical form of h. If there are p positive entries, then the group G in the exercise above is the Lie group known as O(p, n p). For Euclidean space, the group is O(n), the orthogonal group. For Minkowski space, it s O(1, 3), the full Lorentz group. As a topological space, O(1, 3) has 4 connected components. L + is the connected component containing the identity. The other 3 components are obtained from L + by composing these elements with time-reversal and/or space reversal (x, y, z) ( x, y, z). Algebraically, L + is characterized as the subgroup of O(1, 3) consisting of those matrices L with (a) det(l) = 1, and (b) L 0 0 > 0. Exercise: 1. The addition formula for the hyperbolic tangent reads tanh(θ 1 + θ 2 ) = tanh(θ 1) + tanh(θ 2 ) 1 + tanh(θ 1 ) tanh(θ 2 ) which exactly reflects the addition law of velocities and suggests that we define a parameter θ by v = tanh(θ). Recall that (a) tanh(θ) = sinh(θ)/ cosh(θ), and (b) cosh 2 (θ) 17

sinh 2 (θ) = 1, and derive the alternative form ( ) ( tb cosh(θ) sinh(θ) = x B sinh(θ) cosh(θ) ) ( ta x A ) of the boost. Show that these compose the way we want them to: L(θ 1 )L(θ 2 ) = L(θ 1 + θ 2 ). This is why boosts are sometimes called hyperbolic rotations. 18