General Relativity Lecture Notes, Lecture notes of Relativity Theory

These are lecture notes from a course on General Relativity taught by Alan Guth at MIT in Spring 2017. The notes cover topics such as why general relativity is important, a review of special relativity, and the kinematic consequences of special relativity. The notes also include thought experiments and examples to help explain the concepts.

Typology: Lecture notes

2016/2017

Uploaded on 05/11/2023

aghanashin
aghanashin 🇺🇸

4.7

(22)

253 documents

1 / 117

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
8.962 General Relativity, Spring 2017
Massachusetts Institute of Technology
Department of Physics
Lectures by: Alan Guth
Notes by: Andrew P. Turner
May 26, 2017
1 Lecture 1 (Feb. 8, 2017)
1.1 Why general relativity?
Why should we be interested in general relativity?
(a) General relativity is the uniquely greatest triumph of analytic reasoning in all of science.
Simultaneity is not well-defined in special relativity, and so Newton’s laws of gravity become
Ill-defined. Using only special relativity and the fact that Newton’s theory of gravity works
terrestrially, Einstein was able to produce what we now know as general relativity.
(b) Understanding gravity has now become an important part of most considerations in funda-
mental physics. Historically, it was easy to leave gravity out phenomenologically, because it is
a factor of 1038 weaker than the other forces. If one tries to build a quantum field theory from
general relativity, it fails to be renormalizable, unlike the quantum field theories for the other
fundamental forces. Nowadays, gravity has become an integral part of attempts to extend
the standard model. Gravity is also important in the field of cosmology, which became more
prominent after the discovery of the cosmic microwave background, progress on calculations
of big bang nucleosynthesis, and the introduction of inflationary cosmology.
1.2 Review of Special Relativity
The basic assumption of special relativity is as follows: All laws of physics, including the
statement that light travels at speed c, hold in any inertial coordinate system. Fur-
thermore, any coordinate system that is moving at fixed velocity with respect to an
inertial coordinate system is also inertial. The statements that all laws of physics hold in
any inertial frame and any frame that moves at fixed velocity with respect to an inertial frame is
also inertial are often called Galilean relativity.
The statement that the speed of light should always be cmight seem somewhat counterintuitive.
We can avoid the seeming contradictions that this causes by relaxing our assumptions about how
observations in two different reference frames are related. That is, everything is consistent if we
change our ideas relating the observations of different observers.
The kinematic consequences of special relativity can be summarized in three statements:
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download General Relativity Lecture Notes and more Lecture notes Relativity Theory in PDF only on Docsity!

8.962 General Relativity, Spring 2017

Massachusetts Institute of Technology

Department of Physics

Lectures by: Alan Guth

Notes by: Andrew P. Turner

May 26, 2017

1 Lecture 1 (Feb. 8, 2017)

1.1 Why general relativity?

Why should we be interested in general relativity?

(a) General relativity is the uniquely greatest triumph of analytic reasoning in all of science. Simultaneity is not well-defined in special relativity, and so Newton’s laws of gravity become Ill-defined. Using only special relativity and the fact that Newton’s theory of gravity works terrestrially, Einstein was able to produce what we now know as general relativity.

(b) Understanding gravity has now become an important part of most considerations in funda- mental physics. Historically, it was easy to leave gravity out phenomenologically, because it is a factor of 10^38 weaker than the other forces. If one tries to build a quantum field theory from general relativity, it fails to be renormalizable, unlike the quantum field theories for the other fundamental forces. Nowadays, gravity has become an integral part of attempts to extend the standard model. Gravity is also important in the field of cosmology, which became more prominent after the discovery of the cosmic microwave background, progress on calculations of big bang nucleosynthesis, and the introduction of inflationary cosmology.

1.2 Review of Special Relativity

The basic assumption of special relativity is as follows: All laws of physics, including the statement that light travels at speed c, hold in any inertial coordinate system. Fur- thermore, any coordinate system that is moving at fixed velocity with respect to an inertial coordinate system is also inertial. The statements that all laws of physics hold in any inertial frame and any frame that moves at fixed velocity with respect to an inertial frame is also inertial are often called Galilean relativity. The statement that the speed of light should always be c might seem somewhat counterintuitive. We can avoid the seeming contradictions that this causes by relaxing our assumptions about how observations in two different reference frames are related. That is, everything is consistent if we change our ideas relating the observations of different observers. The kinematic consequences of special relativity can be summarized in three statements:

  • Time dilation: A clock moving relative to an inertial frame will “appear” to run slow by a factor of γ = √ 1 −^1 v (^2) /c 2.

To demonstrate that time dilation is essential, we can consider the thought experiment of the light clock. Consider a clock that, in its rest frame, consists of two stationary parallel mirrors with a light beam bouncing back and forth between them. By measuring the time it takes the light beam to make one full transit, we can use this setup as a clock. Now consider another inertial frame in which the same clock is moving at velocity v, in a direction parallel to the plane of the mirrors. In this frame, the light pulse moves diagonally from one mirror to the other and back. This path is longer than the path in the rest frame of the clock, and so the clock runs slower as viewed from this frame. This uses the assumption that the speed of light is the same for all observers, as well as the assumption that the separation between the two mirrors is the same in both frames. The latter assumption can be argued by consistency. Consider the case that the travelling clock passed through a similar clock in our rest frame. If the separation changed due to the motion, the moving mirrors would either pass between or around the mirrors in the rest frame; either situation violates the symmetry between the two frames, and so the separation between the mirrors cannot change.

  • Lorentz–Fitzgerald Contraction: Any rod moving along its length at speed v relative to an inertial frame will “appear”contracted by a factor of γ. There is no contraction along directions perpendicular to the direction of movement. There is another simple thought experiment that allows us to derive the necessity of length contraction. We again consider a light clock, and we now assume that we measure the separation between the mirrors exactly using a measuring rod. We then consider going to a frame in which the clock is moving at speed v in a direction perpendicular to the plane of the mirrors. As we have seen, we know this clock must slow down by a factor of γ due to time dilation. We can then compute the separation of the mirrors in this frame necessary to explain this slowing of the clock, and the result is that the separation must have contracted by an factor of γ.
  • Relativity of simultaneity: If two clocks that are synchronized in their rest frame are viewed in a frame where they are moving along their line of separation at speed v relative to an inertial frame, the trailing clock will lag by an amount vc 20 , where 0 is the rest frame separation of the clocks. Imagine two clocks at rest, separated by a measuring rod of rest length ` 0 , that have been synchronized in the rest frame of the system. Now consider a frame in which this system is moving at a speed v along the direction of separation between the clocks. The clocks will not be synchronized in this frame. The time tclock 1 read off of the leading clock will be related to the time tclock 2 of the trailing clock by

tclock 2 = tclock 1 +

v` 0 c^2

How can we synchronize the clocks in their rest frame? If we know the separation ` 0 between the clocks in the rest frame, we can emit a light pulse at time t 1 from the first clock. When the light pulse reaches the second clock, we set the second clock to read time

t 2 = t 1 +

` 0

c

2 Lecture 2 (Feb. 15, 2017)

2.1 Lorentz Transformations

In special relativity, we denote coordinate vectors by x. By xμ, we mean the coordinates of the vector x in the unprimed frame,

xμ^ = (x^0 , x^1 , x^2 , x^3 ) ≡ (ct, x, y, z). (2.1)

The same vector can be expressed in another frame, which we can denote with a prime on the index, as xμ

′ = (x^0

′ , x^1

′ , x^2

′ , x^3

′ ) ≡ (ct′, x′, y′, z′). (2.2) These two inertial frames are related by Lorentz transformations, Λμ

′ ν. It is important to be careful with our conventions here; in Carroll’s notation, Λμ

′ ν transforms from the unprimed frame to the primed frame, while Λμν′ transforms from the primed frame to the unprimed frame. These two transformations are inverses of one another,

Λμν′ Λν

′ λ =^ δ

μ λ.^ (2.3)

In other texts, such as those by Wald or Weinberg, indices are used simply as variables that run from 0 to 3, with no assumption that primes on indices have any special meaning. In that notation, Eq. (2.3) would be written as (^) ( Λ−^1

)μ ν Λ

ν λ =^ δ

μ λ.^ (2.4)

In this course, we will be using Carroll’s notation. Lorentz transformations act on coordinates as

′ = Λμ

′ ν x ν (^). (2.5)

There are two types of Lorentz transformations: rotations and Lorentz boosts. Let’s first discuss rotations. As an example, we can consider a counterclockwise rotation about the z-axis by an angle θ. This transformation leaves the t and z directions unaffected and rotates the x- and y-directions into one another as

x′^ = x cos θ + y sin θ , (2.6) y′^ = −x sin θ + y cos θ , (2.7) t′^ = t , (2.8) z′^ = z. (2.9)

We can express this in the form of a matrix:

Λμ ′ ν =

0 cos θ sin θ 0 0 − sin θ cos θ 0 0 0 0 1

.^ (2.10)

Here, the rows are labelled by the value of the index μ′, while the columns are labelled by the value of the index ν. The other type of Lorentz transformation is a Lorentz boost, which mixes the spatial and temporal components of spacetime. Consider a boost in which the primed coordinate system

moves in the +x-direction at velocity v relative to the unprimed system, as shown in Fig. 1. We imagine a series of clocks at rest in the unprimed frame, laid out along the x-axis and synchronized in this frame, and similarly a set of clocks at rest in the primed frame, laid out along the x′-axis and synchronized in the primed frame. At time t = 0 in the unprimed frame, the origins of both frames are coincident and the clock at rest at the origin of the primed frame reads t′ 0 = 0.

Figure 1: Unprimed and primed coordinate systems, each constructed from rods and clocks, shown as seen in the unprimed frame at time t. The primed system is moving at speed v in the +ˆx-direction relative to the unprimed frame. The clocks shown on each axis are at rest and are synchronized in the rest frame of their respective axes. The x- and x′-axes actually lie on top of one another, but are shown separated to avoid clutter.

To determine the transformation equations, we will calculate the t- and x-coordinates of the clock shown at position x′^ on the x′-axis, where the time shown on the clock is denoted by t′ x′. Starting with time, we note that the clocks on the x′-axis run slowly in the unprimed frame due to time dilation, and so the clock at rest at the origin of the primed frame must now read t′ 0 = (^) γt. The clock at position x′, which is synchronized in the primed frame with the clock at the origin of the primed frame, is seen in the unprimed frame to lag behind the clock at the origin by an amount vx′ c^2 due to the relativity of simultaneity. The clock at position^ x

′ (^) thus reads

t′ x′ = t γ

vx′ c^2

We can drop the subscript x′^ on t′ x′ , so t′ x′ ≡ t′, since the time on the clock at x′^ defines the time at x′^ in the primed coordinate system. Then solving this equation for t gives

t = γ

t′^ + vx′ c^2

Thus we have expressed the coordinate time t in the unprimed frame as a function of the t′^ and x′ coordinates in the primed frame. We can do something similar to find an expression for x. We know that in the time t, the entire primed coordinate system has moved a distance vt in the unprimed frame. The clock at the origin of the primed system is therefore at position x = vt in the unprimed frame. The clock at position x′^ has a rest separation of x′^ from the clock at the origin of the primed system, so due to Lorentz contraction, the separation in the unprimed frame is x ′ γ. Thus,

x = vt + x′ γ

2.3 Lorentz Metric

Instead of writing out the Lorentz interval, we can instead define a tensor

ημν ≡

.^ (2.21)

The Lorentz interval can then be expressed as

∆s^2 = ημν ∆xμ∆xν^. (2.22)

We can take this further by defining

∆xμ ≡ ημν ∆xν^ , (2.23)

which allows us to express the Lorentz interval as

∆s^2 = ∆xμ∆xμ^. (2.24)

We can now generalize this notation to allow us to lower indices of any tensor object using the metric: T ··· μ^ ···= ημν T ···ν···^. (2.25)

We can similarly use the metric to raise indices. To do so, we first introduce the inverse metric ημν^ , which is defined by the relation

ημαηαν = ηνβ ηβμ^ = δνμ. (2.26)

For the moment, this may seem like an odd definition, because η = η−^1 , but it is still useful to take these definitions because in the future we will deal with metrics that are not equal to their own inverse. This new quantity allows us to raise indices, such as

∆xμ^ = ημν^ ∆xν. (2.27)

If an index is lowered and then raised, or vice versa, then the original tensor is recovered. More generally, we have T ···μ···^ = ημν^ T ··· ··· ν. (2.28) We can now show a special property of the Lorentz transformation: it leaves the metric invariant. We begin by using the fact that the spacetime interval is invariant:

∆s′^2 = ∆s^2 = ημν ∆xμ∆xν = ημν Λμμ′ ∆xμ ′ Λνν′ ∆xν ′

= ημν Λμμ′ Λνν′ ∆xμ

′ ∆xν

′ .

On the other hand, we have ∆s′^2 = ημ′ν′ ∆xμ ′ ∆xν ′

. (2.30)

Since Eqs. (2.29) and (2.30) must match for every ∆xμ ′ , we conclude that

ημν Λμμ′ Λνν′ = ημ′ν′. (2.31)

In fact, this is typically used as the defining quality of a Lorentz transformation: the group of Lorentz transformations is precisely the group of transformations that satisfy this property. We can write this as a matrix equation by being careful with our indices. We have

Λμμ′ ημν Λνν′ = ημ′ν′^. (2.32)

By using the transpose (^) ( ΛT

) (^) μ μ′^ ≡^ Λ

μ μ′^ ,^ (2.33)

we can write this as (^) ( ΛT

) (^) μ μ′^ ημν^ Λ

ν ν′^ =^ ημ′ν′^.^ (2.34)

Since we are only contracting adjacent indices in this expression, we can write it directly as a matrix equation,

ΛTηΛ = η. (2.35)

By inverting the matrices to move them to other sides of the equation, we can also write this in the form η−^1 ΛTη = Λ−^1. (2.36)

2.4 Curved Spacetime

We now wish to generalize the concept of the metric to a non-Euclidean space (i.e. a non-Minkowskian spacetime). Whereas before we wrote

∆s^2 = ημν ∆xμ∆xν^ (2.37)

for arbitrarily large separations ∆xμ, this no longer makes sense in a curve spacetime, because the metric varies. Thus, we can only we can only write a simple expression for infinitesimal separations dxμ. Furthermore, we will denote the metric by gμν (x) rather than ημν to make it clear that we are talking about a general curved metric,

ds^2 = gμν (x) dxμ^ dxν^. (2.38)

Note that the metric gμν (x) is now a function of spacetime. It is worth mentioning that there is nothing special or intrinsic about our choice of coordinates; they are simply one of many ways of assigning coordinates to spacetime, and each way is as good as any other (though certain choices can make particular calculations easier). We do assume that gμν (x) has the signature [− 1 , 1 , 1 , 1], meaning that it has three positive eigenvalues and one negative eigenvalue. This ensures that our spacetime has one temporal direction and three spatial directions. We will now make another definition to make our lives easier. For any massive object moving through spacetime, two points on its worldline will be timelike separated, because the object is moving at speed less than c (we will set c = 1 from here on out). As such, the spacetime interval between these points will be ds^2 < 0. This is inconvenient, so we define

dτ 2 ≡ − ds^2. (2.39)

This dτ is precisely the proper time, meaning the time interval measured on a clock carried by a traveler travelling at the right velocity to travel from the first event to the second.

In the absence of external forces, particles in spacetime travel along geodesics, which are es- sentially the closest paths in the curved spacetime to straight lines. Today, we want to derive the geodesic equation, which defines the curve on which free-falling particles will travel. Particles must always move along timelike trajectories, because particles can only travel at up to the speed of light. It is thus conventional to define

dτ 2 ≡ − ds^2. (3.2)

This quantity has a natural interpretation; everything we learned from special relativity still applies in general relativity, because we can always go to a coordinate system that is locally flat. If the separation dτ 2 is positive, then this means we have a timelike separation, and we can then go to a frame in which the two events happen at the same spatial location. In this frame, we simply have dτ 2 = dt^2. Thus, dτ is the proper time between the two events, that is, the time as measured on a clock that moves from one event to the other at uniform velocity. The concept of constant velocity is somewhat problematic because we have not defined what coordinate systems we are using, but this concept does make sense so long as our coordinate system is smooth (differentiable) and the separation is infinitesimal (any acceleration caused by curving of spacetime will be second order in the infinitesimal separation).

3.2 Geodesics

Now let us consider a trajectory connecting two points A and B in spacetime. We will parametrize the curve by the parameter λ, such that λ = λ 1 at point A and λ = λ 2 at point B. The trajectory can then be written as a function xμ(λ). Note that the parameter λ is not unique, as there are many ways we could parametrize the trajectory. The length of the trajectory can then be computed as

D =

ˆ B

A

dτ =

ˆ (^) λ 2

λ 1

dτ dλ dλ. (3.3)

We can express the proper time as

dτ =

−gμν (x(λ)) dxμ^ dxν^ =

−gμν (x(λ)) dxμ dλ

dxμ dλ

dλ. (3.4)

We can then write

D =

ˆ (^) λ 2

λ 1

−gμν (x(λ))

dxμ dλ

dxμ dλ dλ. (3.5)

This path length is the proper time experienced by an observer travelling along the trajectory. A geodesic is a trajectory xμ(λ) for which this length D is stationary, meaning that the first- order variation of D due to a small change in the path vanishes. Thus, in order to determine the condition for D to be stationary, we need to consider a nearby path

x˜μ(λ) = xμ(λ) + δxμ(λ). (3.6)

We assume that δxμ(λ) is small enough that we can treat it to first order, and we will insist that the variation of D vanish when we do so. We must guarantee that the endpoints A and B of the path remain fixed, so we fix δxμ(λ 1 ) = δxμ(λ 2 ) = 0. (3.7)

We now wish to calculate

δD = D[xμ(λ) + δxμ(λ)] − D[xμ(λ)]. (3.8)

Here, the square brackets are used to indicate that D is a functional, rather than simply a function; its argument is itself a function, and D depends on all values of the function it takes as input. Before dealing with the form of D given in Eq. (3.5), we will derive a general formula by considering

D =

ˆ (^) λ 2

λ 1

L

xμ(λ), dxμ dλ

dλ , (3.9)

and we can specialize later. The variation is

δD =

ˆ (^) λ 2

λ 1

∂L

∂xμ^ (x(λ))δxμ^ +

∂L

( (^) dxμ dλ

) (x(λ))δ

dxμ dλ

dλ. (3.10)

The first term in brackets is essentially the first term in a Taylor expansion of L, which is all we need because we are treating the variation δxμ^ only to first order. The second term is the same, but expanding with respect to the four variables dx μ dλ ; as far as^ L^ is concerned, these are simply additional variables. We now note that

δ

dxμ dλ

d˜xμ dλ

dxμ dλ

d dλ

[(xμ(λ) + δxμ(λ)) − xμ(λ)] = d dλ

δxμ(λ). (3.11)

We then have

δD =

ˆ (^) λ 2

λ 1

∂L

∂xμ^

δxμ^ +

∂L

( (^) dxμ dλ

d dλ

δxμ(λ)

dλ. (3.12)

The trick at this point is to integrate the second term by parts, which will introduce a surface term:

δD =

ˆ (^) λ 2

λ 1

∂L

∂xμ^ δxμ^ −

d dλ

[

∂L

( (^) dxμ dλ

]

δxμ

dλ +

[

∂L

( (^) dxμ dλ

) (^) δxμ(λ)

]λ 2

λ 1

The surface term vanishes precisely because we chose the boundary conditions

δxμ(λ 1 ) = δxμ(λ 2 ) = 0. (3.14)

Thus, we are left with

δD =

ˆ (^) λ 2

λ 1

∂L

∂xμ^

d dλ

[

∂L

( (^) dxμ dλ

]}

δxμ^ dλ. (3.15)

Now, we insist that δD = 0. Normally, the requirement that the integral vanishes does not imply that the integrand must vanish; however, in this case we know that δD must vanish for every possible δxμ, which necessitates that the term in brackets must vanish identically. We will now give a rough argument for why this should be true, via a proof by contradiction. For ease of notation, define

{}μ ≡

∂L

∂xμ^

d dλ

[

∂L

( (^) dxμ dλ

]}

Now, suppose by way of contradiction that {}μ > 0 for some λ 0 , and assume that all quantities of interest are continuous. Thus, because {}μ is positive at λ 0 and {}μ is continuous, there must be some neighborhood of λ 0 for which {}μ > 0. We can then choose a δxμ(λ) to be zero everywhere

This is the geodesic equation. Though many texts treat this simply as an intermediate result to finding the form of the geodesic equation we will write down later, it is very useful in its own rite. First, it only has one term involving a derivative of gμν , while the other form involves three. Second, if the metric is independent of any coordinate xα, then the right-hand side vanishes, telling us immediately that

gαν dxν dτ

= constant. (3.28)

Could we have started with τ as our parameter to begin with and avoided the extra complications in this computation? If we had started by parametrizing our paths by τ , then the trajectory length would be

D =

ˆ (^) τ 2

τ 1

dτ = τ 2 − τ 1 , (3.29)

and it would be hard to tell how to vary the path. To vary the path, we would have to allow τ 2 to depend on the path, requiring a more complicated formalism. By parametrizing the path with parameter λ, we were able to easily insist that the λ 1 , λ 2 were the same for both the original and the varied paths. This raises the question of whether it is valid to set λ = τ after the fact; it is, because the geodesic equation we arrived at in terms of the parameter λ is local in λ, meaning it is independent of global stationary points. After arriving at this equation, we can then set the parameter λ = τ. We will now derive the standard form of the geodesic equation. We expand the left-hand side of Eq. (3.27) as d dτ

[

gσν

dxν dτ

]

∂gσν ∂xμ

dxμ dτ

dxν dτ

  • gσν

d^2 xν dτ 2

Plugging this back into Eq. (3.27) and rearranging terms then yields

gσν

d^2 xν dτ 2

∂gσν ∂xμ

dxμ dτ

dxν dτ

∂gμν ∂xσ

dxμ dτ

dxν dτ

We can factor the expression on the right side to reach

gσν d^2 xν dτ 2

[

∂gσν ∂xμ^

∂gσμ ∂xν^

∂gμν ∂xσ

]

dxμ dτ

dxν dτ

where we have used the fact that

∂gσν ∂xμ

dxμ dτ

dxν dτ

∂gσν ∂xμ^

∂gσμ ∂xν

dxμ dτ

dxν dτ

which follows because dxμ dτ

dxν dτ

is symmetric in the indices μ and ν. The final step to simplifying this equation is to define the inverse metric gμν^ , which is defined by the property

gμσgσν = gνσgσμ^ = δνμ. (3.35)

We can then contract both sides of Eq. (3.32) with the inverse metric gλσ^ to reach

d^2 xλ dτ 2 = −Γλμν dxμ dτ

dxν dτ

where we have defined

Γλμν ≡

gλσ

[

∂gσν ∂xμ^

∂gσμ ∂xν^

∂gμν ∂xσ

]

which is called the affine connection or the Christoffel symbol. The geodesic equation as given in Eq. (3.36) is in the standard form.

3.2.1 Example

Consider the two-dimensional metric

gij =

0 x^2

, xi^ = (x, y). (3.38)

The spacetime interval then takes the form

ds^2 = dx^2 + x^2 dy^2. (3.39)

We can work out the Christoffel symbols, and we find

Γxxx = Γxxy = Γxyx = Γyxx = Γyyy = 0 ,

Γxyy =

gxx(−∂xgyy) = −x ,

Γyxy = Γyyx =

gyy(∂xgyy) = x x^2

x

Thus, the geodesic equation gives us the equations of motion

x¨ = −Γxyy y˙ y˙ = x y˙ y ,˙ (3.41)

y¨ = −Γyxy x˙ y˙ − Γyyx x˙ y˙ = −

x x˙ y .˙ (3.42)

We note that the metric has the same form as that of standard polar coordinates for the plane,

ds^2 = dr^2 + r^2 dθ^2 , (3.43)

so although the metric as we have written it does not immediately appear to be the one for flat space, it turns out that a change of coordinates will leave us with the metric for flat space. Thus, this metric does in fact describe flat space, and the geodesics are actually straight lines in the Cartesian coordinates.

4 Lecture 4 (Feb. 22, 2017)

4.1 General Coordinate Transformations

In special relatively, the only form of transformations we talk about are linear transformations, namely the Lorentz transformations

xμ^ → xν ′ = Λν ′ μ x μ (^). (4.1)

These transformations take us from one frame where the laws of physics are simple (special rela- tivity) to another frame where the same simple laws apply.

4.2 Embeddings

So far we have talked about transformations from one (3 + 1)-dimensional space to another, but sometimes we are interested in talking about transformations from one space to a higher- or lower- dimensional space. Thus, it is useful to give some formalism for a general embedding, which is a mapping from some space into a higher-dimensional space. Suppose we have a mapping from R^2 → R^3. To describe this concretely, we use maps xi(uα), where i = 1, 2 , 3 and α = 1, 2. Suppose the metric on R^3 is Euclidean, so that gij = δij. Then we can determine the metric on the embedded curved two-dimensional surface from the metric of the three-dimensional space using Eq. (4.7), giving

gαβ =

∂xi ′

∂uα

∂xj ′

∂uβ^ gi′j′^. (4.12)

This is called the induced metric, or the pullback of the metric on R^3. This phrase originates from the fact that we used the maps xi(uα), which map “forward” from the coordinates on the surface to the coordinates in R^3 , to “pull back” the metric of R^3 to a metric on the surface. Note that the affine connection Γλμν is not a tensor. Why? One important property of tensors is that if a tensor vanishes in one frame, it must vanish in all frames; this can be seen by inspecting the transformation rule Eq. (4.9). We saw an example in the previous lecture where the affine connection was zero for the Euclidean plane in Cartesian coordinates but nonzero for the Euclidean plane in polar coordinates, so clearly it cannot be a tensor.

4.3 Example

Recall the geodesic equation d^2 xλ dτ 2

= −Γλμν dxμ dτ

dxν dτ

which describes a freely falling particle not under the influence of any force other than gravity. Suppose that ds^2 = − dt^2 + gij dxi^ dxj^ , (4.14)

for i, j = 1, 2 , 3. In this case, the only nonzero components of the affine connection are the Γijk. From the geodesic equation, we then see that

d^2 t dτ 2

d^2 xi dτ 2

= −Γijk dxj dτ

dxk dτ

From the first of these equations, we see that

dt dτ

= constant. (4.16)

The proper time is given by dτ 2 = dt^2 − gij dxi^ dxj^. (4.17)

Dividing through by dτ 2 , this becomes

1 =

dt dτ

− gij dxi dτ

dxj dτ

Because (^) ddtτ is a constant, we thus conclude that

gij dxi dτ

dxj dτ = constant. (4.19)

This can then be interpreted as a kinetic energy for the system that will be conserved.

4.4 The Equivalence Principle

The Weak Equivalence Principle is the statement that for any object, the inertial mass is equal to the gravitational mass. By inertial mass, we mean the parameter m appearing in the equation F = ma, which determines how the object responds to forces acting on it; by gravitational mass, we mean the parameter m appearing in the gravitational force law F = −m∇φ, where φ is the gravitational potential. This statement tells us that the acceleration under gravity is the same for all objects. General relativity uses a stronger form of the equivalence principle, called the Einstein Equiva- lence Principle. The Einstein Equivalence Principle says that local physics in a locally free-falling frame is indistinguishable from special relativity. Another way to say this is that local physics in a gravitational field is indistinguishable from the physics in an accelerating reference frame. The classic picture to demonstrate the equivalence principle is to imagine observers inside a box sitting at rest on the surface of the Earth; the statement is that this situation can never be distinguished, by an observer within the box, from the situation in which the box is actually sitting in a spaceship that is accelerating. There are some subtleties with this picture; in actuality, one could tell the difference by measuring the local gravity vector at opposite ends of the box, which would point in slightly different directions if the box were on Earth. This is why the equivalence principle only holds for very small distances.

4.5 Locally Inertial Frames

We will now discuss an important theorem of Riemannian geometry, which tells us that we can always find a coordinate transformation that causes the metric to be Minkowskian and have van- ishing derivative at a particular point. This is very useful, because it allows us to locally treat general relativity in the same way that we treat special relativity.

Theorem 1 For any point x 0 in spacetime, we can find a local inertial coordinate system (also called a free-falling coordinate system, a locally Minkowskian coordinate system, or geodesic normal coordinates) such that

gμν (x 0 ) = ημν , ∂λgμν (x 0 ) = 0. (4.20) 2

Note that ∂λgμν (x 0 ) = 0 implies that Γλμν (x 0 ) = 0.

Proof We will first show that we can achieve gμν (x 0 ) = ημν. We know that the metric transforms under a general coordinate transformation as

gμν =

∂xλ′ ∂xμ

∂xσ′ ∂xν^ gλ′σ′^. (4.21)

We want to appeal to our intuition about matrices, and so we define the matrix

M = [M ]λ′μ = ∂xλ ′

∂xμ^

But here we have overcounted because the objects are not all distinct. For any choice of ordering, we could rearrange all the dots among themselves, which could be done in R! ways, and rearrange the partitions among themselves, which could be done in (D − 1)! ways. Thus, the number of arrangements is

N =

(R + D − 1)!

R!(D − 1)!

This is the number of independent components of a symmetric, R-index tensor for which the indices take on D possible values. Now we return to the proof. Assume we have already done the necessary transformations to reach g = η at x 0. We let x 0 = 0 without loss of generality. In the primed coordinates, we have gμ′ν′ (0) = ημ′ν′ (0). We can then expand the metric in a Taylor series as

gμ′ν′^ (x) = ημ′ν′^ + Aμ′ν′λ′^ xλ

Bμ′ν′λ′σ′^ xλ

′ xσ

  • · · ·. (4.32)

The tensor Aμ′ν′λ′ is symmetric in μ′, ν′, which gives us ten independent components, and for each of these there are four possible values of λ′, so Aμ′ν′λ′^ has 40 independent components. Similarly, Bμ′ν′λ′σ′ is symmetric in μ′, ν′^ and symmetric in λ′, σ′, which gives 100 independent components. We will now briefly abandon the Carroll notation for indices, so that indices will just represent numbers that run from 0 to 3, and we will put primes on the coordinates themselves. We consider a coordinate transformation to the unprimed frame, in which we will try to arrange for ∂λgμν (0) = 0. The coordinates x′λ^ and xλ^ can be related by an infinite power series

x′λ^ = xλ^ +

Cλμν xμxν^ +

Dλμνρxμxν^ xρ^ + · · ·. (4.33)

(The first term on the right is the reason why the Carroll convention for indices would be awkward— we want to have the same index on both a primed and an unprimed coordinate.) The metric in the unprimed frame can be written as

gμν = ∂x′λ ∂xμ

∂x′σ ∂xν^

g λσ′. (4.34)

Using the expansion Eq. (4.33) gives us

∂x′λ ∂xμ^

= δμλ + Cλμν xν^ + Dλμνρxν^ xρ^ + · · ·. (4.35)

Using this along with Eq. (4.32), we have

gμν = ∂x′λ ∂xμ

∂x′σ ∂xν^

g λσ′

=

[

δλμ + Cλμρ xρ^ + · · ·

][

δσν + Cσνρ xρ^ + · · ·

][

ηλσ + Aλστ x′τ^ + · · ·

]

We want to express the right-hand side in terms of unprimed coordinates, so the quantity x′τ^ must be further expanded using Eq. (4.33). Expanding out the full right-hand side order by order, we find to first order that gμν = ημν + Cνμρxρ^ + Cμνρxρ^ + Aμνρxρ^. (4.37)

We see that to make the derivative of gμν vanish to first order, it is sufficient to achieve

Cνμρ + Cμνρ = −Aμνρ. (4.38)

Recall that, for our problem, the As are given, determined by the metric in the primed coordinates. Our goal is to find a set of Cs such that ∂λgμν (0) = 0. Since Cμνρ is symmetric in its last two indices and Aμνρ is symmetric in its first two indices, A and C each have 40 independent components. Forty linear equations in 40 unknowns will typically have a solution, but this could fail if the equations are not linearly independent. Next time, we will solve Eq. (4.38), completing the proof. (^) 

5 Lecture 5 (Feb. 27, 2017)

5.1 Locally Inertial Frames

We will now complete the proof of Thm. 1 from last lecture. We will begin by summarizing our progress thus far.

Proof (cont.) Thus far, we have chosen x 0 = 0 without loss of generality. We wrote down the transformation of the metric under an arbitrary change of coordinates as

g = M Tg′M , (5.1)

where

M μ

′ ν =^

∂xμ ′

∂xν^

Note that the matrix M has sixteen parameters. We did this transformation in two steps. First, we diagonalized g (which is always possible because g is symmetric) via an orthogonal matrix O as

gdiag^ = OTg′O. (5.3)

There are six parameters in the orthogonal matrix O. Second, we scaled the eigenvalues of gdiag using the transformation η = M˜ Tgdiag^ M ,˜ (5.4)

where

M˜ =

m 1 0 0 0 0 m 2 0 0 0 0 m 3 0 0 0 0 m 4

.^ (5.5)

This transformation clearly has four parameters. There are six parameters in M that are undetermined by the transformations we have done thus far (six parameters are fixed by O and four by M˜ ). These six parameters are the parameters of an arbitrary Lorentz transformation, which leaves the Minkowski metric invariant. For the second half of the proof, we assume that gμ′ν′^ (0) = ημ′ν′^ , but ∂λ′^ gμ′ν′^6 = 0. We want to find an unprimed coordinate system in which

gμν (0) = ημν , ∂λgμν = 0. (5.6)

In Eq. (4.32), we wrote down a Taylor expansion for the metric

gμ′ν′^ (x) = ημ′ν′^ + Aμ′ν′λ′^ xλ

Bμ′ν′λ′σ′^ xλ

′ xσ

  • · · ·. (5.7)

We noted that A was symmetric in its first two indices, meaning that there are ten parameters in A for each value of λ′, giving a total of 40 parameters in A. The tensor B is symmetric in its first two indices and symmetric in its last two indices, giving a total of 100 parameters in B.