Special Relativity
At speeds that are a substantial fraction of the speed of light, the framework of Newtonian mechanics no longer suffices to describe many physical phenomena. Instead, one must start to take into account Einstein's theory of special relativity, which deals with the "special" case of physics in the absence of gravity. The more "general" case of general relativity takes into account gravitational effects.
Contents
Introduction
Given what is known about modern physics today, it is quite easy to take for granted that the speed of light, approximately \( 3 \cdot 10^8\: \text{m}/\text{s} \), is the unwavering speed limit of the universe. However, just a little over a century ago, this seemingly simple fact shook to the ground the Newtonian laws of physics that had stood for over two centuries and became the basis of Einstein's revolutionary 1905 paper On the Electrodynamics of Moving Bodies, in which Einstein posited the fundamental ideas of special relativity [1].
While Newtonian physics is adequate to describe the motion of most macroscopic objects, which usually move at speeds much slower than the speed of light, many macroscopic phenomena are fundamentally microscopic in nature. Due to their low mass-to-charge ratio, microscopic particles are relatively easy to accelerate to high speeds and must often be described relativistically. As it turns out, special relativity is crucial to explaining the correct chemical activity of the elements predicted by chemistry, the behavior of magnetic materials, and the behavior of subatomic particles in particle accelerators, all of which involve particles such as electrons or protons moving at relativistic speeds.
Relativity is also often used on a very macroscopic scale when an extremely precise measurement is required. For instance, GPS satellites, which orbit Earth tens of thousands of kilometers above its surface, must account for the effects of general relativity on the passage of time in order to provide directions accurate enough for daily use. Another well-known case is correctly describing the particularly strange orbit of the planet Mercury, as well as the dynamics of black holes and other large astronomical bodies.
The marriage of special relativity and quantum mechanics in the 1930s allowed for the prediction of antimatter and soon led to the birth of particle physics via quantum field theory.
History of Special Relativity
The beginnings of special relativity came well before Einstein's 1905 paper introducing special relativity. In the decades after Maxwell fleshed out a framework for describing electricity and magnetism in the mid-nineteenth century, physicists started to become aware of possible holes in the laws of physics.
One of the direct implications of what became known as Maxwell's equations is that a propagating electromagnetic wave—that is, light—must travel at a fixed speed of approximately \( 3 \cdot 10^8 \: \text{m}/\text{s} \). This is a result that stands somewhat at odds with classical physics, which inherently requires that no speeds be absolute.
Recall that Newtonian physics assumes that relative velocities add directly according to the so-called Galilean transformation. Suppose an observer in an inertial frame of reference moves at velocity \( \mathbf{v} \) relative to some lab frame. If this observer measures an object to move with velocity \( \mathbf{w} \) in his or her frame, then one measures the velocity of the object in the lab frame to be \( \mathbf{v} + \mathbf{w} \). As a result, there always exists a frame in which an object's velocity is greater than in its rest frame. If a car driving at \( 40 \text{ m}/\text{s} \) with respect to observers on the side of the road flashes its headlights in the direction of its motion, then classical physics predicts that the observers on the side of the road should measure the speed of the light emitted from the headlights to be \( 40 \text{ m}/\text{s} + 3 \cdot 10^8 \text{ m}/\text{s} \). However, this is not what is predicted by Maxwell's equations as we currently understand them.
Similarly troubling was the fact that the velocity dependence of Maxwell's equations led to apparent asymmetries in the electric and magnetic forces. Consider, for instance, Einstein's example of a stationary loop of wire and moving magnet. According to Lenz's law, the moving magnet changes the magnetic flux across the wire loop and leads to an induced current, or an electric field, in the wire. However, the loop feels no magnetic force since it is stationary. Equivalently, one can also view the magnet as stationary and the loop as moving, in which case the loop experiences a magnetic force but no electric force. In this simplified analysis, the forces calculated in both frames turn out to be equal, but in one frame the loop perceives a magnetic field, while in the other it perceives an electric field.
Worse yet, when one considers Ampère's law (with Maxwell's modifications), one obtains an additional magnetic field term resulting from a changing electric field in the case of the moving magnet. This additional term is small but nonetheless cannot be accounted for classically.
Einstein's breakthrough in addressing these paradoxes in electromagnetism was to apply the mathematical framework of Lorentz and Poincaré from decades earlier to modify Galilean relativity. Confronted with these puzzles plus experimental observations in the late \(19^\text{th}\) century that suggested that the speed of light was indeed constant in all frames, Einstein realized Lorentz and Poincaré's mathematics could be perfectly adapted to account for a velocity bound in nature. Einstein's seminal paper on special relativity in electrodynamics incorporated their so-called Lorentz transformation to explain why propagating electromagnetic waves were consistent with the existing laws of physics. Einstein's genius was to make the bold statement that it was not Maxwell's equations but, rather, the centuries-old Newtonian framework that had to be modified. On this basis, Einstein was able to elegantly derive the Lorentz transformation from first principles in physics, leading to the modern theory of special relativity as we know it today.
Einstein's Postulates
Einstein proposed two simple postulates as the basis for special relativity, which were justified by previous experimental results and fundamental ideals for physical theories:
Postulate 1. All inertial frames are equivalent with respect to physical laws. There is no "preferred" frame of reference.
Postulate 2. The speed of light is measured to be the same value, \( c \approx 3 \cdot 10^8 \: \text{m}/\text{s} \), by observers in all inertial frames.
Recall that an inertial frame is one that is not accelerating. It is generally assumed that a frame is inertial if and only if no fictitious forces like the centrifugal force appear in the frame.
It can be shown that these two seemingly innocuous postulates have the following counterintuitive consequences, among many others:
- Time dilation. A time interval measured by an observer moving with respect to a stationary observer may be measured to be longer in the frame of the stationary observer.
- Length contraction. A length measured by an observer moving with respect to a stationary observer may be measured to be shorter in the frame of the stationary observer.
- Loss of simultaneity. Events measured to be simultaneous by an observer moving with respect to a stationary observer may not be simultaneous in the frame of the stationary observer.
Note that Postulate 1 demands that the consequences be symmetric between two frames. For example, a time interval measured in the frame of a stationary observer must also be longer in the frame of the moving observer, to whom his or her own frame appears stationary and the "stationary" observer appears to be moving. As a result, there is no longer any sense of giving absolute quantities of time or length to a single event, since even the relative lengths of distance and time are relative to the frame of measurement. The relativity of such physical quantities gave rise to the name of the theory.
Time Dilation
An immediate consequence of Postulate 2 is the fact that measurements of time will depend on the frame of reference. In general, a measurement of the duration of some event (for instance, the tick of a clock) measured by an observer moving with respect to a stationary observer will be longer in the frame of the stationary observer. If an intergalactic spaceship flying by Earth measures each pulse of its laser to take one millisecond, observers on Earth measure each pulse to take longer than one millisecond—far slower (and thus "dilated") compared to the time measured by the inhabitants of the spaceship.
Time dilation is a simple consequence of the fact that if the speed of light is fixed to be the same value for two observers in different frames, either length or time will change from the usual Newtonian value when transforming between frames.
The lengths of time measured in both frames, however, are indeed different. Indeed, the ratio of time measured in the stationary observer's frame \( t \) to the time measured in the train observer's frame \( t' \) is given by
\[ \frac{t}{t'} = \frac{c}{\sqrt{c^2 - v^2}}. \]
Since lengths perpendicular to the direction of motion of a frame are measured to be the same by observers in all frames (as argued below), it makes sense to consider a thought experiment containing a "light clock" set up in the perpendicular direction.
Suppose an observer in an open car of a train of height \( l \) moving to the right at a speed of \( v \) keeps track of time by pulsing light from the bottom of the train directly toward a mirror at the top of the train and measuring the time elapsed between the initial pulse and the return of the light back to the source after being reflected at the top of the train. Clearly, to the observer in the train, the time \( t' \) elapsed for one such pulse is
\[ t' = \frac{2l}{c}, \]
where \( c \) is the speed of light.
An observer outside the train, however, cannot measure the same \( t' \) for the duration of the pulse, or else the total speed of the light pulse would be
\[ v_{\text{total}} = \sqrt{v_x^2 + v_y^2} = \sqrt{v^2 + c^2}, \]
which is larger than \( c \) and thus violates Postulate 2.
Instead, it must be the case that \( v_{\text{total}} = c \), in which case
\[ c = \sqrt{v^2 + v_y^2}, \]
so
\[ v_y = \sqrt{c^2 - v^2}. \]
Therefore, the time \( t \) measured by an observer outside the train is
\[ t = \frac{2l}{\sqrt{c^2 - v^2}}, \]
and the ratio \( \frac{t}{t'} \) of time measured in the stationary observer's frame to the time measured in the train observer's frame is
\[ \frac{t}{t'} = \frac{c}{\sqrt{c^2 - v^2}}.\ _\square \]
It is customary in special relativity to name this ratio \( \gamma \) (the Greek letter gamma) and rewrite the expression in the form
\[ \gamma = \frac{c}{\sqrt{c^2 - v^2}} = \frac{1}{\sqrt{1 - \frac{v^2}{c^2}}}, \]
in which case one may write simply
\[ \frac{t}{t'} = \gamma. \]
In this form, the asymptotic behavior of \( \gamma \) as a function of \( v \) is clear. As \( v \rightarrow 0 \), \( \gamma \rightarrow 1\), and as \( v \rightarrow c \), \(\gamma \rightarrow \infty \). In other words, as the speed of the train goes to zero, the time measured in the stationary observer's frame approaches that measured in the train observer's frame. However, as the speed of the train approaches the speed of light, the ratio between the two times increases without bound.
In all cases, \( \gamma \geq 1 \), so \( t \geq t' \), which proves the earlier statement that a time interval measured by an observer moving with respect to a stationary observer may be measured to be longer in the frame of the stationary observer.
Muon decay 1. Elementary particles called muons are constantly produced in the upper atmosphere due to collisions from cosmic rays. Because muons are relatively light (about a few hundred times heavier than an electron), they travel at nearly the speed of light. For the sake of simplicity, consider that all of the muons produced travel at \( 0.998 c \).
Given an average distance of \( 15 \, \text{km} \) from the upper atmosphere to the earth's surface, classically one might believe that no muons should reach the surface because muons have a half-life of \( 1.56 \cdot 10^{-6} \, \text{s} \). Even though the muons travel at nearly the speed of light \( \big(3 \cdot 10^8 \, \text{m}/\text{s}\big), \) they exist for such a short time that the average muon travels no more than a few hundred meters.
However, note that the decay time is the time measured in the muon's frame. As measured by an observer on the earth, the half-life is longer by a factor of \( \gamma \approx 15.8 \). As a result, according to special relativity, the muon travels almost \( 16 \) times farther in the stationary observer's frame than the classical prediction. This extra distance traveled (on the order of kilometers) allows an appreciable number of muons to reach the earth's surface. Indeed, the relativistic prediction matches the experimental observation of muon detection on the earth's surface.
Length Contraction
With the time dilation result in hand, one can show that lengths parallel to the direction of travel are shorter in the frame of the stationary observer. Lengths perpendicular to the direction of travel are the same in both frames, however, as argued below.
Analogous to the time dilation result, one can express the ratio of the length measured in the stationary observer's frame \( l \) to the time measured in the train observer's frame \( l' \). The result is
\[ \frac{l}{l'} = \frac{1}{\gamma}. \]
Consider a thought experiment containing a "light clock" set up in the parallel direction. Consider a mirror placed at the right end of the train, with a light source placed at the left end. Let the length of the train in the train observer's frame be \( l' \), and let the length of the train measured in the stationary observer's frame outside of the train be \( l. \) \((\)Note that a priori it is not necessarily true that \( l \neq l'.) \)
Suppose that the train moves to the right at a speed of \( v \) and that the observer in the train keeps track of time by pulsing light from the left end of the train and measuring the time elapsed between the initial pulse and the return of the light back to the source after being reflected at the right end. To the observer in the train, the time \( t' \) elapsed for one such pulse is
\[ t' = \frac{2l'}{c}, \]
where \( c \) is the speed of light.
However, an observer outside the train measures a different result. The time to reach the mirror from the source is no longer just the length of the train divided by the speed of the light but rather
\[ t_\text{right} = \frac{l}{c - v}. \]
To obtain this result, note that the train travels distance \(v t_\text{right}\) in time \(t_\text{right}\), so
\[ t_\text{right} = \frac{l+v t_\text{right}}{c}\]
and solving for \( t_\text{right} \) gives \( t_\text{right} = \frac{l}{c - v}\).
By the same argument with a sign change, the time needed for light to reach the source from the right end is
\[ t_\text{left} = \frac{l}{c + v}. \]
Therefore, the total time \( t \) for one tick of the clock in the stationary observer's frame is
\[ t = t_\text{right} + t_\text{left} = \frac{2lc}{c^2 - v^2}. \]
Since \( \frac{t}{t}' = \gamma \), it follows that
\[ \frac{2lc}{c^2 - v^2} = \frac{2 \gamma l'}{c}. \]
Straightforward algebra leads to the desired result:
\[ \gamma = \frac{l'}{l}.\ _\square \]
As the speed of the moving frame approaches the speed of light, the length measured in the stationary observer's frame therefore becomes arbitrarily small, whereas it remains unchanged for \( v = 0 \implies \gamma = 1 \).
Muon decay 2. Postulate 1 suggests that the previous muon decay example should be equivalent when viewed in the frame of the muon, which travels toward the earth at a speed \( 0.998 c \). In this frame, the half-life is properly \( 1.56 \cdot 10^{-6} \, \text{s} \). Do muons still reach the earth's surface?
In the frame of the muon, the earth travels toward the muon at almost the speed of light, which requires that lengths measured in the earth's frame must be shorter by a factor of \( \gamma = 15.8 \). As a result, the atmosphere-earth distance of \( 15 \, \text{km} \) is reduced by a factor of \( \gamma \), which produces results equivalent to the calculation performed in the frame of the earth.
A simple thought experiment, discussed below, shows that lengths perpendicular to the direction of motion of a frame must be measured to be the same by observers in all frames.
There are two sticks of equal length aligned parallel to each other as shown. If the stick on the right moves toward the other, show that the length measured in the frame of either stick is the same.
Suppose that the stick on the right has globs of paint on both ends. For the sake of contradiction, suppose that lengths measured by an observer in a moving frame are measured to be shorter in the frame of a stationary observer. In that case, in the frame of the stick on the left, the stick on the right is shorter, and therefore the right stick paints the left stick as the right stick passes through the left sticks. However, in the frame of the stick on the right, the stick on the left is moving toward the right stick, which means that the left stick is shorter in the frame of the right stick. Therefore, the right stick does not point the left stick as the left stick passes through the right stick. Because Postulate 1 is violated, our assumption must be false, and both sticks must be the same length in both frames.
Similarly, a contradiction results if one supposes that lengths are longer to observers in a stationary frame.
Loss of Simultaneity
In the derivation for the length contraction result, there was a certain asymmetry between the pulse of light moving toward the right end of the train (in the direction of travel of the train) and the light moving back toward the left end of the train (opposite to the direction of travel of the train). This observation is the basis for the fact that events simultaneous in the train frame may not be simultaneous in the frame of the stationary observer.
Suppose events at the left and right ends of a train are simultaneous in the train frame. It turns out that these events are not simultaneous in a stationary observer's frame outside the train. Furthermore, the time elapsed between the events in the stationary observer's frame is
\[ \Delta t = \frac{\gamma l' v}{c^2}, \]
with the event on the side of the left end of the train occurring first, \( l' \) the length of the train in the train frame, and \( v \) the speed of the train.
Suppose a light source is placed in the center of a train, which in the stationary observer's frame is of length \( l \). In the train frame, the light reaches both ends of the train simultaneously, but this is not the case in the stationary observer's frame. If the train moves to the right, the amount of time required for light to reach the right end as measured by the stationary observer is\[ t_\text{right} = \frac{l}{2(c - v)}, \]
whereas the time for light to reach the left end is
\[ t_\text{left} = \frac{l}{2(c + v)}, \]
via the same reasoning as before. Using the expression for length contraction \( l/l' = 1/\gamma \) yields
\[ \Delta t = t_\text{right} - t_\text{left} = \frac{l'}{\gamma} \left[\frac{1}{2(c - v)} - \frac{1}{2(c + v)}\right], \]
which can be rearranged to
\[ \Delta t = \frac{\gamma l' v}{c^2}. \]
While the expressions for time dilation and length contraction involved only time or length coordinates separately, here there is clearly "mixing" of the length and time coordinates in different frames. For this reason, in relativistic physics, one often refers to "spacetime" as a single entity, in which time sits on equal footing with the three spatial dimensions.
Lorentz Transformation
As they have been presented, the expressions for the key relativistic effects (time dilation, length contraction, and loss of simultaneity) only apply directly to each of the effects taken separately. In general, all of the effects can be systematically dealt with in one fell swoop. One can show that the so-called Lorentz transformation, which relates the stationary observer's coordinates with that of an observer in a moving frame, is consistent with all of the expressions previously obtained:
\[ \begin{align} ct' &= \gamma \left( ct - \frac{vx}{c}\right)\\ x' &= \gamma\left(x - \frac{v}{c} ct\right) \\ y' &= y \\ z' &= z. \end{align} \]
Here, the moving frame travels along the \( x \)-axis with velocity \( v \), with the coordinates of the moving frame denoted by primes. Usually, the Lorentz transformation is written as a matrix transformation between vectors of coordinates, defining \(\beta = \frac{v}{c}\):
\[\begin{pmatrix} ct' \\ x' \\ y' \\ z' \end{pmatrix} = \begin{pmatrix} \gamma & -\gamma \beta & 0 & 0 \\ -\gamma \beta & \gamma & 0 &0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 &1 \end{pmatrix} \begin{pmatrix} ct \\ x \\ y \\ z \end{pmatrix} .\]
Relativistic Dynamics
If special relativity is true, why is it that one never encounters relativistic phenomena in everyday experience? The key lies in the size of \( \gamma \) at everyday velocities. For \( v \ll c \), \(1 - v^2/c^2 \approx 1\), and therefore \( \gamma \approx 1 \). Thus, at speeds much smaller than the speed of light, relativistic effects are quite minimal, since the \(\gamma\) factor modifying equations of motion effectively does not contribute any correction.
To illustrate how \(\gamma\) changes with \(v\), consider the below tables of values of \( \gamma \) as a function of \( v \) for several given speeds \( v \) (in terms of fractions of the speed of light \( c \)). For comparison, the speed of light is a factor of ten thousand times that of the escape velocity from the earth:
\[ \begin{array}{cc} \\ v & \gamma \\[.01cm] \hline \\[.01cm] 0.1 c & 1.005 \\ 0.25 c & 1.033 \\ 0.5 c & 1.155 \\ 0.9 c & 2.294 \\ 0.99 c & 7.089 \end{array} \]
Clearly, the value of \( \gamma \) is barely larger than \( 1 \) for any speeds that are not an appreciable fraction of \( c \).
One of the clear implications of special relativity is the fact that no object with mass can travel at the speed of light or faster. This presents a clear problem with the Newtonian expressions of various dynamical quantities such as the kinetic energy \( \frac{1}{2} mv^2 \) and the momentum \( m \mathbf{v} \). In both cases, both quantities are constrained to be finite even though there are no physical laws that prevent an arbitrarily large amount (but finite) amount of energy to be added to a system.
It turns out that there exist proper relativistic expressions for the energy and momentum that have the proper asymptotic behavior as \( v \) approaches \( c \). The total energy of a particle of mass \( m \) and speed \( v \) is given by
\[ E = \gamma mc^2. \]
One peculiar outcome of this expression is that a particle at rest must have some rest energy \(E_0 = mc^2 \), which is a powerful implication in and of itself.
Similarly, the relativistic momentum is
\[ \mathbf{p} = \gamma m \mathbf{v}, \]
which, like the relativistic energy, has the correct limiting behavior.
These equations, like all others in special relativity, can be derived by demanding consistency with Einstein's two postulates or equivalently enforcing the Lorentz transformation in a variety of physical circumstances.
As in classical physics, the total energy and momentum of a system is always conserved. \((\)However, mass is no longer conserved in general. Einstein's equation \(E = mc^2\) shows that energy may be converted to mass and vice versa, and only the total energy including the mass will be conserved.\()\)
References
[1] Einstein, A. "On the Electrodynamics of Moving Bodies." Annalen der Physik, 1905.
[2] Morin, D.J. Introduction to Classical Mechanics. Cambridge University Press, 2007.
[3] Taylor, J.R. Classical Mechanics. University Science Books, 2005.