Lagrangian Mechanics
Newton's laws of motion are the foundation on which all of classical mechanics is built. Everything from celestial mechanics to rotational motion, to the ideal gas law, can be explained by the powerful principles that Newton wrote down. The main difficulty in applying the Newtonian algorithm is in identifying all the forces between objects, which requires some ingenuity.
This all stems from the fact that Newton's laws are written in terms of vector quantities that are easiest to use in Cartesian coordinates. One can of course rewrite a vector in any system of coordinates, but it is not the most transparent operation. We also need an insightful choice of coordinate system to simplify all our calculations. Moreover, choosing the wrong reference frame can give rise to confusing artifacts.
The scheme would be more useful if our important quantities could be easily rewritten and solved in the most convenient coordinate system. What we imagine is the ability to describe our systems in terms of scalars, instead of vectors. That is, to write down numbers like mass, energy, or momentum squared which are invariant under a change in coordinates. Such is the aim of the Lagrangian formulation of mechanics.
Let's begin by reviewing some hard problems of Newtonian mechanics and pointing out what makes them so difficult to resolve.
Contents
- Hard Problems for the Newtonian Algorithm
- Complaints about Newtonian Mechanics
- Newtonian Mechanics: The Elevator Pitch
- Throwing out Forces
- Kicking out Vectors, Bringing in Energy
- Implicit Constraints
- Some Solved Problems
- Generalized Momenta and Conservation
- Simple Solutions to Hard Problems in Newtonian Mechanics
Hard Problems for the Newtonian Algorithm
Without making an attempt at solution, we point out some challenging problems for Newton's approach.
Mass on a sliding wedge
Consider the block-shaped mass on a sliding wedge below. Both the block and the wedge are free to move without friction under the force of gravity. Setting this problem up is difficult for two reasons. The first is that the mass \(M\) moves along the floor while \(m\) moves at an angle, which creates a tension in the choice of coordinates (the Cartesian grid in the frame of the ground, and another in the frame of the tilted surface). The second is that the normal force between \(M\) and \(m\) depends upon the motion of \(M\) and \(m\).
\[\] Bead on a spinning hoop
We might also consider a bead that is free to slide on a rotating hoop. This situation does not easily yield to a vector description because gravity beckons us to use Cartesian coordinates, while the constraint of the hoop begs that we use polar coordinates. Finding a representation that is at once easy to think about, and straightforward to calculate in, is no easy task.
\[\] Coupled pendulums
As a final example, consider the coupled pendulum, where one pendulum hangs from the end of another. This scenario is plagued by the choice of convenient coordinates, the coupling of the two pendulums, and the constraint that keeps the two pendulums from breaking apart.
Complaints about Newtonian Mechanics
Each of these problems would be extremely difficult to resolve by the usual approach of identifying forces, writing down constraints, transforming to a convenient system of coordinates, and possibly uncoupling equations by clever substitutions. Let us outline these difficulties explicitly, in an anti-wish list of sorts.
Vector representation: Representing a system using vectors cripples us when we need to change coordinate systems. For example, though it is easy to write down the force of gravity as \(-g\hat{z}\) in a Cartesian coordinate system, it is most natural to represent the motion of a pendulum in terms of the angle \(\theta\) of the swing. Rewriting the acceleration vector of the pendulum bob \(\ddot{\vec{r}}\) is tedious and requires special knowledge of the transformation properties of the radial and directional unit vectors. In more involved problems, this only becomes more of a hindrance. Contrast this to the ease with which we can transform scalar quantities such as the kinetic energy \(\textrm{K.E.}=\frac12 m v^2 \rightarrow \frac12 m \omega^2 r +\frac12 m \dot{r}^2 \). Thus, we would benefit from a mechanics freed from vector quantities.
Identifying forces: In using Newton's laws, we must account for all forces as well as their counter forces. In systems with more than one particle, this can become paralyzing as the set of contact forces between particles explode into a combinatorial nightmare. We should hope for a mechanics where this bookkeeping does not keep us from analyzing problems of interest. Thus, if we can find a way to free ourselves from the chains of vector forces, we should be able to reason about more complex systems.
Constraints: In Newtonian mechanics, we must explicitly build constraints into the equations of motion. For example, a mass on an inclined plane must abide the surface of the plane, and this must be treated by introducing a normal force representing the constraint of the surface. This can quickly become very complicated, especially when constraints are imposed by non-rigid surfaces like in the first example above. It would make our task much easier if constraints could be imposed implicitly, for instance, in our choice of coordinates.
Newtonian Mechanics: The Elevator Pitch
We now remind ourselves of the principles of Newtonian mechanics, and suggest a formulation that will aid us in our effort to reshape mechanics in an energy-based, vector-free manner. Newton's second law states that the acceleration of any particle is given by the net force upon the particle, where the acceleration and net force are both vectors in one, two, or three dimensions:
\[\vec{F}_\text{net} = m\vec{a}.\]
We recall the work-energy principle which states that the work done by the net force (i.e. the non-constraint forces) on a particle is equal to the increase in kinetic energy of the particle. We neglect constraint forces because they always act perpendicular to the direction of motion, and thus perform no work on systems.
Consider an arbitrary, infinitesimal displacement \(\delta\vec{r}\) of the position of our particle under the net force. We can say \(\vec{F}\cdot \delta\vec{r} = m\vec{a}\cdot\delta\vec{r}\). If we integrate along a finite displacement, we find that
\[W = \int \vec{F}_\text{net} \cdot \delta\vec{r}\]
and therefore
\[\int \left(\vec{F}_\text{net}-m\vec{a}\right)\cdot \delta\vec{r} = 0.\]
This statement is known as d'Alembert's principle and is the jumpoff point for our efforts.
We now proceed with a reformulation of Newtonian mechanics that eliminates the complaints that we outlined in our wish list.
Throwing out Forces
Our first desire is to exchange vector forces for scalar energies. We recall that conservative forces are those that can be written as the spatial derivative of a potential function, i.e.
\[F_i = -\frac{\partial V(r_1, \ldots, r_n)}{\partial r_i}.\]
At the lowest level (i.e. particle interactions) all forces are conservative, but in macroscopic perspectives, conservative forces between particles are often approximated by dissipative forces like contact friction or viscosity. For now, let us proceed with conservative forces, and worry about how to accommodate dissipative kludges later. Our first step is to rewrite Newton's second law \(\vec{F} = m\ddot{\vec{r}}\) as
\[m\ddot{\vec{r}} \cdot \delta\vec{r} = - \frac{\partial V(\vec{r})}{\partial \vec{r}}\cdot \delta\vec{r} = - \big[\nabla_r V(\vec{r})\big] \cdot \delta\vec{r}.\]
Kicking out Vectors, Bringing in Energy
We have thus dispensed with forces by relating the spatial derivatives of the potential energy to the acceleration vector. However, we are still dealing in vectors. Our next objective is to eliminate our dependence upon using vector quantities to represent positions and velocities. Concretely, we would like to rid ourselves of any terms containing derivatives of position vectors and replace them with derivatives of scalar quantities like energy.
We proceed from the statement of d'Alembert's principle
\[\vec{F}_\text{net} \cdot \delta\vec{r} = m\frac{d^2\vec{r}}{dt^2} \cdot \delta\vec{r},\]
where \(\delta \vec{r}\) is some infinitesimal movement of the particle. We can immediately rewrite the right-hand side as
\[m\left[ \frac{d}{dt}\left( \frac{d\vec{r}}{dt}\cdot \delta\vec{r} \right) - \frac{d\vec{r}}{dt}\cdot\frac{d\delta\vec{r}}{dt}\right],\]
where we have applied the identity \(\left(xy\right)^\prime = x^\prime y + y^\prime x\).
As written, we have an equation for the vector \(\vec{r}\) that makes no reference to a particular choice of coordinates (Cartesian, spherical, etc). We now expand the RHS in a coordinate basis \(r_1, \ldots, r_n:\)
\[\begin{align} &m\sum_k\left[\frac{d}{dt}\left( \frac{d\vec{r}}{dt}\cdot \frac{\partial \vec{r}}{\partial r_k}\delta r_k \right) - \frac{d\vec{r}}{dt}\cdot\frac{\partial\dot{\vec{r}}}{\partial r_k} \delta r_k\right] \\ =& m\sum_k\left[\frac{d}{dt}\left( \dot{\vec{r}}\cdot \frac{\partial \vec{r}}{\partial r_k}\right) - \dot{\vec{r}}\cdot\frac{\partial\dot{\vec{r}}}{\partial r_k} \right]\delta r_k. \end{align} \]
Now, for any \(\vec{r}\), we have the following identity, known as dot-cancellation: (why is this true?)
Dot-cancellation
If we expand a vector \(\vec{r}\) in any coordinate basis \(\{r_1,\ldots,r_n\}\), we have
\[\frac{\partial \vec{r}}{\partial r_k} = \frac{\partial \dot{\vec{r}}}{\partial \dot{r}_k}.\]
Employing this identity, the right-hand side becomes
\[m\sum_k\left[\frac{d}{dt}\left( \dot{\vec{r}}\cdot \frac{\partial \dot{\vec{r}}}{\partial \dot{r}_k}\right) - \dot{\vec{r}}\cdot\frac{\partial\dot{\vec{r}}}{\partial r_k} \right]\delta r_k. \]
Recognizing \(\dot{\vec{r}} = \vec{v}\), we rewrite this as
\[\begin{align} &m\sum_k\left[\frac{d}{dt}\left(\frac12 \frac{\partial v^2}{\partial \dot{r}_k}\right) - \frac12 \frac{\partial v^2}{\partial r_k} \right]\delta r_k \\ &=\sum_k\left[\frac{d}{dt}\left(\frac{\partial T}{\partial \dot{r}_k}\right) - \frac{\partial T}{\partial r_k} \right]\delta r_k, \end{align}\]
where in the second line we've made the replacement \(T = \frac12 mv^2.\)
Thus
\[\begin{align} - \sum_k \frac{\partial V}{\partial r_k} \delta r_k &= \sum_k\left[\frac{d}{dt}\left(\frac{\partial T}{\partial \dot{r}_k}\right) - \frac{\partial T}{\partial r_k} \right] \delta r_k \\ 0 &= \sum_k\left[\frac{d}{dt}\left(\frac{\partial T}{\partial \dot{r}_k}\right) - \frac{\partial T}{\partial r_k} + \frac{\partial V}{\partial r_k} \right] \delta r_k . \end{align} \]
Now, remember that the small displacement \(\delta \vec{r}\) was arbitrary. In three dimensions, we could have \(\delta\vec{r} = \langle \delta_x, 0, \delta_z \rangle\) or \(\langle 0, 0, \delta_z \rangle\). But the equation must be true independent of the displacement, i.e. every term in the sum is equal to zero independent of the coordinate displacements \(\delta r_k.\) Therefore, we have
\[0 = \frac{d}{dt}\left(\frac{\partial T}{\partial \dot{r}_k}\right) + \frac{\partial\left(V - T\right)}{\partial r_k} \]
for each \(k\).
Now, for conservative forces, \(V\) is a pure function of the position variables so that \(\frac{\partial V}{\partial \dot{r}_k}=0\). We can therefore move \(V\) into the derivative and write
\[\frac{\partial\left(T - V\right)}{\partial r_k} = \frac{d}{dt}\frac{\partial \left(T-V\right)}{\partial \dot{r}_k}. \]
We have thus completely eliminated the use of position and velocity vectors from classical mechanics. If we agree to call the difference between the kinetic energy and the potential energy \(L=T-V\), we can shorten our equation to
\[\frac{\partial L}{\partial r_k} = \frac{d}{dt}\frac{\partial L}{\partial \dot{r}_k} .\]
The quantity \(T-V\) is called the Lagrangian of the system, and the equation for \(L\) is called the Euler equation. In any problem of interest, we obtain the equations of motion in a straightforward manner by evaluating the Euler equation for each variable. For example, in a spherical coordinate system, we would have three Euler equations for \(r, \theta\), and \(\phi.\)
The Lagrangian, \(L\), of a system is the difference of the kinetic energy \(T\) and the potential energy \(V:\)
\[L(r, \dot{r}) \equiv T(r, \dot{r}) - V(r).\]
Implicit Constraints
Our last aim was to eliminate the explicit inclusion of constraint forces. Perhaps it won't be so surprising that having eliminated forces, we've eliminated constraint forces as well. In fact, they were tossed out at the introduction of d'Alembert's principle. In the Lagrangian formulation, constraints can be incorporated in several ways; we'll focus on the simplest.
When we eliminated vectors in favor of coordinates, an important fact went unmentioned. Not only did we eliminate vectors, but we restated mechanics in terms of generalized coordinates, i.e. we are free to choose any coordinate system whatsoever that is most natural to describe our system. For example, if a particle is constrained to move on the surface of a sphere, we can use a spherical coordinate system where \(r\) is held constant. Thus, we include constraints intrinsically by our choice of coordinates!
Some Solved Problems
This is all a bit abstract. Let's apply the Lagrangian formulation to a few problems and develop a concrete feeling for what we've done.
Mass on an inclined plane
The mass slides down the inclined plane so we can choose our coordinate to be the distance along the plane \(s\). In this coordinate system, the kinetic energy of the mass is given by \(T=\frac12 m \dot{s}^2\). If we define the initial potential energy to be zero, then the potential energy as a function of \(s\) is given by \(V=+mgs\sin\theta\), where \(\theta\) is the angle of the incline.
Thus our Lagrangian is given by
\[L = \frac12 m \dot{s}^2 - mgs\sin\theta.\]
As \(s\) is the only variable coordinate in this problem, we have one Euler equation to evaluate, which we now do:
\[\begin{align} \frac{d}{dt}\frac{\partial L}{\partial \dot{s}} &= \frac{\partial L}{\partial s} \\ m\frac{d}{dt}\dot{s} &= - mg\sin\theta \\ m\ddot{s} &= - mg\sin\theta\\ \ddot{s} &= - g\sin\theta, \end{align}\]
which is the equation of motion we expect for the mass on an inclined plane. Notice that our solution required almost no thought. As soon as we write down the kinetic and potential energies, the solution amounts to taking some mindless derivatives.
Celestial mechanics
Let's suppose that we have a star of mass \(M\) and a planet whose mass \(m\) is much less than that of the star. The planet and star interact through the gravitational potential \(V(r) = -GMm/r,\) where \(r\) is the distance separating the star and planet. What are the equations of motion?
Given the symmetry of the problem, it is most convenient to write in polar coordinates, with \(r\) being the distance between the star and the planet, and \(\theta\) the angle of our planet around its orbit.
The kinetic energy of the planet is given by \(T = \frac12 m \dot{r}^2 + \frac12 m \left(r\dot{\theta}\right)^2\).
Thus our Lagrangian is given by
\[L = \frac12 m \dot{r}^2 + \frac12 m \left(r\dot{\theta}\right)^2 + GMm/r.\]
Now, it is possible for both \(r\) and \(\theta\) to vary, so we have two Euler equations to set up. The one for \(r\) is given by
\[\begin{align} \frac{d}{dt}\frac{\partial L}{\partial \dot{r}} &= \frac{\partial L}{\partial r} \\ m\ddot{r} &= mr\dot{\theta}^2 - GMm/r^2, \end{align}\]
which is the equation of motion we expect for \(r\).
For \(\theta\) we have
\[\begin{align} \frac{d}{dt}\frac{\partial L}{\partial \dot{\theta}} &= \frac{\partial L}{\partial \theta} \\ \frac{d}{dt}\left(mr^2\dot{\theta}\right) &= 0. \end{align}\]
The Euler equation for \(\theta\) tells us something remarkable. It says that the quantity in parentheses, the angular momentum of the planet, is a constant of the motion.
Notice that we didn't need to know anything about torque, or angular momentum, but we just got it for free as a result of our Euler equation. Compare our Lagrangian approach to the solution using the Newtonian algorithm in deriving Kepler's laws .
The instance example of finding a conserved quantity from our Euler equation is no happy accident. It is an example of a general feature of Lagrangian mechanics. Before stating the general connection between the form of a Lagrangian and the conserved quantities of motion, we'll make a further observation about our Lagrangian formalism.
Generalized Momenta and Conservation
Notice that in Cartesian coordinates, the kinetic energy is given by \(T = \frac12 m \dot{r}^2\). Since potentials are functions of position, the velocity \(\dot{r}\) will only ever appear in the kinetic energy term. Thus, the quantity \(\frac{\partial L}{\partial \dot{r}}\) in the Euler equation will always be equal to \(m\dot{r}\). However, this is just the momentum of the particle. Thus, we can say that \(p_r = \frac{\partial L}{\partial \dot{r}}\), i.e. the momentum of the particle is just the derivative of the Lagrangian with respect to the velocity.
Thus, we can rewrite the Euler equation as
\[\frac{d}{dt}p_r = \frac{\partial V}{\partial r}.\]
Notice also, in the orbital mechanics example, that the second Euler equation produced \(\frac{\partial L}{\partial \dot{\theta}} = mr^2\dot{\theta}\), the angular momentum of the particle. Thus, we have \(p_\theta = \frac{\partial L}{\partial \dot{\theta}}\), just as we had for linear momentum above.
In fact, it is generally the case that the partial derivative of the Lagrangian with respect to any velocity variable is equal to the generalized momentum associated with that variable.
Thus, we can recast the Lagrangian formulation as
\[\dot{p}_i = \frac{\partial L}{\partial r_i}.\]
We can now state a conservation principle about Lagrangians.
Noether's theorem for momenta
If the coordinate \(r_k\) does not appear in the Lagrangian of a system, then the corresponding momenta is a conserved quantity.
Suppose the coordinate \(r_k\) makes no appearances in the Lagrangian, then
\[\dot{p}_k = \frac{\partial L}{\partial r_k} = 0.\]
Thus, in the example of orbital mechanics, we can see that the Lagrangian makes no reference to the angular variable \(\theta\), and thus the angular momentum does not vary in time.
Simple Solutions to Hard Problems in Newtonian Mechanics
We close the article by revisiting two of the hard problems we posed for Newtonian mechanics at the beginning. It is strongly recommended that you attempt to set those problems up using the Newtonian approach so as to more fully appreciate the power of the Lagrangian formulation.
Mass on a sliding wedge
In this problem, both the mass and the wedge are free to slide. Find the acceleration of the mass.
One way to set up our coordinates is to use a Cartesian system in the frame of the surface. Let us call the horizontal position of the left edge of the wedge \(x_w\), and the horizontal position of the mass \(x_m\).The only challenge here is expressing the height of the mass in terms of \(x_w\) and \(x_m\). The relative horizontal distance between the edge of the wedge and the mass is \(x_m - x_w\). If we imagine this to be the base of a triangle whose hypotenuse is at an angle of \(\theta\), then we can find the vertical drop by
\[\begin{align} \text{hyp}\cos\theta &= (x_m - x_w) \\ \text{hyp}\sin\theta &= y \\ \Rightarrow y &= \tan\theta\left(x_m - x_w\right). \end{align}\]
Therefore, we have \(y = \tan\theta\left(x_m - x_w\right)\) and \(\dot{y} = \tan\theta\left(\dot{x}_m - \dot{x}_w\right)\). If we set the horizontal surface as the zero of gravitational potential energy \(\big(\)so that \(V = -mg\tan\theta\left(x_m - x_w\right)\big),\) we can write down the Lagrangian of our system as
\[L = \frac12 M \dot{x}_w^2 + \frac12 m \dot{x}_m^2 + \frac12 m \tan^2\theta\left(\dot{x}_m - \dot{x}_w\right)^2 + mg\tan\theta\left(x_m - x_w\right).\]
The motion of the system is thus defined by two Euler equations
\[\begin{align} \frac{d}{dt}\frac{\partial L}{\partial \dot{x}_w} &= \frac{\partial L}{\partial x_w} \\ M\ddot{x}_w + m\left(\ddot{x}_w - \ddot{x}_m\right)\tan^2\theta &= -mg\tan\theta \end{align}\]
and
\[\begin{align} \frac{d}{dt}\frac{\partial L}{\partial \dot{x}_m} &= \frac{\partial L}{\partial x_m} \\ m\ddot{x}_m + m\left(\ddot{x}_m - \ddot{x}_w\right)\tan^2\theta &= mg\tan\theta. \end{align}\]
We can easily solve this system of equations for \(\ddot{x}_w\) and \(\ddot{x}_m\) to find the motion of the system.
Solving for \(\ddot{x}_m\), we find
\[\ddot{x}_m = \dfrac{Mg\cot\theta}{m+M\csc^2\theta}.\]
To check our solution, we can take a familiar limit.
If \(M \gg m\), then we should recover the usual inclined plane solution where the plane is fixed in place. If \(M\gg m,\) then the horizontal acceleration of the mass \(\ddot{x}_m\) becomes \(g\cot\theta/\csc^2\theta,\) which is equal to \(g\cos\theta\sin\theta\).
Thus, the acceleration of the mass along the plane is equal to \(g\sin\theta\) as we expect.
Bead on a rotating hoop
In this problem, a bead slides on a hoop that is spun by a motor at angular velocity \(\dot{\phi} = \omega\). At first the bead will slide down or up depending on its initial position, but eventually it will settle at some equilibrium angle given by the balance of gravity and centripetal acceleration due to the spinning of the room. Our question is to find the equilibrium angle \(\theta_\text{eq}.\)
The bead is free to slide along the rigid circular hoop, and therefore the only free variable in the system is the angle \(\theta\) that the bead makes with the vertical. The choice of spherical coordinates is clear from the symmetry of the problem. We can write the kinetic energy of the bead as \(T = \frac12 m \left(\dot{\phi} r\sin\theta\right)^2 + \frac12 m \left(r\dot{\theta}^2\right)\). If we set the bottom of the hoop as the minimum of the gravitational potential energy, then we have \(V = mg(1-\cos\theta) r\). Thus our Lagrangian is equal to\[L = \frac12 m \left(\dot{\phi} r\sin\theta\right)^2 + \frac12 m \left(r\dot{\theta}\right)^2 + mg(\cos\theta - 1) r\]
and our Euler equation for \(\theta\) is
\[\begin{align} \frac{d}{dt}\frac{\partial L}{\partial \dot{\theta}} &= \frac{\partial L}{\partial \theta} \\ mr^2\ddot{\theta} &= -mg\sin\theta r + m\dot{\phi}^2r^2\sin\theta\cos\theta \\ &= -m\sin\theta\left( gr- \omega^2r^2\cos\theta\right). \end{align}\]
We therefore have \(\ddot{\theta}=0\) when \(\theta_\text{eq} = \cos^{-1}\frac{g}{\omega^2 r}\).
Note that if \(\ddot{\theta} = 0\), then we have \(\dot{\theta}=\ \text{constant}\). However, since \(\theta\) must be between \(0\) and \(\pi\), \(\dot{\theta}\) must equal zero (it cannot keep changing value on a finite set of values without changing direction).
Bead on a rotating rod
Consider a smooth rod that is spun by a motor with angular velocity \(\omega\). At time zero, a bead of mass \(m\) is placed a distance of \(\epsilon\) along the rod, free to slide without friction, and the motor is switched on. Show that the radial position of the bead as a function of time is given by
\[r(t)=\epsilon\cosh \omega t.\]
Since the rod spins in a circle, polar coordinates are a natural choice to describe the system. There is not potential energy associated with the system, so we have the Lagrangian given purely by the kinetic energy:\[L = \frac12 m \dot{r}^2 + \frac12 m r^2\dot{\theta}^2.\]
From inspecting the Lagrangian, it might seem as though \(\dot{p}_\theta\) is constant, since \(\theta\) does not appear in the Lagrangian. However, recall that \(\dot{\theta}=\omega\) is imposed by the motor spinning the rod. Thus, we have one Euler equation for \(r:\)
\[\begin{align} \frac{d}{dt}\frac{\partial L}{\partial \dot{r}} &= \frac{\partial L}{\partial L}\\ m\ddot{r} &= mr\dot{\theta}^2\\ &= mr\omega^2, \end{align} \]
which has the solution \(r = Ae^{\omega t} + Be^{-\omega t}\).
At time zero, we have \(r = \epsilon\) and \(\dot{r} = 0\), so we can solve for \(A\) and \(B\) with these initial conditions. We find \(A=B=\epsilon/2\) and we have \(r = \epsilon\frac{e^{\omega t} + e^{-\omega t}}{2} = \epsilon \cosh \omega t.\)