To summarize our work, we've found an object , which we can define for any probability distribution , that serves as a wellspring for all quantities of interest associated with the distribution. It makes calculations more transparent, eases more advanced manipulations, and frees up mental computing power for more interesting tasks.
Our object , defined as , works as it does because we exploit the derivative properties of as a tool. However, because we chose , our derivatives are a little ugly.
For instance, to find the mean, we take a single derivative, , but for the mean squared value we add two derivatives together, . As we try to find higher moments, the dance gets less and less appetizing.
This situation arises because each time we take a derivative of , we bring down a new function of : rather than . Things would be nicer if this could be arranged.
An alternative choice that gets the job done is , so that .
In our voting model, we'd instead have , and find the simpler relations
The object that results from this choice is called the moment generating function and it is a powerful tool. It is so powerful in fact that a foundational domain of theoretical physics, statistical mechanics, has been constructed around its use.
In a system that is close to thermodynamic equilibrium, the likelihood of the system occupying a particular state , of energy is proportional to the factor .
The engine of statistical mechanics is a kind of moment generating function, a sum over the likelihood of all energetic states, , known as the partition function, . It encapsulates all the statistical properties of a physical system and allows us to obtain them through imaginative manipulations.
The probability of occupying the state is given by the relative likelihood, normalized by the partition function (the sum of all likelihoods):
What this translates to is that, all else being equal (temperature, volume, pressure, etc.), a system tends to occupy states of lower energy.
Though slightly more intimidating than before, we find that useful quantities such as the average energy can be found through derivatives of
Physics and politics should never be equated. We can, nevertheless, obtain our higher moments () through derivatives similar () to what we used in the voting model.
The success of the partition function in physics has catapulted it to use in areas as far flung as neuroscience, computation biology, and the study of natural language.
Consider a particular DNA molecule (top strand in diagrams below) diffusing in liquid with millions of other DNA molecules. It is more likely to be found in a state of maximum pair bonding
than in a state with multiple mismatches
The reason for this is that each stable bond the DNA molecule makes with its partner molecule (A:T, or G:C) lowers the energy of interaction, contributing a factor .
On the other hand, each mismatch (A:G, C:T, etc.) produces a high energy bulge, contributing a factor .
The bulges are expensive ( ) to maintain in the face of lower energy alternatives and, so, DNA molecules are most likely to be found with their most stable partners. The probability distribution of the various binding events can be captured through a suitably constructed partition function describing the energetics.
Hopefully, you can now see the divine beauty of the generating function approach.
Let's wrap things up, using the partition function to fold a simple protein.