The straightforward way to calculate the mean value, mean squared value, or any other moment of a distribution is by taking a weighted average:
For example, in the voting model, the mean value of is given formally by
where is given by the binomial distribution: .
As we expect, the mean value of is
In principle, we can extract anything we'd like to know from a distribution using similar series. In general, however, these sums can get cumbersome and quite unfriendly. It would be preferred if we could find a way to bypass their use altogether, if we could find an equivalent representation that reveals its properties by simpler means.
A similar problem arises in statistical mechanics where we have a series of states, characterized by some energies , for which we'd like to know the likelihood of finding a system in any one of those states. In principle we have to calculate some dreadful weighted sum.
However, it is often possible to express the probability distribution in a compact form known as the partition function and obtain various bulk average by means of simple operations on , rather than wading through the murky waters of series manipulation. Can we do something similar here?
Let's consider the object . On its face this may not seem like much of anything. For one thing, it is always equal to 1.
If we expand it, we see that it is a sum of terms, each one corresponding to a in the binomial distribution, i.e.
This is a good sign, we have found a concise expression that reproduces the binomial distribution. It would be great if we could implant something that allows us to easily transform the compact expression from distribution, to mean value, to mean squared value, etc.
Right now, we have an expression for while we'd like one for .
Consider . Upon expanding, we find that this is
and in particular .
Cleaning things up a bit, we have
Now, if we take the derivative of with respect to , we can bring down a factor of in each term of the sum, i.e.
and, by setting , we obtain the mean value of the distribution
This is amazing. Rather than summing the series by hand, manipulating the terms, applying identities, etc. we perform a single derivative and find .
It is natural to then ask what happens if we take the second derivative. Let us see!
Then, we have
It appears we have stumbled on something quite valuable: for each derivative we perform on , we receive another piece of information about the distribution .
At this point you might wonder what all of this is good for. After all, we began with some ugly looking sum, and now we have arrived at some slightly less ugly but more abstract sum to take its place. But remember, we are working with the abstract representation for in order to derive rules to manipulate 's simpler form, . Calculations are simpler and more transparent once these are in place.
Next, let's try out some of our tools.