Better living through probability generating functions

The straightforward way to calculate the mean value, mean squared value, or any other moment of a distribution is by taking a weighted average:

\[\begin{align} \langle x \rangle &= \sum\limits_i x_ip(x_i) \\ \vdots \\ \langle x^m \rangle &= \sum\limits_i x_i^mp(x_i) \end{align} \]

For example, in the voting model, the mean value of A^\hat{A} is given formally by

A^=0p(0)+1p(1)+2p(2)++Np(N)=mmp(m) \begin{aligned} \langle\hat{A}\rangle &= 0\cdot p(0)+1\cdot p(1)+2\cdot p(2)+\ldots+N\cdot p(N) \\ &= \sum\limits_m m\cdot p(m) \end{aligned}

where p(m)p(m) is given by the binomial distribution: (Nm)pAm(1pA)Nm\binom{N}{m}p_A^m\left(1-p_A\right)^{N-m}.

As we expect, the mean value of B^\hat{B} is

B^=m(Nm)p(m)=NA^ \begin{aligned} \langle\hat{B}\rangle &= \sum\limits_m (N-m)\cdot p(m) \\ &= N - \langle \hat{A} \rangle \end{aligned}

In principle, we can extract anything we'd like to know from a distribution using similar series. In general, however, these sums can get cumbersome and quite unfriendly. It would be preferred if we could find a way to bypass their use altogether, if we could find an equivalent representation that reveals its properties by simpler means.

A similar problem arises in statistical mechanics where we have a series of states, characterized by some energies EiE_i, for which we'd like to know the likelihood of finding a system in any one of those states. In principle we have to calculate some dreadful weighted sum.

However, it is often possible to express the probability distribution in a compact form known as the partition function Z(Ei)Z(E_i) and obtain various bulk average by means of simple operations on ZZ, rather than wading through the murky waters of series manipulation. Can we do something similar here?

Let's consider the object Z=(pA+pB)N\displaystyle Z = \left(p_A + p_B\right)^N. On its face this may not seem like much of anything. For one thing, it is always equal to 1.

If we expand it, we see that it is a sum of terms, each one corresponding to a p(m)p(m) in the binomial distribution, i.e.

(pA+pB)N=m=0N(Nm)pAmpBNm\left(p_A + p_B\right)^N = \sum\limits_{m=0}^{N}\binom{N}{m}p_A^mp_B^{N-m}

This is a good sign, we have found a concise expression that reproduces the binomial distribution. It would be great if we could implant something that allows us to easily transform the compact expression from distribution, to mean value, to mean squared value, etc.

Right now, we have an expression for m=0N(Nm)pAmpBNm\displaystyle\sum\limits_{m=0}^{N}\binom{N}{m}p_A^mp_B^{N-m} while we'd like one for m=0Nm(Nm)pAmpBNm\displaystyle\sum\limits_{m=0}^{N}m\binom{N}{m}p_A^mp_B^{N-m}.

Consider (pAz+pB)N\left(p_Az+p_B\right)^N. Upon expanding, we find that this is

f(z)=m=0Nzm(Nm)pAmpBNmf(z) = \displaystyle\sum\limits_{m=0}^{N}z^m\binom{N}{m}p_A^mp_B^{N-m}

and in particular f(1)=m=0N(Nm)pAmpBNmf(1) = \displaystyle\sum\limits_{m=0}^{N}\binom{N}{m}p_A^mp_B^{N-m}.

Cleaning things up a bit, we have f(z)=m=0Nzmp(m)f(z) = \displaystyle\sum\limits_{m=0}^{N}z^mp(m)

Now, if we take the derivative of ff with respect to zz, we can bring down a factor of mm in each term of the sum, i.e.

f(z)=m=0Nmzm1p(m)f'(z) = \displaystyle\sum\limits_{m=0}^{N}mz^{m-1}p(m)

and, by setting z=1z=1, we obtain the mean value of the distribution f(1)=m=0Nmp(m)=mf'(1) = \displaystyle\sum\limits_{m=0}^{N}m\cdot p(m) = \langle m \rangle

This is amazing. Rather than summing the series by hand, manipulating the terms, applying identities, etc. we perform a single derivative and find f(1)=mf'(1) = \langle m \rangle .

It is natural to then ask what happens if we take the second derivative. Let us see!

f(1)=m(m1)zm2p(m)z=1=[m2zm2p(m)mzm2p(m)]z=1=m2m \begin{aligned} f''(1) &= \sum\limits m(m-1)z^{m-2}p(m)\bigg|_{z=1} \\ &= \left[\displaystyle\sum m^2z^{m-2}p(m) - \displaystyle\sum mz^{m-2}p(m)\right]\bigg|_{z=1} \\ &= \langle m^2 \rangle - \langle m \rangle \end{aligned}

Then, we have m2=f(1)+f(1)\langle m^2 \rangle = f''(1) + f'(1)


σ2(m)=m2m2=f(1)+f(1)f(1)2 \begin{aligned} \sigma^2(m) &= \langle m^2 \rangle - \langle m \rangle ^2 \\ &= f''(1)+f'(1)-f'(1)^2 \end{aligned}

It appears we have stumbled on something quite valuable: for each derivative we perform on f(z)f(z), we receive another piece of information about the distribution p(m)p(m).

At this point you might wonder what all of this is good for. After all, we began with some ugly looking sum, and now we have arrived at some slightly less ugly but more abstract sum to take its place. But remember, we are working with the abstract representation for f(z)f(z) in order to derive rules to manipulate ff's simpler form, (pAz+pB)N\left(p_Az+p_B\right)^N. Calculations are simpler and more transparent once these are in place.

Next, let's try out some of our tools.

Note by Josh Silverman
7 years, 3 months ago

No vote yet
1 vote

  Easy Math Editor

This discussion board is a place to discuss our Daily Challenges and the math and science related to those challenges. Explanations are more than just a solution — they should explain the steps and thinking strategies that you used to obtain the solution. Comments should further the discussion of math and science.

When posting on Brilliant:

  • Use the emojis to react to an explanation, whether you're congratulating a job well done , or just really confused .
  • Ask specific questions about the challenge or the steps in somebody's explanation. Well-posed questions can add a lot to the discussion, but posting "I don't understand!" doesn't help anyone.
  • Try to contribute something new to the discussion, whether it is an extension, generalization or other idea related to the challenge.
  • Stay on topic — we're all here to learn more about math and science, not to hear about your favorite get-rich-quick scheme or current world events.

MarkdownAppears as
*italics* or _italics_ italics
**bold** or __bold__ bold

- bulleted
- list

  • bulleted
  • list

1. numbered
2. list

  1. numbered
  2. list
Note: you must add a full line of space before and after lists for them to show up correctly
paragraph 1

paragraph 2

paragraph 1

paragraph 2

[example link]( link
> This is a quote
This is a quote
    # I indented these lines
    # 4 spaces, and now they show
    # up as a code block.

    print "hello world"
# I indented these lines
# 4 spaces, and now they show
# up as a code block.

print "hello world"
MathAppears as
Remember to wrap math in \( ... \) or \[ ... \] to ensure proper formatting.
2 \times 3 2×3 2 \times 3
2^{34} 234 2^{34}
a_{i-1} ai1 a_{i-1}
\frac{2}{3} 23 \frac{2}{3}
\sqrt{2} 2 \sqrt{2}
\sum_{i=1}^3 i=13 \sum_{i=1}^3
\sin \theta sinθ \sin \theta
\boxed{123} 123 \boxed{123}


Sort by:

Top Newest

I didnt understand it but its good, so I reshared it.

Julian Poon - 6 years, 8 months ago

Log in to reply

what does this symbol (!) mean?

cyrus wilson - 5 years, 11 months ago

Log in to reply

Sorry, where in the note do you see the ! symbol? I can't find it.

Josh Silverman Staff - 5 years, 11 months ago

Log in to reply

on a problem ,algebra, all I remember is something like 6!-4! divided by 10!, or something like that. do not understand it . can you help!

cyrus wilson - 5 years, 11 months ago

Log in to reply

@Cyrus Wilson Oh, I see. !! means the factorial function. It is easy to see what it means from example, 5!=5×4×3×2×15! = 5\times 4\times 3\times 2\times 1. Not much more complicated than that!

Josh Silverman Staff - 5 years, 11 months ago

Log in to reply


Problem Loading...

Note Loading...

Set Loading...