A simplified model for molecular diffusion, stock price changes, genetic drift, etc. is the random walk, which is a particle that, at each point in time, can move a given distance in a random direction. For instance, an object currently at position \(n\) on a 1d line can move to \(n-1\), or \(n+1\), each with probability \(\frac12\).

Of course, each individual particle will take a different trajectory, and so, is unpredictable. Despite this, exact statistical properties of the movement can be obtained without much work. We'll take the simple discretized case where particles move the same unit distance at each step. A diagram for the flow of particles in this process is shown below:

\[\displaystyle \boxed{n-1}\longleftrightarrow \boxed{n}\longleftrightarrow \boxed{n+1}\]

This means that at each time step, a particle at position \(n\) can drift to the right, or the left. We can write this down as a simple differential equation for the change in the number of particles at position \(n\), \(N_n\), with time:

\[\displaystyle\frac{dN_n(t)}{dt}=N_{n+1}(t)+N_{n-1}(t)-2N_n(t)\]

The 2 reflects the fact that particles at \(n\) can drift to the right or the left, with equal probability.

Several simplifications are appropriate:

- I won't premultiply everything by the probability one half, it amounts to a rescaling of the time unit.
- Instead of thinking about this in terms of particle number, we can divide the equation by the total number of diffusing particles, and in that way, speak of the probability that a given position is occupied, rather than a numerical amount of particles. This is more generalizable and allows us to think of probability flows.
- Moreover, instead of thinking of many particles on a single lattice, we can think of the equation as describing an infinite number of particles, with the probabilities describing the bulk behavior of the system. In this way, we can calculate statistical properties of the process very easily.

Rewriting the differential equation in the suggested way, we find:

\[\displaystyle\frac{dp_n(t)}{dt}=p_{n+1}(t)+p_{n-1}(t)-2p_n{t}\]

It's not trivial to solve a differential equation like that just by staring at it. It's also not straightforward to get a nice recurrence relation. In order to solve it, we can use a probability generating function \(\displaystyle F(z,t) = \sum z^np_n(t)\). It's usefulness is best seen by example, but first we can point out two of its properties that reveal how it can be used.

First of all, if we set \(z=1\), then every coefficient multiplying the \(p_n(t)\) in \(F(z,t)\) is equal to 1, and we have the sum of all probabilities at a given time. But at all times, the probabilities have to add up to 1. Therefore, \(F(1,t)=1\).

The next property is a beautiful formula of complex analysis. The variable \(z\) can be taken to be a complex number, so, we can use the full array of complex analysis theorems to dissect it. If, in general, we had an interesting expression for \(F(z,t)\), and we want to know the probability being at position \(n\), \(p_n(t)\), we could simply take the contour integral of \(\displaystyle \frac{F(z,t)}{z^{n+1}}\) around the origin. The reason for this is that \(p_n\) has the coefficient \(z^n\), and the only thing that survives a contour integral about the origin are terms with \(1/z\). This is known as Cauchy's integral formula

Therefore, we have:

\[\displaystyle p_n(t)=\frac{1}{2\pi i}\oint\frac{F(z,t)}{z^{n+1}}dz\]

We want to find a simple expression for \(F\), for which we can take these derivatives. If we take the differential equation of the probability flow for \(p_n\), we can multiply it by \(z^n\) and sum over all \(n\) to get the following relation:

\[\displaystyle\frac{dF(z,t)}{dt}=\left(z+\frac{1}{z}-2\right)F(z,t)\]

This can be integrated directly to produce the simple formula:

\[\displaystyle F(z,t)=K(z)e^{\left(z+z^{-1}-2\right)t}\]

We can choose an initial condition in which all the particles start situated at the origin. I.e., \(p_n=0 \forall n\) except for \(p_0\) which is equal to 1. This implies \(F(z,0)\) = 1, which shows that the constant \(K\) is 1.

At this point, we have solved the problem in principle, but are still wanting for a solution in \(n\)/. We could expand the expression for \(\displaystyle F\) out into a Taylor series to find the coefficient on the \(\displaystyle z^n\) which are the \(\displaystyle p_n(t)\). However, carrying this out results in ugly looking sums and Bessel functions.

Instead, we can do a bit more complex analysis, and make a useful approximation to get an analytic equation for \(p_n(t)\).

Recalling the contour integral described above, we have:

\[\displaystyle p_n=\frac{1}{2\pi i}\oint F(z,t)z^{-\left(n+1\right)}dz\]

The only pole of this integral is at the origin and, so, we can take the contour to be the unit circle. In this case,

\[z=e^{i\theta}\] and \[dz=ie^{i\theta}d\theta\]

Also, it's clear that

\[\displaystyle z+z^{-1} = 2\cos\theta\]

So that the expression for \(F\) becomes

\[F(z,t)=e^{\left(2\cos\theta-2\right)t}\]

Writing this contour integral out in full form, we are faced with evaluating

\[\displaystyle p_n(t)=\frac{1}{2\pi}\oint e^{2t\left(\cos\theta-1\right)}e^{-in\theta}d\theta\]

which is something complicated. However, we notice that for times \(t>1\), the exponential drops off very quickly for increasing \(\theta\), so that contributions to the integral become negligible well before \(\pm\pi\), as seen in the plot below.

With this realization, we can expand the \(\cos\theta\) to second order with very little error. The integral is therefore

\[\displaystyle p_n(t)\approx\frac{1}{2\pi}\int\limits_{-\pi}^{\pi}e^{-t\theta^2-in\theta}d\theta\]

Further, as there is essentially no contribution to the integral for \(\theta > \pi\) we can extend the upper limit of the integral to infinity without any error. We can also extend the bottom limit to negative infinity, although this introduces small error at small \(t\). However, for \(t\) modestly large, the error is negligible and as \(t\rightarrow\infty\), it is non-existent. Therefore

\[\displaystyle p_n(t)=\frac{1}{2\pi}\int\limits_{-\infty}^{\infty}e^{-t\theta^2-in\theta}d\theta\]

Finally, this can be solved analytically through the Hubbard-Stratonovich transformation from field theory (or simply, completing the square) and we arrive at

\[\displaystyle p_n(t)=\frac{1}{\sqrt{4\pi t}}e^{-\frac{n^2}{4t}}\]

This is a Gaussian distribution that spreads symmetrically to the left and right as time proceeds; the variance increases linearly with time.

We show the probability distribution below for \(n \in {0,20}\) and \(t\in{0,100}\). Remember, we started with the initial condition that all particles start at the origin. We can see that as time progresses, the particles flow out from the origin to greater values of \(n\).

In another view, here is a movie of the same process.

## Means and variance

Now that we've done the difficult job of finding the time dependent probability distribution, we can get some values for free. First, the mean position of the diffusing particles.

We can find this by evaluating the expectation value:

\[\displaystyle\left<n\right>=\int\limits_{-\infty}^{\infty} np_n(t)dn=\frac{1}{\sqrt{4\pi t}}\int\limits_{-\infty}^{\infty}ne^{\frac{-n^2}{4t}}dn\]

We can see that this is zero because of the symmetric distribution. On average, particles diffuse about the origin without any bias, and so, are still centered about the initial position.

It is clear that the particles are spreading out with time, so, we expect that the average particle will make it away from the origin. Specifically, we can find the mean squared distance. We find this, again, by taking an expectation value:

\[\displaystyle\left<n^2\right>=\int\limits_{-\infty}^{\infty}n^2p_n(t)dn=\frac{1}{\sqrt{4\pi t}}\int\limits_{-\infty}^{\infty}n^2e^{\frac{-n^2}{4t}}dn\]

This integral can be solved in a number of ways, most beautifully by taking a derivative underneath the integral sign. It turns out that the average squared distance grows linearly in time:

\[\displaystyle\left<n^2\right>=2t\]

and therefore we have the classic behavior of the random walk, that the average particle diffuses \(\sqrt{2t}\) steps from the origin by time \(t\), which is in good agreement with real random walkers like \(\textit{E. coli}\) swimming in liquid, or gas diffusing through the air.

## Comments

Sort by:

TopNewestIs this related to Pólya's Random Walk Constants? – Lionel Doolan · 1 year, 3 months ago

Log in to reply

– Josh Silverman Staff · 1 year, 1 month ago

Yes it is, but that problem (whether or not the particle returns to the origin) is a harder question about the same kind of trajectory.Log in to reply

Nice note about random motion – Rajat Dwivedi · 3 years, 1 month ago

Log in to reply

Nice post! Would you recommend Lawler's "Random Walk and the Heat Equation" to students? I haven't read it. Also, problem for you to think about: what happens if you have multiple random walkers in one dimension who are too polite to scoot past each other or occupy the same position? – Eric Brattain · 3 years, 3 months ago

Log in to reply

I haven't read that! My first experience with this kind of stuff was Stochastic Processes in Physics and Chemistry by van Kampen, which I love and think is great but more than a little dense. I would also recommend these course notes from Matt Scott.

I've never tried that situation before, but it sounds like a humdinger, I will put it on my list. – Josh Silverman Staff · 3 years, 3 months ago

Log in to reply