The cumulative distribution function, CDF, or cumulant is a function derived from the probability density function for a continuous random variable. It gives the probability of finding the random variable at a value less than or equal to a given cutoff. Many questions and computations about probability distribution functions are convenient to rephrase or perform in terms of CDFs, e.g. computing the PDF of a function of a random variable.
For any random variable the cumulative distribution function is defined as
which is the probability that is less than or equal to
Using this definition, one can write the probability that takes a value in a certain interval without using an integral. Recall that previously this probability was defined in terms of a PDF:
Now, the probability is rewritten as the difference in values of the CDF:
So the CDF gives the amount of area underneath the PDF between two points. It increases from zero (for very low values of ) to one (for very high values of ). This is because as , there is no probability that will be found that far out if the PDF is normalized. If , this corresponds to which will be one because it is certain that takes some finite value.
In the case of discrete random variables, the value of makes a discrete jump at all possible values of ; the size of the jump corresponds to the probability of that value. In the case of a continuous random variable, the function increases continuously; it is not meaningful to speak of the probability that because this probability is always zero. Instead one considers the probability that the value of lies in a given interval:
Note that it does not matter if the inequalities are strict (if the interval is or for example): since the probability of any given value is zero, the endpoints can be included or not without changing any probabilities.
Still, one frequently wants to make use of the probability density function rather than the CDF. Since the CDF corresponds to the integral of the PDF, the PDF corresponds to the derivative of the CDF:
A fly lands on a long ruler at a random position chosen uniformly along the ruler. Let be the position of the fly in centimeters, and let be the probability density function for What is ?
This probability distribution is uniform, meaning that the probability density is constant on the entire interval . This means that is a linear function: The probability density function is the derivative:
Therefore the probability density function at is equal to
A dart player always hits the dartboard (with a radius of ), but has such a poor aim that the distribution of darts is uniform across the entire board. Let be the distance in cm between the dart and the center. Evaluate the probability density function for at and
The probability is directly proportional to the area of a circle with radius :
The probability density function is the derivative:
Thus one obtains:
One question that often comes up in applications of continuous probability is the following: given the PDF of a random variable, is it possible to find the PDF of an arbitrary function of that random variable?
The answer is yes, and the easiest method uses the CDF of the random variable. The general case goes as follows: consider the CDF of the random variable , and let be a function of . It's important to note the distinction between upper and lower case: is a random variable while is a real number. Recall that the PDF is given by the derivative of the CDF:
Now write the formula for the CDF of :
If is invertible and increasing, then by the chain rule:
This formula can be generalized straightforwardly to cases where is not invertible or increasing.
Consider a uniform random variable on the interval . Find the distribution (i.e., PDF) of .
Note that where is an invertible and increasing function, so the discussion above will apply. The CDF of is:
This is consistent with the formula derived above.