Discrete Random Variables - Probability Density Function (PDF)

The probability density function (PDF) of a random variable is a function describing the probabilities of each particular event occurring. For instance, a random variable describing the result of a single dice roll has the p.d.f.

$\text{Pr}(X=x) = \begin{cases} \frac{1}{6} & x = 1 \\ \frac{1}{6} & x = 2 \\ \frac{1}{6} & x = 3 \\ \frac{1}{6} & x = 4 \\ \frac{1}{6} & x = 5 \\ \frac{1}{6} & x = 6 \end{cases}$

In general, the value of the p.d.f. at any point must be nonnegative (since a negative probability is impossible), and the sum of the probabilities must be equal to 1 (as exactly one outcome must occur).

The p.d.f. of a random variable is useful in analyzing its expected value, along with other measures such as variance and median.

Expected value

The expected value of a random variable is a weighted average of each case, defined by:

$\mathbb{E}[X] = \sum_x x \cdot \text{Pr}(X=x)$

which weights the values of each outcome by the probability it occurs. For example, the expected value of a single dice roll is

$\mathbb{E}[X]=\frac{1}{6} \cdot 1 + \frac{1}{6} \cdot 2 + \ldots + \frac{1}{6} \cdot 6 = 3.5$

Note that the expected value does not mean the most likely value to occur (the mode); indeed, 3.5 is an impossible value for a die roll to take on. The best interpretation of the expected value is its significance across multiple trials; after $n$ dice rolls, the sum of the results is approximated by $3.5n$ . This is also an illustration of linearity of expectation, which states that for any two (not necessarily independent random variables $X$ and $Y$ ,

$\mathbb{E}[X+Y] = \mathbb{E}[X] + \mathbb{E}[Y]$

where $X+Y$ is the random variable representing the sum of $x$ and $y$ . In this particular case, this says that if $X_1, X_2, \ldots, X_n$ are random variables representing a single dice roll, then $X_1+X_2+\ldots+X_n$ is a random variable representing the sum of $n$ dice rolls, and that

$\mathbb{E}[X_1+X_2+\ldots+X_n]=\mathbb{E}[X_1]+\mathbb{E}[X_2]+\ldots+\mathbb{E}[X_n]=3.5n$

Variance

The variance of a random variable is a measure of Dispersion, or how "spread out" the data is, defined by the sum of the squared distance from the data to the mean. More specifically,

$\text{Var}[X] = \mathbb{E}[(X-\mu)^2]$

where $\mu=\mathbb{E}[X]$ is the mean of $X$ .

This is often rewritten in the following manner:

$\begin{aligned} \text{Var}[X] &= \mathbb{E}[(X-\mathbb{E}[X])^2] \\ &=\mathbb{E}[X^2-2X\mathbb{E}[X]+\mathbb{E}[X]^2] \\ &=\mathbb{E}[X^2]-2\mathbb{E}[X]\mathbb{E}[X]+\mathbb{E}[X]^2 \\ &=\mathbb{E}[X^2]-\mathbb{E}[X]^2 \end{aligned}$

Although the mean is always additive, the corresponding property holds for variance under the additional assumption of independence:

$\text{Var}[X + Y] = \text{Var}[X] + \text{Var}[Y] \ \text{for independent random variables} \ X \ \text{and} \ Y.$

Contents

Expected value

Variance

See Also