Discrete Random Variables - Definition
A random variable is a variable that takes on one of multiple different values, each occurring with some probability. When there are a finite (or countable) number of such values, the random variable is discrete. Random variables contrast with "regular" variables, which have a fixed (though often unknown) value.
For instance, a single roll of a standard die can be modeled by the random variable
\[ X = \begin{cases} 1: \text{ die shows 1} \\ 2: \text{ die shows 2} \\ 3: \text{ die shows 3} \\ 4: \text{ die shows 4} \\ 5: \text{ die shows 5} \\ 6: \text{ die shows 6}, \end{cases}\]
where each case occurs with probability \(\frac{1}{6}\).
Random variables are important in several areas of statistics; in fact, the standard probability distributions can be viewed as random variables. They also model situations involving conditional probability as well as problems involving linearity of expectation.
Contents
Formal Definition
A probability space consists of
- a sample space \(\Omega\), the set of all possible outcomes,
- a set of events \(\mathcal{F}\), where each event is a subset of \(\Omega,\) and
- a function \(P: \mathcal{F} \rightarrow \mathbb{R}\), which assigns a probability to each event.
For example, consider flipping a fair coin. The sample space is the set \(\{\text{heads}, \text{tails}\}\), and the possible events are \(\{\}, \{\text{heads}\}, \{\text{tails}\}, \{\text{heads}, \text{tails}\}\). Here the empty set refers to neither heads nor tails occurring, and the set \(\{\text{heads}, \text{tails}\}\) refers to either one of heads or tails occurring. The function \(P\) is thus defined by
\[P\big(\{\}\big)=0,\quad P\big(\{\text{heads}\}\big)=\frac{1}{2},\quad P\big(\{\text{tails}\}\big)=\frac{1}{2},\quad P\big(\{\text{heads},\text{tails}\}\big)=1.\]
The set \(\mathcal{F}\) consists of selected subsets of \(\Omega\) which represent relevant events of the experiment. For example, if the experiment consists of rolling a 6-sided die \(\big(\Omega = \{1,2,3,4,5,6\}\big),\) one can take \(\mathcal{F}\) to be the set of all subsets of \(\Omega\). Then \(\{1,3,5\}\in \mathcal{F}\) is the event which can be described as "an odd number is rolled." If the experiment produced the outcome \(\omega\in \Omega\), then each event \(F\in \mathcal{F}\) with \(\omega\in F\) is said to have happened. For example, if a 6-sided die is rolled and number \(2\) is the outcome, some of the events that have happened are \(\{2\}\) ("\(2\) is rolled"), \(\{1,2,3\}\) ("a number less than \(4\) is rolled"), and \(\{2,4,6\}\) ("an even number is rolled").
In general, \(\mathcal{F}\) need not contain all possible subsets of \(\Omega\). The restrictions on \(\mathcal{F}\) are
- \(\Omega\in \mathcal{F};\)
- if \(A\in \mathcal{F}\), then \(\Omega\setminus A \in \mathcal{F};\)
- if \(\{A_i\}_{i\in I}\) is a countable family of subsets of \(\Omega\) with \(A_i\in \mathcal{F}\) for each \(i\in I\), then \(\displaystyle\bigcup_{i\in I} A_i \in \mathcal{F}\).
The function \(P\) must satisfy \(P(\Omega) = 1\) and is required to be countably additive, that is, for any countable collection of pairwise disjoint sets \(\{F_i\}_{i\in I},\)
\[P\bigg(\displaystyle\bigcup_{i\in I} F_i\bigg) = \displaystyle\sum_{i\in I} P(F_i).\]
In this way, the probability space \((\Omega, \mathcal{F}, P)\) becomes a measure space with measure \(P\) and the collection of all measurable sets \(\mathcal{F}\).
The requirements on the probability space reflect basic rules of probability. For example, if \(A\in \mathcal{F},\) then \(A^{c}:= \Omega\setminus A \in \mathcal{F}\) and \(A, A^{c}\) are disjoint sets with \(A\cup A^{c} = \Omega\). Since \(P\) is countably additive, \(1 = P(\Omega) = P(A\cup A^{c}) = P(A) + P(A^{c}),\) which is the complement rule of probability.
A random variable \(X\) is formally defined as a measurable function from the sample space \(\Omega\) to another measurable space \(S\). The requirement that \(X\) is measurable means that the inverse image of each measurable set \(B\) in \(S\) is an event.
Commonly, \(X\) is a function taking values in \(\mathbb{R}\) which describes some property of the outcomes of the probability space. For example, a random variable \(X\) that denotes the number of heads in a single coin flip would have
\[X\big(\{\text{heads}\}\big)=1,\quad X\big(\{\text{tails}\}\big)=0.\]
Note that the random variable \(X\) does not return a probability by itself; rather, it is the probability space itself that assigns probabilities. This is perhaps more obvious when considering multiple coin flips; for instance, the probability space describing 3 consecutive coin flips has 8 possible outcomes and the probability that exactly two heads are flipped is the probability that the event \(\{\text{heads}, \text{heads}, \text{tails}\}\cup \{\text{heads}, \text{tails}, \text{heads}\}\cup \{\text{tails}, \text{heads}, \text{heads}\}\in \mathcal{F}\) happened:
\[P\big(\{\text{heads}, \text{heads}, \text{tails}\}\cup \{\text{heads}, \text{tails}, \text{heads}\}\cup \{\text{tails}, \text{heads}, \text{heads}\}\big) = \frac{3}{8}.\]
If \(X\) is a variable that denotes the number of heads flipped,
\[\text{Pr}(X=2) = \frac{3}{8}.\]
A discrete random variable is a random variable which takes only finitely many or countably infinitely many different values.
However, this does not imply that the sample space must have at most countably infinitely many outcomes. For example, if a point \(a\) is chosen uniformly at random in the interval \([-1,1]\), consider the random variable \(X\) which takes the value \(-1\) if \(-1\leq a<0\) and \(1\) otherwise. Although the sample space is the interval \([-1,1]\) which is infinite and uncountable, the variable \(X\) is discrete since it takes only finitely many (i.e. 2) values.