A random variable is a variable that takes on one of multiple different values, each occurring with some probability. When there are a finite (or countable) number of such values, the random variable is discrete. Random variables contrast with "regular" variables, which have a fixed (though often unknown) value.
For instance, a single roll of a standard die can be modeled by the random variable
where each case occurs with probability .
Random variables are important in several areas of statistics; in fact, the standard probability distributions can be viewed as random variables. They also model situations involving conditional probability as well as problems involving linearity of expectation.
A probability space consists of
- a sample space , the set of all possible outcomes,
- a set of events , where each event is a subset of and
- a function , which assigns a probability to each event.
For example, consider flipping a fair coin. The sample space is the set , and the possible events are . Here the empty set refers to neither heads nor tails occurring, and the set refers to either one of heads or tails occurring. The function is thus defined by
The set consists of selected subsets of which represent relevant events of the experiment. For example, if the experiment consists of rolling a 6-sided die one can take to be the set of all subsets of . Then is the event which can be described as "an odd number is rolled." If the experiment produced the outcome , then each event with is said to have happened. For example, if a 6-sided die is rolled and number is the outcome, some of the events that have happened are (" is rolled"), ("a number less than is rolled"), and ("an even number is rolled").
In general, need not contain all possible subsets of . The restrictions on are
- if , then
- if is a countable family of subsets of with for each , then .
The function must satisfy and is required to be countably additive, that is, for any countable collection of pairwise disjoint sets
In this way, the probability space becomes a measure space with measure and the collection of all measurable sets .
The requirements on the probability space reflect basic rules of probability. For example, if then and are disjoint sets with . Since is countably additive, which is the complement rule of probability.
A random variable is formally defined as a measurable function from the sample space to another measurable space . The requirement that is measurable means that the inverse image of each measurable set in is an event.
Commonly, is a function taking values in which describes some property of the outcomes of the probability space. For example, a random variable that denotes the number of heads in a single coin flip would have
Note that the random variable does not return a probability by itself; rather, it is the probability space itself that assigns probabilities. This is perhaps more obvious when considering multiple coin flips; for instance, the probability space describing 3 consecutive coin flips has 8 possible outcomes and the probability that exactly two heads are flipped is the probability that the event happened:
If is a variable that denotes the number of heads flipped,
A discrete random variable is a random variable which takes only finitely many or countably infinitely many different values.
However, this does not imply that the sample space must have at most countably infinitely many outcomes. For example, if a point is chosen uniformly at random in the interval , consider the random variable which takes the value if and otherwise. Although the sample space is the interval which is infinite and uncountable, the variable is discrete since it takes only finitely many (i.e. 2) values.