# This note has been used to help create the Bayes' Theorem and Conditional Probability wiki

by Matt Enlow

Probability problems are notorious for yielding surprising and counterintuitive results. One famous example -- or pair of examples -- is the following:

## 1. A couple has two children and the older child is a boy. If the probabilities of having a boy or a girl are both \(50\%\), what is the probability that the couple has two boys?

We already know that the older child is a boy. The probability of two boys is equivalent to the probability that the younger child is a boy, which is \(50\%\).

## 2. A couple has two children, one of which is a boy. If the probabilities of having a boy or a girl are both \(50\%\), what is the probability that the couple has two boys?

At first glance, this appears to be asking the same question. We might reason as follows: “We know that one is a boy, so the only question is whether the other one is a boy, and the chances of that being the case are \(50\%\). So again, the answer is \(50\%\).”

This makes perfect sense. It also happens to be incorrect.

There are several approaches we could take to straighten out this logical tangle, as well as many other tangles that arise in probability. One common tool is known as **Bayes’ Theorem**. Before we state the theorem, we will discuss conditional probability.

When calculating probabilities of certain events, we often obtain new information that may cause our calculations to change. For example, if I draw a single card from a standard \(52\)-card deck and ask the probability that my card is a Heart, you would quickly say \(\frac{13}{52}=\frac{1}{4}\), as there are \(13\) Hearts in the deck. However, if I then peek at the card and tell you that it is red, then we will have narrowed the possibilities for my card to \(26\) possible cards, and the probability that my card is a Heart is now \(\frac{13}{26}= \frac{1}{2}\).

Let’s define the events \(E\) and \(F\) as “the card is a Heart”and “the card is red,” respectively. Our inital value of \(\frac{1}{4}\) is the probability that my card is a Heart before we have any other information about it. This is called the *prior probability* of event \(E\), and is denoted \(P(E)\). Once we take into account the knowledge that the card is red, we calculate that the *posterior probability* of the card being a Heart is \(\frac{1}{2}\). This probability is denoted \(P(E \mid F)\), and read “the probability of \(E\) given \(F\).”

What about the values of \(P(F)\) and \(P(F \mid E)\)? It’s clear that \(P(F)\), the prior probability that the card is red, is \(\frac{26}{52}=\frac{1}{2}\). To think about the posterior probability \(P(F \mid E)\), we must ask, “What is the probability that the card is red, if we *assume* that it’s a Heart?” Phrased that way, the probability is clearly \(100\%\), or \(1\). Sometimes conditional probabilities are not this easy to calculate. For a given pair of events \(A\) and \(B\), it is often the case that the value of one of \(P(A ￼\mid B)\) or \(P(B ￼\mid A)\) is much easier to calculate than the other. This is where Bayes’ Theorem comes to the rescue, by giving a relationship between these two conditional probabilities. There are several formulations of Bayes’ Theorem; we first state one of the simpler formulations.

## ￼Bayes’ Theorem

￼For any two events \(A\) and \(B\),

\[P(A \mid B) = \frac{P(A)P(B \mid A)}{P(B)}. \]

The proof of Bayes' Theorem follows from the fact that both \(P(B) P(A \mid B)\) and \(P(A) P(B \mid￼ A)\) are equivalent to \(P(A \cap B)\), the probability of both events occurring.

## Example

### Suppose you are given two coins. The first coin is fair and the second coin comes up Heads \(75\%\) of the time. (You can’t tell from appearances which one is which.) You choose one of the coins and flip it three times, yielding Heads, Heads, and Tails (\(HHT\)). Which coin is more likely to be the unfair one?

We want to find the probability that the coin we flipped is unfair, given that we flipped it three times and got \(HHT\). So let’s define event \(A\) as “the coin we flipped is unfair,” and event \(B\) as “the outcome of 3 flips was \(HHT\).” We want to find \(P(A \mid￼ B)\). To use Bayes’ Theorem, we must first determine \(P(A)\), \(P(B ￼\mid A)\), and \(P(B)\).

\(P(A)\) is the probability that the coin we flipped is unfair, without taking into account any flipping that took place. We can assume that we were just as likely to select the fair coin as we were to select the unfair one, so \(P(A)= \frac{1}{2}\). \(P(B ￼\mid A)\) is the probability that the outcome is \(HHT\) if the coin we flip is the unfair one. Since the unfair coin comes up Heads \(75\%\) of the time, \(P(B \mid A) = \frac{3}{4} \cdot \frac{3}{4} \cdot \frac{1}{4}= \frac{9}{64}.\) That leaves \(P(B)\), the probability that the outcome is \(HHT\) if we pick either coin and flip it three times. Here, we come across a little bump in the road. “Well, that depends,” we say. “If it’s the fair coin, the probability is \(\left( \frac{1}{2} \right)^3 = \frac{1}{8}\).

But if it’s the unfair coin, we just calculated that the probability is \(\frac{9}{64}\). How can we calculate the probability if we don’t know whether or not the coin is fair?” We will use the fact that for any two sets \(A\) and \(B\), the sets \(A \cap B\) and \(\overline{A} \cap B\) partition \(B\). This implies

\[ P(B) = P(A \cap B) + P(\overline{A}\cap B) = P(A)P(B \mid A) + P(\overline{A})P(B \mid \overline{A}),\]

￼which gives us another formulation of Bayes’ Theorem:

\[P(A \mid B) = \frac{P(A)P(B \mid A)}{P(A)P(B \mid A) + P(\overline{A})P(B \mid \overline{A})}.\]

￼￼￼￼￼We can now plug everything in:

\[P(A \mid B) = \frac{ \frac{1}{2}\cdot \frac{9}{64} }{ \frac{1}{2}\cdot\frac{9}{64} + \frac{1}{2}\cdot \frac{1}{8} }=\frac{9}{17} \approx 52.94 \%.\]

￼￼￼￼￼￼￼￼￼Therefore, given the evidence of the flips, the coin we’ve been flipping is slightly more likely to be the unfair one. \(_\square \)

## ￼￼￼Diagnosing Disease

￼ One common application of Bayes’Theorem is in analyzing the quality of diagnostic tests. Consider the following.

### The reliability of a particular skin test for tuberculosis (TB) is as follows: If the subject has TB, the test comes back positive \(98\%\) of the time. If the subject does not have TB, the test comes back negative \(99\%\) of the time. (Another way to say this is that the sensitivity of the test is \(0.98\), and the specificity of the test is \(0.99\).)

### From a large population, in which \(2\) in every \(10,000\) people have TB, a person is selected at random and given the test, which comes back positive. What is the probability that the person actually has TB?

Before we implement Bayes’Theorem, take a guess as to what the answer might be. (You might require some convincing that the answer is not simply \(98\%\)!)

Let’s define event \(A\) as “the person has TB” and event \(B\) as “the person tests positive for TB”. It is clear that the prior probability \(P(A)\) is equal to \(0.0002\) and \(P(\overline{A})= 1 - P(A)=0.9998\).

What about \(P(B \mid￼ A)\), the probability that the person will test positive for TB given that the person has TB? This was given to us as \(0.98\). The other value we need is \(P(B \mid￼ \overline{A})\), the probability that the person will test positive for TB given that the person does not have TB. Since a person who does not have TB will test negative \(99\%\) of the time, he or she will test positive \(1\%\) of the time, and therefore \(P(B \mid￼ \overline{A})=0.01\). ￼￼￼ This implies

\[\begin{align}￼ P(A \mid￼ B) &= \frac{ P(A) P(B \mid A)}{P(A)P(B\mid A) + P(\overline{A})P(B\mid \overline{A})} \\ & = \frac{0.0002 \times 0.98}{0,0002\cdot 0.98 + 0.9998 \cdot 0.01} \\ &= \frac{98}{5097} \\ & \approx 1.92 \%. \end{align}\]

You might find this hard to believe— that fewer than \(2\%\) of people who test positive for TB using this test actually have the disease. Even though the sensitivity and specificity of this test are both high, the extremely low incidence of TB in the population has a tremendous effect on the test’s positive predictive value— the proportion of people who test positive that actually have the disease. To see this for yourself, you might try answering the same question assuming that the incidence of TB in the population is \(2\) in \(100\) instead of \(2\) in \(10,000\). \( _\square \)

Let’s see how to use Bayes’ Theorem to solve the two problems at the beginning of this post:

## 1. A couple has two children, the older of which is a boy. What is the probability that they have two boys?

## 2. A couple has two children, one of which is a boy. What is the probability that they have two boys?

Let’s define three events, \(A\), \(B\), and \(C\), as follows:

\[ \begin{align} A & = \mbox{ both children are boys}\\ B & = \mbox{ the older child is a boy}\\ C & = \mbox{ one of their children is a boy} \end{align}\]

Question 1 is asking for \(P(A ￼\mid B)\), and Question 2 is asking for \(P(A \mid￼ C)\). The first is computed using the simpler version of Bayes’ Theorem:

\[P(A \mid B) = \frac{P(A)P(B \mid A)}{P(B)} = \frac{ \frac{1}{4}\cdot 1 }{\frac{1}{2}} = \frac{1}{2}.\]

To find \(P(A ￼\mid C)\), we must determine \(P(C)\), the prior probability that the couple has at least one boy. This is equal to \(1 - P(\mbox{both children are girls}) = 1 - \frac{1}{4}=\frac{3}{4}\). Therefore the desired probability is

\[P(A \mid C) = \frac{P(A)P(C \mid A)}{P(C)} = \frac{\frac{1}{4}\cdot 1}{\frac{3}{4}} = \frac{1}{3}. \;\; _\square \]

There are probably simpler ways to come to an understanding of how these two questions are different. But Bayes’ Theorem is always an option, and an important tool to keep in our toolbox!

## Practice Problems

## The Paradox of the Second Ace

The following pair of problems come from Greg Ross’ wonderful blog, Futility Closet.

(1) You are watching four people play bridge, where a hand begins by dealing each player 13 cards.

a. After a hand is dealt, you ask a player, “Do you have at least one Ace?” She says, “Yes.” What is the probability that she’s holding more than one Ace?

b. On a later hand, you ask another player, “Do you have the Ace of Spades?” She says, “Yes.” What is the probability that she’s holding more than one Ace?

## Taxi Hit-And-Run

The following problem is a favorite of Stanford professor and author Keith Devlin.

(2) A certain town has two taxi companies: Blue Birds, whose cabs are blue, and Night Owls, whose cabs are black. Blue Birds has \(15\) taxis in its fleet, and Night Owls has \(75\). Late one night, there is a hit-and-run accident involving a taxi. The town's \(90\) taxis were all on the streets at the time of the accident. A witness saw the accident and claims that a blue taxi was involved. ￼￼￼ At the request of the police, the witness undergoes a vision test under conditions similar to those on the night in question. Presented repeatedly with a blue taxi and a black taxi, in random order, he shows he can successfully identify the color of the taxi \(4\) times out of \(5\). Which company is more likely to have been involved in the accident?

## Coin Tossing

(3) In front of you are two identical-looking coins. One is fair, and the other comes up Heads \(90\%\) of the time. You choose one of the coins, and flip it ten times, yielding \(HTHHTHHHHT\). What is the probability that the coin you’ve been flipping is the unfair one?

(4) In front of you are two identical-looking coins. One is fair, and the other comes up Heads \(60\%\) of the time. You flip one of the coins five times, yielding \(HHTTH\). You flip the other coin three times, yielding \(THH\). Which one is more likely to be the unfair coin?

(5) In front of you are three identical-looking coins. Two are fair, and the third comes up Heads \(80\%\) of the time. You flip one of the coins three times, yielding \(HTH\). You flip a second coin once, yielding H. Which coin is most likely to be the unfair one?

(6) You have three identical-looking coins in front of you. Two of them are fair, and the third comes up Heads \(80\%\) of the time. You flip one of the coins twice, yielding \(TH\). You flip a second coin once, yielding \(H\). You are permitted to perform one more flip, with any coin you choose, after which you will be asked to identify which coin you believe to be the unfair one. Which coin should you flip in order to maximize the likelihood of your guessing correctly?

## Comments

Sort by:

TopNewestCan you please supply the answers to those problems so that we check our solution? – Titas Dutta · 2 years, 7 months ago

Log in to reply

Under Diagnosing Disease, when calculating the probability of people testing positive for TB having TB, shouldn't the numerator notation be 0.0002*0.98 (multiplication) rather than subtraction (0.0002 - 0.98) as it is now? This does not affect the answer arrived at. – Christopher Williams Staff · 1 year, 2 months ago

Log in to reply

– Calvin Lin Staff · 1 year, 2 months ago

Yes it should be. I've edited the page. Thanks!Log in to reply

A nice written theory. Thanks sir. It provided a good insight into Bayes theorum – Pranav Rao · 1 year, 4 months ago

Log in to reply

There are two bags A and B. A contains n white and 2 black balls and B contains 2 white and n black balls. One of the two bags is selected at random and two balls are drawn from it without replacement. If both the balls drawn are white and the probability that the bag A was used to draw the balls is 6/7, find the value of n – Zishan Ahmad · 1 year, 6 months ago

Log in to reply

Very nicely written sir! – Snehal Shekatkar · 2 years, 5 months ago

Log in to reply

Where are the answers – Robert Forte · 2 years, 6 months ago

Log in to reply