Newcomb's Paradox

Newcomb's paradox (or Newcomb's problem) is a problem in decision theory in which the seemingly rational decision ends up with a worse outcome than the seemingly irrational decision. The paradox revolves around a particular example, where an agent will give you rewards depending on how it predicts you will act. In a way, it is similar to the prisoner's dilemma. It is deeply tied to problems in prediction, causality, decisions, and free will. Newcomb's problem was invented by William Newcomb, first published by Robert Nozick^[1], and popularized by Martin Gardner^[2].

The core problem in Newcomb's paradox imagines a superintelligent agent. Imagine that this agent is named Omega. Omega is a robot that lives in space and amuses itself by rocketing down to Earth and giving humans difficult problems to work on. Today, Omega has chosen you as the target of its game. Omega appears in front of you with two boxes, labeled $A$ and $B$, and places them on the ground in front of you. It tells you that it has placed a certain amount of money in each of the boxes. Specifically, box $A$ is opaque and has either $1,000,000 or $0 in it (you don't know which), and box $B$ is transparent and has $1,000 in it. Omega says you have the option of choosing to open box $A$ or both boxes $A$ and $B$.

More on One Box or Two Boxes?

What if this problem changed slightly?

Those were the easy problems. Now, Omega puts down the boxes and tells you that what it has said so far is accurate, but there's a twist; Omega decides what to put in the boxes based on what it predicts you will do with the boxes after it puts them down. If Omega predicts that you will only open box $A$, it puts $1,000,000 into box $A$ and $1,000 into box $B$. However, if Omega predicts that you will open both boxes, it puts $0 into box $A$ and $1,000 into box $B$.

You know that every time Omega has made a prediction about anyone's behavior it has been correct. In fact, you know that Omega will always perfectly predict what you will do. (Obviously this assumption is unrealistic--the case where Omega is not perfect will be dealt with later.)

After explaining the situation to you, Omega rockets back to its space lair, leaving you with both of the boxes. The question is: should you open just box $A$ or both boxes?

There are two contradictory lines of reasoning:

Once Omega puts down the boxes, there either is money in $A$ or there is not money in $A$. If there is money in $A$, then taking both boxes gets you more money, and if there is no money in $A$, taking both boxes gets you more money. Therefore, you should take both boxes.
If you take both boxes, then by the assumption of the problem there will be no money in $A$, so you will end up with $1,000. If you just take $A$ then there will be money in it, so you will end up with $1,000,000. Therefore, you should just take $A$.

Payoff Matrices

In game theory, the set of possible different actions by two players and their corresponding rewards are represented in a payoff matrix. Here is the payoff matrix for Newcomb's paradox:

The diagonal elements of the matrix (top-left and bottom-right) represent the cases in which Omega predicted your actions correctly. The off-diagonal elements (bottom-left and top-right) represent the cases in which Omega predicted incorrectly. However, by the statement of the problem, Omega cannot predict incorrectly, so the real decision matrix looks like this:

Breaking apart the payoff matrix in this way allows a generalization of the problem to the case in which Omega is not a perfect predictor. If Omega predicts your action with probability $p$, then the combined case looks like this:

Note that this covers the case in which Omega is worse than chance at predicting your actions, where $p < 0.5$. It also takes into account how much of a difference of reward there is between choosing one and two boxes. In the standard setup, Omega doesn't need to be anywhere near a perfect predictor for taking only box $A$ to be the right choice since there is such a large difference in payoff.

Similarly, if the problem were changed such that the payoff from $B$ were greater than or equal to two times the payoff from $A$, then choosing two boxes would always be the right strategy.

Transparent Newcomb Problem

Imagine that instead of the original case in which $B$ was transparent and $A$ was opaque, both boxes were transparent. In this case, you would know exactly what was in box $A$ once Omega flew away. Again, there are two ways of reasoning about this.

Obviously if you already know what is in the box, then you should take both boxes. Either the money is there, or it isn't.
This setup is just the same as the previous one. If Omega predicts that you are the kind of person that will take two boxes, there will not be any money in box $A$. There shouldn't exist a world in which there is money in box $A$, but you take both. Therefore, committing to take only $A$ will make you be in the world where there is money in box $A$, and you will get $1,000,000.

Parfit's Hitchhiker

Newcomb's problem appears to deal with strange notions of causal reasoning, in which a decision now affects something that happened in the past. As a result, many people assume that the problem is just a weird philosophical thought experiment that doesn't relate to decisions in ordinary life. However, similar problems, known as Newcomblike problems, occur all the time^[3]. Consider a more realistic example, known as Parfit's hitchhiker^[4]. A hitchhiker is stranded along a hot desert road, hoping to get to the next town. A driver pulls up alongside the hitchhiker, and says that she will give the hitchhiker a ride, but only if the hitchhiker gives her $100 when they get into town.

The hitchhiker thinks about this and realizes that his best option is to promise to pay the $100 when they get into town, but then not do that. After all, once he is in town, paying the money can't causally affect whether or not he is there. Paying the money at that point would be irrational.

The driver, realizing that the hitchhiker uses the above logic, drives away. Unfortunately, the rational choice of action in this setup leads the hitchhiker to a worse outcome than the case in which the hitchhiker could commit to pay the $100, even though he would benefit from not paying.

Parfit's hitchhiker is structurally the same as the transparent Newcomb problem. Choosing to take the ride and pay $100 at the town corresponds to choosing one box, while choosing to take the ride without paying at the town corresponds to choosing both boxes. The driver will only present the possibility of a ride if the driver predicts that the hitchhiker will follow through on her promise to pay $100. The advice to one-box on Newcomb's problem maps to the hitchhiker actually being an honest person, such that the driver trusts him enough to give him a ride. The advice to two-box maps to the hitchhiker deceiving the driver into thinking that he is honest, and then not following through on his promise.

References

Nozick, R. (1969). Newcomb's Problem and Two Principles of Choice. Essays in Honor of Carl G Hempel.
Gardner, M. (1974). Mathematical Games. Scientific American.
Soares, N. Newcomblike Problems are the Norm. Retrieved September 24, 2014, from http://mindingourway.com/newcomblike-problems-are-the-norm/
Parfit, D. (1984). Reasons and Persons.

Contents