Vacationer's Dilemma

from Wikipedia, the free encyclopedia

The vacationer's dilemma is a game-theoretical thought experiment devised by Kaushik Basu in 1994 , in which those involved can make more profit by acting incorrectly in game theory than with the "correct" solution. The original English title "traveler's dilemma" is not with the "traveling salesman problem" , ie the Traveling Salesman Problem to be confused. The dilemma is not a zero-sum game , because positive values, i.e. winnings, are always paid, even if the advantage of one player is equal to the disadvantage of the other player.

Framework story

The method by which the payouts are calculated

The background story exists in several versions, since Basu published the dilemma several times and continued to embellish it. The version shown here comes from an article in the magazine "Spektrum der Wissenschaft", which is probably the first German explanation of the dilemma.

Tanja and Markus went on vacation to the same remote Pacific island at the same time; but they don't get to know each other until after their return flight at their home airport - in the office of the compensation department. The airline broke up the antique vases, of which each of the two had bought a copy on site. The clerk accepts your claim without further ado, but with the best will in the world, cannot assess the value of the works of art. Apart from major exaggerations, he promises precious little from a survey of travelers. After some deliberation, he therefore decides on a more tricky approach. He asks both of them to independently write the value of the vase in euros on a piece of paper, as a whole number between 2 and 100. Any prior agreement is of course prohibited. But what he announces beforehand is the payment procedure: If both indicate the same value, he will consider this to be the true purchase price and pay it to each of them. If the information differs, however, he will consider the lower price to be true and the higher to be an attempt at fraud. In this case, both will be reimbursed the lower amount - with one difference: the one who wrote down the lower value receives 2 euros more as a reward for honesty, the other is deducted a penalty fee of 2 euros. So if Tanja chooses 46, for example, but Markus 100, she gets 48 euros and he only 44.

The paradox

Disbursement matrix of the dilemma
2 3 4th ... 98 99 100
2 2 2 4 0 4 0 ... 4 0 4 0 4 0
3 0 4 3 3 5 1 5 1 5 1 5 1
4th 0 4 1 5 4 4 6 2 6 2 6 2
... ... ...
98 0 4 1 5 2 6 ... 98 98 100 96 100 96
99 0 4 1 5 2 6 96 100 99 99 101 97
100 0 4 1 5 2 6 96 100 97 101 100 100

The amazing thing about this game is that game theory predicts that the players would rationally choose the value € 2. This answer is, of course, contrary to common sense , but it can be understood through some logical considerations.

Tanja and Markus - or abstractly A and B - will consider how the other will act. The first choice is of course 100, as this is how you can make the most profit. However, player A can even increase his payout to 101 by entering 99 and taking the bonus. Since player B thinks the same way as player A - this is one of the properties that game theory summarizes under the term " rational " - he will have come to the same conclusion, so that both now choose 99. A knows that B thinks the same way and tries, again in the same way, to increase his payout: He chooses the next lower value 98, which gives him the bonus (B still chooses 99) and thus at least a payout of 100. B will now catch up again, be undercut by A by the same conclusions, and so on. The result is that for every number there is a better one, and the lower one. So the logical choice for both players is 2. By deviating by one unit (i.e. to 3) one can only cause a deterioration, regardless of what the other player chooses, choice 2 is more favorable. So here lies the so-called Nash equilibrium of the game. The choice of equilibrium strategy 2 by both players is ultimately anything but advantageous, as only minimal payouts can be achieved.

The mistake of reasoning

There are at least 3 possible goals of the people involved. The choice of 2 euros is correct and understandable for player A with the aim of not winning less than player B if possible. If a player aims to achieve the highest possible total payout amount from the insurance, he will choose 100 euros. The decision is more difficult for player A when it comes to personal profit maximization. Only if he assumes that player B chooses a higher amount than 3 euros with a negligible probability, he will choose 2 euros himself. Player B is more likely to strive for personal profit maximization himself and name a high amount.

It is true that player A cannot do any worse when switching from 100 to 99 euros. If player B chose 100 euros, player A wins 101 euros, if player B chose 99 euros, player A also receives 99 euros, but compared to 97 euros if he originally chose 100 euros. A change from 100 to 98 euros also makes sense. A change from 99 to 98 euros is not always the case. Assuming that player B chooses amounts in the upper range with approximately the same probability, a change from 97 to 96 euros would no longer be associated with any advantage.

Mixed strategies as an explanation

Above: Even distribution of the probabilities
Middle: Distribution of the probability values ​​for player A, if player B keeps the upper distribution and the expected value is proportional to the probability.
Below: The limit values ​​deviate only slightly from the parabolic shape

One way to approximate human behavior is based on probability theory rather than game theory. The players do not choose a certain value (from 2–100), but each value with a certain probability. Since player A does not know how B chooses his probabilities, he can for example assume an even distribution. For every choice of A one can now calculate its expected value . If one assumes that the probability with which A chooses a certain value is proportional to the payoff that he has to expect on average when choosing this value, if B adheres to its own distribution (the expected value), one can determine the distribution calculate the probabilities of A. The result can now be used for B instead of the uniform distribution. If you repeat the procedure with the new initial distribution, a different distribution results, which you can enter as the start distribution. If this is done several times, the distribution converges towards a limit distribution with the maximum at 97.

Real behavior of people in the vacationer's dilemma

Over time, several attempts have been made to find out how "real" people behave in the vacationer's dilemma. Almost always (with low bonuses) the overwhelming majority stated the maximum (in the original version 100), the rest is divided roughly equally between the three alternatives: Nash equilibrium, values ​​just below the maximum and random values ​​in between. In any case, the average of the values ​​mentioned was relatively high.

A real gambler will not simply accept the Nash equilibrium calculated above, but will question parts of the logical chain . He may see the insurance as another opponent or relate the possible winnings to an uninvolved fictitious other player.

You have to note that the change back from 99 to 100 euros for a strictly logical player is only excluded if he sees the game as a pure duel between Tanja and Markus. By switching back, the other player would have given the lower number and the changer would get the deduction. Switching both players is also impossible, since everyone only looks at the game from their own perspective. The original 100 × 100 table has been shortened to a 99 × 99 table, so to speak. Due to the backward induction , only one cell with the value 2 remains at the end under this condition.

Contrary to the theoretical considerations, a person in the situation described will focus on personal profit maximization. The comparison with the other policyholder remains secondary to him. He will prefer to choose as high an amount as possible in order to preserve the chance of a high profit. From this point of view, it would be nonsensical to choose the smallest possible amount in order to score only 2 euros better than the opponent. Because the other player also chooses a correspondingly high number, the action only pays off. Basu calls this a "superordinate rationality".

Parallels to other problems

The vacationer's dilemma is basically a generalization of the better-known prisoner problem . This corresponds to a case of the vacationer's dilemma with the lower limit 2 and the upper limit 3, i.e. the upper left four cells of the payout matrix. The prisoner's dilemma, therefore, poses difficulties similar to those of the vacationer's dilemma; the difference between human choice and the prediction of game theory is much more apparent in the vacationer's dilemma.

See also

Individual evidence

  1. a b Kaushik Basu: The vacationer's dilemma . In: Spectrum of Science . 08/07, 2007, pp. 82-88.
  2. Christoph Pöppe: What is really rational? . In: Spectrum of Science . 10/07, 2007, pp. 98-103.