Prisoner's Dilemma

from Wikipedia, the free encyclopedia

The Prisoner's Dilemma is a mathematical game from game theory . It models the situation of two prisoners who are accused of having committed a crime together. The two prisoners are interrogated individually and cannot communicate with each other. If both deny the crime, both receive a low sentence, since only one less severely punished act can be proven. If both confess, both will receive a heavy penalty, but not the maximum penalty because of their confession. If, however, only one of the two prisoners confesses , the latter goes unpunished as a key witness , while the other receives the maximum sentence as a convicted but not confessed perpetrator.

The dilemma now consists in the fact that each prisoner has to decide either to deny (i.e. to cooperate with the other prisoner ) or to confess (i.e. to betray the other ) without knowing the decision of the other prisoner. However, this ultimately imposed penalties depends on how the two prisoners together have testified and thus not only depends on one's own decision but also the decision of the other prisoners.

The prisoner's dilemma is a symmetrical game with complete information that can be represented in normal form . The dominant strategy of both prisoners is to confess. This combination also represents the only Nash equilibrium . On the other hand, cooperation between the prisoners would lead to a lower sentence for both and thus to a lower overall sentence.

The prisoner's dilemma appears in a variety of sociological and economic questions. In economics, the prisoner's dilemma, as part of game theory, is also assigned to decision-oriented organizational theories. It is not to be confused with the prisoner's paradox about conditional probabilities and the problem of the 100 prisoners of combinatorics .

Development and naming

Thomas Hobbes (1588–1679) already dealt with the subject . Hobbes was an English mathematician , state theorist and philosopher of the modern age ; in his main work Leviathan he developed a theory of absolutism . Hobbes was one of the most important contract theorists alongside John Locke and Jean-Jacques Rousseau . (See also: Prisoner's Dilemma and Business Ethics in Leviathan .)

The basic concept of the Prisoner's Dilemma was formulated in the 1950s by two employees of the Rand Corporation . To illustrate their abstract theoretical results, Merrill M. Flood and Melvin Dresher described a two-person game that shows how individually rational decisions can lead to collectively worse results.

The term "prisoner's dilemma" goes back to Albert William Tucker of Princeton University . He saw the payout matrix at Melvin Dresher's in 1950 and adopted it because of its clarity. When he was supposed to give a lecture on game theory to psychologists, he decided to illustrate the abstract payout matrix with the scenario of a social dilemma . Two (guilty) prisoners on remand are faced with the choice of denying or confessing. It is safest for the individual to confess, but mutual denial promises the best overall result.

Since then, the term prisoner's dilemma has become established for all interaction relationships with the same framework conditions (two actors, two alternative courses of action, symmetrical payout options, no possibility of agreement, mutual interdependencies).

Description of the situation

To illustrate this, Tucker formulated the game theory question as a social dilemma :

Two prisoners are suspected of having committed a crime together. Both prisoners are interrogated in separate rooms and have no opportunity to consult and coordinate their behavior. The maximum sentence for the crime is six years. If the prisoners decide to remain silent (cooperation), both will be sentenced to two years in prison for minor offenses. However, if both confess the act (defection), both expect a prison sentence, but because of the cooperation with the investigative authorities, not the maximum sentence, but only four years in prison. If only one person confesses (defection) and the other is silent (cooperation), the confessing person receives a one-year suspended sentence as a key witness, the other receives the maximum sentence of six years in prison.

Entering a payout matrix ( bimatrix ) results in the following picture including the overall result:


is silent




is silent

A: −2 B: −2 A: −6 B: −1
−4 −7


A: −1 B: −6 A: −4 B: −4
−7 −8


Possible income for the individual



Treason when the other is silent

(Defect in cooperation)

one year imprisonment −1



Silence when the other is silent

(Cooperation in cooperation)

two years imprisonment −2



Treason when the other betrays too

(Defect after defect)

four years imprisonment −4

sucker's payoff

(Reward of the good believer)

Silence when the other betrays

(Cooperation in defection)

six years imprisonment −6

In general terms, the prisoner's dilemma for two players A and B can be represented with the following payout matrix:







A. B.
A. B.


A. B.
A. B.

with and

A player's payoff therefore depends not only on their own decision, but also on the decision of their accomplice ( interdependence of behavior).

Collectively , it is objectively more beneficial for both of them to remain silent. If both prisoners were to cooperate, each would only have to go to prison for two years. The loss for both together is four years, and any other combination of confession and silence results in a greater loss.

Individually , it seems to be more beneficial for both to testify. For the individual prisoner, the situation is as follows:

  1. If the other confesses, his testimony reduces the sentence from six to four years;
  2. but if the other is silent, he can reduce the sentence from two years to one year with his testimony!

From an individual perspective, “confess” is definitely recommended as a strategy. This statement does not depend on the behavior of the other, and it always seems to be more advantageous to confess. Such a strategy, which is chosen regardless of the opponent's, is called a dominant strategy in game theory .

The dilemma is based on the fact that collective and individual analysis lead to different recommendations for action.

The play area prevents communication and provokes a one-sided betrayal, through which the traitor hopes to achieve the better result for him, "one year" (if the fellow prisoner is silent) or four instead of six years (if the fellow prisoner confesses). But if both prisoners pursue this strategy, they worsen their situation - also individually - since they each receive four years instead of two years in prison.

In this disintegration of the possible strategies lies the prisoners' dilemma. The supposedly rational, step-by-step analysis of the situation leads both prisoners to confess, which leads to a poor result (suboptimal allocation ). The better result could be achieved through cooperation, which is, however, susceptible to a breach of trust. The rational players meet at the point where the dominant strategies meet. This point is called the Nash equilibrium . The paradox is that both players have no reason to deviate from the Nash equilibrium, although the Nash equilibrium is not a Pareto-optimal state here .

The role of trust

The players' dilemma is based on ignorance of the other player's behavior. The game theory is concerned with the optimal strategies prisoner's dilemma. The optimal strategy for both would be to trust and cooperate in one another. Trust can be established in two ways: on the one hand, through communication - which is not allowed under the rules of the game - and appropriate evidence of trust, and on the other hand, by punishing the other player in the event of a breach of trust.

The economist and game theorist Thomas Schelling addresses such problems under the conditions of the Cold War (“ Balance of Terror ”) in his work The Strategy of Conflict . The punishment for breach of trust would have been so severe that it wasn't worth it. When playing the Prisoner's Dilemma iteratively, most strategies rely on using information from previous rounds. If the other cooperates in a round, the successful strategy tit for tat (“Like you me, so I you”) trusts that he will continue to do so, and in turn gives a vote of confidence. Otherwise, she punishes her in order to prevent her from being exploited.

William Poundstone points out that it is not a dilemma if one immediately and always chooses cooperation based on trust.

The role of guilt and innocence

In the prisoner's dilemma, the question of actual guilt or innocence is excluded. A prisoner always benefits from a confession, even if he confesses when he is actually innocent. On the other hand, he achieves a worse result if moral concerns and the hope that his innocence will be proven prevent him from making a confession.

If the penalty for not confessing is very high, innocent people also tend to confess; this effect is particularly evident in show trials .

Playing styles

Unique game

According to the classic analysis of the game, in the prisoner's dilemma (English: One Shot), which is only played once, the only rational strategy for a player interested in his own well-being is to confess and thereby betray fellow prisoners. Because with his decision he cannot influence the behavior of the fellow player, and regardless of the decision of the fellow player he always looks better if he does not cooperate with the fellow prisoner himself. This analysis assumes that the players only meet once and that their decisions cannot influence subsequent interactions. Since this is a real dilemma, this analysis does not result in any clear instructions for action (prescriptive statement) for real interactions that correspond to a prisoner's dilemma.

In the one-time, all-important game, however, it must be pointed out that it does not matter whether both parties agree beforehand. The situation after a possible conversation remains the same.


Experiments have shown that a large number of players cooperate even in a one-off game. It is believed that there are different types of players. The actual distribution of the cooperation observed in the experiments cannot be explained by the standard theory of “rational strategy”. In an experiment with 40 players who each played 20 games in pairs, the average cooperation rate was 22%.

According to an experiment published by Frank , Gilovich, and Regan in 1993, the behavior of first-year economics students was compared with students in the year before their exams, as well as with the behavior of students from other disciplines under conditions of prisoner's dilemma. The students received two dollars each if they both cooperated and one dollar each if they did not cooperate; in the case of unilateral cooperation, the cooperating student received nothing, while the non-cooperating student received three dollars. It was found that both freshmen and students from other disciplines opted for cooperation strategies with a large majority. Fourth-year economics students, on the other hand, tended to be uncooperative. Frank u. a. concluded from this that economists should give a less narrow perspective on human motivation in their teaching with regard to the general welfare as well as the well-being of their students than was previously the case.

Multiple (finite) game

The situation can change if the game is played over several rounds ( iterated or repeated Prisoner's Dilemma ). Whether the situation changes then depends on whether the players know the number of rounds or not. If the players know the end of the game, it is worthwhile for players who are actually cooperating to give away in the last round, because retaliation is no longer possible. However, the penultimate round becomes the last in which a decision has to be made, whereupon the same situation arises again. By induction it follows that the Nash equilibrium in this case is constant betrayal. This means that if both sides are constantly giving themselves away, this is the only strategy in which a better result cannot be achieved by changing the strategy. Therefore, a game in which both players know the number of rounds is to be treated exactly as a one-shot game. In practice, however, this theoretically rational (also called backward induction) behavior is not always observed. This is because a rational player cannot know whether the other player is also acting rationally. If there is a possibility that the other player could act irrationally, it is also advantageous for the rational player to deviate from constant betrayal and instead play tit-for-tat.

It is fundamentally different if the players do not know the number of rounds. Since the players do not know which round will be the last, there is no backward induction. The unknown often repeated game can be equated with an infinitely often repeated game (single shot).

Infinite game

With infinitely repeated games (single shot), as with unknown often repeated games, there is no backward induction. The repeated interaction makes it possible to reward cooperation in subsequent rounds, which leads to higher total payouts, or to reward defects, which leads to lower payouts. Tit for Tat ("as you me, so I you") means punishment for betrayal in the next period. In this case, one speaks of calculative trust.

To interpret the results of a game, the payouts of the individual rounds are combined into a total payout in finite games, which then reflects the success of a player in a game. For this purpose, the payouts of the individual rounds are usually added unweighted, but can also be discounted in the form of a discount factor.

With multiple game, the payoff matrix is typically designed so that in addition to universal inequality also holds, which is fulfilled in the sample payoff matrix of the introduction: . In the opposite case, two players could otherwise gain an advantage over cooperating players by alternately exploiting and being exploited, or they could simply share the sum of the individual results for one-sided cooperation and one-sided defection.

It makes a difference whether you want to win or win. If you want to win, it's actually a different game. The game becomes a zero-sum game when only victory is counted in the end. If you want to win (want to make a profit), it is worthwhile to offer the other player cooperation by cooperating. If the other person agrees, you end up making a higher profit than if you only practice betrayal. Even if you enter into the cooperation of the other yourself through your own cooperation, you increase your profit.

More than two actors

A prisoner's dilemma with several people arises, for example, if the people involved can choose between two strategies ( = cooperation; = defection) and the payouts are as follows:

Here referred to the number of players, the strategy choose, so cooperate.

The payoff functions show that the choice of always yields a higher payout than the choice of , so it is a strictly dominant strategy and thus leads to the Nash equilibrium . The Nash equilibrium is not a Pareto optimum , as all players could improve through a mutual contractual agreement. As in the two-person game, the parato optimum of mutually cooperative players is not a Nash equilibrium, as there is always an incentive for an egoistic player to defect.

In a symmetrical game with two decision options as in this example, the payout matrix for 100 players can be shown in the following form.

  000 001 002 003 ... 098 099
C. 2 4th 6th 8th ... 198 200
D. 3 6th 9 12 ... 297 300

The line prefix stands for the strategy of any player, the column headings represent the number of other players who choose strategy , i.e. cooperate. For you get a payout matrix corresponding to the two-person game.

In general, it is a prisoner's dilemma with actors ( ), where the number of cooperating actors is and applies to all of the payout functions:

The first condition means that in the event that actors cooperate, it always means a higher payout for an individual if he is defective than if he is the . Cooperating would. With that, defection is the dominant strategy. The second condition means that cooperation between all actors leads to a higher payout than a general defect. This makes the Nash equilibrium inefficient.

In the classic prisoner's dilemma with several people, the actors can only choose between two strategies and thus not about the extent of the cooperation. A generalization where the latter is possible is the public-goods game .


Axelrod's computer tournament

The American political scientist Robert Axelrod organized a computer tournament for the multiple prisoners' dilemma at the beginning of the 1980s, in which he let computer programs with different strategies compete against each other. The overall most successful strategy, and at the same time one of the simplest, was the aforementioned tit-for-tat strategy, developed by Anatol Rapoport . She cooperates in the first step (friendly strategy) and the following and “refrains from betrayal” as long as the other also cooperates. If the other tries to gain an advantage (“betrayal”), she will do so the next time (she cannot be exploited), but cooperates again immediately if the other cooperates (she is not resentful).

In his highly acclaimed book The Evolution of Cooperation , Axelrod described the results of his computer tournaments in 1984. Axelrod identified the principle of “ live and let live ” in the First World War as the most important case of cooperation as the dominant strategy in the repeated prisoner’s dilemma .

Evolution dynamic tournaments

A further development of the game over several rounds is playing over several generations. If all strategies have been pitted against each other and against themselves in several rounds, the results achieved for each strategy are added up. For the next round, the successful strategies replace the less successful ones. The most successful strategy is most common in the next generation. This tournament variant was also carried out by Axelrod.

Strategies that tended to betray achieved relatively good results at the beginning - as long as they encountered other strategies that tended to cooperate, i.e. could be exploited. If, however, treacherous strategies are successful, cooperative strategies become rarer from generation to generation - the treacherous strategies in their success evade the basis of success themselves. However, if two traitor strategies come together, they achieve worse results than two cooperating strategies. Traitor strategies can only grow through the exploitation of fellow players. Cooperative strategies, on the other hand, grow best when they clash. A minority of cooperating strategies such as B. Tit for Tat can thus even assert itself in a majority of treacherous strategies and grow to the majority. Such strategies, which can be established over generations and are also resistant to invasions by other strategies, are called evolutionarily stable strategies .

Tit for Tat could only be beaten in 2004 by a new strategy "Master and Servant" (exploiter and victim) of the University of Southampton , whereby associated participants, when they meet each other after an initial exchange, go into an exploiter or victim role in order to To enable exploiters (individually) to achieve a top position. If one considers the result of the exploiter and the victim together (collectively), then they are with the above. Payout values ​​worse than tit for tat . However, a certain critical minimum size is necessary for the individually good results, i.e. that is, Master and Servant cannot establish itself from a small starting population. Since the game partners communicate in coded form about their initial game behavior, the objection arises that the master-and-servant strategy violates the rules of the game, according to which the game partners are questioned in isolation from one another. The strategy is reminiscent of insect colonies, in which female workers completely renounce reproduction and use their labor for the welfare of the fertile queen.

Necessary conditions for the expansion of cooperative strategies are: a) that several rounds are played, b) the players can recognize each other from round to round in order to retaliate if necessary, and c) that it is not known when the players are going to meet last time.

Asymmetric variation: sequential decision

The variant of the prisoner's dilemma, in which the players decide one after the other, puts the players in an asymmetrical position. Such a situation arises, for example, when carrying out transactions that have come about on eBay . First, the buyer has to decide whether to cooperate, i. H. want to transfer the purchase amount to the seller. The seller then decides whether to ship the goods. Trivially, the seller will never send the goods if the buyer does not transfer the purchase amount.

(Note for understanding: In the following, the focus is not on rational decision-making in the sense of an optimal strategy, but on emotional motivation .) The buyer is therefore in a situation of " fear " that the seller might not be able to send the goods, even if he - the buyer - transfers the purchase price. Once the money has been received by the seller, there is a temptation (" greed ") not to send the goods. In this case, fear and greed can be assigned to the two players separately as emotions , while in the usual, simultaneous decision-making, both players can equally feel or experience both emotions.

This difference makes the analysis of the influence of social identity (simplified: "we-feeling") possible. The traditional hypothesis is that an existing sense of unity generally strengthens the tendency towards cooperation. However, Yamagishi and Kiyonari put forward the thesis that an influence of a we-feeling does exist, but in the case of the sequential prisoner's dilemma a much stronger effect of reciprocal cooperation makes the presence or absence of a we-feeling insignificant: the buyer motivates the seller through his own cooperation also to cooperation. Simpson was able to show, however, that the evidence that Yamagishi and Kiyonari cite for their thesis is also compatible with the assumption that an existing sense of togetherness leads players not to give in to greed, the fear that the other might not cooperate , however, remains a crucial influence.

Such a situation would be particularly serve to explain likely that in the minimal group experiments of Tajfel was not observed that the players sought to gain their own group to maximize, but to maximize the profit difference to the other group, and the difference in the try to minimize their own group: If one assumes that two players in a prisoner's dilemma both feel that they are part of a group in some way and that the group membership is salient at the moment of the game , one must assume that on the one hand the two players are as similar as possible On the other hand, strive for the lowest possible amount of penalties (or the highest possible amount of rewards). If one player assumes that the other is cooperating (he can be prevented from cooperating by greed), both goals can be achieved through cooperation (difference :; sum:) ; However, if the player assumes that the other does not cooperate (fear of being exploited), both goals are achieved with different strategies (difference suggests non-cooperation:; but sum suggests cooperation:) .


Some selected strategies

There are many different strategies for dealing with the Prisoner's Dilemma over several rounds. Names have become established for some strategies (translation in brackets). Behind it is how high the average profit is. (Assuming that the number of rounds is unknown and there is a probability of another move after each move . - The probability that the game will last at least i moves is .)

  • Tit for Tat  : Cooperates in the first round and copies the previous play of the game partner in the next rounds. This strategy is in principle willing to cooperate, but retaliates in the event of betrayal. If the other player cooperates again, she is not resentful, but on her part reacts with cooperation.
  • The tit-for-tat player (TFT) receives:
    • against a perpetual cooperator (K): (the cooperator receives the same payment)
    • against another tit-for-tat player:
    • against an eternal defective / traitor (D):
  • mistrust (mistrust): Reveals in the first round and copies the previous play of the game partner in the next rounds (like Tit for Tat ). Is not willing to cooperate on its own.
  • spite (resentment): Cooperates until the other player gives away for the first time. Always reveals afterwards. Cooperates until the first abuse of trust. Very resentful.
  • punisher (Bestrafer): Cooperate to the first deviation. Then he is hostile until the other player's profit from his deviation has been used up. Then he cooperates again until the next deviation from the cooperative solution. This strategy is optimal for players who are willing to cooperate and who make mistakes, i.e. who mistakenly make a confrontational move. With few repetitions or too big differences in the result matrix, however, it can happen that a loss due to a mistake by the opponent can no longer be compensated for. These games are called incurable .
  • pavlov : Cooperates in the first round and reveals if the other player's previous move was different from his own. Cooperates if both players cooperated or both gave away in the preliminary round. This leads to a change in behavior if the win in the preliminary round was small, but to maintain behavior if the win was large.
  • gradual (gradually): Cooperates until the other player gives away for the first time. Then betray once and cooperate twice. If the other player reveals again after this sequence, the gradual strategy reveals twice and cooperates twice. If the other player then gives away again, he gives it away three times and cooperates twice. This strategy cooperates in principle, but punishes every attempt at exploitation increasingly irreconcilably.
  • prober : plays the first three moves cooperate, betray, betray and betray from now on, if the opponent cooperated on the second and third moves , otherwise plays tit for tat . Tests whether the other player can be removed without retaliation. Excludes non-retributive players. But adapts to retaliation.
  • master and servant (" master and servant " or " Southampton strategy "): During the first five to ten rounds, this strategy plays a coded behavior for recognition. The strategy thus determines whether the other player is also playing Master and Servant , i.e. i.e., whether he is a relative. If this is the case, one player becomes an exploiter ("master") who always cheats, the other player becomes an exception ("servant") who cooperates unconditionally and apparently against all reason. If the other player does not conform to the Master-and-Servant , cheating is carried out in order to damage competitors in the competition. This leads to a very good result for the strategy as a whole, since in master-and-servant encounters the master almost always receives the maximum possible number of points for one-sided betrayal, which is extremely unlikely in otherwise usual encounters. Success in a tournament can be further increased by submitting similar master-and-servant strategies that are identified as “related” . Whether Master and Servant can win against Tit for Tat depends on the points awarded (payout matrix). If it is, the strategy has a hard time winning against tit for tat .
  • always defect : Always reveals, regardless of what the playing partner does.
In exchange for a perpetual cooperator (K), the defective / traitor (D) receives:
Against another perpetual defect / traitor (D) the defect / traitor (D) receives:
  • always cooperate : Always cooperate , no matter what the playing partner does.
For another perpetual cooperator (K), the cooperator (K) receives:
In exchange for a perpetual defect / traitor (D) the cooperator (K) receives:
  • random : Reveals or cooperates based on a 50:50 random decision.
  • per child (periodic and friendly): Periodically plays the sequence cooperate / cooperate / betray. This strategy tries to lull the other player into safety by cooperating twice in order to then take him out once.
  • per nasty (periodic and unfriendly): Periodically plays the episode betray / betray / cooperate.
  • go by majority (decide according to the majority): Cooperates in the first round and then plays the most frequently used move by the other player. In the event of a tie, there is cooperation.
  • Tit for Two Tats (more good-natured Tit for Tat ): Cooperates in the first round. If the other player cooperated last, there is also cooperation. But if the other player betrayed last, there is an equal probability of cooperating or betraying. This tit-for-tat variation can form colonies very successfully, even if the business relationship is occasionally disrupted by “misunderstandings” (sabotage or poor communication). Normal tit-for- tat agents can get caught in a cycle through a disturbance in which one alternately cooperates and the other betrays. This cycle is only broken by a further disturbance.
For an eternal Defekteur / traitor (D) receives Tit-for-Tat Two players (TFTT) payment: .
For an eternal Kooperateur (K), a tit-for-tat poker players, or other tit-for-tat two players he receives the payment: .

Optimal strategy

The strategy tit for tat is - if it is played strictly - a simple but very effective and long-term successful strategy. But the game also miscommunication and misunderstanding possible (. Eg a Cooperate is as betraying misunderstood), has strict Tit for Tat one flaw on: A set submerged by a misunderstanding betrayal is then perpetuated by a succession of mutual retaliation and will not tarry. Both players can block themselves in an ongoing conflict of retaliatory reactions and significantly reduce their game result. This circumstance is called vendetta (Italian blood revenge ) or echo effect (one's own actions reverberate one round with a delay). Vendetta can only arise among tit-for- tat players through miscommunication, since the tit-for-tat strategy never plays unprovoked by being betrayed . The Vendetta can be interrupted again by a further miscommunication (if a Betrayed as cooperating misunderstood) since the tit-for-tat never fails strategy on its own retribution.

One possible adaptation of the tit-for-tat strategy in order to reduce the risk of extensive vendetta is therefore to make the strategy a little less relentless when it comes to retaliation, i.e. to incorporate a forgiveness mechanism into the strategy . This means that not every betrayal will be rewarded, but with a certain probability that a betrayal will be tolerated even without retaliation. One such “good-natured tit for tat ” is the aforementioned tit for two tat . As long as the frequency of miscommunication between players is not so high that it prevents the tit-for-tat strategy being played from being identifiable, it is still possible to achieve optimal results. For this purpose, the frequency of forgiveness must be selected proportionally to the frequency of communication errors.


From politics and society

The prisoner's dilemma can be applied to many situations in practice. If, for example, two countries agree on arms control, it will always be better for each individual to arm in secret. Neither of the countries keeps its promise and both are worse off due to the armament (higher risk potential, higher economic costs), but better than if only the other one armed (danger of aggression by the other).

A common example of a prisoner's dilemma involving several people can result from the permanent occupancy of lounge chairs in a recreational facility. If the number of loungers is designed in such a way that they would be sufficient for all guests based on the average usage time, a bottleneck can still arise if guests switch to using the loungers permanently, for example by placing a towel.

The Braess paradox is also based on a prisoner's dilemma. Because of such, the construction of an additional road will worsen the situation.

From the economy

There are also examples of the prisoner's dilemma in business, for example in the case of agreements in cartels or oligopolies : Two companies agree on an output quota (for example in oil production), but individually it is worthwhile to increase your own quota compared to the agreed one. Both companies will produce more. The cartel bursts. The companies in the oligopoly are forced to lower prices because of the increased production, which reduces their monopoly profits.

If several companies compete in one market, the advertising expenditures keep on increasing because each one wants to outperform the other a little. This theory was confirmed in the United States in 1971 when a law banning cigarette advertising on television was passed. There were hardly any protests from the ranks of the cigarette manufacturers. The prisoner's dilemma that the cigarette industry had gotten into was resolved by this law.

Another example is a traveling salesman who can deliver good goods (lower profit, but secure in the long term) or no goods at all (high short-term profit) to his customers with prepayment (if necessary, bad checks). In such scenarios, traders with a bad reputation disappear from the market because nobody does business with them and they can not cover their fixed costs . Here tit for tat leads to a market with little “fraud”. A well-known example based on this model is how the eBay rating scheme works: Retailers who do not deliver the agreed goods despite receiving payment receive poor ratings and thus disappear from the market.

The vendor's dilemma , which influences the prices of goods on offer, is noteworthy . Suppliers do not benefit in the presence of the dilemma, but the overall welfare of an economy can increase because the customer benefits from low prices. A supplier dilemma is often artificially generated through government intervention in the form of competition policy , for example by prohibiting agreements between suppliers. Institutions thus create more competition in order to protect the consumer.

The auction of UMTS licenses in Germany can also serve as an example. Twelve frequency blocks for UMTS were auctioned, which could be purchased either as a two-pack or a three-pack. Seven bidders ( E-Plus / Hutchison , Mannesmann , T-Mobile , Group 3G / Quam , debitel , mobilcom and Viag Interkom ) took part in the auction in August 2000 . As in the theoretical original, agreements between the players, i.e. the mobile network providers, were prevented. After debitel retired after the 126th round on August 11, 2000, there were twelve licenses for six mobile phone providers, so two for each; the total of all licenses at that time was 57.6 billion DM. However, since the mobile phone providers were speculating on the departure of another provider and the possibility of acquiring three licenses, they continued to submit bids. In the 173rd round on August 17, 2000, two licenses each went to the six remaining wireless service providers - a result that could have been achieved in the 127th round. The sum that the mobile phone providers paid for all licenses was now DM 98.8 billion.

From criminology

The so-called " Omertà " (keep quiet or die!) Of the Mafia tries to ensure silence (cooperation) by threatening a violation with particularly drastic sanctions. This strengthens the cooperation, while at the same time a one-sided confession is demotivated by extreme loss. This would be an internalization of a negative external effect (“negative” in a purely game-theoretical sense).

Omertà tries to encourage the players to trust each other, but cannot resolve the fundamental dilemma. As an antidote, the judiciary can e.g. B. Offering traitors impunity and / or a new identity in order to undermine the trust of accomplices ( leniency program ). A simple interrogation strategy of the police (albeit impermissible in Germany according to § 136a StPO ) can consist in unsettling the suspect by falsely claiming that the accomplice had already confessed.

In a study on mentally disturbed test persons, Rilling found out that a deficit in cooperation goes hand in hand with deficits in the emotional and behavioral area. Psychopathy is viewed as a disorder primarily of the affects for social interaction . It is defined as a socially impairing personality disorder with affective, social, and behavioral problems. In agreement with Axelrod's (1987) assumptions, psychopaths are much less likely to want to enter into and maintain stable relationships. The fact that the above-mentioned deficits occur at the same time in a clinical population that happens to be defective in the iterated prisoner's dilemma indicates the close relationship between the ability to cooperate and empathy and emotional affect .

Influence on welfare

The extent to which the prisoner's dilemma improves or worsens social welfare depends on the circumstances under consideration. In the case of a cartel or oligopoly, the prisoner's dilemma leads to an improvement in the situation. The " market failure " caused by a reduced supply can be remedied. However, if one looks at the arms build-up of states or the advertising expenditure of companies, then the prisoner's dilemma leads to poorer welfare, since only costs are created that do not lead to any new benefit.

In his conception of business ethics, Karl Homann assumes that it is the task of the states or the legislature to work towards shaping the framework to ensure that desired dilemma situations are maintained and that undesirable dilemma situations are overcome by creating or changing institutions. For example, statutory minimum standards for securing consumer rights (e.g. general terms and conditions law ) can dispel distrust of the seller (undesirable dilemma situation) and thus lead to more trade; At the same time, the competition between the respective sellers and the respective buyers is to be maintained as a desirable dilemma situation.

Related problems

The symmetrical two-person non-zero-sum games also include the game with doom ( coward game , chicken game ), deer hunting , the vacationer's dilemma and the game of the battle of the sexes .

Another example of how individual and collective rationality leads to different results is the rationality trap .


  • Anatol Rapoport, Albert M. Chammah: Prisoner's dilemma: a study in conflict and cooperation . University of Michigan Press, 1965
  • Robert Axelrod: The evolution of cooperation . Oldenbourg Verlag, 2000, ISBN 3-486-53995-7
  • Eggebrecht, Winfried; Manhart, Klaus: Fatal Logic: Egoism or Cooperation in Computer Simulation, c't 6/1991
  • Rilling, J., K., Glenn, A., L., Jairim, M., R., Pagnoni, G., Goldsmith, D., R., Elfenbein, H., A., Lilienfeld, S., O. (2007). Neural Correlates of Social Cooperation and Non- Cooperation as a Function of Psychopathy. In: Biological Psychiatry. 61: 1260-1271

Web links

Wiktionary: Prisoner's Dilemma  - explanations of meanings, word origins, synonyms, translations

Individual evidence

  1. Wolf, J. (2008), Organization, Management, Unternehmensführung - Theories, Practical Examples and Criticism, Wiesbaden 2008, p. XVII
  2. ^ Manz, K., Albrecht, B., Müller, F. (1994), table of contents, in: Manz, K., Albrecht, B., Müller, F. (Ed .; 1994), Organizationstheorie, Munich 1994, p VII-IX; here: p. VII.
  3. Gosch, KN (2013), Differences in the interpretation and acceptance of globally binding rules within multinational companies - an investigation with special consideration of culture, Hamburg 2013, p. 66.
  4. a b Kollock, P. (1998), Social Dilemmas - The Anatomy of Cooperation, in: Annual Review of Sociology, 24, born, No. 1, pp 183-214;.. here p. 185
  5. a b Gosch, KN (2013), Differences in the interpretation and acceptance of globally binding rules within multinational companies - A study with special consideration of culture, Hamburg 2013, p. 67.
  6. a b Straffin, PD (1983), in: The Two-Year College Mathematics Journal, Vol. 14, No. 3, pp. 228-232; here: p. 229.
  7. Gosch, KN (2013), Differences in the interpretation and acceptance of globally binding rules within multinational companies - A study with special consideration of culture, Hamburg 2013, p. 67 f.
  8. a b Tucker, AW (1950), A Two-Person Dilemma - The Prisoner's Dilemma; Reprinted in: Straffin, PD (1983), The Mathematics of Tucker - A Sampler, in: Two-Year College Mathematics Journal, Vol. 14, No. 3, pp. 228-232; here: p. 228.
  9. Gosch, KN (2013), Differences in the interpretation and acceptance of globally binding rules within multinational companies - A study with special consideration of culture, Hamburg 2013, p. 69 f.
  10. a b Axelrod, R. (2005), The Evolution of Cooperation, 6th edition, Munich 2005, p. 9.
  11. a b Nash, J. (1950), Equilibrium Points in N-Person Games, in Proceedings of the National Academy of Science, Vol. 36, No. 1, pp. 48-49; here: p. 49.
  12. ^ William Poundstone, Prisoner's Dilemma: John von Neumann , Game Theory, and the Puzzle of the Bomb , Anchor / Random House, 1992
  13. a b Kwasnica, AM, Sherstyuk, K. (2007), Collusion and Equilibrium Selection in Auctions, in: Economic Journal, 117, vol., No. 516, 2007, pp. 120–145; here: p. 127.
  14. Carsten Vogt: Cooperation in the prisoner's dilemma through endogenous learning , inaugural dissertation, archived copy ( memento of the original from September 30, 2007 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. @1@ 2Template: Webachiv / IABot /
  15. ^ Robert H. Frank , Thomas Gilovich & Dennis Regan: "Does Studying Economics Inhibit Cooperation?" In: Journal of Economic Perspectives . Vol. 7, No. 2. Spring 1993. pp. 159–71 ( PDF; 788 KB )
  16. a b Gächter, S., Kovác, J. (1999), Intrinsic Motivation and Extrinsic Incentives in a Repeated Game with Incomplete Contracts, in: Journal of Economic Psychology, Vol. 20, No. 3, 1999, p. 251 -284; here: p. 262
  17. ^ Robert Axelrod, The Evolution of Co-operation, 1984, p. 10.
  18. ^ Luce, RD, Raiffa, H. (1957), Games and Decisions - Introduction and Critical Survey, New York et al. 1957, p. 98 f.
  19. ^ Martin J. Osborne, Ariel Rubinstein: A Course in Game Theory . MIT Press, 1994, p. 135.
  20. Gosch, KN (2013), Differences in the interpretation and acceptance of globally binding rules within multinational companies - an investigation with special consideration of culture, Hamburg 2013, p. 71.
  21. ^ William Poundstone, Prisoner's Dilemma: John von Neumann , Game Theory, and the Puzzle of the Bomb , Anchor / Random House, 1992, pp. 101 ff.
  22. a b c d e f Andreas Diekmann: Game theory. Rowohlt Taschenbuch Verlag, Reinbek bei Hamburg 2009, ISBN 978-3-499-55701-9 , pp. 113-120
  23. Anatol Rapoport: Decision Theory and Decision Behavior. Maxmillan Press, London 1998, ISBN 978-1-349-39988-8 , pp. 259-260 ( Google books ).
  24. ^ Robyn M. Dawes: Social Dilemmas. In: Annual Review of Psychology. Volume 31, pp. 178-180, 1980 ( preview ).
  25. Axelrod, R. (1980), Effective Choice in the Prisoner's Dilemma, in: Journal of Conflict Resolution, Vol. 24, No. 1, 1980, pp. 3-25; here: p. 7.
  26. Axelrod, R. (1980), Effective Choice in the Prisoner's Dilemma, in: Journal of Conflict Resolution, Vol. 24, No. 1, 1980, pp. 3-25; here: p. 4 ff.
  27. ^ Robert Axelrod: The Evolution of Cooperation . New York 1984, pp. 73-87.
  28. T. Yamagishi and T. Kiyonari. The Group as the Container of Generalized Reciprocity. ' Social Psychology Quarterly 63: 116-32 2000
  29. Brent Simpson. Social Identity and Cooperation in Social Dilemmas. Rationality and Society 2006; 18; 443 doi : 10.1177 / 1043463106066381 .
  30. Tajfel . Experiments in intergroup discrimination. Scientific American, Nov 1970, 223, 96-102
  31. Gernot Sieg: Game Theory. 3rd edition, Oldenbourg Verlag, Munich 2011, ISBN 978-3-486-59657-1 , p. 7f.