Fictitious play

Fictitious Play (literally: fictional game, analogously: game moves in the mind) , also called Brown-Robinson learning process, is an analysis of game theory . More precisely, fictitious play is a convergent process for determining value and optimal strategy. This method was introduced in 1951 by GW Brown (1951) as an algorithm to find the optimal value in zero-sum games .

Fictitious play belongs to the sub-area of evolutionary game theory . The basic idea of these theories is based on the assumption that the players do not act (fully) rationally in search of optimizing solutions.

history

Brown described in his 1951 review, fictitious play as a strategy in which the players go through the fictitious game flow in their head, update it in each round and adapt it to the strategy that was chosen by their opponents in advance. Hence the naming of this strategy, whereby the strategies do not only take place in the head, but can all be played.

In modern game theory, Brown's ideas are no longer given the status that this theory used to have. The new representations differ from the original publication, in the process of updating the new beliefs before each game round. Brown assumes that the beliefs of the individual players are alternately adjusted and updated in relation to the opponent's empirical distribution function . In contrast, modern game theorists believe that the beliefs are updated simultaneously by the players.

After 1951, beliefs were only updated simultaneously. a. is due to the symmetrical treatment of decisions.

Furthermore, fictitious play was used at that time for approximation in order to calculate the optimal strategy. This is also very out of date and at the present time only the Nash equilibrium algorithm is used in practice. Thus, the optimal strategies are no longer approximated, but determined by the following algorithm:

Optimize the decision of player i = 1, ..., n with (arbitrarily) fixed strategies of all other players: Mark the highest payouts that can be achieved for player i under these circumstances. Repeat this for all possible strategy combinations of the other players.
Perform 1. for all players.

Today, because of its enormous complexity, fictitious play is almost exclusively only used in computer science. There are very complex game theory problems with two people, such as B. chess or backgammon, analyzed with fictitious play.

Evolutionary game theory

Fictitious play supports the assumptions of evolutionary, also called evolutionary game theory. Their approach differs from "conventional" game theory in that it turns away from the rationality arguments of the players. Instead, it is assumed that each player is assigned exactly one possible strategy which, due to an irrational assumption, does not optimize the player's payout. In addition, in such games the player is not aware that he is in a game. This strategy is not limited to the clear rules of a game, but represents an evolutionary development in which the players symbolize a "species".

In the model of evolutionary game theory, the players always meet in pairs. The probability of encountering a particular type of strategy depends on the proportion of this strategy in the totality of all possible strategies. In this model one no longer speaks of a set of strategies, but of a "population".

The theories of evolutionary game theory find practical application in economics, especially in the fields of tax and electoral systems, behavioral standards and property structures.

Learning strategies

Fictitious play is assigned to this sub-area of game theory. These strategies are only applied to repetitive games because their best answers are based on decisions made in the past. The beliefs are adapted to these decisions in the past, which is also known as belief learning.

Reinforcement learning is a reinforcement of learning strategies, as it is assumed that players play strategies more often that have made them profitable in the past.

Model Fictitious Play

Fictitious Play was developed to enable agents to approximate optimal solutions within a very short time. In each round, each player plays the best answer, which is based on the previous opponent's moves and generates the highest probability of winning. In fictitious play, each player assumes that the other players are pursuing a stationary strategy. A stationary strategy is characterized by individual decisions that are made regardless of the time of the decision.

In the first round in Fictitious Play, any strategy should be chosen, as you cannot fall back on any historical strategies of the other players.

From the second round onwards, the players take the average of the opponent's historical strategies as the basis for predicting the action in the next game. This is based on the assumption that the opponents will also play in the coming game according to their historical strategy frequency. Using this analysis of past strategies, the player plays the best strategy against them.

Fictitious play is only successful if the opponents actually pursue a stationary strategy. If, on the other hand, the opponents pursue a strategy that is based on a different basis (for example, if the opponent only uses the strategy last played), fictious play is not a successful strategy.

Assumptions of the model

It is based on a repetitive game with two fixed people who make their decisions simultaneously. The beliefs of the players are based on the strategies played by the opponent in the past. Both players play stationary strategies.

Mathematical Definitions

In the following, the symmetrical formation of beliefs in a game with two players is considered.

${\ displaystyle S}$ is the pure strategy space of the two players with ${\ displaystyle m = \ left | S \ right |}$

${\ displaystyle h_ {T} = \ left \ {i_ {t} \ right \} _ {t = 1} ^ {T}}$ , be a historical course of the game.

${\ displaystyle f_ {T} ^ {i} = \ left | \ left \ {s: i_ {s} = i, 1 \ leq s \ leq T \ right \} \ right |}$ be the number with which in occurs. So is and ${\ displaystyle i \ in S}$ ${\ displaystyle h_ {T}}$ ${\ displaystyle \ sum _ {i = 1} ^ {m} f_ {T} ^ {i} = T}$ ${\ displaystyle {\ bar {f}} _ {T} ^ {i} = {\ frac {f_ {T} ^ {i}} {T}}}$

The start conditions for are now determined. For be ${\ displaystyle T = 1}$ ${\ displaystyle i \ in S}$ ${\ displaystyle x_ {0} ^ {i} \ geq 0, \ sum _ {i = 1} ^ {m} x_ {0} ^ {i}> 0}$

With these conditions now the averages are calculated as ${\ displaystyle {\ bar {x}} _ {T} ^ {i} = {\ frac {f_ {T} ^ {i} + x_ {0} ^ {i}} {\ sum _ {k = 1} ^ {m} \ left (f_ {T} ^ {k} + x_ {0} ^ {k} \ right)}}}$

There again , this corresponds ${\ displaystyle \ sum _ {k = 1} ^ {m} f_ {T} ^ {i} = T}$ ${\ displaystyle {\ bar {x}} _ {T} ^ {i} = {\ frac {{\ bar {f}} _ {T} ^ {i} + \ left ({\ frac {1} {T }} \ right) x_ {0} ^ {i}} {1+ \ left ({\ frac {1} {T}} \ right) \ sum _ {k = 1} ^ {m} x_ {0} ^ {k}}} \ approx {\ bar {f}} _ {T} ^ {i}}$

Brown's 1951 definition

A player's best answer to the opponent's strategy using . ${\ displaystyle x}$ ${\ displaystyle y}$ ${\ displaystyle x \ in B \ left (y \ right)}$

Given and is called a pair of functions ${\ displaystyle \ left \ {x_ {0} ^ {i} \ right \} _ {i = 1} ^ {m}}$ ${\ displaystyle \ left \ {y_ {0} ^ {j} \ right \} _ {j = 1} ^ {m}}$

${\ displaystyle x: \ triangle \ left (S \ right) \ rightarrow}$ ${\ displaystyle \ triangle \ left (S \ right)}$ , Fictitious Play, if every period applies ${\ displaystyle y: \ triangle \ left (S \ right) \ rightarrow}$ ${\ displaystyle \ triangle \ left (S \ right)}$ ${\ displaystyle t \ geq 1}$

that and and are determined recursively by and . ${\ displaystyle x \ left ({\ bar {x}} _ {T} \ right) \ in B \ left ({\ bar {y}} _ {T} \ right)}$ ${\ displaystyle y \ left ({\ bar {y}} _ {T} \ right) \ in B \ left ({\ bar {x}} _ {T} \ right)}$ ${\ displaystyle {\ bar {x}} _ {T}, {\ bar {y}} _ {T}}$ ${\ displaystyle \ left ({\ bar {x}} _ {0}, {\ bar {y}} _ {0} \ right)}$ ${\ displaystyle \ left (x, y \ right)}$

This means that a player only behaves optimally when his opponent is actually playing or - otherwise not! ${\ displaystyle {\ bar {x}} _ {T}}$ ${\ displaystyle {\ bar {y}} _ {T}}$

Convergence properties of fictitious play

In the long term, fictitious play only leads to a Nash equilibrium under special assumptions . In fictitious play, Nash equilibria are absorbent, which means that if a Nash equilibrium is played by all players in any round of the game, it follows that the Nash equilibrium is also played for each subsequent game. Furthermore, it can be shown that a correlation between the Nash equilibrium and the probabilities is formed when fictitious play converges to a distribution.

Since fictitious play does not always converge to distributions, other update dynamics are used to form beliefs for the coming round, e.g. B. Replicator Dynamics. This method is more complicated than Fictitious Play, but it can be safely assumed that the results will converge.

Fictitious Play is sure to converge in the following cases:

Zero-sum games with two players (Robinson 1951)
Generally all 2x2 games (Miyasawa 1961; Monderer and Shapley 1996)
Games that can be solved with iterative, strictly dominant strategy (Neighbor 1990)
Games with weighted potential (Monderer and Shapley 1996)

Fictitious play is still not considered a general solution concept (Shapley 1964), but in principle it converges to half of approximated equilibria (Conitzer 2009; Goldberg, Savani, Sorensen, Ventre 2011).

If all players choose their strategies according to fictitious play, the strategy is weakened by the resulting Nash equilibria. A typical result for this is the limit-cycle behavior, which increases with the number of games. In some specific cases the product of the empirical distribution converges to the Nash equilibrium, although playing in cycles, as in matching pennies .

Practical use

Poker game tree with one card and one round

Nowadays, fictitious play is most popular only in poker . Duziak was a pioneer in the use of fictitious play in poker (here we are only talking about the poker variant Texas Hold'em ). Fictitious play greatly simplifies the complexity of the game states.

There are 10 ^ 18 possible game states at Poker, which are reduced to 10 ^ 7 with Fictitious Play. This abstraction can be based on various algorithms. In poker, the so-called "bucketing" is mainly used, as the size of the algorithm is reduced without abstracting the underlying cause. This is done by grouping hands (poker) with similar values into different "buckets". If you have now abstracted the game, it is now essential if you want to optimize your fictitious play strategy to develop a well-defined game tree. This tree quickly takes on gigantic dimensions, so that lower probabilities of winning are sorted out and these paths of the tree are cut off. The game tree is getting smaller and smaller and the pseudo-optimal strategy can be calculated during the game. The pseudo-optimality comes about because the game states are greatly simplified and thus the chances of winning decrease.

Criticism of Fictitious Play

In practice, except in poker, fictious play is rarely found, as it is not suitable for games with several people and does not play optimal strategies. Even simple games with more than two people can no longer be calculated due to the excessive number of terms, so that this will only be possible for computers in the foreseeable future.

For more complex games there is a more efficient algorithm called Joint Strategy Fictitious Play. This simplifies the complexity of fictitious play, as it considers the historical decisions and thus approximates the upcoming strategies of the opponents. The simplification lies in the fact that one's own strategy is chosen at random and is not adapted to all players. Thus, the computational effort is significantly less, at the expense of optimality. In the case of the two-player game, the strategy of Joint Strategy Fictitious Play corresponds to the original Fictitious Play. However, in contrast to fictitious play, Joint Strategy Fictitious Play is practicable for games with a number of people> 2. In addition to the high computational effort of fictitious play, the observation effort of the previous strategies is another aspect that makes this strategy hardly possible in practice.

Web links

http://www.gambit-project.org/

literature

Original paper: GW Brown: Iterative solution of games by Fictitious Play, in Activity Analysis of Production and Allocation, ed. TC Koopmans, New York, Wiley (1951), 374-376

Berger, U. (2004). Two more classes of games with the Fictitious Play property. Mimeo, Vienna University of Economics.

Berger, U. (2005) “Fictitious Play in 2xN Games”, Journal of Economic Theory 120, 139–154.

Berger, U. (2007) "Brown's original Fictitious Play", Journal of Economic Theory 135: 572-578
Danskin, JM (1954), Fictitious play for continuous games. Naval Research Logistics Quarterly, 1: 313-320
Duziak, W. (2006). Using Fictitious Play to find pseudo-optimal solutions for full-scale poker. In Proceedings of the 2006 International Conference on Artificial Intelligence (ICAI-2006).
JR Marden, G. Arslan, JS Shamma, “Joint strategy fictitious play with inertia for potential games,” in Proceedings of the 44th IEEE Conference on Decision and Control, December 2005, pp. 6692-6697
D. Monderer, L. Shapley: Fictitious Play property for games with identical interests, J. Econ. Theory, 68: 258-265 (1996)
von Neumann and Morgenstern (1944), Theory of Games and Economic Behavior, Princeton and Woodstock: Princeton University Press.
Samuelson, L. (1998) Evolutionary Games and Equilibrium Selection, MITP.
Shapley L. (1964) "Some Topics in Two-Person Games" In Advances in Game Theory M. Dresher, LS Shapley, and AW Tucker (Eds.), Princeton: Princeton University Press.
Weibull, J. (1995) Evolutionary Game Theory, MITP. Fudenberg, D. & David K. Levine, D. (1998) The Theory of Learning in Games, MITP.
M. Zinkevich, M. Bowling, MJ, & Piccione, C. (2007). Regret minimization in games with incomplete information. Advances in Neural Information Processing Systems (NIPS 2007), 374-380.

Individual evidence

↑ JM Danskin, (2006), Fictitious Play for continuous games, p. 314
^ GW Brown: Iterative solution of games by Fictitious Play, in Activity Analysis of Production and Allocation, ed.TC Koopmans, New York, Wiley (1951), 375
↑ Berger, U. (2007) “Brown's original Fictitious Play”, Journal of Economic Theory p. 574
^ Nash equilibrium
↑ Edgar Saliger, Business Decision Theory, 2003, Oldenbourg Wissenschaftsverlag, p. 105
↑ Duziak, W. (2006). Using fictitious play to find pseudo-optimal solutions for full-scale poker. In Proceedings of the 2006 International Conference on Artificial Intelligence (ICAI-2006)
↑ D. Billings, N. Burch, A. Davidson, R. Holte, J. Schaeffer, T. Schauenberg, and D. Szafron. Approximating Game-Theoretic Optimal Strategies for Full-scale Poker. Proceedings of the 2003 International Joint Conference on Artificial Intelligence
↑ JR Marden, G. Arslan, JS Shamma, “Joint strategy Fictitious Play with inertia for potential games,” in Proceedings of the 44th IEEE Conference on Decision and Control, December 2005, pp. 6694

[1] JM Danskin, (2006), Fictitious Play for continuous games, p. 314

[2] GW Brown: Iterative solution of games by Fictitious Play, in Activity Analysis of Production and Allocation, ed.TC Koopmans, New York, Wiley (1951), 375

[3] Berger, U. (2007) “Brown's original Fictitious Play”, Journal of Economic Theory p. 574

[4] Nash equilibrium

[5] Edgar Saliger, Business Decision Theory, 2003, Oldenbourg Wissenschaftsverlag, p. 105

[6] Duziak, W. (2006). Using fictitious play to find pseudo-optimal solutions for full-scale poker. In Proceedings of the 2006 International Conference on Artificial Intelligence (ICAI-2006)

[7] D. Billings, N. Burch, A. Davidson, R. Holte, J. Schaeffer, T. Schauenberg, and D. Szafron. Approximating Game-Theoretic Optimal Strategies for Full-scale Poker. Proceedings of the 2003 International Joint Conference on Artificial Intelligence

[8] JR Marden, G. Arslan, JS Shamma, “Joint strategy Fictitious Play with inertia for potential games,” in Proceedings of the 44th IEEE Conference on Decision and Control, December 2005, pp. 6694