Expectiminimax algorithm

from Wikipedia, the free encyclopedia

The Expectiminimax algorithm is an extended version of the Minimax algorithm , which takes random elements such as the falling of a die into account. The algorithm is therefore most frequently used when programming zero-sum games with two participants, such as backgammon .

properties

The algorithm is largely the same as the conventional MiniMax algorithm, but is expanded to include a case distinction. All possible events (e.g. resulting from different numbers) are calculated separately as if they were real. Subsequently, all probability branches of a probability depth that arise in this way are then multiplied by the probability of their occurrence and added. The sum then forms the criterion for the known minimization / maximization process. The values in the same tree so formal mathematical definition of the expected value (Engl. Expected value ).

Pseudo code

The Expectiminimax algorithm was first proposed by Donald Michie . The following is the pseudocode :

function expectiminimax (branch, depth)

    WENN Ast ist ein Bewertungsast ODER Tiefe = 0
        return die heuristische Bewertung der aktuellen Situation
    WENN der Gegner am Zug ist
        // Gib den mit dem geringsten Wert bewerteten Zug zurück
        let α := +∞
        FÜR JEDEN Zug des Astes
            α := min(α, expectiminimax(Nachfolger, Tiefe-1))
    SONST WENN wir am Zug sind
        // Gib den mit dem höchsten Wert bewerteten Zug zurück
        let α := -∞
        FÜR JEDEN Zug des Astes
            α := max(α, expectiminimax(Nachfolger, Tiefe-1))
    SONST WENN zufälliges Ereignis
        // Gib durchschnittliche Bewertung aller Möglichkeiten zurück
        let α := 0
        FÜR JEDEN Nachfolger des Astes
            α := α + (Wahrscheinlichkeit[Nachfolger] * expectiminimax(Nachfolger, Tiefe-1))
    return α

Evaluation function

As in the MiniMax algorithm, the evaluation function must evaluate the current position in the tree on the basis of heuristics . In contrast to the MiniMax algorithm, however, the value range of the function plays a decisive role. If profit or loss positions are marked with or in the MiniMax algorithm , such handling in the Expectiminimax algorithm leads to a loss of information. For example, if a player can achieve a good situation with a number of 1, 2 and 3, but loses if he rolls a 4, the algorithm results with the same probabilities: The move is discarded although it has only a low probability of losing of the game. This effect becomes particularly problematic at the end of a game, when profit and loss evaluations pile up and possibly occur with the same level of probability . To prevent such errors, fixed value ranges must be defined and profit and loss evaluations must be treated separately. In practice, mostly complex data types are used as evaluation values ​​in which the evaluation of the situation is divided into basic evaluation and profit / loss chance.

Pruning

A more efficient design of the calculation by cutting away branches ( pruning ) as in the alpha-beta algorithm is also possible with the Expectiminimax algorithm. However, the implementation is more complex due to the definition of certain value ranges and differs for different applications.

Individual evidence

  1. ^ D. Michie : Game-playing and game-learning automata. In: Leslie Fox (Ed.): Advances in Programming and Non-Numerical Computation. Pergamon Press, Oxford et al. 1966, pp. 183-200.
  2. ^ Stuart Jonathan Russell, Peter Norvig: Artificial intelligence. A modern approach. 3. Edition. Prentice-Hall, Upper Saddle River NJ et al. 2010, ISBN 978-0-13-604259-4 , p. 177 f.