Bellman's principle of optimality

The Bellman equation is a fundamental principle of optimization . It is named after Richard Bellman and states that for some optimization problems, every optimal solution is composed of optimal partial solutions. On this principle are based algorithms of dynamic programming .

One example is the computation of a shortest path in a graph (e.g. a road network). A shortest path P between nodes (cities) A and B, which goes through nodes X and Y, must also use a shortest path between these two nodes between X and Y. If this were not the case, P could be shortened by using a shorter partial path between X and Y, and then P would not have been a shortest path between A and B, contrary to the assumption. The Bellman-Ford algorithm for calculating the shortest paths, which is based on dynamic programming, makes use of this principle. Such graphs are shown in a source-sink tree .

Definition (classic)

"An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision."

- Bellman, 1957

"An optimal decision sequence has the property that whatever the initial situation was and whatever the first decision was made, the remaining decisions must form an optimal decision sequence, based on the state that resulted from the first decision."

Is meant:

“An optimal decision sequence has the property that whatever the initial state was and the first decision was made, the remaining decisions must also form an optimal decision sequence, considered over all possible decision sequences that begin with the state from the first decision results. "

Definition (formal)

If there is an optimization function that works on lists, then Bellman's principle of optimality applies to a -digit function if: ${\ displaystyle h}$ ${\ displaystyle k}$ ${\ displaystyle f}$

${\ displaystyle h ([f (x_ {1}, \ ldots, x_ {k}) | x_ {1} \ leftarrow z_ {1}, \ ldots, x_ {k} \ leftarrow z_ {k}]) = h ([f (x_ {1}, \ ldots, x_ {k}) | x_ {1} \ leftarrow h (z_ {1}), \ ldots, x_ {k} \ leftarrow h (z_ {k})]) }$
${\ displaystyle h (z_ {1} {++} z_ {2}) = h (h (z_ {1}) {++} h (z_ {2}))}$

(Giegerich et al., 2002)

${\ displaystyle z_ {i}, 1 \ leq i \ leq k}$ are lists of the type . is of the type . This is the list link operator and is the list description notation as defined in Haskell . ${\ displaystyle A}$ ${\ displaystyle h}$ ${\ displaystyle h :: [A] -> [A]}$ ${\ displaystyle ++}$ ${\ displaystyle [|]}$

literature

Richard Bellman: Dynamic Programming . Princeton University Press, 1957.
Thomas L. Morin: Monotonicity and the Principle of Optimality . In: Journal of mathematical analysis and applications . tape 86 , 1982, pp. 665-674 .
R. Giegerich, C. Meyer, P. Steffen: Towards a Discipline of Dynamic Programming . In: GI Edition - Lecture Notes in Informatics . Bonner Köllen Verlag, 2002, p. 3–44 ( http://bibiserv.techfak.uni-bielefeld.de/adp/ps/adp_discipline.pdf 260 KB [PDF]).