Bellman algorithm

The Bellman algorithm constructs an optimal binary search tree from a given key list and a corresponding search probability . The algorithm is based on the sentence found by Richard Bellman in 1957 about optimal mean search times in binary search trees and uses the dynamic programming method .

algorithm

input

${\ displaystyle n}$ Search keys that are ordered in a sequence . In addition, the search probability is given for each key . For each denotes the probability that a non-existent key , with for or for , is searched for. ${\ displaystyle k_ {i}, 0 <i \ leq n}$ ${\ displaystyle k_ {i}}$ ${\ displaystyle p_ {i}}$ ${\ displaystyle k_ {i}}$ ${\ displaystyle q_ {i-1}}$ ${\ displaystyle x}$ ${\ displaystyle k_ {i-1} <x <k_ {i}}$ ${\ displaystyle 1 <i \ leq n}$ ${\ displaystyle x <k_ {i}}$ ${\ displaystyle i = 1}$

Since and are probabilities, the sum of all and 1 must be : ${\ displaystyle p_ {i}}$ ${\ displaystyle q_ {i}}$ ${\ displaystyle p_ {i}}$ ${\ displaystyle q_ {i}}$

${\ displaystyle \ sum _ {i = 1} ^ {n} p_ {i} + \ sum _ {i = 0} ^ {n} q_ {i} = 1}$

output

The minimum expected search time in an optimal binary search tree for the key set and the optimal search tree under which the minimum expected search time is achieved. ${\ displaystyle k_ {i}}$

However, if there are geometrically falling probabilities, then the search time for the associated very rare keys cannot be logarithmically restricted.

Calculation of the search time

The search time for a key search or the search costs for a key search is the number of nodes visited on a path from the root to the key node in a binary search tree. So if a key a depth of has in the tree, then its search costs . ${\ displaystyle k_ {i}}$ ${\ displaystyle d (k_ {i})}$ ${\ displaystyle d (k_ {i}) + 1}$

In order to model the search time for non-existent keys, each leaf is given two child nodes and . If a -sheet is reached during the search , then the node is not contained in the binary search tree. ${\ displaystyle k_ {i}}$ ${\ displaystyle d_ {i-1}}$ ${\ displaystyle d_ {i}}$ ${\ displaystyle d_ {i}}$

For a given search tree , the expected search time can be calculated: ${\ displaystyle T}$

${\ displaystyle {\ begin {aligned} E (T) & = & \ sum _ {i = 1} ^ {n} (d (k_ {i}) + 1) p_ {i} + \ sum _ {i = 0} ^ {n} (d (d_ {i}) + 1) q_ {i} \\ & = & \ sum _ {i = 1} ^ {n} d (k_ {i}) p_ {i} + \ sum _ {i = 1} ^ {n} p_ {i} + \ sum _ {i = 0} ^ {n} d (d_ {i}) q_ {i} + \ sum _ {i = 0} ^ {n} q_ {i} \\ & = & 1+ \ sum _ {i = 1} ^ {n} d (k_ {i}) p_ {i} + \ sum _ {i = 0} ^ {n} d ( d_ {i}) q_ {i} \ end {aligned}}}$

Recursive computation

The Bellman algorithm calculates the expected search time under an optimal binary search tree recursively on the sequence of search keys. The algorithm is specified by means of matrix recurrences.

Initialization:

${\ displaystyle M [i, i-1] = q_ {i-1}, 0 <i \ leq n}$

Recursion:

${\ displaystyle M [i, j] = {\ begin {Bmatrix} \ min _ {i \ leq r \ leq j} M [i, r-1] + M [r + 1, j] + w (i, j) \ end {Bmatrix}}, 0 \ leq i \ leq n, 0 <j \ leq n, i \ leq j}$

In each cell, there is the minimum search duration under an optimal search tree for the partial sequence of the search key sequence , the sum of all search probabilities denoting the keys in the tree for the partial sequence. So the minimum search time for the entire sequence is stored in the cell . ${\ displaystyle M [i, j]}$ ${\ displaystyle i, j}$ ${\ displaystyle k_ {i}}$ ${\ displaystyle w (i, j)}$ ${\ displaystyle M [1, n]}$

In the recursion, each choice for choosing as the root of the tree corresponds to the subsequence . The creation of the root increases the depth of each node in this tree by 1. So the expected search time in this tree must be increased by. ${\ displaystyle r}$ ${\ displaystyle k_ {r}}$ ${\ displaystyle i, j}$ ${\ displaystyle w (i, j)}$

${\ displaystyle w (i, j)}$ is defined as

${\ displaystyle w (i, j) = \ sum _ {l = i} ^ {j} p_ {l} + \ sum _ {l = i-1} ^ {j} q_ {l}}$

and can be calculated efficiently with a matrix recurrence.

Backtracking

In order to construct an optimal search tree with the minimum expected search duration, the calculation of the optimal value must be traced back using backtracking . Alternatively, an additional auxiliary matrix can be used in an implementation of the algorithm, which is filled with the optimal values of for each during the calculation of and is evaluated after the calculation of has been completed. ${\ displaystyle M [1, n]}$ ${\ displaystyle M}$ ${\ displaystyle r}$ ${\ displaystyle i, j}$ ${\ displaystyle M}$

complexity

The runtime of the calculation of the matrix for the values is in . The matrix contains entries and each entry must be optimized using elements. So the runtime complexity of the algorithm is in and the memory requirement in . ${\ displaystyle w (i, j)}$ ${\ displaystyle {\ mathcal {O}} (n ^ {2})}$ ${\ displaystyle M}$ ${\ displaystyle {\ mathcal {O}} (n ^ {2})}$ ${\ displaystyle {\ mathcal {O}} (n)}$ ${\ displaystyle {\ mathcal {O}} (n ^ {3})}$ ${\ displaystyle {\ mathcal {O}} (n ^ {2})}$

The iteration over in the recursion can be further restricted so that the total runtime of all iterations is in . So the total runtime of the algorithm modified in this way is then in . ${\ displaystyle r}$ ${\ displaystyle {\ mathcal {O}} (n)}$ ${\ displaystyle {\ mathcal {O}} (n ^ {2})}$

literature

Thomas H. Cormen, Charles E. Leiserson , Ronald L. Rivest , Clifford Stein: Introduction to Algorithms . 2nd Edition. MIT Press, Cambridge (Massachusetts) 2001, ISBN 0-262-03293-7 , pp. 356-363 .
Donald Ervin Knuth : The Art of Computer Programming 3. Sorting and Searching . 2nd Edition. Addison-Wesley Longman, Amsterdam 1998, ISBN 0-201-89685-0 , pp. 436-442 .

swell

^ Donald Ervin Knuth : The Art of Computer Programming 3. Sorting and Searching . 2nd Edition. Addison-Wesley Longman, Amsterdam 1998, ISBN 0-201-89685-0 , pp. 436-442 .

[1] Donald Ervin Knuth : The Art of Computer Programming 3. Sorting and Searching . 2nd Edition. Addison-Wesley Longman, Amsterdam 1998, ISBN 0-201-89685-0 , pp. 436-442 .