B * tree

The B * -tree is a data or index structure in computer science and a variant of the B-tree , which was proposed by Donald Knuth in 1973 and differs from the B-tree in the requirement that nodes are at least 2/3 full must (instead of just 1/2 full). This is achieved primarily through a modified split strategy, in which 2 full nodes are divided into 3 nodes with a filling degree of 2/3.

For historical reasons, the name is often used for the B + tree , another B-tree variant in which data is only stored in the leaf nodes and range inquiries are supported more efficiently through the chaining of the leaf nodes.

definition

Knuth defines a B * -tree of the order by the following requirements: ${\ displaystyle m}$

All nodes except the root have at most children, ${\ displaystyle m}$
all nodes except roots and leaves have at least children, ${\ displaystyle {\ frac {2m-1} {3}}}$
the root has at least and at most children, ${\ displaystyle 2}$ ${\ displaystyle 2 \ left \ lfloor {\ frac {2m-2} {3}} \ right \ rfloor +1}$
all leaves are the same distance from the root and
all nodes that are not leaves with children contain keys. ${\ displaystyle k}$ ${\ displaystyle k-1}$

Differences to the B-tree

Overflow handling in a B * tree of order 6: With the first insert, the node overflows into the neighboring node, with the second insert both nodes overflow and a new node is created.

As in the B-tree, the inner nodes in the B * -tree also contain data. The main difference to B-trees lies in the second requirement that nodes must be filled. This requires an adaptation of the B-tree algorithm for overflow handling. Instead of creating a new node immediately in the event of an overflow, the system first checks whether there is still space in the neighboring node on the right. If this is the case, the keys of the two nodes and the separating key in the parent node are distributed equally between the two nodes. If the second node contains keys, the key is the new separating key. If there is also no space in the neighboring node, a new node is created and the keys of the two original nodes, the inserted node and the separating key of the parent node are distributed to all three nodes. The keys and the separating keys are in the parent node. ${\ displaystyle 2/3}$ ${\ displaystyle j}$ ${\ displaystyle \ lfloor (m + j) / 2 \ rfloor +1}$ ${\ displaystyle \ lfloor (2m-2) / 3 \ rfloor +1}$ ${\ displaystyle \ lfloor (4m-1) / 3 \ rfloor}$

The higher data density results in a higher degree of branching and thus a lower tree height. This increases the performance compared to the B-tree, since fewer nodes have to be loaded. For the level of a key and therefore the necessary for an access node of the code applies in a B * tree with keys: . In the B-tree, however, nodes must be loaded for access. ${\ displaystyle l}$ ${\ displaystyle m}$ ${\ displaystyle N}$ ${\ displaystyle l \ leq 1+ \ log _ {\ lceil (2m-1) / 3 \ rceil} {\ frac {N + 1} {2}}}$ ${\ displaystyle l \ leq 1+ \ log _ {\ lceil m / 2 \ rceil} {\ frac {N + 1} {2}}}$

application

Like the B-tree, the B * -tree is optimized for storage in the secondary memory. Data blocks, also called pages, of a fixed size are transferred between the main memory and the secondary memory . Since accesses to the secondary memory take a long time compared to calculations and accesses to the main memory, the number of pages from the secondary memory required for an operation must be minimized. The size of the nodes is therefore chosen so that they fit exactly on one side. This makes the different variants of the B-tree very suitable for databases and file systems .

designation

Another variant of the B-tree is often referred to as a B * -tree, which is also described by Knuth, but is not explicitly named. Hartmut Wedekind gave this tree the name B * tree in 1974, but in 1979 Douglas Comer called it a B ⁺ tree for better delimitation . However, as early as 1977 Rudolf Bayer used the term B * tree for the variant later called B ⁺ tree, so that a clear delimitation could no longer be fully established. The National Institute of Standards and Technology lists the variants of the B-tree according to Comer's subdivision.

Individual evidence

↑ ^a ^b ^c ^d Knuth, DE: The Art of Computer Programming, Volume III: Sorting and Searching Addison-Wesley , 1973, pages 476-479
^ ^A ^b Thomas H. Cormen, Charles E. Leiserson , Ronald L. Rivest , Clifford Stein: Introduction to Algorithms . 2nd Edition. MIT Press, Cambridge (Massachusetts) 2001, ISBN 0-262-03293-7 .
↑ Comer, D .: Ubiquitous B-Tree ACM Computing Surveys 11 , 1979, 2, 121-137
↑ Wedekind, H. On the Selection of Access Paths in a Data Base System IFIP Working Conference Data Base Management, 1974, 385-397
^ Bayer, R. & Unterauer, K .: Prefix B-Trees ACM Trans . Database Syst., 1977, 2, 11-26
^ Paul E. Black: "B * -tree" , in Dictionary of Algorithms and Data Structures [online], Paul E. Black, ed. US National Institute of Standards and Technology. November 6, 2007, accessed January 24, 2011. Available from: http://xlinux.nist.gov/dads/HTML/bstartree.html
↑ Paul E. Black: "B ⁺ tree" , in Dictionary of Algorithms and Data Structures [online], Paul E. Black, ed., US National Institute of Standards and Technology . November 6, 2007, accessed January 24, 2011. Available from: http://xlinux.nist.gov/dads/HTML/bplustree.html

[Knuth1973-1] Knuth, DE: The Art of Computer Programming, Volume III: Sorting and Searching Addison-Wesley , 1973, pages 476-479

[Cormen2001-2] A ^b Thomas H. Cormen, Charles E. Leiserson , Ronald L. Rivest , Clifford Stein: Introduction to Algorithms . 2nd Edition. MIT Press, Cambridge (Massachusetts) 2001, ISBN 0-262-03293-7 .

[Comer1979-3] Comer, D .: Ubiquitous B-Tree ACM Computing Surveys 11 , 1979, 2, 121-137

[Wedekind1974-4] Wedekind, H. On the Selection of Access Paths in a Data Base System IFIP Working Conference Data Base Management, 1974, 385-397

[Bayer1977-5] Bayer, R. & Unterauer, K .: Prefix B-Trees ACM Trans . Database Syst., 1977, 2, 11-26

[Nistbstartree-6] Paul E. Black: "B * -tree" , in Dictionary of Algorithms and Data Structures [online], Paul E. Black, ed. US National Institute of Standards and Technology. November 6, 2007, accessed January 24, 2011. Available from: http://xlinux.nist.gov/dads/HTML/bstartree.html

[Nistbplustree-7] Paul E. Black: "B ⁺ tree" , in Dictionary of Algorithms and Data Structures [online], Paul E. Black, ed., US National Institute of Standards and Technology . November 6, 2007, accessed January 24, 2011. Available from: http://xlinux.nist.gov/dads/HTML/bplustree.html