Heap (data structure)
A heap (English literally pile or heap ) in computer science is usually on trees based abstract data structure . Objects or elements can be stored in a heap and removed from it again. They are used to store quantities . The elements are assigned a key that defines the priority of the elements. The elements themselves are also often used as keys.
A total order must therefore be defined for the set of keys, which defines the order of the inserted elements. For example, the set of integers together with the comparison operator <could act as a key set.
The term heap is often understood to mean the same thing as a partially ordered tree (see figure). Occasionally one understands only a special form of implementation of a partially ordered tree, namely the embedding in an array , as a heap.
Heaps are mainly used where it is important to quickly remove an element with the highest priority from the heap ( HIFO principle ), for example with priority queues .
Operations
Depending on the type, heaps support a whole range of operations . The main operations are
- insert
- to insert an element,
- remove
- to remove an element and
- extractMin
- to return and remove an element with a minimum key, i.e. the highest priority.
In addition, the changeKey operations for adapting the key and merge for merging two heaps are often provided. The changeKey operation is often formed by executing the remove and insert operations one after the other , with the key being adjusted after the removal and before the reinsertion . In some cases, heaps only offer the decreaseKey operation instead of changeKey , in which the key can only be reduced.
Heap condition
A distinction is made between min heaps and max heaps. In min-heaps , the property that the keys of the children of a node are always greater than or equal to the key of their father is called the heap condition . This means that at the root of the tree there is always an element with a minimal key in the tree. Conversely, the heap condition with max heaps requires that the keys of the children of a node are always less than or equal to those of their father. There is always an element with the maximum key at the root of the tree.
Mathematically, the only difference between the two variants is the opposing order of the elements. Since the definition of ascending and descending is purely arbitrary, it is therefore a question of interpretation whether a specific implementation is a min heap or a max heap.
It often happens that a heap is supposed to map the heap property in both directions. In this case, two heaps are simply created, one according to the smaller and the other according to the larger relation. Such a data structure is then called a double heap or deap for short .
When using double heaps, it should be noted that not every implementation of heaps retains its runtime behavior for the individual operations. For example, Fibonacci heaps only support a decreaseKey to decrease the key with amortized constant running time . A more general changeKey for changing the key, which would be needed in the case of such a double heap, needs at least a logarithmic runtime amortized.
A variant of heaps that supports the heap condition only to a limited extent or in a modified form are so-called min-max heaps . These are double heaps which, thanks to the modified form of the heap condition, can hold the data in just one tree.
implementation
Heaps are usually implemented with an implicit heap data structure, which is an implicit data structure made up of a fixed-size array or a dynamic array, with each element representing the node of a tree whose parent-child relationship is implicitly defined its index is defined. After an element has been added to or deleted from a heap, the heap property can be violated and the heap must be balanced by exchanging elements within the array.
In an implicit heap data structure, the first or last element contains the root . The next two elements of the array contain its children. The next four elements contain the four children of the two child nodes, etc. Thus, the children of the node at position n would be at positions 2n and 2n + 1 in an array with starting index 1 or at positions 2n + 1 and 2n + 2 in one Array with start index 0. Calculating the index of the parent node of the element at position n is also straightforward. For an array with start index 1, the parent element is at position n / 2 . In the case of an array with start index 0, the parent element at position (n - 1) / 2 . This enables the tree to be traversed by simple index calculations. Balancing a heap is done by swapping out elements that are out of order. Since we can create a heap from an array without using additional memory, Heapsort can be used to sort an array in-place .
Different types of heaps implement the operations in different ways. In particular, however, the insertion is often done by adding the new element to the end of the heap in the first available free space. This generally violates the heap property. As a result, the items are moved up until the heap property is restored. Similarly, root deletion is done by removing the root and then inserting the last element into the root and sifting it to compensate. The replacement is done by deleting the root and inserting the new element in the root and sifting through, avoiding an upward step compared to lowering the last element followed by ascending the new element.
Variants of heaps
There are numerous types of heaps with different runtime behavior for the various operations they make available. Examples of heaps are:
In addition, there is a variant with soft heaps in which a better runtime behavior is achieved in that a certain proportion of the keys can be corrupted. These corrupted keys no longer have their original value.
Runtime behavior
The following table shows the runtimes of various heaps with regard to their operations.
surgery | Binary heap | Binomial heap | Fibonacci heap | Pairing | Leftist | 2-3 heap |
---|---|---|---|---|---|---|
amortized | amortized | at thelocated | ||||
extract-min | ||||||
remove | ||||||
insert | ( presumed) | or | ||||
decrease-key | ( presumed) | or | ||||
merge | ( presumed) | or |
application
Heaps have a wide range of uses. Often it is mainly used in priority queues , such as those used in servers or operating systems to determine the sequence of execution of tasks bare required.
However, there are also applications that do not explicitly require the use of queues . In general, a heap is an ideal data structure for greedy algorithms that gradually make local, optimized decisions. For example, in the Heapsort sorting algorithm, a binary heap is used for sorting. Fibonacci heaps are used in the Dijkstra and Prim algorithms .
See also
literature
- Thomas H. Cormen, Charles E. Leiserson , Ronald L. Rivest , Clifford Stein: Introduction to Algorithms . 2nd Edition. MIT Press, Cambridge (Massachusetts) 2001, ISBN 0-262-03293-7 , pp. 127 .
- Donald E. Knuth : The Art of Computer Programming : Volume 3 Sorting and Searching . 2nd Edition. Addison-Wesley, Reading MA 1997, ISBN 0-201-89685-0 , pp. 144-155 .