Cartesian tree

from Wikipedia, the free encyclopedia
A sequence of numbers and the Cartesian tree derived from them.

A Cartesian tree is a binary heap derived from a sequence of numbers , with the additional property that an in-order run returns the original sequence. The Cartesian tree for a sequence can be constructed in linear time.

definition

The Cartesian tree of a sequence of totally ordered elements is defined as follows:

  1. The Cartesian tree is a binary tree .
  2. There is exactly one node in the tree for each element of the sequence .
  3. An in-order traversal of the tree provides the original sequence. That means: the left subtree of the root consists of the elements that are in the sequence before the root element, the right subtree of the elements after it; this applies to every subtree.
  4. The tree fulfills the heap property (hereinafter always min heap).

According to the heap property, the root element is the smallest element in the sequence. This means that the tree can also be defined recursively: the root is the minimum value of the sequence and the left and right subtree are the Cartesian trees of the subsequences to the left and right of the root.

If the elements of the sequence are not different in pairs, their Cartesian tree is not uniquely determined. The uniqueness can be ensured by choosing a deterministic tie-break rule (for example: "Consider the first occurrence of two identical elements as the smaller").

construction

The recursive definition already results in a naive construction method with a worst-case runtime . However, the construction of a Cartesian tree of a given sequence is possible in linear time. For this purpose, the sequence of elements is iterated from left to right so that the Cartesian tree of the first elements is already available at every point in time (i.e. in iteration ) . To add the next element in the next iteration , start at the node that corresponds to the previous element and from there follow the path to the root until the deepest node is reached whose associated element is smaller than . The node for is now appended to as a right subtree and the previously right subtree of will instead become the left subtree of the newly inserted node . In the special case that no element is found on the path to the root (inclusive) that is smaller than , the root of the new Cartesian tree becomes . To do this, the old root is attached as a left child.

The total runtime for the construction is linear in the number of sequence elements, since the time for the search for the node can be offset against the number of nodes that are no longer on the right-most path after the iteration during the operations to insert the new one Node and relocating a subtree are possible in constant time.

Applications

Range minimum query

A minimum range query (RMQ) on a sequence searches for the minimum element of a contiguous subsequence. In the Cartesian tree is the minimum of the partial sequence in the deepest common ancestor (find lowest common ancestor ) of the first and the last node of the subsequence. In the example sequence used above, for example, the minimum in the Lowest Common Ancestor of the first and last element ( and ) can be found for the partial sequence .

Since the Lowest Common Ancestor can be determined using a data structure with linear storage space in linear time, RMQs within these limits can also be answered with the help of a Cartesian tree.

3-page area inquiries

The name of the Cartesian tree comes from its use to answer three-way range inquiries in the Cartesian plane. Find all points that meet conditions (1) and (2) . To do this, the points are first sorted according to the x coordinate and the Cartesian tree with heap property is constructed with respect to the y coordinate. For a specific query, the points that fulfill condition (1) now form a coherent partial sequence, with the first node of the partial sequence (with the smallest x coordinate) and the last (with the largest x coordinate) being. The lowest common ancestor of and contains the point from this interval with the minimum y-coordinate. If the y-coordinate is smaller than the limit , ( i.e. lies in the searched area), the search is carried out recursively on the partial sequences between and and between and . In this way, after the nodes and have been determined once (e.g. with a binary search), all points within the searched area can be determined in constant time per point.

history

Cartesian trees go back to Vuillemin (1980), who described a special case of the Cartesian trees described above for a sequence of points in the Cartesian coordinate system: The heap property refers to the y-coordinate of the points, and an in-order run provides the sorted sequence of the x-coordinates. Gabow, Bentley, and Tarjan (1984) and other authors followed the definition given here, in which a Cartesian tree is defined for any sequence and thus abstracted from the original geometric problem.

Individual evidence

  1. a b c Gabow, HN, JL Bentley, and RE Tarjan: Scaling and related techniques for geometry problems . In: Proceedings of the sixteenth annual ACM symposium on Theory of Computing . 1984, pp. 135-143. doi : 10.1145 / 800057.808675 .
  2. Baruch Schieber, Uzi Vishkin: On finding lowest common ancestors: simplification and parallelization . In: SIAM Journal on Computing . 17, No. 6, 1988, pp. 1253-1262. doi : 10.1137 / 0217079 .
  3. ^ Dov Harel, Robert E. Tarjan: Fast algorithms for finding nearest common ancestors . In: SIAM Journal on Computing . 13, No. 2, 1984, pp. 338-355. doi : 10.1137 / 0213024 .
  4. ^ Jean Vuillemin: A unifying look at data structures . In: ACM (Ed.): Commun. ACM . 23, No. 4, New York, NY, USA, 1980, pp. 229-239. doi : 10.1145 / 358841.358852 .