Top-Down Induction of Decision Trees

from Wikipedia, the free encyclopedia

Top-Down Induction of Decision Trees or TDIDT for short is a non-incremental learning method in the research field of machine learning that is based on the use of decision trees .

A learning set L of examples and a set T of available tests serve as a starting point . The function F represents a termination condition for a node. Furthermore, a method M is required that enables a test t to be selected from T.

Starting from the root node, each successor node is now recursively examined to determine whether the termination condition F is fulfilled at this node. If so, the node is defined as a leaf and labeled with the output of F. If the node could not be identified as a leaf, a test t is selected from T using M , and the node is labeled with it. For in this branch the following node t from the set T removed. Due to the conditions of t , corresponding successor nodes with connecting edges are formed from the current one. The set of examples L is also divided into disjoint subsets on the following nodes due to the conditions of t . During the recursion through all nodes, the learning set L and the set of available tests T change until finally these sets ( e.g. L ) are empty. All examples from L were assigned to one sheet.

Of course, the goal must be to obtain a decision tree that is as efficient as possible, i.e. the smallest possible. This can be achieved from the outset in that the method M selects a test in each case that splits the available examples L into subsets of the same size as possible. During construction, termination conditions F can be used to aim for termination as early as possible. In retrospect, techniques such as tree pruning can be used to make the tree smaller.

As a non-incremental learning process, TDIDT must be completely rebuilt when the examples L are changed due to new observations (i.e. new examples) or changes in behavior among each other.

Frequently used TDIDT methods are ID3 and C4.5 .

See also

literature

  • JR Quinlan: Induction of Decision Trees . In: Machine Learning 1 . Kluwer Academic Publishers, Boston 1986, p. 81–106 ( oregonstate.edu [PDF; accessed July 10, 2010]).