Iterative dichotomisers 3

from Wikipedia, the free encyclopedia

Iterative Dichotomiser 3 ( ID3 ) is an algorithm used for decision making. It is used in decision trees . The Australian researcher J. Ross Quinlan first published this algorithm in 1986 . ID3 was very influential in its early years. It is still used in some products today. ID3 is considered to be the predecessor of the C4.5 algorithm.

ID3 is used when many different attributes are important with a large amount of data and therefore a decision tree should be generated without large calculations. This usually creates simple decision trees. However, it cannot be guaranteed that no better trees would be possible.

The basic structure of ID3 is iterative . Entropies relating to the training set are calculated for each attribute that has not yet been used . The attribute with the highest information gain or the smallest entropy is selected and a new tree node is generated from it. The process terminates when all training instances have been classified, i. H. if a classification is assigned to each leaf node.

algorithm

When all elements from (data) belong to one class

Then
// create a sheet //
Construct a sheet with the class as an identifier
Otherwise
// recursively create a subtree //
Choose the most "informative" feature
Start: For_ all occurring values ​​of the characteristic
Construct all subtrees recursively with the corresponding subsets as data
End: For_all
Construct a tree node with an identifier and append all subtrees created
End else

Selection of attributes (features)

For the formation of the sub-trees corresponding to each of the information theory , the most informative attribute selected.

Let be the set of training examples with their respective classification, the attribute to be checked from the set of available attributes, the set of possible attribute values ​​of and the subset of for which the attribute takes the value . The information gain achieved by selecting the attribute is then calculated as the difference between the entropy of and the expected / average entropy of when fixing :

.

Finally, one chooses an attribute with the greatest possible profit from the set as the next attribute.

This choice leads to a preference for attributes with many options and thus to a broad tree. To counteract this, normalization can be carried out using the number of options.

See also

Web links

Individual evidence

  1. ^ J. Ross Quinlan: Induction of Decision Trees . In: Machine Learning . tape 1 , no. 1 , March 1986, p. 81-106 , doi : 10.1007 / BF00116251 .