Feature subset selection

The feature selection (FSS), short Feature Selection is an approach from the machine learning in which only a subset of the available features for a learning algorithm is used. FSS is necessary because it is sometimes technically impossible to include all features or because there are differentiation problems when a large number of features but only a small number of data sets are available.

Filter approach

Find a measure to distinguish between classes. Measure the weight of the features and choose the best ones . The learning algorithm is applied to this feature subset. Filters can calculate the intrinsic properties of the data either univariately (e.g. Euclidean distance , chi-square test ) or multivariate (e.g. correlation-based filters) .

Advantages:

quickly calculable
scalable
can be interpreted intuitively

Disadvantage:

Redundant features (related features will have similar weight)
ignores dependencies with the learning algorithm

Wrapper approach

Search the set of all possible feature subsets. The learning algorithm is applied to each subset . Searching can be either deterministic (e.g. forward selection, backward elimination) or random (e.g. simulated annealing, genetic algorithms).

Advantages:

Finds a feature subset that optimally fits the learning algorithm
Also includes combinations of features, not just each feature individually
Removes redundant features
easy to implement
interacts with learning algorithm

Disadvantage:

Very time consuming
with heuristic procedures there is the danger of only finding local optima
Risk of over-adapting the data
Dependence on the learning algorithm

Embedded approach

The search for an optimal subset is directly linked to the learning algorithm.

Advantages:

better runtimes and less complexity
Dependencies between data points are modeled

Disadvantage:

The choice of the subset strongly depends on the learning algorithm used.

Examples:

Decision trees
Weighted naive Bayes
Selection of the subset using the weighting vector of SVM

literature

Dunja Mladenić: Feature Selection for Dimensionality Reduction . Craig Saunders et al. (Ed.): SLSFS, 2005, pp. 84-102 ISBN 3-540-34137-4
Yvan Saeys, Inaki Inza and Pedro Larranaga (2007) A review of feature selection techniques in bioinformatics . Bioinformatics. 23 (19) 2507-2517.

Individual evidence

↑ Duda, P., et al. (2001) Pattern Classification. Wiley, New York.
↑ Guyon, I. and Elisseeff, A. (2003) An introduction to variable and feature selection. J. Mach Learn Res., 3, 1157-1182.

[1] Duda, P., et al. (2001) Pattern Classification. Wiley, New York.

[2] Guyon, I. and Elisseeff, A. (2003) An introduction to variable and feature selection. J. Mach Learn Res., 3, 1157-1182.