Machine learning

Machine learning is a generic term for the "artificial" generation of knowledge from experience : an artificial system learns from examples and can generalize them after the learning phase has ended . To do this, machine learning algorithms build a statistical model that is based on training data. This means that the examples are not simply learned by heart , but patterns and regularities in the learning data are recognized. The system also unknown data can judge ( learning transfer ) or of unknown learning data fail ( over-fitting ; English overfitting ). Were from the wide range of possible applications mentioned here: automated diagnostic methods, detection of credit card fraud , stock market analysis, classification of nucleotide sequences , voice and text recognition and autonomous systems .

The topic is closely related to " Knowledge Discovery in Databases " and " Data Mining ", which, however, are primarily concerned with finding new patterns and regularities. Many algorithms can be used for either purpose. Methods of "knowledge discovery in databases" can be used to produce or preprocess learning data for "machine learning". In return, algorithms from machine learning are used in data mining. The term must also be differentiated from the term “ deep learning ”, which is only one possible learning variant using artificial neural networks .

Symbolic and non-symbolic approaches

In machine learning, the type and power of the knowledge representation play an important role. A distinction is made between symbolic approaches in which the knowledge - both the examples and the induced rules - is explicitly represented, and non-symbolic approaches, such as neural networks , which are "trained" to behave in a predictable manner but which do not provide any insight into the allow learned solutions; knowledge is implicitly represented here.

In the symbolic approaches, a distinction is made between propositional and predicate logic systems. Representatives of the former are ID3 and its successor C4.5 . The latter are developed in the field of inductive logic programming .

Algorithmic approaches

The practical implementation takes place using algorithms . Various algorithms from the field of machine learning can be roughly divided into two groups: supervised learning (English supervised learning ) and unsupervised learning (English unsupervised learning ).

Supervised learning

The algorithm learns a function from given pairs of inputs and outputs. A “teacher” provides the correct function value for an input while learning. The aim of supervised learning is that after several calculations with different inputs and outputs, the network is trained to be able to create associations. One area of supervised learning is automatic classification . An example of use would be handwriting recognition .

A few more subcategories for supervised learning can be identified that are mentioned more often in the literature:

Partly supervised learning (English semi-supervised learning ) The associated outputs are only known for some of the inputs.

Reinforcement learning (English reinforcement learning ) The algorithm learns through reward and punishment tactics, how to act in potentially occurring situations to the benefit of the agents (i. E. The system to which the learning component belongs) to maximize. This is the most common form of human learning.

Active learning (English active learning ) The algorithm has the possibility of asking for the correct output for some of the inputs. The algorithm must determine the questions that promise a high level of information gain in order to keep the number of questions as small as possible.

Independent learning (English self-training ) This algorithm can be divided into two main components. The first algorithm component (teacher) derives further data sets with pseudo-labels from an existing labeled data set. The second algorithm component now learns from the extended labeled data set and applies the patterns it finds to its own model.

Unsupervised learning

For a given set of inputs , the algorithm generates a statistical model that describes the inputs and contains recognized categories and relationships and thus enables predictions. There are clustering processes that divide the data into several categories, which differ from one another by characteristic patterns. The network thus independently creates classifiers according to which it divides the input pattern. An important algorithm in this context is the EM algorithm , which iteratively defines the parameters of a model in such a way that it optimally explains the data seen. He bases this on the existence of unobservable categories and alternately estimates the affiliation of the data to one of the categories and the parameters that make up the categories. One application of the EM algorithm can be found, for example, in the Hidden Markov Models (HMMs). Other methods of unsupervised learning, e.g. B. principal component analysis , dispense with the categorization. They aim to translate the observed data into a simpler representation that reproduces them as accurately as possible despite drastically reduced information.

Furthermore, a distinction is made between batch learning, in which all input / output pairs are available at the same time, and continuous (sequential) learning, in which the structure of the network develops with a time delay.

In addition, a distinction is made between off-line learning, in which all data is stored and can therefore be accessed repeatedly, and on-line learning, in which the data is lost after it has been carried out once and the weights have been adjusted. Batch training is always off-line, on-line training is always incremental. However, incremental learning can be done on-line or off-line.

software

Caffe is a program library for deep learning .

Deeplearning4j is open source software programmed in Java that implements an artificial neural network.

ELKI is open source software programmed in Java with an emphasis on unsupervised learning and with index support to accelerate algorithms.

GNU R is free statistics software available on many platforms with extensions for machine learning (e.g. rpart, randomForest) and data mining.

Matlab is a proprietary machine learning library and user interface software.

ML.NET is a free machine learning library from Microsoft for .NET languages. One component of this is Infer.NET , which is a cross-platform open source framework for statistical modeling and online learning.

Keras offers a uniform interface for various backends, including TensorFlow , Microsoft Cognitive Toolkit (formerly CNTK) and Theano .

KNIME is an open source data mining, workflow and data pipelining software.

OpenNN is a program library written in C ++ that implements an artificial neural network.

PHP-ML is a machine learning library in PHP . It is freely available in GitLab .

PyTorch is an open source program library for the Python programming language geared towards machine learning. With LibTorch there is also a native C ++ API available.

RapidMiner is an operator-based graphical user interface for machine learning with commercial support, but also a community edition.

Scikit-learn using the numerical and scientific open source - Python libraries NumPy and SciPy .

Shogun is an open source toolbox for kernel methods.

TensorFlow is an open source machine learning software library developed by Google.

WEKA is a Java-based open source software with numerous learning algorithms.

literature

Sebastian Raschka, Vahid Mirjalili: Machine Learning with Python and Scikit-Learn and TensorFlow: The comprehensive practical manual for data science, predictive analytics and deep learning . MITP-Verlags GmbH & Co. KG, December 13, 2017, ISBN 978-3-95845-735-5 .
Andreas C. Müller, Sarah Guido: Introduction to Machine Learning with Python . O'Reilly-Verlag, Heidelberg 2017, ISBN 978-3-96009-049-6 .
Christopher M. Bishop: Pattern Recognition and Machine Learning . Information Science and Statistics. Springer-Verlag, Berlin 2008, ISBN 978-0-387-31073-2 .
David JC MacKay: Information Theory, Inference and Learning Algorithms . Cambridge University Press, Cambridge 2003, ISBN 978-0-521-64298-9 ( online ).
Trevor Hastie, Robert Tibshirani, Jerome Friedman: The Elements of Statistical Learning . Data Mining, Inference, and Prediction. 2nd Edition. Springer-Verlag, 2008, ISBN 978-0-387-84857-0 ( stanford.edu [PDF]).
Thomas Mitchell: Machine Learning . Mcgraw-Hill, London 1997, ISBN 978-0-07-115467-3 .
D. Michie, DJ Spiegelhalter: Machine Learning, Neural and Statistical Classification . In: Ellis Horwood Series in Artificial Intelligence . E. Horwood Publishing, New York 1994, ISBN 978-0-13-106360-0 .
Richard O. Duda, Peter E. Hart, David G. Stork: Pattern Classification . Wiley, New York 2001, ISBN 978-0-471-05669-0 .
David Barber: Bayesian Reasoning and Machine Learning . Cambridge University Press, Cambridge 2012, ISBN 978-0-521-51814-7 .

Web links

Machine Learning Crash Course. In: developers.google.com. Retrieved November 6, 2018 .

Heinrich Vasce: Machine Learning - Basics. In: Computerwoche. July 13, 2017, accessed January 16, 2019 .

golem.de, Miroslav Stimac: This is how developers get started with machine learning , November 12, 2018

Introduction to Machine Learning (English)

Reading list deep learning (English)

Machine learning overview page of the AITopics Group (English)

Machine learning open source software

List of Machine Learning APIs. mashape , April 16, 2013, accessed April 16, 2013 (list of 40+ machine learning web APIs).

Learning machines - reaching your goal without understanding , science feature, Deutschlandfunk, April 10, 2016. Audio , manuscript

Individual evidence

↑ Tobias Reitmaier: Active learning for classification problems using structural information . kassel university press, Kassel 2015, ISBN 978-3-86219-999-0 , p. 1 ( Google books ).
^ Lillian Pierson: Data Science for Dummies . 1st edition. Wiley-VCH Verlag , Weinheim 2016, ISBN 978-3-527-80675-1 , pp. 105 f . ( Google books ).
^ Pat Langley: The changing science of machine learning . In: Machine Learning . tape 82 , no. 3 , February 18, 2011, p. 275-279 , doi : 10.1007 / s10994-011-5242-y .
↑ ftp://ftp.sas.com/pub/neural/FAQ.html#questions
↑ Ralf Mikut: Data Mining in Medicine and Medical Technology . KIT Scientific Publishing, 2008, ISBN 978-3-86644-253-5 , pp. 34 ( Google books ).
^ Paul Fischer: Algorithmic learning . Springer-Verlag, 2013, ISBN 978-3-663-11956-2 , pp. 6-7 ( Google books ).
↑ Self-training with Noisy Student improves ImageNet classification. In: Arxiv. Retrieved December 20, 2019 .
↑ ftp://ftp.sas.com/pub/neural/FAQ2.html#A_styles

[1] Tobias Reitmaier: Active learning for classification problems using structural information . kassel university press, Kassel 2015, ISBN 978-3-86219-999-0 , p. 1 ( Google books ).

[2] Lillian Pierson: Data Science for Dummies . 1st edition. Wiley-VCH Verlag , Weinheim 2016, ISBN 978-3-527-80675-1 , pp. 105 f . ( Google books ).

[3] Pat Langley: The changing science of machine learning . In: Machine Learning . tape 82 , no. 3 , February 18, 2011, p. 275-279 , doi : 10.1007 / s10994-011-5242-y .

[4] tp://ftp.sas.com/pub/neural/FAQ.html#questions

[5] Ralf Mikut: Data Mining in Medicine and Medical Technology . KIT Scientific Publishing, 2008, ISBN 978-3-86644-253-5 , pp. 34 ( Google books ).

[6] Paul Fischer: Algorithmic learning . Springer-Verlag, 2013, ISBN 978-3-663-11956-2 , pp. 6-7 ( Google books ).

[7] Self-training with Noisy Student improves ImageNet classification. In: Arxiv. Retrieved December 20, 2019 .

[8] tp://ftp.sas.com/pub/neural/FAQ2.html#A_styles