Bayesian network

A Bayesian network or Bayesian network (named after Thomas Bayes ) is a directed acyclic graph (DAG) in which the nodes describe random variables and the edges describe dependent dependencies between the variables. Each node of the network is given a conditional probability distribution of the random variable it represents, which assigns random variables to the parent nodes. They are described by probability tables. This distribution can be arbitrary, but discrete or normal distributions are often used. Parents of a vertex v are those vertices with an edge leading to v.

A Bayesian network is used to represent the common probability distribution of all variables involved as compactly as possible using known, conditional independence . The conditional (in) dependence of subsets of the variables is combined with the a priori knowledge.

If X ₁ , ..., X _{n are} some of the random variables occurring in the graph (which are closed by adding parent variables), their joint distribution is calculated as

{\ displaystyle P (X_ {1}, \ dots, X_ {n}) = \ prod _ {i = 1} ^ {n} P (X_ {i} | \ mathrm {parents} (X_ {i})) }

If a node has no parents, the associated probability distribution is an unconditional distribution.

example

Closing in Bayesian networks

Is the value of some of the variables, e.g. E ₁ , ..., E _m known, i.e. H. if there is evidence , the conditional probability distribution of X ₁ , ..., X _n with given E ₁ , ..., E _m can also be calculated with the help of various algorithms and inference can thus be carried out.

The inference problem, both the exact and the approximate, in Bayesian networks is NP-hard . In larger networks, however, approximate methods can be used. Exact methods are somewhat more precise than approximate, but in practice this often only plays an insignificant role, since Bayesian networks are used for decision making where the exact probabilities are not required.

It should be noted that in software conversions of exact inference procedures, floating point numbers are often only used with double precision . This limits the accuracy of these calculations.

Exact inference

For exact inference in Bayesian networks, u. a. the following algorithms:

Variable elimination
Clustering algorithms

Approximate inference

Inference types

Diagnostic: from effects to causes
Causal: From causes to effects
Inter-causal: Between causes of a common effect
Mixed: Combination of the preceding

Learning Bayesian Networks

If a Bayesian network is to be generated automatically from the available data, which describes the data as well as possible, then two possible problems arise: Either the graph structure of the network is already given and one no longer has to determine the conditional independence, but only take care of the calculation of the conditional probability distributions at the nodes of the network, or you have to learn a structure of a suitable network in addition to the parameters.

Parameter learning

If one does not start from a full (Bayesian) probability model, one generally chooses

Maximum Likelihood Estimation (MLE)

as an estimation method. In the case of a complete (Bayesian) probability model, the point estimate can be

Maximum a posteriori estimate (MAP)

on. In the case of complete data and fully observed variables, local maxima of the likelihood or a-posteriori functions can usually be determined with common optimization algorithms such as

being found. In the case of missing observations (to be taken as the rule), the powerful and widespread one usually becomes

Expectation Maximization Algorithm (EM) or the
Generalized expectation maximization algorithm (GEM)

used.

Structure learning

Structure learning can u. a. with the K2 algorithm (approximate, using a suitable objective function) or the PC algorithm .

Conditional independence

To determine the conditional independence of two sets of variables, given a third set of variables, it is sufficient to examine the graph structure of the network. One can show that the (graph theoretical) concept of d-separation coincides with the concept of conditional independence.

application

Bayesian networks are used as a form of probabilistic expert systems, the areas of application being in bioinformatics , pattern analysis , medicine and engineering. In the tradition of artificial intelligence , the focus of Bayesian networks is on the use of their graphic structures to enable abductive and deductive inferences that would be impracticable in an unfactorized probability model. This is realized through the various inference algorithms.

The basic idea of Bayesian networks, namely the graphic factorization of a probability model, is also used in other traditions, such as Bayesian statistics and in the tradition of so-called graphic models for data modeling purposes. Areas of application are primarily epidemiology , medicine and social sciences.

software

literature

Enrique Castillo, Jose Manuel Gutierrez, Ali S. Hadi: Expert Systems and Probabilistic Network Models . Springer-Verlag, New York 1997, ISBN 0-387-94858-9 .
Finn V. Jensen: Bayesian Networks and Decision Graphs . Springer-Verlag, New York 2001, ISBN 0-387-95259-4 .
Richard E. Neapolitan: Learning Bayesian Networks . Prentice Hall, 2003, ISBN 0-13-012534-2 .
Judea Pearl: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference . Morgan Kauffmann Publishers, San Francisco 1988, ISBN 0-934613-73-7 .
Judea Pearl: Causality . Cambridge University Press, Cambridge 2000, ISBN 0-521-77362-8 .
Stuart Russell, Peter Norvig: Artificial Intelligence - A Modern Approach . Pearson Education Germany, Germany 2004, ISBN 3-8273-7089-2 .