Probability theory

from Wikipedia, the free encyclopedia

The probability theory , and probability theory or probabilistic , is a branch of mathematics that from the formalization, modeling and analysis of random events emerged. Together with mathematical statistics , which make statements about the underlying model based on observations of random processes, they form the mathematical sub-area of stochastics .

The central objects of probability theory are random events , random variables and stochastic processes .

Axiomatic structure

Like every branch of modern mathematics, probability theory is formulated in set theory and built on axiomatic specifications. The starting point of probability theory are events that are understood as sets and to which probabilities are assigned; Probabilities are real numbers between 0 and 1; the assignment of probabilities to events must meet certain minimum requirements.

These definitions give no indication of how to determine the probabilities of individual events; they also say nothing about what chance and what probability actually are. The mathematical formulation of probability theory is thus open to various interpretations, but its results are nevertheless exact and independent of the respective understanding of the concept of probability.


Conceptually, the mathematical consideration is based on a random process or random experiment . All possible results of this random process are summarized in the result set . Often one is not interested in the exact result at all , but only in whether it is in a certain subset of the result set, which can be interpreted to mean that an event has occurred or not. So an event is defined as a subset of . If the event contains exactly one element of the result set, it is an elementary event . Compound events contain multiple outcomes. The result is therefore an element of the result set, but the event is a subset.

So you can assign probabilities to the events in a meaningful way, they are listed in a quantitative system, the algebra of events or the event system on , a set of subsets of , for which: It contains and is a σ-body , d. That is, it is closed with respect to the set operations of union and complement formation (relative with respect to ) as well as with respect to the infinite union of countably many sets. The probabilities are then images of a certain mapping of the event space into the interval [0,1]. Such a mapping is called a probability measure . The triple is called the probability space.

Axioms of Kolmogorov

The axiomatic foundation of probability theory was developed by Andrei Kolmogorow in the 1930s . A probability measure must therefore satisfy the following three axioms:


  1. For each event, the probability of is a real number between 0 and 1: .
  2. The certain event has a probability of 1: .
  3. The probability of a union of countably many incompatible events is equal to the sum of the probabilities of the individual events. Events are called incompatible if they are disjoint in pairs, i.e. for all . It is therefore true . This property is also called σ-additivity .

Example: As part of a physical model, a probability measure is used to describe the outcome of a coin toss, the possible outcomes ( called events ) may be numbers and heads .

  • Then is the result set .
  • The power set can be chosen as the event space, thus .
  • For the measure of probability it is clear from the axioms:

Additional physical assumptions about the nature of the coin can now lead to a choice .


From the axioms there are some direct consequences:

1. From the additivity of probability of disjoint events follows that complementary events (counter-events) complementary probabilities ( against probabilities ) have: .

Proof: It is as well . Consequently, according to axiom (3): and then Ax (2): . Changed follows: .

2. It follows that the impossible event, the empty set , the probability zero has: .

Proof: It is , and so on Axiom (3): . It follows from this .

3. For the union not necessary disjoint events follows: .

Stochastic quantities1.PNG
Proof: The quantities required for the proof are shown in the picture above. The set can then be represented as the union of three disjoint sets:
Stochastic quantities2.PNG
According to (3) it follows: .
On the other hand, according to (3), both
as well as
Addition gives:
Rearranging results .
The Poincaré-Sylvester sieve formula generalizes this assertion in the case of n different (not necessarily disjoint) subsets.

Furthermore, a distinction must be made between countable and uncountable result sets.

Countable result set

Example: A wheel of fortune with result set , event space (here the power set of ) and probability measure .

With a countable result set, each elementary event can be assigned a positive probability. If is finite or countably infinite, one can choose the power set of for σ-algebra . The sum of the probabilities of all natural events from is here 1.

Uncountable result set

The probability of hitting a certain point on a
target with a dart tip assumed to be point-shaped is zero. A meaningful mathematical theory can only be based on the probability of hitting certain partial areas . Such probabilities can be described by a probability density .

A prototype of an uncountable result set is the set of real numbers. In many models it is not possible to meaningfully assign a probability to all subsets of the real numbers. As an event system, instead of the power set of the real numbers, one usually chooses the Borel σ-algebra , that is the smallest σ-algebra that contains all intervals of real numbers as elements. The elements of this σ-algebra are called Borel sets or also ( Borel -) measurable. If the probability of any Borel set as an integral

can be written over a probability density is called absolutely continuous . In this case (but not only in this case) all elementary events { x } have the probability 0. The probability density of an absolutely continuous probability measure is only uniquely determined almost everywhere, i. . e, they can be applied to any Lebesgue - null set , so an amount of Lebesgue measure 0 will be modified without being changed. If the first derivative of the distribution function of exists, then it is a probability density of P. However, the values ​​of the probability density are not interpreted as probabilities.

Special properties in the case of discrete probability spaces

Laplace experiments

If one assumes that only a finite number of natural events are possible and that all are equal, i. H. occur with the same probability (such as when tossing an ideal coin, where {tails} and {heads} each have a probability of 0.5), one speaks of a Laplace experiment . Then probabilities can be calculated easily: We assume a finite result set that has the cardinality , i.e. i.e., it has elements. Then the probability of each natural event is simple .

Proof: If is, then there are natural events . It is then on the one hand and on the other hand two elementary events are disjoint (incompatible: if one occurs, the other cannot occur). So the conditions for axiom (3) are fulfilled, and we have:
Since on the other hand it is supposed to be, and therefore rearranged: as claimed.

As a consequence it follows that for events that are composed of several elementary events, the corresponding multiple probability applies. If an event is powerful , it is the union of elementary events . Each of these has the probability , so is . So you get the simple connection

In Laplace's experiments, the probability of an event is equal to the number of outcomes that are favorable to that event divided by the total number of possible outcomes.

The following is an example of rolling the dice with an ideal dice.


The event = high number (5 or 6) has a probability of 1/3.

A typical attempt at Laplace is also drawing a card from a game of cards or drawing a ball from an urn with balls. Here every elementary event has the same probability. Combinatorial methods are often used to determine the number of elementary events in Laplace experiments .

The concept of the Laplace experiments can be generalized to the case of a constant uniform distribution .

Conditional probability

A conditional probability is understood as the probability of an event occurring , provided that another event is already known. Of course , it must be able to happen, so it cannot be the impossible event. Then or less often one writes for “probability of under the assumption ”, in short “ of , provided ”.

Example: The probability of drawing a heart card from a Skat sheet (event ) is 1/4, because there are 32 cards and 8 of them are heart cards. Then is . The counter-event is then diamonds, spades or clubs and therefore has the probability .

Stochastic maps.PNG
Result set when drawing a card from a Skat game

If, however, the event “The card is red” has already occurred (a heart or diamond card was drawn, but it is not known which of the two colors), you only have the choice between the 16 red cards , then the probability is that it is then the heart leaf.

This consideration applied to a Laplace experiment. For the general case , the conditional probability of “ provided ” is defined as

That this definition is meaningful is shown by the fact that the probability defined in this way satisfies Kolmogorov's axioms if one restricts oneself to a new result set; d. i.e. that the following applies:

  1. If pairs are disjoint, then


  1. is the quotient of two probabilities for which axiom (1) holds and . Since the impossible event is not supposed to be, even is . So also applies to the quotient . Furthermore, and are disjoint, and their union is . So by Axiom (3): . There is, follows and therefore .
  2. It is
  3. Furthermore:
This was to be shown.

Example: Let it be as above the event “Drawing a heart card” and the event “It is a red card”. Then:



The following consequences result from the definition of the conditional probability:

Association probability (intersections of events)

The simultaneous occurrence of two events and corresponds in set theory to the occurrence of the compound event . The likelihood thereof calculated for joint probability or joint probability

Proof: According to the definition of the conditional probability, on the one hand

and on the other hand too

Switching to then immediately delivers the assertion.

Example: A card is drawn from 32 cards. be the event: "It is a king". be the event: "It's a heart card". Then the simultaneous occurrence of and , thus the event: “The card drawn is a king of hearts”. Apparently it is . Furthermore , because there is only one heart card among the four kings. Indeed, then the probability is for the King of Hearts.

Bayes' theorem

The conditional probability of the condition can be from by the conditional probability under the condition by

express if one knows the total probabilities and ( Bayes' theorem ).

Dependence and independence from events

Events are called independent of one another if the occurrence of one does not affect the probability of the other. In the opposite case, it is called dependent. One defines:

Two events and are independent if applies.
Inaccurately but memorably worded: In the case of independent events, the probabilities can be multiplied.

That this does justice to the term "independence" can be seen by changing over to :

This means: The total probability for is just as great as the probability for , provided ; so the occurrence of does not affect the probability of .

Example: One of 32 cards is drawn. be the event "It's a heart card". be the event "It is a picture card". These events are independent, because the knowledge that you draw a picture card does not affect the probability that it is a heart card (the proportion of heart cards among the picture cards is just as large as the proportion of heart cards. Cards on all cards). Apparently is and . is the event "It is a heart picture card". Since there are three of them, is . And in fact you find that is.

Another example of very small and very large probabilities can be found in the Infinite Monkey Theorem .

Dimension theory perspective

Classical probability calculus only considers probabilities on discrete probability spaces and continuous models with density functions. These two approaches can be unified and generalized through the modern formulation of probability theory, which is based on the concepts and results of the theory of measure and integration .

Probability spaces

In this view, a probability space is a measure space with a probability measure . This means that the result set is any set, the event space is a σ-algebra with a basic set and is a measure that is normalized by .

Important standard cases of probability spaces are:

  • is a countable set and is the power set of . Then every probability uniquely defined by its values on the one-element subsets of and for all true
  • is a subset of and is the Borel σ-algebra on . If the probability measure is absolutely continuous with respect to the Lebesgue measure , then according to the Radon-Nikodým theorem it has a Lebesgue density , i.e. h., for all true
Conversely, for a non-negative measurable function that fulfills the normalization condition, this formula defines a probability measure .
  • is a Cartesian product and is the product σ-algebra of σ-algebras based on . If probability measures are given, then the product measure defines a probability measure which models the independent execution of the individual experiments one after the other.

Random variable

A random variable is the mathematical concept for a quantity whose value depends on chance. From maßtheoretischer point of view it is a measurable function on a probability space into a measuring space consisting of a set and a σ-algebra on . Measurability means that the archetype is an element of σ-algebra for everyone . The distribution of is then nothing other than the image size


which is induced by on the measurement space and makes it a probability space .

The expected value of a real-valued random variable averages the possible results. It can be defined abstractly as the integral of with respect to the probability measure :


Probability Theory and Statistics

Probability theory and mathematical statistics are collectively referred to as stochastics . Both areas are closely interrelated:

  • Statistical distributions are regularly modeled under the assumption that they are the result of random processes.
  • Statistical methods can provide indications of the behavior of probability distributions in a numerical way.

application areas

The theory of probability arose from the problem of the fair distribution of the stakes in abandoned games of chance . Other early uses also came from the area of ​​gambling.

Today probability theory is a foundation of statistics . Applied statistics use the results of probability theory to analyze survey results or to make economic forecasts.

Large areas of physics such as thermodynamics and quantum mechanics use probability theory for the theoretical description of their results.

It is also the basis for mathematical disciplines such as reliability theory, renewal theory and queuing theory and the tool for analysis in these areas.

Probability theory is also of central importance in pattern recognition .

Probability Theory in School

Due to its diverse areas of application and the everyday relevance of even young students, probability theory is taught from grade 1 in all types of school as part of mathematics lessons. While elementary school is still about getting to know the basic concepts of probability calculus and evaluating the first random experiments with regard to their chances of winning, in lower secondary school the concept of probability is increasingly being examined analytically in its diversity and increasingly complex random experiments are the focus of interest. In the upper secondary level, the previous knowledge is expanded to include specific aspects such as Bernoulli chains, conditional probability and Laplace experiments.

See also

Literature (selection)

Web links

Commons : Probability Theory  - Collection of Images, Videos and Audio Files

Individual evidence