Gaussian test

from Wikipedia, the free encyclopedia

In mathematical statistics, the Gauss test or Z test is a group of hypothesis tests with a standard normally distributed test test variable under the null hypothesis. The test is named after Carl Friedrich Gauß .

With the z-test based on are sampling - averages hypotheses about the expected values tested those populations from which the samples come from.

The Gaussian test follows a method similar to the t-test . The most important difference lies in the requirements for the application of these tests: While the t-test works with the empirical standard deviations of the samples, the standard deviations of the population must be known for the Gaussian test . Furthermore, the Gauss test basically uses the standard normal distribution as the characteristic value distribution, while the t test uses the t distribution. The Gauss test is therefore only suitable to a limited extent for small samples.

Mathematical basics

If there are independent, normally distributed random variables with expected value and standard deviation , then their arithmetic mean is

normally distributed with expected value and standard error .

The sampling function

is then standard normal distributed under the null hypothesis and is used as test statistic .

The test statistic can be written as:

,

So like a standard normally distributed random variable plus a number that shows the distance between the real and the assumed expected value in a standardized way.

There are also independent normally distributed random variables with expected value , standard deviation and arithmetic mean

that are also independent of the sample, the distribution is normal with the expected value and standard deviation .

The sampling function

is then standard normal distributed under the null hypothesis and is used as test statistic.

One-sample Gaussian test

application

The one- sample Gaussian test uses the arithmetic mean of a sample to check whether the expected value of the associated population is not equal to (or smaller or larger) a specified value.

The sample consists of the characteristics of independent random variables and comes from a normally distributed population with an unknown expected value and known standard deviation .

It will be tested at one

  • two-sided test: against
  • right-sided test: against
  • left-sided test: against

The value of is specified by the user.

Calculation of the test size

The sample mean is used to calculate the test size .

Two-sample Gaussian test for independent samples

application

The two-sample Gaussian test for independent samples uses the arithmetic means of the samples to check whether the expected values ​​of the associated populations are different.

The independent samples and should also be mutually independent and normally distributed populations with unknown expected values or and known standard deviations or originate.

It will be tested at one

  • two-sided test: against
  • right-sided test: against
  • left-sided test: against

The value of is specified by the user.

Calculation of the test size

The sample means and are used to calculate the test size .

Two-sample Gaussian test for dependent (connected) samples

application

For the two-sample Gaussian test for dependent samples, pairs of measured values must be available, such as one such as B. found in before-and-after measurements. The pair differences are used to check whether the expected value of the associated population is not equal to (or smaller or larger) a specified value for these differences .

The differences should come from a normally distributed population with an unknown expected value and a known standard deviation .

It will be tested at one

  • two-sided test: against
  • right-sided test: against
  • left-sided test: against

is specified by the user. In most use cases, "inequality" ( ) is tested, then is .

Calculation of the test size

The differences form a new sample with an arithmetic mean . So you can apply the one-sample Gaussian test to the sample of the differences and get the test variable .

Decision on the hypotheses

In all three Gaussian tests, the general criteria for hypothesis tests are used to decide whether to accept or reject the hypotheses . Since there is a standard normally distributed random variable under the null hypothesis, the following rules are obtained.

Rejection of (i.e. acceptance of ) at the level of significance if:

  • for the two-sided test: (this is the - quantile of the standard normal distribution)
  • in the right-sided test:
  • in the left-sided test:

Gaussian test for non-normally distributed random variables

For large sample sizes (> 30 as a rule of thumb), the assumption of normal distribution can be dispensed with due to the central limit value theorem . If the requirements for the expected values ​​and standard deviations of the random variables involved are fulfilled for the Gauss test, it is assumed that the sums required to calculate z are approximately normally distributed and that the Gauss test delivers correct results to a good approximation.

example

A certain blood parameter B is normally distributed in the population in a very good approximation with . A group of chemically related drugs is known to be able to shift the distribution of the blood parameter, i. H. they may change the expected value (while maintaining the form of distribution).

For a pharmaceutical P from this group it should be checked whether such a change actually occurs. Random independent samples of size n = 22 give the following measurements for B:

ohne Gabe von P   xi  12 13 10 12 14 11 14 18 15 13 15 13 11 17 11 12 13 14 15 13 14 13
mit Gabe von P    yi  13 14 13 17 13 16 16 19 17 15 17 15 15 20 15 15 14 15 13 15 16 15

Various hypotheses are to be tested with these measured values. The level of significance should be 0.05 in each case; the associated u-values ​​are then (in the following all values ​​rounded):

For the mean values ​​one calculates and .

  • 1st hypothesis: The mean values ​​of B after administration of P are above 15.
Procedure: right-hand, one-sample Gaussian test
and
Decision: H 0 is retained. It could not be proven that the administration of P leads to an average B value above 15.
  • 2nd hypothesis: The values ​​of B differ on average in the two populations with and without the administration of P.
Procedure: two-tailed two-sample Gaussian test for independent samples
and
Decision: H 0 is rejected in favor of H 1 . With a probability of error of 0.05 or less, it was demonstrated that the B values ​​differ on average with regard to the administration or non-administration of P.

Let us now consider an experiment with dependent samples. In extensive before-and-after studies, a normal distribution was also found for the change in the B values ​​due to the administration of the pharmaceuticals concerned, with . In the table of measured values, the measured values ​​superimposed on each other have now been determined in a before-and-after test.

  • 3rd hypothesis: The values ​​of B after the administration of P are on average more than 1.25 above the values ​​before the administration of P.
Procedure: left-hand two-sample Gaussian test for dependent samples
and
Decision: H 0 is rejected in favor of H 1 . With a probability of error of at most 0.05, it was shown that in before-and-after examinations the B values ​​after the administration of P are on average more than 1.25 above the B values ​​before the administration of P.

See also

literature

  • Rönz / Strohe (Ed.): Lexicon Statistics . Gabler, 1994, ISBN 978-3-409-19952-0 .
  • Irle: Probability Theory and Statistics . Cape. 20. Vieweg and Teubner, 2nd edition 2005, ISBN 978-3-519-12395-8 .
  • Cramer / Kamps: Fundamentals of Probability Theory and Statistics: A script for students of computer science, engineering and economics . P. 271ff. Springer, 2nd edition 2008, ISBN 978-3-540-77760-1 .