Operating characteristics

from Wikipedia, the free encyclopedia
Dependence of the 2nd type risk on the true position of the opposite parameter µ 1 ( called θ 1 in the text opposite ) for a one-sided and two-sided hypothesis test.

In the statistics , the operating characteristic curve , even OC curve ( OC : English for operating characteristic ) or OC function called a concept from the theory of statistical tests , with a functional relationship between the probability of an error the second type and the actual location of the unknown parameter of a distribution function is established.

definition

Influence of the sample size on the quality function or test strength of a right- sided test
Influence of the sample size on the quality function or test strength of a two-tailed test

Given is a random variable with a distribution function that depends on an unknown parameter . Observations of the random variables are made to estimate the parameter . The parameter can then be determined by an estimator

to be appreciated. A presumption regarding the true, unknown parameter is to be checked statistically. A hypothesis is therefore made with regard to this parameter, the so-called null hypothesis . It is now assumed that if the null hypothesis is true, the estimated value should be close to the true parameter , and it is rejected if the distance is too great, i.e. if it falls within the rejection range of the test. The rejection range is set in such a way that of all samples, even if it were true, a proportion of (often one chooses ) would be rejected.

There are two types of errors that can be made in hypothesis testing:

  • One refuses , although the real parameter is. It is therefore an error, the so-called α error or type 1 error .
  • One does not refuse, although another parameter is the real parameter. This is the β error or type 2 error .

is determined before the test procedure, but depends on the true parameter , which is usually unknown. One can calculate the β errors for various alternative parameter values for the risk assessment of a wrong decision . The β-error for an alternative parameter is calculated as the probability that the null hypothesis falls within the non-rejection area if or although the distribution of actually rules:

.

therefore depends on and can therefore also be represented as a function of the alternative parameter :

.

This function is called the operating characteristic , often also written. The counter-probability to is the probability that it will be rejected and accepted for, if is the true parameter. The rejection of in favor of is therefore desirable, which is why the corresponding function is also called a quality function (and its function value for a given selectivity or test strength ).

The quality function and the operating characteristic thus both represent complete characterizations of the associated test. They can be used, for example, to see whether the test gets better and better the larger the number of observations ( consistency ) and whether the probability of rejecting is greater if true than if true ( integrity ).

example

β-error: The red normal distribution curve indicates how the sample mean X would be distributed if μ = 260 g. The red area represents the α error of 5%. The blue curve shows the distribution of X if μ were in truth 255 g. The blue area is then the probability that X ≥ 256.7 and that H 0 is not rejected, although the trout are on average underweight. The same applies to the green curve, where the true average weight of the trout is only 252 g - as you can see, the 2nd type risk of classifying them as normal weight is now much lower.
Operating characteristics: The ordinate value of the graph indicates the β error as a function of the unknown parameter μ 1 . For μ = 260 the value is 0.95, i.e. just 1 - α.

A trout farmer supplies his bulk buyer with trout that should weigh at least 260 g on average. Upon delivery, it is tested whether the average weight is at least 260 grams. If the hypothesis is rejected, the delivery will be rejected. Let us know that the weight of the trout is normally distributed with the variance and an unknown expected value . Trout are weighed in a sample , with the -th trout weighing. The average weight

this trout is identified. Since the mean value turns out differently for each attempt, this variable is also a random variable and is normally distributed with the parameters

and .

The hypotheses are now and .

If the error is of the first type, for example , the critical value for the test variable results as

with as -quantile of the standard normal distribution.

is rejected if is, is the rejection area . If it is actually true, 5% of all samples would fall into the rejection range, the delivery would be wrongly returned, which corresponds to the α error.

But it can also happen, for example, that the average weight is in truth , but that is random . That is the β error for . With unchanged variance, the test variable is in truth normally distributed as

.

The probability that the null hypothesis will not be rejected is then

and is calculated using the normal distribution as

,

where the value of the normal distribution function with parameters 255 and 2 is at the point 256.7 and the corresponding value of the standard normal distribution. The delivery would therefore be accepted in approx. 20% of all samples, although the trout are on average underweight. If, on the other hand, is in truth , the β-error results as

;

here the risk of a wrong decision is very low. The graph of the operational characteristics shows how, with increasing distance from, the β-error decreases. The aim is to get into the range of a small β error as quickly as possible. By increasing the sample size, one can reduce the β error. A test with a small β-error is also called selective , because the distributions are strongly separated here.

See also

literature

  • Hartung, Joachim / Elpelt, Bärbel / Klösener, Karl-Heinz: Statistics - teaching and manual of applied statistics. 9th, through Aufl., Oldenbourg, Munich 1993, in particular pages 135ff and 381ff.

Web links

Commons : Operating characteristics  - collection of images, videos and audio files

Individual evidence

  1. Bernd Rönz, Hans G. Strohe (1994), Lexicon Statistics , Gabler Verlag, p. 268