# Selectivity of a test

The selectivity of a test, also called quality , power , power ( English for power , performance , strength ) of a test or test strength or test severity , or severity for short , describes the decision-making ability of a statistical test in test theory , a branch of mathematical statistics . In the context of assessing a binary classifier , the selectivity of a test is also called sensitivitydesignated. Just like the level of a test, the selectivity of a test is a term derived from the quality function ( selectivity function ).

The power of a test indicates the ability of a test to recognize differences (effects) when they actually exist. More precisely, the discriminatory power indicates the probability with which a statistical test correctly rejects the null hypothesis to be rejected (“There is no difference”) if the alternative hypothesis (“There is a difference”) is true. Assuming that the null hypothesis represents the absence of a specific disease (“not sick”), the alternative hypothesis the presence of the disease (“sick”) and the rejection of the null hypothesis represents a positive diagnostic test , the selectivity of the test is equivalent to the sensitivity of the Tests (the likelihood that someone will have a positive test result). At the same time, this fact represents a bridge between test theory and the theory of diagnostic testing. ${\ displaystyle H_ {0}}$ ${\ displaystyle H_ {1}}$

The selectivity of the test can therefore be interpreted as the “power to reject” the test. High selectivity of the test speaks against, low selectivity for the null hypothesis . An attempt is made to determine the rejection range in such a way that the probability of rejecting a “false null hypothesis” , i. H. for maintaining the alternative hypothesis on the condition that is true is as large as possible: . In order to be able to calculate the discriminatory power of a test, the alternative hypothesis must be specified in the form of a concrete point hypothesis . ${\ displaystyle H_ {0}}$ ${\ displaystyle A}$${\ displaystyle H_ {0}}$${\ displaystyle H_ {1}}$ ${\ displaystyle H_ {1}}$${\ displaystyle \ operatorname {Pr} (T \ in A \ mid H_ {1}) \; = 1- \ beta}$

It forms the complement to the type II error probability , ie the probability of making a wrong decision in favor of the null hypothesis ( ) if the is valid . The selectivity itself is the probability of avoiding such an error . ${\ displaystyle \ beta}$${\ displaystyle H_ {1}}$${\ displaystyle H_ {0}}$

## description

For a type II error probability , the corresponding selectivity is . For example, if Experiment E has a Power of and Experiment F has a Power of , Experiment E is more likely to have a Type II error than Experiment F , and Experiment F is more reliable because of its lower probability of a Type II error as experiment E. Equivalently, the discriminatory power of a test can be viewed as the probability that a statistical test correctly rejects the null hypothesis to be rejected (“There is no difference”) if the alternative hypothesis (“There is a difference”) is true, ie ${\ displaystyle \ beta}$${\ displaystyle 1- \ beta}$${\ displaystyle 0 {,} 7}$${\ displaystyle 0 {,} 95}$ ${\ displaystyle H_ {0}}$ ${\ displaystyle H_ {1}}$

${\ displaystyle {\ text {Selectivity}} = \ Pr \ left ({\ text {reject}} \; H_ {0} \ mid H_ {1} \; {\ text {is true}} \ right) = 1 - \ beta}$.

So it can be seen as the ability of a test to detect a specific effect when that specific effect is actually present. If there is no equality, but only the negation of (for example, one would simply have a negation for an unobservable population parameter), then the selectivity of the test cannot be calculated, unless the probabilities for all possible values ​​of the parameter, which violating the null hypothesis are known. One generally refers to the discriminatory power of a test against a specific alternative hypothesis (point hypothesis). ${\ displaystyle H_ {1}}$${\ displaystyle H_ {0}}$${\ displaystyle H_ {0}: \ mu = 0}$${\ displaystyle \ mu}$${\ displaystyle H_ {1}: \ mu \ neq 0}$

As the selectivity increases, the probability of a Type II error decreases, since the selectivity is the same . A similar concept is the Type I Error Probability . The smaller the probability is for a given type 1 error , the more sharply the test separates and . A test is called selective if it has a relatively high degree of selectivity compared to other possible tests for a given one. If true, the maximum power of a test is the same . ${\ displaystyle 1- \ beta}$${\ displaystyle \ alpha}$${\ displaystyle \ beta}$${\ displaystyle H_ {0}}$${\ displaystyle H_ {1}}$${\ displaystyle \ alpha}$${\ displaystyle H_ {0}}$${\ displaystyle \ alpha}$

Selectivity analyzes or power analyzes can be used to calculate the required minimum sample size at which an effect of a certain size ( effect size ) can be recognized with sufficient probability . Example: "How many times do I have to flip a coin to conclude that it has been tampered with to some extent?"

In the context of assessing a binary classifier , the selectivity of a test is also referred to as sensitivity .

## Decision table

reality H 0 is true H 1 is true ... for H 0 Correct decision (specificity) Probability: 1 - α Type 2 error probability: β ... for H 1 Type 1 error Probability: α correct decision probability: 1-β (selectivity of the test)

## Choice of the β-error level

Influence of the sample size on the quality function or selectivity of a one-sided (in this case left-sided) test
Influence of the sample size on the quality function or selectivity of a two-sided test

For studies of the effectiveness of medical treatments, Cohen (1969: 56) suggests a value that is 4 times as high as that for the level of significance . So if is, the error level should be 20%. If the error probability (probability of an error of the 2nd type) is below this 20% limit in an investigation, the selectivity ( ) is thus greater than 80%. ${\ displaystyle \ beta}$${\ displaystyle \ alpha}$${\ displaystyle \ alpha = 5 \, \%}$${\ displaystyle \ beta}$${\ displaystyle \ beta}$${\ displaystyle 1- \ beta}$

It should be borne in mind that errors generally cannot be controlled directly at a given, fixed level of significance . The error in many asymptotic or nonparametric tests is simply unpredictable, or there are only simulation studies. In contrast, with some tests, for example the t test , the error can be controlled if the statistical evaluation is preceded by sample size planning. ${\ displaystyle \ beta}$${\ displaystyle \ alpha}$${\ displaystyle \ beta}$${\ displaystyle \ beta}$

An equivalence test ( induced from the parameters of the t-test ) can be used to control the (t-test) error independently of the sample size planning. In this case the (t-test) level of significance is variable. ${\ displaystyle \ beta}$${\ displaystyle \ alpha}$

## Determining factors of selectivity

There are several ways to increase the power of a test. The selectivity ( ) increases: ${\ displaystyle 1- \ beta}$

• with increasing difference of (this means: a large difference between two subpopulations is less often overlooked than a small difference)${\ displaystyle (\ mu _ {0} - \ mu _ {1})}$
• with decreasing feature spread ${\ displaystyle \ sigma}$
• with increasing significance level (if not specified)${\ displaystyle \ alpha}$${\ displaystyle \ beta}$
• with increasing sample size as the standard error then becomes smaller: . Smaller effects can be separated by a larger sample size${\ displaystyle \ sigma _ {\ overline {x}} = {\ frac {\ sigma} {\ sqrt {n}}}}$
• for one-sided tests compared to two-sided tests: For the two-sided test you need a sample size that is approximately larger in order to achieve the same selectivity as for the one-sided test.${\ displaystyle 25 \, \%}$
• by the use of the best or most discriminative ( English most powerful ) Tests
• by reducing scatter in the data, e.g. B. through the use of filters or the choice of homogeneous subgroups (stratification)
• by increasing the sensitivity of the measuring process (strengthening the effects, e.g. through higher dosages)

Important for the selectivity and Power is also the type of statistical tests: Parametric tests such as the t have test if the distribution assumption is true, with the same sample size is always a higher selectivity than non-parametric tests such as the Wilcoxon signed rank -Test . However, if the assumed and the true distribution differ from one another, for example, if a Laplace distribution is actually the basis, while a normal distribution was assumed, nonparametric methods can, however, also have a significantly greater degree of selectivity than their parametric counterparts.

## Opposite notation

In some sources - which can cause confusion - the exact opposite notation is used for the type 2 error and the selectivity, i.e. the probability of committing a type 2 error is denoted by the value , whereas the selectivity is denoted by . ${\ displaystyle 1- \ beta}$${\ displaystyle \ beta}$

## literature

Wiktionary: Power  - explanations of meanings, word origins, synonyms, translations

## Individual evidence

1. ^ Ludwig Fahrmeir , Rita artist, Iris Pigeot , Gerhard Tutz : Statistics. The way to data analysis. 8., revised. and additional edition. Springer Spectrum, Berlin / Heidelberg 2016, ISBN 978-3-662-50371-3 , p. 393.
2. ^ Otfried Beyer, Horst Hackel: Probability calculation and mathematical statistics. 1976, p. 154.
3. Ludwig von Auer : Econometrics. An introduction. 6., through. u. updated edition. Springer, 2013, ISBN 978-3-642-40209-8 , p. 128.
4. ^ Otfried Beyer, Horst Hackel: Probability calculation and mathematical statistics. 1976, p. 154.
5. This is true because . For the meaning of the notation, see Truth Matrix: Right and Wrong Classifications .${\ displaystyle 1- \ beta = 1 - {\ frac {f _ {\ text {n}}} {f _ {\ text {n}} + r _ {\ text {p}}}} = {\ frac {f_ { \ text {n}} + r _ {\ text {p}}} {f _ {\ text {n}} + r _ {\ text {p}}}} - {\ frac {f _ {\ text {n}}} {f _ {\ text {n}} + r _ {\ text {p}}}} = {\ frac {r _ {\ text {p}}} {f _ {\ text {n}} + r _ {\ text {p }}}} = {\ text {Sensitivity}}}$
6. Frederick J. Dorey: In Brief: Statistics in Brief: Statistical Power: What Is It and When Should It Be Used ?. (2011), 619-620.
7. Ludwig von Auer : Econometrics. An introduction. 6., through. u. updated edition. Springer, 2013, ISBN 978-3-642-40209-8 , p. 128.
8. a b c d Lothar Sachs , Jürgen Hedderich: Applied Statistics: Collection of Methods with R. 8., revised. and additional edition. Springer Spectrum, Berlin / Heidelberg 2018, ISBN 978-3-662-56657-2 , p. 461
9. ^ J. Bortz: Statistics for social scientists . Springer, Berlin 1999, ISBN 3-540-21271-X .
10. Erwin Kreyszig: Statistical methods and their applications. 7th edition. Göttingen 1998, pp. 209ff.