p hacking

from Wikipedia, the free encyclopedia

p -Hacking, also known as specification searching , describes the distortion and manipulation of research results by subsequent adjustment of the test parameters.

The value is "hacked", ie artificially pushed below the 5% limit. In this way, a supposed statistical significance of the results is generated. The frequent misinterpretation of values ​​and the use of hacking has led to countless false research results that have harmed science. Hacking can be seen as a reaction of scientific authors to the fact that studies with significant results are preferred for publication and analyzes that do not show significant results remain unpublished ( file drawer problem ). With the help of meta-analyzes it is possible to uncover hacking.

Statistical significance by chance

The process of mining data in a single set of data automatically implies that by meticulously looking for combinations of variables that might have a correlation , a large number of hypotheses are automatically tested.

Conventional significance tests determine a priori an error probability for an error of the first type . It is therefore necessary to accept the risk of assuming a false test result. When a variety of statistical tests are carried out, some by design produce incorrect results by chance. Thus it turns out that 5% of the randomly selected hypotheses are only by chance significant at the 5% level, 1% at the 1% level, etc. If a sufficient number of hypotheses have been tested, it is practically certain that some Hypotheses mistakenly appear to be statistically significant.

Example: chocolate diet

In a satirical study in 2015, John Bohannon claimed that dark chocolate could lead to weight loss as part of a diet.

In order to publish this claim with an error probability of less than 5% (i.e. ), he thought up 18 different criteria in advance, on which dark chocolate could have an effect, for example weight, cholesterol value, blood pressure, sleep quality, etc. It was individually very unlikely that dark chocolate had a statistically significant effect on any of these criteria. But because there were so many criteria, there was a high probability from the start that any of them would be statistically significantly influenced by dark chocolate consumption. This study found that the weight loss claim was "statistically significant".

The study deliberately showed numerous other methodological errors and thereby wanted to point out precisely these deficiencies.

Countermeasures

An increasing number of journals are now adopting the registered report format to counter scientific misconduct such as p-hacking and HARKing .

literature

Web links

Individual evidence

  1. a b Megan L. Head u. a .: The Extent and Consequences of P-Hacking in Science. In: PLOS Biology. March 13, 2015, doi: 10.1371 / journal.pbio.1002106 . P. 1.
  2. Regina Nuzzo: When researchers fail the significance test. In: Spektrum.de. February 2, 2014, accessed April 11, 2018.
  3. io9.gizmodo.com
  4. Promoting reproducibility with registered reports . In: Nature.com. January 10, 2017, doi: 10.1038 / s41562-016-0034 .