Reproducibility (psychology)

from Wikipedia, the free encyclopedia

The reproducibility ( replication (experiment) , replicability, repeatability) of the test results by other researchers is a fundamental requirement for scientific research, especially in the natural sciences . In other empirical sciences as well, such as psychology and medicine, important results should be checked by independent and qualified investigators. Associated with this is the expectation that scientific research will control itself in its course and will gradually develop on the basis of the replicated findings.

Task in psychology

A psychological experiment or other research work should be described methodologically so precisely that it can be checked. From the perspective of the philosophy of science of critical rationalism , the validation of a theory against all attempts at refutation is a fundamental principle (see falsification ). From the growing stock of relatively reliable results (facts), an increasingly secure specialist knowledge can be gained, as is necessary for the formation of theories and for applications in the practical fields of psychology. Scientific definitions of reproducibility ( International Union of Pure and Applied Chemistry , see reproducibility ) cannot be adopted directly, because psychological and social science studies with humans have special conditions.

Replication strategies

Methodologically, a distinction must be made between different replication processes (see also Schmidt 2009, Schweizer 1989):

  • The direct (exact) Replication is the repetition (duplication) of a particular investigation; it is also known as identical or exact replication. Strictly speaking, it is a similar repetition only with other participants. The exact repetition is - apart from computer-aided experiments with a high degree of standardization of simple processes - possible at most in the same laboratory. Even if the experiment is recorded very precisely and the survey of the independent and dependent variables is standardized, there are usually special technical skills of the investigator and peculiarities of the research style, peculiarities of the investigator-subject interaction and other possibly important context variables (see reactivity (Social sciences) ). Scarce journal articles usually do not contain sufficient information for direct replication.
  • The reanalysis of the possibly accessible dataset of a published scientific work is undertaken by independent scientists.
  • The approximate replication tries to repeat the original examination as well as possible. How well this is achieved is not easy to assess because of the numerous methodological aspects.
  • In the case of partial replication, only one of the important test conditions is changed: the selection of persons or just the presentation of the independent variable (according to duration, intensity, quality, etc.) or the collection of the dependent variable by a possibly newly developed measurement or test method.
  • The systematic replication is taking the scheduled variation of two or more major test conditions at once. This procedure seems to be more economical, because in the positive case it could create a broader experience base; in the negative case, however, it remains unclear why the outcome was different.
  • The constructive ( conceptual ) replication consists in a newly created investigation, which admittedly adopts the general theoretical approach and the investigation hypothesis, but methodologically selects different definitions (methods) of the independent and the dependent variables, which are theoretically regarded as adequate. The objective is adopted, the methodical implementation is more or less redesigned. Conceptual replications can be found more often, but not under this name, but as more or less free references to previous studies. This shows whether the phenomenon of interest is stable over different areas. It remains questionable, however, whether the phenomenon, which is methodologically different, is “the same” (Siri Carpenter, 2012).

Asendorpf u. a. (2013) proposed distinction from reproducibility d. H. identical results with independent evaluation of the same data set, replicability , d. H. Generalizability in multiple dimensions, and generalizability , i. H. Excluding certain moderator effects is unfortunate because it does not follow the widespread use of the term and Lee J. Cronbach's concept of generalizability theory .

Stefan Schmidt (2009) suggests a functionally structured classification of replication approaches by asking about the guiding intention: to control random effects, to control possible artifacts (deficiencies in internal validity ), to control forgeries, to generalize to another population, and so on to confirm the hypotheses underlying the original experiment. Is the lack of replication attempts a blind spot in psychology and the social sciences, as Schmidt says? He therefore calls for a more thorough methodological discussion, greater consideration in textbooks and a change in editorial policy.

The fact that numerous non-reproducible research results have been published is well known in the history and methodology of the sciences (cf. publication bias in medicine). In the past, too, there have been individual voices calling for a replication of psychological results. As early as 1959, when analyzing the articles in four psychology journals, the statistician Theodore Sterling found that almost all of the papers reported positive results. He saw a connection here with the selection criteria for submitted manuscripts, which favor the publication of “positive” results; Another analysis in 1995 showed that the situation remained unchanged.

The systematic reanalysis of the original data from articles that have already been published seems to be a difficult path. Although the guidelines (Ethical Standards or the Publication Manual) of the American Psychological Association stipulate that such data are generally made available by the authors, Wicherts, among others, of 141 selected articles in APA journals only actually received the data in 27 percent of cases on which they wanted to analyze the meaning of outlier values; they then had to cancel their project.

In the textbooks of the methodology of psychology, the strategies of replication research are treated rather casually. There is still a lack of methodological discussion, conventions and systematic approaches, and there are "Lots of botches:"

“Positive outcomes in psychology are like rumors - easy to spread but difficult to take back. They shape the content of most specialist journals, which is no wonder, as the journals prefer to report on new, exciting studies. Attempts to reproduce these, on the other hand, often go unpublished, especially if they fail. "

Failed replication attempts are more likely to be considered if the original test results were particularly interesting but very dubious. The “premonitions” reported by the social psychologist Daryl Bem provoked three (failed) replication attempts in memory experiments. The critical report on this falsification was rejected by Science and two psychological journals before appearing in the online publication PLOS ONE .

The question of the reproducibility of important findings is discussed in some areas of psychology - as well as in medicine - mainly in the in-depth reviews of controversial topics or in the statistically summarizing meta-analyzes (see evidence-based medicine ). In contrast, there are only relatively few publications on successful and unsuccessful replications of psychological experiments or systematically varying generalizability studies in the psychological literature banks. Occasionally, the awareness of the problem that exists in specialist circles in the face of unexpected and unlikely results is expressed in the ironic, mocking reference to the Journal of Irreproducible Results , whose articles are intended to make readers laugh and then think ( scientific joke ). This magazine, founded in 1955, was followed in 1995 by the satirical Annals of Improbable Research with real and fictional experiments on often absurd topics.

Resistance to Replication

What are the reasons for this lack of scientific control? Dealing with replication may not be considered creative; Corresponding publications would then hardly contribute to the scientific reputation, so that they would at least be less beneficial for younger scientists than the publication of "new" findings. The very reserved attitude of the editors of many scientific journals speaks for this assumption. In a survey of 79 publishers of social science journals , 94 percent declined to accept manuscripts on replication studies, and 54 percent of reviewers said they would prefer a new study to a replication study. Could the concern that too many published results not be reproducible also play a role? Siri Carpenter quotes various opinions on the Reproducibility Project . While this bold initiative is recognized, it should be noted that the project, if only a few experiments were confirmed, could amount to an unfair accusation by psychology:

"I think one would want to see a similar effort done in another area before one concluded that low replication rates are unique to psychology. It would really be a shame if a field that was engaging in a careful attempt at evaluating itself were somehow punished for that. It would discourage other fields from doing the same. "

A senior in psychology advised against the planned Reproducibility Project because psychology is under pressure and such a project would make psychology look bad. In contrast, other scientists praised this bold initiative. Other disciplines could benefit from this type of self-reflection. The organizer of the project, Brian Nosek, explained his point of view:

"We're doing this because we love science. The goal is to align the values ​​that science embodies - transparency, sharing, self-critique, reproducibility - with its practices. "

Arguments for More Replication Studies

In the USA and gradually also in Germany, an increasingly critical attitude towards the usual forms of publication and the lack of internal control can be recognized in general science journals. Systematic evidence of statistical flaws and extreme cases of data falsification have increased interest in replication studies. There is an increasing demand for quality control, for example quality assurance in psychological diagnostics .

  • A more recent example of fraud and falsification in science was given by the well-known social psychologist Diederik Stapel , who wrote at least 30 publications with invented data. (These falsifications, however, were not discovered through replication attempts, but rather on the basis of information from his working group.) There are also current allegations against two other social psychologists: Dirk Smeesters and Jens Förster .
  • The number of recalls of no longer trustworthy scientific publications in medicine, but also in the social sciences, is small but has increased significantly, with “fraud” being the main reason. The recall rate also seems to be related to the impact factor, i.e. H. the reputation of the magazine to be related.
  • In a survey based on 2,155 responses to examine the research practice of psychologists in the USA, it was found: 43 percent admitted that they had left out inappropriate data, 35 percent pretended that the surprising result was exactly what they had expected 2 Percent admitted that they had already tweaked data.
  • Investigators have a leeway in planning the experiment: how many people, how many dependent variables, etc. For example, the chance of getting significant results could be doubled if the participants are divided into two age groups or if they are broken down by gender. In addition, additional test statistics can be calculated in parallel. An examiner has many “degrees of freedom” of this kind and could be tempted to achieve the desired “positive” results through such “flexible” decisions, possibly made afterwards. In extreme cases, the hypotheses are only formulated when the results are available.
  • In many research areas of psychology and medicine, examinations with only 20 to 30 people are common, for example in neurosciences, because of the considerable effort involved. It is often overlooked that the statistical results can even be turned into the opposite due to the limited amount of data in a small sample if the author considers or excludes a conspicuous value, an “outlier”, before the calculations.
  • A systematic reanalysis of clinical investigations showed that the conclusions on the investigated treatments in 35% of the publications differ substantially from those of the original publications. If the size of the effect decreases significantly in the follow-up studies, it is referred to as a decline effect.
  • Other authors point out the limited informative value of the statistical significance of a finding and demand that the magnitude of the effect ( effect size ) of an examination result is always given in suitable parameters (analysis of power ). The review of 1000 publications showed, contrary to the theoretical expectation, that there is a systematic relationship between effect size and sample size, i. H. a special publication bias can be assumed.
  • In extensive reviews of research in medicine, the epidemiologist John Ioannidis very often found deficiencies. This often cited study was criticized from a statistical point of view, but the number of false positive results on the basis of 77,430 articles in five major medical journals between 2000 and 2010 was estimated at 14 percent, but there was no increase in this period this tendency.
  • There are errors in statistical analysis in numerous publications in psychology. In 18 percent of the 281 articles examined, there were deficiencies in the statistical analysis and in 15 percent at least one error, which often turned out to be in favor of the hypothesis.
  • Since today almost all research results in psychology and medicine are based on statistical analyzes, i. H. To check the probability of the observed result versus the random expectation, a larger number of published findings must contain some randomly positive and some randomly negative results. However, studies have shown an incredibly high percentage of “positive” results in many areas of science. Some investigators, in view of a negative result that contradicts their expectations, will be inclined to leave this work in the drawer (“file drawer problem”) and preferably to publish their significant positive results. An analysis of 4600 studies from different disciplines showed a relatively high proportion of positive results, especially for the subjects psychology and psychiatry. Of these investigations, 91.5 percent confirmed the investigation hypothesis. The chances of positive results were thus five times higher than, for example, in geosciences. Fanelli thinks that in the “softer” sciences there are fewer obstacles to self-critically controlling the conscious and the unnoticed tendencies in favor of a positive result.
  • The current system of scientific publications in psychology favors the publication of non-replicable results.

Important aspects of the methodological discussion are contained in a collection of essays Special Section on Replicability in Psychological Science: A Crisis of Confidence? continued. In connection with a problem overview: Joachim Funke has set up a blog on the subject of the reproducibility of psychological research .

The Reproducibility Project

Task

The Reproducibility Project founded by Brian Nosek and numerous American and also some international employees has set itself the task:

"Do normative scientific practices and incentive structures produce a biased body of research evidence? The Reproducibility Project is a crowdsourced empirical effort to estimate the reproducibility of a sample of studies from scientific literature. The project is a large-scale, open collaboration currently involving more than 150 scientists from around the world.

The investigation is currently sampling from the 2008 issues of three prominent psychology journals - Journal of Personality and Social Psychology , Psychological Science , and Journal of Experimental Psychology: Learning, Memory, and Cognition . Individuals or teams of scientists follow a structured protocol for designing and conducting a close, high-powered replication of a key effect from the selected articles. We expect to learn about:

  • The overall rate of reproducibility in a sample of the published psychology literature
  • Obstacles that arise in conducting effective replications of original study procedures
  • Predictors of replication success, such as the journal in which the original finding was published, the citation impact of the original report, and the number of direct or conceptual replications that have been published elsewhere
  • Aspects of a procedure that are or are not critical to a successful direct replication, such as the setting, specific characteristics of the sample, or details of the materials. "

The Reproducibility Project is organized and financed within the Center for Open Science COS. This non-profit organization aims to “increase the openness, integrity, and reproducibility of scientific research.” For the project, the first 30 articles of the 2008 volume of the three named journals were selected for the most accurate replication possible. Important details and criteria are set out in instructions. Follow-up examiners should contact the original authors for methodological details.

In the US, this project received a lot of attention in science magazines and was welcomed as a courageous initiative that had to overcome internal concerns. Psychologists commented very differently on the intent of the project and the concept of reproducibility.

270 scientists from 125 institutions, including 14 German institutes, took part in the project. Important details and criteria were set out in instructions. Follow-up examiners should contact the original authors for methodological details for the most accurate replication possible. The report is based on 100 publications from the three American journals. The selection from a total of 488 articles from the year 2008 is called “quasi-random”. There were a number of eligibility criteria and a step-by-step process as to which of the topics were gradually offered to potential project members for attempting replication. Those 100 of 113 replication attempts that were completed in time for the report were recorded. As a result of this peculiar selection process, the results cannot be generalized to all of the 488 publications, and much less to experimental psychology as a whole.

Results

The second investigators tried to reproduce the experiment and its individual conditions, including the statistical evaluation, as precisely as possible; As a rule, they were supported by the initial examiners and the project management team. After each of the differentiated statistical evaluations had been completed, the second investigators assessed whether the replication was successful. In 39% of the cases this question was answered in the affirmative. The majority of the published research results could therefore not be confirmed.

The summarizing project report and the supplementary documents contain differentiated statistical analyzes in which various aspects and criteria of such comparisons are taken into account. In addition to the statistical significance (over-randomness), the size of the experimentally induced difference between the experimental group and the control group ( effect size) can also be used. In addition, the first and second examinations can be combined statistically and the correlation of the two indices with influencing variables ( moderator variables ) can be checked. The group of authors summarizes the Reproducibility Project:

“We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. Replication effects were half the magnitude of original effects, representing a substantial decline. Ninety-seven percent of original studies had statistically significant results. Thirty-six percent of replications had statistically significant results; 47% of original effect sizes were in the 95% confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects. ”

Comments and criticism

In retrospect, the group of authors commented:

"We conducted this project because we care deeply about the health of our discipline and believe in its promise for accumulating knowledge about human behavior that can advance the quality of the human condition. Reproducibility is central to that aim. Accumulating evidence is the scientific community's method of self-correction and is the best available option for achieving that ultimate goal: truth. "

“We undertook this project because we are deeply concerned about the state of our discipline and believe in the prospect of improving the quality of living conditions with knowledge of human behavior. Reproducibility is fundamental to this goal. The accumulation of evidence is the method of self-correcting in science and remains the best available option to achieve the ultimate goal: truth. "

Shortly after its publication (August 28, 2015), the main result was also criticized in the German media. The disappointing result of the reproducibility project means a massive challenge for psychologists and their specialist societies to rethink fundamental research strategies and to reform the requirements for scientific publications. Other disciplines are encouraged to follow the example of this self-critical open science collaboration in psychology. The low reproducibility is not only problematic from a scientific point of view: various studies show that low reproducibility also damages public trust in psychology.

The German Society for Psychology (DGPs) commented on the results rather positively in a statement, which in turn provoked criticism from some specialist representatives. The criticism is based on the one hand on the too positive presentation of the results in the DGPs announcement, on the other hand on the fundamental shortcomings of the study.

As a limitation of the study, it is also stated that the selected works mainly concern specific topics and sub-disciplines: d. H. Cognitive psychology , priming (psychology) , effect of an attitude influenced by special instructions (psychology) , questions based on simple, computer-aided experiments. So the topics are not representative of all psychology. More demanding examinations in terms of research effort, methods, equipment and participants, d. H. not only students of psychology are in the minority. The project deals with experiments, while a large part of psychological research consists only of quasi-experimental investigations (see psychological experiment ), measurement of changes, correlation analyzes and criterion predictions . The problem of context dependency and the question of the practically important, external and ecological validity , which can only be tested under everyday conditions in field experiments and in a laboratory-field comparison, are by no means affected . Consequently, with all due recognition for the Reproducibility Project, which is outstanding in terms of its scale and methodology, its findings cannot simply be generalized to the research results of empirical psychology.

In addition, it is noted that there can hardly be a general measure of the percentage of failed replication attempts that should be considered problematic or very problematic. Nosek thinks that perhaps the main result may not be the number of reproducible examinations, but the insight into which features characterize a successfully replicated study. The project report contains numerous methodological considerations and suggestions for future investigations into the reproducibility of scientific work - also in other disciplines.

Inferences

recommendations

A group of authors names some general methodological principles and demands that the authors make their research more transparent: the research plan must be documented before the start of the investigation and, if possible, archived with open access, the research material and, above all, the data must be made available in principle, as is the idea of ​​an international study register is hoped for. Internet-based cooperation offers many new ways. Recommendations are also addressed to journal editors and reviewers, academic teachers, and institutions and funders. Will the test of reproducibility one day become the scientific standard in psychology? So far, specific measures and facilities have hardly been created by the professional associations, but by individual initiatives.

More precise publication guidelines

The American Committee on Publication Ethics COPE , together with other organizations, has developed Principles of Transparency and Best Practice in Scholarly Publishing: revised and updated .

Brian Nosek and members of the project group formulated guidelines for transparency, openness and reproducibility in an accompanying essay. The 8 standards of the Transparency and Openness Promotion (TOP) guidelines are each divided into 3 more or less demanding levels and should help to classify the technical quality of an essay and to increase the credibility of the scientific literature.

Study Register

The PsychFileDrawer system enables the archiving of successful and unsuccessful replications from all areas of psychology combined with a discussion forum. An overview of existing replication studies from 1989 to 2013 lists 53 replication attempts, most of which failed. Jeffrey Spies, Brian Nosek and others a. have created a website in the Open Science Framework OSF, where information about projects, test plans (study designs) before the start of the investigation, study materials, can be documented in a citable manner and thus also registered. One of the tools allows users who failed a replication attempt to find similar experiences.

Open access data

The Open Access movement demands that the primary data associated with a scientific publication be made accessible. In Germany, data sets from psychology can be archived on a voluntary basis in the research data center for psychology within the Leibniz Center for Psychological Information and Documentation (ZPID). This data sharing platform was specially designed for psychological research, but this particular option is not currently used very widely.

The American Psychological Association has not yet implemented the archiving of primary data for the journals it publishes for each publication. In addition to the legally difficult question of ownership and the special rights to use such data ( copyright ), there are also organizational problems. At least in the case of research projects that are funded from public funds, it should be possible to ensure that not only the reports but also the data are publicly accessible. This regulation should be defined and assured when submitting the application.

In the Journal of Open Psychology Data JOPD, data sets that are of particular value for reanalysis can be archived.

Journals also for negative results

New journals in which so-called zero results and in the sense of the hypothesis negative replication attempts (falsifications) can be published should prevent publication bias. The PsychFileDrawer. Archive of Replication Attempts in Experimental Psychology publishes experimental psychological repeat studies regardless of their outcome; it also contains a list of 20 works, the replication of which is preferred by visitors to this website.

There are now journals for the publication of insignificant findings in medicine and the natural sciences: the Journal of Articles in Support of the Null Hypothesis , the Journal of Contradicting Results in Science , the Journal of Negative Results in ecology and evolutionary biology , the Journal of Negative Results in Biomedicine and The All Results Journals .

literature

  • Alexander, Anita; Barnett-Cowan, Michael; Bartmess, Elizabeth; Bosco, Frank A .; Brandt, Mark; Carp, Joshua; Chandler, Jesse J .; Clay, Russ; Cleary, Hayley; Cohn, Michael; Costantini, Giulio; DeCoster, Jamie; Dunn, Elizabeth; Eggleston, Casey; Estel, Vivien; Farach, Frank J .; Feather, Jenelle; Fiedler, Susann; Field, James G .; Foster, Joshua D .; Frank, Michael; Frazier, Rebecca S .; Fuchs, Heather M .; Galak, Jeff; Galliani, Elisa Maria; Garcia, Sara; Giammanco, Elise M .; Gilbert, Elizabeth A .; Giner-Sorolla, Roger; Goellner, Lars; Goh, Jin X .; Goss, R. Justin; Graham, Jesse; Grange, James A .; Gray, Jeremy R .; Gripshover, Sarah; Hartshorne, Joshua; Hayes, Timothy B .; Jahn, Georg; Johnson, Kate; Johnston, William; Joy-Gaba, Jennifer A .; Lai, Calvin K .; Lakens, Daniel; Lane, Kristin; LeBel, Etienne P .; Lee, Minha; Lemm, Kristi; Mackinnon, Sean; May, Michael; Moore, Katherine; Motyl, Matt; Müller, Stephanie M .; Munafo, Marcus; Nosek, Brian A .; Olsson, Catherine; Paunesku, Dave; Perugini, Marco; Pitts, Michael; Ratliff, Kate; Renkewitz, Frank; Rutchick, Abraham M .; Sandstrom, Gillian; Saxe, Rebecca; Selterman, Dylan; Simpson, William; Smith, Colin Tucker; Spies, Jeffrey R .; Strohminger, Nina; Talhelm, Thomas; van't Veer, Anna; Vianello, Michelangelo: An open, large-scale, collaborative effort to estimate the reproducibility of psychological science. In: Perspectives on Psychological Science. Volume 7 (6), 2010, pp. 657-660. (on-line)
  • Jens Asendorpf, Mark Connor, Filip de Fruyt, Jan de Houwer, Jaap JA Denissen, Klaus Fiedler, Susann Fiedler, David C. Funder, Reinhold Kliegl, Brian A. Nosek, Marco Perugini, Brent W. Roberts, Manfred Schmitt, Marcel AG Vanaken, Hannelore Weber, Jelte M. Wicherts: Recommendations for increasing replicaility in psychology. In: European Journal of Personality. Vol. 27, 2013, pp. 108-119. (on-line)
  • Jürgen Bortz, Nicola Dörig: Research methods and evaluation for human and social scientists. 4th edition. Springer, Heidelberg 2006, ISBN 3-540-33305-3 .
  • Siri Carpenter: Psychology's bold initiative. In an unusual attempt at scientific elf-examination, psychology researchers are scrutinizing their field's reproducibility. In: Science, 335, 30 March 2012, pp. 1558-1561. (on-line)
  • Estimating the reproducibility of psychological science . Open Science Collaboration, Science 349, (2015) doi: 10.1126 / science.aac4716
  • Fred N. Kerlinger, Howard B. Lee: Foundations of behavioral research. 3. Edition. Fort Worth, Narcourt, 2000, ISBN 0-15-507897-6 .
  • Brian A. Nosek, Jeffry R. Spies, Matt Motyl: Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. In: Perspectives on Psychological Science . Vol. 7, 2012, pp. 615-631. (on-line)
  • Karl Schweizer: An analysis of the concepts, conditions and objectives of replications. In: Archives for Psychology. 141, 1989, pp. 85-97.
  • Stefan Schmidt: Shall we really do it again? The powerful concept of replication is neglected in the social sciences. In: Review of General Psychology. 2009, 13 (2), pp. 90-100, doi: 10.1037 / a0015108
  • Ed Yong: Lots of botch. Many scientific studies cannot be reproduced. This raises questions about research - and the practice of publishing specialist journals. In: Spectrum of Science. February 2013, pp. 58–63.

Web links

Individual evidence

  1. Nathaniel E. Smith: Replication Study: A neglected aspect of psychological research . In: American Psychologist . Vol. 25 (10), pp. 970-975.
  2. ^ Theodore D. Sterling: Publication decisions and their possible effects on inferences drawn from tests of significance - or vice versa . In: Journal of the American Statistical Association. Vol. 54 (285), 1959, pp. 30-34.
  3. ^ Theodore D. Sterling, Wilf F. Rosenbaum, James J. Weinkam: Publication decisions revisited: The effect of the outcome of statistical tests on the decision to publish and vice versa. In: American Statistician. Vol. 49, 1995, pp. 108-112.
  4. Jelte M. Wicherts, Denny Borsboom, Judith Kats, Dylan Molenaar: The poor availability of psychological research data for reanalysis. In: American Psychologist. Vol. 61, 2006, pp. 726-728.
  5. Ed Yong: Lots of botch. In: Spectrum of Science . February 2013, pp. 58–63.
  6. ^ J. Ritchie, Richard Wiseman, Christopher C. French: Failing the future: Three unsuccessful attempts to replicate Bem's Retroactive Facilitation of Recall Effect .  ( Page no longer available , search in web archivesInfo: The link was automatically marked as defective. Please check the link according to the instructions and then remove this notice. In: PLoS ONE. 7, 2012, p. E33423.@1@ 2Template: Dead Link / www.plosone.org  
  7. Ed Yong: Lots of botch . In: Spectrum of Science. February 2013, pp. 58–63.
  8. James W. Neuliep, Rick Crandell: Editorial bias against replication research . In: Journal of Social Behavior and Personality . Vol. 8, 1993, pp. 21-29.
  9. ^ Siri Carpenter: Psychology's bold initiative . In: Science. 2012, pp. 1558–1561.
  10. ^ Siri Carpenter: Psychology's bold initiative . In: Science. 2012, p. 1559.
  11. ^ Siri Carpenter: Psychology's bold initiative . In: Science. 2012, p. 1559.
  12. Jürgen Margraf: On the situation of psychology. In: Psychologische Rundschau, 60 (1), 2015, 1–30.
  13. Ferric C. Fang, Arturo Casadevall: Retracted science and the retraction index. In: Infection and Immunity, 79 (10), 2011, 3855-3859. doi: 10.1128 / IAI.05661-11 .
  14. Leslie K. John, George Loewenstein, Drazen Prelec: Measuring the Prevalence of Questionable Research Practices with Incentives for Truth Telling . In: Psychological Science . Vol. 23, 2012, pp. 524-532.
  15. Joseph Simmons, Leif D. Nelson, Uri Simonsohn: False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. In: Psychological Science . Vol. 22, 2011, pp. 1359-1366.
  16. ^ Katherine S. Button, John PA Ioannidis, Claire Mokrysz, Brian A. Nosek, Jonathan Flint, Emma SJ Robinson, Marcus R. Munafo: Power failure: why small sample size undermines the reliability of neuroscience. In: Nature Reviews Neuroscience . Vol. 14, May 2013, pp. 365-376.
  17. Michael Springer: The (too) small world of brain researchers. Statistically, neuroscience stands on feet of clay. Gloss. In: Spectrum of Science. May 2013, p. 20.
  18. ZN Sohani, ZN Reanalysis of Randomized Clinical Trial Data . In: JAMA - The Journal of the Medical Association , 312 (10), 2014, 1024-1032.
  19. ^ Anton Kühberger, Astrid Fritz, Scherndl, Thomas: Publication bias in psychology: a diagnosis based on the correlation between effect size and sample size . In: PloS one , 2014, 9 (9), e105825, ISSN  1932-6203 .
  20. ^ JP Ioannidis: Why most published research findings are false. In: PLoS medicine. Volume 2, number 8, August 2005, p. E124, doi: 10.1371 / journal.pmed.0020124 , PMID 16060722 , PMC 1182327 (free full text).
  21. Leah R. Jager, Jeffrey T. Leek: An estimate of the science-wise false discovery rate and application to the top medical literature. In: Biostatistics. Vol. 15 (1), Jan. 2014, PMID 24068246 , pp. 1-12.
  22. Marjan Bakker, Jelte M. Wicherts: The (mis) reporting of statistical results in psychology journals . In: Behavior Research Methods. Vol. 43 (3), 2011, pp. 666-678.
  23. ^ Daniele Fanelli: Negative results are disappearing from most disciplines and countries . In: Scientometrics. Vol. 90 (3), 2012), pp. 891-904.
  24. ^ John P. Ioannidis: Why most published research findings are false. In: PLoS Medicine. Vol. 2 (8), 2005, p. E124.
  25. Daniele Fanelli: Positive results receive more citations, but only in some disciplines. In: Scientometrics. Vol. 94 (2), 2013, pp. 701-709.
  26. see below a. Keith R. Laws: Negativland - a home for all findings in Psychology. ( Memento of the original from January 9, 2014 in the Internet Archive ) Info: The archive link was automatically inserted and not yet checked. Please check the original and archive link according to the instructions and then remove this notice. In: BMC Psychology. 2013, 1 (2. @1@ 2Template: Webachiv / IABot / www.biomedcentral.com
  27. Marjan Bakker, Annette van Dijk, Jelte M. Wicherts: The rules of the game called psychological science. In: Perspectives on Psychological Science. Vol. 7 (6), 2012, pp. 543-554.
  28. Perspectives on Psychological Science , 7 (6), 2012; doi: 10.1177 / 1745691612465253 .
  29. http://f20.blog.uni-heidelberg.de/2012/11/18/zur-reproduzierbarkeit-psychologischer-forschung/
  30. ^ Siri Carpenter: Psychology's bold initiative. In: Science. 2012, pp. 1558–1561.
  31. John Bohannon: Psychologists launch a bare-all research initiative. In: Science Magazine. March 5, 2013.
  32. ^ Ed Yong: Replication studies: Bad copy. In the wake of high-profile controversies, psychologists are facing up to problems with replication. In: Nature. May 16, 2012.
  33. ^ Sarah Estes: The myth of self-correcting science. In: The Atlantic. 20th Dec 2012.
  34. ^ Open Peer Commentary . In: European Journal of Personality . Vol. 27, 2013, pp. 120-144.
  35. Estimating the reproducibility of psychological science. In: Science. 349, 2015, S. aac4716, doi: 10.1126 / science.aac4716 .
  36. faz.net/aktuell/wissen/mensch-gene/die-meisten-psycho-studien-sind-zweifelhaft
  37. spiegel.de/wissenschaft/mensch/psychologie-wissenschaften-hunder-studien-nicht- Wiederholbar
  38. Farid Anvari, Daniël Lakens: The replicability crisis and public trust in psychological science . In: Comprehensive Results in Social Psychology . tape 0 , no. 0 , November 19, 2019, ISSN  2374-3603 , p. 1–21 , doi : 10.1080 / 23743603.2019.1684822 .
  39. Tobias Wingen, Jana B. Berkessel, Birte Englich: No Replication, No Trust? How Low Replicability Influences Trust in Psychology . In: Social Psychological and Personality Science . October 24, 2019, ISSN  1948-5506 , p. 194855061987741 , doi : 10.1177 / 1948550619877412 .
  40. Replications of studies ensure quality in science and advance research. Website of the German Society for Psychology. Retrieved September 7, 2015.
  41. Discussion forum: Quality assurance in research. Website of the German Society for Psychology. Retrieved September 7, 2015.
  42. Jochen Fahrenberg, Michael Myrtek, Kurt Pawlik, Meinrad Perrez: Outpatient assessment - recording behavior in everyday life. A behavioral science challenge to psychology . In: Psychologische Rundschau , Volume 58, 2007, pp. 12-23.
  43. ^ Siri Carpenter: Psychology's bold initiative. In: Science. 2012, p. 1561.
  44. ^ Jens Asendorpf et al .: Recommendations for increasing replicaility in psychology. In: European Journal of Personality. Vol. 27, 2013, pp. 108-119.
  45. publicationethics.org /
  46. publicationethics.org/news/principles-transparency-and-best-practice-scholarly-publishing-revised-and-updated
  47. BA Nosek, et al .: SCIENTIFIC STANDARDS. Promoting an open research culture. In: Science. Volume 348, number 6242, June 2015, pp. 1422–1425, doi: 10.1126 / science.aab2374 , PMID 26113702 , PMC 4550299 (free full text).
  48. psychfiledrawer.org
  49. psychfiledrawer.org/private_networking.php .
  50. Jochen Fahrenberg: Open Access - only texts or also primary data? Working Paper Series of the Council for Social and Economic Data (RatSWD), ed. by GG Wagner, funded by the Ministry of Education and Research. No. 200, June 2012, pp. 1-30.
  51. open-access.net/informationen-zu-open-access/open-access-bei-forschungsdaten