Randomized experiment
A randomized experiment is an experiment in which different treatments to be evaluated in terms of their effects are randomly assigned to observation units. Due to the random allocation, the observation units should not differ on average (with the exception of the treatments). The opposite is the quasi-experiment .
Ronald Fisher's Randomized Experiment
Ronald Fisher is considered to be the inventor of the randomized experiment. In The Design of Experiments (1935) he described his concept using an example known today. In this case, the aim is to check whether a woman can tell whether the milk or the tea was added to the cup first by means of a taste test of a cup of tea with milk. In Fisher's day, the predominant approach to such questions was to hold constant covariates that might affect the outcome . In this case this would mean, for example, exactly matching the temperature and strength of the tea, the amount of added sugar or milk, or the type of cup for both treatments (tea first, milk first). Fisher rejected this approach for two reasons. First, it is impossible. Second, even if it were even close to being possible, it was too expensive.
Instead of the prevailing dictum of keeping all factors constant, Fisher suggested keeping nothing constant, namely through randomization. To clarify the specific question, Fisher suggested filling four cups first with milk, then with tea, and filling four other cups first with tea, then with milk. The woman is told that four cups first received milk, then tea, and four others first received tea, then milk, but not which cups they were. The eight cups are presented to the woman in random order . Your task is now to use taste tests to assign the cups to the correct group. So the number of cups is . The order of presentation of the cups is a random variable , and each presentation is the realization of this random variable . A particular presentation can be described with, for example . All possible presentations are elements of the set of all possible presentations Ω. Third, a result will be observed. In the example above, the woman should correctly assign all of the cups . Finally, the experiment should decide whether the null hypothesis (woman can't taste whether tea or milk was added to the cup first) must be rejected if there is a certain probability of error .
All possible outcomes should be predicted before performing randomized experiments. The number of elements in Ω is central . Since Fisher's experiment is a permutation , it can be calculated as follows:
So there are 70 possible arrangements (and also 70 possible results ). Fisher then asked what the probability was that the woman would correctly assign all eight cups by chance alone. This probability is . If it turns out , one can conclude with an error probability of less than 2% that the woman actually has the ability to taste the order in which the tea and milk are poured. Under a less strict definition of the ability, according to which two allocation errors are allowed, the probability of error would already be low . Under this definition, the experiment described above would no longer have sufficient statistical significance.
Core elements
Rosenbaum (2002) summarizes the core elements of a randomized experiment as follows:
- Experiments do not require homogeneity of the treatment units
- Experiments do not require a random sample of a population of treatment centers
- In order to be able to draw a valid conclusion on the effects of a treatment from an experiment, the treatments must be randomly distributed among the treatment units
- In the experiment, probability only plays a role in connection with the assignment of treatments to treatment units.
Types of randomized experiments and statistical tests
Fisher's method became the gold standard in many areas, such as agriculture , computer science , manufacturing processes , medicine and welfare . In addition to the completely randomized experiment, there are variants such as the block design ( block plan ) or paired randomized experiments. In addition, there are a number of statistical tests that in randomized experiments (as opposed to non-randomized experiments) get by with almost no assumptions. Rosenbaum (2002) summarizes them as follows:
- Tests for binary results: Fisher's exact test , Mantel – Haenszel statistics , McNemar test
- Tests for ordinal results: Mantels (1959) extension of the Mantel – Haenszel statistics
- Tests for a single stratum with interval scale and ratio scale : Wilcoxon rank sum test
- Tests for ordinal results (with a large number of strata compared to the number of samples): Hodges-Lehmann estimator
Criticism of social randomized experiments
While the randomized experiment has proven very useful in many applications since Fisher, criticism has been raised against its use in humans over the past three decades. In particular, it has been criticized that assignments to control groups deprive some people of treatment, which can be unethical and / or illegal.
James Heckman and colleagues also emphasized the need to model the processes that lead people to participate or not in programs or treatments. The criticism was also directed against the fundamental assumption of the randomized experiment that randomization eliminates selection bias .
Individual evidence
- ↑ ^{a } ^{b } ^{c } ^{d } ^{e } ^{f } ^{g } ^{h } ^{i} Shenyang Guo & Mark W. Fraser: Propensity Score Analysis: Statistical Methods and Applications . Sage Publications, 2009. ISBN 9781412953566 . Pp. 5-12.