Randomized response technique

from Wikipedia, the free encyclopedia

The randomized response technique (German randomized answer technique ) is a method of psychology and the social sciences to reduce certain falsifications of interview responses.

For some survey topics , honest answers can be embarrassing or incriminating for the respondent, or they can be falsified by the effect of social desirability . Then the randomized response technique offers a possibility to estimate the true result of the survey through anonymization . By anonymizing the question rather than the person, so to speak, personal details such as name, age and address can be recorded in the same survey without the (true) answer being able to be assigned to a specific person.

The concept is related to the credible deniability ( English plausible deniability ). While in credible deniability a person can credibly claim that another person has said “yes” to something, thanks to RRT the respondent can credibly claim that he or she said “yes” to something but was asked a different question.

Procedure

The randomized answer technique has been further developed over time and new variants have been added. The Forced Response method by Boruch (1971) follows as an example. Before the “ sensitive question ” is answered, a random generator decides whether the person questioned should answer honestly or with “yes”. The interviewer does not know what the random number generator decided, which protects the “yes” answer, i.e. the admission of the embarrassing quality. Other common variants are the

  • Unrelated Question Technique (UQT)
  • Two step procedure
  • Card design according to Kuk
  • Warner's original version

example

The aim is to determine the percentage of the population that has ever driven a car under the influence of alcohol. Each respondent (chosen purely at random from the population) receives three cards from the interviewer. Each card is provided with a question, whereby for example the first card the question “Have you ever driven a car under the influence of alcohol?”, The second card the question “Is there a black triangle here?” (With no black triangle can be seen) and the third card also contains the question “Is there a black triangle to be seen here?” (although a black triangle is indeed to be seen here). The respondent is given all three cards face down. Without the questioner seeing the cards, the respondent draws one of the cards and simply answers with “Yes” or “No”. The interviewer does not now know which of the questions the respondent answered. Thus, the respondent has no reason to answer untrue in this survey.
Let us assume that 3,000 people are asked, of which 1200 people answered “yes” (which question this answer refers to does not matter here). On average, around a third of them, i.e. around 1000 people, pulled the card with the black triangle and answered it truthfully with “Yes”. Another 1,000 people drew the card without a triangle (and therefore answered "No"). Around 1000 people also drew the card with the alcohol question, to which the remaining 200 "yes" answers can now be assigned. It can therefore be said that around 200 (i.e. 20%) out of 1000 people surveyed have already driven a car under the influence of alcohol.

application

This question was used during the Vietnam War when the US army command wanted to know what proportion of the US troops stationed there were using drugs. According to rumors, this proportion was very high, which is why they wanted to check this empirically. With a direct questioning method, one would most likely have obtained a very inaccurate result, since drug use is at least a criminal offense.

A comparison between public statistics on doping and drug consumption (from the German National Anti-Doping Agency) and the results of a study with RRT showed a significant difference in 2010: For example, the athletes surveyed stated (in some cases many times) more frequent consumption behavior, as officially collected or stated.

Original version

In the original version by Warner (1965), the process is somewhat different: The “sensitive question” is formulated in two complementary versions, and the random number generator decides which of the questions should be answered (honestly). The interviewer thus receives a “agree” or “disagree” answer without knowing which question. For mathematical reasons, the probability distribution cannot be “fair” (½ to ½). Is p , the probability that the sensitive question to be answered, and the true proportion of respondents with the embarrassing property so the share is the "true" responses , with the total number of "True" is responses and the total number of the people surveyed as follows:

Resolved after one receives

Mathematical derivation of the formula

The sampling area {A, B} is assumed. The sampling space consists of the events

and

.

The random variables are independent and identically distributed . Each of these random variables can be viewed as a respondent. Let the respective probabilities for the two events be as well . In our example, this represents the actual proportion of people who have already driven a car under the influence of alcohol. However, the probability is unknown. Now a random experiment is carried out with the failures A and B and the known probabilities and . However, the result of this random experiment is only observed by the respondent and not by the questioner. The respondent then tells the questioner whether the result matches his group membership (i.e. with A or B). Now you can define a new random variable as follows:

The realizations of the random variables are then obtained as information . The probability for can now be represented as a conditional probability as follows:

Correspondingly, one can also represent the probability for :

Let Y be the number of "yes" answers, then for Y:

Since each can only assume the values ​​1 and 0 with probability and , the are distributed. So it is distributed. Now you can estimate the ones through the sample proportion, i.e. the number of all ones in relation to the total number of all results. This results in:


One can now determine the moment estimate for using the following equation:

By reshaping one then obtains the moment estimate for :

Here you can see that this method is only valid for .

One can now determine the expected value of this estimator :


So is an unbiased estimator for .

example

  • Alternative 1: "I have already driven a car under the influence of alcohol."
  • Alternative 2: "I've never driven a car under the influence of alcohol."

The respondents roll the dice face down and should only answer the first question on a 6, otherwise the second . The proportion of “true” answers is now made up of those who have already driven a car under the influence of alcohol and rolled a 6 and those who have never driven a car under the influence of alcohol and rolled a different number. Out of 100 respondents, 75 say they are “true” ( ). Inserting it into the formula gives

If all of the respondents were honest, the true proportion of people who have driven a car while under the influence of alcohol is 12.5%.

See also

literature

  • C. Hesse: The basics of clear thinking . (= Beck'sche series .) 2009, pp. 284-303.
  • Lecture notes Mathematical Statistics SS 2010 by Christian Hesse, University of Stuttgart.
  • SL Warner: Randomized response: a survey technique for eliminating evasive answer bias . In: Journal of the American Statistical Association 60, 1965, pp. 63-69.
  • BG Greenberg et al .: The Unrelated Question Randomized Response Model: Theoretical Framework . In: Journal of the American Statistical Association 64 (326), 1969, pp. 520-539.
  • Arijit Chaudhuri, Rahul Mukerjee: Randomized response: theory and techniques .
  • M. Ostapczuk, M. Moshagen, Z. Zhao & J. Musch: Assessing sensitive attributes using the randomized-response-technique: Evidence for the importance of response symmetry . In: Journal of Educational and Behavioral Statistics 34, 2009, pp. 267–287.
  • M. Ostapczuk, J. Musch & M. Moshagen: A randomized-response investigation of the education effect in attitudes towards foreigners . In: European Journal of Social Psychology 39, 2009, pp. 920-931.

Individual evidence

  1. Heiko Striegel, Rolf Ulrich & Perikles Simon: Randomized response estimates for doping and illicit drug use in elite athletes . In: Drug and alcohol dependence 106 (2-3), 2010, pp. 230-232, doi: 10.1016 / j.drugalcdep.2009.07.026 .