Local stochastic independence

Local stochastic independence means that the probability of solving a certain item (task in a test) should be independent of having previously solved or not solved any other item. The point is that the probability of solving an item should only depend on the known personal parameters (the ability of the person) and one item parameter (the difficulty of the item).

background

The local stochastic independence is a term from the Item Response Theory (IRT) . The classical test theory (KTT) has been criticized because it trusts that the sum value of a person in a test is per fiat an indicator of the characteristic expression of the person in the test (if person 1 achieved a high value, it is, for example, classified as highly intelligent, whereas a person 2 whose overall test value was low would be classified as less gifted). However, the KTT cannot answer how exactly the item responses came about. This is where the IRT comes in. The test instruments developed on the basis of the IRT investigate the question of what conclusions can be drawn from the response to an item (response) on the attitude and ability characteristics of a test person. Item response theory understands the answer to an item as a manifest , i.e. observable variable, which is determined by an underlying, non-observable latent variable. Such latent variables correspond to these abilities, attitudes or dispositions. (This background is also reflected in the term " latent-trait " models). In classical test theory , the separation of latent and manifest variables is of no such importance.
Furthermore, each item in the IRT can be assigned a certain probability of solution, which is dependent on its item properties (the item parameters ) and the ability of a subject (the person parameter ); hence the German term for the item response theory: probabilistic test theory . The item parameters are divided into difficulty sigma (the greater, the lower the probability of a solution), selectivity lambda (ability to differentiate between 'able' and 'not able' within the answer to a single item) [is with a solution probability of 0.5 = fifty: fifty greatest] and rate probability gamma (which increases the easier the item can be solved even without knowledge) - whereby in the IRT one can choose between one-parameter models (only Sigma), two-parameter models (Sigma and Lambda) and three-parameter (sigma, lambda, gamma). Strictly speaking, a single-parameter model also has all three item parameters, but lambda is assumed to be constant 1 and gamma is assumed to be constant 0.

The terms "difficulty" and "ability" come from the field of performance measurement - such as intelligence tests . However, this term is still used for personality tests. In this area it is mostly about how much someone identifies with something (personality traits) and how much he prefers one of the item selection alternatives (on a scale from 1 to 10, how anxious have I been in the last 2 weeks?). A high "ability" then stands for a high degree of expression in the sense of the characteristic to be recorded (person has identified himself very strongly with the construct [here fearfulness]) and "difficulty" stands for the proportion of test persons who responded in terms of a higher characteristic expression have (the less likely it was to give high preferences for the information content of the item in the item, the higher the item difficulty was, since it quasi represents the 'unattractiveness' of the options).

definition

If the latent variable influences the manifest, the test items will correlate with each other (a prerequisite for being able to infer a latent dimension from manifest behavior). In other words, the latent variable creates the variation of the manifest variable.

However, the answer to item A may depend on the answer to item B - in this case the items would also correlate with one another. Items are only considered indicators (indication of the expression of the person in the latent construct, i.e.) for a latent variable under the condition that their probabilities of solution do not depend on each other, but are determined solely by the person and item parameters. The probability of solving an item must therefore not depend on the probability of solving another item. (The throwing of a die is often given as an example: The fact that a six is thrown has nothing to do with the fact that I previously rolled a one. The events are independent of one another.)

In order to be able to infer a latent variable from the correlation of the items, the local stochastic independence of the items must be proven. The items must not correlate with anything other than their latent construct (the person parameter). If all persons within a group had the same person parameter, the items should no longer show any intercorrelations. This is where the term local stochastic independence comes from . The person variable is a continuum that can have any value. One can describe the characteristics of the person by indicating where on the continuum the characteristics of the person are located . The local stochastic independence is only valid at this one place (locus) on the continuum. The items are therefore allowed to correlate with each other, but only if one has different loci of the person characteristic in a group of people and these differences in the person characteristic are the cause of the correlation. If the items correlate and if they are also locally stochastically independent, then they are referred to as "homogeneous" with regard to the latent variables. The items then all measure the same latent dimension (person characteristic) and are not locally disturbed by any further intercorrelations with other confounding variables. In this case the latent variable is actually the reason for the variation of the manifest variables and it is unimportant which items a test person works on; the result of this test person is an exhaustive statistic for its expression on the characteristic to be recorded in each of the item subsets. This means that for each person in the result matrix their total row value (row score) gives all information about the expression of their personality factor - regardless of which item was given which answer - the row total value is the exhaustive information about the personal characteristic. But the item parameters are also exhaustive statistics under these assumptions. Each column (vertical) of the matrix contains the answers of the people who were given for this item. Your column total indicates how many of all participants in the test were able to 'solve' the item (i.e. knew it in performance tests, or preferred it in personality tests). The smaller this proportion, the more difficult it was B. This item - regardless of which people did exactly what in this item - the total column total provides exhaustive information about the item parameters.

Verification

In order to check whether the correlations of the items are only caused by differences in the latent dimension, the latent variable is kept constant at a local level. If the items are homogeneous and locally stochastically independent, the correlations of the items at these levels disappear. So one subdivides z. B. the total sample according to different personal parameters (age, gender, school education, soc. Economic status of the parents) and then estimates the item parameters for each of these sub-samples. Since the model claims that the item parameters can be estimated independently of the person parameter, no statistically significant (over randomly significant) differences should be found between the subgroups. If so, the model must be discarded or at least modified and retested on a fresh sample. The most prominent method of creating such a model check is the Conditional Maximum Likelihood Method (CML), which works exactly as just described.

The check is carried out within different items, but also from a statistical point of view via the multiplication theorem for independent events : The combined probability of multiple items (WS that both events occur at the same time under a fixed condition) of agreeing corresponds to the product of the probabilities of consent to each individual item (i.e. the conditional Individual probabilities that the event will occur under the specified condition).

Example: If the likelihood of approval of an item A corresponds to p = .10 and the likelihood of agreeing to item B p = .30, then local stochastic independence results if its product corresponds to the actually determined combined probability of approval.
In this case: ${\ displaystyle p_ {A} (. 10) \ cdot p_ {B} (. 30) = p_ {AB} (. 03)}$

literature

Hermann-Josef Fisseni: textbook of psychological diagnostics. With hints for intervention. 3rd, revised and expanded edition. Hogrefe, Göttingen et al. 2004, ISBN 3-8017-1756-9 .
Manfred Amelang, Lothar Schmidt-Atzert: Psychological diagnostics and intervention. 4th, completely revised and expanded edition. Springer, Heidelberg 2006, ISBN 3-540-28462-1 .