Mokken analysis

from Wikipedia, the free encyclopedia

The Mokken analysis is a method for statistical analysis of data.

It is named after the Dutch scientific theorist Robert Jan (Rob) Mokken (* 1929), who first described this method in 1971.

The unique selling point of the Mokken analysis is the ability to manage without parameters ; this enables an exploratory approach without restrictions. No function of the ICC has to be determined, and no distribution has to be determined for the estimation of the parameters. Item response theory methods without parameters are particularly suitable for showing the various dimensions of the underlying data material. Depending on the number of existing dimensions, items are modeled into new item sets. Items that are not used to record the latent expression are removed. It should be particularly emphasized that an item set should only belong to one dimension; this property has already been described as one-dimensionality. The Mokken analysis is not a scaling method that points to the existing deviations between the empirical and theoretical assumptions, but the focus is on checking the model assumptions. The original scaling method according to Mokken contains dichotomous items, later it was further developed so that polytomous items can also be analyzed.

Both the Rasch model and the Mokken analysis are based on the basic ideas of Louis Guttman . Mokken (1971) adopts the principles already adopted by Rasch (1960) in his analysis. So it applies that the probability of a correct answer, which is coded as 1 , is described for each person and for each item in the respective parameter. The Item Response Function is also known as the Item Characteristic Curve (ICC) or Traceline . ICCs can take different courses depending on the model. The Mokken analysis makes it possible for the ICC to assume different courses.

The four basic assumptions of the Mokken analysis:

  1. One-dimensionality - One-dimensionality is present when all available items ask about a certain ability and measure it, without the influence of further characteristics. If several skills of a test person were to be asked in a test, no clear value depending on a skill can be determined.
  2. Local stochastic independence - is characterized by the fact that answering additional items does not depend on the processing of a previous or a later item. The probability of answering the item depends only on the asked ability of the participant.
  3. Monotony of the ICCs - If this property applies, if the characterizing graphic implementation of the item should behave analogously to the value of the value of the test score , the curve must rise if the score achieved by the participating person has a higher value compared to another test person. The ICC is therefore never falling, but always rising, depending on the test score.
  4. ICCs without overlap - If an ICC does not overlap (i.e. the curves run parallel), it is possible to rank the items according to their difficulty.

Models

The Mokken analysis is made up of two important models: on the one hand the model of monotonous homogeneity, on the other hand the model of double monotony.

Model of monotonous homogeneity

If one summarizes Mokken's first three assumptions (one-dimensionality, local stochastic independence and monotony of the ICCs), it explains the model of monotonous homogeneity, or MHM for short (Monotonic Homogeneity Model). The constant increase in ICCs creates constant conditions so that it is equally possible for every test person to answer the item correctly. If it is more likely for a test person v to answer an item i sooner than test person w, then test person v will also answer every other item with a higher probability than test person w. If a test person a with the associated ability can be assigned a high probability that this item will solve item i, other test persons also have a higher probability of solving this item earlier than another. If this is the case, the existing scale is described as deterministic cumulative. In order to be able to describe an item parameter as homogeneous, the items must have a certain order. The existence of this property can also be shown graphically; in the case of a valid homogeneity, no overlap can be read in the ICC. The function increases monotonically. With regard to the ICC, this is reflected in the constantly increasing function.

Model of double monotony

If the model of monotonic homogeneity is expanded to include the monotony of the item parameter, the model of double monotony, DMM for short (Double Monotonicity Model) is obtained. This results in ICCs that do not overlap. The fulfillment of the double monotony can be tested using various methods. Since the double monotony is characterized by the monotonically increasing item parameter, it follows that the ranking of the difficulties must be the same for all test subjects. In addition to the intended population, this can also be tested on other groups.

Uniformity coefficient H

The homogeneity coefficient serves as a mathematical variable to determine the double monotony, which is composed on the one hand of the monotonous homogeneity and on the other hand of the homogeneous item parameter. The homogeneity coefficient introduced by Mokken (1971) is based on the homogeneity coefficient from Loevinger. A distinction is made between homogeneity based on three initial situations:

  • H ij stands for the homogeneity related to two items
  • H i provides a value for the relationship between an item and the remaining combined items in the set
  • H refers to all items in the specification

If the homogeneity coefficient H assumes the value 0, no correlation can be assumed. If H reaches the value 1, one speaks of the perfect Guttman scale. Mokken introduces guideline values ​​for estimating H. It describes a scale as weak if 0.3 ≤ H <0.4. If the H value is between 0.4 and 0.5, Mokken speaks of a medium scale. If the homogeneity coefficient is greater than 0.5, it is based on a strong scale.

Parameter estimation - item parameters and person parameters

In the Mokken scaling model, two essential parameters are in the foreground: On the one hand, the personal parameters, it describes the ability of the test person. On the other hand, the item parameter, which one through? is represented, here the difficulty of the item is explained in relation to the test person. Both parameters are not described by a numerical value, but by a ranking of test persons depending on their latent ability, or a ranking of the items according to their difficulty. In the case of non-parametric models, the total score is equated with the expression of the person on the latent continuum. It follows from this that it is an ordinal order. The ranking of the expression of the characteristic expression is meaningful.

Mokken scaling method

Sequence of the selection of a scaling method and the assessment of the suitability of the same. In this way, hypotheses are created on facts regarding the definition of the possible latent and manifest variables, as well as the assumption of how the given variables relate to one another. The data set is viewed and a scaling method, in this case the Mokken analysis, is selected. Furthermore, the validity is recorded: it is checked whether the assumptions made are logically conclusive in terms of content. The algorithm for analyzing existing errors is called SCAMMO, this filters existing scales and excludes items that do not belong to any scale. If a scale is referred to as robust, an equivalent value of the homogeneity coefficient can be determined for different groups of test persons. The test scores of the test subjects are calculated in order to create scales from them. If the case occurs that the homogeneity coefficient Hij of an item that has not yet been included in a scale has a negative correlation to an item in the scale, this will always be the case, even if it is only the relation to one in the scale Item concerns and the negative correlation is minimally pronounced, not included in the scale. The mindfulness of the person performing the scaling is always required, but analogous to the low negative correlation, for example, an item can correlate slightly positive and have a value below the constant c. Here, too, it must be decided whether the item should be included in the scale or excluded.

Mokken describes the process of creating a scale from an already existing item set: at the beginning he indicates the necessary level of knowledge regarding the interaction between the queried variable and the items; it is assumed that there are homogeneous items in the item set. As a scale, dichotomously coded items that correlate with one another are understood, hence H ij > 0, so covariances greater than 0 can be assumed, and one can orientate oneself with regard to the homogeneity coefficient H i and H as well as the termination criterion c, H i , H ≥ c> 0. The automatically running algorithm is referred to as the “Automated Item Selection Procedure”, or AISP for short.

Mokken points out the possibility of incorrectly adding an item with an estimated Hi value due to the repetitive algorithm under the termination criterion c, but this can be corrected manually afterwards. Faulty processes in the AISP also occur, so the value of Hi can be greater than c, although the ICC precondition of monotony is not met. This occurs when the value of the skill at the locations of the ICC changes in the same direction. Furthermore, ICCs that run very flat are identified as monotonous with a low termination criterion. If the value of c is set high, the item is not included in the scale.

The process of the algorithm of the Mokken analysis, which creates a scale: a scale must contain at least two items, a suitable item is selected and more, depending on suitability, added. In the initiating step, Hij is calculated for all items; this homogeneity coefficient should be greater than 0 on the one hand and greater than the specified constant c on the other. The first selected item has the highest value with regard to Hij. If there are several items with this highest value, the decision is made on the item that is listed first. If there is an item in addition to the start item that correlates negatively with the start item, this cannot in any case be integrated into the resulting scale. Now the homogeneity coefficients H and Hi are also calculated, these must show a higher value than 0 and c, the item with the highest value is added to the scale next. If several items show the same level of H and Hi, the item with the highest difficulty is chosen. The author points out the problem that items can be included in the scale that have low values ​​in terms of homogeneity; he therefore recommends checking the values ​​of Hi. Furthermore, reference can be made to the problem of capitalizing on chance and the possibility that the scale found does not necessarily have to be the optimal one, since the entire scale is based on the start item; it indicates that this could be exchanged for other combinations to enable.

Critical illumination of the Mokken analysis

Jansen (1982) criticized the Mokken analysis eleven years after the publication of Mokken (1971). He questions the use of the homogeneity coefficient as a characteristic value for homogeneous items and demonstrates in an example items that show no homogeneity. However, these items were recognized as homogeneous by the Mokken analysis. This is an example of ICCs that do not run in parallel; in the case of a parallel course, as occurs in the Rasch model, the present homogeneity results per se. Jansen states: “A set of perfectly homogeneous items can be judged as 'homogeneous' or 'not homogeneous' in SCAMMO depending on the minimal boundary for scalability and the distance between the item 'latent parameters'. If the distance between the difficulties of the item is too small, the homogeneity coefficient cannot assume a value greater than the termination criterion, and so actually scalable items are viewed as not being able to be integrated into a scale. Furthermore, the author divides the concept of homogeneity into classical and modern homogeneity. The former describes the homogeneity as a relationship, the latter deals with the special characteristics of the homogeneity in order to select items for a scale. With this division it is possible to assign Loevinger's homogeneity coefficient to the modern category.

Sijtsma reacts to the criticism in 1984. He confirms the logical-mathematical execution of Jansen, but does not agree with the direct transfer of the findings to the Mokken analysis. So he corrects that Mokken's homogeneity coefficient was used to establish the relationship between the same and the perfect Guttman scale. This also relates to the termination criterion c, which, analogous to the homogeneity coefficient, relates to the Guttman scale.

Jansen, Roskam and Van den Wollenberg (1984) publish the article “Discussion on the Usefulness of the Mokken Procedure for Nonparametric Scaling” a short time later. In this publication, the authors deal with two fundamental questions relating to the H coefficient in the context of the Mokken analysis. On the one hand, the content of the measurement of the coefficient H is to be examined, on the other hand, the relationship between the homogeneity coefficient and the Mokken DMM. In a first step, the authors refute the possibility assumed by Sijtsma (1984) of seeing a high value of H as an indicator for the conclusion that identical total scores of test persons are due to congruent answers to the same items. The authors support this statement with an example that includes two test persons. They state that in addition to the coefficient H, other conditions must also apply in order to be able to declare this conclusion to be valid. If the test persons show similar characteristics of the questioned ability, is the personal parameter? analogously, do not show large distances. Molenaar assumes that the homogeneity coefficient will be represented with a low value. With this consideration by Molenaar (1982), Jansen et al. (1984) that the expression of H is not exclusively responsible.

In 1986 Sijtsma responded with a statement on the statements by Jansen et al. (1984). He notes that the example given to explain the high level of H von Jansen et al. (1984) with two test persons and one item cannot be valid as a representative example and as evidence of a false assumption. He also points out that “Coefficient H, however, does not express a probability, nor is it based on probabilities as defined by Jansen et al. (1984) “(Sijtsma, 1986, p. 428). Roskam, Van den Wollenberg and Jansen publish a critical article on the Mokken analysis in 1983 within a university specialist group and publish the content in 1986. In their article, the authors point out that the homogeneity coefficient H is not suitable for making statements regarding homogeneity and the To meet the holomorphism of a dataset. They also state that "The Mokken scale [...] appears to be a revival of the Guttman scale". ( Roskam et al. (1986), p. 277).

These allegations refute Mokken, Lewis and Sijtsma (1986) and criticize the biased approach of Roskam et al. (1986), which is based on sympathy for the Rasch model. Sijtsma, Van Abswoude and Van der Ark (2004) apply different nonparametric scaling methods of the IRT to a data set in order to compare the procedures.The authors recognize that c with the static value of 0.3 - like Mokken (1971, p . 153) recommends. When using the Mokken scaling method with different values ​​for the termination criterion, correct scales were created.

Meijer, Smits and Timmerman (2012) build on this knowledge and state that the Mokken analysis often does not adequately depict the empirically found data structure in the creation of the scales; this depends on the underlying structure of the data material. It is often not possible to clarify a priori whether the data are suitable for analysis with the mocha scaling, which is why the authors propose to vary the termination criterion c and thus increase the possibility of correct mapping. In order to be able to use these methods, two conditions must be met: "[…] factors are not strongly correlated and the items do not differ substantially in the item strength." The authors emphasize promising results when performing the implementation with different termination criteria, note however, it also indicates that further applications of a wide variety of data are pending.

literature

  • J. Rost: Test theory - test construction . Hans Huber, Bern 1996.
  • BT Hemker, K. Sijtsma: A Practical Comparison Between the Weighted and the Unweighted Scalability Coefficients of the Mokken Model . In: Kwantitatieve method , 1993, 14, pp. 59-73.
  • Klaas Sijtsma: New Developments in Psychometrics . Springer, New York 2003.
  • J. Gerich: Non-parametric scaling according to Mokken - Contributions to the qualitative analysis of quantitative data . Trauner, Linz 2001.
  • PGM Van der Heijden, S. Van Buuren, M. Fekkes, J. Radder, E. Verrips (2003). Unidimensionality and Reliability under Mokken Scaling of the Dutch Language Version of the SF-36 . In: Quality of Life Research , 12, pp. 189-198.
  • G. Rasch: Probabilistic Models for Some Intelligence and Attainment Tests . University Press, Chicago 1980.
  • RJ Mokken: A Theory and Procedure of Scale Analysis: With Applications in Political Research . Walter de Gruyter, Berlin 1971.
  • K. Sijtsma, IW Molenaar: Introduction to Nonparametric Item Response Theory . SAGE Publications, London 2002.
  • WH Van Schuur: Mokken Scale Analysis: Between the Guttman Scale and Parametric Item Response Theory . In: Political Analysis , 2003, 11, pp. 139-163.
  • W. Meredith: Some Results Based On A General Stochastic Model For Mental Tests . In: Psychometrika , 1965, 30, pp. 419-440.
  • K. Sijtsma: Methodology Review: Nonparametric IRT Approaches to the Analysis of Dichotomous Item Scores . In: Applied Psychological Measurement , 1998, 22, pp. 3-31.
  • J. Loevinger: The Technic of Homogeneous Tests Compared With Some Aspects Of “Scale Analysis” and Factor Analysis . In: Psychological Bulletin , 1948, 45, pp. 507-529.
  • PGW Jansen: Homogeneity measurement using the Loevinger coefficient H: a critical discussion . In: Psychological Contributions , 1982, 24, pp. 96-105.
  • K. Sijtsma: Useful Nonparametric Scaling: A Reply to Jansen . In: Psychological Contributions , 1984, 26, pp. 423-437.
  • PGW Jansen, EE Ch. I. Roskam, AL Van den Wollenberg: Discussion on the Usefulness of Mokken Procedure for Nonparametric Scaling . In: Psychological Contributions , 1984, 26, pp. 722-735.

Individual evidence

  1. Meijer, Smits, Timmerman, 2012, p. 536