Matching (statistics)

from Wikipedia, the free encyclopedia

Matching or German pairwise assignment refers to methods in statistics with which similar observations are combined in two or more data sets . Using matching methods, one or more similar observations from the other data sets are assigned to the observations from one data set on the basis of common features. This enables a joint analysis of the data, although there is probably no case that actually occurs in both data sets. In medical statistics , matching is used in the design of observational studies.

description

For example, there are two data sets, the results of a survey on the income situation and those of another survey on living conditions. The respondents in both data sets are different, but common characteristics (e.g. gender, age group, place of residence, etc.) were recorded in both surveys. Using matching methods, one or more similar observations from the other data set are then assigned to the observations from one data set on the basis of these common features. This enables a joint analysis of the income situation and living conditions, although there are probably no respondents who took part in both surveys. The quality of the analysis of the income situation and living conditions depends heavily on the quality of the matching.

Matching procedure

Special matching procedures are:

Applications

In medical statistics , matching is used in the design of observational studies. In case-control studies , cases and controls can be selected in such a way that they match with regard to certain criteria (e.g. gender, socio-economic status, age group). This is possible on an individual level (for each case a control that matches the selected criteria is selected) or as group matching ( English frequency matching ). In group matching, the composition of all controls is selected according to the composition of the cases. For example, if the group of cases is 80% women, the control group is made up with a similar percentage of women.

In the case of individually matched case-control studies that are analyzed using logistic regression , a special form of this method (conditional logistic regression) should be used. If several factors are matched individually, there is a risk that no controls will be found for cases that match according to the matching criteria.

See also

literature

  • Susanne Rässler: Statistical Matching: A Frequentist Theory, Practical Applications and Alternative Bayesian Approaches . Springer, 2008, ISBN 978-0-387-95516-2 .

Individual evidence

  1. Deborah N. Peikes, Lorenzo Moreno, Sean Michael Orzol: Propensity score matching . In: The American Statistician , 62.3, 2008.
  2. Rajeev H. Dehejia, Sadek Wahba: propensity score-matching methods for nonexperimental causal studies . In: Review of Economics and statistics , 84.1, 2002, pp. 151-161.
  3. Marco Caliendo, Sabine Kopeinig: Some practical guidance for the implementation of propensity score matching . In: Journal of economic surveys , 22.1, 2008, pp. 31–72.
  4. ^ Donald B. Rubin, Neal Thomas: Combining propensity score matching with additional adjustments for prognostic covariates . In: Journal of the American Statistical Association , 95.450, 2000, pp. 573-585.
  5. ^ Christian Erzberger, Gerald Prein: Optimal-Matching-Technik: An analysis method for the comparability and order of individually different life courses . 1997.
  6. Andrew Abbott, Angela Tsay: Sequence analysis and optimal matching methods in sociology review and prospect . In: Sociological methods & research , 29.1, 2000, pp. 3-33.
  7. a b Christel Weiß: Basic knowledge of medical statistics . 5th edition. 2010