Standardization (epidemiology)

from Wikipedia, the free encyclopedia

The term standardization is used in epidemiology to describe two different methods for calculating measures that can make meaningful comparisons of populations possible. A fundamental distinction must be made between direct and indirect standardization .

In epidemiology, for example, when presenting results from cancer registers , a standardization of measures with regard to gender and age structure is carried out. In a corresponding procedure is then by gender and age standardization spoken and gender and age-standardized results and metrics. In principle, however, standardization can also be carried out with regard to any other structural features and combinations of structural features. To simplify legibility, the following explanations and examples are limited to age standardizations.

Direct standardization

Goal and requirements

The aim of direct age standardization is to calculate measures for populations, whereby the calculated measures should be as independent as possible of the age structure in the individual populations and thus also comparable between populations with different age structures.

A prerequisite for direct age standardization is the availability of separately determined measures for all age groups considered in the respective standardization in all the populations considered. Thanks to direct standardization, results from different populations as well as from different studies can be adequately compared - regardless of differences in age structure. Direct age standardization, however, requires relatively large populations in order to be able to calculate sufficiently stable statistical measures within narrower age groups. In the case of broader age groups, different age distributions within the groups may not be taken into account.

Action

For the calculation of directly age-standardized measures, a uniform (fictitious) age structure, namely that of a so-called standard population, is established for all populations under consideration (see below). From the (real) information on age group-specific measures in a population under consideration and the information on the occupation of age groups in the agreed standard population, a (fictitious) measure is calculated that would result in an age group occupation as in the standard population if exactly the age-specific ones in individual age groups Measures of the population considered would apply (see example). This is then the directly age-standardized value for the population under consideration (note: the procedure for direct age standardization corresponds to that of a weighted summary of results from age groups; identical results can also be determined by statistical calculations with weighted individual observations) .

Calculation example

Fictitious problem example: disease rates are determined for the two districts A and B, each with 30,000 inhabitants. A total of 900 inhabitants fell ill in district A, which results in a disease rate of 30 people per 1,000 inhabitants. In contrast, 2,020 inhabitants fell ill in district B, which means that a disease rate of 67.3 patients per 1,000 inhabitants can be calculated. Obviously, more than twice as many people fall ill in district B than in district A, which could indicate a health hazard in district B. On the other hand, a different age structure of the two districts could be responsible for the differences.

For direct standardization, incidence rates in individual age groups must first be calculated for both districts A and B (to simplify the clarity, only three age groups are differentiated in the example):

District A
Age Residents Sick 1,000 A each
0 to 49 20,000 200 10
50 to 64 8,000 400 50
from 65 2,000 300 150
total 30,000 900 30th
District B
Age Residents Sick 1,000 B each
0 to 49 12,000 120 10
50 to 64 8,000 400 50
from 65 10,000 1,500 150
total 30,000 2,020 67.3

The age-specific incidence rates from both populations are decisive for further calculations (given here as sick persons per 1,000 inhabitants). These are used to calculate (fictitious) disease numbers for district A and district B, whereby both populations are viewed as if they had an identical age structure as a previously (arbitrarily) selected standard population. In the present case, a standard population was simply formed from the sums of the age group occupations of district A and B.

District A
Age Stand.Pop. Diseased 1,000 A each Sick person (fict.)
0 to 49 32,000 10 320
50 to 64 16,000 50 800
from 65 12,000 150 1,800
total 60,000 2,920
District B
Age Stand.Pop. Diseased 1,000 B each Sick person (fict.)
0 to 49 32,000 10 320
50 to 64 16,000 50 800
from 65 12,000 150 1,800
total 60,000 2,920

In the example, from the information in the last rows of the table, identical, age-standardized disease rates of 4.87% (or 48.7 patients per 1,000 inhabitants) can be obtained by dividing the calculated fictitious number of patients (2,920) by the total number of people in the standard population (60,000). The initially mentioned differences in the "raw" disease rates between the two districts A and B result in this example exclusively from the different age structure of the districts A and B.

In the simple example given, a quick look at the age group-specific disease rates from both parts of the city would have been sufficient to discover their corresponding values ​​and thus to identify the age differences of the populations as the cause of different disease frequencies. In practice, however, standardizations often differentiate considerably more finely (e.g. 20 age group, each for men and women), a larger number of populations (e.g. 16 federal states) and a larger number of measures (e.g. Incidence rates for different diagnoses). In appropriate cases, standardized individual values ​​can make comparisons considerably easier, and representations of all measures for all subgroups with gender and age-specific values ​​would hardly be possible, especially in print publications.

Standard populations for direct standardizations

The selection of a standard population will depend on the populations considered, the intended comparisons, and the intended use of the results. It is always arbitrary to a certain extent and can influence results and their interpretation. For results that are to be compared on an international level, there are a number of proposals for fictitious standard populations, which are provided by the World Health Organization (WHO) or Eurostat , for example. If statements are to be made on current national measures on the basis of a study (e.g. on disease rates in Germany), direct standardization according to the current gender and age structure of the nation under consideration can be helpful, provided that the study and age groups are not represented representatively anyway. If the aim of a study is to compare standardized measures from subgroups of the study population, for direct standardization it may also be useful to use structural information on the overall study population under consideration.

Indirect standardization

Goal and requirements

The indirect age standardization is also intended to enable comparative evaluations of results on (sub-) populations that differ in terms of their actually observed age structures. As a rule, indirect standardization is primarily used for evaluations of smaller subpopulations, for which more detailed data on a superordinate total population is available at the same time (for example, for evaluations of regional districts in Germany, for which data is also available nationwide).

The prerequisite for an indirect age standardization is the availability of separately determined indicators for all age groups considered only for the reference or total population. For the subpopulations under consideration, only information about their age structure as well as cross-age results or indicators need to be known; individual age groups may also be weak or missing in the subpopulations. In the case of indirect standardization, a restriction can be that primarily only relative deviations of measures in subpopulations to the results from the total population are determined. Comparisons between the subpopulations are only legitimate under certain conditions and assumptions.

Action

For the calculation of indirectly age-standardized measures, in addition to the results observed across all ages, expected results are determined for all subpopulations, which would result on the basis of the real age structure of the subpopulation if the age-specific measures of the total population were to apply in individual age groups. From the observed and expected results, quotients are finally calculated that form the primary result of the direct standardization (see example; a value of 1.00 results if the observed and expected results do not differ, values ​​greater than 1 result if values ​​in of the subpopulation are higher than age-specifically expected, values ​​less than 1 if observed results are lower than the expected values). If the calculations are evaluations of mortality, these quotients are referred to as the standardized mortality ratio (SMR ).

Calculation example

First of all, incidence rates in individual age groups must be calculated for the total population (in this example, age-specific results from combined data on districts A and B).

District A + B (total population)
Age Residents Sick 1,000 AB each
0 to 49 32,000 320 10
50 to 64 16,000 800 50
from 65 12,000 1,800 150

On the basis of the age-specific incidence rates in the total population (here district A + B), the expected numbers of sick people in the subpopulations (here district A and district B) are calculated.

District A (subpopulation)
Age Resident A 1,000 AB each Sick A expected
0 to 49 20,000 10 200
50 to 64 8,000 50 400
from 65 2,000 150 300
total 30,000 30th 900
District B (subpopulation)
Age Resident B 1,000 AB each Sick B expected
0 to 49 12,000 10 120
50 to 64 8,000 50 400
from 65 10,000 150 1,500
total 30,000 67.3 2,020

From the observed and expected disease numbers in subpopulations A and B, quotients are finally calculated:

Subpopulation A: Sick A / Sick A expected = 900/900 = 1.00

Subpopulation B: Sick B / Sick B expected = 2020/2020 = 1.00

The quotients with the value 1.00 as the result of the indirect age standardization show for both subpopulation A and subpopulation B that the incidence rates in both populations A and B do not differ from the total population (here A and B), provided that the effects of the different age structure are taken into account in the context of indirect age standardization.

literature

Epidemiological methods, standardization
  • Harold A. Kahn, Christopher T. Sempos: Statistical Methods in Epidemiology . Oxford University Press, New York Oxford 1989, ISBN 0-19-505751-1 .
Standard populations used internationally
  • Omar B. Ahmad et al .: Age Standardization of Rates: A New WHO Standard. In: GPE Discussion Paper Series: No. 31. EIP / GPE / EBD. World Health Organization 2001 { Online access , PDF file.}
Information on the population structure in Germany