Chi-square test

In mathematical statistics, a chi-square test ( test ) denotes a group of hypothesis tests with a chi-square-distributed test test variable . ${\ displaystyle \ chi ^ {2}}$

A distinction is mainly made between the following tests:

Distribution test (also fit test given): Here, it is checked whether data present in a particular manner distributed are.
Independence test: Here it is checked whether two features are stochastically independent .
Homogeneity test: Here it is checked whether two or more samples come from the same distribution or a homogeneous population.

The chi-square test and its test statistic were first described by Karl Pearson in 1900 .

Distribution test

We are looking at a statistical characteristic whose probabilities are unknown in the population. With regard to the probabilities of, it becomes a provisionally generally formulated null hypothesis ${\ displaystyle X}$ ${\ displaystyle X}$

{\ displaystyle H_ {0} \,}

: The characteristic has the probability distribution

{\ displaystyle X}

{\ displaystyle F_ {0} (x)}

set up.

method

There are independent observations of the trait that fall into different categories. If a characteristic has a large number of characteristics, it is expedient to group them together in classes and see the classes as categories. The number of observations in the -th category is the observed frequency . ${\ displaystyle n}$ ${\ displaystyle x_ {1}, \ dots, x_ {n}}$ ${\ displaystyle X}$ ${\ displaystyle m}$ ${\ displaystyle m}$ ${\ displaystyle j}$ ${\ displaystyle N_ {j}}$

One now thinks about how many observations on average would have to lie in a category if the hypothetical distribution actually had. To do this, one first calculates the probability that a characteristic of falls into the category . The absolute frequency to be expected below is: ${\ displaystyle X}$ ${\ displaystyle p_ {0j}}$ ${\ displaystyle X}$ ${\ displaystyle j}$ ${\ displaystyle H_ {0}}$

{\ displaystyle n_ {0j} = p_ {0j} \ cdot n}

If the frequencies observed in the present sample deviate “too much” from the expected frequencies, the null hypothesis is rejected. The test variable for the test ${\ displaystyle N_ {j}}$

{\ displaystyle X ^ {2} = \ sum _ {j = 1} ^ {m} {\ frac {(N_ {j} -n_ {0j}) ^ {2}} {n_ {0j}}}}

measures the size of the deviation.

If the test variable is sufficiently large, it is approximately chi-square distributed with degrees of freedom . If the null hypothesis is true, the difference between the observed frequency and the theoretically expected frequency should be small. So if the test variable value is high, it is rejected. The denial area for is on the right. ${\ displaystyle X ^ {2}}$ ${\ displaystyle N_ {j}}$ ${\ displaystyle m-1}$ ${\ displaystyle H_ {0}}$ ${\ displaystyle H_ {0}}$

At a level of significance is rejected if true when the so obtained from the sample value of the test statistic is larger than the - quantile of the distribution with 's degrees of freedom. ${\ displaystyle \ alpha}$ ${\ displaystyle H_ {0}}$ ${\ displaystyle X ^ {2}> \ chi _ {(1- \ alpha; m-1)} ^ {2}}$ ${\ displaystyle (1- \ alpha)}$ ${\ displaystyle \ chi ^ {2}}$ ${\ displaystyle m-1}$

There are tables of the quantiles ( critical values ) depending on the number of degrees of freedom and the desired level of significance ( see below ). ${\ displaystyle \ chi ^ {2}}$

If the significance level that belongs to a certain value is to be determined, an intermediate value must usually be calculated from the table. Logarithmic interpolation is used for this . ${\ displaystyle \ chi ^ {2}}$

particularities

Estimation of distribution parameters

In general, one specifies the parameters of the distribution in the distribution hypothesis. If these cannot be specified, they must be estimated from the sample. With the chi-square distribution, one degree of freedom is lost for each estimated parameter. It therefore has degrees of freedom with the number of estimated parameters. For the normal distribution it would be if the expected value and the variance are estimated. ${\ displaystyle mw-1}$ ${\ displaystyle w}$ ${\ displaystyle w = 2}$ ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma ^ {2}}$

Minimum size of the expected frequencies

So that the test variable can be viewed as approximately chi-square distributed, each expected frequency must have a certain minimum size. Various textbooks put this at 1 or 5. If the expected frequency is too small, several classes can possibly be combined in order to achieve the minimum size.

Example of the distribution test

The sales of approx. 200 listed companies are available. The following histogram shows their distribution.

Let it be the turnover of a company [million €]. ${\ displaystyle X}$

We now want to test the hypothesis that is normally distributed . ${\ displaystyle X}$

Since the data are available in many different forms, they have been divided into classes. The table resulted:

class	interval		Observed frequency
j	over	to	n _y
1	...	0	0
2	0	5000	148
3	5000	10,000	17th
4th	10,000	15000	5
5	15000	20000	8th
6th	20000	25,000	4th
7th	25,000	30000	3
8th	30000	35000	3
9	35000	...	9
total			197

Since no parameters are specified, they are determined from the sample. It's appreciated

{\ displaystyle {\ hat {\ mu}} = {\ bar {x}} = 6892}

and

{\ displaystyle {\ hat {\ sigma}} = s = 14984.}

It is tested:

{\ displaystyle H_ {0}}

: is normally distributed with the expected value and the standard deviation .

{\ displaystyle X}

{\ displaystyle \ mu = 6892}

{\ displaystyle \ sigma = 14984}

In order to determine the frequencies below expected, the probabilities that fall into the given classes are first calculated . One then calculates ${\ displaystyle H_ {0}}$ ${\ displaystyle X}$

{\ displaystyle p_ {01} = P (X \ leq 0 \ vert H_ {0}) = P \ left (\ left. {\ frac {X- \ mu} {\ sigma}} \ leq - {\ frac { \ mu} {\ sigma}} \ right | H_ {0} \ right) = P (Z \ leq - {\ tfrac {6892} {14984}}) = \ Phi \ left (-0 {,} 46 \ right ) = 0 {,} 3228.}

It contains a random variable with a standard normal distribution and its distribution function. Analogously one calculates: ${\ displaystyle Z}$ ${\ displaystyle \ Phi}$

{\ displaystyle p_ {02} = P (0 <X \ leq 5000 \ vert H_ {0}) = P \ left (\ left .- {\ frac {\ mu} {\ sigma}} <{\ frac {X - \ mu} {\ sigma}} \ leq {\ frac {5000- \ mu} {\ sigma}} \ right | H_ {0} \ right) =}

{\ displaystyle = P \ left (- {\ tfrac {6892} {14984}} <Z \ leq {\ tfrac {5000-6892} {14984}} \ right) = \ Phi \ left (-0 {,} 126 \ right) - \ Phi \ left (-0 {,} 46 \ right) = 0 {,} 1270}

...

This gives the expected frequencies

{\ displaystyle n_ {01} = p_ {01} \ cdot n = 0 {,} 3228 \ cdot 197 = 63 {,} 59}

{\ displaystyle n_ {02} = p_ {02} \ cdot n = 0 {,} 1270 \ cdot 197 = 25 {,} 02}

...

For example, around 25 companies would have to have an average turnover between 0 € and 5000 € if the characteristic turnover is actually normally distributed.

The expected frequencies, along with the observed frequencies, are listed in the following table.

class	interval		Observed frequency	probability	Expected frequency
j	over	to	n _y	p _0j	n _0y
1	...	0	0	0.3228	63.59
2	0	5000	148	0.1270	25.02
3	5000	10,000	17th	0.1324	26.08
4th	10,000	15000	5	0.1236	24.35
5	15000	20000	8th	0.1034	20.36
6th	20000	25,000	4th	0.0774	15.25
7th	25,000	30000	3	0.0519	10.23
8th	30000	35000	3	0.0312	6.14
9	35000	...	9	0.0303	5.98
total			197	1.0000	197.00

The test variable is now determined as follows:

{\ displaystyle X ^ {2} = {\ frac {(0-63.59) ^ {2}} {63 {,} 59}} + {\ frac {(148-25 {,} 02) ^ {2 }} {25 {,} 02}} + \ dotsb + {\ frac {(9-5 {,} 98) ^ {2}} {5 {,} 98}} = 710 {,} 79.}

At a significance level , the critical value lies Testprüfgröße at . There , the null hypothesis is rejected. It can be assumed that the sales characteristic is not normally distributed in the population. ${\ displaystyle \ alpha = 0 {,} 05}$ ${\ displaystyle \ chi _ {(0 {,} 95; 9-3 = 6)} ^ {2} = 12 {,} 59}$ ${\ displaystyle X ^ {2}> 12 {,} 59}$

complement

The above data were then logarithmized . Based on the result of the test of the data set of the logarithmized data for normal distribution, the null hypothesis of normal distribution of the data could not be rejected at a significance level of 0.05. Provided that the logarithmized sales data actually come from a normal distribution, the original sales data are logarithmically normalized .

The following histogram shows the distribution of the logarithmized data.

Chi-square distribution test in jurisdiction

In Germany, the chi-square distribution test was judicially confirmed as part of the application of Benford's law as a means of a tax authority to object to the correctness of the cash management. Specifically, the frequency distribution of digits in cash book entries was examined using the chi-square test, which resulted in a "strong indication of manipulation of the receipt records". However, the applicability is subject to restrictions and other statistical methods may have to be used (see Benford's law ).

Independence test

The independence test is a significance test for stochastic independence in the contingency table .

We consider two statistical features and , which can be scaled as required . One is interested in whether the features are stochastically independent. It becomes the null hypothesis ${\ displaystyle X}$ ${\ displaystyle Y}$

{\ displaystyle H_ {0}}

: The features and are stochastically independent.

{\ displaystyle X}

{\ displaystyle Y}

set up.

method

The observations of are in categories , those of the characteristic in categories . If a characteristic has a large number of characteristics, it is expedient to group them together into classes and to understand the class affiliation as the -th category. There are a total of pairwise observations of and , which are divided into categories. ${\ displaystyle X}$ ${\ displaystyle m}$ ${\ displaystyle j}$ ${\ displaystyle (j = 1, \ dotsc, m)}$ ${\ displaystyle Y}$ ${\ displaystyle r}$ ${\ displaystyle k}$ ${\ displaystyle (k = 1, \ dotsc, r)}$ ${\ displaystyle j}$ ${\ displaystyle j}$ ${\ displaystyle n}$ ${\ displaystyle X}$ ${\ displaystyle Y}$ ${\ displaystyle m \ cdot r}$

Conceptually, the test is to be understood as follows:

Consider two discrete random variables and , the common probabilities of which can be represented in a probability table. ${\ displaystyle X}$ ${\ displaystyle Y}$

You now count how often the -th expression of occurs together with the -th expression of . The observed common absolute frequencies can be entered in a two-dimensional frequency table with rows and columns. ${\ displaystyle j}$ ${\ displaystyle X}$ ${\ displaystyle k}$ ${\ displaystyle Y}$ ${\ displaystyle n_ {jk}}$ ${\ displaystyle m}$ ${\ displaystyle r}$

	feature ${\ displaystyle Y}$						Sum Σ
feature ${\ displaystyle X}$	1	2	...	k	...	r	n _{y .}
1	n ₁₁	n ₁₂	...	n _{1 k}	...	n _{1 r}	n _1.
2	n ₂₁	n ₂₂	...	n _{2 k}	...	n _{2 r}	n _2.
...	...	...	...	...	...	...	...
j	n _j1	...	...	n _jk	...	...	n _y.
...	...	...	...	...	...	...	...
m	n _{m 1}	n _{m 2}	...	n _mk	...	n _mr	n _{m .}
Sum Σ	n _.1	n _.2	...	n _{. k}	...	n _{. r}	n

The row and column sums give the absolute marginal frequencies or as ${\ displaystyle n_ {j \, \ cdot}}$ ${\ displaystyle n _ {\ cdot \, k}}$

{\ displaystyle n_ {j \, \ cdot} = \ sum _ {k = 1} ^ {r} n_ {jk}}

and

{\ displaystyle n _ {\ cdot \, k} = \ sum _ {j = 1} ^ {m} n_ {jk}}

Correspondingly, the common relative frequencies and the relative marginal frequencies and . ${\ displaystyle p_ {jk} = n_ {jk} / n}$ ${\ displaystyle p_ {j \, \ cdot} = n_ {j \, \ cdot} / n}$ ${\ displaystyle p _ {\ cdot \, k} = n _ {\ cdot \, k} / n}$

Probability theory applies: If two events and statistically independent, the probability of their common occurrence is equal to the product of the individual probabilities: ${\ displaystyle A}$ ${\ displaystyle B}$

{\ displaystyle P (A \ cap B) = P (A) \ cdot P (B)}

One now thinks that analogously to above, with stochastic independence from and , it would also have to apply ${\ displaystyle X}$ ${\ displaystyle Y}$

{\ displaystyle p_ {jk} \ approx p_ {j \, \ cdot} \ cdot p _ {\ cdot \, k},}

with multiplied accordingly ${\ displaystyle n}$

{\ displaystyle n_ {jk} \ approx {\ frac {n_ {j \, \ cdot} \ cdot n _ {\ cdot \, k}} {n}}}

or

{\ displaystyle n_ {jk} - {\ frac {n_ {j \, \ cdot} \ cdot n _ {\ cdot \, k}} {n}} \ approx 0.}

If these differences are small for all of them , one can assume that and are actually stochastically independent. ${\ displaystyle j, k}$ ${\ displaystyle X}$ ${\ displaystyle Y}$

If one sets for the expected frequency in the presence of independence

{\ displaystyle n_ {jk} ^ {*} = {\ frac {n_ {j \, \ cdot} \ cdot n _ {\ cdot \, k}} {n}},}

the test variable for the independence test results from the above consideration

{\ displaystyle X ^ {2} = \ sum _ {j = 1} ^ {m} \ sum _ {k = 1} ^ {r} {\ frac {(n_ {jk} -n_ {jk} ^ {* }) ^ {2}} {n_ {jk} ^ {*}}}.}

If the expected frequencies are sufficiently large, the test variable is approximately chi-square distributed with degrees of freedom. ${\ displaystyle \ chi ^ {2}}$ ${\ displaystyle n_ {jk} ^ {*}}$ ${\ displaystyle (m-1) (r-1)}$

If the test variable is small, the hypothesis is presumed to be true. So is rejected at a high Prüfgrößenwert, the critical region for is right. ${\ displaystyle H_ {0}}$ ${\ displaystyle H_ {0}}$

At a level of significance , it is rejected if the quantile is the chi-square distribution with degrees of freedom. ${\ displaystyle \ alpha}$ ${\ displaystyle H_ {0}}$ ${\ displaystyle X ^ {2}> \ chi _ {(1- \ alpha; (m-1) (r-1))} ^ {2}}$ ${\ displaystyle (1- \ alpha)}$ ${\ displaystyle (m-1) (r-1)}$

particularities

So that the test variable can be viewed as approximately chi-square distributed, every expected frequency must have a certain minimum size. Various textbooks put this at 1 or 5. If the expected frequency is too small, several classes can possibly be combined in order to achieve the minimum size. ${\ displaystyle n_ {jk} ^ {*}}$

Alternatively, the sample distribution of the test statistics can be examined using the bootstrapping method based on the given marginal distributions and the assumption that the characteristics are independent .

Example of the independence test

As part of the quality management, the customers of a bank were asked, among other things, about their satisfaction with the business transaction and their overall satisfaction. The degree of satisfaction was based on the school grading system.

The data gives the following cross table of overall satisfaction with bank customers versus their satisfaction with doing business. You can see that some of the expected frequencies were too small.

A reduction of the categories to three by combining the grades 3–6 to a new overall grade of 3 yielded methodologically correct results.

The following table contains the expected frequencies , which are calculated as follows: ${\ displaystyle n_ {jk} ^ {*}}$

{\ displaystyle n_ {11} ^ {*} = {\ frac {102 \ cdot 270} {621}} = 44 {,} 35 \ quad n_ {12} ^ {*} = {\ frac {102 \ cdot 273 } {621}} = 44 {,} 84 \ quad \ dotsb \ quad n_ {33} ^ {*} = {\ frac {160 \ cdot 78} {621}} = 20 {,} 10}

	feature ${\ displaystyle Y}$
feature ${\ displaystyle X}$	1	2	3	Σ
1	44.35	44.84	12.81	102
2	156.09	157.82	45.09	359
3	69.57	70.34	20.10	160
Σ	270	273	78	621

The test variable is then determined as follows:

{\ displaystyle X ^ {2} = {\ frac {(86-44 {,} 35) ^ {2}} {44 {,} 35}} + {\ frac {(16-44 {,} 84) ^ {2}} {44 {,} 84}} + \ dotsb + {\ frac {(53-20 {,} 10) ^ {2}} {20 {,} 10}} = 167 {,} 187}

For one , the critical value is attached to the test size . Since is, the hypothesis is significantly rejected, so it is assumed that satisfaction with the business transaction and overall satisfaction are associated. ${\ displaystyle \ alpha = 0 {,} 05}$ ${\ displaystyle \ chi ^ {2} (0 {,} 95; 4) = 9 {,} 488}$ ${\ displaystyle \ chi ^ {2}> 9 {,} 488}$

Homogeneity test

With the chi-square homogeneity test , the associated sample distributions can be used to check whether (independent) random samples of discrete features with the sample sizes come from identically distributed (i.e. homogeneous) populations . It is thus helpful in deciding whether several samples come from the same population or distribution, or in deciding whether a characteristic is distributed in the same way in different populations (e.g. men and women). The test, like the other chi-square tests, can be used at any scale level . ${\ displaystyle m \ geq 2}$ ${\ displaystyle X_ {1}, \ dotsc, X_ {m}}$ ${\ displaystyle n_ {1}, \ dotsc, n_ {m}}$

The hypotheses are:

{\ displaystyle \! H_ {0}:}

The independent characteristics are distributed identically.

{\ displaystyle X_ {1}, \ dotsc, X_ {m}}

{\ displaystyle \! H_ {1}:}

At least two of the characteristics are distributed differently.

{\ displaystyle X_ {1}, \ dotsc, X_ {m}}

If the distribution function of is indicated with, the hypotheses can also be formulated as follows: ${\ displaystyle F_ {k}}$ ${\ displaystyle X_ {k}}$

{\ displaystyle \! H_ {0}: F_ {1} = F_ {2} = \ dotsb = F_ {m}}

{\ displaystyle \! H_ {1}: F_ {i} \ neq F_ {j}}

for at least one

{\ displaystyle i \ neq j}

method

The examined random variable (the characteristic), e.g. B. Answer to “the Sunday question ”, be graded, d. That is, there are feature categories (the feature has characteristics), e.g. B. SPD, CDU, B90 / Greens, FDP, Die Linke and Others (i.e., ). The samples can e.g. B. the survey results of various opinion research institutes. It could then be of interest to check whether the survey results differ significantly. ${\ displaystyle k}$ ${\ displaystyle k}$ ${\ displaystyle k}$ ${\ displaystyle k = 6}$ ${\ displaystyle X_ {i} \,}$

The observed frequencies per sample (survey) and feature category (called party) are in a corresponding - Crosstab entered (here 3 × 3): ${\ displaystyle n_ {ij} \,}$ ${\ displaystyle m \ times k}$

sample ${\ displaystyle X_ {i}}$	Category 1	Category 2	Category 3	total
	Feature category ${\ displaystyle j \,}$
${\ displaystyle X_ {1} \,}$	${\ displaystyle n_ {11} \,}$	${\ displaystyle n_ {12} \,}$	${\ displaystyle n_ {13} \,}$	${\ displaystyle n_ {11} + n_ {12} + n_ {13} = n_ {1 \ bullet}}$
${\ displaystyle X_ {2} \,}$	${\ displaystyle n_ {21} \,}$	${\ displaystyle n_ {22} \,}$	${\ displaystyle n_ {23} \,}$	${\ displaystyle n_ {21} + n_ {22} + n_ {23} = n_ {2 \ bullet}}$
${\ displaystyle X_ {3} \,}$	${\ displaystyle n_ {31} \,}$	${\ displaystyle n_ {32} \,}$	${\ displaystyle n_ {33} \,}$	${\ displaystyle n_ {31} + n_ {32} + n_ {33} = n_ {3 \ bullet}}$
total	${\ displaystyle n_ {11} + n_ {21} + n_ {31} = n _ {\ bullet 1}}$	${\ displaystyle n_ {12} + n_ {22} + n_ {32} = n _ {\ bullet 2}}$	${\ displaystyle n_ {13} + n_ {23} + n_ {33} = n _ {\ bullet 3}}$	${\ displaystyle n_ {1 \ bullet} + n_ {2 \ bullet} + n_ {3 \ bullet} = n \,}$

The deviations between the observed (empirical) frequency or probability distributions of the samples across the categories of the characteristic are now examined. The observed cell frequencies are compared with the frequencies that would be expected if the null hypothesis were valid. ${\ displaystyle n_ {ij} \,}$

The cell frequencies expected under the validity of the null hypothesis of a homogeneous population are determined from the marginal distributions:

{\ displaystyle E_ {ij} = n_ {i \ bullet} \ cdot {\ frac {n _ {\ bullet j}} {n}}}

denotes the expected number of observations (absolute frequency) from sample in category . ${\ displaystyle i}$ ${\ displaystyle j}$

On the basis of the values calculated in this way, the following approximately chi-square distributed test value is calculated:

{\ displaystyle \ chi ^ {2} = \ sum _ {j = 1} ^ {k} \ sum _ {i = 1} ^ {m} {\ frac {(n_ {ij} -E_ {ij}) ^ {2}} {E_ {ij}}}}

In order to arrive at a test decision, the value obtained for the test variable is compared with the associated critical value; H. with the quantile of the chi-square distribution that depends on the number of degrees of freedom and the level of significance (alternatively, the p-value can be determined). If the deviations between at least two sample distributions are significant, the null hypothesis is rejected; H. the null hypothesis of homogeneity is rejected if ${\ displaystyle \ alpha \,}$

{\ displaystyle X ^ {2}> \ chi _ {(k-1) (m-1); 1- \ alpha} ^ {2}}

.

The rejection range for is to the right of the critical value. ${\ displaystyle H_ {0}}$

Conditions of use

So that the test variable can be viewed as approximately ( approximately ) chi-square distributed, the following approximation conditions must apply:

"Large" sample size ( ) ${\ displaystyle n> 30}$
${\ displaystyle E_ {ij} \ geq 1}$ for all ${\ displaystyle i, j}$
min. 80% of the ${\ displaystyle E_ {ij}> 5 \,}$
Rinne (2003) and Voss (2000) also call for cell frequencies ${\ displaystyle n_ {ij} \ geq 10}$

If some of the expected frequencies are too small, several classes or feature categories must be combined in order to comply with the approximation conditions.

Does the random variable being examined have a large number of (possible) characteristics, e.g. B. because the variable is metrically continuous, it is expedient to summarize it in classes (= categories) so that the now classified random variable can be examined with the chi-square test. It should be noted, however, that the way the observations are classified can influence the test result. ${\ displaystyle m}$

Comparison to independence and distribution test

The homogeneity test can also be interpreted as an independence test if the samples are viewed as expressions of a second characteristic. It can also be viewed as a form of distribution test that compares not one empirical and one theoretical distribution, but rather several empirical distributions. However, the independence test and distribution test are one-sample problems, while the homogeneity test is a multiple-sample problem. In the independence test, a single sample is taken with regard to two characteristics, and in the case of the distribution test, a sample is taken with regard to one characteristic. In the homogeneity test, several random samples are taken with regard to a characteristic.

Four-field test

The chi-square four-field test is a statistical test . It is used to check whether two dichotomous features are stochastically independent of one another or whether the distribution of a dichotomous feature in two groups is identical.

method

The four-field test is based on a (2 × 2) contingency table that visualizes the (bivariate) frequency distribution of two characteristics :

Feature Y	Expression 1	Expression 2	Line total
	Feature X
Expression 1	a	b	a + b
Expression 2	c	d	c + d
Column total	a + c	b + d	n = a + b + c + d

According to a rule of thumb , the expected value of all four fields must be at least 5. The expected value is calculated from the row total * column total / total number. If the expected value is less than 5, statisticians recommend the Fisher's exact test .

Test statistics

In order to test the null hypothesis that both features are stochastically independent, the following test variable is first calculated for a two-sided test:

{\ displaystyle X ^ {2} = {\ frac {n \ cdot (a \ cdot dc \ cdot b) ^ {2}} {(a + c) \ cdot (b + d) \ cdot (a + b) \ cdot (c + d)}} \; {\ stackrel {a} {\ sim}} \; \ chi _ {1} ^ {2}}

.

The test variable is approximately chi-square distributed with one degree of freedom. It should only be used if each of the two samples contains at least six carriers of characteristics (observations).

Test decision

If the test value obtained on the basis of the sample is smaller than the critical value belonging to the selected significance level (i.e. the corresponding quantile of the chi-square distribution), then the test could not prove that there was a significant difference. If, on the other hand, a test value is calculated that is greater than or equal to the critical value, there is a significant difference between the samples.

The probability that the calculated (or an even larger) test value was only obtained by chance due to the sampling (p-value) can be calculated approximately as follows:

{\ displaystyle p = {\ frac {1} {2}} \ cdot 10 ^ {\ frac {- {\ hat {\ chi}} ^ {2}} {3 {,} 84}}}

The approximation of this (rule of thumb) formula to the actual p-value is good if the test variable is between 2.0 and 8.0.

Examples and Applications

When asked whether a medical measure is effective or not, the four-field test is very helpful because it focuses on the main decision criterion.

example 1

50 (randomly selected) women and men are asked whether they smoke or not.

The result is:

Women: 25 smokers, 25 non-smokers
Men: 30 smokers, 20 non-smokers

If a four-field test is carried out on the basis of this survey, the formula shown above results in a test value of approx. 1. Since this value is smaller than the critical value 3.841, the null hypothesis that smoking behavior is independent of gender can not be rejected become. The proportion of smokers and non-smokers does not differ significantly between the sexes.

Example 2

500 (randomly selected) women and men are asked whether they smoke or not.

The following data is received:

Women: 250 non-smokers, 250 smokers
Men: 300 non-smokers, 200 smokers

The four-field test results in a test value of greater than 3.841. Since the null hypothesis that the characteristics “smoking behavior” and “gender” are stochastically independent of one another can be rejected at a significance level of 0.05. ${\ displaystyle {\ tfrac {1000} {99}} \ approx 10 {,} 1}$ ${\ displaystyle 10 {,} 1> 3 {,} 841}$

Table of the quantiles of the chi-square distribution

The table shows the most important quantiles of the chi-square distribution . The degrees of freedom are entered in the left column and the levels in the top line . Reading example: The quantile of the chi-square distribution with 2 degrees of freedom and a level of 1% is 9.21. ${\ displaystyle f}$ ${\ displaystyle (1- \ alpha)}$ ${\ displaystyle \ alpha}$

	1 − α
${\ displaystyle f}$	0.900	0.950	0.975	0.990	0.995	0.999
1	2.71	3.84	5.02	6.63	7.88	10.83
2	4.61	5.99	7.38	9.21	10.60	13.82
3	6.25	7.81	9.35	11.34	12.84	16.27
4th	7.78	9.49	11.14	13.28	14.86	18.47
5	9.24	11.07	12.83	15.09	16.75	20.52
6th	10.64	12.59	14.45	16.81	18.55	22.46
7th	12.02	14.07	16.01	18.48	20.28	24.32
8th	13.36	15.51	17.53	20.09	21.95	26.12
9	14.68	16.92	19.02	21.67	23.59	27.88
10	15.99	18.31	20.48	23.21	25.19	29.59
11	17.28	19.68	21.92	24.72	26.76	31.26
12	18.55	21.03	23.34	26.22	28.30	32.91
13	19.81	22.36	24.74	27.69	29.82	34.53
14th	21.06	23.68	26.12	29.14	31.32	36.12
15th	22.31	25.00	27.49	30.58	32.80	37.70
16	23.54	26.30	28.85	32.00	34.27	39.25
17th	24.77	27.59	30.19	33.41	35.72	40.79
18th	25.99	28.87	31.53	34.81	37.16	42.31
19th	27.20	30.14	32.85	36.19	38.58	43.82
20th	28.41	31.41	34.17	37.57	40.00	45.31
21st	29.62	32.67	35.48	38.93	41.40	46.80
22nd	30.81	33.92	36.78	40.29	42.80	48.27
23	32.01	35.17	38.08	41.64	44.18	49.73
24	33.20	36.42	39.36	42.98	45.56	51.18
25th	34.38	37.65	40.65	44.31	46.93	52.62
26th	35.56	38.89	41.92	45.64	48.29	54.05
27	36.74	40.11	43.19	46.96	49.64	55.48
28	37.92	41.34	44.46	48.28	50.99	56.89
29	39.09	42.56	45.72	49.59	52.34	58.30
30th	40.26	43.77	46.98	50.89	53.67	59.70
40	51.81	55.76	59.34	63.69	66.77	73.40
50	63.17	67.50	71.42	76.15	79.49	86.66
60	74.40	79.08	83.30	88.38	91.95	99.61
70	85.53	90.53	95.02	100.43	104.21	112.32
80	96.58	101.88	106.63	112.33	116.32	124.84
90	107.57	113.15	118.14	124.12	128.30	137.21
100	118.50	124.34	129.56	135.81	140.17	149.45
200	226.02	233.99	241.06	249.45	255.26	267.54
300	331.79	341.40	349.87	359.91	366.84	381.43
400	436.65	447.63	457.31	468.72	476.61	493.13
500	540.93	553.13	563.85	576.49	585.21	603.45

Alternatives to the chi-square test

The chi-square test is still widely used, although better alternatives are available today. The test statistics are particularly problematic with small values per cell (rule of thumb:) , while the chi-square test is still reliable with large samples. ${\ displaystyle x_ {ij} <5}$

The original advantage of the chi-square test was that the test statistics can also be calculated by hand, especially for smaller tables, because the most difficult calculation step is squaring, while the more precise G-test, the most difficult calculation step, requires logarithmization. The test statistic is approximately distributed chi-square and is also robust when the contingency table contains rare events. ${\ displaystyle G}$

In computational linguistics , the G-Test has been able to establish itself, since the frequency analysis of rarely occurring words and text modules is a typical problem there.

Since today's computers offer enough computing power, both tests can be replaced by Fisher's exact test .

Web links

Four field test:

Wikibooks: four-field test with R - learning and teaching materials

Individual evidence

^ Karl Pearson: On the criterion that a given system of derivations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling . In: The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science . tape 50 , no. 5 , 1900, pp. 157-175 , doi : 10.1080 / 14786440009463897 .
↑ Decision of the FG Münster from November 10, 2003 (Ref .: 6 V 4562/03 E, U) (PDF)
↑ Horst Rinne: Pocket book of statistics . 3. Edition. Verlag Harri Deutsch, 2003, p. 562-563 .
↑ Bernd Rönz, Hans G. Strohe: Lexikon Statistics , Gabler Verlag, 1994, p. 69.
↑ see Political Parties in Germany
↑ ^a ^b Horst Rinne: Pocket book of statistics . 3. Edition. Verlag Harri Deutsch, 2003, p. 562 .
↑ ^a ^b ^c Werner Voss: Pocket book of statistics . 1st edition. Fachbuchverlag Leipzig, 2000, p. 447 .
^ Jürgen Bortz, Nicola Döring: Research methods and evaluation for human and social scientists . 4th edition. Springer, 2006, p. 103 .
^ Hans-Hermann Dubben, Hans-Peter Beck-Bornholdt: The dog who lays eggs . 4th edition. Rowohlt Science, 2009, p. 293 .

[1] Karl Pearson: On the criterion that a given system of derivations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling . In: The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science . tape 50 , no. 5 , 1900, pp. 157-175 , doi : 10.1080 / 14786440009463897 .

[2] Decision of the FG Münster from November 10, 2003 (Ref .: 6 V 4562/03 E, U) (PDF)

[Rinne2003_S69-3] Horst Rinne: Pocket book of statistics . 3. Edition. Verlag Harri Deutsch, 2003, p. 562-563 .

[Roenz1994-4] Bernd Rönz, Hans G. Strohe: Lexikon Statistics , Gabler Verlag, 1994, p. 69.

[5] see Political Parties in Germany

[Rinne2003_S562-6] Horst Rinne: Pocket book of statistics . 3. Edition. Verlag Harri Deutsch, 2003, p. 562 .

[Voss2000-7] Werner Voss: Pocket book of statistics . 1st edition. Fachbuchverlag Leipzig, 2000, p. 447 .

[Bortz2006-8] Jürgen Bortz, Nicola Döring: Research methods and evaluation for human and social scientists . 4th edition. Springer, 2006, p. 103 .

[Beck2009-9] Hans-Hermann Dubben, Hans-Peter Beck-Bornholdt: The dog who lays eggs . 4th edition. Rowohlt Science, 2009, p. 293 .

Chi-square test

contents

Distribution test

method

particularities

Estimation of distribution parameters

Minimum size of the expected frequencies

Example of the distribution test

complement

Chi-square distribution test in jurisdiction

Independence test

method

particularities

Example of the independence test

Homogeneity test

method

Conditions of use

Comparison to independence and distribution test

Four-field test

method

Test statistics

Test decision

Examples and Applications

example 1

Example 2

Table of the quantiles of the chi-square distribution

Alternatives to the chi-square test

See also

Web links

Individual evidence