F test

The F -test is a group of statistical tests in which the test statistic follows an F -distribution under the null hypothesis . In the context of regression analysis , the F-test examines a combination of linear (equation) hypotheses. In the special case of analysis of variance, the F test means a test that can be used to decide with a certain degree of confidence whether two samples from different, normally distributed populations differ significantly in terms of their variance . It is therefore used, among other things, for general checking of differences between two statistical populations.

The test goes back to one of the most famous statisticians, Ronald Aylmer Fisher (1890–1962).

F test for two samples

The F -test is a term from mathematical statistics , it describes a group of hypothesis tests with F -distributed test statistics . In the analysis of variance , the F test means the test that checks the differences in the variances for two samples from different, normally distributed populations.

The F test assumes two different normally distributed populations (groups) with the parameters and or and . It is believed that the variance in the second population (group) could be greater than that in the first population. To check this, a random sample is drawn from each population , whereby the sample sizes and may also be different. The sample variables of the first population and the second population must be independent both within a group and from each other. ${\ displaystyle \ mu _ {1}}$${\ displaystyle \ sigma _ {1} ^ {2}}$${\ displaystyle \ mu _ {2}}$${\ displaystyle \ sigma _ {2} ^ {2}}$${\ displaystyle n_ {1}}$${\ displaystyle n_ {2}}$${\ displaystyle X_ {1,1}, \ dots, X_ {1, n_ {1}}}$${\ displaystyle X_ {2,1}, \ dots, X_ {2, n_ {2}}}$

For the test of: Null hypothesis : against the alternative hypothesis : is suitable the F test, the test statistic is the quotient of the estimated variances , the two samples: ${\ displaystyle H_ {0} \ colon \, \ sigma _ {2} ^ {2} = \ sigma _ {1} ^ {2}}$${\ displaystyle H_ {1} \ colon \, \ sigma _ {2} ^ {2}> \ sigma _ {1} ^ {2}}$

${\ displaystyle F _ {\ mathrm {sample}} = {\ frac {S_ {2} ^ {2}} {S_ {1} ^ {2}}} = {\ frac {\ displaystyle {\ frac {1} { n_ {2} -1}} \ sum _ {i = 1} ^ {n_ {2}} (X_ {2, i} - {\ overline {X}} _ {2}) ^ {2}} {\ displaystyle {\ frac {1} {n_ {1} -1}} \ sum _ {i = 1} ^ {n_ {1}} (X_ {1, i} - {\ overline {X}} _ {1} ) ^ {2}}}.}$

Here are , the sample variances and , the sample mean within the two groups. ${\ displaystyle S_ {1} ^ {2}}$${\ displaystyle S_ {2} ^ {2}}$${\ displaystyle {\ overline {X}} _ {1}}$${\ displaystyle {\ overline {X}} _ {2}}$

Under the validity of the null hypothesis, the test statistic is F -distributed with degrees of freedom in the numerator and denominator. The null hypothesis is rejected for too large values ​​of the test statistic. To do this, the critical value is determined or the p-value of the test value is calculated. The easiest way to do this is with the help of an F value table . ${\ displaystyle F _ {\ mathrm {sample}}}$ ${\ displaystyle n_ {2} -1}$ ${\ displaystyle n_ {1} -1}$

The critical value K results from the condition:

${\ displaystyle P \ left (F (n_ {2} -1, n_ {1} -1) \ geq K \ mid H_ {0} \ right) \ leq \ alpha _ {0},}$

with the desired level of significance . ${\ displaystyle \ alpha _ {0}}$

The p -value is calculated using:

${\ displaystyle p = P \ left (F (n_ {2} -1, n_ {1} -1) \ geq F _ {\ mathrm {sample}} \ mid H_ {0} \ right),}$

with , the value of the test statistic found in the sample . ${\ displaystyle f _ {\ mathrm {sample}}}$${\ displaystyle F _ {\ mathrm {sample}}}$

Once you have determined K , you reject it , if so . Once the p-value p has been calculated, it is rejected if . ${\ displaystyle H_ {0}}$${\ displaystyle f _ {\ mathrm {sample}} \ geq K}$${\ displaystyle H_ {0}}$${\ displaystyle p \ leq \ alpha _ {0}}$

The value 5% is often chosen for the level of significance . However, this is just a common convention, see also the article Statistical Significance . However, no direct conclusions about the probability of the validity of the alternative hypothesis can be drawn from the probability obtained. ${\ displaystyle \ alpha _ {0}}$${\ displaystyle P (F \ mid H_ {0})}$

example

A company wants to convert the manufacture of one of its products to a process that promises better quality. The new method would be more expensive, but should have a smaller spread. As a test, 100 products manufactured using the new method B compared to 120 products manufactured using the old method A. Products B have a variance of 80, and products A have a variance of 95. Is tested

${\ displaystyle H_ {0} \ colon \ sigma _ {A} = \ sigma _ {B}}$

against

${\ displaystyle H_ {1} \ colon \ sigma _ {A}> \ sigma _ {B}}$

The test statistic has the test value:

${\ displaystyle f _ {\ mathrm {sample}} = {\ frac {s_ {A} ^ {2}} {s_ {B} ^ {2}}} = {\ frac {95} {80}} = 1 { ,} 1875}$

Under the null hypothesis, this F-value comes from a distribution. So the p-value of the sample result is: ${\ displaystyle F _ {(119; \, 99)}}$

${\ displaystyle P (F (119; \, 99) \ geq 1 {,} 1875) \ approx 0 {,} 189.}$

The null hypothesis can therefore not be rejected, and production will not be converted to the new process. The question remains whether this decision is justified. What if the new method actually caused a smaller variance, but because of the sample this went undetected? But even if the null hypothesis had been rejected, i.e. a significant difference between the variances had been found, the difference could have been insignificantly small. The first question, of course, is whether the test would be able to detect the difference. To do this, consider the test strength. The significance level is also the minimum value of the test strength. So that doesn't lead any further. In practice, however, production would of course only be converted if a significant improvement could be expected, e.g. B. a decrease in the standard deviation of 25%. How likely is it that the test will find such a difference? That is exactly the value of the test strength for . The calculation first requires the calculation of the critical value . For this we assume and read from a table: ${\ displaystyle \ alpha _ {0}}$${\ displaystyle \ sigma _ {B} = 0 {,} 75 \ sigma _ {A}}$${\ displaystyle f _ {\ mathrm {krit}}}$${\ displaystyle \ alpha _ {0} = 5 \ \%}$

${\ displaystyle f _ {\ mathrm {krit}} = 1 {,} 378.}$

The following applies:

${\ displaystyle P (F _ {\ mathrm {sample}} \ geq 1 {,} 378 | H_ {0}) = 0 {,} 05.}$

The desired value of the test strength is the probability of discovering the aforementioned decrease in the standard deviation, i.e.:

${\ displaystyle P (H_ {0} \ {\ text {reject}} | \ sigma _ {B} = {\ tfrac {3} {4}} \ sigma _ {A}) =}$
${\ displaystyle = P (F _ {\ mathrm {sample}} \ geq f _ {\ mathrm {krit}} | \ sigma _ {B} = {\ tfrac {3} {4}} \ sigma _ {A}) = }$
${\ displaystyle = P \ left (\ left. {\ frac {S_ {A} ^ {2}} {S_ {B} ^ {2}}} \ geq f _ {\ mathrm {krit}} \ right | {\ frac {\ sigma _ {B}} {\ sigma _ {A}}} = {\ tfrac {3} {4}} \ right) =}$
${\ displaystyle = P \ left ({\ frac {S_ {A} ^ {2} / \ sigma _ {A} ^ {2}} {S_ {B} ^ {2} / \ sigma _ {B} ^ { 2}}} \ geq ({\ tfrac {3} {4}}) ^ {2} f _ {\ mathrm {krit}} \ right) =}$
${\ displaystyle = P \ left (F (119; 99) \ geq 0 {,} 775 \ right) = 0 {,} 91.}$

This means: If the variance decreases by 25% or more, this is discovered in at least 91% of the cases.

F -Test for multiple sample comparisons

The simple analysis of variance is also based on the F test. Here the square sum of the treatment and the residual square sum are compared.

F -Test for overall significance of a model

In the global F- test (also known as the overall F-test or F-test for the overall significance of a model ), it is checked whether at least one explanatory variable provides an explanatory content for the model and whether the model as a whole is significant.