Goldfeld-Quandt test

from Wikipedia, the free encyclopedia

The Goldfeld-Quandt test is a statistical test for heteroscedasticity (non-constant variance of the confounding variables ) in regression analysis . The test is based on the comparison of two sample halves. It was named after Stephen Goldfeld and Richard E. Quandt.

Action

Procedure for the Goldfeld-Quandt test

The sample is divided into two subsets for an explanatory variable, see graphic. The two subsets must be disjoint so that no observations occur in both subsets. However, the two subsets together do not have to encompass the entire sample. In the graphic is e.g. B. the middle part of the observations in no subset (gray). A regression is estimated for both subsets and the variance of the residuals is calculated. Then the sample variance of the residuals for i = 1.2 is determined (with ) for each subset and the test value is compared with a critical value from the F-distribution . The example shows heteroscedasticity because the regression for one subset shows a high residual variance (red), while the regression for the other subset shows a low residual variance (blue).

Mathematical formulation

requirement

In the classic regression model, or with and applies . The test is sensitive to violations of the normal distribution of the residuals.

Hypotheses and test statistics

The null and alternative hypotheses are

(Presence of homoscedasticity) vs. (Presence of heteroscedasticity).

The distribution of the test statistic results as

with the number of observations in the th part and the number of estimated regression parameters as well

.

The null hypothesis (homoscedasticity) is rejected if the test value is greater than the critical value from the F-distribution with and degrees of freedom and a predefined level of significance .

example

variable meaning
medv Median purchase price of a
house in US $ 1000
lstat Proportion of the lower class population
rm Average number of rooms
dis Weighted distance to the five
most important employment centers

For the example, linear regressions were performed on the Boston Housing data set . The variables on the right were collected for each of the 506 districts and a multiple linear regression was carried out:

.

If you plot the residuals against the variable dis (graphic above) you can see that the variance of the residuals decreases when the values ​​of dis increase. The data is now divided into two parts: the red and the blue part. Then you fit two regression models and calculate the sum of the squared residuals.

red
blue

Then the test value results from and the critical value for a significance level results from the F-distribution with 108 and 45 degrees of freedom . Since the test value is greater than the critical value, the null hypothesis of homoscedasticity must be rejected.

literature

  • William E. Griffiths, R. Carter Hill, George G. Judge: Learning and Practicing Econometrics . 1st edition. 1993, ISBN 0-471-51364-4 , p. 494 ff.

Individual evidence

  1. Stephen M. Goldfeld, RE Quandt: Some Tests for Homoscedasticity . In: Journal of the American Statistical Association . 60, No. 310, June 1965, pp. 539-547. JSTOR 2282689 . doi : 10.1080 / 01621459.1965.10480811 .