Geostatistics
The term geostatistics refers to certain stochastic methods for characterizing and estimating spatially correlated geo-referenced data, for example surface temperatures at different points in a lake. The aim is to use the point-by-point measured data as a starting point for spatial interpolation , i.e. to derive an infinite number of estimated values from a finite number of measured values, which should be as close as possible to the real values.
The estimated value for a physical quantity (such as the surface temperature) at an estimate locus is due to the spatial correlation more of the measured values of adjacent as dependent on such remote measurement locations. These neighboring measured values must therefore be taken into greater account for the estimation. A distinction is made between two methods, the non- statistical and the statistical interpolation method, the latter based on a geostatistical model (often a special random field ).
In order to find out up to what maximum distance ( range ) and to what extent measured values depend on neighboring or more distant measured values, so-called experimental semivariograms are modeled: For all distances (as x-values) that two measuring locations of the data set have to one another, the differences between the respective measured values are plotted (as y values): The increasing dissimilarity with increasing distance is reflected in the increase in y values with increasing x values up to a certain limit value. This dependency is expressed with a model function, for example a quadratic function.
The function that was obtained from the analysis of the measured values is the basis for the subsequent interpolation of a distribution of estimated values in space in a process called kriging . Depending on their proximity to the estimated value sought, the measured values receive different weighting factors , depending on the modeled semivariogram , with which they are included in the calculation of the estimated value (counterexample: arithmetic mean as estimator : all measured values receive the same weight without any difference).
The prerequisite for interpolation is that the measured value distribution in the investigation area is homogeneous (criterion of stationarity / homogeneity ). Example of inhomogeneity: the aluminum content of rocks in a study area in which two completely different rock units are present side by side due to an offset at a fault and border one another without a transition zone.
For the example of the surface temperature of a lake, the result of the kringing would be a distribution of estimated values in the plane, which can be visualized, for example, as an isotherm map or surface relief (" flying carpet ") with the height axis as the temperature axis.
Standard literature
- H. Wackernagel: Multivariate Geostatistics . Springer, Berlin / Heidelberg / New York 1995, ISBN 3-540-60127-9 .
- JP Chiles, P. Delfiner: Geostatistics: Modeling Spatial Uncertainty . Wiley, New York 1999, ISBN 0-471-08315-1 .
- N. Cressie: Statistics for Spatial Data. World Scientific, Singapore 2007.