Data quality (GIS)

from Wikipedia, the free encyclopedia

In geographic information systems , data quality means the quality of data. A distinction can be made between:

  • completeness
  • Positional accuracy
  • Attribute accuracy
  • temporal accuracy
  • Resolution / scale
  • logical consistency
  • Usage target for data collection (origin of the data)

Depending on the type of data, information on data quality can either be obtained from the associated metadata or from a standard.

completeness

The completeness (or reliability) indicates how high the probability is that data is available for a specific event and location. This value can either be estimated from the collection of data or calculated from samples.

Example of reliability:

  • The reliability of house numbers that are managed in a GIS can be estimated by taking random samples and using statistical methods to calculate the probability and the confidence interval.
  • The reliability of soil appraisal data can be assessed by assessing the collection process (how was it recorded, what effort was made to obtain comprehensive coverage, how old is the data, etc.).

Positional accuracy

The accuracy indicates the deviation of the data from the true value (or, if this is not known, from the expected value ). The deviation is usually given in a measurable unit and determined from a statistical assessment that takes into account the type of data collection.

In terms of position accuracy, a distinction can be made between absolute and relative accuracy. The absolute accuracy describes the deviation from the actual coordinates of the object, while the relative accuracy indicates how exactly the directions and distances between two points were mapped.

Example of accuracy:

Attribute accuracy

This characteristic relates to inaccuracies in the assigned properties. These errors are caused by:

  • faulty source files
  • Misinterpretation (human error)
  • Database error