Semantic integration

from Wikipedia, the free encyclopedia

Under semantic integration refers to the solution of semantic conflicts between heterogeneous data sources as part of the information integration . Such conflicts arise from differences in the meaning of the terms and concepts used. These are clearly clarified in the semantic integration. In contrast, syntactic integration is limited to the structure of data and leaves out context and meaning.

Semantic heterogeneity

Semantic heterogeneity is used for data when the meaning, interpretation and type of use of the data model differ.

Example: product databases

The operator of an Internet comparison platform would like to add two new providers: a provider of locking systems and a real estate platform. Both have records with the designation lock . In the context of locking systems this describes a mechanical device , in the context of real estate a magnificent building . The term is a so-called homonym . Without a semantic distinction, the operator cannot tell the two data sets apart.

Example: calendar dates

The partners Martin and Sabrina have previously had separate digital calendars and now want to merge them. You transfer the data from both calendars to a new, shared calendar, whereby duplicates are recognized and combined. However, the duplicate detection only works with syntactically identical calendar entries, for example two identical entries "06/20/2015: Wedding Ute". Entries that describe the same event but were not entered with the same syntax are not merged and can still be found next to each other in the new calendar. For example, the entry "06/20/2015: Ute's wedding" differs syntactically, but not semantically, from the previously mentioned entry.

Ontology-based semantic integration

A manual creation of images between concepts from different data sources is no longer possible without further ado above a certain scope, complexity and rate of change. The ontology-based integration enables automatic or semi-automatic integration. Here are ontologies used the semantic heterogeneity dissolve. In contrast to classic databases, which do not provide any information about the meaning of stored data, ontologies have formal specifications of the data as well as rules about relationships within the data. This should enable computer programs to automatically derive relationships between concepts. Special description languages ​​are used for specification, such as RDF schema or OWL .

See also

literature

  • Pellegrini, Tassilo, Blumauer: “Semantic Web.” Paths to a networked knowledge society. Springer, Berlin 2006.
  • Hribernik, Kramer, Hans, & Thoben (2010). A Semantic Mediator for Data Integration in Autonomous Logistics Processes. In Enterprise Interoperability IV (pp. 157–167). Springer London
  • Franke et al. Semantic Data Integration Approach for the Vision of a Digital Factory. In: Enterprise Interoperability VII. Springer, Cham, 2016, pp. 77–86.