Semantic gap

from Wikipedia, the free encyclopedia

The semantic gap describes the semantic , i.e. meaning-related difference between two descriptions of an object that arises from the fact that different forms of representation (languages) are selected. This term, which is used in computer science , is generally made clear where an image of real life has to be translated into a formal, machine-processable representation .

More precisely, the term describes the difference between the formulation of context knowledge in a powerful language (e.g. natural language ) and its formal and automatically reproducible representation in a less powerful formal language (e.g. programming language ). In natural language, relationships can be expressed that cannot be evaluated in formal language. For this reason, the difference in expressiveness cannot be formally described.

The Church-Turing thesis states that a machine can be used to carry out precisely those formal operations that a calculating human can perform. However, the selection of the necessary operations for the correct execution of a calculation is not completely guaranteed by such a formal set of rules. If the underlying task cannot be calculated without restrictions , a formal procedure either delivers no result or only an incomplete result, or the rule application does not terminate . In contrast, it is possible for a human to formulate and recognize such tasks as the holding problem .

This discrepancy in knowledge modeling results from the context-related, indecisable ambiguity of spoken language, which in the Chomsky hierarchy is called extended context. Practical programs that automatically reproduce knowledge, on the other hand, are dependent on clarity and decidability . For this reason, the semantic gap can probably never be closed completely with the resources currently available. Rather, an abstraction from the elementary low-level information and tools to the high-level expert knowledge of the application context must be developed for each application. This corresponds to the programming and parameterization of an algorithm.

Formal languages ​​in practice

In practice, applications are formalized with programming languages . The basis of the current Von Neumann architecture is the Boolean algebra , in which all operations are expressed that are possible with our computers. There are also mechanisms for storing the binary data and for defining the processing sequence, which corresponds to a Turing machine . This lowest level is given by what is currently technically feasible; it would be a little different, e.g. B. with the quantum computer . Complex algorithms are difficult to implement on such a Turing machine, and modern applications such as operating systems or word processors are practically impossible to implement. Therefore, tools are needed to facilitate work in the form of programming languages. The first stage thereby form machine or assembly language , z. B. Combine arithmetic and memory operations in commands and make them readable. In high-level programming languages , increasingly complex sequences of these low-level operations are now combined into instructions that are increasingly easy to understand. However, since these commands can only be executed on a Von Neumann computer, the Turing machine is still the limit of what is feasible, no matter how complex the high-level programming language appears to be. This means that the usual tools compiler or interpreter alone do not close the semantic gap.

Natural language illustration

In order to write a program for a real-world application, despite programming languages, the task remains to translate the user's knowledge about the application from the natural domain-specific language into the language of the Turing machine. In addition, it can be deduced from the investigations of the Chomsky hierarchy that precisely this step cannot be automated, i.e. that interaction with humans is always necessary . . A practical consequence of this is that any use of computers to solve a real problem requires the user to have a certain amount of knowledge about what is technically feasible. A word processor hides, for example, data structures, memory access as well as search and sorting algorithms behind a corresponding user interface and the user can concentrate on creating the content on a more abstract level than the selection of ASCII codes. The underlying technology is abstracted to the extent that a user only accesses a low-level function when saving and loading the document. In more complex applications, such as a decision-making system in medicine, this abstraction becomes much more difficult. Theoretically, the user should know which methods exist to assign measured values ​​to the observations as required by the application. On the other hand, the developer must know which combinations of measured values ​​and observations occur in order to select the appropriate methods for learning the decision function. The semantic gap manifests itself precisely with this domain change.

Software technology as a solution

It remains the general task of software technology to fill the gap between application knowledge and what is technically feasible. For this purpose, the domain knowledge (high level) about a problem must be transferred into an algorithm and a parameterization (low level). This requires the dialogue between the user and the software developer, which has to be carried out anew for each domain. The goal is always a software that enables the user to interpret the results of the algorithm without technical explanations from the developer, as well as to express his knowledge in parameterizations without knowing the technical details of the implementation. A suitable user interface plays a central role .

Examples

A typical domain that requires a high degree of abstraction from the low-level methods with a high degree of automation is diagnostic support in medicine. Here, complex relationships are stored in data structures for expert systems , which the user can efficiently train and search through without having to expect knowledge of artificial intelligence methods . The problem of the semantic gap in automated image analysis is even more complex. The aim is to recognize the content of the picture and to assign one or more meanings to the picture. The data basis available for this is only made up of unspecific pixel data as low-level information. In order to recognize the displayed objects or scenes from this raw data, algorithms for pixel selection or manipulation must be suitably combined, parameterized and linked with natural terms. The implementation of natural description categories such as color or shape requires completely different mathematical formalization concepts, which the user must be familiar with in addition to the natural language formulation.

See also

literature

  1. ^ Arnold WM Smeulders, Marcel Worring, Simone Santini, Amarnath Gupta, Ramesh Jain: Content-Based Image Retrieval at the End of the Early Years. In: IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 22, No. 12, 2000, pp. 1349-1380, doi : 10.1109 / 34.895972 .
  2. Chitra Dorai, Svetha Venkatesh: Bridging the Semantic Gap with Computational Media Aesthetics. In: IEEE MultiMedia. Vol. 10, No. 2, 2003, pp. 15-17, doi : 10.1109 / MMUL.2003.1195157 .